Science.gov

Sample records for concatenative speech synthesis

  1. Text-to-speech from concatenation of articulatory units derived from natural speech

    NASA Astrophysics Data System (ADS)

    Sinder, Daniel J.; Sondhi, M. Mohan

    2003-04-01

    It has been conjectured that articulatory synthesis possesses the greatest potential for generating high quality synthetic speech. However, for text-to-speech (TTS), waveform concatenation techniques have proven more practical due in part to the challenge of generating appropriate trajectories of articulatory parameters. A waveform generation method for TTS that combines the practical success of concatenative methods with the quality potential of articulatory synthesis is under development. The system concatenates articulatory units derived from natural speech using an articulatory voice mimic. The mimic estimates articulatory parameters by minimizing a cost function that includes a spectral distance between natural and synthetic speech and a geometric distance that penalizes rapid or discontinuous changes in articulator positions. A database of articulatory trajectories representing phonetic units is constructed from the estimated parameters. For TTS, phonetic units generated by text analysis are used to select the corresponding articulatory units from the database. Duration modification, concatenation, and smoothing across units are performed in the articulatory domain resulting in a single articulatory trajectory for the complete utterance. Speech is synthesized from the trajectory using a two mass model for voicing, achieving a high degree of acoustic continuity across unit boundaries while also allowing for source-tract interaction.

  2. Speech Synthesis

    NASA Astrophysics Data System (ADS)

    Dutoit, Thierry; Bozkurt, Baris

    Text-to-speech (TTS) synthesis is the art of designing talking machines. It is often seen by engineers as an easy task, compared to speech recognition.1 It is true, indeed, that it is easier to create a bad, first trial text-to-speech (TTS) system than to design a rudimentary speech recognizer.

  3. Models of speech synthesis.

    PubMed

    Carlson, R

    1995-10-24

    The term "speech synthesis" has been used for diverse technical approaches. In this paper, some of the approaches used to generate synthetic speech in a text-to-speech system are reviewed, and some of the basic motivations for choosing one method over another are discussed. It is important to keep in mind, however, that speech synthesis models are needed not just for speech generation but to help us understand how speech is created, or even how articulation can explain language structure. General issues such as the synthesis of different voices, accents, and multiple languages are discussed as special challenges facing the speech synthesis community. PMID:7479805

  4. Models of speech synthesis.

    PubMed Central

    Carlson, R

    1995-01-01

    The term "speech synthesis" has been used for diverse technical approaches. In this paper, some of the approaches used to generate synthetic speech in a text-to-speech system are reviewed, and some of the basic motivations for choosing one method over another are discussed. It is important to keep in mind, however, that speech synthesis models are needed not just for speech generation but to help us understand how speech is created, or even how articulation can explain language structure. General issues such as the synthesis of different voices, accents, and multiple languages are discussed as special challenges facing the speech synthesis community. PMID:7479805

  5. Linguistic aspects of speech synthesis.

    PubMed

    Allen, J

    1995-10-24

    The conversion of text to speech is seen as an analysis of the input text to obtain a common underlying linguistic description, followed by a synthesis of the output speech waveform from this fundamental specification. Hence, the comprehensive linguistic structure serving as the substrate for an utterance must be discovered by analysis from the text. The pronunciation of individual words in unrestricted text is determined by morphological analysis or letter-to-sound conversion, followed by specification of the word-level stress contour. In addition, many text character strings, such as titles, numbers, and acronyms, are abbreviations for normal words, which must be derived. To further refine these pronunciations and to discover the prosodic structure of the utterance, word part of speech must be computed, followed by a phrase-level parsing. From this structure the prosodic structure of the utterance can be determined, which is needed in order to specify the durational framework and fundamental frequency contour of the utterance. In discourse contexts, several factors such as the specification of new and old information, contrast, and pronominal reference can be used to further modify the prosodic specification. When the prosodic correlates have been computed and the segmental sequence is assembled, a complete input suitable for speech synthesis has been determined. Lastly, multilingual systems utilizing rule frameworks are mentioned, and future directions are characterized. PMID:7479807

  6. Speech Synthesis Applied to Language Teaching.

    ERIC Educational Resources Information Center

    Sherwood, Bruce

    1981-01-01

    The experimental addition of speech output to computer-based Esperanto lessons using speech synthesized from text is described. Because of Esperanto's phonetic spelling and simple rhythm, it is particularly easy to describe the mechanisms of Esperanto synthesis. Attention is directed to how the text-to-speech conversion is performed and the ways…

  7. Computer speech synthesis: its status and prospects.

    PubMed Central

    Liberman, M

    1995-01-01

    Computer speech synthesis has reached a high level of performance, with increasingly sophisticated models of linguistic structure, low error rates in text analysis, and high intelligibility in synthesis from phonemic input. Mass market applications are beginning to appear. However, the results are still not good enough for the ubiquitous application that such technology will eventually have. A number of alternative directions of current research aim at the ultimate goal of fully natural synthetic speech. One especially promising trend is the systematic optimization of large synthesis systems with respect to formal criteria of evaluation. Speech recognition has progressed rapidly in the past decade through such approaches, and it seems likely that their application in synthesis will produce similar improvements. PMID:7479804

  8. Preparation of sound base for a text-to-speech synthesis system

    NASA Astrophysics Data System (ADS)

    Degtyarev, Vladimir M.; Gusev, Mikhail N.

    2005-04-01

    We are giving several recommendations for the choice of parameters of the sound fragments in this report. The sound fragments are components of the sound base, used in Russian speech synthesis system by a text. It isn't the secret that quality of concatenation synthesis in many respects is defined at the stage of a speaker choice and preparation of base of speaker's voice samples. Formulated recommendations are received on the basis of the statistic analysis of big amount of various types of texts and concern both separate sound fragments and their groups. Parameters of sounds were taken with the help of the automatic linguistic processor including phonetic and prosodic transcriptors. The duration, intensity and main pitch frequency of sounds in various contexts and intonational contours were analyzed. The sound base produced according to the worked out recommendations, allows to make better intelligibility and naturalness of synthetic speech due to minimization of changes of speaker's voice samples.

  9. Segmental intelligibility of four currently used text-to-speech synthesis methods

    NASA Astrophysics Data System (ADS)

    Venkatagiri, Horabail S.

    2003-04-01

    The study investigated the segmental intelligibility of four currently available text-to-speech (TTS) products under 0-dB and 5-dB signal-to-noise ratios. The products were IBM ViaVoice™ version 5.1, which uses formant coding, Festival version 1.4.2, a diphone-based LPC TTS product, AT&T Next-Gen™, a half-phone-based TTS product that uses harmonic-plus-noise method for synthesis, and FlexVoice™2, a hybrid TTS product that combines concatenative and formant coding techniques. Overall, concatenative techniques were more intelligible than formant or hybrid techniques, with formant coding slightly better at modeling vowels and concatenative techniques marginally better at synthesizing consonants. No TTS product was better at resisting noise interference than others, although all were more intelligible at 5 dB than at 0-dB SNR. The better TTS products in this study were, on the average, 22% less intelligible and had about 3 times more phoneme errors than human voice under comparable listening conditions. The hybrid TTS technology of FlexVoice had the lowest intelligibility and highest error rates. There were discernible patterns of errors for stops, fricatives, and nasals. Unrestricted TTS output-e-mail messages, news reports, and so on-under high noise conditions prevalent in automobiles, airports, etc. will likely challenge the listeners.

  10. Segmental intelligibility of four currently used text-to-speech synthesis methods.

    PubMed

    Venkatagiri, Horabail S

    2003-04-01

    The study investigated the segmental intelligibility of four currently available text-to-speech (TTS) products under 0-dB and 5-dB signal-to-noise ratios. The products were IBM ViaVoice version 5.1, which uses formant coding, Festival version 1.4.2, a diphone-based LPC TTS product, AT&T Next-Gen, a half-phone-based TTS product that uses harmonic-plus-noise method for synthesis, and FlexVoice2, a hybrid TTS product that combines concatenative and formant coding techniques. Overall, concatenative techniques were more intelligible than formant or hybrid techniques, with formant coding slightly better at modeling vowels and concatenative techniques marginally better at synthesizing consonants. No TTS product was better at resisting noise interference than others, although all were more intelligible at 5 dB than at 0-dB SNR. The better TTS products in this study were, on the average, 22% less intelligible and had about 3 times more phoneme errors than human voice under comparable listening conditions. The hybrid TTS technology of FlexVoice had the lowest intelligibility and highest error rates. There were discernible patterns of errors for stops, fricatives, and nasals. Unrestricted TTS output--e-mail messages, news reports, and so on--under high noise conditions prevalent in automobiles, airports, etc. will likely challenge the listeners. PMID:12703720

  11. Segmental intelligibility of four currently used text-to-speech synthesis methods.

    PubMed

    Venkatagiri, Horabail S

    2003-04-01

    The study investigated the segmental intelligibility of four currently available text-to-speech (TTS) products under 0-dB and 5-dB signal-to-noise ratios. The products were IBM ViaVoice version 5.1, which uses formant coding, Festival version 1.4.2, a diphone-based LPC TTS product, AT&T Next-Gen, a half-phone-based TTS product that uses harmonic-plus-noise method for synthesis, and FlexVoice2, a hybrid TTS product that combines concatenative and formant coding techniques. Overall, concatenative techniques were more intelligible than formant or hybrid techniques, with formant coding slightly better at modeling vowels and concatenative techniques marginally better at synthesizing consonants. No TTS product was better at resisting noise interference than others, although all were more intelligible at 5 dB than at 0-dB SNR. The better TTS products in this study were, on the average, 22% less intelligible and had about 3 times more phoneme errors than human voice under comparable listening conditions. The hybrid TTS technology of FlexVoice had the lowest intelligibility and highest error rates. There were discernible patterns of errors for stops, fricatives, and nasals. Unrestricted TTS output--e-mail messages, news reports, and so on--under high noise conditions prevalent in automobiles, airports, etc. will likely challenge the listeners.

  12. Expressive facial animation synthesis by learning speech coarticulation and expression spaces.

    PubMed

    Deng, Zhigang; Neumann, Ulrich; Lewis, J P; Kim, Tae-Yong; Bulut, Murtaza; Narayanan, Shrikanth

    2006-01-01

    Synthesizing expressive facial animation is a very challenging topic within the graphics community. In this paper, we present an expressive facial animation synthesis system enabled by automated learning from facial motion capture data. Accurate 3D motions of the markers on the face of a human subject are captured while he/she recites a predesigned corpus, with specific spoken and visual expressions. We present a novel motion capture mining technique that "learns" speech coarticulation models for diphones and triphones from the recorded data. A Phoneme-Independent Expression Eigenspace (PIEES) that encloses the dynamic expression signals is constructed by motion signal processing (phoneme-based time-warping and subtraction) and Principal Component Analysis (PCA) reduction. New expressive facial animations are synthesized as follows: First, the learned coarticulation models are concatenated to synthesize neutral visual speech according to novel speech input, then a texture-synthesis-based approach is used to generate a novel dynamic expression signal from the PIEES model, and finally the synthesized expression signal is blended with the synthesized neutral visual speech to create the final expressive facial animation. Our experiments demonstrate that the system can effectively synthesize realistic expressive facial animation.

  13. Vocoders and Speech Perception: Uses of Computer-Based Speech Analysis-Synthesis in Stimulus Generation.

    ERIC Educational Resources Information Center

    Tierney, Joseph; Mack, Molly

    1987-01-01

    Stimuli used in research on the perception of the speech signal have often been obtained from simple filtering and distortion of the speech waveform, sometimes accompanied by noise. However, for more complex stimulus generation, the parameters of speech can be manipulated, after analysis and before synthesis, using various types of algorithms to…

  14. Infants' brain responses to speech suggest analysis by synthesis.

    PubMed

    Kuhl, Patricia K; Ramírez, Rey R; Bosseler, Alexis; Lin, Jo-Fu Lotus; Imada, Toshiaki

    2014-08-01

    Historic theories of speech perception (Motor Theory and Analysis by Synthesis) invoked listeners' knowledge of speech production to explain speech perception. Neuroimaging data show that adult listeners activate motor brain areas during speech perception. In two experiments using magnetoencephalography (MEG), we investigated motor brain activation, as well as auditory brain activation, during discrimination of native and nonnative syllables in infants at two ages that straddle the developmental transition from language-universal to language-specific speech perception. Adults are also tested in Exp. 1. MEG data revealed that 7-mo-old infants activate auditory (superior temporal) as well as motor brain areas (Broca's area, cerebellum) in response to speech, and equivalently for native and nonnative syllables. However, in 11- and 12-mo-old infants, native speech activates auditory brain areas to a greater degree than nonnative, whereas nonnative speech activates motor brain areas to a greater degree than native speech. This double dissociation in 11- to 12-mo-old infants matches the pattern of results obtained in adult listeners. Our infant data are consistent with Analysis by Synthesis: auditory analysis of speech is coupled with synthesis of the motor plans necessary to produce the speech signal. The findings have implications for: (i) perception-action theories of speech perception, (ii) the impact of "motherese" on early language learning, and (iii) the "social-gating" hypothesis and humans' development of social understanding.

  15. Speech synthesis using an aeroacoustic fricative model

    NASA Astrophysics Data System (ADS)

    Sinder, Daniel Jared

    isolation and in a vowel context. The results show the strong potential for this approach to produce high quality unvoiced speech without the need to estimate source strength, spectra, or location for different vocal tract geometries. That is, the synthesis of unvoiced sounds is gained automatically from the articulatory description. (Abstract shortened by UMI.)

  16. Voice Quality Modelling for Expressive Speech Synthesis

    PubMed Central

    Socoró, Joan Claudi

    2014-01-01

    This paper presents the perceptual experiments that were carried out in order to validate the methodology of transforming expressive speech styles using voice quality (VoQ) parameters modelling, along with the well-known prosody (F0, duration, and energy), from a neutral style into a number of expressive ones. The main goal was to validate the usefulness of VoQ in the enhancement of expressive synthetic speech in terms of speech quality and style identification. A harmonic plus noise model (HNM) was used to modify VoQ and prosodic parameters that were extracted from an expressive speech corpus. Perception test results indicated the improvement of obtained expressive speech styles using VoQ modelling along with prosodic characteristics. PMID:24587738

  17. Establishing a Methodology for Benchmarking Speech Synthesis for Computer-Assisted Language Learning (CALL)

    ERIC Educational Resources Information Center

    Handley, Zoe; Hamel, Marie-Josee

    2005-01-01

    Despite the new possibilities that speech synthesis brings about, few Computer-Assisted Language Learning (CALL) applications integrating speech synthesis have found their way onto the market. One potential reason is that the suitability and benefits of the use of speech synthesis in CALL have not been proven. One way to do this is through…

  18. Speech, Speech!

    ERIC Educational Resources Information Center

    McComb, Gordon

    1982-01-01

    Discussion focuses on the nature of computer-generated speech and voice synthesis today. State-of-the-art devices for home computers are called text-to-speech (TTS) systems. Details about the operation and use of TTS synthesizers are provided, and the time saving in programing over previous methods is emphasized. (MP)

  19. Towards personalized speech synthesis for augmentative and alternative communication.

    PubMed

    Mills, Timothy; Bunnell, H Timothy; Patel, Rupal

    2014-09-01

    Text-to-speech options on augmentative and alternative communication (AAC) devices are limited. Often, several individuals in a group setting use the same synthetic voice. This lack of customization may limit technology adoption and social integration. This paper describes our efforts to generate personalized synthesis for users with profoundly limited speech motor control. Existing voice banking and voice conversion techniques rely on recordings of clearly articulated speech from the target talker, which cannot be obtained from this population. Our VocaliD approach extracts prosodic properties from the target talker's source function and applies these features to a surrogate talker's database, generating a synthetic voice with the vocal identity of the target talker and the clarity of the surrogate talker. Promising intelligibility results suggest areas of further development for improved personalization. PMID:25025818

  20. Alternative Speech Communication System for Persons with Severe Speech Disorders

    NASA Astrophysics Data System (ADS)

    Selouani, Sid-Ahmed; Sidi Yakoub, Mohammed; O'Shaughnessy, Douglas

    2009-12-01

    Assistive speech-enabled systems are proposed to help both French and English speaking persons with various speech disorders. The proposed assistive systems use automatic speech recognition (ASR) and speech synthesis in order to enhance the quality of communication. These systems aim at improving the intelligibility of pathologic speech making it as natural as possible and close to the original voice of the speaker. The resynthesized utterances use new basic units, a new concatenating algorithm and a grafting technique to correct the poorly pronounced phonemes. The ASR responses are uttered by the new speech synthesis system in order to convey an intelligible message to listeners. Experiments involving four American speakers with severe dysarthria and two Acadian French speakers with sound substitution disorders (SSDs) are carried out to demonstrate the efficiency of the proposed methods. An improvement of the Perceptual Evaluation of the Speech Quality (PESQ) value of 5% and more than 20% is achieved by the speech synthesis systems that deal with SSD and dysarthria, respectively.

  1. Modeling Consonant-Vowel Coarticulation for Articulatory Speech Synthesis

    PubMed Central

    Birkholz, Peter

    2013-01-01

    A central challenge for articulatory speech synthesis is the simulation of realistic articulatory movements, which is critical for the generation of highly natural and intelligible speech. This includes modeling coarticulation, i.e., the context-dependent variation of the articulatory and acoustic realization of phonemes, especially of consonants. Here we propose a method to simulate the context-sensitive articulation of consonants in consonant-vowel syllables. To achieve this, the vocal tract target shape of a consonant in the context of a given vowel is derived as the weighted average of three measured and acoustically-optimized reference vocal tract shapes for that consonant in the context of the corner vowels /a/, /i/, and /u/. The weights are determined by mapping the target shape of the given context vowel into the vowel subspace spanned by the corner vowels. The model was applied for the synthesis of consonant-vowel syllables with the consonants /b/, /d/, /g/, /l/, /r/, /m/, /n/ in all combinations with the eight long German vowels. In a perception test, the mean recognition rate for the consonants in the isolated syllables was 82.4%. This demonstrates the potential of the approach for highly intelligible articulatory speech synthesis. PMID:23613734

  2. Speech synthesis with pitch modification using harmonic plus noise model

    NASA Astrophysics Data System (ADS)

    Lehana, Parveen K.; Pandey, Prem C.

    2003-10-01

    In harmonic plus noise model (HNM) based speech synthesis, the input signal is modeled as two parts: the harmonic part using amplitudes and phases of the harmonics of the fundamental and the noise part using an all-pole filter excited by random white Gaussian noise. This method requires relatively less number of parameters and computations, provides good quality output, and permits pitch and time scaling without explicit estimation of vocal tract parameters. Pitch scaling to synthesize the speech with interpolated original amplitudes and phases at the multiples of the scaled pitch frequency results in an unnatural quality. Our investigation for obtaining natural quality output showed that the frequency scale of the amplitudes and phases of the harmonics of the original signal needed to be modified by a speaker dependent warping function. The function was obtained by studying the relationship between pitch frequency and formant frequencies for the three cardinal vowels naturally occurring with different pitches in a passage with intonation. Listening tests showed that good quality speech was obtained by linear frequency scaling of the amplitude and phase spectra, by the same factor as the pitch-scaling.

  3. Design and performance of an analysis-by-synthesis class of predictive speech coders

    NASA Technical Reports Server (NTRS)

    Rose, Richard C.; Barnwell, Thomas P., III

    1990-01-01

    The performance of a broad class of analysis-by-synthesis linear predictive speech coders is quantified experimentally. The class of coders includes a number of well-known techniques as well as a very large number of speech coders which have not been named or studied. A general formulation for deriving the parametric representation used in all of the coders in the class is presented. A new coder, named the self-excited vocoder, is discussed because of its good performance with low complexity, and because of the insight this coder gives to analysis-by-synthesis coders in general. The results of a study comparing the performances of different members of this class are presented. The study takes the form of a series of formal subjective and objective speech quality tests performed on selected coders. The results of this study lead to some interesting and important observations concerning the controlling parameters for analysis-by-synthesis speech coders.

  4. The Nitech-NAIST HMM-Based Speech Synthesis System for the Blizzard Challenge 2006

    NASA Astrophysics Data System (ADS)

    Zen, Heiga; Toda, Tomoki; Tokuda, Keiichi

    We describe a statistical parametric speech synthesis system developed by a joint group from the Nagoya Institute of Technology (Nitech) and the Nara Institute of Science and Technology (NAIST) for the annual open evaluation of text-to-speech synthesis systems named Blizzard Challenge 2006. To improve our 2005 system (Nitech-HTS 2005), we investigated new features such as mel-generalized cepstrum-based line spectral pairs (MGC-LSPs), maximum likelihood linear transform (MLLT), and a full covariance global variance (GV) probability density function (pdf). A combination of mel-cepstral coefficients, MLLT, and full covariance GV pdf scored highest in subjective listening tests, and the 2006 system performed significantly better than the 2005 system. The Blizzard Challenge 2006 evaluations show that Nitech-NAIST-HTS 2006 is competitive even when working with relatively large speech databases.

  5. Application of speech recognition and synthesis in the general aviation cockpit

    NASA Technical Reports Server (NTRS)

    North, R. A.; Mountford, S. J.; Bergeron, H.

    1984-01-01

    Interactive speech recognition/synthesis technology is assessed as a method for the aleviation of single-pilot IFR flight workloads. Attention was given during this series of evaluations to the conditions typical of general aviation twin-engine aircrft cockpits, covering several commonly encountered IFR flight condition scenarios. The most beneficial speech command tasks are noted to be in the data retrieval domain, which would allow the pilot access to uplinked data, checklists, and performance charts. Data entry tasks also appear to benefit from this technology.

  6. Review of text-to-speech conversion for English.

    PubMed

    Klatt, D H

    1987-09-01

    The automatic conversion of English text to synthetic speech is presently being performed, remarkably well, by a number of laboratory systems and commercial devices. Progress in this area has been made possible by advances in linguistic theory, acoustic-phonetic characterization of English sound patterns, perceptual psychology, mathematical modeling of speech production, structured programming, and computer hardware design. This review traces the early work on the development of speech synthesizers, discovery of minimal acoustic cues for phonetic contrasts, evolution of phonemic rule programs, incorporation of prosodic rules, and formulation of techniques for text analysis. Examples of rules are used liberally to illustrate the state of the art. Many of the examples are taken from Klattalk, a text-to-speech system developed by the author. A number of scientific problems are identified that prevent current systems from achieving the goal of completely human-sounding speech. While the emphasis is on rule programs that drive a format synthesizer, alternatives such as articulatory synthesis and waveform concatenation are also reviewed. An extensive bibliography has been assembled to show both the breadth of synthesis activity and the wealth of phenomena covered by rules in the best of these programs. A recording of selected examples of the historical development of synthetic speech, enclosed as a 33 1/3-rpm record, is described in the Appendix. PMID:2958525

  7. Synthesis of Speaker Facial Movement to Match Selected Speech Sequences

    NASA Technical Reports Server (NTRS)

    Scott, K. C.; Kagels, D. S.; Watson, S. H.; Rom, H.; Wright, J. R.; Lee, M.; Hussey, K. J.

    1994-01-01

    A system is described which allows for the synthesis of a video sequence of a realistic-appearing talking human head. A phonic based approach is used to describe facial motion; image processing rather than physical modeling techniques are used to create video frames.

  8. Thresholds for Universal Concatenated Quantum Codes.

    PubMed

    Chamberland, Christopher; Jochym-O'Connor, Tomas; Laflamme, Raymond

    2016-07-01

    Quantum error correction and fault tolerance make it possible to perform quantum computations in the presence of imprecision and imperfections of realistic devices. An important question is to find the noise rate at which errors can be arbitrarily suppressed. By concatenating the 7-qubit Steane and 15-qubit Reed-Muller codes, the 105-qubit code enables a universal set of fault-tolerant gates despite not all of them being transversal. Importantly, the cnot gate remains transversal in both codes, and as such has increased error protection relative to the other single qubit logical gates. We show that while the level-1 pseudothreshold for the concatenated scheme is limited by the logical Hadamard gate, the error suppression of the logical cnot gates allows for the asymptotic threshold to increase by orders of magnitude at higher levels. We establish a lower bound of 1.28×10^{-3} for the asymptotic threshold of this code, which is competitive with known concatenated models and does not rely on ancillary magic state preparation for universal computation. PMID:27419549

  9. Thresholds for Universal Concatenated Quantum Codes

    NASA Astrophysics Data System (ADS)

    Chamberland, Christopher; Jochym-O'Connor, Tomas; Laflamme, Raymond

    2016-07-01

    Quantum error correction and fault tolerance make it possible to perform quantum computations in the presence of imprecision and imperfections of realistic devices. An important question is to find the noise rate at which errors can be arbitrarily suppressed. By concatenating the 7-qubit Steane and 15-qubit Reed-Muller codes, the 105-qubit code enables a universal set of fault-tolerant gates despite not all of them being transversal. Importantly, the cnot gate remains transversal in both codes, and as such has increased error protection relative to the other single qubit logical gates. We show that while the level-1 pseudothreshold for the concatenated scheme is limited by the logical Hadamard gate, the error suppression of the logical cnot gates allows for the asymptotic threshold to increase by orders of magnitude at higher levels. We establish a lower bound of 1.28 ×10-3 for the asymptotic threshold of this code, which is competitive with known concatenated models and does not rely on ancillary magic state preparation for universal computation.

  10. Soft context clustering for F0 modeling in HMM-based speech synthesis

    NASA Astrophysics Data System (ADS)

    Khorram, Soheil; Sameti, Hossein; King, Simon

    2015-12-01

    This paper proposes the use of a new binary decision tree, which we call a soft decision tree, to improve generalization performance compared to the conventional `hard' decision tree method that is used to cluster context-dependent model parameters in statistical parametric speech synthesis. We apply the method to improve the modeling of fundamental frequency, which is an important factor in synthesizing natural-sounding high-quality speech. Conventionally, hard decision tree-clustered hidden Markov models (HMMs) are used, in which each model parameter is assigned to a single leaf node. However, this `divide-and-conquer' approach leads to data sparsity, with the consequence that it suffers from poor generalization, meaning that it is unable to accurately predict parameters for models of unseen contexts: the hard decision tree is a weak function approximator. To alleviate this, we propose the soft decision tree, which is a binary decision tree with soft decisions at the internal nodes. In this soft clustering method, internal nodes select both their children with certain membership degrees; therefore, each node can be viewed as a fuzzy set with a context-dependent membership function. The soft decision tree improves model generalization and provides a superior function approximator because it is able to assign each context to several overlapped leaves. In order to use such a soft decision tree to predict the parameters of the HMM output probability distribution, we derive the smoothest (maximum entropy) distribution which captures all partial first-order moments and a global second-order moment of the training samples. Employing such a soft decision tree architecture with maximum entropy distributions, a novel speech synthesis system is trained using maximum likelihood (ML) parameter re-estimation and synthesis is achieved via maximum output probability parameter generation. In addition, a soft decision tree construction algorithm optimizing a log-likelihood measure

  11. An Interactive Concatenated Turbo Coding System

    NASA Technical Reports Server (NTRS)

    Liu, Ye; Tang, Heng; Lin, Shu; Fossorier, Marc

    1999-01-01

    This paper presents a concatenated turbo coding system in which a Reed-Solomon outer code is concatenated with a binary turbo inner code. In the proposed system, the outer code decoder and the inner turbo code decoder interact to achieve both good bit error and frame error performances. The outer code decoder helps the inner turbo code decoder to terminate its decoding iteration while the inner turbo code decoder provides soft-output information to the outer code decoder to carry out a reliability-based soft- decision decoding. In the case that the outer code decoding fails, the outer code decoder instructs the inner code decoder to continue its decoding iterations until the outer code decoding is successful or a preset maximum number of decoding iterations is reached. This interaction between outer and inner code decoders reduces decoding delay. Also presented in the paper are an effective criterion for stopping the iteration process of the inner code decoder and a new reliability-based decoding algorithm for nonbinary codes.

  12. (abstract) Synthesis of Speaker Facial Movements to Match Selected Speech Sequences

    NASA Technical Reports Server (NTRS)

    Scott, Kenneth C.

    1994-01-01

    We are developing a system for synthesizing image sequences the simulate the facial motion of a speaker. To perform this synthesis, we are pursuing two major areas of effort. We are developing the necessary computer graphics technology to synthesize a realistic image sequence of a person speaking selected speech sequences. Next, we are developing a model that expresses the relation between spoken phonemes and face/mouth shape. A subject is video taped speaking an arbitrary text that contains expression of the full list of desired database phonemes. The subject is video taped from the front speaking normally, recording both audio and video detail simultaneously. Using the audio track, we identify the specific video frames on the tape relating to each spoken phoneme. From this range we digitize the video frame which represents the extreme of mouth motion/shape. Thus, we construct a database of images of face/mouth shape related to spoken phonemes. A selected audio speech sequence is recorded which is the basis for synthesizing a matching video sequence; the speaker need not be the same as used for constructing the database. The audio sequence is analyzed to determine the spoken phoneme sequence and the relative timing of the enunciation of those phonemes. Synthesizing an image sequence corresponding to the spoken phoneme sequence is accomplished using a graphics technique known as morphing. Image sequence keyframes necessary for this processing are based on the spoken phoneme sequence and timing. We have been successful in synthesizing the facial motion of a native English speaker for a small set of arbitrary speech segments. Our future work will focus on advancement of the face shape/phoneme model and independent control of facial features.

  13. Use of the magnitude estimation technique for assessing the performance of text-to-speech synthesis systems.

    PubMed

    Pavlovic, C V; Rossi, M; Espesser, R

    1990-01-01

    As text-to-speech systems develop, it becomes necessary to compare various solutions and to evaluate whether a change in the synthesis procedure has an effect on the listener's attitude to the system. The possibility of directly scaling intelligibility, naturalness, and user's satisfaction (i.e., acceptability) with the magnitude estimation technique is investigated. A magnitude estimation protocol suitable for this purpose is described. In general, within the limits of the methodological constraints discussed in this paper, the procedure appears to be reliable and valid for quantifying the perceived attributes of synthesized speech. PMID:2137144

  14. Performance Bounds on Two Concatenated, Interleaved Codes

    NASA Technical Reports Server (NTRS)

    Moision, Bruce; Dolinar, Samuel

    2010-01-01

    A method has been developed of computing bounds on the performance of a code comprised of two linear binary codes generated by two encoders serially concatenated through an interleaver. Originally intended for use in evaluating the performances of some codes proposed for deep-space communication links, the method can also be used in evaluating the performances of short-block-length codes in other applications. The method applies, more specifically, to a communication system in which following processes take place: At the transmitter, the original binary information that one seeks to transmit is first processed by an encoder into an outer code (Co) characterized by, among other things, a pair of numbers (n,k), where n (n > k)is the total number of code bits associated with k information bits and n k bits are used for correcting or at least detecting errors. Next, the outer code is processed through either a block or a convolutional interleaver. In the block interleaver, the words of the outer code are processed in blocks of I words. In the convolutional interleaver, the interleaving operation is performed bit-wise in N rows with delays that are multiples of B bits. The output of the interleaver is processed through a second encoder to obtain an inner code (Ci) characterized by (ni,ki). The output of the inner code is transmitted over an additive-white-Gaussian- noise channel characterized by a symbol signal-to-noise ratio (SNR) Es/No and a bit SNR Eb/No. At the receiver, an inner decoder generates estimates of bits. Depending on whether a block or a convolutional interleaver is used at the transmitter, the sequence of estimated bits is processed through a block or a convolutional de-interleaver, respectively, to obtain estimates of code words. Then the estimates of the code words are processed through an outer decoder, which generates estimates of the original information along with flags indicating which estimates are presumed to be correct and which are found to

  15. Advancements in text-to-speech technology and implications for AAC applications

    NASA Astrophysics Data System (ADS)

    Syrdal, Ann K.

    2003-10-01

    Intelligibility was the initial focus in text-to-speech (TTS) research, since it is clearly a necessary condition for the application of the technology. Sufficiently high intelligibility (approximating human speech) has been achieved in the last decade by the better formant-based and concatenative TTS systems. This led to commercially available TTS systems for highly motivated users, particularly the blind and vocally impaired. Some unnatural qualities of TTS were exploited by these users, such as very fast speaking rates and altered pitch ranges for flagging relevant information. Recently, the focus in TTS research has turned to improving naturalness, so that synthetic speech sounds more human and less robotic. Unit selection approaches to concatenative synthesis have dramatically improved TTS quality, although at the cost of larger and more complex systems. This advancement in naturalness has made TTS technology more acceptable to the general public. The vocally impaired appreciate a more natural voice with which to represent themselves when communicating with others. Unit selection TTS does not achieve such high speaking rates as the earlier TTS systems, however, which is a disadvantage to some AAC device users. An important new research emphasis is to improve and increase the range of emotional expressiveness of TTS.

  16. Speech research directions

    SciTech Connect

    Atal, B.S.; Rabiner, L.R.

    1986-09-01

    This paper presents an overview of the current activities in speech research. The authors discuss the state of the art in speech coding, text-to-speech synthesis, speech recognition, and speaker recognition. In the speech coding area, current algorithms perform well at bit rates down to 9.6 kb/s, and the research is directed at bringing the rate for high-quality speech coding down to 2.4 kb/s. In text-to-speech synthesis, what we currently are able to produce is very intelligible but not yet completely natural. Current research aims at providing higher quality and intelligibility to the synthetic speech that these systems produce. Finally, today's systems for speech and speaker recognition provide excellent performance on limited tasks; i.e., limited vocabulary, modest syntax, small talker populations, constrained inputs, etc.

  17. Hardware Implementation of Serially Concatenated PPM Decoder

    NASA Technical Reports Server (NTRS)

    Moision, Bruce; Hamkins, Jon; Barsoum, Maged; Cheng, Michael; Nakashima, Michael

    2009-01-01

    A prototype decoder for a serially concatenated pulse position modulation (SCPPM) code has been implemented in a field-programmable gate array (FPGA). At the time of this reporting, this is the first known hardware SCPPM decoder. The SCPPM coding scheme, conceived for free-space optical communications with both deep-space and terrestrial applications in mind, is an improvement of several dB over the conventional Reed-Solomon PPM scheme. The design of the FPGA SCPPM decoder is based on a turbo decoding algorithm that requires relatively low computational complexity while delivering error-rate performance within approximately 1 dB of channel capacity. The SCPPM encoder consists of an outer convolutional encoder, an interleaver, an accumulator, and an inner modulation encoder (more precisely, a mapping of bits to PPM symbols). Each code is describable by a trellis (a finite directed graph). The SCPPM decoder consists of an inner soft-in-soft-out (SISO) module, a de-interleaver, an outer SISO module, and an interleaver connected in a loop (see figure). Each SISO module applies the Bahl-Cocke-Jelinek-Raviv (BCJR) algorithm to compute a-posteriori bit log-likelihood ratios (LLRs) from apriori LLRs by traversing the code trellis in forward and backward directions. The SISO modules iteratively refine the LLRs by passing the estimates between one another much like the working of a turbine engine. Extrinsic information (the difference between the a-posteriori and a-priori LLRs) is exchanged rather than the a-posteriori LLRs to minimize undesired feedback. All computations are performed in the logarithmic domain, wherein multiplications are translated into additions, thereby reducing complexity and sensitivity to fixed-point implementation roundoff errors. To lower the required memory for storing channel likelihood data and the amounts of data transfer between the decoder and the receiver, one can discard the majority of channel likelihoods, using only the remainder in

  18. The Neural Basis of Speech Parsing in Children and Adults

    ERIC Educational Resources Information Center

    McNealy, Kristin; Mazziotta, John C.; Dapretto, Mirella

    2010-01-01

    Word segmentation, detecting word boundaries in continuous speech, is a fundamental aspect of language learning that can occur solely by the computation of statistical and speech cues. Fifty-four children underwent functional magnetic resonance imaging (fMRI) while listening to three streams of concatenated syllables that contained either high…

  19. Automated recognition of helium speech. Phase I: Investigation of microprocessor based analysis/synthesis system

    NASA Astrophysics Data System (ADS)

    Jelinek, H. J.

    1986-01-01

    This is the Final Report of Electronic Design Associates on its Phase I SBIR project. The purpose of this project is to develop a method for correcting helium speech, as experienced in diver-surface communication. The goal of the Phase I study was to design, prototype, and evaluate a real time helium speech corrector system based upon digital signal processing techniques. The general approach was to develop hardware (an IBM PC board) to digitize helium speech and software (a LAMBDA computer based simulation) to translate the speech. As planned in the study proposal, this initial prototype may now be used to assess expected performance from a self contained real time system which uses an identical algorithm. The Final Report details the work carried out to produce the prototype system. Four major project tasks were: a signal processing scheme for converting helium speech to normal sounding speech was generated. The signal processing scheme was simulated on a general purpose (LAMDA) computer. Actual helium speech was supplied to the simulation and the converted speech was generated. An IBM-PC based 14 bit data Input/Output board was designed and built. A bibliography of references on speech processing was generated.

  20. Speech processing using maximum likelihood continuity mapping

    DOEpatents

    Hogden, John E.

    2000-01-01

    Speech processing is obtained that, given a probabilistic mapping between static speech sounds and pseudo-articulator positions, allows sequences of speech sounds to be mapped to smooth sequences of pseudo-articulator positions. In addition, a method for learning a probabilistic mapping between static speech sounds and pseudo-articulator position is described. The method for learning the mapping between static speech sounds and pseudo-articulator position uses a set of training data composed only of speech sounds. The said speech processing can be applied to various speech analysis tasks, including speech recognition, speaker recognition, speech coding, speech synthesis, and voice mimicry.

  1. Speech processing using maximum likelihood continuity mapping

    SciTech Connect

    Hogden, J.E.

    2000-04-18

    Speech processing is obtained that, given a probabilistic mapping between static speech sounds and pseudo-articulator positions, allows sequences of speech sounds to be mapped to smooth sequences of pseudo-articulator positions. In addition, a method for learning a probabilistic mapping between static speech sounds and pseudo-articulator position is described. The method for learning the mapping between static speech sounds and pseudo-articulator position uses a set of training data composed only of speech sounds. The said speech processing can be applied to various speech analysis tasks, including speech recognition, speaker recognition, speech coding, speech synthesis, and voice mimicry.

  2. A Comparison of Speech Synthesis Intelligibility with Listeners from Three Age Groups.

    ERIC Educational Resources Information Center

    Mirenda, Pat; Beukelman, David R.

    1987-01-01

    The study evaluated the intelligibility of three different types of speech synthesizers (Echo II+, Votrax Personal Speech System, and DECtalk) with children at two different age groups and adults. Results are reported in terms of their educational application with communication disordered persons. (Author/DB)

  3. Prosody Production and Perception with Conversational Speech

    ERIC Educational Resources Information Center

    Mo, Yoonsook

    2010-01-01

    Speech utterances are more than the linear concatenation of individual phonemes or words. They are organized by prosodic structures comprising phonological units of different sizes (e.g., syllable, foot, word, and phrase) and the prominence relations among them. As the linguistic structure of spoken languages, prosody serves an important function…

  4. Building a Prototype Text to Speech for Sanskrit

    NASA Astrophysics Data System (ADS)

    Mahananda, Baiju; Raju, C. M. S.; Patil, Ramalinga Reddy; Jha, Narayana; Varakhedi, Shrinivasa; Kishore, Prahallad

    This paper describes about the work done in building a prototype text to speech system for Sanskrit. A basic prototype text-to-speech is built using a simplified Sanskrit phone set, and employing a unit selection technique, where prerecorded sub-word units are concatenated to synthesize a sentence. We also discuss the issues involved in building a full-fledged text-to-speech for Sanskrit.

  5. Multilevel Analysis in Analyzing Speech Data

    ERIC Educational Resources Information Center

    Guddattu, Vasudeva; Krishna, Y.

    2011-01-01

    The speech produced by human vocal tract is a complex acoustic signal, with diverse applications in phonetics, speech synthesis, automatic speech recognition, speaker identification, communication aids, speech pathology, speech perception, machine translation, hearing research, rehabilitation and assessment of communication disorders and many…

  6. Shor-Preskill-type security proof for concatenated Bennett-Brassard 1984 quantum-key-distribution protocol

    SciTech Connect

    Hwang, Won-Young; Matsumoto, Keiji; Imai, Hiroshi; Kim, Jaewan; Lee, Hai-Woong

    2003-02-01

    We discuss a long code problem in the Bennett-Brassard 1984 (BB84) quantum-key-distribution protocol and describe how it can be overcome by concatenation of the protocol. Observing that concatenated modified Lo-Chau protocol finally reduces to the concatenated BB84 protocol, we give the unconditional security of the concatenated BB84 protocol.

  7. Designing robust unitary gates: Application to concatenated composite pulses

    SciTech Connect

    Ichikawa, Tsubasa; Bando, Masamitsu; Kondo, Yasushi; Nakahara, Mikio

    2011-12-15

    We propose a simple formalism to design unitary gates robust against given systematic errors. This formalism generalizes our previous observation [Y. Kondo and M. Bando, J. Phys. Soc. Jpn. 80, 054002 (2011)] that vanishing dynamical phase in some composite gates is essential to suppress pulse-length errors. By employing our formalism, we derive a composite unitary gate which can be seen as a concatenation of two known composite unitary operations. The obtained unitary gate has high fidelity over a wider range of error strengths compared to existing composite gates.

  8. Concatenation of 'alert' and 'identity' segments in dingoes' alarm calls.

    PubMed

    Déaux, Eloïse C; Allen, Andrew P; Clarke, Jennifer A; Charrier, Isabelle

    2016-01-01

    Multicomponent signals can be formed by the uninterrupted concatenation of multiple call types. One such signal is found in dingoes, Canis familiaris dingo. This stereotyped, multicomponent 'bark-howl' vocalisation is formed by the concatenation of a noisy bark segment and a tonal howl segment. Both segments are structurally similar to bark and howl vocalisations produced independently in other contexts (e.g. intra- and inter-pack communication). Bark-howls are mainly uttered in response to human presence and were hypothesized to serve as alarm calls. We investigated the function of bark-howls and the respective roles of the bark and howl segments. We found that dingoes could discriminate between familiar and unfamiliar howl segments, after having only heard familiar howl vocalisations (i.e. different calls). We propose that howl segments could function as 'identity signals' and allow receivers to modulate their responses according to the caller's characteristics. The bark segment increased receivers' attention levels, providing support for earlier observational claims that barks have an 'alerting' function. Lastly, dingoes were more likely to display vigilance behaviours upon hearing bark-howl vocalisations, lending support to the alarm function hypothesis. Canid vocalisations, such as the dingo bark-howl, may provide a model system to investigate the selective pressures shaping complex communication systems. PMID:27460289

  9. Serial turbo trellis coded modulation using a serially concatenated coder

    NASA Technical Reports Server (NTRS)

    Divsalar, Dariush (Inventor); Dolinar, Samuel J. (Inventor); Pollara, Fabrizio (Inventor)

    2011-01-01

    Serial concatenated trellis coded modulation (SCTCM) includes an outer coder, an interleaver, a recursive inner coder and a mapping element. The outer coder receives data to be coded and produces outer coded data. The interleaver permutes the outer coded data to produce interleaved data. The recursive inner coder codes the interleaved data to produce inner coded data. The mapping element maps the inner coded data to a symbol. The recursive inner coder has a structure which facilitates iterative decoding of the symbols at a decoder system. The recursive inner coder and the mapping element are selected to maximize the effective free Euclidean distance of a trellis coded modulator formed from the recursive inner coder and the mapping element. The decoder system includes a demodulation unit, an inner SISO (soft-input soft-output) decoder, a deinterleaver, an outer SISO decoder, and an interleaver.

  10. Multilevel Concatenated Block Modulation Codes for the Frequency Non-selective Rayleigh Fading Channel

    NASA Technical Reports Server (NTRS)

    Lin, Shu; Rhee, Dojun

    1996-01-01

    This paper is concerned with construction of multilevel concatenated block modulation codes using a multi-level concatenation scheme for the frequency non-selective Rayleigh fading channel. In the construction of multilevel concatenated modulation code, block modulation codes are used as the inner codes. Various types of codes (block or convolutional, binary or nonbinary) are being considered as the outer codes. In particular, we focus on the special case for which Reed-Solomon (RS) codes are used as the outer codes. For this special case, a systematic algebraic technique for constructing q-level concatenated block modulation codes is proposed. Codes have been constructed for certain specific values of q and compared with the single-level concatenated block modulation codes using the same inner codes. A multilevel closest coset decoding scheme for these codes is proposed.

  11. Advances in speech processing

    NASA Astrophysics Data System (ADS)

    Ince, A. Nejat

    1992-10-01

    The field of speech processing is undergoing a rapid growth in terms of both performance and applications and this is fueled by the advances being made in the areas of microelectronics, computation, and algorithm design. The use of voice for civil and military communications is discussed considering advantages and disadvantages including the effects of environmental factors such as acoustic and electrical noise and interference and propagation. The structure of the existing NATO communications network and the evolving Integrated Services Digital Network (ISDN) concept are briefly reviewed to show how they meet the present and future requirements. The paper then deals with the fundamental subject of speech coding and compression. Recent advances in techniques and algorithms for speech coding now permit high quality voice reproduction at remarkably low bit rates. The subject of speech synthesis is next treated where the principle objective is to produce natural quality synthetic speech from unrestricted text input. Speech recognition where the ultimate objective is to produce a machine which would understand conversational speech with unrestricted vocabulary, from essentially any talker, is discussed. Algorithms for speech recognition can be characterized broadly as pattern recognition approaches and acoustic phonetic approaches. To date, the greatest degree of success in speech recognition has been obtained using pattern recognition paradigms. It is for this reason that the paper is concerned primarily with this technique.

  12. A Statistical Approach to Automatic Speech Summarization

    NASA Astrophysics Data System (ADS)

    Hori, Chiori; Furui, Sadaoki; Malkin, Rob; Yu, Hua; Waibel, Alex

    2003-12-01

    This paper proposes a statistical approach to automatic speech summarization. In our method, a set of words maximizing a summarization score indicating the appropriateness of summarization is extracted from automatically transcribed speech and then concatenated to create a summary. The extraction process is performed using a dynamic programming (DP) technique based on a target compression ratio. In this paper, we demonstrate how an English news broadcast transcribed by a speech recognizer is automatically summarized. We adapted our method, which was originally proposed for Japanese, to English by modifying the model for estimating word concatenation probabilities based on a dependency structure in the original speech given by a stochastic dependency context free grammar (SDCFG). We also propose a method of summarizing multiple utterances using a two-level DP technique. The automatically summarized sentences are evaluated by summarization accuracy based on a comparison with a manual summary of speech that has been correctly transcribed by human subjects. Our experimental results indicate that the method we propose can effectively extract relatively important information and remove redundant and irrelevant information from English news broadcasts.

  13. Speech coding, reconstruction and recognition using acoustics and electromagnetic waves

    DOEpatents

    Holzrichter, J.F.; Ng, L.C.

    1998-03-17

    The use of EM radiation in conjunction with simultaneously recorded acoustic speech information enables a complete mathematical coding of acoustic speech. The methods include the forming of a feature vector for each pitch period of voiced speech and the forming of feature vectors for each time frame of unvoiced, as well as for combined voiced and unvoiced speech. The methods include how to deconvolve the speech excitation function from the acoustic speech output to describe the transfer function each time frame. The formation of feature vectors defining all acoustic speech units over well defined time frames can be used for purposes of speech coding, speech compression, speaker identification, language-of-speech identification, speech recognition, speech synthesis, speech translation, speech telephony, and speech teaching. 35 figs.

  14. Speech coding, reconstruction and recognition using acoustics and electromagnetic waves

    DOEpatents

    Holzrichter, John F.; Ng, Lawrence C.

    1998-01-01

    The use of EM radiation in conjunction with simultaneously recorded acoustic speech information enables a complete mathematical coding of acoustic speech. The methods include the forming of a feature vector for each pitch period of voiced speech and the forming of feature vectors for each time frame of unvoiced, as well as for combined voiced and unvoiced speech. The methods include how to deconvolve the speech excitation function from the acoustic speech output to describe the transfer function each time frame. The formation of feature vectors defining all acoustic speech units over well defined time frames can be used for purposes of speech coding, speech compression, speaker identification, language-of-speech identification, speech recognition, speech synthesis, speech translation, speech telephony, and speech teaching.

  15. Campbell's monkeys concatenate vocalizations into context-specific call sequences

    PubMed Central

    Ouattara, Karim; Lemasson, Alban; Zuberbühler, Klaus

    2009-01-01

    Primate vocal behavior is often considered irrelevant in modeling human language evolution, mainly because of the caller's limited vocal control and apparent lack of intentional signaling. Here, we present the results of a long-term study on Campbell's monkeys, which has revealed an unrivaled degree of vocal complexity. Adult males produced six different loud call types, which they combined into various sequences in highly context-specific ways. We found stereotyped sequences that were strongly associated with cohesion and travel, falling trees, neighboring groups, nonpredatory animals, unspecific predatory threat, and specific predator classes. Within the responses to predators, we found that crowned eagles triggered four and leopards three different sequences, depending on how the caller learned about their presence. Callers followed a number of principles when concatenating sequences, such as nonrandom transition probabilities of call types, addition of specific calls into an existing sequence to form a different one, or recombination of two sequences to form a third one. We conclude that these primates have overcome some of the constraints of limited vocal control by combinatorial organization. As the different sequences were so tightly linked to specific external events, the Campbell's monkey call system may be the most complex example of ‘proto-syntax’ in animal communication known to date. PMID:20007377

  16. Digression and Value Concatenation to Enable Privacy-Preserving Regression

    PubMed Central

    Li, Xiao-Bai; Sarkar, Sumit

    2015-01-01

    Regression techniques can be used not only for legitimate data analysis, but also to infer private information about individuals. In this paper, we demonstrate that regression trees, a popular data-analysis and data-mining technique, can be used to effectively reveal individuals’ sensitive data. This problem, which we call a “regression attack,” has not been addressed in the data privacy literature, and existing privacy-preserving techniques are not appropriate in coping with this problem. We propose a new approach to counter regression attacks. To protect against privacy disclosure, our approach introduces a novel measure, called digression, which assesses the sensitive value disclosure risk in the process of building a regression tree model. Specifically, we develop an algorithm that uses the measure for pruning the tree to limit disclosure of sensitive data. We also propose a dynamic value-concatenation method for anonymizing data, which better preserves data utility than a user-defined generalization scheme commonly used in existing approaches. Our approach can be used for anonymizing both numeric and categorical data. An experimental study is conducted using real-world financial, economic and healthcare data. The results of the experiments demonstrate that the proposed approach is very effective in protecting data privacy while preserving data quality for research and analysis. PMID:26752802

  17. Cyanuric acid hydrolase: evolutionary innovation by structural concatenation

    PubMed Central

    Peat, Thomas S; Balotra, Sahil; Wilding, Matthew; French, Nigel G; Briggs, Lyndall J; Panjikar, Santosh; Cowieson, Nathan; Newman, Janet; Scott, Colin

    2013-01-01

    The cyanuric acid hydrolase, AtzD, is the founding member of a newly identified family of ring-opening amidases. We report the first X-ray structure for this family, which is a novel fold (termed the ‘Toblerone’ fold) that likely evolved via the concatenation of monomers of the trimeric YjgF superfamily and the acquisition of a metal binding site. Structures of AtzD with bound substrate (cyanuric acid) and inhibitors (phosphate, barbituric acid and melamine), along with mutagenesis studies, allowed the identification of the active site. The AtzD monomer, active site and substrate all possess threefold rotational symmetry, to the extent that the active site possesses three potential Ser–Lys catalytic dyads. A single catalytic dyad (Ser85–Lys42) is hypothesized, based on biochemical evidence and crystallographic data. A plausible catalytic mechanism based on these observations is also presented. A comparison with a homology model of the related barbiturase, Bar, was used to infer the active-site residues responsible for substrate specificity, and the phylogeny of the 68 AtzD-like enzymes in the database were analysed in light of this structure–function relationship. PMID:23651355

  18. The Neural Basis of Speech Parsing in Children and Adults

    PubMed Central

    McNealy, Kristin; Mazziotta, John C.; Dapretto, Mirella

    2011-01-01

    Word segmentation, detecting word boundaries in continuous speech, is a fundamental aspect of language learning that can occur solely by the computation of statistical and speech cues. Fifty-four children underwent functional magnetic resonance imaging (fMRI) while listening to three streams of concatenated syllables, which contained either high statistical regularities, high statistical regularities and speech cues, or no easily-detectable cues. Significant signal increases over time in temporal cortices suggest that children utilized the cues to implicitly segment the speech streams. This was confirmed by the findings of a second fMRI run where children displayed reliably greater activity in left inferior frontal gyrus when listening to ‘words’ that occurred more frequently in the streams of speech they just heard. Finally, comparisons between activity observed in these children vs. previously-studied adults indicate significant developmental changes in the neural substrate of speech parsing. PMID:20136936

  19. Research in speech communication.

    PubMed Central

    Flanagan, J

    1995-01-01

    Advances in digital speech processing are now supporting application and deployment of a variety of speech technologies for human/machine communication. In fact, new businesses are rapidly forming about these technologies. But these capabilities are of little use unless society can afford them. Happily, explosive advances in microelectronics over the past two decades have assured affordable access to this sophistication as well as to the underlying computing technology. The research challenges in speech processing remain in the traditionally identified areas of recognition, synthesis, and coding. These three areas have typically been addressed individually, often with significant isolation among the efforts. But they are all facets of the same fundamental issue--how to represent and quantify the information in the speech signal. This implies deeper understanding of the physics of speech production, the constraints that the conventions of language impose, and the mechanism for information processing in the auditory system. In ongoing research, therefore, we seek more accurate models of speech generation, better computational formulations of language, and realistic perceptual guides for speech processing--along with ways to coalesce the fundamental issues of recognition, synthesis, and coding. Successful solution will yield the long-sought dictation machine, high-quality synthesis from text, and the ultimate in low bit-rate transmission of speech. It will also open the door to language-translating telephony, where the synthetic foreign translation can be in the voice of the originating talker. Images Fig. 1 Fig. 2 Fig. 5 Fig. 8 Fig. 11 Fig. 12 Fig. 13 PMID:7479806

  20. Methods of Teaching Speech Recognition

    ERIC Educational Resources Information Center

    Rader, Martha H.; Bailey, Glenn A.

    2010-01-01

    Objective: This article introduces the history and development of speech recognition, addresses its role in the business curriculum, outlines related national and state standards, describes instructional strategies, and discusses the assessment of student achievement in speech recognition classes. Methods: Research methods included a synthesis of…

  1. Computer-generated speech

    SciTech Connect

    Aimthikul, Y.

    1981-12-01

    This thesis reviews the essential aspects of speech synthesis and distinguishes between the two prevailing techniques: compressed digital speech and phonemic synthesis. It then presents the hardware details of the five speech modules evaluated. FORTRAN programs were written to facilitate message creation and retrieval with four of the modules driven by a PDP-11 minicomputer. The fifth module was driven directly by a computer terminal. The compressed digital speech modules (T.I. 990/306, T.S.I. Series 3D and N.S. Digitalker) each contain a limited vocabulary produced by the manufacturers while both the phonemic synthesizers made by Votrax permit an almost unlimited set of sounds and words. A text-to-phoneme rules program was adapted for the PDP-11 (running under the RSX-11M operating system) to drive the Votrax Speech Pac module. However, the Votrax Type'N Talk unit has its own built-in translator. Comparison of these modules revealed that the compressed digital speech modules were superior in pronouncing words on an individual basis but lacked the inflection capability that permitted the phonemic synthesizers to generate more coherent phrases. These findings were necessarily highly subjective and dependent on the specific words and phrases studied. In addition, the rapid introduction of new modules by manufacturers will necessitate new comparisons. However, the results of this research verified that all of the modules studied do possess reasonable quality of speech that is suitable for man-machine applications. Furthermore, the development tools are now in place to permit the addition of computer speech output in such applications.

  2. Speech Development

    MedlinePlus

    ... W View More… Donate Donor Spotlight Fundraising Ideas Vehicle Donation Volunteer Efforts Speech Development skip to submenu ... Lip and Palate . Bzoch (1997). Cleft Palate Speech Management: A Multidisciplinary Approach . Shprintzen, Bardach (1995). Cleft Palate: ...

  3. Speech Problems

    MedlinePlus

    ... a person's ability to speak clearly. Some Common Speech Disorders Stuttering is a problem that interferes with fluent ... is a language disorder, while stuttering is a speech disorder. A person who stutters has trouble getting out ...

  4. Overview of speech technology of the 80's

    SciTech Connect

    Crook, S.B.

    1981-01-01

    The author describes the technology innovations necessary to accommodate the market need which is the driving force toward greater perceived computer intelligence. The author discusses aspects of both speech synthesis and speech recognition.

  5. High pH reversed-phase chromatography with fraction concatenation for 2D proteomic analysis

    SciTech Connect

    Yang, Feng; Shen, Yufeng; Camp, David G.; Smith, Richard D.

    2012-04-01

    Orthogonal high-resolution separations are critical for attaining improved analytical dynamic ranges of proteome measurements. Concatenated high pH reversed phase liquid chromatography affords better separations than the strong cation exchange conventionally applied for two-dimensional shotgun proteomic analysis. For example, concatenated high pH reversed phase liquid chromatography increased identification coverage for peptides (e.g., by 1.8-fold) and proteins (e.g., by 1.6-fold) in shotgun proteomics analyses of a digested human protein sample. Additional advantages of concatenated high pH RPLC include improved protein sequence coverage, simplified sample processing, and reduced sample losses, making this an attractive first dimension separation strategy for two-dimensional proteomics analyses.

  6. Punctured Parallel and Serial Concatenated Convolutional Codes for BPSK/QPSK Channels

    NASA Technical Reports Server (NTRS)

    Acikel, Omer Fatih

    1999-01-01

    As available bandwidth for communication applications becomes scarce, bandwidth-efficient modulation and coding schemes become ever important. Since their discovery in 1993, turbo codes (parallel concatenated convolutional codes) have been the center of the attention in the coding community because of their bit error rate performance near the Shannon limit. Serial concatenated convolutional codes have also been shown to be as powerful as turbo codes. In this dissertation, we introduce algorithms for designing bandwidth-efficient rate r = k/(k + 1),k = 2, 3,..., 16, parallel and rate 3/4, 7/8, and 15/16 serial concatenated convolutional codes via puncturing for BPSK/QPSK (Binary Phase Shift Keying/Quadrature Phase Shift Keying) channels. Both parallel and serial concatenated convolutional codes have initially, steep bit error rate versus signal-to-noise ratio slope (called the -"cliff region"). However, this steep slope changes to a moderate slope with increasing signal-to-noise ratio, where the slope is characterized by the weight spectrum of the code. The region after the cliff region is called the "error rate floor" which dominates the behavior of these codes in moderate to high signal-to-noise ratios. Our goal is to design high rate parallel and serial concatenated convolutional codes while minimizing the error rate floor effect. The design algorithm includes an interleaver enhancement procedure and finds the polynomial sets (only for parallel concatenated convolutional codes) and the puncturing schemes that achieve the lowest bit error rate performance around the floor for the code rates of interest.

  7. Perception of synthetic speech produced automatically by rule: Intelligibility of eight text-to-speech systems.

    PubMed

    Greene, Beth G; Logan, John S; Pisoni, David B

    1986-03-01

    We present the results of studies designed to measure the segmental intelligibility of eight text-to-speech systems and a natural speech control, using the Modified Rhyme Test (MRT). Results indicated that the voices tested could be grouped into four categories: natural speech, high-quality synthetic speech, moderate-quality synthetic speech, and low-quality synthetic speech. The overall performance of the best synthesis system, DECtalk-Paul, was equivalent to natural speech only in terms of performance on initial consonants. The findings are discussed in terms of recent work investigating the perception of synthetic speech under more severe conditions. Suggestions for future research on improving the quality of synthetic speech are also considered. PMID:23225916

  8. Concatenative and Nonconcatenative Plural Formation in L1, L2, and Heritage Speakers of Arabic

    ERIC Educational Resources Information Center

    Albirini, Abdulkafi; Benmamoun, Elabbas

    2014-01-01

    This study compares Arabic L1, L2, and heritage speakers' (HS) knowledge of plural formation, which involves concatenative and nonconcatenative modes of derivation. Ninety participants (divided equally among L1, L2, and heritage speakers) completed two oral tasks: a picture naming task (to measure proficiency) and a plural formation task. The…

  9. Speech disorders - children

    MedlinePlus

    ... of speech disorders may disappear on their own. Speech therapy may help with more severe symptoms or speech ... the disorder. Speech can often be improved with speech therapy. Early treatment is likely to have better results.

  10. Symbolic Speech

    ERIC Educational Resources Information Center

    Podgor, Ellen S.

    1976-01-01

    The concept of symbolic speech emanates from the 1967 case of United States v. O'Brien. These discussions of flag desecration, grooming and dress codes, nude entertainment, buttons and badges, and musical expression show that the courts place symbolic speech in different strata from verbal communication. (LBH)

  11. Speech Aids

    NASA Technical Reports Server (NTRS)

    1987-01-01

    Designed to assist deaf and hearing impaired-persons in achieving better speech, Resnick Worldwide Inc.'s device provides a visual means of cuing the deaf as a speech-improvement measure. This is done by electronically processing the subjects' sounds and comparing them with optimum values which are displayed for comparison.

  12. Speech coding

    SciTech Connect

    Ravishankar, C., Hughes Network Systems, Germantown, MD

    1998-05-08

    Speech is the predominant means of communication between human beings and since the invention of the telephone by Alexander Graham Bell in 1876, speech services have remained to be the core service in almost all telecommunication systems. Original analog methods of telephony had the disadvantage of speech signal getting corrupted by noise, cross-talk and distortion Long haul transmissions which use repeaters to compensate for the loss in signal strength on transmission links also increase the associated noise and distortion. On the other hand digital transmission is relatively immune to noise, cross-talk and distortion primarily because of the capability to faithfully regenerate digital signal at each repeater purely based on a binary decision. Hence end-to-end performance of the digital link essentially becomes independent of the length and operating frequency bands of the link Hence from a transmission point of view digital transmission has been the preferred approach due to its higher immunity to noise. The need to carry digital speech became extremely important from a service provision point of view as well. Modem requirements have introduced the need for robust, flexible and secure services that can carry a multitude of signal types (such as voice, data and video) without a fundamental change in infrastructure. Such a requirement could not have been easily met without the advent of digital transmission systems, thereby requiring speech to be coded digitally. The term Speech Coding is often referred to techniques that represent or code speech signals either directly as a waveform or as a set of parameters by analyzing the speech signal. In either case, the codes are transmitted to the distant end where speech is reconstructed or synthesized using the received set of codes. A more generic term that is applicable to these techniques that is often interchangeably used with speech coding is the term voice coding. This term is more generic in the sense that the

  13. Research on Speech Perception. Progress Report No. 12.

    ERIC Educational Resources Information Center

    Pisoni, David B.; And Others

    Summarizing research activities in 1986, this is the twelfth annual report of research on speech perception, analysis, synthesis, and recognition conducted in the Speech Research Laboratory of the Department of Psychology at Indiana University. The report contains the following 23 articles: "Comprehension of Digitally Encoded Natural Speech Using…

  14. Use of Computer Speech Technologies To Enhance Learning.

    ERIC Educational Resources Information Center

    Ferrell, Joe

    1999-01-01

    Discusses the design of an innovative learning system that uses new technologies for the man-machine interface, incorporating a combination of Automatic Speech Recognition (ASR) and Text To Speech (TTS) synthesis. Highlights include using speech technologies to mimic the attributes of the ideal tutor and design features. (AEF)

  15. Research on Speech Perception. Progress Report No. 15.

    ERIC Educational Resources Information Center

    Pisoni, David B.

    Summarizing research activities in 1989, this is the fifteenth annual report of research on speech perception, analysis, synthesis, and recognition conducted in the Speech Research Laboratory of the Department of Psychology at Indiana University. The report contains the following 21 articles: "Perceptual Learning of Nonnative Speech Contrasts:…

  16. Acousto-optic fiber interferometer based on concatenated flexural wave modulation

    NASA Astrophysics Data System (ADS)

    Kang, Shouxin; Zhang, Hao; Liu, Bo; Zhang, Ning; Miao, Yinping

    2015-07-01

    An acousto-optic fiber interferometer has been proposed and experimentally demonstrated by employing two MgF2 sandwiches to implement concatenated flexural acoustic wave modulation onto single-mode optical fibers. The transmission spectrum of the acoustic grating pair has been experimentally investigated. Experimental results indicate that interferometric spectral fringes possess a frequency sensitivity as large as -499.0 nm/MHz due to the Mach-Zehnder interference. Moreover, the applied radio frequency signal voltage for flexural wave generation has a great impact on the transmission spectral properties. The work presented would be of importance for the understanding of the acousto-optic interaction mechanism in concatenated acoustic fiber gratings and is helpful for the design of related acousto-optic fiber devices.

  17. Performance analysis of a concatenated erbium-doped fiber amplifier supporting four mode groups

    NASA Astrophysics Data System (ADS)

    Qin, Zujun; Fan, Di; Zhang, Wentao; Xiong, Xianming

    2016-05-01

    An erbium-doped fiber amplifier (EDFA) supporting four mode groups has been theoretically designed by concatenating two sections of erbium-doped fibers (EDFs). Each EDF has a simple erbium doping profile for the purpose of reducing its fabrication complexity. We propose a modified genetic algorithm (GA) to provide detailed investigations on the concatenated amplifier. Both the optimal fiber length and erbium doping radius in each EDF have been found to minimize the gain difference between signal modes. Results show that the parameters of the central-doped EDF have a greater impact on the amplifier performance compared to those of the annular-doped one. We then investigate the influence of the small deviations of the erbium fiber length, doping radius and doping concentration of each EDF from their optimal values upon the amplifier performance, and discuss their design tolerances in obtaining a desirable amplification characteristics.

  18. Concatenated shift registers generating maximally spaced phase shifts of PN-sequences

    NASA Technical Reports Server (NTRS)

    Hurd, W. J.; Welch, L. R.

    1977-01-01

    A large class of linearly concatenated shift registers is shown to generate approximately maximally spaced phase shifts of pn-sequences, for use in pseudorandom number generation. A constructive method is presented for finding members of this class, for almost all degrees for which primitive trinomials exist. The sequences which result are not normally characterized by trinomial recursions, which is desirable since trinomial sequences can have some undesirable randomness properties.

  19. Probability of undetected error after decoding for a concatenated coding scheme

    NASA Technical Reports Server (NTRS)

    Costello, D. J., Jr.; Lin, S.

    1984-01-01

    A concatenated coding scheme for error control in data communications is analyzed. In this scheme, the inner code is used for both error correction and detection, however the outer code is used only for error detection. A retransmission is requested if the outer code detects the presence of errors after the inner code decoding. Probability of undetected error is derived and bounded. A particular example, proposed for NASA telecommand system is analyzed.

  20. Vapor pressure measurements on low-volatility terpenoid compounds by the concatenated gas saturation method.

    PubMed

    Widegren, Jason A; Bruno, Thomas J

    2010-01-01

    The atmospheric oxidation of monoterpenes plays a central role in the formation of secondary organic aerosols (SOAs), which have important effects on the weather and climate. However, models of SOA formation have large uncertainties. One reason for this is that SOA formation depends directly on the vapor pressures of the monoterpene oxidation products, but few vapor pressures have been reported for these compounds. As a result, models of SOA formation have had to rely on estimated values of vapor pressure. To alleviate this problem, we have developed the concatenated gas saturation method, which is a simple, reliable, high-throughput method for measuring the vapor pressures of low-volatility compounds. The concatenated gas saturation method represents a significant advance over traditional gas saturation methods. Instead of a single saturator and trap, the concatenated method uses several pairs of saturators and traps linked in series. Consequently, several measurements of vapor pressure can be made simultaneously, which greatly increases the rate of data collection. It also allows for the simultaneous measurement of a control compound, which is important for ensuring data quality. In this paper we demonstrate the use of the concatenated gas saturation method by determination of the vapor pressures of five monoterpene oxidation products and n-tetradecane (the control compound) over the temperature range 283.15-313.15 K. Over this temperature range, the vapor pressures ranged from about 0.5 Pa to about 70 Pa. The standard molar enthalpies of vaporization or sublimation were determined by use of the Clausius-Clapeyron equation.

  1. Coalescence vs. concatenation: Sophisticated analyses vs. first principles applied to rooting the angiosperms.

    PubMed

    Simmons, Mark P; Gatesy, John

    2015-10-01

    It has recently been concluded that phylogenomic data from 310 nuclear genes support the clade of (Amborellales, Nymphaeales) as sister to the remaining angiosperms and that shortcut coalescent phylogenetic methods outperformed concatenation for these data. We falsify both of those conclusions here by demonstrating that discrepant results between the coalescent and concatenation analyses are primarily caused by the coalescent methods applied (MP-EST and STAR) not being robust to the highly divergent and often mis-rooted gene trees that were used. This result reinforces the expectation that low amounts of phylogenetic signal and methodological artifacts in gene-tree reconstruction can be more problematic for shortcut coalescent methods than is the assumption of a single hierarchy for all genes by concatenation methods when these approaches are applied to ancient divergences in empirical studies. We also demonstrate that a third coalescent method, ASTRAL, is more robust to mis-rooted gene trees than MP-EST or STAR, and that both Observed Variability (OV) and Tree Independent Generation of Evolutionary Rates (TIGER), which are two character subsampling procedures, are biased in favor of characters with highly asymmetrical distributions of character states when applied to this dataset. We conclude that enthusiastic application of novel tools is not a substitute for rigorous application of first principles, and that trending methods (e.g., shortcut coalescent methods applied to ancient divergences, tree-independent character subsampling), may be novel sources of previously under-appreciated, systematic errors.

  2. Coalescent versus concatenation methods and the placement of Amborella as sister to water lilies.

    PubMed

    Xi, Zhenxiang; Liu, Liang; Rest, Joshua S; Davis, Charles C

    2014-11-01

    The molecular era has fundamentally reshaped our knowledge of the evolution and diversification of angiosperms. One outstanding question is the phylogenetic placement of Amborella trichopoda Baill., commonly thought to represent the first lineage of extant angiosperms. Here, we leverage publicly available data and provide a broad coalescent-based species tree estimation of 45 seed plants. By incorporating 310 nuclear genes, our coalescent analyses strongly support a clade containing Amborella plus water lilies (i.e., Nymphaeales) that is sister to all other angiosperms across different nucleotide rate partitions. Our results also show that commonly applied concatenation methods produce strongly supported, but incongruent placements of Amborella: slow-evolving nucleotide sites corroborate results from coalescent analyses, whereas fast-evolving sites place Amborella alone as the first lineage of extant angiosperms. We further explored the performance of coalescent versus concatenation methods using nucleotide sequences simulated on (i) the two alternate placements of Amborella with branch lengths and substitution model parameters estimated from each of the 310 nuclear genes and (ii) three hypothetical species trees that are topologically identical except with respect to the degree of deep coalescence and branch lengths. Our results collectively suggest that the Amborella alone placement inferred using concatenation methods is likely misled by fast-evolving sites. This appears to be exacerbated by the combination of long branches in stem group angiosperms, Amborella, and Nymphaeales with the short internal branch separating Amborella and Nymphaeales. In contrast, coalescent methods appear to be more robust to elevated substitution rates.

  3. Coalescent versus concatenation methods and the placement of Amborella as sister to water lilies.

    PubMed

    Xi, Zhenxiang; Liu, Liang; Rest, Joshua S; Davis, Charles C

    2014-11-01

    The molecular era has fundamentally reshaped our knowledge of the evolution and diversification of angiosperms. One outstanding question is the phylogenetic placement of Amborella trichopoda Baill., commonly thought to represent the first lineage of extant angiosperms. Here, we leverage publicly available data and provide a broad coalescent-based species tree estimation of 45 seed plants. By incorporating 310 nuclear genes, our coalescent analyses strongly support a clade containing Amborella plus water lilies (i.e., Nymphaeales) that is sister to all other angiosperms across different nucleotide rate partitions. Our results also show that commonly applied concatenation methods produce strongly supported, but incongruent placements of Amborella: slow-evolving nucleotide sites corroborate results from coalescent analyses, whereas fast-evolving sites place Amborella alone as the first lineage of extant angiosperms. We further explored the performance of coalescent versus concatenation methods using nucleotide sequences simulated on (i) the two alternate placements of Amborella with branch lengths and substitution model parameters estimated from each of the 310 nuclear genes and (ii) three hypothetical species trees that are topologically identical except with respect to the degree of deep coalescence and branch lengths. Our results collectively suggest that the Amborella alone placement inferred using concatenation methods is likely misled by fast-evolving sites. This appears to be exacerbated by the combination of long branches in stem group angiosperms, Amborella, and Nymphaeales with the short internal branch separating Amborella and Nymphaeales. In contrast, coalescent methods appear to be more robust to elevated substitution rates. PMID:25077515

  4. Viterbi decoder node synchronization losses in the Reed-Solomon/Veterbi concatenated channel

    NASA Technical Reports Server (NTRS)

    Deutsch, L. J.; Miller, R. L.

    1982-01-01

    The Viterbi decoders currently used by the Deep Space Network (DSN) employ an algorithm for maintaining node synchronization that significantly degrades at bit signal-to-noise ratios (SNRs) of below 2.0 dB. In a recent report by the authors, it was shown that the telemetry receiving system, which uses a convolutionally encoded downlink, will suffer losses of 0.85 dB and 1.25 dB respectively at Voyager 2 Uranus and Neptune encounters. This report extends the results of that study to a concatenated (255,223) Reed-Solomon/(7, 1/2) convolutionally coded channel, by developing a new radio loss model for the concatenated channel. It is shown here that losses due to improper node synchronization of 0.57 dB at Uranus and 1.0 dB at Neptune can be expected if concatenated coding is used along with an array of one 64-meter and three 34-meter antennas.

  5. Coalescence vs. concatenation: Sophisticated analyses vs. first principles applied to rooting the angiosperms.

    PubMed

    Simmons, Mark P; Gatesy, John

    2015-10-01

    It has recently been concluded that phylogenomic data from 310 nuclear genes support the clade of (Amborellales, Nymphaeales) as sister to the remaining angiosperms and that shortcut coalescent phylogenetic methods outperformed concatenation for these data. We falsify both of those conclusions here by demonstrating that discrepant results between the coalescent and concatenation analyses are primarily caused by the coalescent methods applied (MP-EST and STAR) not being robust to the highly divergent and often mis-rooted gene trees that were used. This result reinforces the expectation that low amounts of phylogenetic signal and methodological artifacts in gene-tree reconstruction can be more problematic for shortcut coalescent methods than is the assumption of a single hierarchy for all genes by concatenation methods when these approaches are applied to ancient divergences in empirical studies. We also demonstrate that a third coalescent method, ASTRAL, is more robust to mis-rooted gene trees than MP-EST or STAR, and that both Observed Variability (OV) and Tree Independent Generation of Evolutionary Rates (TIGER), which are two character subsampling procedures, are biased in favor of characters with highly asymmetrical distributions of character states when applied to this dataset. We conclude that enthusiastic application of novel tools is not a substitute for rigorous application of first principles, and that trending methods (e.g., shortcut coalescent methods applied to ancient divergences, tree-independent character subsampling), may be novel sources of previously under-appreciated, systematic errors. PMID:26002829

  6. Concatenated block codes for unequal error protection of embedded bit streams.

    PubMed

    Arslan, Suayb S; Cosman, Pamela C; Milstein, Laurence B

    2012-03-01

    A state-of-the-art progressive source encoder is combined with a concatenated block coding mechanism to produce a robust source transmission system for embedded bit streams. The proposed scheme efficiently trades off the available total bit budget between information bits and parity bits through efficient information block size adjustment, concatenated block coding, and random block interleavers. The objective is to create embedded codewords such that, for a particular information block, the necessary protection is obtained via multiple channel encodings, contrary to the conventional methods that use a single code rate per information block. This way, a more flexible protection scheme is obtained. The information block size and concatenated coding rates are judiciously chosen to maximize system performance, subject to a total bit budget. The set of codes is usually created by puncturing a low-rate mother code so that a single encoder-decoder pair is used. The proposed scheme is shown to effectively enlarge this code set by providing more protection levels than is possible using the code rate set directly. At the expense of complexity, average system performance is shown to be significantly better than that of several known comparison systems, particularly at higher channel bit error rates.

  7. Speech processing: An evolving technology

    SciTech Connect

    Crochiere, R.E.; Flanagan, J.L.

    1986-09-01

    As we enter the information age, speech processing is emerging as an important technology for making machines easier and more convenient for humans to use. It is both an old and a new technology - dating back to the invention of the telephone and forward, at least in aspirations, to the capabilities of HAL in 2001. Explosive advances in microelectronics now make it possible to implement economical real-time hardware for sophisticated speech processing - processing that formerly could be demonstrated only in simulations on main-frame computers. As a result, fundamentally new product concepts - as well as new features and functions in existing products - are becoming possible and are being explored in the marketplace. As the introductory piece to this issue, the authors draw a brief perspective on the evolving field of speech processing and assess the technology in the the three constituent sectors: speech coding, synthesis, and recognition.

  8. Listen up! Speech is for thinking during infancy.

    PubMed

    Vouloumanos, Athena; Waxman, Sandra R

    2014-12-01

    Infants' exposure to human speech within the first year promotes more than speech processing and language acquisition: new developmental evidence suggests that listening to speech shapes infants' fundamental cognitive and social capacities. Speech streamlines infants' learning, promotes the formation of object categories, signals communicative partners, highlights information in social interactions, and offers insight into the minds of others. These results, which challenge the claim that for infants, speech offers no special cognitive advantages, suggest a new synthesis. Far earlier than researchers had imagined, an intimate and powerful connection between human speech and cognition guides infant development, advancing infants' acquisition of fundamental psychological processes.

  9. Concatenated coding systems employing a unit-memory convolutional code and a byte-oriented decoding algorithm

    NASA Technical Reports Server (NTRS)

    Lee, L. N.

    1976-01-01

    Concatenated coding systems utilizing a convolutional code as the inner code and a Reed-Solomon code as the outer code are considered. In order to obtain very reliable communications over a very noisy channel with relatively small coding complexity, it is proposed to concatenate a byte oriented unit memory convolutional code with an RS outer code whose symbol size is one byte. It is further proposed to utilize a real time minimal byte error probability decoding algorithm, together with feedback from the outer decoder, in the decoder for the inner convolutional code. The performance of the proposed concatenated coding system is studied, and the improvement over conventional concatenated systems due to each additional feature is isolated.

  10. Concatenated coding systems employing a unit-memory convolutional code and a byte-oriented decoding algorithm

    NASA Technical Reports Server (NTRS)

    Lee, L.-N.

    1977-01-01

    Concatenated coding systems utilizing a convolutional code as the inner code and a Reed-Solomon code as the outer code are considered. In order to obtain very reliable communications over a very noisy channel with relatively modest coding complexity, it is proposed to concatenate a byte-oriented unit-memory convolutional code with an RS outer code whose symbol size is one byte. It is further proposed to utilize a real-time minimal-byte-error probability decoding algorithm, together with feedback from the outer decoder, in the decoder for the inner convolutional code. The performance of the proposed concatenated coding system is studied, and the improvement over conventional concatenated systems due to each additional feature is isolated.

  11. Free Speech Yearbook: 1972.

    ERIC Educational Resources Information Center

    Tedford, Thomas L., Ed.

    This book is a collection of essays on free speech issues and attitudes, compiled by the Commission on Freedom of Speech of the Speech Communication Association. Four articles focus on freedom of speech in classroom situations as follows: a philosophic view of teaching free speech, effects of a course on free speech on student attitudes,…

  12. Speech Research

    NASA Astrophysics Data System (ADS)

    Several articles addressing topics in speech research are presented. The topics include: exploring the functional significance of physiological tremor: A biospectroscopic approach; differences between experienced and inexperienced listeners to deaf speech; a language-oriented view of reading and its disabilities; Phonetic factors in letter detection; categorical perception; Short-term recall by deaf signers of American sign language; a common basis for auditory sensory storage in perception and immediate memory; phonological awareness and verbal short-term memory; initiation versus execution time during manual and oral counting by stutterers; trading relations in the perception of speech by five-year-old children; the role of the strap muscles in pitch lowering; phonetic validation of distinctive features; consonants and syllable boundaires; and vowel information in postvocalic frictions.

  13. Functional inositol 1,4,5-trisphosphate receptors assembled from concatenated homo- and heteromeric subunits.

    PubMed

    Alzayady, Kamil J; Wagner, Larry E; Chandrasekhar, Rahul; Monteagudo, Alina; Godiska, Ronald; Tall, Gregory G; Joseph, Suresh K; Yule, David I

    2013-10-11

    Vertebrate genomes code for three subtypes of inositol 1,4,5-trisphosphate (IP3) receptors (IP3R1, -2, and -3). Individual IP3R monomers are assembled to form homo- and heterotetrameric channels that mediate Ca(2+) release from intracellular stores. IP3R subtypes are regulated differentially by IP3, Ca(2+), ATP, and various other cellular factors and events. IP3R subtypes are seldom expressed in isolation in individual cell types, and cells often express different complements of IP3R subtypes. When multiple subtypes of IP3R are co-expressed, the subunit composition of channels cannot be specifically defined. Thus, how the subunit composition of heterotetrameric IP3R channels contributes to shaping the spatio-temporal properties of IP3-mediated Ca(2+) signals has been difficult to evaluate. To address this question, we created concatenated IP3R linked by short flexible linkers. Dimeric constructs were expressed in DT40-3KO cells, an IP3R null cell line. The dimeric proteins were localized to membranes, ran as intact dimeric proteins on SDS-PAGE, and migrated as an ∼1100-kDa band on blue native gels exactly as wild type IP3R. Importantly, IP3R channels formed from concatenated dimers were fully functional as indicated by agonist-induced Ca(2+) release. Using single channel "on-nucleus" patch clamp, the channels assembled from homodimers were essentially indistinguishable from those formed by the wild type receptor. However, the activity of channels formed from concatenated IP3R1 and IP3R2 heterodimers was dominated by IP3R2 in terms of the characteristics of regulation by ATP. These studies provide the first insight into the regulation of heterotetrameric IP3R of defined composition. Importantly, the results indicate that the properties of these channels are not simply a blend of those of the constituent IP3R monomers.

  14. Investigation of the Use of Erasures in a Concatenated Coding Scheme

    NASA Technical Reports Server (NTRS)

    Kwatra, S. C.; Marriott, Philip J.

    1997-01-01

    A new method for declaring erasures in a concatenated coding scheme is investigated. This method is used with the rate 1/2 K = 7 convolutional code and the (255, 223) Reed Solomon code. Errors and erasures Reed Solomon decoding is used. The erasure method proposed uses a soft output Viterbi algorithm and information provided by decoded Reed Solomon codewords in a deinterleaving frame. The results show that a gain of 0.3 dB is possible using a minimum amount of decoding trials.

  15. Space communication system for compressed data with a concatenated Reed-Solomon-Viterbi coding channel

    NASA Technical Reports Server (NTRS)

    Rice, R. F.; Hilbert, E. E. (Inventor)

    1976-01-01

    A space communication system incorporating a concatenated Reed Solomon Viterbi coding channel is discussed for transmitting compressed and uncompressed data from a spacecraft to a data processing center on Earth. Imaging (and other) data are first compressed into source blocks which are then coded by a Reed Solomon coder and interleaver, followed by a convolutional encoder. The received data is first decoded by a Viterbi decoder, followed by a Reed Solomon decoder and deinterleaver. The output of the latter is then decompressed, based on the compression criteria used in compressing the data in the spacecraft. The decompressed data is processed to reconstruct an approximation of the original data-producing condition or images.

  16. Extended electrical tuning of quantum cascade lasers with digital concatenated gratings

    SciTech Connect

    Slivken, S.; Bandyopadhyay, N.; Bai, Y.; Lu, Q. Y.; Razeghi, M.

    2013-12-02

    In this report, the sampled grating distributed feedback laser architecture is modified with digital concatenated gratings to partially compensate for the wavelength dependence of optical gain in a standard high efficiency quantum cascade laser core. This allows equalization of laser threshold over a wide wavelength range and demonstration of wide electrical tuning. With only two control currents, a full tuning range of 500 nm (236 cm{sup −1}) has been demonstrated. Emission is single mode, with a side mode suppression of >20 dB.

  17. Computational Benefits Using an Advanced Concatenation Scheme Based on Reduced Order Models for RF Structures

    NASA Astrophysics Data System (ADS)

    Heller, Johann; Flisgen, Thomas; van Rienen, Ursula

    The computation of electromagnetic fields and parameters derived thereof for lossless radio frequency (RF) structures filled with isotropic media is an important task for the design and operation of particle accelerators. Unfortunately, these computations are often highly demanding with regard to computational effort. The entire computational demand of the problem can be reduced using decomposition schemes in order to solve the field problems on standard workstations. This paper presents one of the first detailed comparisons between the recently proposed state-space concatenation approach (SSC) and a direct computation for an accelerator cavity with coupler-elements that break the rotational symmetry.

  18. Speech Intelligibility

    NASA Astrophysics Data System (ADS)

    Brand, Thomas

    Speech intelligibility (SI) is important for different fields of research, engineering and diagnostics in order to quantify very different phenomena like the quality of recordings, communication and playback devices, the reverberation of auditoria, characteristics of hearing impairment, benefit using hearing aids or combinations of these things.

  19. Phrase-level speech simulation with an airway modulation model of speech production.

    PubMed

    Story, Brad H

    2013-06-01

    Artificial talkers and speech synthesis systems have long been used as a means of understanding both speech production and speech perception. The development of an airway modulation model is described that simulates the time-varying changes of the glottis and vocal tract, as well as acoustic wave propagation, during speech production. The result is a type of artificial talker that can be used to study various aspects of how sound is generated by humans and how that sound is perceived by a listener. The primary components of the model are introduced and simulation of words and phrases are demonstrated.

  20. Type of Speech Material Affects Acceptable Noise Level Test Outcome

    PubMed Central

    Koch, Xaver; Dingemanse, Gertjan; Goedegebure, André; Janse, Esther

    2016-01-01

    The acceptable noise level (ANL) test, in which individuals indicate what level of noise they are willing to put up with while following speech, has been used to guide hearing aid fitting decisions and has been found to relate to prospective hearing aid use. Unlike objective measures of speech perception ability, ANL outcome is not related to individual hearing loss or age, but rather reflects an individual’s inherent acceptance of competing noise while listening to speech. As such, the measure may predict aspects of hearing aid success. Crucially, however, recent studies have questioned its repeatability (test–retest reliability). The first question for this study was whether the inconsistent results regarding the repeatability of the ANL test may be due to differences in speech material types used in previous studies. Second, it is unclear whether meaningfulness and semantic coherence of the speech modify ANL outcome. To investigate these questions, we compared ANLs obtained with three types of materials: the International Speech Test Signal (ISTS), which is non-meaningful and semantically non-coherent by definition, passages consisting of concatenated meaningful standard audiology sentences, and longer fragments taken from conversational speech. We included conversational speech as this type of speech material is most representative of everyday listening. Additionally, we investigated whether ANL outcomes, obtained with these three different speech materials, were associated with self-reported limitations due to hearing problems and listening effort in everyday life, as assessed by a questionnaire. ANL data were collected for 57 relatively good-hearing adult participants with an age range representative for hearing aid users. Results showed that meaningfulness, but not semantic coherence of the speech material affected ANL. Less noise was accepted for the non-meaningful ISTS signal than for the meaningful speech materials. ANL repeatability was comparable

  1. Concatenation of ‘alert’ and ‘identity’ segments in dingoes’ alarm calls

    PubMed Central

    Déaux, Eloïse C.; Allen, Andrew P.; Clarke, Jennifer A.; Charrier, Isabelle

    2016-01-01

    Multicomponent signals can be formed by the uninterrupted concatenation of multiple call types. One such signal is found in dingoes, Canis familiaris dingo. This stereotyped, multicomponent ‘bark-howl’ vocalisation is formed by the concatenation of a noisy bark segment and a tonal howl segment. Both segments are structurally similar to bark and howl vocalisations produced independently in other contexts (e.g. intra- and inter-pack communication). Bark-howls are mainly uttered in response to human presence and were hypothesized to serve as alarm calls. We investigated the function of bark-howls and the respective roles of the bark and howl segments. We found that dingoes could discriminate between familiar and unfamiliar howl segments, after having only heard familiar howl vocalisations (i.e. different calls). We propose that howl segments could function as ‘identity signals’ and allow receivers to modulate their responses according to the caller’s characteristics. The bark segment increased receivers’ attention levels, providing support for earlier observational claims that barks have an ‘alerting’ function. Lastly, dingoes were more likely to display vigilance behaviours upon hearing bark-howl vocalisations, lending support to the alarm function hypothesis. Canid vocalisations, such as the dingo bark-howl, may provide a model system to investigate the selective pressures shaping complex communication systems. PMID:27460289

  2. Voice synthesis application

    NASA Astrophysics Data System (ADS)

    Lightstone, P. C.; Davidson, W. M.

    1982-01-01

    Selection of a speech synthesis system as an augmentation for a perimeter security device is described. Criteria used in selection of a system are discussed. The final system is a speech 1000 speech synthesizer board that has a 2000 word speech lexicon, a first time charge for a 32 K EPROM of custom words, and extra features such as an alternate command to adjust desired listening level.

  3. Voice synthesis application

    SciTech Connect

    Lightstone, P.C.; Davidson, W.M.

    1982-01-27

    Selection of a speech synthesis system as an augmentation for a perimeter security device is described. Criteria used in selection of a system are discussed. The final system is a speech 1000 speech synthesizer board that has a 2000 word speech lexicon, a first time charge of $75 for a 32 K EPROM of custom words, and extra features such as an alternate command to adjust desired listening level.

  4. Speech communications in noise

    NASA Technical Reports Server (NTRS)

    1984-01-01

    The physical characteristics of speech, the methods of speech masking measurement, and the effects of noise on speech communication are investigated. Topics include the speech signal and intelligibility, the effects of noise on intelligibility, the articulation index, and various devices for evaluating speech systems.

  5. The effects of receiver tracking phase error on the performance of the concatenated Reed-Solomon/Viterbi channel coding system

    NASA Technical Reports Server (NTRS)

    Liu, K. Y.

    1981-01-01

    Analytical and experimental results are presented of the effects of receiver tracking phase error, caused by weak signal conditions on either the uplink or the downlink or both, on the performance of the concatenated Reed-Solomon (RS) Viterbi channel coding system. The test results were obtained under an emulated S band uplink and X band downlink, two way space communication channel in the telecommunication development laboratory of JPL with data rates ranging from 4 kHz to 20 kHz. It is shown that, with ideal interleaving, the concatenated RS/Viterbi coding system is capable of yielding large coding gains at very low bit error probabilities over the Viterbi decoded convolutional only coding system. Results on the effects of receiver tracking phase errors on the performance of the concatenated coding system with antenna array combining are included.

  6. Self-Similar Conformations and Dynamics of Non-Concatenated Entangled Ring Polymers

    NASA Astrophysics Data System (ADS)

    Ge, Ting

    A scaling model of self-similar conformations and dynamics of non-concatenated entangled ring polymers is developed. Topological constraints force these ring polymers into compact conformations with fractal dimension D =3 that we call fractal loopy globules (FLGs). This result is based on the conjecture that the overlap parameter of loops on all length scales is equal to the Kavassalis-Noolandi number 10-20. The dynamics of entangled rings is self-similar, and proceeds as loops of increasing sizes are rearranged progressively at their respective diffusion times. The topological constraints associated with smaller rearranged loops affect the dynamics of larger loops by increasing the effective friction coefficient, but have no influence on the tubes confining larger loops. Therefore, the tube diameter defined as the average spacing between relevant topological constraints increases with time, leading to ``tube dilation''. Analysis of the primitive paths in molecular dynamics (MD) simulations suggests complete tube dilation with the tube diameter on the order of the time-dependent characteristic loop size. A characteristic loop at time t is defined as a ring section that has diffused a distance of its size during time t. We derive dynamic scaling exponents in terms of fractal dimensions of an entangled ring and the underlying primitive path and a parameter characterizing the extent of tube dilation. The results reproduce the predictions of different dynamic models of a single non-concatenated entangled ring. We demonstrate that traditional generalization of single-ring models to multi-ring dynamics is not self-consistent and develop a FLG model with self-consistent multi-ring dynamics and complete tube dilation. Various dynamic scaling exponents predicted by the self-consistent FLG model are consistent with recent computer simulations and experiments. We also perform MD simulations of nanoparticle (NP) diffusion in melts of non-concatenated entangled ring polymers

  7. Insufficient chunk concatenation may underlie changes in sleep-dependent consolidation of motor sequence learning in older adults.

    PubMed

    Bottary, Ryan; Sonni, Akshata; Wright, David; Spencer, Rebecca M C

    2016-09-01

    Sleep enhances motor sequence learning (MSL) in young adults by concatenating subsequences ("chunks") formed during skill acquisition. To examine whether this process is reduced in aging, we assessed performance changes on the MSL task following overnight sleep or daytime wake in healthy young and older adults. Young adult performance enhancement was correlated with nREM2 sleep, and facilitated by preferential improvement of slowest within-sequence transitions. This effect was markedly reduced in older adults, and accompanied by diminished sigma power density (12-15 Hz) during nREM2 sleep, suggesting that diminished chunk concatenation following sleep may underlie reduced consolidation of MSL in older adults. PMID:27531835

  8. Sample-based engine noise synthesis using an enhanced pitch-synchronous overlap-and-add method.

    PubMed

    Jagla, Jan; Maillard, Julien; Martin, Nadine

    2012-11-01

    An algorithm for the real time synthesis of internal combustion engine noise is presented. Through the analysis of a recorded engine noise signal of continuously varying engine speed, a dataset of sound samples is extracted allowing the real time synthesis of the noise induced by arbitrary evolutions of engine speed. The sound samples are extracted from a recording spanning the entire engine speed range. Each sample is delimitated such as to contain the sound emitted during one cycle of the engine plus the necessary overlap to ensure smooth transitions during the synthesis. The proposed approach, an extension of the PSOLA method introduced for speech processing, takes advantage of the specific periodicity of engine noise signals to locate the extraction instants of the sound samples. During the synthesis stage, the sound samples corresponding to the target engine speed evolution are concatenated with an overlap and add algorithm. It is shown that this method produces high quality audio restitution with a low computational load. It is therefore well suited for real time applications. PMID:23145595

  9. Sample-based engine noise synthesis using an enhanced pitch-synchronous overlap-and-add method.

    PubMed

    Jagla, Jan; Maillard, Julien; Martin, Nadine

    2012-11-01

    An algorithm for the real time synthesis of internal combustion engine noise is presented. Through the analysis of a recorded engine noise signal of continuously varying engine speed, a dataset of sound samples is extracted allowing the real time synthesis of the noise induced by arbitrary evolutions of engine speed. The sound samples are extracted from a recording spanning the entire engine speed range. Each sample is delimitated such as to contain the sound emitted during one cycle of the engine plus the necessary overlap to ensure smooth transitions during the synthesis. The proposed approach, an extension of the PSOLA method introduced for speech processing, takes advantage of the specific periodicity of engine noise signals to locate the extraction instants of the sound samples. During the synthesis stage, the sound samples corresponding to the target engine speed evolution are concatenated with an overlap and add algorithm. It is shown that this method produces high quality audio restitution with a low computational load. It is therefore well suited for real time applications.

  10. Hyperbranched Hybridization Chain Reaction for Triggered Signal Amplification and Concatenated Logic Circuits.

    PubMed

    Bi, Sai; Chen, Min; Jia, Xiaoqiang; Dong, Ying; Wang, Zonghua

    2015-07-01

    A hyper-branched hybridization chain reaction (HB-HCR) is presented herein, which consists of only six species that can metastably coexist until the introduction of an initiator DNA to trigger a cascade of hybridization events, leading to the self-sustained assembly of hyper-branched and nicked double-stranded DNA structures. The system can readily achieve ultrasensitive detection of target DNA. Moreover, the HB-HCR principle is successfully applied to construct three-input concatenated logic circuits with excellent specificity and extended to design a security-mimicking keypad lock system. Significantly, the HB-HCR-based keypad lock can alarm immediately if the "password" is incorrect. Overall, the proposed HB-HCR with high amplification efficiency is simple, homogeneous, fast, robust, and low-cost, and holds great promise in the development of biosensing, in the programmable assembly of DNA architectures, and in molecular logic operations. PMID:26012841

  11. Multidimensional Trellis Coded Phase Modulation Using a Multilevel Concatenation Approach. Part 1; Code Design

    NASA Technical Reports Server (NTRS)

    Rajpal, Sandeep; Rhee, Do Jun; Lin, Shu

    1997-01-01

    The first part of this paper presents a simple and systematic technique for constructing multidimensional M-ary phase shift keying (MMK) trellis coded modulation (TCM) codes. The construction is based on a multilevel concatenation approach in which binary convolutional codes with good free branch distances are used as the outer codes and block MPSK modulation codes are used as the inner codes (or the signal spaces). Conditions on phase invariance of these codes are derived and a multistage decoding scheme for these codes is proposed. The proposed technique can be used to construct good codes for both the additive white Gaussian noise (AWGN) and fading channels as is shown in the second part of this paper.

  12. Functional brain fluorescence plurimetry in rat by implantable concatenated CMOS imaging system.

    PubMed

    Kobayashi, Takuma; Masuda, Hiroyuki; Kitsumoto, Chikara; Haruta, Makito; Motoyama, Mayumi; Ohta, Yasumi; Noda, Toshihiko; Sasagawa, Kiyotaka; Tokuda, Takashi; Shiosaka, Sadao; Ohta, Jun

    2014-03-15

    Measurement of brain activity in multiple areas simultaneously by minimally invasive methods contributes to the study of neuroscience and development of brain machine interfaces. However, this requires compact wearable instruments that do not inhibit natural movements. Application of optical potentiometry with voltage-sensitive fluorescent dye using an implantable image sensor is also useful. However, the increasing number of leads required for the multiple wired sensors to measure larger domains inhibits natural behavior. For imaging broad areas by numerous sensors without excessive wiring, a web-like sensor that can wrap the brain was developed. Kaleidoscopic potentiometry is possible using the imaging system with concatenated sensors by changing the alignment of the sensors. This paper describes organization of the system, evaluation of the system by a fluorescence imaging, and finally, functional brain fluorescence plurimetry by the sensor. The recorded data in rat somatosensory cortex using the developed multiple-area imaging system compared well with electrophysiology results.

  13. BoD services in layer 1 VPN with dynamic virtual concatenation group

    NASA Astrophysics Data System (ADS)

    Du, Shu; Peng, Yunfeng; Long, Keping

    2008-11-01

    Bandwidth-on-Demand (BoD) services are characteristic of dynamic bandwidth provisioning based on customers' resource requirement, which will be a must for future networks. BoD services become possible with the development of make-before-break, Virtual Concatenation (VCAT) and Link Capacity Adjustment Scheme (LCAS). In this paper, we introduce BoD services into L1VPN, thus the resource assigned to a L1VPN can be gracefully adjusted at various bandwidth granularities based on customers' requirement. And we propose a dynamic bandwidth adjustment scheme, which is compromise between make-before-break and VCAT&LCAS and mainly based on the latter. The scheme minimizes the number of distinct paths to support a connection between a source-destination pair, and uses make-beforebreak technology for re-optimization.

  14. Efficient entanglement concentration for concatenated Greenberger-Horne-Zeilinger state with the cross-Kerr nonlinearity

    NASA Astrophysics Data System (ADS)

    Pan, Jun; Zhou, Lan; Gu, Shi-Pu; Wang, Xing-Fu; Sheng, Yu-Bo; Wang, Qin

    2016-04-01

    Concatenated Greenberger-Horne-Zeilinger (C-GHZ) state, which encodes physical qubits in a logic qubit, has great application in the future quantum communication. We present an efficient entanglement concentration protocol (ECP) for recovering less-entangled C-GHZ state into the maximally entangled C-GHZ state with the help of cross-Kerr nonlinearities and photon detectors. With the help of the cross-Kerr nonlinearity, the obtained maximally entangled C-GHZ state can be remained for other applications. Moreover, the ECP can be used repeatedly, which can increase the success probability largely. Based on the advantages above, our ECP may be useful in the future long-distance quantum communication.

  15. Temperature insensitive refractive index sensor based on concatenated long period fiber gratings

    NASA Astrophysics Data System (ADS)

    Tripathi, Saurabh M.; Bock, Wojtek J.; Mikulic, Predrag

    2013-10-01

    We propose and demonstrate a temperature immune biosensor based on two concatenated LPGs incorporating a suitable inter-grating-space (IGS). Compensating the thermal induced phase changes in the grating region by use of an appropriate length of the IGS the temperature insensitivity has been achieved. Using standard telecommunication grade single-mode fibers we show that a length ratio of ~8.2 is sufficient to realize the proposed temperature insensitivity. The resulting sensor shows a refractive index sensitivity of 423.28 nm/RIU displaying the capability of detecting an index variation of 2.36 × 10-6 RIU in the bio-samples. The sensor can also be applied as a temperature insensitive WMD channel isolation filter in the optical communication systems, removing the necessity of any external thermal insulation packaging.

  16. Fundamental GABAergic amacrine cell circuitries in the retina: nested feedback, concatenated inhibition, and axosomatic synapses.

    PubMed

    Marc, R E; Liu, W

    2000-10-01

    Presynaptic gamma-aminobutyrate-immunoreactive (GABA+) profiles were mapped in the cyprinid retina with overlay microscopy: a fusion of electron and optical imaging affording high-contrast ultrastructural and immunocytochemical visualization. GABA+ synapses, deriving primarily from amacrine cells (ACs), compose 92% of conventional synapses and 98% of the input to bipolar cells (BCs) in the inner plexiform layer. GABA+ AC synapses, the sign-inverting elements of signal processing, are deployed in micronetworks and distinctive synaptic source/target topologies. Nested feedback micronetworks are formed by three types of links (BC --> AC, reciprocal BC <-- AC, and AC --> AC synapses) arranged as nested BC<--> [AC --> AC] loops. Circuits using nested feedback can possess better temporal performance than those using simple reciprocal feedback loops. Concatenated GABA+ micronetworks of AC --> AC and AC --> AC --> AC chains are common and must be key elements for lateral spatial, temporal, and spectral signal processing. Concatenated inhibitions may represent exceptionally stable, low-gain, sign-conserving devices for receptive field construction. Some chain elements are GABA immunonegative (GABA-) and are, thus, likely glycinergic synapses. GABA+ synaptic baskets target the somas of certain GABA+ and GABA- cells, resembling cortical axosomatic synapses. Finally, all myelinated intraretinal profiles are GABA+, suggesting that some efferent systems are sources of GABAergic inhibition in the cyprinid retina and may comprise all axosomatic synapses. These micronetworks are likely the fundamental elements for receptive field shaping in the inner plexiform layer, although few receptive field models incorporate them as functional components. Conversely, simple feedback and feedforward synapses may often be chimeras: the result of an incomplete view of synaptic topology.

  17. Inter-Calibration and Concatenation of Climate Quality Infrared Cloudy Radiances from Multiple Instruments

    NASA Technical Reports Server (NTRS)

    Behrangi, Ali; Aumann, Hartmut H.

    2013-01-01

    A change in climate is not likely captured from any single instrument, since no single instrument can span decades of time. Therefore, to detect signals of global climate change, observations from many instruments on different platforms have to be concatenated. This requires careful and detailed consideration of instrumental differences such as footprint size, diurnal cycle of observations, and relative biases in the spectral brightness temperatures. Furthermore, a common basic assumption is that the data quality is independent of the observed scene and therefore can be determined using clear scene data. However, as will be demonstrated, this is not necessarily a valid assumption as the globe is mostly cloudy. In this study we highlight challenges in inter-calibration and concatenation of infrared radiances from multiple instruments by focusing on the analysis of deep convective or anvil clouds. TRMM/VIRS is potentially useful instrument to make correction for observational differences in the local time and foot print sizes, and thus could be applied retroactively to vintage instruments such as AIRS, IASI, IRIS, AVHRR, and HIRS. As the first step, in this study, we investigate and discuss to what extent AIRS and VIRS agree in capturing deep cloudy radiances at the same local time. The analysis also includes comparisons with one year observations from CrIS. It was found that the instruments show calibration differences of about 1K under deep cloudy scenes that can vary as a function of land type and local time of observation. The sensitivity of footprint size, view angle, and spectral band-pass differences cannot fully explain the observed differences. The observed discrepancies can be considered as a measure of the magnitude of issues which will arise in the comparison of legacy data with current data.

  18. Speech recognition and understanding

    SciTech Connect

    Vintsyuk, T.K.

    1983-05-01

    This article discusses the automatic processing of speech signals with the aim of finding a sequence of works (speech recognition) or a concept (speech understanding) being transmitted by the speech signal. The goal of the research is to develop an automatic typewriter that will automatically edit and type text under voice control. A dynamic programming method is proposed in which all possible class signals are stored, after which the presented signal is compared to all the stored signals during the recognition phase. Topics considered include element-by-element recognition of words of speech, learning speech recognition, phoneme-by-phoneme speech recognition, the recognition of connected speech, understanding connected speech, and prospects for designing speech recognition and understanding systems. An application of the composition dynamic programming method for the solution of basic problems in the recognition and understanding of speech is presented.

  19. Insufficient Chunk Concatenation May Underlie Changes in Sleep-Dependent Consolidation of Motor Sequence Learning in Older Adults

    ERIC Educational Resources Information Center

    Bottary, Ryan; Sonni, Akshata; Wright, David; Spencer, Rebecca M. C.

    2016-01-01

    Sleep enhances motor sequence learning (MSL) in young adults by concatenating subsequences ("chunks") formed during skill acquisition. To examine whether this process is reduced in aging, we assessed performance changes on the MSL task following overnight sleep or daytime wake in healthy young and older adults. Young adult performance…

  20. Research on Speech Perception. Progress Report No. 14.

    ERIC Educational Resources Information Center

    Pisoni, David B.; And Others

    Summarizing research activities in 1988, this is the fourteenth annual report of research on speech perception, analysis, synthesis, and recognition conducted in the Speech Research Laboratory of the Department of Psychology at Indiana University. The report includes extended manuscripts, short reports, and progress reports. The report contains…

  1. Opportunities in Speech Pathology.

    ERIC Educational Resources Information Center

    Newman, Parley W.

    The importance of speech is discussed and speech pathology is described. Types of communication disorders considered are articulation disorders, aphasia, facial deformity, hearing loss, stuttering, delayed speech, voice disorders, and cerebral palsy; examples of five disorders are given. Speech pathology is investigated from these aspects: the…

  2. Careers in Speech Communication.

    ERIC Educational Resources Information Center

    Speech Communication Association, New York, NY.

    Brief discussions in this pamphlet suggest educational and career opportunities in the following fields of speech communication: rhetoric, public address, and communication; theatre, drama, and oral interpretation; radio, television, and film; speech pathology and audiology; speech science, phonetics, and linguistics; and speech education.…

  3. Processing of Speech Signals for Physical and Sensory Disabilities

    NASA Astrophysics Data System (ADS)

    Levitt, Harry

    1995-10-01

    Assistive technology involving voice communication is used primarily by people who are deaf, hard of hearing, or who have speech and/or language disabilities. It is also used to a lesser extent by people with visual or motor disabilities. A very wide range of devices has been developed for people with hearing loss. These devices can be categorized not only by the modality of stimulation [i.e., auditory, visual, tactile, or direct electrical stimulation of the auditory nerve (auditory-neural)] but also in terms of the degree of speech processing that is used. At least four such categories can be distinguished: assistive devices (a) that are not designed specifically for speech, (b) that take the average characteristics of speech into account, (c) that process articulatory or phonetic characteristics of speech, and (d) that embody some degree of automatic speech recognition. Assistive devices for people with speech and/or language disabilities typically involve some form of speech synthesis or symbol generation for severe forms of language disability. Speech synthesis is also used in text-to-speech systems for sightless persons. Other applications of assistive technology involving voice communication include voice control of wheelchairs and other devices for people with mobility disabilities.

  4. Processing of speech signals for physical and sensory disabilities.

    PubMed

    Levitt, H

    1995-10-24

    Assistive technology involving voice communication is used primarily by people who are deaf, hard of hearing, or who have speech and/or language disabilities. It is also used to a lesser extent by people with visual or motor disabilities. A very wide range of devices has been developed for people with hearing loss. These devices can be categorized not only by the modality of stimulation [i.e., auditory, visual, tactile, or direct electrical stimulation of the auditory nerve (auditory-neural)] but also in terms of the degree of speech processing that is used. At least four such categories can be distinguished: assistive devices (a) that are not designed specifically for speech, (b) that take the average characteristics of speech into account, (c) that process articulatory or phonetic characteristics of speech, and (d) that embody some degree of automatic speech recognition. Assistive devices for people with speech and/or language disabilities typically involve some form of speech synthesis or symbol generation for severe forms of language disability. Speech synthesis is also used in text-to-speech systems for sightless persons. Other applications of assistive technology involving voice communication include voice control of wheelchairs and other devices for people with mobility disabilities. PMID:7479816

  5. Segmental concatenation of individual signatures and context cues in banded mongoose (Mungos mungo) close calls

    PubMed Central

    2012-01-01

    Background All animals are anatomically constrained in the number of discrete call types they can produce. Recent studies suggest that by combining existing calls into meaningful sequences, animals can increase the information content of their vocal repertoire despite these constraints. Additionally, signalers can use vocal signatures or cues correlated to other individual traits or contexts to increase the information encoded in their vocalizations. However, encoding multiple vocal signatures or cues using the same components of vocalizations usually reduces the signals' reliability. Segregation of information could effectively circumvent this trade-off. In this study we investigate how banded mongooses (Mungos mungo) encode multiple vocal signatures or cues in their frequently emitted graded single syllable close calls. Results The data for this study were collected on a wild, but habituated, population of banded mongooses. Using behavioral observations and acoustical analysis we found that close calls contain two acoustically different segments. The first being stable and individually distinct, and the second being graded and correlating with the current behavior of the individual, whether it is digging, searching or moving. This provides evidence of Marler's hypothesis on temporal segregation of information within a single syllable call type. Additionally, our work represents an example of an identity cue integrated as a discrete segment within a single call that is independent from context. This likely functions to avoid ambiguity between individuals or receivers having to keep track of several context-specific identity cues. Conclusions Our study provides the first evidence of segmental concatenation of information within a single syllable in non-human vocalizations. By reviewing descriptions of call structures in the literature, we suggest a general application of this mechanism. Our study indicates that temporal segregation and segmental concatenation of

  6. Testing the performance of feedback concatenated decoder with a nonideal receiver

    NASA Technical Reports Server (NTRS)

    Feria, Y.; Dolinar, S.

    1995-01-01

    One of the inherent problems in testing the feedback concatenated decoder (FCD) at our operating symbol signal-to-noise ratio (SSNR) is that the bit-error rate is so low that we cannot measure it directly through simulations in a reasonable time period. This article proposes a test procedure that will give a reasonable estimate of the expected losses even though the number of frames tested is much smaller than needed for a direct measurement. This test procedure provides an organized robust methodology for extrapolating small amounts of test data to give reasonable estimates of FCD loss increments at unmeasurable miniscule error rates. Using this test procedure, we have run some preliminary tests on the FCD to quantify the losses due to the fact that the input signal contains multiplicative non-white non-Gaussian noises resulting from the buffered telemetry demodulator (BTD). Besides the losses in the BTD, we have observed additional loss increments of 0.3 to 0.4 dB at the output of the FCD for several test cases with loop signal-to-noise ratios (SNR's) lower than 20 dB. In contrast, these loss increments were less than 0.1 dB for a test case with the subcarrier loop SNR at about 28 dB. This test procedure can be applied to more extensive test data to determine thresholds on the loop SNRs above which the FCD will not suffer substantial loss increments.

  7. Physical properties of modification of speech signal fragments

    NASA Astrophysics Data System (ADS)

    Gusev, Mikhail N.

    2004-04-01

    The methods used for modification of separate speech signals fragments in the process of speech synthesis by arbitrary text are described in this report. Three groups of sounds differ in the modification methods of frequency characteristics. Two groups of sounds differ in that they need different methods of duration changes. To modify the samples of a speaker's voice by the methods used it is necessary to make pre-marking, so called segementation. As variable speech fragments, the allophones are taken. The modification methods described allow form arbitrary speech successions in the wide intonation diapason on the basis of limited amount of the speaker's voice patterns.

  8. A review and synthesis of the first 20 years of PET and fMRI studies of heard speech, spoken language and reading

    PubMed Central

    Price, Cathy J.

    2012-01-01

    The anatomy of language has been investigated with PET or fMRI for more than 20 years. Here I attempt to provide an overview of the brain areas associated with heard speech, speech production and reading. The conclusions of many hundreds of studies were considered, grouped according to the type of processing, and reported in the order that they were published. Many findings have been replicated time and time again leading to some consistent and undisputable conclusions. These are summarised in an anatomical model that indicates the location of the language areas and the most consistent functions that have been assigned to them. The implications for cognitive models of language processing are also considered. In particular, a distinction can be made between processes that are localized to specific structures (e.g. sensory and motor processing) and processes where specialisation arises in the distributed pattern of activation over many different areas that each participate in multiple functions. For example, phonological processing of heard speech is supported by the functional integration of auditory processing and articulation; and orthographic processing is supported by the functional integration of visual processing, articulation and semantics. Future studies will undoubtedly be able to improve the spatial precision with which functional regions can be dissociated but the greatest challenge will be to understand how different brain regions interact with one another in their attempts to comprehend and produce language. PMID:22584224

  9. Delayed Speech or Language Development

    MedlinePlus

    ... to Know About Zika & Pregnancy Delayed Speech or Language Development KidsHealth > For Parents > Delayed Speech or Language ... your child is right on schedule. Normal Speech & Language Development It's important to discuss early speech and ...

  10. Data concatenation, Bayesian concordance and coalescent-based analyses of the species tree for the rapid radiation of Triturus newts.

    PubMed

    Wielstra, Ben; Arntzen, Jan W; van der Gaag, Kristiaan J; Pabijan, Maciej; Babik, Wieslaw

    2014-01-01

    The phylogenetic relationships for rapid species radiations are difficult to disentangle. Here we study one such case, namely the genus Triturus, which is composed of the marbled and crested newts. We analyze data for 38 genetic markers, positioned in 3-prime untranslated regions of protein-coding genes, obtained with 454 sequencing. Our dataset includes twenty Triturus newts and represents all nine species. Bayesian analysis of population structure allocates all individuals to their respective species. The branching patterns obtained by data concatenation, Bayesian concordance analysis and coalescent-based estimations of the species tree differ from one another. The data concatenation based species tree shows high branch support but branching order is considerably affected by allele choice in the case of heterozygotes in the concatenation process. Bayesian concordance analysis expresses the conflict between individual gene trees for part of the Triturus species tree as low concordance factors. The coalescent-based species tree is relatively similar to a previously published species tree based upon morphology and full mtDNA and any conflicting internal branches are not highly supported. Our findings reflect high gene tree discordance due to incomplete lineage sorting (possibly aggravated by hybridization) in combination with low information content of the markers employed (as can be expected for relatively recent species radiations). This case study highlights the complexity of resolving rapid radiations and we acknowledge that to convincingly resolve the Triturus species tree even more genes will have to be consulted.

  11. High-speed concatenation of frequency ramps using sampled grating distributed Bragg reflector laser diode sources for OCT resolution enhancement

    NASA Astrophysics Data System (ADS)

    George, Brandon; Derickson, Dennis

    2010-02-01

    Wavelength tunable sampled grating distributed Bragg reflector (SG-DBR) lasers used for telecommunications applications have previously demonstrated the ability for linear frequency ramps covering the entire tuning range of the laser at 100 kHz repetition rates1. An individual SG-DBR laser has a typical tuning range of 50 nm. The InGaAs/InP material system often used with SG-DBR lasers allows for design variations that cover the 1250 to 1650 nm wavelength range. This paper addresses the possibility of concatenating the outputs of tunable SGDBR lasers covering adjacent wavelength ranges for enhancing the resolution of OCT measurements. This laser concatenation method is demonstrated by combining the 1525 nm to 1575 nm wavelength range of a "C Band" SG-DBR laser with the 1570nm to 1620 nm wavelength coverage of an "L-Band" SG-DBR laser. Measurements show that SGDBR lasers can be concatenated with a transition switching time of less than 50 ns with undesired leakage signals attenuated by 50 dB.

  12. Expansion and concatenation of nonmuscle myosin IIA filaments drive cellular contractile system formation during interphase and mitosis

    PubMed Central

    Fenix, Aidan M.; Taneja, Nilay; Buttler, Carmen A.; Lewis, John; Van Engelenburg, Schuyler B.; Ohi, Ryoma; Burnette, Dylan T.

    2016-01-01

    Cell movement and cytokinesis are facilitated by contractile forces generated by the molecular motor, nonmuscle myosin II (NMII). NMII molecules form a filament (NMII-F) through interactions of their C-terminal rod domains, positioning groups of N-terminal motor domains on opposite sides. The NMII motors then bind and pull actin filaments toward the NMII-F, thus driving contraction. Inside of crawling cells, NMIIA-Fs form large macromolecular ensembles (i.e., NMIIA-F stacks), but how this occurs is unknown. Here we show NMIIA-F stacks are formed through two non–mutually exclusive mechanisms: expansion and concatenation. During expansion, NMIIA molecules within the NMIIA-F spread out concurrent with addition of new NMIIA molecules. Concatenation occurs when multiple NMIIA-Fs/NMIIA-F stacks move together and align. We found that NMIIA-F stack formation was regulated by both motor activity and the availability of surrounding actin filaments. Furthermore, our data showed expansion and concatenation also formed the contractile ring in dividing cells. Thus interphase and mitotic cells share similar mechanisms for creating large contractile units, and these are likely to underlie how other myosin II–based contractile systems are assembled. PMID:26960797

  13. Children's perception of their synthetically corrected speech production.

    PubMed

    Strömbergsson, Sofia; Wengelin, Asa; House, David

    2014-06-01

    We explore children's perception of their own speech - in its online form, in its recorded form, and in synthetically modified forms. Children with phonological disorder (PD) and children with typical speech and language development (TD) performed tasks of evaluating accuracy of the different types of speech stimuli, either immediately after having produced the utterance or after a delay. In addition, they performed a task designed to assess their ability to detect synthetic modification. Both groups showed high performance in tasks involving evaluation of other children's speech, whereas in tasks of evaluating one's own speech, the children with PD were less accurate than their TD peers. The children with PD were less sensitive to misproductions in immediate conjunction with their production of an utterance, and more accurate after a delay. Within-category modification often passed undetected, indicating a satisfactory quality of the generated speech. Potential clinical benefits of using corrective re-synthesis are discussed.

  14. Concatenation of cyan and yellow fluorescent proteins for efficient resonance energy transfer.

    PubMed

    Shimozono, Satoshi; Hosoi, Haruko; Mizuno, Hideaki; Fukano, Takashi; Tahara, Tahei; Miyawaki, Atsushi

    2006-05-23

    Highly efficient fluorescence resonance energy transfer between cyan(CFP) and yellow fluorescent proteins (YFP), the cyan- and yellow-emitting variants of the Aequorea green fluorescent protein, respectively, was achieved by tightly concatenating the two proteins. After the C-terminus of CFP and the N-terminus of YFP were truncated by 11 and 5 amino acids, respectively, the proteins were fused through a leucine-glutamate dipeptide. The resulting chimeric protein, which we called Cy11.5, exhibited a simple emission spectrum that peaked at 527 nm when the protein was excited at 436 nm. The time-resolved emission of Cy11.5 was measured using a streak camera. After excitation of Cy11.5 with a 400 nm ultrashort pulse, a fast decay of the CFP emission and a concomitant rise of the YFP emission were observed with a lifetime of 66 ps. By contrast, the emission from CFP alone showed a decay component with a lifetime of 2.9 ns. We concluded that in fully folded Cy11.5 molecules, intramolecular FRET occurred with an efficiency of 98%. Importantly, most Cy11.5 molecules were properly folded, and the protein was highly resistant to all of the tested proteases. In living cells, therefore, Cy11.5 behaved as a single fluorescent protein with a broad excitation spectrum. Moreover, Cy11.5 was used as an optical highlighter after photobleaching of YFP. When HeLa cells expressing Cy11.5 were irradiated at 514.5 nm, a 10-fold increase in the 475 nm fluorescence intensity was observed. These features make Cy11.5 useful as an optical highlighter and a new-colored fluorescent protein for multicolor imaging.

  15. Concatenation of observed grasp phases with observer's distal movements: a behavioural and TMS study.

    PubMed

    De Stefani, Elisa; Innocenti, Alessandro; De Marco, Doriana; Gentilucci, Maurizio

    2013-01-01

    The present study aimed at determining how actions executed by two conspecifics can be coordinated with each other, or more specifically, how the observation of different phases of a reaching-grasping action is temporary related to the execution of a movement of the observer. Participants observed postures of initial finger opening, maximal finger aperture, and final finger closing of grasp after observation of an initial hand posture. Then, they opened or closed their right thumb and index finger (experiments 1, 2 and 3). Response times decreased, whereas acceleration and velocity of actual finger movements increased when observing the two late phases of grasp. In addition, the results ruled out the possibility that this effect was due to salience of the visual stimulus when the hand was close to the target and confirmed an effect of even hand postures in addition to hand apparent motion due to the succession of initial hand posture and grasp phase. In experiments 4 and 5, the observation of grasp phases modulated even foot movements and pronunciation of syllables. Finally, in experiment 6, transcranial magnetic stimulation applied to primary motor cortex 300 ms post-stimulus induced an increase in hand motor evoked potentials of opponens pollicis muscle when observing the two late phases of grasp. These data suggest that the observation of grasp phases induced simulation which was stronger during observation of finger closing. This produced shorter response times, greater acceleration and velocity of the successive movement. In general, our data suggest best concatenation between two movements (one observed and the other executed) when the observed (and simulated) movement was to be accomplished. The mechanism joining the observation of a conspecific's action with our own movement may be precursor of social functions. It may be at the basis for interactions between conspecifics, and related to communication between individuals.

  16. NINJA-OPS: Fast Accurate Marker Gene Alignment Using Concatenated Ribosomes.

    PubMed

    Al-Ghalith, Gabriel A; Montassier, Emmanuel; Ward, Henry N; Knights, Dan

    2016-01-01

    The explosion of bioinformatics technologies in the form of next generation sequencing (NGS) has facilitated a massive influx of genomics data in the form of short reads. Short read mapping is therefore a fundamental component of next generation sequencing pipelines which routinely match these short reads against reference genomes for contig assembly. However, such techniques have seldom been applied to microbial marker gene sequencing studies, which have mostly relied on novel heuristic approaches. We propose NINJA Is Not Just Another OTU-Picking Solution (NINJA-OPS, or NINJA for short), a fast and highly accurate novel method enabling reference-based marker gene matching (picking Operational Taxonomic Units, or OTUs). NINJA takes advantage of the Burrows-Wheeler (BW) alignment using an artificial reference chromosome composed of concatenated reference sequences, the "concatesome," as the BW input. Other features include automatic support for paired-end reads with arbitrary insert sizes. NINJA is also free and open source and implements several pre-filtering methods that elicit substantial speedup when coupled with existing tools. We applied NINJA to several published microbiome studies, obtaining accuracy similar to or better than previous reference-based OTU-picking methods while achieving an order of magnitude or more speedup and using a fraction of the memory footprint. NINJA is a complete pipeline that takes a FASTA-formatted input file and outputs a QIIME-formatted taxonomy-annotated BIOM file for an entire MiSeq run of human gut microbiome 16S genes in under 10 minutes on a dual-core laptop.

  17. NINJA-OPS: Fast Accurate Marker Gene Alignment Using Concatenated Ribosomes

    PubMed Central

    Al-Ghalith, Gabriel A.; Montassier, Emmanuel; Ward, Henry N.; Knights, Dan

    2016-01-01

    The explosion of bioinformatics technologies in the form of next generation sequencing (NGS) has facilitated a massive influx of genomics data in the form of short reads. Short read mapping is therefore a fundamental component of next generation sequencing pipelines which routinely match these short reads against reference genomes for contig assembly. However, such techniques have seldom been applied to microbial marker gene sequencing studies, which have mostly relied on novel heuristic approaches. We propose NINJA Is Not Just Another OTU-Picking Solution (NINJA-OPS, or NINJA for short), a fast and highly accurate novel method enabling reference-based marker gene matching (picking Operational Taxonomic Units, or OTUs). NINJA takes advantage of the Burrows-Wheeler (BW) alignment using an artificial reference chromosome composed of concatenated reference sequences, the “concatesome,” as the BW input. Other features include automatic support for paired-end reads with arbitrary insert sizes. NINJA is also free and open source and implements several pre-filtering methods that elicit substantial speedup when coupled with existing tools. We applied NINJA to several published microbiome studies, obtaining accuracy similar to or better than previous reference-based OTU-picking methods while achieving an order of magnitude or more speedup and using a fraction of the memory footprint. NINJA is a complete pipeline that takes a FASTA-formatted input file and outputs a QIIME-formatted taxonomy-annotated BIOM file for an entire MiSeq run of human gut microbiome 16S genes in under 10 minutes on a dual-core laptop. PMID:26820746

  18. Speech impairment (adult)

    MedlinePlus

    Language impairment; Impairment of speech; Inability to speak; Aphasia; Dysarthria; Slurred speech; Dysphonia voice disorders ... environment and keep external stimuli to a minimum. Speak in a normal tone of voice (this condition ...

  19. SPEECH HANDICAPPED SCHOOL CHILDREN.

    ERIC Educational Resources Information Center

    JOHNSON, WENDELL; AND OTHERS

    THIS BOOK IS DESIGNED PRIMARILY FOR STUDENTS WHO ARE BEING TRAINED TO WORK WITH SPEECH HANDICAPPED SCHOOL CHILDREN, EITHER AS SPEECH CORRECTIONISTS OR AS CLASSROOM TEACHERS. THE BOOK DEALS WITH FOUR MAJOR QUESTIONS--(1) WHAT KINDS OF SPEECH DISORDERS ARE FOUND AMONG SCHOOL CHILDREN, (2) WHAT ARE THE PHYSICAL, PSYCHOLOGICAL AND SOCIAL CONDITIONS,…

  20. Free Speech Yearbook 1978.

    ERIC Educational Resources Information Center

    Phifer, Gregg, Ed.

    The 17 articles in this collection deal with theoretical and practical freedom of speech issues. The topics include: freedom of speech in Marquette Park, Illinois; Nazis in Skokie, Illinois; freedom of expression in the Confederate States of America; Robert M. LaFollette's arguments for free speech and the rights of Congress; the United States…

  1. Talking Speech Input.

    ERIC Educational Resources Information Center

    Berliss-Vincent, Jane; Whitford, Gigi

    2002-01-01

    This article presents both the factors involved in successful speech input use and the potential barriers that may suggest that other access technologies could be more appropriate for a given individual. Speech input options that are available are reviewed and strategies for optimizing use of speech recognition technology are discussed. (Contains…

  2. Speech 7 through 12.

    ERIC Educational Resources Information Center

    Nederland Independent School District, TX.

    GRADES OR AGES: Grades 7 through 12. SUBJECT MATTER: Speech. ORGANIZATION AND PHYSICAL APPEARANCE: Following the foreward, philosophy and objectives, this guide presents a speech curriculum. The curriculum covers junior high and Speech I, II, III (senior high). Thirteen units of study are presented for junior high, each unit is divided into…

  3. Speech and Language Delay

    MedlinePlus

    MENU Return to Web version Speech and Language Delay Overview How do I know if my child has speech delay? Every child develops at his or her ... of the same age, the problem may be speech delay. Your doctor may think your child has ...

  4. The Limits of Speech.

    ERIC Educational Resources Information Center

    Shea, Christopher

    1993-01-01

    Colleges and universities are finding it difficult to mold administrative policy concerning freedom of speech on campus. Even when speech or harassment codes mirror federal guidelines for antidiscrimination policy, controversy is common. Potential infringement on rights to free speech is the central issue. (MSE)

  5. Status Report on Speech Research: A Report on the Status and Progress of Studies on the Nature of Speech, Instrumentation for Its Investigation, and Practical Applications, April 1-June 30, 1977.

    ERIC Educational Resources Information Center

    Haskins Labs., New Haven, CT.

    This report is one of a regular series on the status and progress of studies on the nature of speech, instrumentation for its investigation, and practical applications. The ten papers treat the following topics: speech synthesis as a tool for the study of speech production; the study of articulatory organization; phonetic perception; cardiac…

  6. A Statistical Quality Model for Data-Driven Speech Animation.

    PubMed

    Ma, Xiaohan; Deng, Zhigang

    2012-11-01

    In recent years, data-driven speech animation approaches have achieved significant successes in terms of animation quality. However, how to automatically evaluate the realism of novel synthesized speech animations has been an important yet unsolved research problem. In this paper, we propose a novel statistical model (called SAQP) to automatically predict the quality of on-the-fly synthesized speech animations by various data-driven techniques. Its essential idea is to construct a phoneme-based, Speech Animation Trajectory Fitting (SATF) metric to describe speech animation synthesis errors and then build a statistical regression model to learn the association between the obtained SATF metric and the objective speech animation synthesis quality. Through delicately designed user studies, we evaluate the effectiveness and robustness of the proposed SAQP model. To the best of our knowledge, this work is the first-of-its-kind, quantitative quality model for data-driven speech animation. We believe it is the important first step to remove a critical technical barrier for applying data-driven speech animation techniques to numerous online or interactive talking avatar applications.

  7. Machine Translation from Speech

    NASA Astrophysics Data System (ADS)

    Schwartz, Richard; Olive, Joseph; McCary, John; Christianson, Caitlin

    This chapter describes approaches for translation from speech. Translation from speech presents two new issues. First, of course, we must recognize the speech in the source language. Although speech recognition has improved considerably over the last three decades, it is still far from being a solved problem. In the best of conditions, when the speech comes from high quality, carefully enunciated speech, on common topics (such as speech read by a trained news broadcaster), the word error rate is typically on the order of 5%. Humans can typically transcribe speech like this with less than 1% disagreement between annotators, so even this best number is still far worse than human performance. However, the task gets much harder when anything changes from this ideal condition. Some of the conditions that cause higher error rate are, if the topic is somewhat unusual, or the speakers are not reading so that their speech is more spontaneous, or if the speakers have an accent or are speaking a dialect, or if there is any acoustic degradation, such as noise or reverberation. In these cases, the word error can increase significantly to 20%, 30%, or higher. Accordingly, most of this chapter discusses techniques for improving speech recognition accuracy, while one section discusses techniques for integrating speech recognition with translation.

  8. Speech in spinocerebellar ataxia.

    PubMed

    Schalling, Ellika; Hartelius, Lena

    2013-12-01

    Spinocerebellar ataxias (SCAs) are a heterogeneous group of autosomal dominant cerebellar ataxias clinically characterized by progressive ataxia, dysarthria and a range of other concomitant neurological symptoms. Only a few studies include detailed characterization of speech symptoms in SCA. Speech symptoms in SCA resemble ataxic dysarthria but symptoms related to phonation may be more prominent. One study to date has shown an association between differences in speech and voice symptoms related to genotype. More studies of speech and voice phenotypes are motivated, to possibly aid in clinical diagnosis. In addition, instrumental speech analysis has been demonstrated to be a reliable measure that may be used to monitor disease progression or therapy outcomes in possible future pharmacological treatments. Intervention by speech and language pathologists should go beyond assessment. Clinical guidelines for management of speech, communication and swallowing need to be developed for individuals with progressive cerebellar ataxia.

  9. Auditory cortical deactivation during speech production and following speech perception: an EEG investigation of the temporal dynamics of the auditory alpha rhythm.

    PubMed

    Jenson, David; Harkrider, Ashley W; Thornton, David; Bowers, Andrew L; Saltuklaroglu, Tim

    2015-01-01

    Sensorimotor integration (SMI) across the dorsal stream enables online monitoring of speech. Jenson et al. (2014) used independent component analysis (ICA) and event related spectral perturbation (ERSP) analysis of electroencephalography (EEG) data to describe anterior sensorimotor (e.g., premotor cortex, PMC) activity during speech perception and production. The purpose of the current study was to identify and temporally map neural activity from posterior (i.e., auditory) regions of the dorsal stream in the same tasks. Perception tasks required "active" discrimination of syllable pairs (/ba/ and /da/) in quiet and noisy conditions. Production conditions required overt production of syllable pairs and nouns. ICA performed on concatenated raw 68 channel EEG data from all tasks identified bilateral "auditory" alpha (α) components in 15 of 29 participants localized to pSTG (left) and pMTG (right). ERSP analyses were performed to reveal fluctuations in the spectral power of the α rhythm clusters across time. Production conditions were characterized by significant α event related synchronization (ERS; pFDR < 0.05) concurrent with EMG activity from speech production, consistent with speech-induced auditory inhibition. Discrimination conditions were also characterized by α ERS following stimulus offset. Auditory α ERS in all conditions temporally aligned with PMC activity reported in Jenson et al. (2014). These findings are indicative of speech-induced suppression of auditory regions, possibly via efference copy. The presence of the same pattern following stimulus offset in discrimination conditions suggests that sensorimotor contributions following speech perception reflect covert replay, and that covert replay provides one source of the motor activity previously observed in some speech perception tasks. To our knowledge, this is the first time that inhibition of auditory regions by speech has been observed in real-time with the ICA/ERSP technique. PMID:26500519

  10. Auditory cortical deactivation during speech production and following speech perception: an EEG investigation of the temporal dynamics of the auditory alpha rhythm

    PubMed Central

    Jenson, David; Harkrider, Ashley W.; Thornton, David; Bowers, Andrew L.; Saltuklaroglu, Tim

    2015-01-01

    Sensorimotor integration (SMI) across the dorsal stream enables online monitoring of speech. Jenson et al. (2014) used independent component analysis (ICA) and event related spectral perturbation (ERSP) analysis of electroencephalography (EEG) data to describe anterior sensorimotor (e.g., premotor cortex, PMC) activity during speech perception and production. The purpose of the current study was to identify and temporally map neural activity from posterior (i.e., auditory) regions of the dorsal stream in the same tasks. Perception tasks required “active” discrimination of syllable pairs (/ba/ and /da/) in quiet and noisy conditions. Production conditions required overt production of syllable pairs and nouns. ICA performed on concatenated raw 68 channel EEG data from all tasks identified bilateral “auditory” alpha (α) components in 15 of 29 participants localized to pSTG (left) and pMTG (right). ERSP analyses were performed to reveal fluctuations in the spectral power of the α rhythm clusters across time. Production conditions were characterized by significant α event related synchronization (ERS; pFDR < 0.05) concurrent with EMG activity from speech production, consistent with speech-induced auditory inhibition. Discrimination conditions were also characterized by α ERS following stimulus offset. Auditory α ERS in all conditions temporally aligned with PMC activity reported in Jenson et al. (2014). These findings are indicative of speech-induced suppression of auditory regions, possibly via efference copy. The presence of the same pattern following stimulus offset in discrimination conditions suggests that sensorimotor contributions following speech perception reflect covert replay, and that covert replay provides one source of the motor activity previously observed in some speech perception tasks. To our knowledge, this is the first time that inhibition of auditory regions by speech has been observed in real-time with the ICA/ERSP technique. PMID

  11. FPGA implementation of concatenated non-binary QC-LDPC codes for high-speed optical transport.

    PubMed

    Zou, Ding; Djordjevic, Ivan B

    2015-06-01

    In this paper, we propose a soft-decision-based FEC scheme that is the concatenation of a non-binary LDPC code and hard-decision FEC code. The proposed NB-LDPC + RS with overhead of 27.06% provides a superior NCG of 11.9dB at a post-FEC BER of 10-15. As a result, the proposed NB-LDPC codes represent the strong FEC candidate of soft-decision FEC for beyond 100Gb/s optical transmission systems.

  12. RibAlign: a software tool and database for eubacterial phylogeny based on concatenated ribosomal protein subunits

    PubMed Central

    Teeling, Hanno; Gloeckner, Frank Oliver

    2006-01-01

    Background Until today, analysis of 16S ribosomal RNA (rRNA) sequences has been the de-facto gold standard for the assessment of phylogenetic relationships among prokaryotes. However, the branching order of the individual phlya is not well-resolved in 16S rRNA-based trees. In search of an improvement, new phylogenetic methods have been developed alongside with the growing availability of complete genome sequences. Unfortunately, only a few genes in prokaryotic genomes qualify as universal phylogenetic markers and almost all of them have a lower information content than the 16S rRNA gene. Therefore, emphasis has been placed on methods that are based on multiple genes or even entire genomes. The concatenation of ribosomal protein sequences is one method which has been ascribed an improved resolution. Since there is neither a comprehensive database for ribosomal protein sequences nor a tool that assists in sequence retrieval and generation of respective input files for phylogenetic reconstruction programs, RibAlign has been developed to fill this gap. Results RibAlign serves two purposes: First, it provides a fast and scalable database that has been specifically adapted to eubacterial ribosomal protein sequences and second, it provides sophisticated import and export capabilities. This includes semi-automatic extraction of ribosomal protein sequences from whole-genome GenBank and FASTA files as well as exporting aligned, concatenated and filtered sequence files that can directly be used in conjunction with the PHYLIP and MrBayes phylogenetic reconstruction programs. Conclusion Up to now, phylogeny based on concatenated ribosomal protein sequences is hampered by the limited set of sequenced genomes and high computational requirements. However, hundreds of full and draft genome sequencing projects are on the way, and advances in cluster-computing and algorithms make phylogenetic reconstructions feasible even with large alignments of concatenated marker genes. Rib

  13. Detection of terahertz radiation by tightly concatenated InGaAs field-effect transistors integrated on a single chip

    SciTech Connect

    Popov, V. V.; Yermolaev, D. M.; Shapoval, S. Yu.; Maremyanin, K. V.; Gavrilenko, V. I.; Zemlyakov, V. E.; Bespalov, V. A.; Yegorkin, V. I.; Maleev, N. A.; Ustinov, V. M.

    2014-04-21

    A tightly concatenated chain of InGaAs field-effect transistors with an asymmetric T-gate in each transistor demonstrates strong terahertz photovoltaic response without using supplementary antenna elements. We obtain the responsivity above 1000 V/W and up to 2000 V/W for unbiased and drain-biased transistors in the chain, respectively, with the noise equivalent power below 10{sup −11} W/Hz{sup 0.5} in the unbiased mode of the detector operation.

  14. Early recognition of speech

    PubMed Central

    Remez, Robert E; Thomas, Emily F

    2013-01-01

    Classic research on the perception of speech sought to identify minimal acoustic correlates of each consonant and vowel. In explaining perception, this view designated momentary components of an acoustic spectrum as cues to the recognition of elementary phonemes. This conceptualization of speech perception is untenable given the findings of phonetic sensitivity to modulation independent of the acoustic and auditory form of the carrier. The empirical key is provided by studies of the perceptual organization of speech, a low-level integrative function that finds and follows the sensory effects of speech amid concurrent events. These projects have shown that the perceptual organization of speech is keyed to modulation; fast; unlearned; nonsymbolic; indifferent to short-term auditory properties; and organization requires attention. The ineluctably multisensory nature of speech perception also imposes conditions that distinguish language among cognitive systems. WIREs Cogn Sci 2013, 4:213–223. doi: 10.1002/wcs.1213 PMID:23926454

  15. Speech Alarms Pilot Study

    NASA Technical Reports Server (NTRS)

    Sandor, Aniko; Moses, Haifa

    2016-01-01

    Speech alarms have been used extensively in aviation and included in International Building Codes (IBC) and National Fire Protection Association's (NFPA) Life Safety Code. However, they have not been implemented on space vehicles. Previous studies conducted at NASA JSC showed that speech alarms lead to faster identification and higher accuracy. This research evaluated updated speech and tone alerts in a laboratory environment and in the Human Exploration Research Analog (HERA) in a realistic setup.

  16. Military applications of automatic speech recognition and future requirements

    NASA Technical Reports Server (NTRS)

    Beek, Bruno; Cupples, Edward J.

    1977-01-01

    An updated summary of the state-of-the-art of automatic speech recognition and its relevance to military applications is provided. A number of potential systems for military applications are under development. These include: (1) digital narrowband communication systems; (2) automatic speech verification; (3) on-line cartographic processing unit; (4) word recognition for militarized tactical data system; and (5) voice recognition and synthesis for aircraft cockpit.

  17. Neural network based speech synthesizer: A preliminary report

    NASA Technical Reports Server (NTRS)

    Villarreal, James A.; Mcintire, Gary

    1987-01-01

    A neural net based speech synthesis project is discussed. The novelty is that the reproduced speech was extracted from actual voice recordings. In essence, the neural network learns the timing, pitch fluctuations, connectivity between individual sounds, and speaking habits unique to that individual person. The parallel distributed processing network used for this project is the generalized backward propagation network which has been modified to also learn sequences of actions or states given in a particular plan.

  18. An experimental study of the concatenated Reed-Solomon/Viterbi channel coding system performance and its impact on space communications

    NASA Technical Reports Server (NTRS)

    Liu, K. Y.; Lee, J. J.

    1981-01-01

    The need for efficient space communication at very low bit error probabilities to the specification and implementation of a concatenated coding system using an interleaved Reed-Solomon code as the outer code and a Viterbi-decoded convolutional code as the inner code. Experimental results of this channel coding system are presented under an emulated S-band uplink and X-band downlink two-way space communication channel, where both uplink and downlink have strong carrier power. This work was performed under the NASA End-to-End Data Systems program at JPL. Test results verify that at a bit error probability of 10 to the -6 power or less, this concatenated coding system does provide a coding gain of 2.5 dB or more over the Viterbi-decoded convolutional-only coding system. These tests also show that a desirable interleaving depth for the Reed-Solomon outer code is 8 or more. The impact of this "virtually" error-free space communication link on the transmission of images is discussed and examples of simulation results are given.

  19. Reversed-Phase Chromatography with Multiple Fraction Concatenation Strategy for Proteome Profiling of Human MCF10A Cells

    SciTech Connect

    Wang, Yuexi; Yang, Feng; Gritsenko, Marina A.; Wang, Yingchun; Clauss, Therese RW; Liu, Tao; Shen, Yufeng; Monroe, Matthew E.; Lopez-Ferrer, Daniel; Reno, Theresa; Moore, Ronald J.; Klemke, Richard L.; Camp, David G.; Smith, Richard D.

    2011-05-01

    Two dimensional liquid chromatography (2D LC) is commonly used for shotgun proteomics to improve the analysis dynamic range. Reversed phase liquid chromatography (RPLC) has been routinely employed as the second dimensional separation prior to the mass spectrometric analysis. Construction of 2D separation with RP-RP arises a concern for the separation orthogonality. In this study, we applied a novel concatenation strategy to improve the orthogonality of 2D RP-RP formed by low pH (i.e., pH 3) and high pH (i.e., pH 10) RPLC. We confidently identified 3753 proteins (18570 unique peptides) and 5907 proteins (37633 unique peptides) from low pH RPLC-RP and high pH RPLC-RP, respectively, for a trypsin-digested human MCF10A cell sample. Compared with SCX-RP, the high pH-low pH RP-RP approach resulted in 1.8-fold and 1.6-fold in the number of peptide and protein identifications, respectively. In addition to the broader identifications, the High pH-low pH RP-RP approach has advantages including the improved protein sequence coverage, the simplified sample processing, and the reduced sample loss. These results demonstrated that the concatenation high pH-low pH RP-RP strategy is an attractive alternative to SCX for 2D LC shotgun proteomic analysis.

  20. Distributed processing for speech understanding

    SciTech Connect

    Bronson, E.C.; Siegel, L.

    1983-01-01

    Continuous speech understanding is a highly complex artificial intelligence task requiring extensive computation. This complexity precludes real-time speech understanding on a conventional serial computer. Distributed processing technique can be applied to the speech understanding task to improve processing speed. In the paper, the speech understanding task and several speech understanding systems are described. Parallel processing techniques are presented and a distributed processing architecture for speech understanding is outlined. 35 references.

  1. Speech-Language Therapy (For Parents)

    MedlinePlus

    ... 5 Things to Know About Zika & Pregnancy Speech-Language Therapy KidsHealth > For Parents > Speech-Language Therapy Print ... with speech and/or language disorders. Speech Disorders, Language Disorders, and Feeding Disorders A speech disorder refers ...

  2. Speech Processing Applications Using AN Am-Fm Modulation Model.

    NASA Astrophysics Data System (ADS)

    Potamianos, Alexandros

    1995-01-01

    sum of the reconstructed formant bands. The AM-FM analysis-synthesis system produces speech of very natural quality. Currently, the vocoder operates in the 4.8-9.6 kbits/sec range. Future applications of these modeling/coding ideas include text-to-speech synthesis and speaker identification. Overall, the AM-FM modulation model and multiband demodulation analysis are a general nonlinear approach to speech processing with a wide range of successful applications.

  3. The development of gesture and speech as an integrated system.

    PubMed

    Goldin-Meadow, S

    1998-01-01

    transpose that knowledge to a new level and thus express those pieces of information within a single modality? More work is needed to investigate whether the act of producing gesture-speech mismatches itself facilitates transition. Even if it turns out that the production of gesture-speech mismatches has little role to play in facilitating cognitive change, mismatch remains a reliable marker of the speaker's potential for cognitive growth. As such, an understanding of the relationship between gesture and speech may prove useful in clinical settings. For example, there is some evidence that children with delayed onset of two-word speech fall naturally into two groups: children who eventually achieve two-word speech, albeit later than the norm (that is, late bloomers), and children who continue to have serious difficulties with spoken language and may never be able to combine words into a single string (Feldman, Holland, Kemp, and Janosky, 1992; Thal, Tobias, and Morrison, 1991). Observation of combinations in which gesture and speech convey different information may prove a useful clinical tool for distinguishing, at a relatively young age, children who will be late bloomers from those who will have great difficulty mastering spoken language without intervention (see Stare, 1996, for preliminary evidence that the relationship between gesture and speech in children with unilateral brain damage correlates with early versus late onset of two-word combinations. In sum, for both speakers and listeners, gesture and speech are two aspects of a single process, with each modality contributing its own unique level of representation. Gesture conveys information in the global, imagistic form for which it is well suited, and speech conveys information in the segmented, combinatorial fashion that characterizes linguistic structures. The total representation of any message is therefore a synthesis of the analog gestural mode and the discrete speech mode. (ABSTRACT TRUNCATED)

  4. Free Speech Yearbook 1977.

    ERIC Educational Resources Information Center

    Phifer, Gregg, Ed.

    The eleven articles in this collection explore various aspects of freedom of speech. Topics include the lack of knowledge on the part of many judges regarding the complex act of communication; the legislatures and free speech in colonial Connecticut and Rhode Island; contributions of sixteenth century Anabaptist heretics to First Amendment…

  5. Improving Alaryngeal Speech Intelligibility.

    ERIC Educational Resources Information Center

    Christensen, John M.; Dwyer, Patricia E.

    1990-01-01

    Laryngectomized patients using esophageal speech or an electronic artificial larynx have difficulty producing correct voicing contrasts between homorganic consonants. This paper describes a therapy technique that emphasizes "pushing harder" on voiceless consonants to improve alaryngeal speech intelligibility and proposes focusing on the production…

  6. Illustrated Speech Anatomy.

    ERIC Educational Resources Information Center

    Shearer, William M.

    Written for students in the fields of speech correction and audiology, the text deals with the following: structures involved in respiration; the skeleton and the processes of inhalation and exhalation; phonation and pitch, the larynx, and esophageal speech; muscles involved in articulation; muscles involved in resonance; and the anatomy of the…

  7. Chief Seattle's Speech Revisited

    ERIC Educational Resources Information Center

    Krupat, Arnold

    2011-01-01

    Indian orators have been saying good-bye for more than three hundred years. John Eliot's "Dying Speeches of Several Indians" (1685), as David Murray notes, inaugurates a long textual history in which "Indians... are most useful dying," or, as in a number of speeches, bidding the world farewell as they embrace an undesired but apparently inevitable…

  8. Free Speech Yearbook 1981.

    ERIC Educational Resources Information Center

    Kane, Peter E., Ed.

    1982-01-01

    The nine articles in this collection deal with theoretical and practical freedom of speech issues. Topics discussed include the following: (1) freedom of expression in Thailand and India; (2) metaphors and analogues in several landmark free speech cases; (3) Supreme Court Justice William O. Douglas's views of the First Amendment; (4) the San…

  9. Free Speech Yearbook 1980.

    ERIC Educational Resources Information Center

    Kane, Peter E., Ed.

    The 11 articles in this collection deal with theoretical and practical freedom of speech issues. The topics covered are (1) the United States Supreme Court and communication theory; (2) truth, knowledge, and a democratic respect for diversity; (3) denial of freedom of speech in Jock Yablonski's campaign for the presidency of the United Mine…

  10. Free Speech Yearbook: 1971.

    ERIC Educational Resources Information Center

    Tedford, Thomas L., Editor

    This publication of ten scholarly articles provides perspectives on problems and forces that inhibit freedom of speech. 1) "Freedom of Speech and Change in American Education" suggests that a more communicative society, and increasing academic freedoms, helps schools adapt to social change; 2) "Syllabus and Bibliography for 'Issues in Freedom of…

  11. Free Speech Yearbook 1976.

    ERIC Educational Resources Information Center

    Phifer, Gregg, Ed.

    The articles collected in this annual address several aspects of First Amendment Law. The following titles are included: "Freedom of Speech As an Academic Discipline" (Franklyn S. Haiman), "Free Speech and Foreign-Policy Decision Making" (Douglas N. Freeman), "The Supreme Court and the First Amendment: 1975-1976" (William A. Linsley), "'Arnett v.…

  12. Tracking Speech Sound Acquisition

    ERIC Educational Resources Information Center

    Powell, Thomas W.

    2011-01-01

    This article describes a procedure to aid in the clinical appraisal of child speech. The approach, based on the work by Dinnsen, Chin, Elbert, and Powell (1990; Some constraints on functionally disordered phonologies: Phonetic inventories and phonotactics. "Journal of Speech and Hearing Research", 33, 28-37), uses a railway idiom to track gains in…

  13. The Discipline of Speech.

    ERIC Educational Resources Information Center

    Reid, Loren

    1967-01-01

    In spite of the diversity of subjects subsumed under the generic term speech, all areas of this discipline are based on oral communication with its essential elements--voice, action, thought, and language. Speech may be viewed as a community of persons with a common tradition participating in a common dialog, described in part by the memberships…

  14. Free Speech. No. 38.

    ERIC Educational Resources Information Center

    Kane, Peter E., Ed.

    This issue of "Free Speech" contains the following articles: "Daniel Schoor Relieved of Reporting Duties" by Laurence Stern, "The Sellout at CBS" by Michael Harrington, "Defending Dan Schorr" by Tome Wicker, "Speech to the Washington Press Club, February 25, 1976" by Daniel Schorr, "Funds Voted For Schorr Inquiry" by Richard Lyons, "Erosion of the…

  15. Private Speech in Ballet

    ERIC Educational Resources Information Center

    Johnston, Dale

    2006-01-01

    Authoritarian teaching practices in ballet inhibit the use of private speech. This paper highlights the critical importance of private speech in the cognitive development of young ballet students, within what is largely a non-verbal art form. It draws upon research by Russian psychologist Lev Vygotsky and contemporary socioculturalists, to…

  16. Automatic speech recognition

    NASA Astrophysics Data System (ADS)

    Espy-Wilson, Carol

    2005-04-01

    Great strides have been made in the development of automatic speech recognition (ASR) technology over the past thirty years. Most of this effort has been centered around the extension and improvement of Hidden Markov Model (HMM) approaches to ASR. Current commercially-available and industry systems based on HMMs can perform well for certain situational tasks that restrict variability such as phone dialing or limited voice commands. However, the holy grail of ASR systems is performance comparable to humans-in other words, the ability to automatically transcribe unrestricted conversational speech spoken by an infinite number of speakers under varying acoustic environments. This goal is far from being reached. Key to the success of ASR is effective modeling of variability in the speech signal. This tutorial will review the basics of ASR and the various ways in which our current knowledge of speech production, speech perception and prosody can be exploited to improve robustness at every level of the system.

  17. Commercial applications of speech interface technology: an industry at the threshold.

    PubMed Central

    Oberteuffer, J A

    1995-01-01

    Speech interface technology, which includes automatic speech recognition, synthetic speech, and natural language processing, is beginning to have a significant impact on business and personal computer use. Today, powerful and inexpensive microprocessors and improved algorithms are driving commercial applications in computer command, consumer, data entry, speech-to-text, telephone, and voice verification. Robust speaker-independent recognition systems for command and navigation in personal computers are now available; telephone-based transaction and database inquiry systems using both speech synthesis and recognition are coming into use. Large-vocabulary speech interface systems for document creation and read-aloud proofing are expanding beyond niche markets. Today's applications represent a small preview of a rich future for speech interface technology that will eventually replace keyboards with microphones and loud-speakers to give easy accessibility to increasingly intelligent machines. PMID:7479717

  18. Annealed lattice animal model and Flory theory for the melt of non-concatenated rings: towards the physics of crumpling.

    PubMed

    Grosberg, Alexander Y

    2014-01-28

    A Flory theory is constructed for a long polymer ring in a melt of unknotted and non-concatenated rings. The theory assumes that the ring forms an effective annealed branched object and computes its primitive path. It is shown that the primitive path follows self-avoiding statistics and is characterized by the corresponding Flory exponent of a polymer with excluded volume. Based on that, it is shown that rings in the melt are compact objects with overall size proportional to their length raised to the 1/3 power. Furthermore, the contact probability exponent γcontact is estimated, albeit by a poorly controlled approximation, with the result close to 1.1 consistent with both numerical and experimental data.

  19. SPEECH FRIGHT IN THE ELEMENTARY SCHOOL, ITS RELATIONSHIP TO SPEECH ABILITY AND ITS POSSIBLE IMPLICATION FOR SPEECH READINESS.

    ERIC Educational Resources Information Center

    SHAW, IRWIN

    THE RELATIONSHIP OF ELEMENTARY SCHOOL STUDENTS' SPEECH FRIGHT TO THEIR SPEECH ABILITY, SPEECH ATTITUDES, AND SPEECH READINESS WAS STUDIED. SURVEYS WERE CONDUCTED AND DESCRIPTIVE DATA WERE COLLECTED ON SPEECH FRIGHT LEVELS AND SPEECH ABILITY OF 1,166 STUDENTS IN SELECTED ELEMENTARY GRADES. ATTITUDES OF TEACHERS TOWARD SPEECH FRIGHT WERE ALSO…

  20. Speech Correction in the Schools.

    ERIC Educational Resources Information Center

    Eisenson, Jon; Ogilvie, Mardel

    An introduction to the problems and therapeutic needs of school age children whose speech requires remedial attention, the text is intended for both the classroom teacher and the speech correctionist. General considerations include classification and incidence of speech defects, speech correction services, the teacher as a speaker, the mechanism…

  1. Sperry Univac speech communications technology

    NASA Technical Reports Server (NTRS)

    Medress, Mark F.

    1977-01-01

    Technology and systems for effective verbal communication with computers were developed. A continuous speech recognition system for verbal input, a word spotting system to locate key words in conversational speech, prosodic tools to aid speech analysis, and a prerecorded voice response system for speech output are described.

  2. Voice and Speech after Laryngectomy

    ERIC Educational Resources Information Center

    Stajner-Katusic, Smiljka; Horga, Damir; Musura, Maja; Globlek, Dubravka

    2006-01-01

    The aim of the investigation is to compare voice and speech quality in alaryngeal patients using esophageal speech (ESOP, eight subjects), electroacoustical speech aid (EACA, six subjects) and tracheoesophageal voice prosthesis (TEVP, three subjects). The subjects reading a short story were recorded in the sound-proof booth and the speech samples…

  3. Environmental Contamination of Normal Speech.

    ERIC Educational Resources Information Center

    Harley, Trevor A.

    1990-01-01

    Environmentally contaminated speech errors (irrelevant words or phrases derived from the speaker's environment and erroneously incorporated into speech) are hypothesized to occur at a high level of speech processing, but with a relatively late insertion point. The data indicate that speech production processes are not independent of other…

  4. Portable Speech Synthesizer

    NASA Technical Reports Server (NTRS)

    Leibfritz, Gilbert H.; Larson, Howard K.

    1987-01-01

    Compact speech synthesizer useful traveling companion to speech-handicapped. User simply enters statement on board, and synthesizer converts statement into spoken words. Battery-powered and housed in briefcase, easily carried on trips. Unit used on telephones and face-to-face communication. Synthesizer consists of micro-computer with memory-expansion module, speech-synthesizer circuit, batteries, recharger, dc-to-dc converter, and telephone amplifier. Components, commercially available, fit neatly in 17-by 13-by 5-in. briefcase. Weighs about 20 lb (9 kg) and operates and recharges from ac receptable.

  5. Fluid Dynamics of Human Phonation and Speech

    NASA Astrophysics Data System (ADS)

    Mittal, Rajat; Erath, Byron D.; Plesniak, Michael W.

    2013-01-01

    This article presents a review of the fluid dynamics, flow-structure interactions, and acoustics associated with human phonation and speech. Our voice is produced through the process of phonation in the larynx, and an improved understanding of the underlying physics of this process is essential to advancing the treatment of voice disorders. Insights into the physics of phonation and speech can also contribute to improved vocal training and the development of new speech compression and synthesis schemes. This article introduces the key biomechanical features of the laryngeal physiology, reviews the basic principles of voice production, and summarizes the progress made over the past half-century in understanding the flow physics of phonation and speech. Laryngeal pathologies, which significantly enhance the complexity of phonatory dynamics, are discussed. After a thorough examination of the state of the art in computational modeling and experimental investigations of phonatory biomechanics, we present a synopsis of the pacing issues in this arena and an outlook for research in this fascinating subject.

  6. Speech and Communication Disorders

    MedlinePlus

    ... or understand speech. Causes include Hearing disorders and deafness Voice problems, such as dysphonia or those caused ... language therapy can help. NIH: National Institute on Deafness and Other Communication Disorders

  7. Auditory speech preprocessors

    SciTech Connect

    Zweig, G.

    1989-01-01

    A nonlinear transmission line model of the cochlea (Zweig 1988) is proposed as the basis for a novel speech preprocessor. Sounds of different intensities, such as voiced and unvoiced speech, are preprocessed in radically different ways. The Q's of the preprocessor's nonlinear filters vary with input amplitude, higher Q's (longer integration times) corresponding to quieter sounds. Like the cochlea, the preprocessor acts as a ''subthreshold laser'' that traps and amplifies low level signals, thereby aiding in their detection and analysis. 17 refs.

  8. Human speech articulator measurements using low power, 2GHz Homodyne sensors

    SciTech Connect

    Barnes, T; Burnett, G C; Holzrichter, J F

    1999-06-29

    Very low power, short-range microwave ''radar-like'' sensors can measure the motions and vibrations of internal human speech articulators as speech is produced. In these animate (and also in inanimate acoustic systems) microwave sensors can measure vibration information associated with excitation sources and other interfaces. These data, together with the corresponding acoustic data, enable the calculation of system transfer functions. This information appears to be useful for a surprisingly wide range of applications such as speech coding and recognition, speaker or object identification, speech and musical instrument synthesis, noise cancellation, and other applications.

  9. Musician advantage for speech-on-speech perception.

    PubMed

    Başkent, Deniz; Gaudrain, Etienne

    2016-03-01

    Evidence for transfer of musical training to better perception of speech in noise has been mixed. Unlike speech-in-noise, speech-on-speech perception utilizes many of the skills that musical training improves, such as better pitch perception and stream segregation, as well as use of higher-level auditory cognitive functions, such as attention. Indeed, despite the few non-musicians who performed as well as musicians, on a group level, there was a strong musician benefit for speech perception in a speech masker. This benefit does not seem to result from better voice processing and could instead be related to better stream segregation or enhanced cognitive functions. PMID:27036287

  10. Robust Speech Rate Estimation for Spontaneous Speech

    PubMed Central

    Wang, Dagen; Narayanan, Shrikanth S.

    2010-01-01

    In this paper, we propose a direct method for speech rate estimation from acoustic features without requiring any automatic speech transcription. We compare various spectral and temporal signal analysis and smoothing strategies to better characterize the underlying syllable structure to derive speech rate. The proposed algorithm extends the methods of spectral subband correlation by including temporal correlation and the use of prominent spectral subbands for improving the signal correlation essential for syllable detection. Furthermore, to address some of the practical robustness issues in previously proposed methods, we introduce some novel components into the algorithm such as the use of pitch confidence for filtering spurious syllable envelope peaks, magnifying window for tackling neighboring syllable smearing, and relative peak measure thresholds for pseudo peak rejection. We also describe an automated approach for learning algorithm parameters from data, and find the optimal settings through Monte Carlo simulations and parameter sensitivity analysis. Final experimental evaluations are conducted based on a portion of the Switchboard corpus for which manual phonetic segmentation information, and published results for direct comparison are available. The results show a correlation coefficient of 0.745 with respect to the ground truth based on manual segmentation. This result is about a 17% improvement compared to the current best single estimator and a 11% improvement over the multiestimator evaluated on the same Switchboard database. PMID:20428476

  11. Speech Alarms Pilot Study

    NASA Technical Reports Server (NTRS)

    Sandor, A.; Moses, H. R.

    2016-01-01

    Currently on the International Space Station (ISS) and other space vehicles Caution & Warning (C&W) alerts are represented with various auditory tones that correspond to the type of event. This system relies on the crew's ability to remember what each tone represents in a high stress, high workload environment when responding to the alert. Furthermore, crew receive a year or more in advance of the mission that makes remembering the semantic meaning of the alerts more difficult. The current system works for missions conducted close to Earth where ground operators can assist as needed. On long duration missions, however, they will need to work off-nominal events autonomously. There is evidence that speech alarms may be easier and faster to recognize, especially during an off-nominal event. The Information Presentation Directed Research Project (FY07-FY09) funded by the Human Research Program included several studies investigating C&W alerts. The studies evaluated tone alerts currently in use with NASA flight deck displays along with candidate speech alerts. A follow-on study used four types of speech alerts to investigate how quickly various types of auditory alerts with and without a speech component - either at the beginning or at the end of the tone - can be identified. Even though crew were familiar with the tone alert from training or direct mission experience, alerts starting with a speech component were identified faster than alerts starting with a tone. The current study replicated the results from the previous study in a more rigorous experimental design to determine if the candidate speech alarms are ready for transition to operations or if more research is needed. Four types of alarms (caution, warning, fire, and depressurization) were presented to participants in both tone and speech formats in laboratory settings and later in the Human Exploration Research Analog (HERA). In the laboratory study, the alerts were presented by software and participants were

  12. Using concatenated subunits to investigate the functional consequences of heterotetrameric inositol 1,4,5-trisphosphate receptors.

    PubMed

    Chandrasekhar, Rahul; Alzayady, Kamil J; Yule, David I

    2015-06-01

    Inositol 1,4,5-trisphosphate receptors (IP3Rs) are a family of ubiquitous, ER localized, tetrameric Ca2+ release channels. There are three subtypes of the IP3Rs (R1, R2, R3), encoded by three distinct genes, that share ∼60-70% sequence identity. The diversity of Ca2+ signals generated by IP3Rs is thought to be largely the result of differential tissue expression, intracellular localization and subtype-specific regulation of the three subtypes by various cellular factors, most significantly InsP3, Ca2+ and ATP. However, largely unexplored is the notion of additional signal diversity arising from the assembly of both homo and heterotetrameric InsP3Rs. In the present article, we review the biochemical and functional evidence supporting the existence of homo and heterotetrameric populations of InsP3Rs. In addition, we consider a strategy that utilizes genetically concatenated InsP3Rs to study the functional characteristics of heterotetramers with unequivocally defined composition. This approach reveals that the overall properties of IP3R are not necessarily simply a blend of the constituent monomers but that specific subtypes appear to dominate the overall characteristics of the tetramer. It is envisioned that the ability to generate tetramers with defined wild type and mutant subunits will be useful in probing fundamental questions relating to IP3R structure and function.

  13. Differential Diagnosis of Severe Speech Disorders Using Speech Gestures

    ERIC Educational Resources Information Center

    Bahr, Ruth Huntley

    2005-01-01

    The differentiation of childhood apraxia of speech from severe phonological disorder is a common clinical problem. This article reports on an attempt to describe speech errors in children with childhood apraxia of speech on the basis of gesture use and acoustic analyses of articulatory gestures. The focus was on the movement of articulators and…

  14. Why Go to Speech Therapy?

    MedlinePlus

    ... Teachers Speech-Language Pathologists Physicians Employers Tweet Why Go To Speech Therapy? Parents of Preschoolers Parents of ... types of therapy work best when you can go on an intensive schedule (i.e., every day ...

  15. Development of a speech autocuer

    NASA Technical Reports Server (NTRS)

    Bedles, R. L.; Kizakvich, P. N.; Lawson, D. T.; Mccartney, M. L.

    1980-01-01

    A wearable, visually based prosthesis for the deaf based upon the proven method for removing lipreading ambiguity known as cued speech was fabricated and tested. Both software and hardware developments are described, including a microcomputer, display, and speech preprocessor.

  16. Hearing or speech impairment - resources

    MedlinePlus

    Resources - hearing or speech impairment ... The following organizations are good resources for information on hearing impairment or speech impairment: Alexander Graham Bell Association for the Deaf and Hard of Hearing -- www.agbell. ...

  17. Development of a speech autocuer

    NASA Astrophysics Data System (ADS)

    Bedles, R. L.; Kizakvich, P. N.; Lawson, D. T.; McCartney, M. L.

    1980-12-01

    A wearable, visually based prosthesis for the deaf based upon the proven method for removing lipreading ambiguity known as cued speech was fabricated and tested. Both software and hardware developments are described, including a microcomputer, display, and speech preprocessor.

  18. Speech spectrogram expert

    SciTech Connect

    Johannsen, J.; Macallister, J.; Michalek, T.; Ross, S.

    1983-01-01

    Various authors have pointed out that humans can become quite adept at deriving phonetic transcriptions from speech spectrograms (as good as 90percent accuracy at the phoneme level). The authors describe an expert system which attempts to simulate this performance. The speech spectrogram expert (spex) is actually a society made up of three experts: a 2-dimensional vision expert, an acoustic-phonetic expert, and a phonetics expert. The visual reasoning expert finds important visual features of the spectrogram. The acoustic-phonetic expert reasons about how visual features relates to phonemes, and about how phonemes change visually in different contexts. The phonetics expert reasons about allowable phoneme sequences and transformations, and deduces an english spelling for phoneme strings. The speech spectrogram expert is highly interactive, allowing users to investigate hypotheses and edit rules. 10 references.

  19. Development and Perceptual Evaluation of Amplitude-Based F0 Control in Electrolarynx Speech

    PubMed Central

    Saikachi, Yoko; Stevens, Kenneth N.; Hillman, Robert E.

    2013-01-01

    Purpose Current electrolarynx (EL) devices produce a mechanical speech quality that has been largely attributed to the lack of natural fundamental frequency (F0) variation. In order to improve the quality of EL speech, in the present study the authors aimed to develop and evaluate an automatic F0 control scheme, in which F0 was modulated based on variations in the root-mean-square (RMS) amplitude of the EL speech signal. Method Recordings of declarative sentences produced by 2 male participants before and after total laryngectomy were used to develop procedures for calculating F0 contours for EL speech. Specifically, the positive linear relationship between F0 and RMS amplitude observed in pre-laryngectomy speech was used as the basis for generating an F0 contour based on the amplitude variation of EL speech. An analysis-by-synthesis approach was used to modify the F0 contour, and a perceptual experiment was conducted to examine its impact on the quality of the EL speech. Results The results of perceptual experiments showed that modulating the F0 of EL speech using a linear relationship between amplitude and frequency made it significantly more natural sounding than EL speech with constant F0. Conclusions The current study provides preliminary support for amplitude-based control of F0 in EL speech. PMID:19564438

  20. ADMINISTRATIVE GUIDE IN SPEECH CORRECTION.

    ERIC Educational Resources Information Center

    HEALEY, WILLIAM C.

    WRITTEN PRIMARILY FOR SCHOOL SUPERINTENDENTS, PRINCIPALS, SPEECH CLINICIANS, AND SUPERVISORS, THIS GUIDE OUTLINES THE MECHANICS OF ORGANIZING AND CONDUCTING SPEECH CORRECTION ACTIVITIES IN THE PUBLIC SCHOOLS. IT INCLUDES THE REQUIREMENTS FOR CERTIFICATION OF A SPEECH CLINICIAN IN MISSOURI AND DESCRIBES ESSENTIAL STEPS FOR THE DEVELOPMENT OF A…

  1. "Zero Tolerance" for Free Speech.

    ERIC Educational Resources Information Center

    Hils, Lynda

    2001-01-01

    Argues that school policies of "zero tolerance" of threatening speech may violate a student's First Amendment right to freedom of expression if speech is less than a "true threat." Suggests a two-step analysis to determine if student speech is a "true threat." (PKP)

  2. Abortion and compelled physician speech.

    PubMed

    Orentlicher, David

    2015-01-01

    Informed consent mandates for abortion providers may infringe the First Amendment's freedom of speech. On the other hand, they may reinforce the physician's duty to obtain informed consent. Courts can promote both doctrines by ensuring that compelled physician speech pertains to medical facts about abortion rather than abortion ideology and that compelled speech is truthful and not misleading. PMID:25846035

  3. Abortion and compelled physician speech.

    PubMed

    Orentlicher, David

    2015-01-01

    Informed consent mandates for abortion providers may infringe the First Amendment's freedom of speech. On the other hand, they may reinforce the physician's duty to obtain informed consent. Courts can promote both doctrines by ensuring that compelled physician speech pertains to medical facts about abortion rather than abortion ideology and that compelled speech is truthful and not misleading.

  4. Signed Soliloquy: Visible Private Speech

    ERIC Educational Resources Information Center

    Zimmermann, Kathrin; Brugger, Peter

    2013-01-01

    Talking to oneself can be silent (inner speech) or vocalized for others to hear (private speech, or soliloquy). We investigated these two types of self-communication in 28 deaf signers and 28 hearing adults. With a questionnaire specifically developed for this study, we established the visible analog of vocalized private speech in deaf signers.…

  5. Study of acoustic correlates associate with emotional speech

    NASA Astrophysics Data System (ADS)

    Yildirim, Serdar; Lee, Sungbok; Lee, Chul Min; Bulut, Murtaza; Busso, Carlos; Kazemzadeh, Ebrahim; Narayanan, Shrikanth

    2004-10-01

    This study investigates the acoustic characteristics of four different emotions expressed in speech. The aim is to obtain detailed acoustic knowledge on how a speech signal is modulated by changes from neutral to a certain emotional state. Such knowledge is necessary for automatic emotion recognition and classification and emotional speech synthesis. Speech data obtained from two semi-professional actresses are analyzed and compared. Each subject produces 211 sentences with four different emotions; neutral, sad, angry, happy. We analyze changes in temporal and acoustic parameters such as magnitude and variability of segmental duration, fundamental frequency and the first three formant frequencies as a function of emotion. Acoustic differences among the emotions are also explored with mutual information computation, multidimensional scaling and acoustic likelihood comparison with normal speech. Results indicate that speech associated with anger and happiness is characterized by longer duration, shorter interword silence, higher pitch and rms energy with wider ranges. Sadness is distinguished from other emotions by lower rms energy and longer interword silence. Interestingly, the difference in formant pattern between [happiness/anger] and [neutral/sadness] are better reflected in back vowels such as /a/(/father/) than in front vowels. Detailed results on intra- and interspeaker variability will be reported.

  6. Methods for real-time speech processing on Unix

    SciTech Connect

    Romberger, A.

    1982-01-01

    The author discusses computer programming done at the University of California, Berkeley, in support of research work in the area of speech analysis and synthesis. The purpose of this programming is to set up a system for doing real-time speech sampling using the Unix operating system. Two alternative approaches to real time work on Unix are discussed. The first approach is to do the real-time input/output on a secondary (satellite) machine that is not running Unix. The second approach is to do the real-time input/output on the main machine with the aid of special hardware.

  7. Free Speech Yearbook 1973.

    ERIC Educational Resources Information Center

    Barbour, Alton, Ed.

    The first article in this collection examines civil disobedience and the protections offered by the First Amendment. The second article discusses a study on antagonistic expressions in a free society. The third essay deals with attitudes toward free speech and treatment of the United States flag. There are two articles on media; the first examines…

  8. Black History Speech

    ERIC Educational Resources Information Center

    Noldon, Carl

    2007-01-01

    The author argues in this speech that one cannot expect students in the school system to know and understand the genius of Black history if the curriculum is Eurocentric, which is a residue of racism. He states that his comments are designed for the enlightenment of those who suffer from a school system that "hypocritically manipulates Black…

  9. Speech and Language Impairments

    MedlinePlus

    ... SLP) who can help you identify strategies for teaching and supporting this student, ways to adapt the ... ASHA | American Speech-Language-Hearing Association Information in Spanish | Información en español. 1.800.638.8255 | actioncenter@ ...

  10. Expectations and speech intelligibility.

    PubMed

    Babel, Molly; Russell, Jamie

    2015-05-01

    Socio-indexical cues and paralinguistic information are often beneficial to speech processing as this information assists listeners in parsing the speech stream. Associations that particular populations speak in a certain speech style can, however, make it such that socio-indexical cues have a cost. In this study, native speakers of Canadian English who identify as Chinese Canadian and White Canadian read sentences that were presented to listeners in noise. Half of the sentences were presented with a visual-prime in the form of a photo of the speaker and half were presented in control trials with fixation crosses. Sentences produced by Chinese Canadians showed an intelligibility cost in the face-prime condition, whereas sentences produced by White Canadians did not. In an accentedness rating task, listeners rated White Canadians as less accented in the face-prime trials, but Chinese Canadians showed no such change in perceived accentedness. These results suggest a misalignment between an expected and an observed speech signal for the face-prime trials, which indicates that social information about a speaker can trigger linguistic associations that come with processing benefits and costs.

  11. Mandarin Visual Speech Information

    ERIC Educational Resources Information Center

    Chen, Trevor H.

    2010-01-01

    While the auditory-only aspects of Mandarin speech are heavily-researched and well-known in the field, this dissertation addresses its lesser-known aspects: The visual and audio-visual perception of Mandarin segmental information and lexical-tone information. Chapter II of this dissertation focuses on the audiovisual perception of Mandarin…

  12. Perceptual Learning in Speech

    ERIC Educational Resources Information Center

    Norris, Dennis; McQueen, James M.; Cutler, Anne

    2003-01-01

    This study demonstrates that listeners use lexical knowledge in perceptual learning of speech sounds. Dutch listeners first made lexical decisions on Dutch words and nonwords. The final fricative of 20 critical words had been replaced by an ambiguous sound, between [f] and [s]. One group of listeners heard ambiguous [f]-final words (e.g.,…

  13. Microprocessor for speech recognition

    SciTech Connect

    Ishizuka, H.; Watari, M.; Sakoe, H.; Chiba, S.; Iwata, T.; Matsuki, T.; Kawakami, Y.

    1983-01-01

    A new single-chip microprocessor for speech recognition has been developed utilizing multi-processor architecture and pipelined structure. By DP-matching algorithm, the processor recognizes up to 340 isolated words or 40 connected words in realtime. 6 references.

  14. From the Speech Files

    ERIC Educational Resources Information Center

    Can Vocat J, 1970

    1970-01-01

    In a speech, Looking Ahead in Vocational Education", to a group of Hamilton educators, D.O. Davis, Vice-President, Engineering, Dominion Foundries and Steel Limited, Hamilton, Ontario spoke of the challenge of change and what educators and industry must do to help the future of vocational education. (Editor)

  15. Speech to schoolchildren

    NASA Astrophysics Data System (ADS)

    Angell, C. Austen

    2013-02-01

    Prof. C. A. Angell from Arizona State University read the following short and simple speech, saying the sentences in Italics in the best Japanese he could manage (after earnest coaching from a Japanese colleague). The rest was translated on the bus ride, and then spoken, as I spoke, by Ms. Yukako Endo- to whom the author is very grateful.

  16. Measuring Speech Communication Skills.

    ERIC Educational Resources Information Center

    Carpenter, Edwin C.

    Improving the quality of undergraduate speech communication education depends to a large extent on effectively measuring student achievement in college level communication skills. While formal tests are not as well developed for speaking skills as for other areas of the curriculum, they are available. The two used most frequently are the…

  17. Speech intelligibility in hospitals.

    PubMed

    Ryherd, Erica E; Moeller, Michael; Hsu, Timothy

    2013-07-01

    Effective communication between staff members is key to patient safety in hospitals. A variety of patient care activities including admittance, evaluation, and treatment rely on oral communication. Surprisingly, published information on speech intelligibility in hospitals is extremely limited. In this study, speech intelligibility measurements and occupant evaluations were conducted in 20 units of five different U.S. hospitals. A variety of unit types and locations were studied. Results show that overall, no unit had "good" intelligibility based on the speech intelligibility index (SII > 0.75) and several locations found to have "poor" intelligibility (SII < 0.45). Further, occupied spaces were found to have 10%-15% lower SII than unoccupied spaces on average. Additionally, staff perception of communication problems at nurse stations was significantly correlated with SII ratings. In a targeted second phase, a unit treated with sound absorption had higher SII ratings for a larger percentage of time as compared to an identical untreated unit. Taken as a whole, the study provides an extensive baseline evaluation of speech intelligibility across a variety of hospitals and unit types, offers some evidence of the positive impact of absorption on intelligibility, and identifies areas for future research.

  18. Expectations and speech intelligibility.

    PubMed

    Babel, Molly; Russell, Jamie

    2015-05-01

    Socio-indexical cues and paralinguistic information are often beneficial to speech processing as this information assists listeners in parsing the speech stream. Associations that particular populations speak in a certain speech style can, however, make it such that socio-indexical cues have a cost. In this study, native speakers of Canadian English who identify as Chinese Canadian and White Canadian read sentences that were presented to listeners in noise. Half of the sentences were presented with a visual-prime in the form of a photo of the speaker and half were presented in control trials with fixation crosses. Sentences produced by Chinese Canadians showed an intelligibility cost in the face-prime condition, whereas sentences produced by White Canadians did not. In an accentedness rating task, listeners rated White Canadians as less accented in the face-prime trials, but Chinese Canadians showed no such change in perceived accentedness. These results suggest a misalignment between an expected and an observed speech signal for the face-prime trials, which indicates that social information about a speaker can trigger linguistic associations that come with processing benefits and costs. PMID:25994710

  19. Unique Regulatory Properties of Heterotetrameric Inositol 1,4,5-Trisphosphate Receptors Revealed by Studying Concatenated Receptor Constructs.

    PubMed

    Chandrasekhar, Rahul; Alzayady, Kamil J; Wagner, Larry E; Yule, David I

    2016-03-01

    The ability of inositol 1,4,5-trisphosphate receptors (IP3R) to precisely initiate and generate a diverse variety of intracellular Ca(2+) signals is in part mediated by the differential regulation of the three subtypes (R1, R2, and R3) by key functional modulators (IP3, Ca(2+), and ATP). However, the contribution of IP3R heterotetramerization to Ca(2+) signal diversity has largely been unexplored. In this report, we provide the first definitive biochemical evidence of endogenous heterotetramer formation. Additionally, we examine the contribution of individual subtypes within defined concatenated heterotetramers to the shaping of Ca(2+) signals. Under conditions where key regulators of IP3R function are optimal for Ca(2+) release, we demonstrate that individual monomers within heteromeric IP3Rs contributed equally toward generating a distinct 'blended' sensitivity to IP3 that is likely dictated by the unique IP3 binding affinity of the heteromers. However, under suboptimal conditions where [ATP] were varied, we found that one subtype dictated the ATP regulatory properties of heteromers. We show that R2 monomers within a heterotetramer were both necessary and sufficient to dictate the ATP regulatory properties. Finally, the ATP-binding site B in R2 critical for ATP regulation was mutated and rendered non-functional to address questions relating to the stoichiometry of IP3R regulation. Two intact R2 monomers were sufficient to maintain ATP regulation in R2 homotetramers. In summary, we demonstrate that heterotetrameric IP3R do not necessarily behave as the sum of the constituent subunits, and these properties likely extend the versatility of IP3-induced Ca(2+) signaling in cells expressing multiple IP3R isoforms.

  20. Evaluating syntactic constraints to speech recognition in a fighter aircraft environment

    NASA Astrophysics Data System (ADS)

    Stockton, D. B.

    1984-12-01

    A flexible software system has been developed to test the effects of adding syntactic knowledge to an isolated speech phoneme-based word recognizer. Words from seventy-word fighter plane vocabulary, spoken by five pilots at four different levels of background noise, are automatically concatenated into commands randomly chosen from a set of over seven trillion. These commands are then recognized using an existing word recognizer together with grammars of differing specificity. Results are compiled automatically. The system is flexible in that system components such as the command generator, parser, grammar, or word recognizer can be interchanged with very little software modification. Preliminary testing demonstrated that, although the modified word recognizer exhibited very poor performance, the use of more specific grammars enhanced recognition accuracy, sometimes drastically.

  1. Speech Motor Control in Fluent and Dysfluent Speech Production of an Individual with Apraxia of Speech and Broca's Aphasia

    ERIC Educational Resources Information Center

    van Lieshout, Pascal H. H. M.; Bose, Arpita; Square, Paula A.; Steele, Catriona M.

    2007-01-01

    Apraxia of speech (AOS) is typically described as a motor-speech disorder with clinically well-defined symptoms, but without a clear understanding of the underlying problems in motor control. A number of studies have compared the speech of subjects with AOS to the fluent speech of controls, but only a few have included speech movement data and if…

  2. System and method for characterizing voiced excitations of speech and acoustic signals, removing acoustic noise from speech, and synthesizing speech

    DOEpatents

    Burnett, Greg C.; Holzrichter, John F.; Ng, Lawrence C.

    2002-01-01

    Low power EM waves are used to detect motions of vocal tract tissues of the human speech system before, during, and after voiced speech. A voiced excitation function is derived. The excitation function provides speech production information to enhance speech characterization and to enable noise removal from human speech.

  3. Toward the ultimate synthesis/recognition system.

    PubMed Central

    Furui, S

    1995-01-01

    This paper predicts speech synthesis, speech recognition, and speaker recognition technology for the year 2001, and it describes the most important research problems to be solved in order to arrive at these ultimate synthesis and recognition systems. The problems for speech synthesis include natural and intelligible voice production, prosody control based on meaning, capability of controlling synthesized voice quality and choosing individual speaking style, multilingual and multidialectal synthesis, choice of application-oriented speaking styles, capability of adding emotion, and synthesis from concepts. The problems for speech recognition include robust recognition against speech variations, adaptation/normalization to variations due to environmental conditions and speakers, automatic knowledge acquisition for acoustic and linguistic modeling, spontaneous speech recognition, naturalness and ease of human-machine interaction, and recognition of emotion. The problems for speaker recognition are similar to those for speech recognition. The research topics related to all these techniques include the use of articulatory and perceptual constraints and evaluation methods for measuring the quality of technology and systems. Images Fig. 3 PMID:7479723

  4. Seeking a reading machine for the blind and discovering the speech code.

    PubMed

    Shankweiler, Donald; Fowler, Carol A

    2015-02-01

    A machine that can read printed material to the blind became a priority at the end of World War II with the appointment of a U.S. Government committee to instigate research on sensory aids to improve the lot of blinded veterans. The committee chose Haskins Laboratories to lead a multisite research program. Initially, Haskins researchers overestimated the capacities of users to learn an acoustic code based on the letters of a text, resulting in unsuitable designs. Progress was slow because the researchers clung to a mistaken view that speech is a sound alphabet and because of persisting gaps in man-machine technology. The tortuous route to a practical reading machine transformed the scientific understanding of speech perception and reading at Haskins Labs and elsewhere, leading to novel lines of basic research and new technologies. Research at Haskins Laboratories made valuable contributions in clarifying the physical basis of speech. Researchers recognized that coarticulatory overlap eliminated the possibility of alphabet-like discrete acoustic segments in speech. This work advanced the study of speech perception and contributed to our understanding of the relation of speech perception to production. Basic findings on speech enabled the development of speech synthesis, part science and part technology, essential for development of a reading machine, which has found many applications. Findings on the nature of speech further stimulated a new understanding of word recognition in reading across languages and scripts and contributed to our understanding of reading development and reading disabilities. PMID:25528275

  5. Influence of mothers' slower speech on their children's speech rate.

    PubMed

    Guitar, B; Marchinkoski, L

    2001-08-01

    This study investigated the effects on children's speech rate when their mothers talked more slowly. Six mothers and their normally speaking 3-year-olds (3 girls and 3 boys) were studied using single-subject A-B-A-B designs. Conversational speech rates of mothers were reduced by approximately half in the experimental (B) conditions. Five of the six children appeared to reduce their speech rates when their mothers spoke more slowly. This was confirmed by paired t tests (p < .05) that showed significant decreases in the 5 children's speech rate over the two B conditions. These findings suggest that when mothers substantially decrease their speech rates in a controlled situation, their children also decrease their speech rates. Clinical implications are discussed.

  6. A Java speech implementation of the Mini Mental Status Exam.

    PubMed

    Wang, S S; Starren, J

    1999-01-01

    The Folstein Mini Mental Status Exam (MMSE) is a simple, widely used, verbally administered test to assess cognitive function. The Java Speech Application Programming Interface (JSAPI) is a new, cross-platform interface for both speech recognition and speech synthesis in the Java environment. To evaluate the suitability of the JSAPI for interactive, patient interview applications, a JSAPI implementation of the MMSE was developed. The MMSE contains questions that vary in structure in order to assess different cognitive functions. This question variability provided an excellent test-bed to evaluate the strengths and weaknesses of JSAPI. The application is based on Java platform 2 and a JSAPI interface to the IBM ViaVoice recognition engine. Design and implementations issues are discussed. Preliminary usability studies demonstrate that an automated MMSE maybe a useful screening tool for cognitive disorders and changes.

  7. Selected military applications of automatic speech recognition technology

    NASA Astrophysics Data System (ADS)

    Woodard, J. P.; Cupples, E. J.

    1983-12-01

    Voice input for command and control, message sorting by voice, and advanced low-bit-rate voice communications systems are discussed in relation to automatic speech recognition (ASR) technology applications. Several research efforts in the areas of ASR and speech synthesis technology, which forms the basis for voice input/output systems, are described for military airborne and ground-based applications. The success of the speech enhancement unit in noise reduction for both human and machine listeners is documented and shown to be indispensible for the ASR used in message sorting, and for many command-and-control applications in harsh environments. Two experimental low-bit-rate systems, one phonetically based, the other based on vector, or block quantization, are compared. ASR technology is shown to have the potential to increase the effectiveness of man-machine communications in a variety of military applications, such as equipment maintenance and computers. Further research is necessary to solve some basic problems.

  8. Hate Speech: Power in the Marketplace.

    ERIC Educational Resources Information Center

    Harrison, Jack B.

    1994-01-01

    A discussion of hate speech and freedom of speech on college campuses examines the difference between hate speech from normal, objectionable interpersonal comments and looks at Supreme Court decisions on the limits of student free speech. Two cases specifically concerning regulation of hate speech on campus are considered: Chaplinsky v. New…

  9. Headphone localization of speech

    NASA Technical Reports Server (NTRS)

    Begault, Durand R.; Wenzel, Elizabeth M.

    1993-01-01

    Three-dimensional acoustic display systems have recently been developed that synthesize virtual sound sources over headphones based on filtering by head-related transfer functions (HRTFs), the direction-dependent spectral changes caused primarily by the pinnae. In this study, 11 inexperienced subjects judged the apparent spatial location of headphone-presented speech stimuli filtered with nonindividualized HRTFs. About half of the subjects 'pulled' their judgments toward either the median or the lateral-vertical planes, and estimates were almost always elevated. Individual differences were pronounced for the distance judgments; 15 to 46 percent of stimuli were heard inside the head, with the shortest estimates near the median plane. The results suggest that most listeners can obtain useful azimuth information from speech stimuli filtered by nonindividualized HRTFs. Measurements of localization error and reversal rates are comparable with a previous study that used broadband noise stimuli.

  10. The feasibility of miniaturizing the versatile portable speech prosthesis: A market survey of commercial products

    NASA Technical Reports Server (NTRS)

    Walklet, T.

    1981-01-01

    The feasibility of a miniature versatile portable speech prosthesis (VPSP) was analyzed and information on its potential users and on other similar devices was collected. The VPSP is a device that incorporates speech synthesis technology. The objective is to provide sufficient information to decide whether there is valuable technology to contribute to the miniaturization of the VPSP. The needs of potential users are identified, the development status of technologies similar or related to those used in the VPSP are evaluated. The VPSP, a computer based speech synthesis system fits on a wheelchair. The purpose was to produce a device that provides communication assistance in educational, vocational, and social situations to speech impaired individuals. It is expected that the VPSP can be a valuable aid for persons who are also motor impaired, which explains the placement of the system on a wheelchair.

  11. Neurophysiology of speech differences in childhood apraxia of speech.

    PubMed

    Preston, Jonathan L; Molfese, Peter J; Gumkowski, Nina; Sorcinelli, Andrea; Harwood, Vanessa; Irwin, Julia R; Landi, Nicole

    2014-01-01

    Event-related potentials (ERPs) were recorded during a picture naming task of simple and complex words in children with typical speech and with childhood apraxia of speech (CAS). Results reveal reduced amplitude prior to speaking complex (multisyllabic) words relative to simple (monosyllabic) words for the CAS group over the right hemisphere during a time window thought to reflect phonological encoding of word forms. Group differences were also observed prior to production of spoken tokens regardless of word complexity during a time window just prior to speech onset (thought to reflect motor planning/programming). Results suggest differences in pre-speech neurolinguistic processes. PMID:25090016

  12. Neurophysiology of Speech Differences in Childhood Apraxia of Speech

    PubMed Central

    Preston, Jonathan L.; Molfese, Peter J.; Gumkowski, Nina; Sorcinelli, Andrea; Harwood, Vanessa; Irwin, Julia; Landi, Nicole

    2014-01-01

    Event-related potentials (ERPs) were recorded during a picture naming task of simple and complex words in children with typical speech and with childhood apraxia of speech (CAS). Results reveal reduced amplitude prior to speaking complex (multisyllabic) words relative to simple (monosyllabic) words for the CAS group over the right hemisphere during a time window thought to reflect phonological encoding of word forms. Group differences were also observed prior to production of spoken tokens regardless of word complexity during a time window just prior to speech onset (thought to reflect motor planning/programming). Results suggest differences in pre-speech neurolinguistic processes. PMID:25090016

  13. [Improving speech comprehension using a new cochlear implant speech processor].

    PubMed

    Müller-Deile, J; Kortmann, T; Hoppe, U; Hessel, H; Morsnowski, A

    2009-06-01

    The aim of this multicenter clinical field study was to assess the benefits of the new Freedom 24 sound processor for cochlear implant (CI) users implanted with the Nucleus 24 cochlear implant system. The study included 48 postlingually profoundly deaf experienced CI users who demonstrated speech comprehension performance with their current speech processor on the Oldenburg sentence test (OLSA) in quiet conditions of at least 80% correct scores and who were able to perform adaptive speech threshold testing using the OLSA in noisy conditions. Following baseline measures of speech comprehension performance with their current speech processor, subjects were upgraded to the Freedom 24 speech processor. After a take-home trial period of at least 2 weeks, subject performance was evaluated by measuring the speech reception threshold with the Freiburg multisyllabic word test and speech intelligibility with the Freiburg monosyllabic word test at 50 dB and 70 dB in the sound field. The results demonstrated highly significant benefits for speech comprehension with the new speech processor. Significant benefits for speech comprehension were also demonstrated with the new speech processor when tested in competing background noise.In contrast, use of the Abbreviated Profile of Hearing Aid Benefit (APHAB) did not prove to be a suitably sensitive assessment tool for comparative subjective self-assessment of hearing benefits with each processor. Use of the preprocessing algorithm known as adaptive dynamic range optimization (ADRO) in the Freedom 24 led to additional improvements over the standard upgrade map for speech comprehension in quiet and showed equivalent performance in noise. Through use of the preprocessing beam-forming algorithm BEAM, subjects demonstrated a highly significant improved signal-to-noise ratio for speech comprehension thresholds (i.e., signal-to-noise ratio for 50% speech comprehension scores) when tested with an adaptive procedure using the Oldenburg

  14. Neurophysiology of speech differences in childhood apraxia of speech.

    PubMed

    Preston, Jonathan L; Molfese, Peter J; Gumkowski, Nina; Sorcinelli, Andrea; Harwood, Vanessa; Irwin, Julia R; Landi, Nicole

    2014-01-01

    Event-related potentials (ERPs) were recorded during a picture naming task of simple and complex words in children with typical speech and with childhood apraxia of speech (CAS). Results reveal reduced amplitude prior to speaking complex (multisyllabic) words relative to simple (monosyllabic) words for the CAS group over the right hemisphere during a time window thought to reflect phonological encoding of word forms. Group differences were also observed prior to production of spoken tokens regardless of word complexity during a time window just prior to speech onset (thought to reflect motor planning/programming). Results suggest differences in pre-speech neurolinguistic processes.

  15. Speech audiometry by a speech synthesizer. I. A preliminary report.

    PubMed

    Rahko, T; Karjalainen, M A; Laine, U K; Lavonen, S

    1979-01-01

    A preliminary report on speech test results with a portable, text-to-speech synthesizer is presented. The differentiation scores achieved at speed 80 words/min vary. So far the best mean differentiation scores in normal material are 75%. The increase of the presentation level improves the differentiation score, as does the decrease of word speed and training. The future and present uses of this system are discussed. These include: devices for the handicapped, e.g. to produce speech for the mute, man-machine communication through speech in industry control, data processing systems and uses in audiological diagnostics. The study is continued. PMID:435169

  16. Speech rhythm: a metaphor?

    PubMed Central

    Nolan, Francis; Jeon, Hae-Sung

    2014-01-01

    Is speech rhythmic? In the absence of evidence for a traditional view that languages strive to coordinate either syllables or stress-feet with regular time intervals, we consider the alternative that languages exhibit contrastive rhythm subsisting merely in the alternation of stronger and weaker elements. This is initially plausible, particularly for languages with a steep ‘prominence gradient’, i.e. a large disparity between stronger and weaker elements; but we point out that alternation is poorly achieved even by a ‘stress-timed’ language such as English, and, historically, languages have conspicuously failed to adopt simple phonological remedies that would ensure alternation. Languages seem more concerned to allow ‘syntagmatic contrast’ between successive units and to use durational effects to support linguistic functions than to facilitate rhythm. Furthermore, some languages (e.g. Tamil, Korean) lack the lexical prominence which would most straightforwardly underpin prominence of alternation. We conclude that speech is not incontestibly rhythmic, and may even be antirhythmic. However, its linguistic structure and patterning allow the metaphorical extension of rhythm in varying degrees and in different ways depending on the language, and it is this analogical process which allows speech to be matched to external rhythms. PMID:25385774

  17. Speech rhythm: a metaphor?

    PubMed

    Nolan, Francis; Jeon, Hae-Sung

    2014-12-19

    Is speech rhythmic? In the absence of evidence for a traditional view that languages strive to coordinate either syllables or stress-feet with regular time intervals, we consider the alternative that languages exhibit contrastive rhythm subsisting merely in the alternation of stronger and weaker elements. This is initially plausible, particularly for languages with a steep 'prominence gradient', i.e. a large disparity between stronger and weaker elements; but we point out that alternation is poorly achieved even by a 'stress-timed' language such as English, and, historically, languages have conspicuously failed to adopt simple phonological remedies that would ensure alternation. Languages seem more concerned to allow 'syntagmatic contrast' between successive units and to use durational effects to support linguistic functions than to facilitate rhythm. Furthermore, some languages (e.g. Tamil, Korean) lack the lexical prominence which would most straightforwardly underpin prominence of alternation. We conclude that speech is not incontestibly rhythmic, and may even be antirhythmic. However, its linguistic structure and patterning allow the metaphorical extension of rhythm in varying degrees and in different ways depending on the language, and it is this analogical process which allows speech to be matched to external rhythms.

  18. Applications for Subvocal Speech

    NASA Technical Reports Server (NTRS)

    Jorgensen, Charles; Betts, Bradley

    2007-01-01

    A research and development effort now underway is directed toward the use of subvocal speech for communication in settings in which (1) acoustic noise could interfere excessively with ordinary vocal communication and/or (2) acoustic silence or secrecy of communication is required. By "subvocal speech" is meant sub-audible electromyographic (EMG) signals, associated with speech, that are acquired from the surface of the larynx and lingual areas of the throat. Topics addressed in this effort include recognition of the sub-vocal EMG signals that represent specific original words or phrases; transformation (including encoding and/or enciphering) of the signals into forms that are less vulnerable to distortion, degradation, and/or interception; and reconstruction of the original words or phrases at the receiving end of a communication link. Potential applications include ordinary verbal communications among hazardous- material-cleanup workers in protective suits, workers in noisy environments, divers, and firefighters, and secret communications among law-enforcement officers and military personnel in combat and other confrontational situations.

  19. [A method of speech donorship and speech discourse for the speech restoration in aphasia].

    PubMed

    Rudnev, V A; Shteĭnerdt, V V

    2012-01-01

    An objective of the study was to evaluate the effectiveness of speech restoration in aphasia in outpatients using audiovisual samples of the speech of first-degree relatives of the patient with the following transformation of the restoration into the feedback with the own audiovisual material (a method of speech donorship and speech discourse). We studied 53 outpatients with different severity of aphasia (28 patients with moderate severity, 12 patients with mild severity and 13 patients with marked severity) that was pathogenetically associated with stroke or brain injury. We used the following algorithm of speech restoration: 1) the work in the regime of biological feedback with the audiovisual sample of the speech of the close relative (7th-14th days); 2) the DVD recording of the own speech of the patient and the work with the own audiovisual sample (14th-21st days). Sessions were carried out twice a day. After the rehabilitation, there was a significant improvement (p<0,001) in the speech function including the decrease in the frequency of literal and verbal paraphasias, literal perseverations as well as the improvement of speech initiation and nonverbal speech component (intonation and kinesthetic appearances). The results of the restoration were worse in patients with severe aphasia than in those with moderate and mild aphasia, for the latter patients the method was very effective.

  20. Two Methods of Mechanical Noise Reduction of Recorded Speech During Phonation in an MRI device

    NASA Astrophysics Data System (ADS)

    Přibil, J.; Horáček, J.; Horák, P.

    2011-01-01

    The paper presents two methods of noise reduction of speech signal recorded in an MRI device during phonation for the human vocal tract modelling. The applied approach of noise speech signal cleaning is based on cepstral speech analysis and synthesis because the noise is mainly produced by gradient coils, has a mechanical character, and can be processed in spectral domain. Our first noise reduction method is using real cepstrum limitation and clipping the "peaks" corresponding to the harmonic frequencies of mechanical noise. The second method is coming out from substation of the short-time spectra of two signals recorded withal: the first includes speech and noise, and the second consists of noise only. The resulting speech quality was compared by spectrogram and mean periodogram methods.

  1. Speech perception at the interface of neurobiology and linguistics.

    PubMed

    Poeppel, David; Idsardi, William J; van Wassenhove, Virginie

    2008-03-12

    Speech perception consists of a set of computations that take continuously varying acoustic waveforms as input and generate discrete representations that make contact with the lexical representations stored in long-term memory as output. Because the perceptual objects that are recognized by the speech perception enter into subsequent linguistic computation, the format that is used for lexical representation and processing fundamentally constrains the speech perceptual processes. Consequently, theories of speech perception must, at some level, be tightly linked to theories of lexical representation. Minimally, speech perception must yield representations that smoothly and rapidly interface with stored lexical items. Adopting the perspective of Marr, we argue and provide neurobiological and psychophysical evidence for the following research programme. First, at the implementational level, speech perception is a multi-time resolution process, with perceptual analyses occurring concurrently on at least two time scales (approx. 20-80 ms, approx. 150-300 ms), commensurate with (sub)segmental and syllabic analyses, respectively. Second, at the algorithmic level, we suggest that perception proceeds on the basis of internal forward models, or uses an 'analysis-by-synthesis' approach. Third, at the computational level (in the sense of Marr), the theory of lexical representation that we adopt is principally informed by phonological research and assumes that words are represented in the mental lexicon in terms of sequences of discrete segments composed of distinctive features. One important goal of the research programme is to develop linking hypotheses between putative neurobiological primitives (e.g. temporal primitives) and those primitives derived from linguistic inquiry, to arrive ultimately at a biologically sensible and theoretically satisfying model of representation and computation in speech.

  2. Hemispheric Asymmetries in Speech Perception: Sense, Nonsense and Modulations

    PubMed Central

    Rosen, Stuart; Wise, Richard J. S.; Chadha, Shabneet; Conway, Eleanor-Jayne; Scott, Sophie K.

    2011-01-01

    Background The well-established left hemisphere specialisation for language processing has long been claimed to be based on a low-level auditory specialization for specific acoustic features in speech, particularly regarding ‘rapid temporal processing’. Methodology A novel analysis/synthesis technique was used to construct a variety of sounds based on simple sentences which could be manipulated in spectro-temporal complexity, and whether they were intelligible or not. All sounds consisted of two noise-excited spectral prominences (based on the lower two formants in the original speech) which could be static or varying in frequency and/or amplitude independently. Dynamically varying both acoustic features based on the same sentence led to intelligible speech but when either or both acoustic features were static, the stimuli were not intelligible. Using the frequency dynamics from one sentence with the amplitude dynamics of another led to unintelligible sounds of comparable spectro-temporal complexity to the intelligible ones. Positron emission tomography (PET) was used to compare which brain regions were active when participants listened to the different sounds. Conclusions Neural activity to spectral and amplitude modulations sufficient to support speech intelligibility (without actually being intelligible) was seen bilaterally, with a right temporal lobe dominance. A left dominant response was seen only to intelligible sounds. It thus appears that the left hemisphere specialisation for speech is based on the linguistic properties of utterances, not on particular acoustic features. PMID:21980349

  3. A Cool Approach to Probing Speech Cortex.

    PubMed

    Flinker, Adeen; Knight, Robert T

    2016-03-16

    In this issue of Neuron, Long et al. (2016) employ a novel technique of intraoperative cortical cooling in humans during speech production. They demonstrate that cooling Broca's area interferes with speech timing but not speech quality. PMID:26985719

  4. A Cool Approach to Probing Speech Cortex

    PubMed Central

    Flinker, Adeen; Knight, Robert T.

    2016-01-01

    In this issue of Neuron, Long et al. (2016) employ a novel technique of intraoperative cortical cooling in humans during speech production. They demonstrate that cooling Broca’s area interferes with speech timing but not speech quality. PMID:26985719

  5. Speech Recognition: How Do We Teach It?

    ERIC Educational Resources Information Center

    Barksdale, Karl

    2002-01-01

    States that growing use of speech recognition software has made voice writing an essential computer skill. Describes how to present the topic, develop basic speech recognition skills, and teach speech recognition outlining, writing, proofreading, and editing. (Contains 14 references.) (SK)

  6. General American Speech and Phonic Symbols.

    ERIC Educational Resources Information Center

    Calvert, Donald R.

    1982-01-01

    General American Symbols, speech and phonic symbols adapted from the Northampton symbols, are presented as a simplified system for teaching reading and speech to deaf children. Ways to use symbols for indicating features of speech production are suggested. (Author)

  7. Huntington's Disease: Speech, Language and Swallowing

    MedlinePlus

    ... the course of the disease. What do speech-language pathologists do when working with people with Huntington's ... of Neurological Disorders and Stroke Typical Speech and Language Development Learning More Than One Language Adult Speech ...

  8. Activities to Encourage Speech and Language Development

    MedlinePlus

    ... and Swallowing / Development Activities to Encourage Speech and Language Development Birth to 2 Years Encourage your baby ... or light) of the packages. Typical Speech and Language Development Learning More Than One Language Adult Speech ...

  9. What Is Language? What Is Speech?

    MedlinePlus

    ... Public / Speech, Language and Swallowing / Development What Is Language? What Is Speech? [ en Español ] Kelly's 4-year-old son, Tommy, has speech and language problems. Friends and family have a hard time ...

  10. Speech systems research at Texas Instruments

    NASA Technical Reports Server (NTRS)

    Doddington, George R.

    1977-01-01

    An assessment of automatic speech processing technology is presented. Fundamental problems in the development and the deployment of automatic speech processing systems are defined and a technology forecast for speech systems is presented.

  11. Multilocus phylogeny of the lichen-forming fungal genus Melanohalea (Parmeliaceae, Ascomycota): insights on diversity, distributions, and a comparison of species tree and concatenated topologies.

    PubMed

    Leavitt, Steven D; Esslinger, Theodore L; Spribille, Toby; Divakar, Pradeep K; Thorsten Lumbsch, H

    2013-01-01

    Accurate species circumscriptions are central for many biological disciplines and have critical implications for ecological and conservation studies. An increasing body of evidence suggests that in some cases traditional morphology-based taxonomy have underestimated diversity in lichen-forming fungi. Therefore, genetic data play an increasing role for recognizing distinct lineages of lichenized fungi that it would otherwise be improbable to recognize using classical phenotypic characters. Melanohalea (Parmeliaceae, Ascomycota) is one of the most widespread and common lichen-forming genera in the northern Hemisphere. In this study, we assess traditional phenotype-based species boundaries, identify previously unrecognized species-level lineages and discuss biogeographic patterns in Melanohalea. We sampled 487 individuals worldwide, representing 18 of the 22 described Melanohalea species, and generated DNA sequence data from mitochondrial, nuclear ribosomal, and protein-coding markers. Diversity previously hidden within traditional species was identified using a genealogical concordance approach. We inferred relationships among sampled species-level lineages within Melanohalea using both concatenated phylogenetic methods and a coalescent-based multilocus species tree approach. Although lineages identified from genetic data are largely congruent with traditional taxonomy, we found strong evidence supporting the presence of previously unrecognized species in six of the 18 sampled taxa. Strong nodal support and overall congruence among independent loci suggest long-term reproductive isolation among most species-level lineages. While some Melanohalea taxa are truly widespread, a limited number of clades appear to have much more restricted distributional ranges. In most instances the concatenated gene tree and multilocus species tree approaches provided similar estimates of relationships. However, nodal support was generally higher in the phylogeny estimated from

  12. Multilocus phylogeny of the lichen-forming fungal genus Melanohalea (Parmeliaceae, Ascomycota): insights on diversity, distributions, and a comparison of species tree and concatenated topologies.

    PubMed

    Leavitt, Steven D; Esslinger, Theodore L; Spribille, Toby; Divakar, Pradeep K; Thorsten Lumbsch, H

    2013-01-01

    Accurate species circumscriptions are central for many biological disciplines and have critical implications for ecological and conservation studies. An increasing body of evidence suggests that in some cases traditional morphology-based taxonomy have underestimated diversity in lichen-forming fungi. Therefore, genetic data play an increasing role for recognizing distinct lineages of lichenized fungi that it would otherwise be improbable to recognize using classical phenotypic characters. Melanohalea (Parmeliaceae, Ascomycota) is one of the most widespread and common lichen-forming genera in the northern Hemisphere. In this study, we assess traditional phenotype-based species boundaries, identify previously unrecognized species-level lineages and discuss biogeographic patterns in Melanohalea. We sampled 487 individuals worldwide, representing 18 of the 22 described Melanohalea species, and generated DNA sequence data from mitochondrial, nuclear ribosomal, and protein-coding markers. Diversity previously hidden within traditional species was identified using a genealogical concordance approach. We inferred relationships among sampled species-level lineages within Melanohalea using both concatenated phylogenetic methods and a coalescent-based multilocus species tree approach. Although lineages identified from genetic data are largely congruent with traditional taxonomy, we found strong evidence supporting the presence of previously unrecognized species in six of the 18 sampled taxa. Strong nodal support and overall congruence among independent loci suggest long-term reproductive isolation among most species-level lineages. While some Melanohalea taxa are truly widespread, a limited number of clades appear to have much more restricted distributional ranges. In most instances the concatenated gene tree and multilocus species tree approaches provided similar estimates of relationships. However, nodal support was generally higher in the phylogeny estimated from

  13. A Technique for Estimating Intensity of Emotional Expressions and Speaking Styles in Speech Based on Multiple-Regression HSMM

    NASA Astrophysics Data System (ADS)

    Nose, Takashi; Kobayashi, Takao

    In this paper, we propose a technique for estimating the degree or intensity of emotional expressions and speaking styles appearing in speech. The key idea is based on a style control technique for speech synthesis using a multiple regression hidden semi-Markov model (MRHSMM), and the proposed technique can be viewed as the inverse of the style control. In the proposed technique, the acoustic features of spectrum, power, fundamental frequency, and duration are simultaneously modeled using the MRHSMM. We derive an algorithm for estimating explanatory variables of the MRHSMM, each of which represents the degree or intensity of emotional expressions and speaking styles appearing in acoustic features of speech, based on a maximum likelihood criterion. We show experimental results to demonstrate the ability of the proposed technique using two types of speech data, simulated emotional speech and spontaneous speech with different speaking styles. It is found that the estimated values have correlation with human perception.

  14. Speech-in-Speech Recognition: A Training Study

    ERIC Educational Resources Information Center

    Van Engen, Kristin J.

    2012-01-01

    This study aims to identify aspects of speech-in-noise recognition that are susceptible to training, focusing on whether listeners can learn to adapt to target talkers ("tune in") and learn to better cope with various maskers ("tune out") after short-term training. Listeners received training on English sentence recognition in speech-shaped noise…

  15. Enhancing Peer Feedback and Speech Preparation: The Speech Video Activity

    ERIC Educational Resources Information Center

    Opt, Susan

    2012-01-01

    In the typical public speaking course, instructors or assistants videotape or digitally record at least one of the term's speeches in class or lab to offer students additional presentation feedback. Students often watch and self-critique their speeches on their own. Peers often give only written feedback on classroom presentations or completed…

  16. Auditory detection of non-speech and speech stimuli in noise: Native speech advantage.

    PubMed

    Huo, Shuting; Tao, Sha; Wang, Wenjing; Li, Mingshuang; Dong, Qi; Liu, Chang

    2016-05-01

    Detection thresholds of Chinese vowels, Korean vowels, and a complex tone, with harmonic and noise carriers were measured in noise for Mandarin Chinese-native listeners. The harmonic index was calculated as the difference between detection thresholds of the stimuli with harmonic carriers and those with noise carriers. The harmonic index for Chinese vowels was significantly greater than that for Korean vowels and the complex tone. Moreover, native speech sounds were rated significantly more native-like than non-native speech and non-speech sounds. The results indicate that native speech has an advantage over other sounds in simple auditory tasks like sound detection. PMID:27250202

  17. Statistical assessment of speech system performance

    NASA Technical Reports Server (NTRS)

    Moshier, Stephen L.

    1977-01-01

    Methods for the normalization of performance tests results of speech recognition systems are presented. Technological accomplishments in speech recognition systems, as well as planned research activities are described.

  18. Overview of requirements and networks for voice communications and speech processing

    NASA Astrophysics Data System (ADS)

    Ince, A. Nejat

    1990-05-01

    The use of voice for military and civil communications are discussed. The military operational requirements are outlined in relation to air operations, including the effects of propagational factors and electronic warfare. Structures of the existing NATO communications network and the evolving Integrated Service Digital Network (ISDN) are reviewed to show how they meet the requirements. It is concluded that speech coding at low-bit rates is a growing need for transmitting speech messages with a high level of security and reliability over low data-rate channels and for memory-efficient systems for voice storage, voice response, and voice mail. Furthermore, it is pointed out that the low-bit rate voice coding can ease the transition to shared channels for voice and data and can readily adopt voice messages for packet switching. The speech processing techniques and systems are then outlined as an introduction to the lectures of this series in terms of: the character of the speech signal, its generation and perception; speech coding which is mainly concerned with man-to-man voice communication; speech synthesis which deals with machine-to-man communication; speech recognition which is related to man-to-machine communication; and quality assessment of speech system and standards.

  19. SPEECH--MAN'S NATURAL COMMUNICATION.

    ERIC Educational Resources Information Center

    DUDLEY, HOMER; AND OTHERS

    SESSION 63 OF THE 1967 INSTITUTE OF ELECTRICAL AND ELECTRONIC ENGINEERS INTERNATIONAL CONVENTION BROUGHT TOGETHER SEVEN DISTINGUISHED MEN WORKING IN FIELDS RELEVANT TO LANGUAGE. THEIR TOPICS INCLUDED ORIGIN AND EVOLUTION OF SPEECH AND LANGUAGE, LANGUAGE AND CULTURE, MAN'S PHYSIOLOGICAL MECHANISMS FOR SPEECH, LINGUISTICS, AND TECHNOLOGY AND…

  20. Taking a Stand for Speech.

    ERIC Educational Resources Information Center

    Moore, Wayne D.

    1995-01-01

    Asserts that freedom of speech issues were among the first major confrontations in U.S. constitutional law. Maintains that lessons from the controversies surrounding the Sedition Act of 1798 have continuing practical relevance. Describes and discusses the significance of freedom of speech to the U.S. political system. (CFR)

  1. SILENT SPEECH DURING SILENT READING.

    ERIC Educational Resources Information Center

    MCGUIGAN, FRANK J.

    EFFORTS WERE MADE IN THIS STUDY TO (1) RELATE THE AMOUNT OF SILENT SPEECH DURING SILENT READING TO LEVEL OF READING PROFICIENCY, INTELLIGENCE, AGE, AND GRADE PLACEMENT OF SUBJECTS, AND (2) DETERMINE WHETHER THE AMOUNT OF SILENT SPEECH DURING SILENT READING IS AFFECTED BY THE LEVEL OF DIFFICULTY OF PROSE READ AND BY THE READING OF A FOREIGN…

  2. Speech Restoration: An Interactive Process

    ERIC Educational Resources Information Center

    Grataloup, Claire; Hoen, Michael; Veuillet, Evelyne; Collet, Lionel; Pellegrino, Francois; Meunier, Fanny

    2009-01-01

    Purpose: This study investigates the ability to understand degraded speech signals and explores the correlation between this capacity and the functional characteristics of the peripheral auditory system. Method: The authors evaluated the capability of 50 normal-hearing native French speakers to restore time-reversed speech. The task required them…

  3. Audiovisual Speech Recalibration in Children

    ERIC Educational Resources Information Center

    van Linden, Sabine; Vroomen, Jean

    2008-01-01

    In order to examine whether children adjust their phonetic speech categories, children of two age groups, five-year-olds and eight-year-olds, were exposed to a video of a face saying /aba/ or /ada/ accompanied by an auditory ambiguous speech sound halfway between /b/ and /d/. The effect of exposure to these audiovisual stimuli was measured on…

  4. Speech Prosody in Cerebellar Ataxia

    ERIC Educational Resources Information Center

    Casper, Maureen A.; Raphael, Lawrence J.; Harris, Katherine S.; Geibel, Jennifer M.

    2007-01-01

    Persons with cerebellar ataxia exhibit changes in physical coordination and speech and voice production. Previously, these alterations of speech and voice production were described primarily via perceptual coordinates. In this study, the spatial-temporal properties of syllable production were examined in 12 speakers, six of whom were healthy…

  5. Perceptual Aspects of Cluttered Speech

    ERIC Educational Resources Information Center

    St. Louis, Kenneth O.; Myers, Florence L.; Faragasso, Kristine; Townsend, Paula S.; Gallaher, Amanda J.

    2004-01-01

    The purpose of this descriptive investigation was to explore perceptual judgments of speech naturalness, compared to judgments of articulation, language, disfluency, and speaking rate, in the speech of two youths who differed in cluttering severity. Two groups of listeners, 48 from New York and 48 from West Virginia, judged 93 speaking samples on…

  6. Speech Analysis Systems: An Evaluation.

    ERIC Educational Resources Information Center

    Read, Charles; And Others

    1992-01-01

    Performance characteristics are reviewed for seven computerized systems marketed for acoustic speech analysis: CSpeech, CSRE, ILS-PC, Kay Elemetrics model 550 Sona-Graph, MacSpeech Lab II, MSL, and Signalyze. Characteristics reviewed include system components, basic capabilities, documentation, user interface, data formats and journaling, and…

  7. Interpersonal Orientation and Speech Behavior.

    ERIC Educational Resources Information Center

    Street, Richard L., Jr.; Murphy, Thomas L.

    1987-01-01

    Indicates that (1) males with low interpersonal orientation (IO) were least vocally active and expressive and least consistent in their speech performances, and (2) high IO males and low IO females tended to demonstrate greater speech convergence than either low IO males or high IO females. (JD)

  8. Interactions between distal speech rate, linguistic knowledge, and speech environment.

    PubMed

    Morrill, Tuuli; Baese-Berk, Melissa; Heffner, Christopher; Dilley, Laura

    2015-10-01

    During lexical access, listeners use both signal-based and knowledge-based cues, and information from the linguistic context can affect the perception of acoustic speech information. Recent findings suggest that the various cues used in lexical access are implemented with flexibility and may be affected by information from the larger speech context. We conducted 2 experiments to examine effects of a signal-based cue (distal speech rate) and a knowledge-based cue (linguistic structure) on lexical perception. In Experiment 1, we manipulated distal speech rate in utterances where an acoustically ambiguous critical word was either obligatory for the utterance to be syntactically well formed (e.g., Conner knew that bread and butter (are) both in the pantry) or optional (e.g., Don must see the harbor (or) boats). In Experiment 2, we examined identical target utterances as in Experiment 1 but changed the distribution of linguistic structures in the fillers. The results of the 2 experiments demonstrate that speech rate and linguistic knowledge about critical word obligatoriness can both influence speech perception. In addition, it is possible to alter the strength of a signal-based cue by changing information in the speech environment. These results provide support for models of word segmentation that include flexible weighting of signal-based and knowledge-based cues.

  9. Hate Speech or Free Speech: Can Broad Campus Speech Regulations Survive Current Judicial Reasoning?

    ERIC Educational Resources Information Center

    Heiser, Gregory M.; Rossow, Lawrence F.

    1993-01-01

    Federal courts have found speech regulations overbroad in suits against the University of Michigan and the University of Wisconsin System. Attempts to assess the theoretical justification and probable fate of broad speech regulations that have not been explicitly rejected by the courts. Concludes that strong arguments for broader regulation will…

  10. Hate Speech/Free Speech: Using Feminist Perspectives To Foster On-Campus Dialogue.

    ERIC Educational Resources Information Center

    Cornwell, Nancy; Orbe, Mark P.; Warren, Kiesha

    1999-01-01

    Explores the complex issues inherent in the tension between hate speech and free speech, focusing on the phenomenon of hate speech on college campuses. Describes the challenges to hate speech made by critical race theorists and explains how a feminist critique can reorient the parameters of hate speech. (SLD)

  11. Is Birdsong More Like Speech or Music?

    PubMed

    Shannon, Robert V

    2016-04-01

    Music and speech share many acoustic cues but not all are equally important. For example, harmonic pitch is essential for music but not for speech. When birds communicate is their song more like speech or music? A new study contrasting pitch and spectral patterns shows that birds perceive their song more like humans perceive speech. PMID:26944220

  12. ON THE NATURE OF SPEECH SCIENCE.

    ERIC Educational Resources Information Center

    PETERSON, GORDON E.

    IN THIS ARTICLE THE NATURE OF THE DISCIPLINE OF SPEECH SCIENCE IS CONSIDERED AND THE VARIOUS BASIC AND APPLIED AREAS OF THE DISCIPLINE ARE DISCUSSED. THE BASIC AREAS ENCOMPASS THE VARIOUS PROCESSES OF THE PHYSIOLOGY OF SPEECH PRODUCTION, THE ACOUSTICAL CHARACTERISTICS OF SPEECH, INCLUDING THE SPEECH WAVE TYPES AND THE INFORMATION-BEARING ACOUSTIC…

  13. Infant Perception of Atypical Speech Signals

    ERIC Educational Resources Information Center

    Vouloumanos, Athena; Gelfand, Hanna M.

    2013-01-01

    The ability to decode atypical and degraded speech signals as intelligible is a hallmark of speech perception. Human adults can perceive sounds as speech even when they are generated by a variety of nonhuman sources including computers and parrots. We examined how infants perceive the speech-like vocalizations of a parrot. Further, we examined how…

  14. Preschool Children's Awareness of Private Speech

    ERIC Educational Resources Information Center

    Manfra, Louis; Winsler, Adam

    2006-01-01

    The present study explored: (a) preschool children's awareness of their own talking and private speech (speech directed to the self); (b) differences in age, speech use, language ability, and mentalizing abilities between children with awareness and those without; and (c) children's beliefs and attitudes about private speech. Fifty-one children…

  15. Linguistic Units and Speech Production Theory.

    ERIC Educational Resources Information Center

    MacNeilage, Peter F.

    This paper examines the validity of the concept of linguistic units in a theory of speech production. Substantiating data are drawn from the study of the speech production process itself. Secondarily, an attempt is made to reconcile the postulation of linguistic units in speech production theory with their apparent absence in the speech signal.…

  16. Multifractal nature of unvoiced speech signals

    SciTech Connect

    Adeyemi, O.A.; Hartt, K.; Boudreaux-Bartels, G.F.

    1996-06-01

    A refinement is made in the nonlinear dynamic modeling of speech signals. Previous research successfully characterized speech signals as chaotic. Here, we analyze fricative speech signals using multifractal measures to determine various fractal regimes present in their chaotic attractors. Results support the hypothesis that speech signals have multifractal measures. {copyright} {ital 1996 American Institute of Physics.}

  17. Phonetic Recalibration Only Occurs in Speech Mode

    ERIC Educational Resources Information Center

    Vroomen, Jean; Baart, Martijn

    2009-01-01

    Upon hearing an ambiguous speech sound dubbed onto lipread speech, listeners adjust their phonetic categories in accordance with the lipread information (recalibration) that tells what the phoneme should be. Here we used sine wave speech (SWS) to show that this tuning effect occurs if the SWS sounds are perceived as speech, but not if the sounds…

  18. Automated Speech Rate Measurement in Dysarthria

    ERIC Educational Resources Information Center

    Martens, Heidi; Dekens, Tomas; Van Nuffelen, Gwen; Latacz, Lukas; Verhelst, Werner; De Bodt, Marc

    2015-01-01

    Purpose: In this study, a new algorithm for automated determination of speech rate (SR) in dysarthric speech is evaluated. We investigated how reliably the algorithm calculates the SR of dysarthric speech samples when compared with calculation performed by speech-language pathologists. Method: The new algorithm was trained and tested using Dutch…

  19. Connected Speech Processes in Australian English.

    ERIC Educational Resources Information Center

    Ingram, J. C. L.

    1989-01-01

    Explores the role of Connected Speech Processes (CSP) in accounting for sociolinguistically significant dimensions of speech variation, and presents initial findings on the distribution of CSPs in the speech of Australian adolescents. The data were gathered as part of a wider survey of speech of Brisbane school children. (Contains 26 references.)…

  20. Speech Patterns and Racial Wage Inequality

    ERIC Educational Resources Information Center

    Grogger, Jeffrey

    2011-01-01

    Speech patterns differ substantially between whites and many African Americans. I collect and analyze speech data to understand the role that speech may play in explaining racial wage differences. Among blacks, speech patterns are highly correlated with measures of skill such as schooling and AFQT scores. They are also highly correlated with the…

  1. Speech recovery device

    DOEpatents

    Frankle, Christen M.

    2004-04-20

    There is provided an apparatus and method for assisting speech recovery in people with inability to speak due to aphasia, apraxia or another condition with similar effect. A hollow, rigid, thin-walled tube with semi-circular or semi-elliptical cut out shapes at each open end is positioned such that one end mates with the throat/voice box area of the neck of the assistor and the other end mates with the throat/voice box area of the assisted. The speaking person (assistor) makes sounds that produce standing wave vibrations at the same frequency in the vocal cords of the assisted person. Driving the assisted person's vocal cords with the assisted person being able to hear the correct tone enables the assisted person to speak by simply amplifying the vibration of membranes in their throat.

  2. Speech recovery device

    SciTech Connect

    Frankle, Christen M.

    2000-10-19

    There is provided an apparatus and method for assisting speech recovery in people with inability to speak due to aphasia, apraxia or another condition with similar effect. A hollow, rigid, thin-walled tube with semi-circular or semi-elliptical cut out shapes at each open end is positioned such that one end mates with the throat/voice box area of the neck of the assistor and the other end mates with the throat/voice box area of the assisted. The speaking person (assistor) makes sounds that produce standing wave vibrations at the same frequency in the vocal cords of the assisted person. Driving the assisted person's vocal cords with the assisted person being able to hear the correct tone enables the assisted person to speak by simply amplifying the vibration of membranes in their throat.

  3. Applications of Hilbert Spectral Analysis for Speech and Sound Signals

    NASA Technical Reports Server (NTRS)

    Huang, Norden E.

    2003-01-01

    A new method for analyzing nonlinear and nonstationary data has been developed, and the natural applications are to speech and sound signals. The key part of the method is the Empirical Mode Decomposition method with which any complicated data set can be decomposed into a finite and often small number of Intrinsic Mode Functions (IMF). An IMF is defined as any function having the same numbers of zero-crossing and extrema, and also having symmetric envelopes defined by the local maxima and minima respectively. The IMF also admits well-behaved Hilbert transform. This decomposition method is adaptive, and, therefore, highly efficient. Since the decomposition is based on the local characteristic time scale of the data, it is applicable to nonlinear and nonstationary processes. With the Hilbert transform, the Intrinsic Mode Functions yield instantaneous frequencies as functions of time, which give sharp identifications of imbedded structures. This method invention can be used to process all acoustic signals. Specifically, it can process the speech signals for Speech synthesis, Speaker identification and verification, Speech recognition, and Sound signal enhancement and filtering. Additionally, as the acoustical signals from machinery are essentially the way the machines are talking to us. Therefore, the acoustical signals, from the machines, either from sound through air or vibration on the machines, can tell us the operating conditions of the machines. Thus, we can use the acoustic signal to diagnosis the problems of machines.

  4. Turbo Processing for Speech Recognition.

    PubMed

    Moon, Todd K; Gunther, Jacob H; Broadus, Cortnie; Hou, Wendy; Nelson, Nils

    2014-01-01

    Speech recognition is a classic example of a human/machine interface, typifying many of the difficulties and opportunities of human/machine interaction. In this paper, speech recognition is used as an example of applying turbo processing principles to the general problem of human/machine interface. Speech recognizers frequently involve a model representing phonemic information at a local level, followed by a language model representing information at a nonlocal level. This structure is analogous to the local (e.g., equalizer) and nonlocal (e.g., error correction decoding) elements common in digital communications. Drawing from the analogy of turbo processing for digital communications, turbo speech processing iteratively feeds back the output of the language model to be used as prior probabilities for the phonemic model. This analogy is developed here, and the performance of this turbo model is characterized by using an artificial language model. Using turbo processing, the relative error rate improves significantly, especially in high-noise settings.

  5. Statistical Analysis of Spectral Properties and Prosodic Parameters of Emotional Speech

    NASA Astrophysics Data System (ADS)

    Přibil, J.; Přibilová, A.

    2009-01-01

    The paper addresses reflection of microintonation and spectral properties in male and female acted emotional speech. Microintonation component of speech melody is analyzed regarding its spectral and statistical parameters. According to psychological research of emotional speech, different emotions are accompanied by different spectral noise. We control its amount by spectral flatness according to which the high frequency noise is mixed in voiced frames during cepstral speech synthesis. Our experiments are aimed at statistical analysis of cepstral coefficient values and ranges of spectral flatness in three emotions (joy, sadness, anger), and a neutral state for comparison. Calculated histograms of spectral flatness distribution are visually compared and modelled by Gamma probability distribution. Histograms of cepstral coefficient distribution are evaluated and compared using skewness and kurtosis. Achieved statistical results show good correlation comparing male and female voices for all emotional states portrayed by several Czech and Slovak professional actors.

  6. A maximum likelihood approach to estimating articulator positions from speech acoustics

    SciTech Connect

    Hogden, J.

    1996-09-23

    This proposal presents an algorithm called maximum likelihood continuity mapping (MALCOM) which recovers the positions of the tongue, jaw, lips, and other speech articulators from measurements of the sound-pressure waveform of speech. MALCOM differs from other techniques for recovering articulator positions from speech in three critical respects: it does not require training on measured or modeled articulator positions, it does not rely on any particular model of sound propagation through the vocal tract, and it recovers a mapping from acoustics to articulator positions that is linearly, not topographically, related to the actual mapping from acoustics to articulation. The approach categorizes short-time windows of speech into a finite number of sound types, and assumes the probability of using any articulator position to produce a given sound type can be described by a parameterized probability density function. MALCOM then uses maximum likelihood estimation techniques to: (1) find the most likely smooth articulator path given a speech sample and a set of distribution functions (one distribution function for each sound type), and (2) change the parameters of the distribution functions to better account for the data. Using this technique improves the accuracy of articulator position estimates compared to continuity mapping -- the only other technique that learns the relationship between acoustics and articulation solely from acoustics. The technique has potential application to computer speech recognition, speech synthesis and coding, teaching the hearing impaired to speak, improving foreign language instruction, and teaching dyslexics to read. 34 refs., 7 figs.

  7. New Ideas for Speech Recognition and Related Technologies

    SciTech Connect

    Holzrichter, J F

    2002-06-17

    The ideas relating to the use of organ motion sensors for the purposes of speech recognition were first described by.the author in spring 1994. During the past year, a series of productive collaborations between the author, Tom McEwan and Larry Ng ensued and have lead to demonstrations, new sensor ideas, and algorithmic descriptions of a large number of speech recognition concepts. This document summarizes the basic concepts of recognizing speech once organ motions have been obtained. Micro power radars and their uses for the measurement of body organ motions, such as those of the heart and lungs, have been demonstrated by Tom McEwan over the past two years. McEwan and I conducted a series of experiments, using these instruments, on vocal organ motions beginning in late spring, during which we observed motions of vocal folds (i.e., cords), tongue, jaw, and related organs that are very useful for speech recognition and other purposes. These will be reviewed in a separate paper. Since late summer 1994, Lawrence Ng and I have worked to make many of the initial recognition ideas more rigorous and to investigate the applications of these new ideas to new speech recognition algorithms, to speech coding, and to speech synthesis. I introduce some of those ideas in section IV of this document, and we describe them more completely in the document following this one, UCRL-UR-120311. For the design and operation of micro-power radars and their application to body organ motions, the reader may contact Tom McEwan directly. The capability for using EM sensors (i.e., radar units) to measure body organ motions and positions has been available for decades. Impediments to their use appear to have been size, excessive power, lack of resolution, and lack of understanding of the value of organ motion measurements, especially as applied to speech related technologies. However, with the invention of very low power, portable systems as demonstrated by McEwan at LLNL researchers have begun

  8. Phylogeny of the cycads based on multiple single-copy nuclear genes: congruence of concatenated parsimony, likelihood and species tree inference methods

    PubMed Central

    Salas-Leiva, Dayana E.; Meerow, Alan W.; Calonje, Michael; Griffith, M. Patrick; Francisco-Ortega, Javier; Nakamura, Kyoko; Stevenson, Dennis W.; Lewis, Carl E.; Namoff, Sandra

    2013-01-01

    Background and aims Despite a recent new classification, a stable phylogeny for the cycads has been elusive, particularly regarding resolution of Bowenia, Stangeria and Dioon. In this study, five single-copy nuclear genes (SCNGs) are applied to the phylogeny of the order Cycadales. The specific aim is to evaluate several gene tree–species tree reconciliation approaches for developing an accurate phylogeny of the order, to contrast them with concatenated parsimony analysis and to resolve the erstwhile problematic phylogenetic position of these three genera. Methods DNA sequences of five SCNGs were obtained for 20 cycad species representing all ten genera of Cycadales. These were analysed with parsimony, maximum likelihood (ML) and three Bayesian methods of gene tree–species tree reconciliation, using Cycas as the outgroup. A calibrated date estimation was developed with Bayesian methods, and biogeographic analysis was also conducted. Key Results Concatenated parsimony, ML and three species tree inference methods resolve exactly the same tree topology with high support at most nodes. Dioon and Bowenia are the first and second branches of Cycadales after Cycas, respectively, followed by an encephalartoid clade (Macrozamia–Lepidozamia–Encephalartos), which is sister to a zamioid clade, of which Ceratozamia is the first branch, and in which Stangeria is sister to Microcycas and Zamia. Conclusions A single, well-supported phylogenetic hypothesis of the generic relationships of the Cycadales is presented. However, massive extinction events inferred from the fossil record that eliminated broader ancestral distributions within Zamiaceae compromise accurate optimization of ancestral biogeographical areas for that hypothesis. While major lineages of Cycadales are ancient, crown ages of all modern genera are no older than 12 million years, supporting a recent hypothesis of mostly Miocene radiations. This phylogeny can contribute to an accurate infrafamilial

  9. Infant perception of atypical speech signals.

    PubMed

    Vouloumanos, Athena; Gelfand, Hanna M

    2013-05-01

    The ability to decode atypical and degraded speech signals as intelligible is a hallmark of speech perception. Human adults can perceive sounds as speech even when they are generated by a variety of nonhuman sources including computers and parrots. We examined how infants perceive the speech-like vocalizations of a parrot. Further, we examined how visual context influences infant speech perception. Nine-month-olds heard speech and nonspeech sounds produced by either a human or a parrot, concurrently with 1 of 2 visual displays: a static checkerboard or a static image of a human face. Using an infant-controlled looking task, we examined infants' preferences for speech and nonspeech sounds. Infants listened equally to parrot speech and nonspeech when paired with a checkerboard. However, in the presence of faces, infants listened longer to parrot speech than to nonspeech sounds, such that their preference for parrot speech was similar to their preference for human speech sounds. These data are consistent with the possibility that infants treat parrot speech similarly to human speech relative to nonspeech vocalizations but only in some visual contexts. Like adults, infants may perceive a range of signals as speech.

  10. Effect of Speaking Rate on Recognition of Synthetic and Natural Speech by Normal-Hearing and Cochlear Implant Listeners

    PubMed Central

    Ji, Caili; Galvin, John J.; Xu, Anting; Fu, Qian-Jie

    2012-01-01

    Objective Most studies have evaluated cochlear implant (CI) performance using “clear” speech materials, which are highly intelligible and well-articulated. CI users may encounter much greater variability in speech patterns in the “real-world,” including synthetic speech. In this study, we measured normal-hearing (NH) and CI listeners’ sentence recognition with multiple talkers and speaking rates, and with naturally produced and synthetic speech. Design NH and CI subjects were asked to recognize naturally produced or synthetic sentences, presented at a slow, normal, or fast speaking rate. Natural speech was produced by one male and one female talker; synthetic speech was generated to simulate a male and female talker. For natural speech, the speaking rate was time-scaled while preserving voice pitch and formant frequency information. For synthetic speech, the speaking rate was adjusted within the speech synthesis engine. NH subjects were tested while listening to unprocessed speech or to an 8-channel acoustic CI simulation. CI subjects were tested while listening with their clinical processors and the recommended microphone sensitivity and volume settings. Results The NH group performed significantly better than the CI simulation group, and the CI simulation group performed significantly better than the CI group. For all subject groups, sentence recognition was significantly better with natural than with synthetic speech. The performance deficit with synthetic speech was relatively small for NH subjects listening to unprocessed speech. However, the performance deficit with synthetic speech was much greater for CI subjects and for CI simulation subjects. There was significant effect of talker gender, with slightly better performance with the female talker for CI subjects and slightly better performance with the male talker for the CI simulations. For all subject groups, sentence recognition was significantly poorer only at the fast rate. CI performance was

  11. Text-to-Speech, Text, and Hypertext: Reading and Spelling with the Computer.

    ERIC Educational Resources Information Center

    Leong, Che Kan

    1992-01-01

    Introduces this special issue. Discusses the analysis-by-synthesis principle of text-to-speech conversion; some classroom and research issues in designing "usable" computer texts; and the criteria of hypertext. Emphasizes the importance of the contextual aspects of text and hypertext and situated learning. (RS)

  12. Neural pathways for visual speech perception

    PubMed Central

    Bernstein, Lynne E.; Liebenthal, Einat

    2014-01-01

    This paper examines the questions, what levels of speech can be perceived visually, and how is visual speech represented by the brain? Review of the literature leads to the conclusions that every level of psycholinguistic speech structure (i.e., phonetic features, phonemes, syllables, words, and prosody) can be perceived visually, although individuals differ in their abilities to do so; and that there are visual modality-specific representations of speech qua speech in higher-level vision brain areas. That is, the visual system represents the modal patterns of visual speech. The suggestion that the auditory speech pathway receives and represents visual speech is examined in light of neuroimaging evidence on the auditory speech pathways. We outline the generally agreed-upon organization of the visual ventral and dorsal pathways and examine several types of visual processing that might be related to speech through those pathways, specifically, face and body, orthography, and sign language processing. In this context, we examine the visual speech processing literature, which reveals widespread diverse patterns of activity in posterior temporal cortices in response to visual speech stimuli. We outline a model of the visual and auditory speech pathways and make several suggestions: (1) The visual perception of speech relies on visual pathway representations of speech qua speech. (2) A proposed site of these representations, the temporal visual speech area (TVSA) has been demonstrated in posterior temporal cortex, ventral and posterior to multisensory posterior superior temporal sulcus (pSTS). (3) Given that visual speech has dynamic and configural features, its representations in feedforward visual pathways are expected to integrate these features, possibly in TVSA. PMID:25520611

  13. Experimental comparison between speech transmission index, rapid speech transmission index, and speech intelligibility index.

    PubMed

    Larm, Petra; Hongisto, Valtteri

    2006-02-01

    During the acoustical design of, e.g., auditoria or open-plan offices, it is important to know how speech can be perceived in various parts of the room. Different objective methods have been developed to measure and predict speech intelligibility, and these have been extensively used in various spaces. In this study, two such methods were compared, the speech transmission index (STI) and the speech intelligibility index (SII). Also the simplification of the STI, the room acoustics speech transmission index (RASTI), was considered. These quantities are all based on determining an apparent speech-to-noise ratio on selected frequency bands and summing them using a specific weighting. For comparison, some data were needed on the possible differences of these methods resulting from the calculation scheme and also measuring equipment. Their prediction accuracy was also of interest. Measurements were made in a laboratory having adjustable noise level and absorption, and in a real auditorium. It was found that the measurement equipment, especially the selection of the loudspeaker, can greatly affect the accuracy of the results. The prediction accuracy of the RASTI was found acceptable, if the input values for the prediction are accurately known, even though the studied space was not ideally diffuse. PMID:16521772

  14. Speech-on-speech masking with variable access to the linguistic content of the masker speech.

    PubMed

    Calandruccio, Lauren; Dhar, Sumitrajit; Bradlow, Ann R

    2010-08-01

    It has been reported that listeners can benefit from a release in masking when the masker speech is spoken in a language that differs from the target speech compared to when the target and masker speech are spoken in the same language [Freyman, R. L. et al. (1999). J. Acoust. Soc. Am. 106, 3578-3588; Van Engen, K., and Bradlow, A. (2007), J. Acoust. Soc. Am. 121, 519-526]. It is unclear whether listeners benefit from this release in masking due to the lack of linguistic interference of the masker speech, from acoustic and phonetic differences between the target and masker languages, or a combination of these differences. In the following series of experiments, listeners' sentence recognition was evaluated using speech and noise maskers that varied in the amount of linguistic content, including native-English, Mandarin-accented English, and Mandarin speech. Results from three experiments indicated that the majority of differences observed between the linguistic maskers could be explained by spectral differences between the masker conditions. However, when the recognition task increased in difficulty, i.e., at a more challenging signal-to-noise ratio, a greater decrease in performance was observed for the maskers with more linguistically relevant information than what could be explained by spectral differences alone. PMID:20707455

  15. Experimental comparison between speech transmission index, rapid speech transmission index, and speech intelligibility index.

    PubMed

    Larm, Petra; Hongisto, Valtteri

    2006-02-01

    During the acoustical design of, e.g., auditoria or open-plan offices, it is important to know how speech can be perceived in various parts of the room. Different objective methods have been developed to measure and predict speech intelligibility, and these have been extensively used in various spaces. In this study, two such methods were compared, the speech transmission index (STI) and the speech intelligibility index (SII). Also the simplification of the STI, the room acoustics speech transmission index (RASTI), was considered. These quantities are all based on determining an apparent speech-to-noise ratio on selected frequency bands and summing them using a specific weighting. For comparison, some data were needed on the possible differences of these methods resulting from the calculation scheme and also measuring equipment. Their prediction accuracy was also of interest. Measurements were made in a laboratory having adjustable noise level and absorption, and in a real auditorium. It was found that the measurement equipment, especially the selection of the loudspeaker, can greatly affect the accuracy of the results. The prediction accuracy of the RASTI was found acceptable, if the input values for the prediction are accurately known, even though the studied space was not ideally diffuse.

  16. Revisiting speech interference in classrooms.

    PubMed

    Picard, M; Bradley, J S

    2001-01-01

    A review of the effects of ambient noise and reverberation on speech intelligibility in classrooms has been completed because of the long-standing lack of agreement on preferred acoustical criteria for unconstrained speech accessibility and communication in educational facilities. An overwhelming body of evidence has been collected to suggest that noise levels in particular are usually far in excess of any reasonable prescription for optimal conditions for understanding speech in classrooms. Quite surprisingly, poor classroom acoustics seem to be the prevailing condition for both normally-hearing and hearing-impaired students with reported A-weighted ambient noise levels 4-37 dB above values currently agreed upon to provide optimal understanding. Revision of currently proposed room acoustic performance criteria to ensure speech accessibility for all students indicates the need for a guideline weighted for age and one for more vulnerable groups. For teens (12-year-olds and older) and young adults having normal speech processing in noise, ambient noise levels not exceeding 40 dBA are suggested as acceptable, and reverberation times of about 0.5 s are concluded to be optimum. Younger students, having normal speech processing in noise for their age, would require noise levels ranging from 39 dBA for 10-11-year-olds to only 28.5 dBA for 6-7-year-olds. By contrast, groups suspected of delayed speech processing in noise may require levels as low as only 21.5 dBA at age 6-7. As one would expect, these more vulnerable students would include the hearing-impaired in the course of language development and non-native listeners. PMID:11688542

  17. Speech prosody in cerebellar ataxia

    NASA Astrophysics Data System (ADS)

    Casper, Maureen

    The present study sought an acoustic signature for the speech disturbance recognized in cerebellar degeneration. Magnetic resonance imaging was used for a radiological rating of cerebellar involvement in six cerebellar ataxic dysarthric speakers. Acoustic measures of the [pap] syllables in contrastive prosodic conditions and of normal vs. brain-damaged patients were used to further our understanding both of the speech degeneration that accompanies cerebellar pathology and of speech motor control and movement in general. Pair-wise comparisons of the prosodic conditions within the normal group showed statistically significant differences for four prosodic contrasts. For three of the four contrasts analyzed, the normal speakers showed both longer durations and higher formant and fundamental frequency values in the more prominent first condition of the contrast. The acoustic measures of the normal prosodic contrast values were then used as a model to measure the degree of speech deterioration for individual cerebellar subjects. This estimate of speech deterioration as determined by individual differences between cerebellar and normal subjects' acoustic values of the four prosodic contrasts was used in correlation analyses with MRI ratings. Moderate correlations between speech deterioration and cerebellar atrophy were found in the measures of syllable duration and f0. A strong negative correlation was found for F1. Moreover, the normal model presented by these acoustic data allows for a description of the flexibility of task- oriented behavior in normal speech motor control. These data challenge spatio-temporal theory which explains movement as an artifact of time wherein longer durations predict more extreme movements and give further evidence for gestural internal dynamics of movement in which time emerges from articulatory events rather than dictating those events. This model provides a sensitive index of cerebellar pathology with quantitative acoustic

  18. System and method for characterizing voiced excitations of speech and acoustic signals, removing acoustic noise from speech, and synthesizing speech

    DOEpatents

    Burnett, Greg C.; Holzrichter, John F.; Ng, Lawrence C.

    2006-08-08

    The present invention is a system and method for characterizing human (or animate) speech voiced excitation functions and acoustic signals, for removing unwanted acoustic noise which often occurs when a speaker uses a microphone in common environments, and for synthesizing personalized or modified human (or other animate) speech upon command from a controller. A low power EM sensor is used to detect the motions of windpipe tissues in the glottal region of the human speech system before, during, and after voiced speech is produced by a user. From these tissue motion measurements, a voiced excitation function can be derived. Further, the excitation function provides speech production information to enhance noise removal from human speech and it enables accurate transfer functions of speech to be obtained. Previously stored excitation and transfer functions can be used for synthesizing personalized or modified human speech. Configurations of EM sensor and acoustic microphone systems are described to enhance noise cancellation and to enable multiple articulator measurements.

  19. System and method for characterizing voiced excitations of speech and acoustic signals, removing acoustic noise from speech, and synthesizing speech

    DOEpatents

    Burnett, Greg C.; Holzrichter, John F.; Ng, Lawrence C.

    2004-03-23

    The present invention is a system and method for characterizing human (or animate) speech voiced excitation functions and acoustic signals, for removing unwanted acoustic noise which often occurs when a speaker uses a microphone in common environments, and for synthesizing personalized or modified human (or other animate) speech upon command from a controller. A low power EM sensor is used to detect the motions of windpipe tissues in the glottal region of the human speech system before, during, and after voiced speech is produced by a user. From these tissue motion measurements, a voiced excitation function can be derived. Further, the excitation function provides speech production information to enhance noise removal from human speech and it enables accurate transfer functions of speech to be obtained. Previously stored excitation and transfer functions can be used for synthesizing personalized or modified human speech. Configurations of EM sensor and acoustic microphone systems are described to enhance noise cancellation and to enable multiple articulator measurements.

  20. System and method for characterizing voiced excitations of speech and acoustic signals, removing acoustic noise from speech, and synthesizing speech

    DOEpatents

    Burnett, Greg C.; Holzrichter, John F.; Ng, Lawrence C.

    2006-02-14

    The present invention is a system and method for characterizing human (or animate) speech voiced excitation functions and acoustic signals, for removing unwanted acoustic noise which often occurs when a speaker uses a microphone in common environments, and for synthesizing personalized or modified human (or other animate) speech upon command from a controller. A low power EM sensor is used to detect the motions of windpipe tissues in the glottal region of the human speech system before, during, and after voiced speech is produced by a user. From these tissue motion measurements, a voiced excitation function can be derived. Further, the excitation function provides speech production information to enhance noise removal from human speech and it enables accurate transfer functions of speech to be obtained. Previously stored excitation and transfer functions can be used for synthesizing personalized or modified human speech. Configurations of EM sensor and acoustic microphone systems are described to enhance noise cancellation and to enable multiple articulator measurements.

  1. Contextual variability during speech-in-speech recognition

    PubMed Central

    Brouwer, Susanne; Bradlow, Ann R.

    2014-01-01

    This study examined the influence of background language variation on speech recognition. English listeners performed an English sentence recognition task in either “pure” background conditions in which all trials had either English or Dutch background babble or in mixed background conditions in which the background language varied across trials (i.e., a mix of English and Dutch or one of these background languages mixed with quiet trials). This design allowed the authors to compare performance on identical trials across pure and mixed conditions. The data reveal that speech-in-speech recognition is sensitive to contextual variation in terms of the target-background language (mis)match depending on the relative ease/difficulty of the test trials in relation to the surrounding trials. PMID:24993234

  2. Production and perception of clear speech

    NASA Astrophysics Data System (ADS)

    Bradlow, Ann R.

    2003-04-01

    When a talker believes that the listener is likely to have speech perception difficulties due to a hearing loss, background noise, or a different native language, she or he will typically adopt a clear speaking style. Previous research has established that, with a simple set of instructions to the talker, ``clear speech'' can be produced by most talkers under laboratory recording conditions. Furthermore, there is reliable evidence that adult listeners with either impaired or normal hearing typically find clear speech more intelligible than conversational speech. Since clear speech production involves listener-oriented articulatory adjustments, a careful examination of the acoustic-phonetic and perceptual consequences of the conversational-to-clear speech transformation can serve as an effective window into talker- and listener-related forces in speech communication. Furthermore, clear speech research has considerable potential for the development of speech enhancement techniques. After reviewing previous and current work on the acoustic properties of clear versus conversational speech, this talk will present recent data from a cross-linguistic study of vowel production in clear speech and a cross-population study of clear speech perception. Findings from these studies contribute to an evolving view of clear speech production and perception as reflecting both universal, auditory and language-specific, phonological contrast enhancement features.

  3. Determining the threshold for usable speech within co-channel speech with the SPHINX automated speech recognition system

    NASA Astrophysics Data System (ADS)

    Hicks, William T.; Yantorno, Robert E.

    2004-10-01

    Much research has been and is continuing to be done in the area of separating the original utterances of two speakers from co-channel speech. This is very important in the area of automated speech recognition (ASR), where the current state of technology is not nearly as accurate as human listeners when the speech is co-channel. It is desired to determine what types of speech (voiced, unvoiced, and silence) and at what target to interference ratio (TIR) two speakers can speak at the same time and not reduce speech intelligibility of the target speaker (referred to as usable speech). Knowing which segments of co-channel speech are usable in ASR can be used to improve the reconstruction of single speaker speech. Tests were performed using the SPHINX ASR software and the TIDIGITS database. It was found that interfering voiced speech with a TIR of 6 dB or greater (on a per frame basis) did not significantly reduce the intelligibility of the target speaker in co-channel speech. It was further found that interfering unvoiced speech with a TIR of 18 dB or greater (on a per frame basis) did not significantly reduce the intelligibility of the target speaker in co-channel speech.

  4. The Effect of Speech Rate on Stuttering Frequency, Phonated Intervals, Speech Effort, and Speech Naturalness during Chorus Reading

    ERIC Educational Resources Information Center

    Davidow, Jason H.; Ingham, Roger J.

    2013-01-01

    Purpose: This study examined the effect of speech rate on phonated intervals (PIs), in order to test whether a reduction in the frequency of short PIs is an important part of the fluency-inducing mechanism of chorus reading. The influence of speech rate on stuttering frequency, speaker-judged speech effort, and listener-judged naturalness was also…

  5. A causal test of the motor theory of speech perception: A case of impaired speech production and spared speech perception

    PubMed Central

    Stasenko, Alena; Bonn, Cory; Teghipco, Alex; Garcea, Frank E.; Sweet, Catherine; Dombovy, Mary; McDonough, Joyce; Mahon, Bradford Z.

    2015-01-01

    In the last decade, the debate about the causal role of the motor system in speech perception has been reignited by demonstrations that motor processes are engaged during the processing of speech sounds. However, the exact role of the motor system in auditory speech processing remains elusive. Here we evaluate which aspects of auditory speech processing are affected, and which are not, in a stroke patient with dysfunction of the speech motor system. The patient’s spontaneous speech was marked by frequent phonological/articulatory errors, and those errors were caused, at least in part, by motor-level impairments with speech production. We found that the patient showed a normal phonemic categorical boundary when discriminating two nonwords that differ by a minimal pair (e.g., ADA-AGA). However, using the same stimuli, the patient was unable to identify or label the nonword stimuli (using a button-press response). A control task showed that he could identify speech sounds by speaker gender, ruling out a general labeling impairment. These data suggest that the identification (i.e. labeling) of nonword speech sounds may involve the speech motor system, but that the perception of speech sounds (i.e., discrimination) does not require the motor system. This means that motor processes are not causally involved in perception of the speech signal, and suggest that the motor system may be used when other cues (e.g., meaning, context) are not available. PMID:25951749

  6. The Contribution of Sensitivity to Speech Rhythm and Non-Speech Rhythm to Early Reading Development

    ERIC Educational Resources Information Center

    Holliman, Andrew J.; Wood, Clare; Sheehy, Kieron

    2010-01-01

    Both sensitivity to speech rhythm and non-speech rhythm have been associated with successful phonological awareness and reading development in separate studies. However, the extent to which speech rhythm, non-speech rhythm and literacy skills are interrelated has not been examined. As a result, five- to seven-year-old English-speaking children…

  7. Perceived Liveliness and Speech Comprehensibility in Aphasia: The Effects of Direct Speech in Auditory Narratives

    ERIC Educational Resources Information Center

    Groenewold, Rimke; Bastiaanse, Roelien; Nickels, Lyndsey; Huiskes, Mike

    2014-01-01

    Background: Previous studies have shown that in semi-spontaneous speech, individuals with Broca's and anomic aphasia produce relatively many direct speech constructions. It has been claimed that in "healthy" communication direct speech constructions contribute to the liveliness, and indirectly to the comprehensibility, of speech.…

  8. Predicting Speech Intelligibility with a Multiple Speech Subsystems Approach in Children with Cerebral Palsy

    ERIC Educational Resources Information Center

    Lee, Jimin; Hustad, Katherine C.; Weismer, Gary

    2014-01-01

    Purpose: Speech acoustic characteristics of children with cerebral palsy (CP) were examined with a multiple speech subsystems approach; speech intelligibility was evaluated using a prediction model in which acoustic measures were selected to represent three speech subsystems. Method: Nine acoustic variables reflecting different subsystems, and…

  9. Common neural substrates support speech and non-speech vocal tract gestures.

    PubMed

    Chang, Soo-Eun; Kenney, Mary Kay; Loucks, Torrey M J; Poletto, Christopher J; Ludlow, Christy L

    2009-08-01

    The issue of whether speech is supported by the same neural substrates as non-speech vocal tract gestures has been contentious. In this fMRI study we tested whether producing non-speech vocal tract gestures in humans shares the same functional neuroanatomy as non-sense speech syllables. Production of non-speech vocal tract gestures, devoid of phonological content but similar to speech in that they had familiar acoustic and somatosensory targets, was compared to the production of speech syllables without meaning. Brain activation related to overt production was captured with BOLD fMRI using a sparse sampling design for both conditions. Speech and non-speech were compared using voxel-wise whole brain analyses, and ROI analyses focused on frontal and temporoparietal structures previously reported to support speech production. Results showed substantial activation overlap between speech and non-speech function in regions. Although non-speech gesture production showed greater extent and amplitude of activation in the regions examined, both speech and non-speech showed comparable left laterality in activation for both target perception and production. These findings posit a more general role of the previously proposed "auditory dorsal stream" in the left hemisphere--to support the production of vocal tract gestures that are not limited to speech processing.

  10. Speech Perception and Short-Term Memory Deficits in Persistent Developmental Speech Disorder

    ERIC Educational Resources Information Center

    Kenney, Mary Kay; Barac-Cikoja, Dragana; Finnegan, Kimberly; Jeffries, Neal; Ludlow, Christy L.

    2006-01-01

    Children with developmental speech disorders may have additional deficits in speech perception and/or short-term memory. To determine whether these are only transient developmental delays that can accompany the disorder in childhood or persist as part of the speech disorder, adults with a persistent familial speech disorder were tested on speech…

  11. The Role of Visual Speech Information in Supporting Perceptual Learning of Degraded Speech

    ERIC Educational Resources Information Center

    Wayne, Rachel V.; Johnsrude, Ingrid S.

    2012-01-01

    Following cochlear implantation, hearing-impaired listeners must adapt to speech as heard through their prosthesis. Visual speech information (VSI; the lip and facial movements of speech) is typically available in everyday conversation. Here, we investigate whether learning to understand a popular auditory simulation of speech as transduced by a…

  12. President Kennedy's Speech at Rice University

    NASA Technical Reports Server (NTRS)

    1988-01-01

    This video tape presents unedited film footage of President John F. Kennedy's speech at Rice University, Houston, Texas, September 12, 1962. The speech expresses the commitment of the United States to landing an astronaut on the Moon.

  13. A statistical model based fundamental frequency synthesizer for Mandarin speech.

    PubMed

    Chen, S H; Chang, S; Lee, S M

    1992-07-01

    A novel method based on a statistical model for the fundamental-frequency (F0) synthesis in Mandarin text-to-speech is proposed. Specifically, a statistical model is employed to determine the relationship between F0 contour patterns of syllables and linguistic features representing the context. Parameters of the model were empirically estimated from a large training set of sentential utterances. Phonologic rules are then automatically deduced through the training process and implicitly memorized in the model. In the synthesis process, contextual features are extracted from a given input text, and the best estimates of F0 contour patterns of syllable are then found by a Viterbi algorithm using the well-trained model. This method can be regarded as employing a stochastic grammar to reduce the number of candidates of F0 contour pattern at each decision point of synthesis. Although linguistic features on various levels of input text can be incorporated into the model, only some relevant contextual features extracted from neighboring syllables were used in this study. Performance of this method was examined by simulation using a database composed of nine repetitions of 112 declarative sentential utterances of the same text, all spoken by a single speaker. By closely examining the well-trained model, some evidence was found to show that the declination effect as well as several sandhi rules are implicitly contained in the model. Experimental results show that 77.56% of synthesized F0 contours coincide with the VQ-quantized counterpart of the original natural speech. Naturalness of the synthesized speech was confirmed by an informal listening test. PMID:1387408

  14. Amplitude-temporal method of speech coding

    NASA Astrophysics Data System (ADS)

    Ababii, Victor; Sudacevschi, Viorica

    2005-02-01

    A method of speech coding and decoding is proposed. The speech coding algorithm is based on first derivate calculation of input speech signal, identification of critical points and input signal amplitude in these points, time period measurement between critical points. The result of codification represents a sequence of amplitudes and time periods. The decoding algorithm utilizes values of COS or SIN functions for reconstruction on the input speech. The codec structure that consists from encoder and decoder units is proposed.

  15. A Chimpanzee Recognizes Synthetic Speech With Significantly Reduced Acoustic Cues to Phonetic Content

    PubMed Central

    Heimbauer, Lisa A.; Beran, Michael J.; Owren, Michael J.

    2011-01-01

    Summary A long-standing debate concerns whether humans are specialized for speech perception [1–7], which some researchers argue is demonstrated by the ability to understand synthetic speech with significantly reduced acoustic cues to phonetic content [2–4,7]. We tested a chimpanzee (Pan troglodytes) that recognizes 128 spoken words [8,9], asking whether she could understand such speech. Three experiments presented 48 individual words, with the animal selecting a corresponding visuo-graphic symbol from among four alternatives. Experiment 1 tested spectrally reduced, noise-vocoded (NV) synthesis, originally developed to simulate input received by human cochlear-implant users [10]. Experiment 2 tested “impossibly unspeechlike” [3] sine-wave (SW) synthesis, which reduces speech to just three moving tones [11]. Although receiving only intermittent and non-contingent reward, the chimpanzee performed well above chance level, including when hearing synthetic versions for the first time. Recognition of SW words was least accurate, but improved in Experiment 3 when natural words in the same session were rewarded. The chimpanzee was more accurate with NV than SW versions, as were 32 human participants hearing these items. The chimpanzee's ability to spontaneously recognize acoustically reduced synthetic words suggests that experience rather than specialization is critical for speech-perception capabilities that some have suggested are uniquely human [12–14]. PMID:21723125

  16. Nonlinear Statistical Modeling of Speech

    NASA Astrophysics Data System (ADS)

    Srinivasan, S.; Ma, T.; May, D.; Lazarou, G.; Picone, J.

    2009-12-01

    Contemporary approaches to speech and speaker recognition decompose the problem into four components: feature extraction, acoustic modeling, language modeling and search. Statistical signal processing is an integral part of each of these components, and Bayes Rule is used to merge these components into a single optimal choice. Acoustic models typically use hidden Markov models based on Gaussian mixture models for state output probabilities. This popular approach suffers from an inherent assumption of linearity in speech signal dynamics. Language models often employ a variety of maximum entropy techniques, but can employ many of the same statistical techniques used for acoustic models. In this paper, we focus on introducing nonlinear statistical models to the feature extraction and acoustic modeling problems as a first step towards speech and speaker recognition systems based on notions of chaos and strange attractors. Our goal in this work is to improve the generalization and robustness properties of a speech recognition system. Three nonlinear invariants are proposed for feature extraction: Lyapunov exponents, correlation fractal dimension, and correlation entropy. We demonstrate an 11% relative improvement on speech recorded under noise-free conditions, but show a comparable degradation occurs for mismatched training conditions on noisy speech. We conjecture that the degradation is due to difficulties in estimating invariants reliably from noisy data. To circumvent these problems, we introduce two dynamic models to the acoustic modeling problem: (1) a linear dynamic model (LDM) that uses a state space-like formulation to explicitly model the evolution of hidden states using an autoregressive process, and (2) a data-dependent mixture of autoregressive (MixAR) models. Results show that LDM and MixAR models can achieve comparable performance with HMM systems while using significantly fewer parameters. Currently we are developing Bayesian parameter estimation and

  17. Speech Communication and Telephone Networks

    NASA Astrophysics Data System (ADS)

    Gierlich, H. W.

    Speech communication over telephone networks has one major constraint: The communication has to be “real time”. The basic principle since the beginning of all telephone networks has been to provide a communication system capable of substituting the air path between two persons having a conversation at 1-m distance. This is the so-called orthotelephonic reference position [7]. Although many technical compromises must be made to enable worldwide communication over telephone networks, it is still the goal to achieve speech quality performance which is close to this reference.

  18. Auditory models for speech analysis

    NASA Astrophysics Data System (ADS)

    Maybury, Mark T.

    This paper reviews the psychophysical basis for auditory models and discusses their application to automatic speech recognition. First an overview of the human auditory system is presented, followed by a review of current knowledge gleaned from neurological and psychoacoustic experimentation. Next, a general framework describes established peripheral auditory models which are based on well-understood properties of the peripheral auditory system. This is followed by a discussion of current enhancements to that models to include nonlinearities and synchrony information as well as other higher auditory functions. Finally, the initial performance of auditory models in the task of speech recognition is examined and additional applications are mentioned.

  19. Campus Speech Codes Said to Violate Rights

    ERIC Educational Resources Information Center

    Lipka, Sara

    2007-01-01

    Most college and university speech codes would not survive a legal challenge, according to a report released in December by the Foundation for Individual Rights in Education, a watchdog group for free speech on campuses. The report labeled many speech codes as overly broad or vague, and cited examples such as Furman University's prohibition of…

  20. DEVELOPMENT AND DISORDERS OF SPEECH IN CHILDHOOD.

    ERIC Educational Resources Information Center

    KARLIN, ISAAC W.; AND OTHERS

    THE GROWTH, DEVELOPMENT, AND ABNORMALITIES OF SPEECH IN CHILDHOOD ARE DESCRIBED IN THIS TEXT DESIGNED FOR PEDIATRICIANS, PSYCHOLOGISTS, EDUCATORS, MEDICAL STUDENTS, THERAPISTS, PATHOLOGISTS, AND PARENTS. THE NORMAL DEVELOPMENT OF SPEECH AND LANGUAGE IS DISCUSSED, INCLUDING THEORIES ON THE ORIGIN OF SPEECH IN MAN AND FACTORS INFLUENCING THE NORMAL…

  1. NEW SPEECH PATTERNS IN THE FRENCH QUARTER.

    ERIC Educational Resources Information Center

    BRADDOCK, CLAYTON

    BUSINESSMEN IN NEW ORLEANS CITED POOR SPEECH AMONG NEGRO APPLICANTS FOR SECRETARIAL AND STENOGRAPHIC POSITIONS AS THE MAJOR REASON FOR NOT HIRING THEM. AS A RESULT, ST. MARY'S DOMINICAN COLLEGE EMBARKED ON AN 8-MONTH PROGRAM IN 1965 TO TEACH STANDARD SPEECH TO 90 YOUNG WOMEN, 75 OF WHOM WERE NEGRO. STANDARD SPEECH WAS TAUGHT AS A SECOND LANGUAGE.…

  2. Liberalism, Speech Codes, and Related Problems.

    ERIC Educational Resources Information Center

    Sunstein, Cass R.

    1993-01-01

    It is argued that universities are pervasively and necessarily engaged in regulation of speech, which complicates many existing claims about hate speech codes on campus. The ultimate test is whether the restriction on speech is a legitimate part of the institution's mission, commitment to liberal education. (MSE)

  3. Hate Speech on Campus: A Practical Approach.

    ERIC Educational Resources Information Center

    Hogan, Patrick

    1997-01-01

    Looks at arguments concerning hate speech and speech codes on college campuses, arguing that speech codes are likely to be of limited value in achieving civil rights objectives, and that there are alternatives less harmful to civil liberties and more successful in promoting civil rights. Identifies specific goals, and considers how restriction of…

  4. Intelligibility of Speech Produced during Simultaneous Communication

    ERIC Educational Resources Information Center

    Whitehead, Robert L.; Schiavetti, Nicholas; MacKenzie, Douglas J.; Metz, Dale Evan

    2004-01-01

    This study investigated the overall intelligibility of speech produced during simultaneous communication (SC). Four hearing, experienced sign language users were recorded under SC and speech alone (SA) conditions speaking Boothroyd's (1985) forced-choice phonetic contrast material designed for measurement of speech intelligibility. Twelve…

  5. Speech-Song Interface of Chinese Speakers

    ERIC Educational Resources Information Center

    Mang, Esther

    2007-01-01

    Pitch is a psychoacoustic construct crucial in the production and perception of speech and songs. This article is an exploration of the interface of speech and song performance of Chinese speakers. Although parallels might be drawn from the prosodic and sound structures of the linguistic and musical systems, perceiving and producing speech and…

  6. Audiovisual Asynchrony Detection in Human Speech

    ERIC Educational Resources Information Center

    Maier, Joost X.; Di Luca, Massimiliano; Noppeney, Uta

    2011-01-01

    Combining information from the visual and auditory senses can greatly enhance intelligibility of natural speech. Integration of audiovisual speech signals is robust even when temporal offsets are present between the component signals. In the present study, we characterized the temporal integration window for speech and nonspeech stimuli with…

  7. Audiovisual Speech Integration and Lipreading in Autism

    ERIC Educational Resources Information Center

    Smith, Elizabeth G.; Bennetto, Loisa

    2007-01-01

    Background: During speech perception, the ability to integrate auditory and visual information causes speech to sound louder and be more intelligible, and leads to quicker processing. This integration is important in early language development, and also continues to affect speech comprehension throughout the lifespan. Previous research shows that…

  8. Free Speech in the College Community.

    ERIC Educational Resources Information Center

    O'Neil, Robert M.

    This book discusses freedom of speech issues affecting the college community, in light of "speech codes" imposed by some institutions, new electronic technology such as the Internet, and recent court decisions. Chapter 1 addresses campus speech codes, the advantages and disadvantages of such codes, and their conflict with the First Amendment of…

  9. Syllable Structure in Dysfunctional Portuguese Children's Speech

    ERIC Educational Resources Information Center

    Candeias, Sara; Perdigao, Fernando

    2010-01-01

    The goal of this work is to investigate whether children with speech dysfunctions (SD) show a deficit in planning some Portuguese syllable structures (PSS) in continuous speech production. Knowledge of which aspects of speech production are affected by SD is necessary for efficient improvement in the therapy techniques. The case-study is focused…

  10. Communicating by Language: The Speech Process.

    ERIC Educational Resources Information Center

    House, Arthur S., Ed.

    This document reports on a conference focused on speech problems. The main objective of these discussions was to facilitate a deeper understanding of human communication through interaction of conference participants with colleagues in other disciplines. Topics discussed included speech production, feedback, speech perception, and development of…

  11. Interventions for Speech Sound Disorders in Children

    ERIC Educational Resources Information Center

    Williams, A. Lynn, Ed.; McLeod, Sharynne, Ed.; McCauley, Rebecca J., Ed.

    2010-01-01

    With detailed discussion and invaluable video footage of 23 treatment interventions for speech sound disorders (SSDs) in children, this textbook and DVD set should be part of every speech-language pathologist's professional preparation. Focusing on children with functional or motor-based speech disorders from early childhood through the early…

  12. Vygotskian Inner Speech and the Reading Process

    ERIC Educational Resources Information Center

    Ehrich, J. F.

    2006-01-01

    There is a paucity of Vygotskian influenced inner speech research in relation to the reading process. Those few studies which have examined Vygotskian inner speech from a reading perspective tend to support the notion that inner speech is an important covert function that is crucial to the reading process and to reading acquisition in general.…

  13. Cognitive Functions in Childhood Apraxia of Speech

    ERIC Educational Resources Information Center

    Nijland, Lian; Terband, Hayo; Maassen, Ben

    2015-01-01

    Purpose: Childhood apraxia of speech (CAS) is diagnosed on the basis of specific speech characteristics, in the absence of problems in hearing, intelligence, and language comprehension. This does not preclude the possibility that children with this speech disorder might demonstrate additional problems. Method: Cognitive functions were investigated…

  14. Acoustics of Clear Speech: Effect of Instruction

    ERIC Educational Resources Information Center

    Lam, Jennifer; Tjaden, Kris; Wilding, Greg

    2012-01-01

    Purpose: This study investigated how different instructions for eliciting clear speech affected selected acoustic measures of speech. Method: Twelve speakers were audio-recorded reading 18 different sentences from the Assessment of Intelligibility of Dysarthric Speech (Yorkston & Beukelman, 1984). Sentences were produced in habitual, clear,…

  15. Speech and Hearing Science, Anatomy and Physiology.

    ERIC Educational Resources Information Center

    Zemlin, Willard R.

    Written for those interested in speech pathology and audiology, the text presents the anatomical, physiological, and neurological bases for speech and hearing. Anatomical nomenclature used in the speech and hearing sciences is introduced and the breathing mechanism is defined and discussed in terms of the respiratory passage, the framework and…

  16. Hate Speech and the First Amendment.

    ERIC Educational Resources Information Center

    Rainey, Susan J.; Kinsler, Waren S.; Kannarr, Tina L.; Reaves, Asa E.

    This document is comprised of California state statutes, federal legislation, and court litigation pertaining to hate speech and the First Amendment. The document provides an overview of California education code sections relating to the regulation of speech; basic principles of the First Amendment; government efforts to regulate hate speech,…

  17. Speech Perception in Individuals with Auditory Neuropathy

    ERIC Educational Resources Information Center

    Zeng, Fan-Gang; Liu, Sheng

    2006-01-01

    Purpose: Speech perception in participants with auditory neuropathy (AN) was systematically studied to answer the following 2 questions: Does noise present a particular problem for people with AN: Can clear speech and cochlear implants alleviate this problem? Method: The researchers evaluated the advantage in intelligibility of clear speech over…

  18. The Varieties of Speech to Young Children

    ERIC Educational Resources Information Center

    Huttenlocher, Janellen; Vasilyeva, Marina; Waterfall, Heidi R.; Vevea, Jack L.; Hedges, Larry V.

    2007-01-01

    This article examines caregiver speech to young children. The authors obtained several measures of the speech used to children during early language development (14-30 months). For all measures, they found substantial variation across individuals and subgroups. Speech patterns vary with caregiver education, and the differences are maintained over…

  19. The Dynamic Nature of Speech Perception

    ERIC Educational Resources Information Center

    McQueen, James M.; Norris, Dennis; Cutler, Anne

    2006-01-01

    The speech perception system must be flexible in responding to the variability in speech sounds caused by differences among speakers and by language change over the lifespan of the listener. Indeed, listeners use lexical knowledge to retune perception of novel speech (Norris, McQueen, & Cutler, 2003). In that study, Dutch listeners made lexical…

  20. Speech Perception in Children with Speech Output Disorders

    ERIC Educational Resources Information Center

    Nijland, Lian

    2009-01-01

    Research in the field of speech production pathology is dominated by describing deficits in output. However, perceptual problems might underlie, precede, or interact with production disorders. The present study hypothesizes that the level of the production disorders is linked to level of perception disorders, thus lower-order production problems…

  1. Critical Thinking in Speech Communication: Survey of Speech Communication Educators.

    ERIC Educational Resources Information Center

    Ruminski, Henry; And Others

    The interests of all communication educators would best be served if those educators could agree on a single broad definition of critical thinking that incorporates a variety of perspectives. Toward this end, questionnaires were sent to a ramdom sampling of 300 members of the Speech Communication Association; 88 were returned. The questionnaire…

  2. Relationship between Speech Intelligibility and Speech Comprehension in Babble Noise

    ERIC Educational Resources Information Center

    Fontan, Lionel; Tardieu, Julien; Gaillard, Pascal; Woisard, Virginie; Ruiz, Robert

    2015-01-01

    Purpose: The authors investigated the relationship between the intelligibility and comprehension of speech presented in babble noise. Method: Forty participants listened to French imperative sentences (commands for moving objects) in a multitalker babble background for which intensity was experimentally controlled. Participants were instructed to…

  3. Perception of Speech Reflects Optimal Use of Probabilistic Speech Cues

    ERIC Educational Resources Information Center

    Clayards, Meghan; Tanenhaus, Michael K.; Aslin, Richard N.; Jacobs, Robert A.

    2008-01-01

    Listeners are exquisitely sensitive to fine-grained acoustic detail within phonetic categories for sounds and words. Here we show that this sensitivity is optimal given the probabilistic nature of speech cues. We manipulated the probability distribution of one probabilistic cue, voice onset time (VOT), which differentiates word initial labial…

  4. Speech entrainment enables patients with Broca's aphasia to produce fluent speech.

    PubMed

    Fridriksson, Julius; Hubbard, H Isabel; Hudspeth, Sarah Grace; Holland, Audrey L; Bonilha, Leonardo; Fromm, Davida; Rorden, Chris

    2012-12-01

    A distinguishing feature of Broca's aphasia is non-fluent halting speech typically involving one to three words per utterance. Yet, despite such profound impairments, some patients can mimic audio-visual speech stimuli enabling them to produce fluent speech in real time. We call this effect 'speech entrainment' and reveal its neural mechanism as well as explore its usefulness as a treatment for speech production in Broca's aphasia. In Experiment 1, 13 patients with Broca's aphasia were tested in three conditions: (i) speech entrainment with audio-visual feedback where they attempted to mimic a speaker whose mouth was seen on an iPod screen; (ii) speech entrainment with audio-only feedback where patients mimicked heard speech; and (iii) spontaneous speech where patients spoke freely about assigned topics. The patients produced a greater variety of words using audio-visual feedback compared with audio-only feedback and spontaneous speech. No difference was found between audio-only feedback and spontaneous speech. In Experiment 2, 10 of the 13 patients included in Experiment 1 and 20 control subjects underwent functional magnetic resonance imaging to determine the neural mechanism that supports speech entrainment. Group results with patients and controls revealed greater bilateral cortical activation for speech produced during speech entrainment compared with spontaneous speech at the junction of the anterior insula and Brodmann area 47, in Brodmann area 37, and unilaterally in the left middle temporal gyrus and the dorsal portion of Broca's area. Probabilistic white matter tracts constructed for these regions in the normal subjects revealed a structural network connected via the corpus callosum and ventral fibres through the extreme capsule. Unilateral areas were connected via the arcuate fasciculus. In Experiment 3, all patients included in Experiment 1 participated in a 6-week treatment phase using speech entrainment to improve speech production. Behavioural and

  5. Physiologically-Based Speech Simulation Using AN Enhanced Wave-Reflection Model of the Vocal Tract.

    NASA Astrophysics Data System (ADS)

    Story, Brad Hudson

    1995-01-01

    Speech simulation is defined as a form of speech synthesis in which movement of air and tissue in the vocal tract is under experimental control rather than the acoustic signal. Successful simulation requires both a sophisticated model of the acoustics of the vocal tract and detailed geometric dimensions of the many vocal tract shapes used to produce speech. The first aim of this study was to improve the current state of a wave-reflection type vocal tract model by implementing yielding walls, frequency dependent viscous losses, sound radiation from skin surfaces, and multiple branches within the vocal and nasal tracts. The second aim was to acquire, using MRI and EBCT, an inventory of three dimensional vocal tract airway shapes that correspond to a particular set of vowels and consonants. A set of 18 shapes was completed for one male subject while a second set of 2 shapes was completed for a second subject. The 3-D shapes were analyzed to find the cross-sectional areas along the centerline of the long axis extending from the glottis to the mouth to produce an "area function". These area functions were then used as input to the enhanced wave-reflection model which resulted in a simulation of the speech waveform radiated from the mouth. The acoustic similarity of the simulated speech sounds and those recorded from the subjects was tested by spectral analysis. The simulations produced acoustic resonances that fell nearly within the range of variation for the three separate recordings of the natural speech.

  6. Comparing the single-word intelligibility of two speech synthesizers for small computers

    SciTech Connect

    Cochran, P.S.

    1986-01-01

    Previous research on the intelligibility of synthesized speech has placed emphasis on the segmental intelligibility (rather than word or sentence intelligibility) of expensive and sophisticated synthesis systems. There is a need for more information about the intelligibility of low-to-moderately priced speech synthesizers because they are the most likely to be widely purchase for clinical and educational use. This study was to compare the word intelligibility of two such synthesizers for small computers, the Votrax Personal Speech System (PSS) and the Echo GP (General Purpose). A multiple-choice word identification task was used in a two-part study in which 48 young adults served as listeners. Groups of subjects in Part I completed one trial listening to taped natural speech followed by one trial with each synthesizer. Subjects in Part II listened to the taped human speech followed by two trials with the same synthesizer. Under the quiet listening conditions used for this study, taped human speech was 30% more intelligible than the Votrax PSS, and 53% more intelligible than the Echo GP.

  7. Pulse Vector-Excitation Speech Encoder

    NASA Technical Reports Server (NTRS)

    Davidson, Grant; Gersho, Allen

    1989-01-01

    Proposed pulse vector-excitation speech encoder (PVXC) encodes analog speech signals into digital representation for transmission or storage at rates below 5 kilobits per second. Produces high quality of reconstructed speech, but with less computation than required by comparable speech-encoding systems. Has some characteristics of multipulse linear predictive coding (MPLPC) and of code-excited linear prediction (CELP). System uses mathematical model of vocal tract in conjunction with set of excitation vectors and perceptually-based error criterion to synthesize natural-sounding speech.

  8. Feasibility of Technology Enabled Speech Disorder Screening.

    PubMed

    Duenser, Andreas; Ward, Lauren; Stefani, Alessandro; Smith, Daniel; Freyne, Jill; Morgan, Angela; Dodd, Barbara

    2016-01-01

    One in twenty Australian children suffers from a speech disorder. Early detection of such problems can significantly improve literacy and academic outcomes for these children, reduce health and educational burden and ongoing social costs. Here we present the development of a prototype and feasibility tests of a screening and decision support tool to assess speech disorders in young children. The prototype incorporates speech signal processing, machine learning and expert knowledge to automatically classify phonemes of normal and disordered speech. We discuss these results and our future work towards the development of a mobile tool to facilitate broad, early speech disorder screening by non-experts. PMID:27440284

  9. Free Speech Advocates at Berkeley.

    ERIC Educational Resources Information Center

    Watts, William A.; Whittaker, David

    1966-01-01

    This study compares highly committed members of the Free Speech Movement (FSM) at Berkeley with the student population at large on 3 sociopsychological foci: general biographical data, religious orientation, and rigidity-flexibility. Questionnaires were administered to 172 FSM members selected by chance from the 10 to 1200 who entered and "sat-in"…

  10. Speech Errors across the Lifespan

    ERIC Educational Resources Information Center

    Vousden, Janet I.; Maylor, Elizabeth A.

    2006-01-01

    Dell, Burger, and Svec (1997) proposed that the proportion of speech errors classified as anticipations (e.g., "moot and mouth") can be predicted solely from the overall error rate, such that the greater the error rate, the lower the anticipatory proportion (AP) of errors. We report a study examining whether this effect applies to changes in error…

  11. Embedding speech into virtual realities

    NASA Technical Reports Server (NTRS)

    Bohn, Christian-Arved; Krueger, Wolfgang

    1993-01-01

    In this work a speaker-independent speech recognition system is presented, which is suitable for implementation in Virtual Reality applications. The use of an artificial neural network in connection with a special compression of the acoustic input leads to a system, which is robust, fast, easy to use and needs no additional hardware, beside a common VR-equipment.

  12. The Segmentation of Impromptu Speech.

    ERIC Educational Resources Information Center

    Svartvik, Jan

    A computer program for classifying elements of a language corpus for large-scale analysis is discussed. The approach is based on the assumption that there is a natural unit in speech processing and production, called a tone unit. The program "tags" the five grammatical phrase types (verb, adverb, adjective, noun, and prepositional) to provide a…

  13. Prosodic Contrasts in Ironic Speech

    ERIC Educational Resources Information Center

    Bryant, Gregory A.

    2010-01-01

    Prosodic features in spontaneous speech help disambiguate implied meaning not explicit in linguistic surface structure, but little research has examined how these signals manifest themselves in real conversations. Spontaneously produced verbal irony utterances generated between familiar speakers in conversational dyads were acoustically analyzed…

  14. Speech and Language Developmental Milestones

    MedlinePlus

    ... What are the milestones for speech and language development? The first signs of communication occur when an infant learns that a cry will bring food, comfort, and companionship. Newborns also begin to recognize important sounds in their environment, such as the voice of their mother or ...

  15. Inner Speech Impairments in Autism

    ERIC Educational Resources Information Center

    Whitehouse, Andrew J. O.; Maybery, Murray T.; Durkin, Kevin

    2006-01-01

    Background: Three experiments investigated the role of inner speech deficit in cognitive performances of children with autism. Methods: Experiment 1 compared children with autism with ability-matched controls on a verbal recall task presenting pictures and words. Experiment 2 used pictures for which the typical names were either single syllable or…

  16. Speech Research. Interim Scientific Report.

    ERIC Educational Resources Information Center

    Cooper, Franklin S.

    The status and progress of several studies dealing with the nature of speech, instrumentation for its investigation, and instrumentation for practical applications is reported on. The period of January 1 through June 30, 1969 is covered. Extended reports and manuscripts cover the following topics: programing for the Glace-Holmes synthesizer,…

  17. Vector adaptive predictive coder for speech and audio

    NASA Technical Reports Server (NTRS)

    Chen, Juin-Hwey (Inventor); Gersho, Allen (Inventor)

    1990-01-01

    A real-time vector adaptive predictive coder which approximates each vector of K speech samples by using each of M fixed vectors in a first codebook to excite a time-varying synthesis filter and picking the vector that minimizes distortion. Predictive analysis for each frame determines parameters used for computing from vectors in the first codebook zero-state response vectors that are stored at the same address (index) in a second codebook. Encoding of input speech vectors s.sub.n is then carried out using the second codebook. When the vector that minimizes distortion is found, its index is transmitted to a decoder which has a codebook identical to the first codebook of the decoder. There the index is used to read out a vector that is used to synthesize an output speech vector s.sub.n. The parameters used in the encoder are quantized, for example by using a table, and the indices are transmitted to the decoder where they are decoded to specify transfer characteristics of filters used in producing the vector s.sub.n from the receiver codebook vector selected by the vector index transmitted.

  18. Phrase-programmable digital speech system

    SciTech Connect

    Raymond, W.J.; Morgan, R.L.; Miller, R.L.

    1987-01-27

    This patent describes a phrase speaking computer system having a programmable digital computer and a speech processor, the speech processor comprising: a voice synthesizer; a read/write speech data segment memory; a read/write command memory; control processor means including processor control programs and logic connecting to the memories and to the voice synthesizer. It is arranged to scan the command memory and to respond to command data entries stored therein by transferring corresponding speech data segments from the speech data segment memory to the voice synthesizer; data conveyance means, connecting the computer to the command memory and the speech data segment memory, for transferring the command data entries supplied by the computer into the command memory and for transferring the speech data segments supplied by the computer into the speech data segment memory; and an enable signal line connecting the computer to the speech processor and arranged to initiate the operation of the processor control programs and logic when the enable signal line is enabled by the computer; the programmable computer including speech control programs controlling the operation of the computer including data conveyance command sequences that cause the computer to supply command data entries to the data conveyance means and speech processor enabling command sequences that cause computer to energize the enable signal line.

  19. Speech Motor Learning in Profoundly Deaf Adults

    PubMed Central

    Nasir, Sazzad M.; Ostry, David J.

    2008-01-01

    Speech production, like other sensorimotor behaviors, relies on multiple sensory inputs — audition, proprioceptive inputs from muscle spindles, and cutaneous inputs from mechanoreceptors in the skin and soft tissues of the vocal tract. However, the capacity for intelligible speech by deaf speakers suggests that somatosensory input on its own may contribute to speech motor control and perhaps even to speech learning. We assessed speech motor learning in cochlear implant recipients who were tested with their implants turned off. A robotic device was used to alter somatosensory feedback by displacing the jaw during speech. We found that with training implant subjects progressively adapted to the mechanical perturbation. Moreover, the corrections we observed were for movement deviations that were exceedingly small, on the order of millimetres, indicating that speakers have precise somatosensory expectations. Speech motor learning is significantly dependent on somatosensory input. PMID:18794839

  20. Speech recognition with amplitude and frequency modulations

    NASA Astrophysics Data System (ADS)

    Zeng, Fan-Gang; Nie, Kaibao; Stickney, Ginger S.; Kong, Ying-Yee; Vongphoe, Michael; Bhargave, Ashish; Wei, Chaogang; Cao, Keli

    2005-02-01

    Amplitude modulation (AM) and frequency modulation (FM) are commonly used in communication, but their relative contributions to speech recognition have not been fully explored. To bridge this gap, we derived slowly varying AM and FM from speech sounds and conducted listening tests using stimuli with different modulations in normal-hearing and cochlear-implant subjects. We found that although AM from a limited number of spectral bands may be sufficient for speech recognition in quiet, FM significantly enhances speech recognition in noise, as well as speaker and tone recognition. Additional speech reception threshold measures revealed that FM is particularly critical for speech recognition with a competing voice and is independent of spectral resolution and similarity. These results suggest that AM and FM provide independent yet complementary contributions to support robust speech recognition under realistic listening situations. Encoding FM may improve auditory scene analysis, cochlear-implant, and audiocoding performance. auditory analysis | cochlear implant | neural code | phase | scene analysis

  1. Speaker identification utilizing noncontemporary speech.

    PubMed

    Hollien, H; Schwartz, R

    2001-01-01

    The noncontemporariness of speech is important to both of the two general approaches to speaker identification. Ear-witness identification is one of them; in that instance, the time at which the identification is made is noncontemporary. A substantial amount of research has been carried out on this relationship and it now is well established that an auditor's memory for a voice decays sharply over time. It is the second approach to speaker identification which is of present interest. In this case, samples of a speaker's utterances are obtained at different points in time. For example, a threat call will be recorded and then sometime later (often very much later), a suspect' s exemplar recording will be obtained. In this instance, it is the speech samples that are noncontemporary and they are the materials that are subjected to some form of speaker identification. Prevailing opinion is that noncontemporary speech itself poses just as difficult a challenge to the identification process as does the listener's memory decay in earwitness identification. Accordingly, series of aural-perceptual speaker identification projects were carried out on noncontemporary speech: first, two with latencies of 4 and 8 weeks followed by 4 and 32 weeks plus two more with the pairs separated by 6 and 20 years. Mean correct noncontemporary identification initially dropped to 75-80% at week 4 and this general level was sustained for up to six years. It was only after 20 years had elapsed that a significant drop (to 33%) was noted. It can be concluded that a listener's competency in identifying noncontemporary speech samples will show only modest decay over rather substantial periods of time and, hence, this factor should have only a minimal negative effect on the speaker identification process.

  2. The verbal transformation effect and the perceptual organization of speech: influence of formant transitions and F0-contour continuity.

    PubMed

    Stachurski, Marcin; Summers, Robert J; Roberts, Brian

    2015-05-01

    This study explored the role of formant transitions and F0-contour continuity in binding together speech sounds into a coherent stream. Listening to a repeating recorded word produces verbal transformations to different forms; stream segregation contributes to this effect and so it can be used to measure changes in perceptual coherence. In experiment 1, monosyllables with strong formant transitions between the initial consonant and following vowel were monotonized; each monosyllable was paired with a weak-transitions counterpart. Further stimuli were derived by replacing the consonant-vowel transitions with samples from adjacent steady portions. Each stimulus was concatenated into a 3-min-long sequence. Listeners only reported more forms in the transitions-removed condition for strong-transitions words, for which formant-frequency discontinuities were substantial. In experiment 2, the F0 contour of all-voiced monosyllables was shaped to follow a rising or falling pattern, spanning one octave. Consecutive tokens either had the same contour, giving an abrupt F0 change between each token, or alternated, giving a continuous contour. Discontinuous sequences caused more transformations and forms, and shorter times to the first transformation. Overall, these findings support the notion that continuity cues provided by formant transitions and the F0 contour play an important role in maintaining the perceptual coherence of speech.

  3. Adaptive Redundant Speech Transmission over Wireless Multimedia Sensor Networks Based on Estimation of Perceived Speech Quality

    PubMed Central

    Kang, Jin Ah; Kim, Hong Kook

    2011-01-01

    An adaptive redundant speech transmission (ARST) approach to improve the perceived speech quality (PSQ) of speech streaming applications over wireless multimedia sensor networks (WMSNs) is proposed in this paper. The proposed approach estimates the PSQ as well as the packet loss rate (PLR) from the received speech data. Subsequently, it decides whether the transmission of redundant speech data (RSD) is required in order to assist a speech decoder to reconstruct lost speech signals for high PLRs. According to the decision, the proposed ARST approach controls the RSD transmission, then it optimizes the bitrate of speech coding to encode the current speech data (CSD) and RSD bitstream in order to maintain the speech quality under packet loss conditions. The effectiveness of the proposed ARST approach is then demonstrated using the adaptive multirate-narrowband (AMR-NB) speech codec and ITU-T Recommendation P.563 as a scalable speech codec and the PSQ estimation, respectively. It is shown from the experiments that a speech streaming application employing the proposed ARST approach significantly improves speech quality under packet loss conditions in WMSNs. PMID:22164086

  4. Intelligibility of laryngectomees' substitute speech: automatic speech recognition and subjective rating.

    PubMed

    Schuster, Maria; Haderlein, Tino; Nöth, Elmar; Lohscheller, Jörg; Eysholdt, Ulrich; Rosanowski, Frank

    2006-02-01

    Substitute speech after laryngectomy is characterized by restricted aero-acoustic properties in comparison with laryngeal speech and has therefore lower intelligibility. Until now, an objective means to determine and quantify the intelligibility has not existed, although the intelligibility can serve as a global outcome parameter of voice restoration after laryngectomy. An automatic speech recognition system was applied on recordings of a standard text read by 18 German male laryngectomees with tracheoesophageal substitute speech. The system was trained with normal laryngeal speakers and not adapted to severely disturbed voices. Substitute speech was compared to laryngeal speech of a control group. Subjective evaluation of intelligibility was performed by a panel of five experts and compared to automatic speech evaluation. Substitute speech showed lower syllables/s and lower word accuracy than laryngeal speech. Automatic speech recognition for substitute speech yielded word accuracy between 10.0 and 50% (28.7+/-12.1%) with sufficient discrimination. It complied with experts' subjective evaluations of intelligibility. The multi-rater kappa of the experts alone did not differ from the multi-rater kappa of experts and the recognizer. Automatic speech recognition serves as a good means to objectify and quantify global speech outcome of laryngectomees. For clinical use, the speech recognition system will be adapted to disturbed voices and can also be applied in other languages. PMID:16001246

  5. Speech Entrainment Compensates for Broca's Area Damage

    PubMed Central

    Fridriksson, Julius; Basilakos, Alexandra; Hickok, Gregory; Bonilha, Leonardo; Rorden, Chris

    2015-01-01

    Speech entrainment (SE), the online mimicking of an audiovisual speech model, has been shown to increase speech fluency in patients with Broca's aphasia. However, not all individuals with aphasia benefit from SE. The purpose of this study was to identify patterns of cortical damage that predict a positive response SE's fluency-inducing effects. Forty-four chronic patients with left hemisphere stroke (15 female) were included in this study. Participants completed two tasks: 1) spontaneous speech production, and 2) audiovisual SE. Number of different words per minute was calculated as a speech output measure for each task, with the difference between SE and spontaneous speech conditions yielding a measure of fluency improvement. Voxel-wise lesion-symptom mapping (VLSM) was used to relate the number of different words per minute for spontaneous speech, SE, and SE-related improvement to patterns of brain damage in order to predict lesion locations associated with the fluency-inducing response to speech entrainment. Individuals with Broca's aphasia demonstrated a significant increase in different words per minute during speech entrainment versus spontaneous speech. A similar pattern of improvement was not seen in patients with other types of aphasia. VLSM analysis revealed damage to the inferior frontal gyrus predicted this response. Results suggest that SE exerts its fluency-inducing effects by providing a surrogate target for speech production via internal monitoring processes. Clinically, these results add further support for the use of speech entrainment to improve speech production and may help select patients for speech entrainment treatment. PMID:25989443

  6. Changing Speech Styles: Strategies in Read Speech and Casual and Careful Spontaneous Speech.

    ERIC Educational Resources Information Center

    Eskenazi, Maxine

    A study examined segmental and suprasegmental elements which contribute to an impression of one speaking style as opposed to another. A corpus containing three styles of speech, casual, careful, and read, for the same linguistic content was gathered. Thirteen speakers from Paris, France (aged 24-35) were given a scenario to be acted out over the…

  7. A hardware preprocessor for use in speech recognition: Speech Input Device SID3

    NASA Astrophysics Data System (ADS)

    Renger, R. E.; Manning, D. R.

    1983-05-01

    A device which reduces the amount of data sent to the computer for speech recognition, by extracting from the speech signal the information that conveys the meaning of the speech, all other data being discarded is presented. The design includes signal to noise ratios as low as 10 dB, public telephone frequency bandwidth and unconstrained speech. It produces continuously at its output 64 bits of digital information, which represents the way 16 speech parameters vary. The parameters cover speech quality, voice pitch, resonant frequency, level of resonance and unvoiced spectrum color. The receiving computer must have supporting software containing recognition algorithms adapted to SID3 parameters.

  8. Analysis and synthesis of laughter

    NASA Astrophysics Data System (ADS)

    Sundaram, Shiva; Narayanan, Shrikanth

    2004-10-01

    There is much enthusiasm in the text-to-speech community for synthesis of emotional and natural speech. One idea being proposed is to include emotion dependent paralinguistic cues during synthesis to convey emotions effectively. This requires modeling and synthesis techniques of various cues for different emotions. Motivated by this, a technique to synthesize human laughter is proposed. Laughter is a complex mechanism of expression and has high variability in terms of types and usage in human-human communication. People have their own characteristic way of laughing. Laughter can be seen as a controlled/uncontrolled physiological process of a person resulting from an initial excitation in context. A parametric model based on damped simple harmonic motion to effectively capture these diversities and also maintain the individuals characteristics is developed here. Limited laughter/speech data from actual humans and synthesis ease are the constraints imposed on the accuracy of the model. Analysis techniques are also developed to determine the parameters of the model for a given individual or laughter type. Finally, the effectiveness of the model to capture the individual characteristics and naturalness compared to real human laughter has been analyzed. Through this the factors involved in individual human laughter and their importance can be better understood.

  9. Some articulatory details of emotional speech

    NASA Astrophysics Data System (ADS)

    Lee, Sungbok; Yildirim, Serdar; Bulut, Murtaza; Kazemzadeh, Abe; Narayanan, Shrikanth

    2005-09-01

    Differences in speech articulation among four emotion types, neutral, anger, sadness, and happiness are investigated by analyzing tongue tip, jaw, and lip movement data collected from one male and one female speaker of American English. The data were collected using an electromagnetic articulography (EMA) system while subjects produce simulated emotional speech. Pitch, root-mean-square (rms) energy and the first three formants were estimated for vowel segments. For both speakers, angry speech exhibited the largest rms energy and largest articulatory activity in terms of displacement range and movement speed. Happy speech is characterized by largest pitch variability. It has higher rms energy than neutral speech but articulatory activity is rather comparable to, or less than, neutral speech. That is, happy speech is more prominent in voicing activity than in articulation. Sad speech exhibits longest sentence duration and lower rms energy. However, its articulatory activity is no less than neutral speech. Interestingly, for the male speaker, articulation for vowels in sad speech is consistently more peripheral (i.e., more forwarded displacements) when compared to other emotions. However, this does not hold for female subject. These and other results will be discussed in detail with associated acoustics and perceived emotional qualities. [Work supported by NIH.

  10. Speech entrainment compensates for Broca's area damage.

    PubMed

    Fridriksson, Julius; Basilakos, Alexandra; Hickok, Gregory; Bonilha, Leonardo; Rorden, Chris

    2015-08-01

    Speech entrainment (SE), the online mimicking of an audiovisual speech model, has been shown to increase speech fluency in patients with Broca's aphasia. However, not all individuals with aphasia benefit from SE. The purpose of this study was to identify patterns of cortical damage that predict a positive response SE's fluency-inducing effects. Forty-four chronic patients with left hemisphere stroke (15 female) were included in this study. Participants completed two tasks: 1) spontaneous speech production, and 2) audiovisual SE. Number of different words per minute was calculated as a speech output measure for each task, with the difference between SE and spontaneous speech conditions yielding a measure of fluency improvement. Voxel-wise lesion-symptom mapping (VLSM) was used to relate the number of different words per minute for spontaneous speech, SE, and SE-related improvement to patterns of brain damage in order to predict lesion locations associated with the fluency-inducing response to SE. Individuals with Broca's aphasia demonstrated a significant increase in different words per minute during SE versus spontaneous speech. A similar pattern of improvement was not seen in patients with other types of aphasia. VLSM analysis revealed damage to the inferior frontal gyrus predicted this response. Results suggest that SE exerts its fluency-inducing effects by providing a surrogate target for speech production via internal monitoring processes. Clinically, these results add further support for the use of SE to improve speech production and may help select patients for SE treatment. PMID:25989443

  11. Individual differneces in degraded speech perception

    NASA Astrophysics Data System (ADS)

    Carbonell, Kathy M.

    One of the lasting concerns in audiology is the unexplained individual differences in speech perception performance even for individuals with similar audiograms. One proposal is that there are cognitive/perceptual individual differences underlying this vulnerability and that these differences are present in normal hearing (NH) individuals but do not reveal themselves in studies that use clear speech produced in quiet (because of a ceiling effect). However, previous studies have failed to uncover cognitive/perceptual variables that explain much of the variance in NH performance on more challenging degraded speech tasks. This lack of strong correlations may be due to either examining the wrong measures (e.g., working memory capacity) or to there being no reliable differences in degraded speech performance in NH listeners (i.e., variability in performance is due to measurement noise). The proposed project has 3 aims; the first, is to establish whether there are reliable individual differences in degraded speech performance for NH listeners that are sustained both across degradation types (speech in noise, compressed speech, noise-vocoded speech) and across multiple testing sessions. The second aim is to establish whether there are reliable differences in NH listeners' ability to adapt their phonetic categories based on short-term statistics both across tasks and across sessions; and finally, to determine whether performance on degraded speech perception tasks are correlated with performance on phonetic adaptability tasks, thus establishing a possible explanatory variable for individual differences in speech perception for NH and hearing impaired listeners.

  12. Speech and language delay in children.

    PubMed

    McLaughlin, Maura R

    2011-05-15

    Speech and language delay in children is associated with increased difficulty with reading, writing, attention, and socialization. Although physicians should be alert to parental concerns and to whether children are meeting expected developmental milestones, there currently is insufficient evidence to recommend for or against routine use of formal screening instruments in primary care to detect speech and language delay. In children not meeting the expected milestones for speech and language, a comprehensive developmental evaluation is essential, because atypical language development can be a secondary characteristic of other physical and developmental problems that may first manifest as language problems. Types of primary speech and language delay include developmental speech and language delay, expressive language disorder, and receptive language disorder. Secondary speech and language delays are attributable to another condition such as hearing loss, intellectual disability, autism spectrum disorder, physical speech problems, or selective mutism. When speech and language delay is suspected, the primary care physician should discuss this concern with the parents and recommend referral to a speech-language pathologist and an audiologist. There is good evidence that speech-language therapy is helpful, particularly for children with expressive language disorder. PMID:21568252

  13. Sensorimotor influences on speech perception in infancy.

    PubMed

    Bruderer, Alison G; Danielson, D Kyle; Kandhadai, Padmapriya; Werker, Janet F

    2015-11-01

    The influence of speech production on speech perception is well established in adults. However, because adults have a long history of both perceiving and producing speech, the extent to which the perception-production linkage is due to experience is unknown. We addressed this issue by asking whether articulatory configurations can influence infants' speech perception performance. To eliminate influences from specific linguistic experience, we studied preverbal, 6-mo-old infants and tested the discrimination of a nonnative, and hence never-before-experienced, speech sound distinction. In three experimental studies, we used teething toys to control the position and movement of the tongue tip while the infants listened to the speech sounds. Using ultrasound imaging technology, we verified that the teething toys consistently and effectively constrained the movement and positioning of infants' tongues. With a looking-time procedure, we found that temporarily restraining infants' articulators impeded their discrimination of a nonnative consonant contrast but only when the relevant articulator was selectively restrained to prevent the movements associated with producing those sounds. Our results provide striking evidence that even before infants speak their first words and without specific listening experience, sensorimotor information from the articulators influences speech perception. These results transform theories of speech perception by suggesting that even at the initial stages of development, oral-motor movements influence speech sound discrimination. Moreover, an experimentally induced "impairment" in articulator movement can compromise speech perception performance, raising the question of whether long-term oral-motor impairments may impact perceptual development.

  14. Loss tolerant speech decoder for telecommunications

    NASA Technical Reports Server (NTRS)

    Prieto, Jr., Jaime L. (Inventor)

    1999-01-01

    A method and device for extrapolating past signal-history data for insertion into missing data segments in order to conceal digital speech frame errors. The extrapolation method uses past-signal history that is stored in a buffer. The method is implemented with a device that utilizes a finite-impulse response (FIR) multi-layer feed-forward artificial neural network that is trained by back-propagation for one-step extrapolation of speech compression algorithm (SCA) parameters. Once a speech connection has been established, the speech compression algorithm device begins sending encoded speech frames. As the speech frames are received, they are decoded and converted back into speech signal voltages. During the normal decoding process, pre-processing of the required SCA parameters will occur and the results stored in the past-history buffer. If a speech frame is detected to be lost or in error, then extrapolation modules are executed and replacement SCA parameters are generated and sent as the parameters required by the SCA. In this way, the information transfer to the SCA is transparent, and the SCA processing continues as usual. The listener will not normally notice that a speech frame has been lost because of the smooth transition between the last-received, lost, and next-received speech frames.

  15. Sensorimotor influences on speech perception in infancy.

    PubMed

    Bruderer, Alison G; Danielson, D Kyle; Kandhadai, Padmapriya; Werker, Janet F

    2015-11-01

    The influence of speech production on speech perception is well established in adults. However, because adults have a long history of both perceiving and producing speech, the extent to which the perception-production linkage is due to experience is unknown. We addressed this issue by asking whether articulatory configurations can influence infants' speech perception performance. To eliminate influences from specific linguistic experience, we studied preverbal, 6-mo-old infants and tested the discrimination of a nonnative, and hence never-before-experienced, speech sound distinction. In three experimental studies, we used teething toys to control the position and movement of the tongue tip while the infants listened to the speech sounds. Using ultrasound imaging technology, we verified that the teething toys consistently and effectively constrained the movement and positioning of infants' tongues. With a looking-time procedure, we found that temporarily restraining infants' articulators impeded their discrimination of a nonnative consonant contrast but only when the relevant articulator was selectively restrained to prevent the movements associated with producing those sounds. Our results provide striking evidence that even before infants speak their first words and without specific listening experience, sensorimotor information from the articulators influences speech perception. These results transform theories of speech perception by suggesting that even at the initial stages of development, oral-motor movements influence speech sound discrimination. Moreover, an experimentally induced "impairment" in articulator movement can compromise speech perception performance, raising the question of whether long-term oral-motor impairments may impact perceptual development. PMID:26460030

  16. Speech entrainment compensates for Broca's area damage.

    PubMed

    Fridriksson, Julius; Basilakos, Alexandra; Hickok, Gregory; Bonilha, Leonardo; Rorden, Chris

    2015-08-01

    Speech entrainment (SE), the online mimicking of an audiovisual speech model, has been shown to increase speech fluency in patients with Broca's aphasia. However, not all individuals with aphasia benefit from SE. The purpose of this study was to identify patterns of cortical damage that predict a positive response SE's fluency-inducing effects. Forty-four chronic patients with left hemisphere stroke (15 female) were included in this study. Participants completed two tasks: 1) spontaneous speech production, and 2) audiovisual SE. Number of different words per minute was calculated as a speech output measure for each task, with the difference between SE and spontaneous speech conditions yielding a measure of fluency improvement. Voxel-wise lesion-symptom mapping (VLSM) was used to relate the number of different words per minute for spontaneous speech, SE, and SE-related improvement to patterns of brain damage in order to predict lesion locations associated with the fluency-inducing response to SE. Individuals with Broca's aphasia demonstrated a significant increase in different words per minute during SE versus spontaneous speech. A similar pattern of improvement was not seen in patients with other types of aphasia. VLSM analysis revealed damage to the inferior frontal gyrus predicted this response. Results suggest that SE exerts its fluency-inducing effects by providing a surrogate target for speech production via internal monitoring processes. Clinically, these results add further support for the use of SE to improve speech production and may help select patients for SE treatment.

  17. Speech Articulator and User Gesture Measurements Using Micropower, Interferometric EM-Sensors

    SciTech Connect

    Holzrichter, J F; Ng, L C

    2001-02-06

    Very low power, GHz frequency, ''radar-like'' sensors can measure a variety of motions produced by a human user of machine interface devices. These data can be obtained ''at a distance'' and can measure ''hidden'' structures. Measurements range from acoustic induced, 10-micron amplitude vibrations of vocal tract tissues, to few centimeter human speech articulator motions, to meter-class motions of the head, hands, or entire body. These EM sensors measure ''fringe motions'' as reflected EM waves are mixed with a local (homodyne) reference wave. These data, when processed using models of the system being measured, provide real time states of interface positions or other targets vs. time. An example is speech articulator positions vs. time in the user's body. This information appears to be useful for a surprisingly wide range of applications ranging from speech coding synthesis and recognition, speaker or object identification, noise cancellation, hand or head motions for cursor direction, and other applications.

  18. A causal test of the motor theory of speech perception: a case of impaired speech production and spared speech perception.

    PubMed

    Stasenko, Alena; Bonn, Cory; Teghipco, Alex; Garcea, Frank E; Sweet, Catherine; Dombovy, Mary; McDonough, Joyce; Mahon, Bradford Z

    2015-01-01

    The debate about the causal role of the motor system in speech perception has been reignited by demonstrations that motor processes are engaged during the processing of speech sounds. Here, we evaluate which aspects of auditory speech processing are affected, and which are not, in a stroke patient with dysfunction of the speech motor system. We found that the patient showed a normal phonemic categorical boundary when discriminating two non-words that differ by a minimal pair (e.g., ADA-AGA). However, using the same stimuli, the patient was unable to identify or label the non-word stimuli (using a button-press response). A control task showed that he could identify speech sounds by speaker gender, ruling out a general labelling impairment. These data suggest that while the motor system is not causally involved in perception of the speech signal, it may be used when other cues (e.g., meaning, context) are not available. PMID:25951749

  19. A causal test of the motor theory of speech perception: a case of impaired speech production and spared speech perception.

    PubMed

    Stasenko, Alena; Bonn, Cory; Teghipco, Alex; Garcea, Frank E; Sweet, Catherine; Dombovy, Mary; McDonough, Joyce; Mahon, Bradford Z

    2015-01-01

    The debate about the causal role of the motor system in speech perception has been reignited by demonstrations that motor processes are engaged during the processing of speech sounds. Here, we evaluate which aspects of auditory speech processing are affected, and which are not, in a stroke patient with dysfunction of the speech motor system. We found that the patient showed a normal phonemic categorical boundary when discriminating two non-words that differ by a minimal pair (e.g., ADA-AGA). However, using the same stimuli, the patient was unable to identify or label the non-word stimuli (using a button-press response). A control task showed that he could identify speech sounds by speaker gender, ruling out a general labelling impairment. These data suggest that while the motor system is not causally involved in perception of the speech signal, it may be used when other cues (e.g., meaning, context) are not available.

  20. Segmenting Words from Natural Speech: Subsegmental Variation in Segmental Cues

    ERIC Educational Resources Information Center

    Rytting, C. Anton; Brew, Chris; Fosler-Lussier, Eric

    2010-01-01

    Most computational models of word segmentation are trained and tested on transcripts of speech, rather than the speech itself, and assume that speech is converted into a sequence of symbols prior to word segmentation. We present a way of representing speech corpora that avoids this assumption, and preserves acoustic variation present in speech. We…

  1. Extensions to the Speech Disorders Classification System (SDCS)

    ERIC Educational Resources Information Center

    Shriberg, Lawrence D.; Fourakis, Marios; Hall, Sheryl D.; Karlsson, Heather B.; Lohmeier, Heather L.; McSweeny, Jane L.; Potter, Nancy L.; Scheer-Cohen, Alison R.; Strand, Edythe A.; Tilkens, Christie M.; Wilson, David L.

    2010-01-01

    This report describes three extensions to a classification system for paediatric speech sound disorders termed the Speech Disorders Classification System (SDCS). Part I describes a classification extension to the SDCS to differentiate motor speech disorders from speech delay and to differentiate among three sub-types of motor speech disorders.…

  2. Modeling Interactions between Speech Production and Perception: Speech Error Detection at Semantic and Phonological Levels and the Inner Speech Loop

    PubMed Central

    Kröger, Bernd J.; Crawford, Eric; Bekolay, Trevor; Eliasmith, Chris

    2016-01-01

    Production and comprehension of speech are closely interwoven. For example, the ability to detect an error in one's own speech, halt speech production, and finally correct the error can be explained by assuming an inner speech loop which continuously compares the word representations induced by production to those induced by perception at various cognitive levels (e.g., conceptual, word, or phonological levels). Because spontaneous speech errors are relatively rare, a picture naming and halt paradigm can be used to evoke them. In this paradigm, picture presentation (target word initiation) is followed by an auditory stop signal (distractor word) for halting speech production. The current study seeks to understand the neural mechanisms governing self-detection of speech errors by developing a biologically inspired neural model of the inner speech loop. The neural model is based on the Neural Engineering Framework (NEF) and consists of a network of about 500,000 spiking neurons. In the first experiment we induce simulated speech errors semantically and phonologically. In the second experiment, we simulate a picture naming and halt task. Target-distractor word pairs were balanced with respect to variation of phonological and semantic similarity. The results of the first experiment show that speech errors are successfully detected by a monitoring component in the inner speech loop. The results of the second experiment show that the model correctly reproduces human behavioral data on the picture naming and halt task. In particular, the halting rate in the production of target words was lower for phonologically similar words than for semantically similar or fully dissimilar distractor words. We thus conclude that the neural architecture proposed here to model the inner speech loop reflects important interactions in production and perception at phonological and semantic levels. PMID:27303287

  3. Headphone localization of speech stimuli

    NASA Technical Reports Server (NTRS)

    Begault, Durand R.; Wenzel, Elizabeth M.

    1991-01-01

    Recently, three dimensional acoustic display systems have been developed that synthesize virtual sound sources over headphones based on filtering by Head-Related Transfer Functions (HRTFs), the direction-dependent spectral changes caused primarily by the outer ears. Here, 11 inexperienced subjects judged the apparent spatial location of headphone-presented speech stimuli filtered with non-individualized HRTFs. About half of the subjects 'pulled' their judgements toward either the median or the lateral-vertical planes, and estimates were almost always elevated. Individual differences were pronounced for the distance judgements; 15 to 46 percent of stimuli were heard inside the head with the shortest estimates near the median plane. The results infer that most listeners can obtain useful azimuth information from speech stimuli filtered by nonindividualized RTFs. Measurements of localization error and reversal rates are comparable with a previous study that used broadband noise stimuli.

  4. Apraxia of speech: an overview.

    PubMed

    Ogar, Jennifer; Slama, Hilary; Dronkers, Nina; Amici, Serena; Gorno-Tempini, Maria Luisa

    2005-12-01

    Apraxia of speech (AOS) is a motor speech disorder that can occur in the absence of aphasia or dysarthria. AOS has been the subject of some controversy since the disorder was first named and described by Darley and his Mayo Clinic colleagues in the 1960s. A recent revival of interest in AOS is due in part to the fact that it is often the first symptom of neurodegenerative diseases, such as primary progressive aphasia and corticobasal degeneration. This article will provide a brief review of terminology associated with AOS, its clinical hallmarks and neuroanatomical correlates. Current models of motor programming will also be addressed as they relate to AOS and finally, typical treatment strategies used in rehabilitating the articulation and prosody deficits associated with AOS will be summarized. PMID:16393756

  5. Language processing for speech understanding

    NASA Astrophysics Data System (ADS)

    Woods, W. A.

    1983-07-01

    This report considers language understanding techniques and control strategies that can be applied to provide higher-level support to aid in the understanding of spoken utterances. The discussion is illustrated with concepts and examples from the BBN speech understanding system, HWIM (Hear What I Mean). The HWIM system was conceived as an assistant to a travel budget manager, a system that would store information about planned and taken trips, travel budgets and their planning. The system was able to respond to commands and answer questions spoken into a microphone, and was able to synthesize spoken responses as output. HWIM was a prototype system used to drive speech understanding research. It used a phonetic-based approach, with no speaker training, a large vocabulary, and a relatively unconstraining English grammar. Discussed here is the control structure of the HWIM and the parsing algorithm used to parse sentences from the middle-out, using an ATN grammar.

  6. Speech parts as Poisson processes.

    PubMed

    Badalamenti, A F

    2001-09-01

    This paper presents evidence that six of the seven parts of speech occur in written text as Poisson processes, simple or recurring. The six major parts are nouns, verbs, adjectives, adverbs, prepositions, and conjunctions, with the interjection occurring too infrequently to support a model. The data consist of more than the first 5000 words of works by four major authors coded to label the parts of speech, as well as periods (sentence terminators). Sentence length is measured via the period and found to be normally distributed with no stochastic model identified for its occurrence. The models for all six speech parts but the noun significantly distinguish some pairs of authors and likewise for the joint use of all words types. Any one author is significantly distinguished from any other by at least one word type and sentence length very significantly distinguishes each from all others. The variety of word type use, measured by Shannon entropy, builds to about 90% of its maximum possible value. The rate constants for nouns are close to the fractions of maximum entropy achieved. This finding together with the stochastic models and the relations among them suggest that the noun may be a primitive organizer of written text.

  7. Effects of gaze and speech rate on receivers' evaluations of persuasive speech.

    PubMed

    Yokoyama, Hitomi; Daibo, Ikuo

    2012-04-01

    This study examined how gaze and speech rate affect perceptions of a speaker. Participants viewed a video recording of one of four persuasive messages delivered by a female speaker. Analysis of speech rate, gaze, and listener's sex revealed that when combined with a small amount of gaze, slow speech rate decreased trustworthiness as compared to a fast speech rate. For women, slow speech rate was thought to be indicative of less expertise as compared to a fast speech rate, again when combined with low gaze. There were no significant interactions, but there were main effects of gaze and speech rate on persuasiveness. High levels of gaze and slow speech rate each enhanced perceptions of the speaker's persuasiveness.

  8. Discriminating between auditory and motor cortical responses to speech and non-speech mouth sounds

    PubMed Central

    Agnew, Z.K.; McGettigan, C.; Scott, S.K.

    2012-01-01

    Several perspectives on speech perception posit a central role for the representation of articulations in speech comprehension, supported by evidence for premotor activation when participants listen to speech. However no experiments have directly tested whether motor responses mirror the profile of selective auditory cortical responses to native speech sounds, or whether motor and auditory areas respond in different ways to sounds. We used fMRI to investigate cortical responses to speech and non-speech mouth (ingressive click) sounds. Speech sounds activated bilateral superior temporal gyri more than other sounds, a profile not seen in motor and premotor cortices. These results suggest that there are qualitative differences in the ways that temporal and motor areas are activated by speech and click sounds: anterior temporal lobe areas are sensitive to the acoustic/phonetic properties while motor responses may show more generalised responses to the acoustic stimuli. PMID:21812557

  9. Primary Progressive Aphasia and Apraxia of Speech

    PubMed Central

    Jung, Youngsin; Duffy, Joseph R.; Josephs, Keith A.

    2014-01-01

    Primary progressive aphasia is a neurodegenerative syndrome characterized by progressive language dysfunction. The majority of primary progressive aphasia cases can be classified into three subtypes: non-fluent/agrammatic, semantic, and logopenic variants of primary progressive aphasia. Each variant presents with unique clinical features, and is associated with distinctive underlying pathology and neuroimaging findings. Unlike primary progressive aphasia, apraxia of speech is a disorder that involves inaccurate production of sounds secondary to impaired planning or programming of speech movements. Primary progressive apraxia of speech is a neurodegenerative form of apraxia of speech, and it should be distinguished from primary progressive aphasia given its discrete clinicopathological presentation. Recently, there have been substantial advances in our understanding of these speech and language disorders. Here, we review clinical, neuroimaging, and histopathological features of primary progressive aphasia and apraxia of speech. The distinctions among these disorders will be crucial since accurate diagnosis will be important from a prognostic and therapeutic standpoint. PMID:24234355

  10. The Levels of Speech Usage Rating Scale: Comparison of Client Self-Ratings with Speech Pathologist Ratings

    ERIC Educational Resources Information Center

    Gray, Christina; Baylor, Carolyn; Eadie, Tanya; Kendall, Diane; Yorkston, Kathryn

    2012-01-01

    Background: The term "speech usage" refers to what people want or need to do with their speech to fulfil the communication demands in their life roles. Speech-language pathologists (SLPs) need to know about clients' speech usage to plan appropriate interventions to meet their life participation goals. The Levels of Speech Usage is a categorical…

  11. The Neural Bases of Difficult Speech Comprehension and Speech Production: Two Activation Likelihood Estimation (ALE) Meta-Analyses

    ERIC Educational Resources Information Center

    Adank, Patti

    2012-01-01

    The role of speech production mechanisms in difficult speech comprehension is the subject of on-going debate in speech science. Two Activation Likelihood Estimation (ALE) analyses were conducted on neuroimaging studies investigating difficult speech comprehension or speech production. Meta-analysis 1 included 10 studies contrasting comprehension…

  12. Self-Evaluation and Pre-Speech Planning: A Strategy for Sharing Responsibility for Progress in the Speech Class.

    ERIC Educational Resources Information Center

    Desjardins, Linda A.

    Speech class teachers can implement a pre- and post-speech strategy, using pre-speech and self-evaluation forms, to help students become active in directing their own progress, and acknowledge their own accomplishments. Every speech is tape-recorded in class. Students listen to their speeches later and fill in the self-evaluation form, which asks…

  13. The neural processing of masked speech.

    PubMed

    Scott, Sophie K; McGettigan, Carolyn

    2013-09-01

    Spoken language is rarely heard in silence, and a great deal of interest in psychoacoustics has focused on the ways that the perception of speech is affected by properties of masking noise. In this review we first briefly outline the neuroanatomy of speech perception. We then summarise the neurobiological aspects of the perception of masked speech, and investigate this as a function of masker type, masker level and task. This article is part of a Special Issue entitled "Annual Reviews 2013". PMID:23685149

  14. Investigating Holistic Measures of Speech Prosody

    ERIC Educational Resources Information Center

    Cunningham, Dana Aliel

    2012-01-01

    Speech prosody is a multi-faceted dimension of speech which can be measured and analyzed in a variety of ways. In this study, the speech prosody of Mandarin L1 speakers, English L2 speakers, and English L1 speakers was assessed by trained raters who listened to sound clips of the speakers responding to a graph prompt and reading a short passage.…

  15. Spotlight on Speech Codes 2007: The State of Free Speech on Our Nation's Campuses

    ERIC Educational Resources Information Center

    Foundation for Individual Rights in Education (NJ1), 2007

    2007-01-01

    Last year, the Foundation for Individual Rights in Education (FIRE) conducted its first-ever comprehensive study of restrictions on speech at America's colleges and universities, "Spotlight on Speech Codes 2006: The State of Free Speech on our Nation's Campuses." In light of the essentiality of free expression to a truly liberal education, its…

  16. Cleft Audit Protocol for Speech (CAPS-A): A Comprehensive Training Package for Speech Analysis

    ERIC Educational Resources Information Center

    Sell, D.; John, A.; Harding-Bell, A.; Sweeney, T.; Hegarty, F.; Freeman, J.

    2009-01-01

    Background: The previous literature has largely focused on speech analysis systems and ignored process issues, such as the nature of adequate speech samples, data acquisition, recording and playback. Although there has been recognition of the need for training on tools used in speech analysis associated with cleft palate, little attention has been…

  17. Statistical modeling of infant-directed versus adult-directed speech: Insights from speech recognition

    NASA Astrophysics Data System (ADS)

    Kirchhoff, Katrin; Schimmel, Steven

    2003-10-01

    Studies on infant speech perception have shown that infant-directed speech (motherese) exhibits exaggerated acoustic properties, which are assumed to guide infants in the acquisition of phonemic categories. Training an automatic speech recognizer on such data might similarly lead to improved performance since classes can be expected to be more clearly separated in the training material. This claim was tested by training automatic speech recognizers on adult-directed (AD) versus infant-directed (ID) speech and testing them under identical versus mismatched conditions. 32 mother-infant conversations and 32 mother-adult conversations were used as training and test data. Both sets of conversations included a set of cue words containing unreduced vowels (e.g., sheep, boot, top, etc.), which mothers were encouraged to use repeatedly. Experiments on continuous speech recognition of the entire data set showed that recognizers trained on infant-directed speech did perform significantly better than those trained on adult-directed speech. However, isolated word recognition experiments focusing on the above-mentioned cue words showed that the drop in performance of the ID-trained speech recognizer on AD test speech was significantly smaller than vice versa, suggesting that speech with over-emphasized phonetic contrasts may indeed constitute better training material for speech recognition. [Work supported by CMBL, University of Washington.

  18. Private and Inner Speech and the Regulation of Social Speech Communication

    ERIC Educational Resources Information Center

    San Martin Martinez, Conchi; Boada i Calbet, Humbert; Feigenbaum, Peter

    2011-01-01

    To further investigate the possible regulatory role of private and inner speech in the context of referential social speech communications, a set of clear and systematically applied measures is needed. This study addresses this need by introducing a rigorous method for identifying private speech and certain sharply defined instances of inaudible…

  19. Exploring the Role of Brain Oscillations in Speech Perception in Noise: Intelligibility of Isochronously Retimed Speech.

    PubMed

    Aubanel, Vincent; Davis, Chris; Kim, Jeesun

    2016-01-01

    A growing body of evidence shows that brain oscillations track speech. This mechanism is thought to maximize processing efficiency by allocating resources to important speech information, effectively parsing speech into units of appropriate granularity for further decoding. However, some aspects of this mechanism remain unclear. First, while periodicity is an intrinsic property of this physiological mechanism, speech is only quasi-periodic, so it is not clear whether periodicity would present an advantage in processing. Second, it is still a matter of debate which aspect of speech triggers or maintains cortical entrainment, from bottom-up cues such as fluctuations of the amplitude envelope of speech to higher level linguistic cues such as syntactic structure. We present data from a behavioral experiment assessing the effect of isochronous retiming of speech on speech perception in noise. Two types of anchor points were defined for retiming speech, namely syllable onsets and amplitude envelope peaks. For each anchor point type, retiming was implemented at two hierarchical levels, a slow time scale around 2.5 Hz and a fast time scale around 4 Hz. Results show that while any temporal distortion resulted in reduced speech intelligibility, isochronous speech anchored to P-centers (approximated by stressed syllable vowel onsets) was significantly more intelligible than a matched anisochronous retiming, suggesting a facilitative role of periodicity defined on linguistically motivated units in processing speech in noise. PMID:27630552

  20. Exploring the Role of Brain Oscillations in Speech Perception in Noise: Intelligibility of Isochronously Retimed Speech

    PubMed Central

    Aubanel, Vincent; Davis, Chris; Kim, Jeesun

    2016-01-01

    A growing body of evidence shows that brain oscillations track speech. This mechanism is thought to maximize processing efficiency by allocating resources to important speech information, effectively parsing speech into units of appropriate granularity for further decoding. However, some aspects of this mechanism remain unclear. First, while periodicity is an intrinsic property of this physiological mechanism, speech is only quasi-periodic, so it is not clear whether periodicity would present an advantage in processing. Second, it is still a matter of debate which aspect of speech triggers or maintains cortical entrainment, from bottom-up cues such as fluctuations of the amplitude envelope of speech to higher level linguistic cues such as syntactic structure. We present data from a behavioral experiment assessing the effect of isochronous retiming of speech on speech perception in noise. Two types of anchor points were defined for retiming speech, namely syllable onsets and amplitude envelope peaks. For each anchor point type, retiming was implemented at two hierarchical levels, a slow time scale around 2.5 Hz and a fast time scale around 4 Hz. Results show that while any temporal distortion resulted in reduced speech intelligibility, isochronous speech anchored to P-centers (approximated by stressed syllable vowel onsets) was significantly more intelligible than a matched anisochronous retiming, suggesting a facilitative role of periodicity defined on linguistically motivated units in processing speech in noise.

  1. Exploring the Role of Brain Oscillations in Speech Perception in Noise: Intelligibility of Isochronously Retimed Speech

    PubMed Central

    Aubanel, Vincent; Davis, Chris; Kim, Jeesun

    2016-01-01

    A growing body of evidence shows that brain oscillations track speech. This mechanism is thought to maximize processing efficiency by allocating resources to important speech information, effectively parsing speech into units of appropriate granularity for further decoding. However, some aspects of this mechanism remain unclear. First, while periodicity is an intrinsic property of this physiological mechanism, speech is only quasi-periodic, so it is not clear whether periodicity would present an advantage in processing. Second, it is still a matter of debate which aspect of speech triggers or maintains cortical entrainment, from bottom-up cues such as fluctuations of the amplitude envelope of speech to higher level linguistic cues such as syntactic structure. We present data from a behavioral experiment assessing the effect of isochronous retiming of speech on speech perception in noise. Two types of anchor points were defined for retiming speech, namely syllable onsets and amplitude envelope peaks. For each anchor point type, retiming was implemented at two hierarchical levels, a slow time scale around 2.5 Hz and a fast time scale around 4 Hz. Results show that while any temporal distortion resulted in reduced speech intelligibility, isochronous speech anchored to P-centers (approximated by stressed syllable vowel onsets) was significantly more intelligible than a matched anisochronous retiming, suggesting a facilitative role of periodicity defined on linguistically motivated units in processing speech in noise. PMID:27630552

  2. Speech and Language Skills of Parents of Children with Speech Sound Disorders

    ERIC Educational Resources Information Center

    Lewis, Barbara A.; Freebairn, Lisa A.; Hansen, Amy J.; Miscimarra, Lara; Iyengar, Sudha K.; Taylor, H. Gerry

    2007-01-01

    Purpose: This study compared parents with histories of speech sound disorders (SSD) to parents without known histories on measures of speech sound production, phonological processing, language, reading, and spelling. Familial aggregation for speech and language disorders was also examined. Method: The participants were 147 parents of children with…

  3. Construction of a Rated Speech Corpus of L2 Learners' Spontaneous Speech

    ERIC Educational Resources Information Center

    Yoon, Su-Youn; Pierce, Lisa; Huensch, Amanda; Juul, Eric; Perkins, Samantha; Sproat, Richard; Hasegawa-Johnson, Mark

    2009-01-01

    This work reports on the construction of a rated database of spontaneous speech produced by second language (L2) learners of English. Spontaneous speech was collected from 28 L2 speakers representing six language backgrounds and five different proficiency levels. Speech was elicited using formats similar to that of the TOEFL iBT and the Speaking…

  4. Spotlight on Speech Codes 2012: The State of Free Speech on Our Nation's Campuses

    ERIC Educational Resources Information Center

    Foundation for Individual Rights in Education (NJ1), 2012

    2012-01-01

    The U.S. Supreme Court has called America's colleges and universities "vital centers for the Nation's intellectual life," but the reality today is that many of these institutions severely restrict free speech and open debate. Speech codes--policies prohibiting student and faculty speech that would, outside the bounds of campus, be protected by the…

  5. DELAYED SPEECH AND LANGUAGE DEVELOPMENT, PRENTICE-HALL FOUNDATIONS OF SPEECH PATHOLOGY SERIES.

    ERIC Educational Resources Information Center

    WOOD, NANCY E.

    WRITTEN FOR SPEECH PATHOLOGY STUDENTS AND PROFESSIONAL WORKERS, THE BOOK BEGINS BY DEFINING LANGUAGE AND SPEECH AND TRACING THE DEVELOPMENT OF SPEECH AND LANGUAGE FROM THE INFANT THROUGH THE 4-YEAR OLD. CAUSAL FACTORS OF DELAYED DEVELOPMENT ARE GIVEN, INCLUDING CENTRAL NERVOUS SYSTEM IMPAIRMENT AND ASSOCIATED BEHAVIORAL CLUES AND LANGUAGE…

  6. Normal and Time-Compressed Speech

    PubMed Central

    Lemke, Ulrike; Kollmeier, Birger; Holube, Inga

    2016-01-01

    Short-term and long-term learning effects were investigated for the German Oldenburg sentence test (OLSA) using original and time-compressed fast speech in noise. Normal-hearing and hearing-impaired participants completed six lists of the OLSA in five sessions. Two groups of normal-hearing listeners (24 and 12 listeners) and two groups of hearing-impaired listeners (9 listeners each) performed the test with original or time-compressed speech. In general, original speech resulted in better speech recognition thresholds than time-compressed speech. Thresholds decreased with repetition for both speech materials. Confirming earlier results, the largest improvements were observed within the first measurements of the first session, indicating a rapid initial adaptation phase. The improvements were larger for time-compressed than for original speech. The novel results on long-term learning effects when using the OLSA indicate a longer phase of ongoing learning, especially for time-compressed speech, which seems to be limited by a floor effect. In addition, for normal-hearing participants, no complete transfer of learning benefits from time-compressed to original speech was observed. These effects should be borne in mind when inviting listeners repeatedly, for example, in research settings.

  7. Adaptation to spectrally-rotated speech.

    PubMed

    Green, Tim; Rosen, Stuart; Faulkner, Andrew; Paterson, Ruth

    2013-08-01

    Much recent interest surrounds listeners' abilities to adapt to various transformations that distort speech. An extreme example is spectral rotation, in which the spectrum of low-pass filtered speech is inverted around a center frequency (2 kHz here). Spectral shape and its dynamics are completely altered, rendering speech virtually unintelligible initially. However, intonation, rhythm, and contrasts in periodicity and aperiodicity are largely unaffected. Four normal hearing adults underwent 6 h of training with spectrally-rotated speech using Continuous Discourse Tracking. They and an untrained control group completed pre- and post-training speech perception tests, for which talkers differed from the training talker. Significantly improved recognition of spectrally-rotated sentences was observed for trained, but not untrained, participants. However, there were no significant improvements in the identification of medial vowels in /bVd/ syllables or intervocalic consonants. Additional tests were performed with speech materials manipulated so as to isolate the contribution of various speech features. These showed that preserving intonational contrasts did not contribute to the comprehension of spectrally-rotated speech after training, and suggested that improvements involved adaptation to altered spectral shape and dynamics, rather than just learning to focus on speech features relatively unaffected by the transformation.

  8. Segregation of unvoiced speech from nonspeech interference.

    PubMed

    Hu, Guoning; Wang, DeLiang

    2008-08-01

    Monaural speech segregation has proven to be extremely challenging. While efforts in computational auditory scene analysis have led to considerable progress in voiced speech segregation, little attention has been given to unvoiced speech, which lacks harmonic structure and has weaker energy, hence more susceptible to interference. This study proposes a new approach to the problem of segregating unvoiced speech from nonspeech interference. The study first addresses the question of how much speech is unvoiced. The segregation process occurs in two stages: Segmentation and grouping. In segmentation, the proposed model decomposes an input mixture into contiguous time-frequency segments by a multiscale analysis of event onsets and offsets. Grouping of unvoiced segments is based on Bayesian classification of acoustic-phonetic features. The proposed model for unvoiced speech segregation joins an existing model for voiced speech segregation to produce an overall system that can deal with both voiced and unvoiced speech. Systematic evaluation shows that the proposed system extracts a majority of unvoiced speech without including much interference, and it performs substantially better than spectral subtraction. PMID:18681616

  9. Speech coding research at Bell Laboratories

    NASA Astrophysics Data System (ADS)

    Atal, Bishnu S.

    2001-05-01

    The field of speech coding is now over 70 years old. It started from the desire to transmit voice signals over telegraph cables. The availability of digital computers in the mid 1960s made it possible to test complex speech coding algorithms rapidly. The introduction of linear predictive coding (LPC) started a new era in speech coding. The fundamental philosophy of speech coding went through a major shift, resulting in a new generation of low bit rate speech coders, such as multi-pulse and code-excited LPC. The semiconductor revolution produced faster and faster DSP chips and made linear predictive coding practical. Code-excited LPC has become the method of choice for low bit rate speech coding applications and is used in most voice transmission standards for cell phones. Digital speech communication is rapidly evolving from circuit-switched to packet-switched networks to provide integrated transmission of voice, data, and video signals. The new communication environment is also moving the focus of speech coding research from compression to low cost, reliable, and secure transmission of voice signals on digital networks, and provides the motivation for creating a new class of speech coders suitable for future applications.

  10. Acquisition of speech rhythm in first language.

    PubMed

    Polyanskaya, Leona; Ordin, Mikhail

    2015-09-01

    Analysis of English rhythm in speech produced by children and adults revealed that speech rhythm becomes increasingly more stress-timed as language acquisition progresses. Children reach the adult-like target by 11 to 12 years. The employed speech elicitation paradigm ensured that the sentences produced by adults and children at different ages were comparable in terms of lexical content, segmental composition, and phonotactic complexity. Detected differences between child and adult rhythm and between rhythm in child speech at various ages cannot be attributed to acquisition of phonotactic language features or vocabulary, and indicate the development of language-specific phonetic timing in the course of acquisition.

  11. Preschoolers Benefit From Visually Salient Speech Cues

    PubMed Central

    Holt, Rachael Frush

    2015-01-01

    Purpose This study explored visual speech influence in preschoolers using 3 developmentally appropriate tasks that vary in perceptual difficulty and task demands. They also examined developmental differences in the ability to use visually salient speech cues and visual phonological knowledge. Method Twelve adults and 27 typically developing 3- and 4-year-old children completed 3 audiovisual (AV) speech integration tasks: matching, discrimination, and recognition. The authors compared AV benefit for visually salient and less visually salient speech discrimination contrasts and assessed the visual saliency of consonant confusions in auditory-only and AV word recognition. Results Four-year-olds and adults demonstrated visual influence on all measures. Three-year-olds demonstrated visual influence on speech discrimination and recognition measures. All groups demonstrated greater AV benefit for the visually salient discrimination contrasts. AV recognition benefit in 4-year-olds and adults depended on the visual saliency of speech sounds. Conclusions Preschoolers can demonstrate AV speech integration. Their AV benefit results from efficient use of visually salient speech cues. Four-year-olds, but not 3-year-olds, used visual phonological knowledge to take advantage of visually salient speech cues, suggesting possible developmental differences in the mechanisms of AV benefit. PMID:25322336

  12. Speech Enhancement based on Compressive Sensing Algorithm

    NASA Astrophysics Data System (ADS)

    Sulong, Amart; Gunawan, Teddy S.; Khalifa, Othman O.; Chebil, Jalel

    2013-12-01

    There are various methods, in performance of speech enhancement, have been proposed over the years. The accurate method for the speech enhancement design mainly focuses on quality and intelligibility. The method proposed with high performance level. A novel speech enhancement by using compressive sensing (CS) is a new paradigm of acquiring signals, fundamentally different from uniform rate digitization followed by compression, often used for transmission or storage. Using CS can reduce the number of degrees of freedom of a sparse/compressible signal by permitting only certain configurations of the large and zero/small coefficients, and structured sparsity models. Therefore, CS is significantly provides a way of reconstructing a compressed version of the speech in the original signal by taking only a small amount of linear and non-adaptive measurement. The performance of overall algorithms will be evaluated based on the speech quality by optimise using informal listening test and Perceptual Evaluation of Speech Quality (PESQ). Experimental results show that the CS algorithm perform very well in a wide range of speech test and being significantly given good performance for speech enhancement method with better noise suppression ability over conventional approaches without obvious degradation of speech quality.

  13. Speech and Language Disorders in the School Setting

    MedlinePlus

    ... and Swallowing / Development Frequently Asked Questions: Speech and Language Disorders in the School Setting What types of speech and language disorders affect school-age children ? Do speech-language ...

  14. Speech Planning Happens before Speech Execution: Online Reaction Time Methods in the Study of Apraxia of Speech

    ERIC Educational Resources Information Center

    Maas, Edwin; Mailend, Marja-Liisa

    2012-01-01

    Purpose: The purpose of this article is to present an argument for the use of online reaction time (RT) methods to the study of apraxia of speech (AOS) and to review the existing small literature in this area and the contributions it has made to our fundamental understanding of speech planning (deficits) in AOS. Method: Following a brief…

  15. Perceptual centres in speech - an acoustic analysis

    NASA Astrophysics Data System (ADS)

    Scott, Sophie Kerttu

    Perceptual centres, or P-centres, represent the perceptual moments of occurrence of acoustic signals - the 'beat' of a sound. P-centres underlie the perception and production of rhythm in perceptually regular speech sequences. P-centres have been modelled both in speech and non speech (music) domains. The three aims of this thesis were toatest out current P-centre models to determine which best accounted for the experimental data bto identify a candidate parameter to map P-centres onto (a local approach) as opposed to the previous global models which rely upon the whole signal to determine the P-centre the final aim was to develop a model of P-centre location which could be applied to speech and non speech signals. The first aim was investigated by a series of experiments in which a) speech from different speakers was investigated to determine whether different models could account for variation between speakers b) whether rendering the amplitude time plot of a speech signal affects the P-centre of the signal c) whether increasing the amplitude at the offset of a speech signal alters P-centres in the production and perception of speech. The second aim was carried out by a) manipulating the rise time of different speech signals to determine whether the P-centre was affected, and whether the type of speech sound ramped affected the P-centre shift b) manipulating the rise time and decay time of a synthetic vowel to determine whether the onset alteration was had more affect on P-centre than the offset manipulation c) and whether the duration of a vowel affected the P-centre, if other attributes (amplitude, spectral contents) were held constant. The third aim - modelling P-centres - was based on these results. The Frequency dependent Amplitude Increase Model of P-centre location (FAIM) was developed using a modelling protocol, the APU GammaTone Filterbank and the speech from different speakers. The P-centres of the stimuli corpus were highly predicted by attributes of

  16. Speech perception as an active cognitive process

    PubMed Central

    Heald, Shannon L. M.; Nusbaum, Howard C.

    2014-01-01

    One view of speech perception is that acoustic signals are transformed into representations for pattern matching to determine linguistic structure. This process can be taken as a statistical pattern-matching problem, assuming realtively stable linguistic categories are characterized by neural representations related to auditory properties of speech that can be compared to speech input. This kind of pattern matching can be termed a passive process which implies rigidity of processing with few demands on cognitive processing. An alternative view is that speech recognition, even in early stages, is an active process in which speech analysis is attentionally guided. Note that this does not mean consciously guided but that information-contingent changes in early auditory encoding can occur as a function of context and experience. Active processing assumes that attention, plasticity, and listening goals are important in considering how listeners cope with adverse circumstances that impair hearing by masking noise in the environment or hearing loss. Although theories of speech perception have begun to incorporate some active processing, they seldom treat early speech encoding as plastic and attentionally guided. Recent research has suggested that speech perception is the product of both feedforward and feedback interactions between a number of brain regions that include descending projections perhaps as far downstream as the cochlea. It is important to understand how the ambiguity of the speech signal and constraints of context dynamically determine cognitive resources recruited during perception including focused attention, learning, and working memory. Theories of speech perception need to go beyond the current corticocentric approach in order to account for the intrinsic dynamics of the auditory encoding of speech. In doing so, this may provide new insights into ways in which hearing disorders and loss may be treated either through augementation or therapy. PMID

  17. Prediction and constraint in audiovisual speech perception

    PubMed Central

    Peelle, Jonathan E.; Sommers, Mitchell S.

    2015-01-01

    During face-to-face conversational speech listeners must efficiently process a rapid and complex stream of multisensory information. Visual speech can serve as a critical complement to auditory information because it provides cues to both the timing of the incoming acoustic signal (the amplitude envelope, influencing attention and perceptual sensitivity) and its content (place and manner of articulation, constraining lexical selection). Here we review behavioral and neurophysiological evidence regarding listeners' use of visual speech information. Multisensory integration of audiovisual speech cues improves recognition accuracy, particularly for speech in noise. Even when speech is intelligible based solely on auditory information, adding visual information may reduce the cognitive demands placed on listeners through increasing precision of prediction. Electrophysiological studies demonstrate oscillatory cortical entrainment to speech in auditory cortex is enhanced when visual speech is present, increasing sensitivity to important acoustic cues. Neuroimaging studies also suggest increased activity in auditory cortex when congruent visual information is available, but additionally emphasize the involvement of heteromodal regions of posterior superior temporal sulcus as playing a role in integrative processing. We interpret these findings in a framework of temporally-focused lexical competition in which visual speech information affects auditory processing to increase sensitivity to auditory information through an early integration mechanism, and a late integration stage that incorporates specific information about a speaker's articulators to constrain the number of possible candidates in a spoken utterance. Ultimately it is words compatible with both auditory and visual information that most strongly determine successful speech perception during everyday listening. Thus, audiovisual speech perception is accomplished through multiple stages of integration, supported

  18. Speech detection in spatial and nonspatial speech maskers.

    PubMed

    Balakrishnan, Uma; Freyman, Richard L

    2008-05-01

    The effect of perceived spatial differences on masking release was examined using a 4AFC speech detection paradigm. Targets were 20 words produced by a female talker. Maskers were recordings of continuous streams of nonsense sentences spoken by two female talkers and mixed into each of two channels (two talker, and the same masker time reversed). Two masker spatial conditions were employed: "RF" with a 4 ms time lead to the loudspeaker 60 degrees horizontally to the right, and "FR" with the time lead to the front (0 degrees ) loudspeaker. The reference nonspatial "F" masker was presented from the front loudspeaker only. Target presentation was always from the front loudspeaker. In Experiment 1, target detection threshold for both natural and time-reversed spatial maskers was 17-20 dB lower than that for the nonspatial masker, suggesting that significant release from informational masking occurs with spatial speech maskers regardless of masker understandability. In Experiment 2, the effectiveness of the FR and RF maskers was evaluated as the right loudspeaker output was attenuated until the two-source maskers were indistinguishable from the F masker, as measured independently in a discrimination task. Results indicated that spatial release from masking can be observed with barely noticeable target-masker spatial differences.

  19. Open Microphone Speech Understanding: Correct Discrimination Of In Domain Speech

    NASA Technical Reports Server (NTRS)

    Hieronymus, James; Aist, Greg; Dowding, John

    2006-01-01

    An ideal spoken dialogue system listens continually and determines which utterances were spoken to it, understands them and responds appropriately while ignoring the rest This paper outlines a simple method for achieving this goal which involves trading a slightly higher false rejection rate of in domain utterances for a higher correct rejection rate of Out of Domain (OOD) utterances. The system recognizes semantic entities specified by a unification grammar which is specialized by Explanation Based Learning (EBL). so that it only uses rules which are seen in the training data. The resulting grammar has probabilities assigned to each construct so that overgeneralizations are not a problem. The resulting system only recognizes utterances which reduce to a valid logical form which has meaning for the system and rejects the rest. A class N-gram grammar has been trained on the same training data. This system gives good recognition performance and offers good Out of Domain discrimination when combined with the semantic analysis. The resulting systems were tested on a Space Station Robot Dialogue Speech Database and a subset of the OGI conversational speech database. Both systems run in real time on a PC laptop and the present performance allows continuous listening with an acceptably low false acceptance rate. This type of open microphone system has been used in the Clarissa procedure reading and navigation spoken dialogue system which is being tested on the International Space Station.

  20. Earlier speech exposure does not accelerate speech acquisition.

    PubMed

    Peña, Marcela; Werker, Janet F; Dehaene-Lambertz, Ghislaine

    2012-08-15

    Critical periods in language acquisition have been discussed primarily with reference to studies of people who are deaf or bilingual. Here, we provide evidence on the opening of sensitivity to the linguistic environment by studying the response to a change of phoneme at a native and nonnative phonetic boundary in full-term and preterm human infants using event-related potentials. Full-term infants show a decline in their discrimination of nonnative phonetic contrasts between 9 and 12 months of age. Because the womb is a high-frequency filter, many phonemes are strongly degraded in utero. Preterm infants thus benefit from earlier and richer exposure to broadcast speech. We find that preterms do not take advantage of this enriched linguistic environment: the decrease in amplitude of the mismatch response to a nonnative change of phoneme at the end of the first year of life was dependent on maturational age and not on the duration of exposure to broadcast speech. The shaping of phonological representations by the environment is thus strongly constrained by brain maturation factors.

  1. Monaural speech intelligibility and detection in maskers with varying amounts of spectro-temporal speech features.

    PubMed

    Schubotz, Wiebke; Brand, Thomas; Kollmeier, Birger; Ewert, Stephan D

    2016-07-01

    Speech intelligibility is strongly affected by the presence of maskers. Depending on the spectro-temporal structure of the masker and its similarity to the target speech, different masking aspects can occur which are typically referred to as energetic, amplitude modulation, and informational masking. In this study speech intelligibility and speech detection was measured in maskers that vary systematically in the time-frequency domain from steady-state noise to a single interfering talker. Male and female target speech was used in combination with maskers based on speech for the same or different gender. Observed data were compared to predictions of the speech intelligibility index, extended speech intelligibility index, multi-resolution speech-based envelope-power-spectrum model, and the short-time objective intelligibility measure. The different models served as analysis tool to help distinguish between the different masking aspects. Comparison shows that overall masking can to a large extent be explained by short-term energetic masking. However, the other masking aspects (amplitude modulation an informational masking) influence speech intelligibility as well. Additionally, it was obvious that all models showed considerable deviations from the data. Therefore, the current study provides a benchmark for further evaluation of speech prediction models. PMID:27475175

  2. Reasons why current speech-enhancement algorithms do not improve speech intelligibility and suggested solutions

    PubMed Central

    Loizou, Philipos C.; Kim, Gibak

    2011-01-01

    Existing speech enhancement algorithms can improve speech quality but not speech intelligibility, and the reasons for that are unclear. In the present paper, we present a theoretical framework that can be used to analyze potential factors that can influence the intelligibility of processed speech. More specifically, this framework focuses on the fine-grain analysis of the distortions introduced by speech enhancement algorithms. It is hypothesized that if these distortions are properly controlled, then large gains in intelligibility can be achieved. To test this hypothesis, intelligibility tests are conducted with human listeners in which we present processed speech with controlled speech distortions. The aim of these tests is to assess the perceptual effect of the various distortions that can be introduced by speech enhancement algorithms on speech intelligibility. Results with three different enhancement algorithms indicated that certain distortions are more detrimental to speech intelligibility degradation than others. When these distortions were properly controlled, however, large gains in intelligibility were obtained by human listeners, even by spectral-subtractive algorithms which are known to degrade speech quality and intelligibility. PMID:21909285

  3. Speech levels in meeting rooms and the probability of speech privacy problems.

    PubMed

    Bradley, J S; Gover, B N

    2010-02-01

    Speech levels were measured in a large number of meetings and meeting rooms to better understand their influence on the speech privacy of closed meeting rooms. The effects of room size and number of occupants on average speech levels, for meetings with and without sound amplification, were investigated. The characteristics of the statistical variations of speech levels were determined in terms of speech levels measured over 10 s intervals at locations inside, but near the periphery of the meeting rooms. A procedure for predicting the probability of speech being audible or intelligible at points outside meeting rooms is proposed. It is based on the statistics of meeting room speech levels, in combination with the sound insulation characteristics of the room and the ambient noise levels at locations outside the room. PMID:20136204

  4. Method and apparatus for obtaining complete speech signals for speech recognition applications

    NASA Technical Reports Server (NTRS)

    Abrash, Victor (Inventor); Cesari, Federico (Inventor); Franco, Horacio (Inventor); George, Christopher (Inventor); Zheng, Jing (Inventor)

    2009-01-01

    The present invention relates to a method and apparatus for obtaining complete speech signals for speech recognition applications. In one embodiment, the method continuously records an audio stream comprising a sequence of frames to a circular buffer. When a user command to commence or terminate speech recognition is received, the method obtains a number of frames of the audio stream occurring before or after the user command in order to identify an augmented audio signal for speech recognition processing. In further embodiments, the method analyzes the augmented audio signal in order to locate starting and ending speech endpoints that bound at least a portion of speech to be processed for recognition. At least one of the speech endpoints is located using a Hidden Markov Model.

  5. [Electrographic Correlations of Inner Speech].

    PubMed

    Kiroy, V N; Bakhtin, O M; Minyaeva, N R; Lazurenko, D M; Aslanyan, E V; Kiroy, R I

    2015-01-01

    On the purpose to detect in EEG specific patterns associated with any verbal performance the gamma activity were investigated. The technique which allows the subject to initiate the mental pronunciation of words and phrases (inner speech) was created. Wavelet analysis of EEG has been experimentally demonstrated that the preparation and implementation stages are related to the specific spatio-temporal patterns in frequency range 64-68 Hz. Sustainable reproduction and efficient identification of such patterns can solve the fundamentally problem of alphabet control commands formation for Brain Computer Interface and Brain to Braine Interface systems. PMID:26860004

  6. Speech recognition technology: a critique.

    PubMed Central

    Levinson, S E

    1995-01-01

    This paper introduces the session on advanced speech recognition technology. The two papers comprising this session argue that current technology yields a performance that is only an order of magnitude in error rate away from human performance and that incremental improvements will bring us to that desired level. I argue that, to the contrary, present performance is far removed from human performance and a revolution in our thinking is required to achieve the goal. It is further asserted that to bring about the revolution more effort should be expended on basic research and less on trying to prematurely commercialize a deficient technology. PMID:7479808

  7. Speech activity detection using accelerometer.

    PubMed

    Matic, Aleksandar; Osmani, Venet; Mayora, Oscar

    2012-01-01

    The level of social activity is linked to the overall wellbeing and to various disorders, including stress. In this regard, a myriad of automatic solutions for monitoring social interactions have been proposed, usually including audio data analysis. Such approaches often face legal and ethical issues and they may also raise privacy concerns in monitored subjects thus affecting their natural behaviour. In this paper we present an accelerometer-based speech detection which does not require capturing sensitive data while being an easily applicable and a cost-effective solution.

  8. The effects of selective attention and speech acoustics on neural speech-tracking in a multi-talker scene.

    PubMed

    Rimmele, Johanna M; Zion Golumbic, Elana; Schröger, Erich; Poeppel, David

    2015-07-01

    Attending to one speaker in multi-speaker situations is challenging. One neural mechanism proposed to underlie the ability to attend to a particular speaker is phase-locking of low-frequency activity in auditory cortex to speech's temporal envelope ("speech-tracking"), which is more precise for attended speech. However, it is not known what brings about this attentional effect, and specifically if it reflects enhanced processing of the fine structure of attended speech. To investigate this question we compared attentional effects on speech-tracking of natural versus vocoded speech which preserves the temporal envelope but removes the fine structure of speech. Pairs of natural and vocoded speech stimuli were presented concurrently and participants attended to one stimulus and performed a detection task while ignoring the other stimulus. We recorded magnetoencephalography (MEG) and compared attentional effects on the speech-tracking response in auditory cortex. Speech-tracking of natural, but not vocoded, speech was enhanced by attention, whereas neural tracking of ignored speech was similar for natural and vocoded speech. These findings suggest that the more precise speech-tracking of attended natural speech is related to processing its fine structure, possibly reflecting the application of higher-order linguistic processes. In contrast, when speech is unattended its fine structure is not processed to the same degree and thus elicits less precise speech-tracking more similar to vocoded speech.

  9. Teaching Speech to Your Language Delayed Child.

    ERIC Educational Resources Information Center

    Rees, Roger J.; Pryor, Jan, Ed.

    1980-01-01

    Intended for parents, the booklet focuses on the speech and language development of children with language delays. The following topics are among those considered: the parent's role in the initial diagnosis of deafness, intellectual handicap, and neurological difficulties; diagnoses and single causes of difficultiy with speech; what to say to…

  10. Pulmonic Ingressive Speech in Shetland English

    ERIC Educational Resources Information Center

    Sundkvist, Peter

    2012-01-01

    This paper presents a study of pulmonic ingressive speech, a severely understudied phenomenon within varieties of English. While ingressive speech has been reported for several parts of the British Isles, New England, and eastern Canada, thus far Newfoundland appears to be the only locality where researchers have managed to provide substantial…

  11. Only Speech Codes Should Be Censored

    ERIC Educational Resources Information Center

    Pavela, Gary

    2006-01-01

    In this article, the author discusses the enforcement of "hate speech" codes and confirms research that considers why U.S. colleges and universities continue to promulgate student disciplinary rules prohibiting expression that "subordinates" others or is "demeaning, offensive, or hateful." Such continued adherence to speech codes is by now…

  12. Speech Genres in Writing Cognitive Artifacts.

    ERIC Educational Resources Information Center

    Shambaugh, R. Neal

    This paper reports on the analysis of an instructional text on the basis of M. Bakhtin's (1986) notion of speech genres, which is used to theorize the different influences on the writing of an instructional text. Speech genres are used to reveal the multiple voices inherent in any text: the writer's, the reader's, and the text's. The…

  13. Anatomy and Physiology of the Speech Mechanism.

    ERIC Educational Resources Information Center

    Sheets, Boyd V.

    This monograph on the anatomical and physiological aspects of the speech mechanism stresses the importance of a general understanding of the process of verbal communication. Contents include "Positions of the Body,""Basic Concepts Linked with the Speech Mechanism,""The Nervous System,""The Respiratory System--Sound-Power Source,""The…

  14. The Need for a Speech Corpus

    ERIC Educational Resources Information Center

    Campbell, Dermot F.; McDonnell, Ciaran; Meinardi, Marti; Richardson, Bunny

    2007-01-01

    This paper outlines the ongoing construction of a speech corpus for use by applied linguists and advanced EFL/ESL students. In the first part, sections 1-4, the need for improvements in the teaching of listening skills and pronunciation practice for EFL/ESL students is noted. It is argued that the use of authentic native-to-native speech is…

  15. Localization of Sublexical Speech Perception Components

    ERIC Educational Resources Information Center

    Turkeltaub, Peter E.; Coslett, H. Branch

    2010-01-01

    Models of speech perception are in general agreement with respect to the major cortical regions involved, but lack precision with regard to localization and lateralization of processing units. To refine these models we conducted two Activation Likelihood Estimation (ALE) meta-analyses of the neuroimaging literature on sublexical speech perception.…

  16. Speech-Language Pathology: Preparing Early Interventionists

    ERIC Educational Resources Information Center

    Prelock, Patricia A.; Deppe, Janet

    2015-01-01

    The purpose of this article is to explain the role of speech-language pathology in early intervention. The expected credentials of professionals in the field are described, and the current numbers of practitioners serving young children are identified. Several resource documents available from the American Speech-­Language Hearing Association are…

  17. A Foster Home Approach to Speech Therapy

    ERIC Educational Resources Information Center

    Hatten, John T.; Hatten, Pequetti A.

    1971-01-01

    A language development program for a 6-year-old boy with limited language development combined an operant approach in the foster home, where both parents were speech clinicians, and daily 3-hour therapy sessions at a university speech and hearing clinic. (KW)

  18. Speech-Language-Pathology and Audiology Handbook.

    ERIC Educational Resources Information Center

    New York State Education Dept., Albany. Office of the Professions.

    The handbook contains State Education Department rules and regulations that govern speech-language pathology and audiology in New York State. The handbook also describes licensure and first registration as a licensed speech-language pathologist or audiologist. The introduction discusses professional regulation in New York State while the second…

  19. The Lombard Effect on Alaryngeal Speech.

    ERIC Educational Resources Information Center

    Zeine, Lina; Brandt, John F.

    1988-01-01

    The study investigated the Lombard effect (evoking increased speech intensity by applying masking noise to ears of talker) on the speech of esophageal talkers, artificial larynx users, and normal speakers. The noise condition produced the highest intensity increase in the esophageal speakers. (Author/DB)

  20. Performing speech recognition research with hypercard

    NASA Technical Reports Server (NTRS)

    Shepherd, Chip

    1993-01-01

    The purpose of this paper is to describe a HyperCard-based system for performing speech recognition research and to instruct Human Factors professionals on how to use the system to obtain detailed data about the user interface of a prototype speech recognition application.

  1. Tampa Bay International Business Summit Keynote Speech

    NASA Technical Reports Server (NTRS)

    Clary, Christina

    2011-01-01

    A keynote speech outlining the importance of collaboration and diversity in the workplace. The 20-minute speech describes NASA's challenges and accomplishments over the years and what lies ahead. Topics include: diversity and inclusion principles, international cooperation, Kennedy Space Center planning and development, opportunities for cooperation, and NASA's vision for exploration.

  2. Building Searchable Collections of Enterprise Speech Data.

    ERIC Educational Resources Information Center

    Cooper, James W.; Viswanathan, Mahesh; Byron, Donna; Chan, Margaret

    The study has applied speech recognition and text-mining technologies to a set of recorded outbound marketing calls and analyzed the results. Since speaker-independent speech recognition technology results in a significantly lower recognition rate than that found when the recognizer is trained for a particular speaker, a number of post-processing…

  3. Preschoolers Benefit from Visually Salient Speech Cues

    ERIC Educational Resources Information Center

    Lalonde, Kaylah; Holt, Rachael Frush

    2015-01-01

    Purpose: This study explored visual speech influence in preschoolers using 3 developmentally appropriate tasks that vary in perceptual difficulty and task demands. They also examined developmental differences in the ability to use visually salient speech cues and visual phonological knowledge. Method: Twelve adults and 27 typically developing 3-…

  4. Mothers' Speech in Three Social Classes

    ERIC Educational Resources Information Center

    Snow, C. E.; And Others

    1976-01-01

    Functional and linguistic aspects of the speech of Dutch-speaking mothers from three social classes to their two-year-old children were studied to test the hypothesis that simplified speech is crucial to language acquisition. Available from Plenum Publishing Corp., 227 W. 17th St., New York, NY 10011. (Author/RM)

  5. Acoustic characteristics of listener-constrained speech

    NASA Astrophysics Data System (ADS)

    Ashby, Simone; Cummins, Fred

    2003-04-01

    Relatively little is known about the acoustical modifications speakers employ to meet the various constraints-auditory, linguistic and otherwise-of their listeners. Similarly, the manner by which perceived listener constraints interact with speakers' adoption of specialized speech registers is poorly Hypo (H&H) theory offers a framework for examining the relationship between speech production and output-oriented goals for communication, suggesting that under certain circumstances speakers may attempt to minimize phonetic ambiguity by employing a ``hyperarticulated'' speaking style (Lindblom, 1990). It remains unclear, however, what the acoustic correlates of hyperarticulated speech are, and how, if at all, we might expect phonetic properties to change respective to different listener-constrained conditions. This paper is part of a preliminary investigation concerned with comparing the prosodic characteristics of speech produced across a range of listener constraints. Analyses are drawn from a corpus of read hyperarticulated speech data comprising eight adult, female speakers of English. Specialized registers include speech to foreigners, infant-directed speech, speech produced under noisy conditions, and human-machine interaction. The authors gratefully acknowledge financial support of the Irish Higher Education Authority, allocated to Fred Cummins for collaborative work with Media Lab Europe.

  6. Speech neglect: A strange educational blind spot

    NASA Astrophysics Data System (ADS)

    Harris, Katherine Safford

    2005-09-01

    Speaking is universally acknowledged as an important human talent, yet as a topic of educated common knowledge, it is peculiarly neglected. Partly, this is a consequence of the relatively recent growth of research on speech perception, production, and development, but also a function of the way that information is sliced up by undergraduate colleges. Although the basic acoustic mechanism of vowel production was known to Helmholtz, the ability to view speech production as a physiological event is evolving even now with such techniques as fMRI. Intensive research on speech perception emerged only in the early 1930s as Fletcher and the engineers at Bell Telephone Laboratories developed the transmission of speech over telephone lines. The study of speech development was revolutionized by the papers of Eimas and his colleagues on speech perception in infants in the 1970s. Dissemination of knowledge in these fields is the responsibility of no single academic discipline. It forms a center for two departments, Linguistics, and Speech and Hearing, but in the former, there is a heavy emphasis on other aspects of language than speech and, in the latter, a focus on clinical practice. For psychologists, it is a rather minor component of a very diverse assembly of topics. I will focus on these three fields in proposing possible remedies.

  7. Assessing Speech Discrimination in Individual Infants

    ERIC Educational Resources Information Center

    Houston, Derek M.; Horn, David L.; Qi, Rong; Ting, Jonathan Y.; Gao, Sujuan

    2007-01-01

    Assessing speech discrimination skills in individual infants from clinical populations (e.g., infants with hearing impairment) has important diagnostic value. However, most infant speech discrimination paradigms have been designed to test group effects rather than individual differences. Other procedures suffer from high attrition rates. In this…

  8. The Oral Speech Mechanism Screening Examination (OSMSE).

    ERIC Educational Resources Information Center

    St. Louis, Kenneth O.; Ruscello, Dennis M.

    Although speech-language pathologists are expected to be able to administer and interpret oral examinations, there are currently no screening tests available that provide careful administration instructions and data for intra-examiner and inter-examiner reliability. The Oral Speech Mechanism Screening Examination (OSMSE) is designed primarily for…

  9. How Should a Speech Recognizer Work?

    ERIC Educational Resources Information Center

    Scharenborg, Odette; Norris, Dennis; ten Bosch, Louis; McQueen, James M.

    2005-01-01

    Although researchers studying human speech recognition (HSR) and automatic speech recognition (ASR) share a common interest in how information processing systems (human or machine) recognize spoken language, there is little communication between the two disciplines. We suggest that this lack of communication follows largely from the fact that…

  10. Treatment Intensity and Childhood Apraxia of Speech

    ERIC Educational Resources Information Center

    Namasivayam, Aravind K.; Pukonen, Margit; Goshulak, Debra; Hard, Jennifer; Rudzicz, Frank; Rietveld, Toni; Maassen, Ben; Kroll, Robert; van Lieshout, Pascal

    2015-01-01

    Background: Intensive treatment has been repeatedly recommended for the treatment of speech deficits in childhood apraxia of speech (CAS). However, differences in treatment outcomes as a function of treatment intensity have not been systematically studied in this population. Aim: To investigate the effects of treatment intensity on outcome…

  11. Portable Tactile Aids for Speech Perception.

    ERIC Educational Resources Information Center

    Lynch, Michael P.; And Others

    1989-01-01

    Experiments using portable tactile aids in speech perception are reviewed, focusing on training studies, additive benefit studies, and device comparison studies (including the "Tactaid II,""Tactaid V,""Tacticon 1600," and "Tickle Talker"). The potential of tactual information in perception of the overall speech code by hearing-impaired individuals…

  12. Hypnosis and the Reduction of Speech Anxiety.

    ERIC Educational Resources Information Center

    Barker, Larry L.; And Others

    The purposes of this paper are (1) to review the background and nature of hypnosis, (2) to synthesize research on hypnosis related to speech communication, and (3) to delineate and compare two potential techniques for reducing speech anxiety--hypnosis and systematic desensitization. Hypnosis has been defined as a mental state characterised by…

  13. Effects of Syllable Frequency in Speech Production

    ERIC Educational Resources Information Center

    Cholin, Joana; Levelt, Willem J. M.; Schiller, Niels O.

    2006-01-01

    In the speech production model proposed by [Levelt, W. J. M., Roelofs, A., Meyer, A. S. (1999). A theory of lexical access in speech production. "Behavioral and Brain Sciences," 22, pp. 1-75.], syllables play a crucial role at the interface of phonological and phonetic encoding. At this interface, abstract phonological syllables are translated…

  14. CLEFT PALATE. FOUNDATIONS OF SPEECH PATHOLOGY SERIES.

    ERIC Educational Resources Information Center

    RUTHERFORD, DAVID; WESTLAKE, HAROLD

    DESIGNED TO PROVIDE AN ESSENTIAL CORE OF INFORMATION, THIS BOOK TREATS NORMAL AND ABNORMAL DEVELOPMENT, STRUCTURE, AND FUNCTION OF THE LIPS AND PALATE AND THEIR RELATIONSHIPS TO CLEFT LIP AND CLEFT PALATE SPEECH. PROBLEMS OF PERSONAL AND SOCIAL ADJUSTMENT, HEARING, AND SPEECH IN CLEFT LIP OR CLEFT PALATE INDIVIDUALS ARE DISCUSSED. NASAL RESONANCE…

  15. Education in the 80's: Speech Communication.

    ERIC Educational Resources Information Center

    Friedrich, Gustav W., Ed.

    Taken together, the 20 chapters in this book provide many suggestions, predictions, alternatives, innovations, and improvements in the speech communication curriculum that can be either undertaken or accomplished during the 1980s. The first five chapters speculate positively about the future of speech communication instruction in five of its most…

  16. Reliability of Speech Diadochokinetic Test Measurement

    ERIC Educational Resources Information Center

    Gadesmann, Miriam; Miller, Nick

    2008-01-01

    Background: Measures of articulatory diadochokinesis (DDK) are widely used in the assessment of motor speech disorders and they play a role in detecting abnormality, monitoring speech performance changes and classifying syndromes. Although in clinical practice DDK is generally measured perceptually, without support from instrumental methods that…

  17. [Thematic Issue: Career Trends in Speech Communication.

    ERIC Educational Resources Information Center

    Hall, Robert, Ed.

    To analyze historical trends in job opportunities available to speech communications graduates, a content analysis was conducted of Speech Communication Association bulletins over a ten-year period (1967 to 1977). All bulletin listings were analyzed for state, type of institution, rank, areas of specialization, job requirements, and salary range.…

  18. Speech Intelligibility in Severe Adductor Spasmodic Dysphonia

    ERIC Educational Resources Information Center

    Bender, Brenda K.; Cannito, Michael P.; Murry, Thomas; Woodson, Gayle E.

    2004-01-01

    This study compared speech intelligibility in nondisabled speakers and speakers with adductor spasmodic dysphonia (ADSD) before and after botulinum toxin (Botox) injection. Standard speech samples were obtained from 10 speakers diagnosed with severe ADSD prior to and 1 month following Botox injection, as well as from 10 age- and gender-matched…

  19. Speech vs. singing: infants choose happier sounds

    PubMed Central

    Corbeil, Marieve; Trehub, Sandra E.; Peretz, Isabelle

    2013-01-01

    Infants prefer speech to non-vocal sounds and to non-human vocalizations, and they prefer happy-sounding speech to neutral speech. They also exhibit an interest in singing, but there is little knowledge of their relative interest in speech and singing. The present study explored infants' attention to unfamiliar audio samples of speech and singing. In Experiment 1, infants 4–13 months of age were exposed to happy-sounding infant-directed speech vs. hummed lullabies by the same woman. They listened significantly longer to the speech, which had considerably greater acoustic variability and expressiveness, than to the lullabies. In Experiment 2, infants of comparable age who heard the lyrics of a Turkish children's song spoken vs. sung in a joyful/happy manner did not exhibit differential listening. Infants in Experiment 3 heard the happily sung lyrics of the Turkish children's song vs. a version that was spoken in an adult-directed or affectively neutral manner. They listened significantly longer to the sung version. Overall, happy voice quality rather than vocal mode (speech or singing) was the principal contributor to infant attention, regardless of age. PMID:23805119

  20. Repeated Speech Errors: Evidence for Learning

    ERIC Educational Resources Information Center

    Humphreys, Karin R.; Menzies, Heather; Lake, Johanna K.

    2010-01-01

    Three experiments elicited phonological speech errors using the SLIP procedure to investigate whether there is a tendency for speech errors on specific words to reoccur, and whether this effect can be attributed to implicit learning of an incorrect mapping from lemma to phonology for that word. In Experiment 1, when speakers made a phonological…

  1. The Effects of TV on Speech Education

    ERIC Educational Resources Information Center

    Gocen, Gokcen; Okur, Alpaslan

    2013-01-01

    Generally, the speaking aspect is not properly debated when discussing the positive and negative effects of television (TV), especially on children. So, to highlight this point, this study was first initialized by asking the question: "What are the effects of TV on speech?" and secondly, to transform the effects that TV has on speech in a…

  2. Methodological Choices in Rating Speech Samples

    ERIC Educational Resources Information Center

    O'Brien, Mary Grantham

    2016-01-01

    Much pronunciation research critically relies upon listeners' judgments of speech samples, but researchers have rarely examined the impact of methodological choices. In the current study, 30 German native listeners and 42 German L2 learners (L1 English) rated speech samples produced by English-German L2 learners along three continua: accentedness,…

  3. Speech after Mao: Literature and Belonging

    ERIC Educational Resources Information Center

    Hsieh, Victoria Linda

    2012-01-01

    This dissertation aims to understand the apparent failure of speech in post-Mao literature to fulfill its conventional functions of representation and communication. In order to understand this pattern, I begin by looking back on the utility of speech for nation-building in modern China. In addition to literary analysis of key authors and works,…

  4. Speech and Language Delays in Identical Twins.

    ERIC Educational Resources Information Center

    Bentley, Pat

    Following a literature review on speech and language development of twins, case studies are presented of six sets of identical twins screened for entrance into kindergarten. Five sets of the twins and one boy from the sixth set failed to pass the screening test, particularly the speech and language section, and were referred for therapy to correct…

  5. School Principal Speech about Fiscal Mismanagement

    ERIC Educational Resources Information Center

    Hassenpflug, Ann

    2015-01-01

    A review of two recent federal court cases concerning school principals who experienced adverse job actions after they engaged in speech about fiscal misconduct by other employees indicates that the courts found that the principal's speech was made as part of his or her job duties and was not protected by the First Amendment.

  6. Voice Modulations in German Ironic Speech

    ERIC Educational Resources Information Center

    Scharrer, Lisa; Christmann, Ursula; Knoll, Monja

    2011-01-01

    Previous research has shown that in different languages ironic speech is acoustically modulated compared to literal speech, and these modulations are assumed to aid the listener in the comprehension process by acting as cues that mark utterances as ironic. The present study was conducted to identify paraverbal features of German "ironic criticism"…

  7. Pitch-Learning Algorithm For Speech Encoders

    NASA Technical Reports Server (NTRS)

    Bhaskar, B. R. Udaya

    1988-01-01

    Adaptive algorithm detects and corrects errors in sequence of estimates of pitch period of speech. Algorithm operates in conjunction with techniques used to estimate pitch period. Used in such parametric and hybrid speech coders as linear predictive coders and adaptive predictive coders.

  8. The Neural Substrates of Infant Speech Perception

    ERIC Educational Resources Information Center

    Homae, Fumitaka; Watanabe, Hama; Taga, Gentaro

    2014-01-01

    Infants often pay special attention to speech sounds, and they appear to detect key features of these sounds. To investigate the neural foundation of speech perception in infants, we measured cortical activation using near-infrared spectroscopy. We presented the following three types of auditory stimuli while 3-month-old infants watched a silent…

  9. Speech masking and cancelling and voice obscuration

    DOEpatents

    Holzrichter, John F.

    2013-09-10

    A non-acoustic sensor is used to measure a user's speech and then broadcasts an obscuring acoustic signal diminishing the user's vocal acoustic output intensity and/or distorting the voice sounds making them unintelligible to persons nearby. The non-acoustic sensor is positioned proximate or contacting a user's neck or head skin tissue for sensing speech production information.

  10. Analog Acoustic Expression in Speech Communication

    ERIC Educational Resources Information Center

    Shintel, Hadas; Nusbaum, Howard C.; Okrent, Arika

    2006-01-01

    We present the first experimental evidence of a phenomenon in speech communication we call "analog acoustic expression." Speech is generally thought of as conveying information in two distinct ways: discrete linguistic-symbolic units such as words and sentences represent linguistic meaning, and continuous prosodic forms convey information about…

  11. Visual speech gestures modulate efferent auditory system.

    PubMed

    Namasivayam, Aravind Kumar; Wong, Wing Yiu Stephanie; Sharma, Dinaay; van Lieshout, Pascal

    2015-03-01

    Visual and auditory systems interact at both cortical and subcortical levels. Studies suggest a highly context-specific cross-modal modulation of the auditory system by the visual system. The present study builds on this work by sampling data from 17 young healthy adults to test whether visual speech stimuli evoke different responses in the auditory efferent system compared to visual non-speech stimuli. The descending cortical influences on medial olivocochlear (MOC) activity were indirectly assessed by examining the effects of contralateral suppression of transient-evoked otoacoustic emissions (TEOAEs) at 1, 2, 3 and 4 kHz under three conditions: (a) in the absence of any contralateral noise (Baseline), (b) contralateral noise + observing facial speech gestures related to productions of vowels /a/ and /u/ and (c) contralateral noise + observing facial non-speech gestures related to smiling and frowning. The results are based on 7 individuals whose data met strict recording criteria and indicated a significant difference in TEOAE suppression between observing speech gestures relative to the non-speech gestures, but only at the 1 kHz frequency. These results suggest that observing a speech gesture compared to a non-speech gesture may trigger a difference in MOC activity, possibly to enhance peripheral neural encoding. If such findings can be reproduced in future research, sensory perception models and theories positing the downstream convergence of unisensory streams of information in the cortex may need to be revised.

  12. The Modulation Transfer Function for Speech Intelligibility

    PubMed Central

    Elliott, Taffeta M.; Theunissen, Frédéric E.

    2009-01-01

    We systematically determined which spectrotemporal modulations in speech are necessary for comprehension by human listeners. Speech comprehension has been shown to be robust to spectral and temporal degradations, but the specific relevance of particular degradations is arguable due to the complexity of the joint spectral and temporal information in the speech signal. We applied a novel modulation filtering technique to recorded sentences to restrict acoustic information quantitatively and to obtain a joint spectrotemporal modulation transfer function for speech comprehension, the speech MTF. For American English, the speech MTF showed the criticality of low modulation frequencies in both time and frequency. Comprehension was significantly impaired when temporal modulations <12 Hz or spectral modulations <4 cycles/kHz were removed. More specifically, the MTF was bandpass in temporal modulations and low-pass in spectral modulations: temporal modulations from 1 to 7 Hz and spectral modulations <1 cycles/kHz were the most important. We evaluated the importance of spectrotemporal modulations for vocal gender identification and found a different region of interest: removing spectral modulations between 3 and 7 cycles/kHz significantly increases gender misidentifications of female speakers. The determination of the speech MTF furnishes an additional method for producing speech signals with reduced bandwidth but high intelligibility. Such compression could be used for audio applications such as file compression or noise removal and for clinical applications such as signal processing for cochlear implants. PMID:19266016

  13. Teaching Indirect Speech: Deixis Points the Way.

    ERIC Educational Resources Information Center

    Harman, Ian P.

    1990-01-01

    Suggests an alternative approach to the teaching of indirect or reported speech. Deixis is proposed as a means of clarifying the anomalies of reported speech. The problem is assessed from a grammatical and semantic point of view in the reporting of statements (as opposed to the reporting of questions or commands). (GLR)

  14. Hate Speech: A Call to Principles.

    ERIC Educational Resources Information Center

    Klepper, William M.; Bakken, Timothy

    1997-01-01

    Reviews the history of First Amendment rulings as they relate to speech codes and of other regulations directed at the content of speech. A case study, based on an experience at Trenton State College, details the legal constraints, principles, and practices that Student Affairs administrators should be aware of regarding such situations.…

  15. Fighting Words. The Politics of Hateful Speech.

    ERIC Educational Resources Information Center

    Marcus, Laurence R.

    This book explores issues typified by a series of hateful speech events at Kean College (New Jersey) and on other U.S. campuses in the early 1990s, by examining the dichotomies that exist between the First and the Fourteenth Amendments and between civil liberties and civil rights, and by contrasting the values of free speech and academic freedom…

  16. Crossed Apraxia of Speech: A Case Report

    ERIC Educational Resources Information Center

    Balasubramanian, Venu; Max, Ludo

    2004-01-01

    The present study reports on the first case of crossed apraxia of speech (CAS) in a 69-year-old right-handed female (SE). The possibility of occurrence of apraxia of speech (AOS) following right hemisphere lesion is discussed in the context of known occurrences of ideomotor apraxias and acquired neurogenic stuttering in several cases with right…

  17. General-Purpose Monitoring during Speech Production

    ERIC Educational Resources Information Center

    Ries, Stephanie; Janssen, Niels; Dufau, Stephane; Alario, F.-Xavier; Burle, Boris

    2011-01-01

    The concept of "monitoring" refers to our ability to control our actions on-line. Monitoring involved in speech production is often described in psycholinguistic models as an inherent part of the language system. We probed the specificity of speech monitoring in two psycholinguistic experiments where electroencephalographic activities were…

  18. Electrocardiographic anxiety profiles improve speech anxiety.

    PubMed

    Kim, Pyoung Won; Kim, Seung Ae; Jung, Keun-Hwa

    2012-12-01

    The present study was to set out in efforts to determine the effect of electrocardiographic (ECG) feedback on the performance in speech anxiety. Forty-six high school students participated in a speech performance educational program. They were randomly divided into two groups, an experimental group with ECG feedback (N = 21) and a control group (N = 25). Feedback was given with video recording in the control, whereas in the experimental group, an additional ECG feedback was provided. Speech performance was evaluated by the Korean Broadcasting System (KBS) speech ability test, which determines the 10 different speaking categories. ECG was recorded during rest and speech, together with a video recording of the speech performance. Changes in R-R intervals were used to reflect anxiety profiles. Three trials were performed for 3-week program. Results showed that the subjects with ECG feedback revealed a significant improvement in speech performance and anxiety states, which compared to those in the control group. These findings suggest that visualization of the anxiety profile feedback with ECG can be a better cognitive therapeutic strategy in speech anxiety. PMID:22714138

  19. Milton's "Areopagitica" Freedom of Speech on Campus

    ERIC Educational Resources Information Center

    Sullivan, Daniel F.

    2006-01-01

    The author discusses the content in John Milton's "Areopagitica: A Speech for the Liberty of Unlicensed Printing to the Parliament of England" (1985) and provides parallelism to censorship practiced in higher education. Originally published in 1644, "Areopagitica" makes a powerful--and precocious--argument for freedom of speech and against…

  20. Quick Statistics about Voice, Speech, and Language

    MedlinePlus

    ... Statistics and Epidemiology Quick Statistics About Voice, Speech, Language Voice, Speech, Language, and Swallowing Nearly 1 in 12 (7.7 ... condition known as persistent developmental stuttering. 8 , 9 Language 3.3 percent of U.S. children ages 3- ...

  1. The Development of Preschoolers' Private Speech.

    ERIC Educational Resources Information Center

    Pellegrini, A. D.

    The intent of this study was to examine the development of three aspects of preschoolers' private speech: coefficients of egocentricism, the extent to which speech regulates actions, and the syntactic and semantic structures of individual utterances. Forty-one randomly chosen preschoolers (26 females, 15 males) were placed in three age groups (3,…

  2. Scaffolded-Language Intervention: Speech Production Outcomes

    ERIC Educational Resources Information Center

    Bellon-Harn, Monica L.; Credeur-Pampolina, Maggie E.; LeBoeuf, Lexie

    2013-01-01

    This study investigated the effects of a scaffolded-language intervention using cloze procedures, semantically contingent expansions, contrastive word pairs, and direct models on speech abilities in two preschoolers with speech and language impairment speaking African American English. Effects of the lexical and phonological characteristics (i.e.,…

  3. The effects of stimulus variability on the perceptual learning of speech and non-speech stimuli.

    PubMed

    Banai, Karen; Amitay, Sygal

    2015-01-01

    Previous studies suggest fundamental differences between the perceptual learning of speech and non-speech stimuli. One major difference is in the way variability in the training set affects learning and its generalization to untrained stimuli: training-set variability appears to facilitate speech learning, while slowing or altogether extinguishing non-speech auditory learning. We asked whether the reason for this apparent difference is a consequence of the very different methodologies used in speech and non-speech studies. We hypothesized that speech and non-speech training would result in a similar pattern of learning if they were trained using the same training regimen. We used a 2 (random vs. blocked pre- and post-testing) × 2 (random vs. blocked training) × 2 (speech vs. non-speech discrimination task) study design, yielding 8 training groups. A further 2 groups acted as untrained controls, tested with either random or blocked stimuli. The speech task required syllable discrimination along 4 minimal-pair continua (e.g., bee-dee), and the non-speech stimuli required duration discrimination around 4 base durations (e.g., 50 ms). Training and testing required listeners to pick the odd-one-out of three stimuli, two of which were the base duration or phoneme continuum endpoint and the third varied adaptively. Training was administered in 9 sessions of 640 trials each, spread over 4-8 weeks. Significant learning was only observed following speech training, with similar learning rates and full generalization regardless of whether training used random or blocked schedules. No learning was observed for duration discrimination with either training regimen. We therefore conclude that the two stimulus classes respond differently to the same training regimen. A reasonable interpretation of the findings is that speech is perceived categorically, enabling learning in either paradigm, while the different base durations are not well-enough differentiated to allow for

  4. Improving robustness of speech recognition systems

    NASA Astrophysics Data System (ADS)

    Mitra, Vikramjit

    2010-11-01

    Current Automatic Speech Recognition (ASR) systems fail to perform nearly as good as human speech recognition performance due to their lack of robustness against speech variability and noise contamination. The goal of this dissertation is to investigate these critical robustness issues, put forth different ways to address them and finally present an ASR architecture based upon these robustness criteria. Acoustic variations adversely affect the performance of current phone-based ASR systems, in which speech is modeled as 'beads-on-a-string', where the beads are the individual phone units. While phone units are distinctive in cognitive domain, they are varying in the physical domain and their variation occurs due to a combination of factors including speech style, speaking rate etc.; a phenomenon commonly known as 'coarticulation'. Traditional ASR systems address such coarticulatory variations by using contextualized phone-units such as triphones. Articulatory phonology accounts for coarticulatory variations by modeling speech as a constellation of constricting actions known as articulatory gestures. In such a framework, speech variations such as coarticulation and lenition are accounted for by gestural overlap in time and gestural reduction in space. To realize a gesture-based ASR system, articulatory gestures have to be inferred from the acoustic signal. At the initial stage of this research an initial study was performed using synthetically generated speech to obtain a proof-of-concept that articulatory gestures can indeed be recognized from the speech signal. It was observed that having vocal tract constriction trajectories (TVs) as intermediate representation facilitated the gesture recognition task from the speech signal. Presently no natural speech database contains articulatory gesture annotation; hence an automated iterative time-warping architecture is proposed that can annotate any natural speech database with articulatory gestures and TVs. Two natural

  5. Strategies for distant speech recognitionin reverberant environments

    NASA Astrophysics Data System (ADS)

    Delcroix, Marc; Yoshioka, Takuya; Ogawa, Atsunori; Kubo, Yotaro; Fujimoto, Masakiyo; Ito, Nobutaka; Kinoshita, Keisuke; Espi, Miquel; Araki, Shoko; Hori, Takaaki; Nakatani, Tomohiro

    2015-12-01

    Reverberation and noise are known to severely affect the automatic speech recognition (ASR) performance of speech recorded by distant microphones. Therefore, we must deal with reverberation if we are to realize high-performance hands-free speech recognition. In this paper, we review a recognition system that we developed at our laboratory to deal with reverberant speech. The system consists of a speech enhancement (SE) front-end that employs long-term linear prediction-based dereverberation followed by noise reduction. We combine our SE front-end with an ASR back-end that uses neural networks for acoustic and language modeling. The proposed system achieved top scores on the ASR task of the REVERB challenge. This paper describes the different technologies used in our system and presents detailed experimental results that justify our implementation choices and may provide hints for designing distant ASR systems.

  6. Predicting the intelligibility of vocoded speech

    PubMed Central

    Chen, Fei; Loizou, Philipos C.

    2010-01-01

    Objectives The purpose of this study is to evaluate the performance of a number of speech intelligibility indices in terms of predicting the intelligibility of vocoded speech. Design Noise-corrupted sentences were vocoded in a total of 80 conditions, involving three different SNR levels (-5, 0 and 5 dB) and two types of maskers (steady-state noise and two-talker). Tone-vocoder simulations were used as well as simulations of combined electric-acoustic stimulation (EAS). The vocoded sentences were presented to normal-hearing listeners for identification, and the resulting intelligibility scores were used to assess the correlation of various speech intelligibility measures. These included measures designed to assess speech intelligibility, including the speech-transmission index (STI) and articulation index (AI) based measures, as well as distortions in hearing aids (e.g., coherence-based measures). These measures employed primarily either the temporal-envelope or the spectral-envelope information in the prediction model. The underlying hypothesis in the present study is that measures that assess temporal envelope distortions, such as those based on the speech-transmission index, should correlate highly with the intelligibility of vocoded speech. This is based on the fact that vocoder simulations preserve primarily envelope information, similar to the processing implemented in current cochlear implant speech processors. Similarly, it is hypothesized that measures such as the coherence-based index that assess the distortions present in the spectral envelope could also be used to model the intelligibility of vocoded speech. Results Of all the intelligibility measures considered, the coherence-based and the STI-based measures performed the best. High correlations (r=0.9-0.96) were maintained with the coherence-based measures in all noisy conditions. The highest correlation obtained with the STI-based measure was 0.92, and that was obtained when high modulation rates (100

  7. The Functional Connectome of Speech Control.

    PubMed

    Fuertinger, Stefan; Horwitz, Barry; Simonyan, Kristina

    2015-07-01

    In the past few years, several studies have been directed to understanding the complexity of functional interactions between different brain regions during various human behaviors. Among these, neuroimaging research installed the notion that speech and language require an orchestration of brain regions for comprehension, planning, and integration of a heard sound with a spoken word. However, these studies have been largely limited to mapping the neural correlates of separate speech elements and examining distinct cortical or subcortical circuits involved in different aspects of speech control. As a result, the complexity of the brain network machinery controlling speech and language remained largely unknown. Using graph theoretical analysis of functional MRI (fMRI) data in healthy subjects, we quantified the large-scale speech network topology by constructing functional brain networks of increasing hierarchy from the resting state to motor output of meaningless syllables to complex production of real-life speech as well as compared to non-speech-related sequential finger tapping and pure tone discrimination networks. We identified a segregated network of highly connected local neural communities (hubs) in the primary sensorimotor and parietal regions, which formed a commonly shared core hub network across the examined conditions, with the left area 4p playing an important role in speech network organization. These sensorimotor core hubs exhibited features of flexible hubs based on their participation in several functional domains across different networks and ability to adaptively switch long-range functional connectivity depending on task content, resulting in a distinct community structure of each examined network. Specifically, compared to other tasks, speech production was characterized by the formation of six distinct neural communities with specialized recruitment of the prefrontal cortex, insula, putamen, and thalamus, which collectively forged the formation

  8. Sensorimotor influences on speech perception in infancy

    PubMed Central

    Bruderer, Alison G.; Danielson, D. Kyle; Kandhadai, Padmapriya; Werker, Janet F.

    2015-01-01

    The influence of speech production on speech perception is well established in adults. However, because adults have a long history of both perceiving and producing speech, the extent to which the perception–production linkage is due to experience is unknown. We addressed this issue by asking whether articulatory configurations can influence infants’ speech perception performance. To eliminate influences from specific linguistic experience, we studied preverbal, 6-mo-old infants and tested the discrimination of a nonnative, and hence never-before-experienced, speech sound distinction. In three experimental studies, we used teething toys to control the position and movement of the tongue tip while the infants listened to the speech sounds. Using ultrasound imaging technology, we verified that the teething toys consistently and effectively constrained the movement and positioning of infants’ tongues. With a looking-time procedure, we found that temporarily restraining infants’ articulators impeded their discrimination of a nonnative consonant contrast but only when the relevant articulator was selectively restrained to prevent the movements associated with producing those sounds. Our results provide striking evidence that even before infants speak their first words and without specific listening experience, sensorimotor information from the articulators influences speech perception. These results transform theories of speech perception by suggesting that even at the initial stages of development, oral–motor movements influence speech sound discrimination. Moreover, an experimentally induced “impairment” in articulator movement can compromise speech perception performance, raising the question of whether long-term oral–motor impairments may impact perceptual development. PMID:26460030

  9. The Functional Connectome of Speech Control

    PubMed Central

    Fuertinger, Stefan; Horwitz, Barry; Simonyan, Kristina

    2015-01-01

    In the past few years, several studies have been directed to understanding the complexity of functional interactions between different brain regions during various human behaviors. Among these, neuroimaging research installed the notion that speech and language require an orchestration of brain regions for comprehension, planning, and integration of a heard sound with a spoken word. However, these studies have been largely limited to mapping the neural correlates of separate speech elements and examining distinct cortical or subcortical circuits involved in different aspects of speech control. As a result, the complexity of the brain network machinery controlling speech and language remained largely unknown. Using graph theoretical analysis of functional MRI (fMRI) data in healthy subjects, we quantified the large-scale speech network topology by constructing functional brain networks of increasing hierarchy from the resting state to motor output of meaningless syllables to complex production of real-life speech as well as compared to non-speech-related sequential finger tapping and pure tone discrimination networks. We identified a segregated network of highly connected local neural communities (hubs) in the primary sensorimotor and parietal regions, which formed a commonly shared core hub network across the examined conditions, with the left area 4p playing an important role in speech network organization. These sensorimotor core hubs exhibited features of flexible hubs based on their participation in several functional domains across different networks and ability to adaptively switch long-range functional connectivity depending on task content, resulting in a distinct community structure of each examined network. Specifically, compared to other tasks, speech production was characterized by the formation of six distinct neural communities with specialized recruitment of the prefrontal cortex, insula, putamen, and thalamus, which collectively forged the formation

  10. Speech prosody in cerebellar ataxia.

    PubMed

    Casper, Maureen A; Raphael, Lawrence J; Harris, Katherine S; Geibel, Jennifer M

    2007-01-01

    Persons with cerebellar ataxia exhibit changes in physical coordination and speech and voice production. Previously, these alterations of speech and voice production were described primarily via perceptual coordinates. In this study, the spatial-temporal properties of syllable production were examined in 12 speakers, six of whom were healthy speakers and six with ataxia. The speaking task was designed to elicit six different prosodic conditions and four contrastive prosodic events. Distinct prosodic patterns were elicited by the examiner for cerebellar patients and healthy speakers. These utterances were digitally recorded and analysed acoustically and statistically. The healthy speakers showed statistically significant differences among all four prosodic contrasts. The normal model described by the prosodic contrasts provided a sensitive index of cerebellar pathology with quantitative acoustic analyses. A significant interaction between subject groups and prosodic conditions revealed a compromised prosody in cerebellar patients. Significant differences were found for durational parameters, F0 and formant frequencies. The cerebellar speakers demonstrated different patterns of syllable lengthening and syllable reduction from that of the healthy speakers. PMID:17613097

  11. Intonation contour in synchronous speech

    NASA Astrophysics Data System (ADS)

    Wang, Bei; Cummins, Fred

    2003-10-01

    Synchronous Speech (Syn-S), obtained by having pairs of speakers read a prepared text together, has been shown to result in interesting properties in the temporal domain, especially in the reduction of inter-speaker variability in supersegmental timing [F. Cummins, ARLO 3, 7-11 (2002)]. Here we investigate the effect of synchronization among speakers on the intonation contour, with a view to informing models of intonation. Six pairs of speakers (all females) read a short text (176 words) both synchronously and solo. Results show that (1) the pitch accent height above a declining baseline is reduced in Syn-S, compared with solo speech, while the pitch accent location is consistent across speakers in both conditions; (2) in contrast to previous findings on duration matching, there is an asymmetry between speakers, with one speaker exerting a stronger influence on the observed intonation contour than the other; (3) agreement on the boundaries of intonational phrases is greater in Syn-S and intonation contours are well matched from the first syllable of the phrase and throughout.

  12. Vestibular hearing and speech processing.

    PubMed

    Emami, Seyede Faranak; Pourbakht, Akram; Sheykholeslami, Kianoush; Kamali, Mohammad; Behnoud, Fatholah; Daneshi, Ahmad

    2012-01-01

    Vestibular hearing in human is evoked as a result of the auditory sensitivity of the saccule to low-frequency high-intensity tone. The objective was to investigate the relationship between vestibular hearing using cervical vestibular-evoked myogenic potentials (cVEMPs) and speech processing via word recognition scores in white noise (WRSs in wn). Intervention comprised of audiologic examinations, cVEMPs, and WRS in wn. All healthy subjects had detectable cVEMPs (safe vestibular hearing). WRSs in wn were obtained for them (66.9 ± 9.3% in the right ears and 67.5 ± 11.8% in the left ears). Dizzy patients in the affected ears, had the cVEMPs abnormalities (insecure vestibular hearing) and decreased the WRS in wn (51.4 ± 3.8% in the right ears and 52.2 ± 3.5% in the left ears). The comparison of the cVEMPs between the subjects revealed significant differences (P < 0.05). Therefore, the vestibular hearing can improve the speech processing in the competing noisy conditions. PMID:23724272

  13. Inconsistency of speech in children with childhood apraxia of speech, phonological disorders, and typical speech

    NASA Astrophysics Data System (ADS)

    Iuzzini, Jenya

    There is a lack of agreement on the features used to differentiate Childhood Apraxia of Speech (CAS) from Phonological Disorders (PD). One criterion which has gained consensus is lexical inconsistency of speech (ASHA, 2007); however, no accepted measure of this feature has been defined. Although lexical assessment provides information about consistency of an item across repeated trials, it may not capture the magnitude of inconsistency within an item. In contrast, segmental analysis provides more extensive information about consistency of phoneme usage across multiple contexts and word-positions. The current research compared segmental and lexical inconsistency metrics in preschool-aged children with PD, CAS, and typical development (TD) to determine how inconsistency varies with age in typical and disordered speakers, and whether CAS and PD were differentiated equally well by both assessment levels. Whereas lexical and segmental analyses may be influenced by listener characteristics or speaker intelligibility, the acoustic signal is less vulnerable to these factors. In addition, the acoustic signal may reveal information which is not evident in the perceptual signal. A second focus of the current research was motivated by Blumstein et al.'s (1980) classic study on voice onset time (VOT) in adults with acquired apraxia of speech (AOS) which demonstrated a motor impairment underlying AOS. In the current study, VOT analyses were conducted to determine the relationship between age and group with the voicing distribution for bilabial and alveolar plosives. Findings revealed that 3-year-olds evidenced significantly higher inconsistency than 5-year-olds; segmental inconsistency approached 0% in 5-year-olds with TD, whereas it persisted in children with PD and CAS suggesting that for child in this age-range, inconsistency is a feature of speech disorder rather than typical development (Holm et al., 2007). Likewise, whereas segmental and lexical inconsistency were

  14. The effects of selective attention and speech acoustics on neural speech-tracking in a multi-talker scene

    PubMed Central

    Rimmele, Johanna M.; Golumbic, Elana Zion; Schröger, Erich; Poeppel, David

    2015-01-01

    Attending to one speaker in multi-speaker situations is challenging. One neural mechanism proposed to underlie the ability to attend to a particular speaker is phase-locking of low-frequency activity in auditory cortex to speech’s temporal envelope (“speech-tracking”), which is more precise for attended speech. However, it is not known what brings about this attentional effect, and specifically if it reflects enhanced processing of the fine structure of attended speech. To investigate this question we compared attentional effects on speech-tracking of natural vs. vocoded speech which preserves the temporal envelope but removes the fine-structure of speech. Pairs of natural and vocoded speech stimuli were presented concurrently and participants attended to one stimulus and performed a detection task while ignoring the other stimulus. We recorded magnetoencephalography (MEG) and compared attentional effects on the speech-tracking response in auditory cortex. Speech-tracking of natural, but not vocoded, speech was enhanced by attention, whereas neural tracking of ignored speech was similar for natural and vocoded speech. These findings suggest that the more precise speech tracking of attended natural speech is related to processing its fine structure, possibly reflecting the application of higher-order linguistic processes. In contrast, when speech is unattended its fine structure is not processed to the same degree and thus elicits less precise speech tracking more similar to vocoded speech. PMID:25650107

  15. An articulatorily constrained, maximum entropy approach to speech recognition and speech coding

    SciTech Connect

    Hogden, J.

    1996-12-31

    Hidden Markov models (HMM`s) are among the most popular tools for performing computer speech recognition. One of the primary reasons that HMM`s typically outperform other speech recognition techniques is that the parameters used for recognition are determined by the data, not by preconceived notions of what the parameters should be. This makes HMM`s better able to deal with intra- and inter-speaker variability despite the limited knowledge of how speech signals vary and despite the often limited ability to correctly formulate rules describing variability and invariance in speech. In fact, it is often the case that when HMM parameter values are constrained using the limited knowledge of speech, recognition performance decreases. However, the structure of an HMM has little in common with the mechanisms underlying speech production. Here, the author argues that by using probabilistic models that more accurately embody the process of speech production, he can create models that have all the advantages of HMM`s, but that should more accurately capture the statistical properties of real speech samples--presumably leading to more accurate speech recognition. The model he will discuss uses the fact that speech articulators move smoothly and continuously. Before discussing how to use articulatory constraints, he will give a brief description of HMM`s. This will allow him to highlight the similarities and differences between HMM`s and the proposed technique.

  16. Statistical properties of infant-directed versus adult-directed speech: Insights from speech recognition

    NASA Astrophysics Data System (ADS)

    Kirchhoff, Katrin; Schimmel, Steven

    2005-04-01

    Previous studies have shown that infant-directed speech (`motherese') exhibits overemphasized acoustic properties which may facilitate the acquisition of phonetic categories by infant learners. It has been suggested that the use of infant-directed data for training automatic speech recognition systems might also enhance the automatic learning and discrimination of phonetic categories. This study investigates the properties of infant-directed vs. adult-directed speech from the point of view of the statistical pattern recognition paradigm underlying automatic speech recognition. Isolated-word speech recognizers were trained on adult-directed vs. infant-directed data sets and were tested on both matched and mismatched data. Results show that recognizers trained on infant-directed speech did not always exhibit better recognition performance; however, their relative loss in performance on mismatched data was significantly less severe than that of recognizers trained on adult-directed speech and presented with infant-directed test data. An analysis of the statistical distributions of a subset of phonetic classes in both data sets showed that this pattern is caused by larger class overlaps in infant-directed speech. This finding has implications for both automatic speech recognition and theories of infant speech perception. .

  17. Development of The Viking Speech Scale to classify the speech of children with cerebral palsy.

    PubMed

    Pennington, Lindsay; Virella, Daniel; Mjøen, Tone; da Graça Andrada, Maria; Murray, Janice; Colver, Allan; Himmelmann, Kate; Rackauskaite, Gija; Greitane, Andra; Prasauskiene, Audrone; Andersen, Guro; de la Cruz, Javier

    2013-10-01

    Surveillance registers monitor the prevalence of cerebral palsy and the severity of resulting impairments across time and place. The motor disorders of cerebral palsy can affect children's speech production and limit their intelligibility. We describe the development of a scale to classify children's speech performance for use in cerebral palsy surveillance registers, and its reliability across raters and across time. Speech and language therapists, other healthcare professionals and parents classified the speech of 139 children with cerebral palsy (85 boys, 54 girls; mean age 6.03 years, SD 1.09) from observation and previous knowledge of the children. Another group of health professionals rated children's speech from information in their medical notes. With the exception of parents, raters reclassified children's speech at least four weeks after their initial classification. Raters were asked to rate how easy the scale was to use and how well the scale described the child's speech production using Likert scales. Inter-rater reliability was moderate to substantial (k>.58 for all comparisons). Test-retest reliability was substantial to almost perfect for all groups (k>.68). Over 74% of raters found the scale easy or very easy to use; 66% of parents and over 70% of health care professionals judged the scale to describe children's speech well or very well. We conclude that the Viking Speech Scale is a reliable tool to describe the speech performance of children with cerebral palsy, which can be applied through direct observation of children or through case note review.

  18. Modulation of Auditory Responses to Speech vs. Nonspeech Stimuli during Speech Movement Planning

    PubMed Central

    Daliri, Ayoub; Max, Ludo

    2016-01-01

    Previously, we showed that the N100 amplitude in long latency auditory evoked potentials (LLAEPs) elicited by pure tone probe stimuli is modulated when the stimuli are delivered during speech movement planning as compared with no-speaking control conditions. Given that we probed the auditory system only with pure tones, it remained unknown whether the nature and magnitude of this pre-speech auditory modulation depends on the type of auditory stimulus. Thus, here, we asked whether the effect of speech movement planning on auditory processing varies depending on the type of auditory stimulus. In an experiment with nine adult subjects, we recorded LLAEPs that were elicited by either pure tones or speech syllables when these stimuli were presented prior to speech onset in a delayed-response speaking condition vs. a silent reading control condition. Results showed no statistically significant difference in pre-speech modulation of the N100 amplitude (early stages of auditory processing) for the speech stimuli as compared with the nonspeech stimuli. However, the amplitude of the P200 component (later stages of auditory processing) showed a statistically significant pre-speech modulation that was specific to the speech stimuli only. Hence, the overall results from this study indicate that, immediately prior to speech onset, modulation of the auditory system has a general effect on early processing stages but a speech-specific effect on later processing stages. This finding is consistent with the hypothesis that pre-speech auditory modulation may play a role in priming the auditory system for its role in monitoring auditory feedback during speech production. PMID:27242494

  19. [Dependence of 'audio-phonatoric coupling' of speech rate and speech loudness].

    PubMed

    Jäncke, L

    1992-01-01

    The 'audio-phonatoric coupling' (APC) was investigated in two independent experiments. Slightly delayed auditory feedback (delay time 40 ms) of the subjects' own speech was used as experimental method. The first experiment was conducted to examine whether the strength of the APC depends on the speech rate. In this experiment 16 male Subjects (Ss) were required to utter the testword/tatatas/either with stress placing on the first or second syllable at two different speech rates (fast and slow). In 16% of the randomly chosen speech trials, the delayed auditory feedback (DAF; 40 ms delay) was introduced. It could be shown that the stressed phonation was significantly lengthened under the DAF condition. This lengthening was greater when Ss spoke slowly. The unstressed phonations were not influenced by the DAF condition. The second experiment was conducted to examine whether or not speech intensity effects APC. Nine male Ss were required to utter the testword/tatatas/either with stress placing on the first or second syllable using three different speech intensities (30 dB, 50 dB and 70 dB). In 16% of the randomly chosen speech trials DAF condition was introduced. It could be shown that speech intensity does not influence the DAF effect (lengthening of stressed phonation). These findings were taken as evidence that the auditory feedback of the subjects' own speech can be incorporated into speech control during ongoing speech. Obviously, this feedback information is efficient only during the production of stressed syllables, and varies as a function of speech rate. In addition, the significance of stressed syllables for the structuring of speech is discussed. PMID:1295272

  20. Toward a functional analysis of delusional speech and hallucinatory behavior

    PubMed Central

    Layng, T. V. Joe; Andronis, Paul Thomas

    1984-01-01

    An approach to a functional analysis of delusional speech and hallucinatory behavior is described and discussed using concepts found in Goldiamond's (1975a and 1984) nonlinear contingency analysis and Skinner's Verbal Behavior (1957). This synthesis draws upon and concords with research from the animal laboratory, with the extensive experimental literatures on stimulus control and signal detection theory, and with our own clinical experiences. In this formulation, delusional speech and hallucinatory behavior are viewed as successful operants. Accordingly, we argue that such behaviors can be considered adaptive and rational, rather than maladaptive and irrational, when analyzed within a model of consequential governance that includes alternative sets of contingencies. Several clinical examples are offered to illustrate both analytic procedures and the design of systemic treatment programs based upon a behavioral contingency analysis derived from a natural science of behavior. Throughout, we emphasize the consequential governance of these clinically important classes of behavior, in contrast to other approaches which suggest formal similarities to operant verbal behavior but largely ignore the role of consequential contingencies. PMID:22478607