Science.gov

Sample records for concatenative speech synthesis

  1. Models of speech synthesis.

    PubMed Central

    Carlson, R

    1995-01-01

    The term "speech synthesis" has been used for diverse technical approaches. In this paper, some of the approaches used to generate synthetic speech in a text-to-speech system are reviewed, and some of the basic motivations for choosing one method over another are discussed. It is important to keep in mind, however, that speech synthesis models are needed not just for speech generation but to help us understand how speech is created, or even how articulation can explain language structure. General issues such as the synthesis of different voices, accents, and multiple languages are discussed as special challenges facing the speech synthesis community. PMID:7479805

  2. Linguistic aspects of speech synthesis.

    PubMed Central

    Allen, J

    1995-01-01

    The conversion of text to speech is seen as an analysis of the input text to obtain a common underlying linguistic description, followed by a synthesis of the output speech waveform from this fundamental specification. Hence, the comprehensive linguistic structure serving as the substrate for an utterance must be discovered by analysis from the text. The pronunciation of individual words in unrestricted text is determined by morphological analysis or letter-to-sound conversion, followed by specification of the word-level stress contour. In addition, many text character strings, such as titles, numbers, and acronyms, are abbreviations for normal words, which must be derived. To further refine these pronunciations and to discover the prosodic structure of the utterance, word part of speech must be computed, followed by a phrase-level parsing. From this structure the prosodic structure of the utterance can be determined, which is needed in order to specify the durational framework and fundamental frequency contour of the utterance. In discourse contexts, several factors such as the specification of new and old information, contrast, and pronominal reference can be used to further modify the prosodic specification. When the prosodic correlates have been computed and the segmental sequence is assembled, a complete input suitable for speech synthesis has been determined. Lastly, multilingual systems utilizing rule frameworks are mentioned, and future directions are characterized. PMID:7479807

  3. Linguistic aspects of speech synthesis.

    PubMed

    Allen, J

    1995-10-24

    The conversion of text to speech is seen as an analysis of the input text to obtain a common underlying linguistic description, followed by a synthesis of the output speech waveform from this fundamental specification. Hence, the comprehensive linguistic structure serving as the substrate for an utterance must be discovered by analysis from the text. The pronunciation of individual words in unrestricted text is determined by morphological analysis or letter-to-sound conversion, followed by specification of the word-level stress contour. In addition, many text character strings, such as titles, numbers, and acronyms, are abbreviations for normal words, which must be derived. To further refine these pronunciations and to discover the prosodic structure of the utterance, word part of speech must be computed, followed by a phrase-level parsing. From this structure the prosodic structure of the utterance can be determined, which is needed in order to specify the durational framework and fundamental frequency contour of the utterance. In discourse contexts, several factors such as the specification of new and old information, contrast, and pronominal reference can be used to further modify the prosodic specification. When the prosodic correlates have been computed and the segmental sequence is assembled, a complete input suitable for speech synthesis has been determined. Lastly, multilingual systems utilizing rule frameworks are mentioned, and future directions are characterized. PMID:7479807

  4. Speech Synthesis Applied to Language Teaching.

    ERIC Educational Resources Information Center

    Sherwood, Bruce

    1981-01-01

    The experimental addition of speech output to computer-based Esperanto lessons using speech synthesized from text is described. Because of Esperanto's phonetic spelling and simple rhythm, it is particularly easy to describe the mechanisms of Esperanto synthesis. Attention is directed to how the text-to-speech conversion is performed and the ways…

  5. Speech synthesis with artificial neural networks

    NASA Astrophysics Data System (ADS)

    Weijters, Ton; Thole, Johan

    1992-10-01

    The application of neural nets to speech synthesis is considered. In speech synthesis, the main efforts so far have been to master the grapheme to phoneme conversion. During this conversion symbols (graphemes) are converted into other symbols (phonemes). Neural networks, however, are especially competitive for tasks in which complex nonlinear transformations are needed and sufficient domain specific knowledge is not available. The conversion of text into speech parameters appropriate as input for a speech generator seems such a task. Results of a pilot study in which an attempt is made to train a neural network for this conversion are presented.

  6. Fifty years of progress in speech synthesis

    NASA Astrophysics Data System (ADS)

    Schroeter, Juergen

    2004-10-01

    A common opinion is that progress in speech synthesis should be easier to discern than in other areas of speech communication: you just have to listen to the speech! Unfortunately, things are more complicated. It can be said, however, that early speech synthesis efforts were primarily concerned with providing intelligible speech, while, more recently, ``naturalness'' has been the focus. The field had its ``electronic'' roots in Homer Dudley's 1939 ``Voder,'' and it advanced in the 1950s and 1960s through progress in a number of labs including JSRU in England, Haskins Labs in the U.S., and Fant's Lab in Sweden. In the 1970s and 1980s significant progress came from efforts at Bell Labs (under Jim Flanagan's leadership) and at MIT (where Dennis Klatt created one of the first commercially viable systems). Finally, over the past 15 years, the methods of unit-selection synthesis were devised, primarily at ATR in Japan, and were advanced by work at AT&T Labs, Univ. of Edinburgh, and ATR. Today, TTS systems are able to ``convince some of the listeners some of the time'' that synthetic speech is as natural as live recordings. Ongoing efforts aim at replacing ``some'' with ``most'' for a wide range of real-world applications.

  7. Speech synthesis by glottal excited linear prediction.

    PubMed

    Childers, D G; Hu, H T

    1994-10-01

    This paper describes a linear predictive (LP) speech synthesis procedure that resynthesizes speech using a 6th-order polynomial waveform to model the glottal excitation. The coefficients of the polynomial model form a vector that represents the glottal excitation waveform for one pitch period. A glottal excitation code book with 32 entries for voiced excitation is designed and trained using two sentences spoken by different speakers. The purpose for using this approach is to demonstrate that quantization of the glottal excitation waveform does not significantly degrade the quality of speech synthesized with a glottal excitation linear predictive (GELP) synthesizer. This implementation of the LP synthesizer is patterned after both a pitch-excited LP speech synthesizer and a code excited linear predictive (CELP) speech coder. In addition to the glottal excitation codebook, we use a stochastic codebook with 256 entries for unvoiced noise excitation. Analysis techniques are described for constructing both codebooks. The GELP synthesizer, which resynthesizes speech with high quality, provides the speech scientist a simple speech synthesis procedure that uses established analysis techniques, that is able to reproduce all speed sounds, and yet also has an excitation model waveform that is related to the derivative of the glottal flow and the integral of the residue. It is conjectured that the glottal excitation codebook approach could provide a mechanism for quantitatively comparing the differences in glottal excitation codebooks for male and female speakers and for speakers with vocal disorders and for speakers with different voice types such as breathy and vocal fry voices. Conceivably, one could also convert the voice of a speaker with one voice type, e.g., breathy, to the voice of a speaker with another voice type, e.g., vocal fry, by synthesizing speech using the vocal tract LP parameters for the speaker with the breathy voice excited by the glottal excitation

  8. Vocoders and Speech Perception: Uses of Computer-Based Speech Analysis-Synthesis in Stimulus Generation.

    ERIC Educational Resources Information Center

    Tierney, Joseph; Mack, Molly

    1987-01-01

    Stimuli used in research on the perception of the speech signal have often been obtained from simple filtering and distortion of the speech waveform, sometimes accompanied by noise. However, for more complex stimulus generation, the parameters of speech can be manipulated, after analysis and before synthesis, using various types of algorithms to…

  9. Infants' brain responses to speech suggest analysis by synthesis.

    PubMed

    Kuhl, Patricia K; Ramírez, Rey R; Bosseler, Alexis; Lin, Jo-Fu Lotus; Imada, Toshiaki

    2014-08-01

    Historic theories of speech perception (Motor Theory and Analysis by Synthesis) invoked listeners' knowledge of speech production to explain speech perception. Neuroimaging data show that adult listeners activate motor brain areas during speech perception. In two experiments using magnetoencephalography (MEG), we investigated motor brain activation, as well as auditory brain activation, during discrimination of native and nonnative syllables in infants at two ages that straddle the developmental transition from language-universal to language-specific speech perception. Adults are also tested in Exp. 1. MEG data revealed that 7-mo-old infants activate auditory (superior temporal) as well as motor brain areas (Broca's area, cerebellum) in response to speech, and equivalently for native and nonnative syllables. However, in 11- and 12-mo-old infants, native speech activates auditory brain areas to a greater degree than nonnative, whereas nonnative speech activates motor brain areas to a greater degree than native speech. This double dissociation in 11- to 12-mo-old infants matches the pattern of results obtained in adult listeners. Our infant data are consistent with Analysis by Synthesis: auditory analysis of speech is coupled with synthesis of the motor plans necessary to produce the speech signal. The findings have implications for: (i) perception-action theories of speech perception, (ii) the impact of "motherese" on early language learning, and (iii) the "social-gating" hypothesis and humans' development of social understanding. PMID:25024207

  10. Auto Spell Suggestion for High Quality Speech Synthesis in Hindi

    NASA Astrophysics Data System (ADS)

    Kabra, Shikha; Agarwal, Ritika

    2014-02-01

    The goal of Text-to-Speech (TTS) synthesis in a particular language is to convert arbitrary input text to intelligible and natural sounding speech. However, for a particular language like Hindi, which is a highly confusing language (due to very close spellings), it is not an easy task to identify errors/mistakes in input text and an incorrect text degrade the quality of output speech hence this paper is a contribution to the development of high quality speech synthesis with the involvement of Spellchecker which generates spell suggestions for misspelled words automatically. Involvement of spellchecker would increase the efficiency of speech synthesis by providing spell suggestions for incorrect input text. Furthermore, we have provided the comparative study for evaluating the resultant effect on to phonetic text by adding spellchecker on to input text.

  11. Prediction Method of Speech Recognition Performance Based on HMM-based Speech Synthesis Technique

    NASA Astrophysics Data System (ADS)

    Terashima, Ryuta; Yoshimura, Takayoshi; Wakita, Toshihiro; Tokuda, Keiichi; Kitamura, Tadashi

    We describe an efficient method that uses a HMM-based speech synthesis technique as a test pattern generator for evaluating the word recognition rate. The recognition rates of each word and speaker can be evaluated by the synthesized speech by using this method. The parameter generation technique can be formulated as an algorithm that can determine the speech parameter vector sequence O by maximizing P(O¦Q,λ) given the model parameter λ and the state sequence Q, under a dynamic acoustic feature constraint. We conducted recognition experiments to illustrate the validity of the method. Approximately 100 speakers were used to train the speaker dependent models for the speech synthesis used in these experiments, and the synthetic speech was generated as the test patterns for the target speech recognizer. As a result, the recognition rate of the HMM-based synthesized speech shows a good correlation with the recognition rate of the actual speech. Furthermore, we find that our method can predict the speaker recognition rate with approximately 2% error on average. Therefore the evaluation of the speaker recognition rate will be performed automatically by using the proposed method.

  12. Voice Quality Modelling for Expressive Speech Synthesis

    PubMed Central

    Socoró, Joan Claudi

    2014-01-01

    This paper presents the perceptual experiments that were carried out in order to validate the methodology of transforming expressive speech styles using voice quality (VoQ) parameters modelling, along with the well-known prosody (F0, duration, and energy), from a neutral style into a number of expressive ones. The main goal was to validate the usefulness of VoQ in the enhancement of expressive synthetic speech in terms of speech quality and style identification. A harmonic plus noise model (HNM) was used to modify VoQ and prosodic parameters that were extracted from an expressive speech corpus. Perception test results indicated the improvement of obtained expressive speech styles using VoQ modelling along with prosodic characteristics. PMID:24587738

  13. Establishing a Methodology for Benchmarking Speech Synthesis for Computer-Assisted Language Learning (CALL)

    ERIC Educational Resources Information Center

    Handley, Zoe; Hamel, Marie-Josee

    2005-01-01

    Despite the new possibilities that speech synthesis brings about, few Computer-Assisted Language Learning (CALL) applications integrating speech synthesis have found their way onto the market. One potential reason is that the suitability and benefits of the use of speech synthesis in CALL have not been proven. One way to do this is through…

  14. Towards personalized speech synthesis for augmentative and alternative communication.

    PubMed

    Mills, Timothy; Bunnell, H Timothy; Patel, Rupal

    2014-09-01

    Text-to-speech options on augmentative and alternative communication (AAC) devices are limited. Often, several individuals in a group setting use the same synthetic voice. This lack of customization may limit technology adoption and social integration. This paper describes our efforts to generate personalized synthesis for users with profoundly limited speech motor control. Existing voice banking and voice conversion techniques rely on recordings of clearly articulated speech from the target talker, which cannot be obtained from this population. Our VocaliD approach extracts prosodic properties from the target talker's source function and applies these features to a surrogate talker's database, generating a synthetic voice with the vocal identity of the target talker and the clarity of the surrogate talker. Promising intelligibility results suggest areas of further development for improved personalization. PMID:25025818

  15. Analysis and synthesis of the three-dimensional movements of the head, face, and hand of a speaker using cued speech

    NASA Astrophysics Data System (ADS)

    Gibert, Guillaume; Bailly, Gérard; Beautemps, Denis; Elisei, Frédéric; Brun, Rémi

    2005-08-01

    In this paper we present efforts for characterizing the three dimensional (3-D) movements of the right hand and the face of a French female speaker during the audiovisual production of cued speech. The 3-D trajectories of 50 hand and 63 facial flesh points during the production of 238 utterances were analyzed. These utterances were carefully designed to cover all possible diphones of the French language. Linear and nonlinear statistical models of the articulations and the postures of the hand and the face have been developed using separate and joint corpora. Automatic recognition of hand and face postures at targets was performed to verify a posteriori that key hand movements and postures imposed by cued speech had been well realized by the subject. Recognition results were further exploited in order to study the phonetic structure of cued speech, notably the phasing relations between hand gestures and sound production. The hand and face gestural scores are studied in reference with the acoustic segmentation. A first implementation of a concatenative audiovisual text-to-cued speech synthesis system is finally described that employs this unique and extensive data on cued speech in action.

  16. Alternative Speech Communication System for Persons with Severe Speech Disorders

    NASA Astrophysics Data System (ADS)

    Selouani, Sid-Ahmed; Sidi Yakoub, Mohammed; O'Shaughnessy, Douglas

    2009-12-01

    Assistive speech-enabled systems are proposed to help both French and English speaking persons with various speech disorders. The proposed assistive systems use automatic speech recognition (ASR) and speech synthesis in order to enhance the quality of communication. These systems aim at improving the intelligibility of pathologic speech making it as natural as possible and close to the original voice of the speaker. The resynthesized utterances use new basic units, a new concatenating algorithm and a grafting technique to correct the poorly pronounced phonemes. The ASR responses are uttered by the new speech synthesis system in order to convey an intelligible message to listeners. Experiments involving four American speakers with severe dysarthria and two Acadian French speakers with sound substitution disorders (SSDs) are carried out to demonstrate the efficiency of the proposed methods. An improvement of the Perceptual Evaluation of the Speech Quality (PESQ) value of 5% and more than 20% is achieved by the speech synthesis systems that deal with SSD and dysarthria, respectively.

  17. Inverse solution of speech production based on perturbation theory and its application to articulatory speech synthesis

    NASA Astrophysics Data System (ADS)

    Yu, Zhenli

    1998-12-01

    The inverse solution of speech production for formant targets of vowels and vowel-to-vowel transitions is studied. Band-limited Fourier cosine expansion of vocal- tract area function or its logarithm is used to model the vocal-tract shape. The inverse solution is based on the perturbation theory of speech production incorporate with a fast calculation of the vocal-tract system. An interpolation method for dynamic constraint on the unobservable zeros and vocal-tract length along the transition between the endpoint of vowel-to-vowel transition is proposed. A unique mapping acoustic-to- geometry codebook is used to match the zeros and vocal tract length of the endpoint. The codebook is designed by geometrical and acoustical constraints. Computer simulation of the evaluation of the inverse solution shows reasonable results with respect to the naturalness of transition behavior of the vocal-tract area function. An articulatory synthesizer with a reflection-type line analog model which is driven by vocal-tract area is implemented. Synthesis evaluation of the performance of the inverse solution for vowel-to-vowel transitions as well as for isolated vowels is conducted. The resultant spectrogram vision and perceptual listening of the synthetic sounds is satisfactory. Quantitative comparison in forms of formant traces reveals fairly good matching of the formants of synthetic sounds to the original one. A novel formant targeted articulatory synthesis, as an application of the inverse solution, is proposed. The entire system consists of an inverse module and a reflection-type line analog model. The synthesizer needs only the first three formant trajectories, pitch contour and amplitude as input parameters. A formant mimic synthesis in which the input parameters can be artificially specified and a formant copy synthesis in which the input parameters are obtained by estimation from real speech sound are implemented. The formant trace or pitch contour can be separately modified

  18. Usage of the HMM-Based Speech Synthesis for Intelligent Arabic Voice

    NASA Astrophysics Data System (ADS)

    Fares, Tamer S.; Khalil, Awad H.; Hegazy, Abd El-Fatah A.

    2008-06-01

    The HMM as a suitable model for time sequence modeling is used for estimation of speech synthesis parameters, A speech parameter sequence is generated from HMMs themselves whose observation vectors consists of spectral parameter vector and its dynamic feature vectors. HMMs generate cepstral coefficients and pitch parameter which are then fed to speech synthesis filter named Mel Log Spectral Approximation (MLSA), this paper explains how this approach can be applied to the Arabic language to produce intelligent Arabic speech synthesis using the HMM-Based Speech Synthesis and the influence of using of the dynamic features and the increasing of the number of mixture components on the quality enhancement of the Arabic speech synthesized.

  19. Speech synthesis with pitch modification using harmonic plus noise model

    NASA Astrophysics Data System (ADS)

    Lehana, Parveen K.; Pandey, Prem C.

    2003-10-01

    In harmonic plus noise model (HNM) based speech synthesis, the input signal is modeled as two parts: the harmonic part using amplitudes and phases of the harmonics of the fundamental and the noise part using an all-pole filter excited by random white Gaussian noise. This method requires relatively less number of parameters and computations, provides good quality output, and permits pitch and time scaling without explicit estimation of vocal tract parameters. Pitch scaling to synthesize the speech with interpolated original amplitudes and phases at the multiples of the scaled pitch frequency results in an unnatural quality. Our investigation for obtaining natural quality output showed that the frequency scale of the amplitudes and phases of the harmonics of the original signal needed to be modified by a speaker dependent warping function. The function was obtained by studying the relationship between pitch frequency and formant frequencies for the three cardinal vowels naturally occurring with different pitches in a passage with intonation. Listening tests showed that good quality speech was obtained by linear frequency scaling of the amplitude and phase spectra, by the same factor as the pitch-scaling.

  20. Interfacing COTS Speech Recognition and Synthesis Software to a Lotus Notes Military Command and Control Database

    NASA Astrophysics Data System (ADS)

    Carr, Oliver

    2002-10-01

    Speech recognition and synthesis technologies have become commercially viable over recent years. Two current market leading products in speech recognition technology are Dragon NaturallySpeaking and IBM ViaVoice. This report describes the development of speech user interfaces incorporating these products with Lotus Notes and Java applications. These interfaces enable data entry using speech recognition and allow warnings and instructions to be issued via speech synthesis. The development of a military vocabulary to improve user interaction is discussed. The report also describes an evaluation in terms of speed of the various speech user interfaces developed using Dragon NaturallySpeaking and IBM ViaVoice with a Lotus Notes Command and Control Support System Log database.

  1. The Effects on Children's Writing of Adding Speech Synthesis to a Word Processor.

    ERIC Educational Resources Information Center

    Borgh, Karin; Dickson, W. Patrick

    A study examined whether computers equipped with speech synthesis devices could facilitate children's writing. It was hypothesized that children using the devices would write longer stories, edit more, and produce higher quality stories than children not receiving feedback from a speech synthesizer. Subjects were 48 children, three girls and three…

  2. Design and performance of an analysis-by-synthesis class of predictive speech coders

    NASA Technical Reports Server (NTRS)

    Rose, Richard C.; Barnwell, Thomas P., III

    1990-01-01

    The performance of a broad class of analysis-by-synthesis linear predictive speech coders is quantified experimentally. The class of coders includes a number of well-known techniques as well as a very large number of speech coders which have not been named or studied. A general formulation for deriving the parametric representation used in all of the coders in the class is presented. A new coder, named the self-excited vocoder, is discussed because of its good performance with low complexity, and because of the insight this coder gives to analysis-by-synthesis coders in general. The results of a study comparing the performances of different members of this class are presented. The study takes the form of a series of formal subjective and objective speech quality tests performed on selected coders. The results of this study lead to some interesting and important observations concerning the controlling parameters for analysis-by-synthesis speech coders.

  3. Implementation of Three Text to Speech Systems for Kurdish Language

    NASA Astrophysics Data System (ADS)

    Bahrampour, Anvar; Barkhoda, Wafa; Azami, Bahram Zahir

    Nowadays, concatenative method is used in most modern TTS systems to produce artificial speech. The most important challenge in this method is choosing appropriate unit for creating database. This unit must warranty smoothness and high quality speech, and also, creating database for it must reasonable and inexpensive. For example, syllable, phoneme, allophone, and, diphone are appropriate units for all-purpose systems. In this paper, we implemented three synthesis systems for Kurdish language based on syllable, allophone, and diphone and compare their quality using subjective testing.

  4. English Intonation and Computerized Speech Synthesis. Technical Report No. 287.

    ERIC Educational Resources Information Center

    Levine, Arvin

    This work treats some of the important issues encountered in an attempt to synthesize natural sounding English speech from arbitrary written text. Details of the systems that interact in producing speech are described. The principal systems dealt with are phonology (intonation), phonetics, syntax, semantics, and text-view (discourse). Technical…

  5. Radio Losses for Concatenated Codes

    NASA Astrophysics Data System (ADS)

    Shambayati, S.

    2002-07-01

    The advent of higher powered spacecraft amplifiers and better ground receivers capable of tracking spacecraft carrier signals with narrower loop bandwidths requires better understanding of the carrier tracking loss (radio loss) mechanism of the concatenated codes used for deep-space missions. In this article, we present results of simulations performed for a (7,1/2), Reed-Solomon (255,223), interleaver depth-5 concatenated code in order to shed some light on this issue. Through these simulations, we obtained the performance of this code over an additive white Gaussian noise (AWGN) channel (the baseline performance) in terms of both its frame-error rate (FER) and its bit-error rate at the output of the Reed-Solomon decoder (RS-BER). After obtaining these results, we curve fitted the baseline performance curves for FER and RS-BER and calculated the high-rate radio losses for this code for an FER of 10^(-4) and its corresponding baseline RS-BER of 2.1 x 10^(-6) for a carrier loop signal-to-noise ratio (SNR) of 14.8 dB. This calculation revealed that even though over the AWGN channel the FER value and the RS-BER value correspond to each other (i.e., these values are obtained by the same bit SNR value), the RS-BER value has higher high-rate losses than does the FER value. Furthermore, this calculation contradicted the previous assumption th at at high data rates concatenated codes have the same radio losses as their constituent convolutional codes. Our results showed much higher losses for the FER and the RS-BER (by as much as 2 dB) than for the corresponding baseline BER of the convolutional code. Further simulations were performed to investigate the effects of changes in the data rate on the code's radio losses. It was observed that as the data rate increased the radio losses for both the FER and the RS-BER approached their respective calculated high-rate values. Furthermore, these simulations showed that a simple two-parameter function could model the increase in the

  6. Application of speech recognition and synthesis in the general aviation cockpit

    NASA Technical Reports Server (NTRS)

    North, R. A.; Mountford, S. J.; Bergeron, H.

    1984-01-01

    Interactive speech recognition/synthesis technology is assessed as a method for the aleviation of single-pilot IFR flight workloads. Attention was given during this series of evaluations to the conditions typical of general aviation twin-engine aircrft cockpits, covering several commonly encountered IFR flight condition scenarios. The most beneficial speech command tasks are noted to be in the data retrieval domain, which would allow the pilot access to uplinked data, checklists, and performance charts. Data entry tasks also appear to benefit from this technology.

  7. Concatenated coding with two levels of interleaving

    NASA Astrophysics Data System (ADS)

    Lim, Samuel; Newhouse, Michael

    1991-02-01

    A performance evaluation of an electronic counter counter measure (ECCM) communication system in a worst case partial band noise and partial band tone jamming scenario is documented. The ECCM communication system is composed of two levels of channel coding (concatenated coding) and two levels of interleaving. An analysis was performed for a concatenated code consisting of either a Reed-Solomon or a convolutional outer code and a conventional inner code, and the decoded bit error rates for typical binary modulation schemes (BPSK and DPSK) were obtained. The performance of these coded waveforms was compared with convolutionally encoded systems with respect to the required E sub b/N sub j to achieve an overall bit error rate of 10(exp -5). The results demonstrate a significant coding gain achievable from systems which adopt concatenated coding.

  8. Synthesis of Speaker Facial Movement to Match Selected Speech Sequences

    NASA Technical Reports Server (NTRS)

    Scott, K. C.; Kagels, D. S.; Watson, S. H.; Rom, H.; Wright, J. R.; Lee, M.; Hussey, K. J.

    1994-01-01

    A system is described which allows for the synthesis of a video sequence of a realistic-appearing talking human head. A phonic based approach is used to describe facial motion; image processing rather than physical modeling techniques are used to create video frames.

  9. Concatenated codes for fault tolerant quantum computing

    SciTech Connect

    Knill, E.; Laflamme, R.; Zurek, W.

    1995-05-01

    The application of concatenated codes to fault tolerant quantum computing is discussed. We have previously shown that for quantum memories and quantum communication, a state can be transmitted with error {epsilon} provided each gate has error at most c{epsilon}. We show how this can be used with Shor`s fault tolerant operations to reduce the accuracy requirements when maintaining states not currently participating in the computation. Viewing Shor`s fault tolerant operations as a method for reducing the error of operations, we give a concatenated implementation which promises to propagate the reduction hierarchically. This has the potential of reducing the accuracy requirements in long computations.

  10. HMM-Based Style Control for Expressive Speech Synthesis with Arbitrary Speaker's Voice Using Model Adaptation

    NASA Astrophysics Data System (ADS)

    Nose, Takashi; Tachibana, Makoto; Kobayashi, Takao

    This paper presents methods for controlling the intensity of emotional expressions and speaking styles of an arbitrary speaker's synthetic speech by using a small amount of his/her speech data in HMM-based speech synthesis. Model adaptation approaches are introduced into the style control technique based on the multiple-regression hidden semi-Markov model (MRHSMM). Two different approaches are proposed for training a target speaker's MRHSMMs. The first one is MRHSMM-based model adaptation in which the pretrained MRHSMM is adapted to the target speaker's model. For this purpose, we formulate the MLLR adaptation algorithm for the MRHSMM. The second method utilizes simultaneous adaptation of speaker and style from an average voice model to obtain the target speaker's style-dependent HSMMs which are used for the initialization of the MRHSMM. From the result of subjective evaluation using adaptation data of 50 sentences of each style, we show that the proposed methods outperform the conventional speaker-dependent model training when using the same size of speech data of the target speaker.

  11. Soft context clustering for F0 modeling in HMM-based speech synthesis

    NASA Astrophysics Data System (ADS)

    Khorram, Soheil; Sameti, Hossein; King, Simon

    2015-12-01

    This paper proposes the use of a new binary decision tree, which we call a soft decision tree, to improve generalization performance compared to the conventional `hard' decision tree method that is used to cluster context-dependent model parameters in statistical parametric speech synthesis. We apply the method to improve the modeling of fundamental frequency, which is an important factor in synthesizing natural-sounding high-quality speech. Conventionally, hard decision tree-clustered hidden Markov models (HMMs) are used, in which each model parameter is assigned to a single leaf node. However, this `divide-and-conquer' approach leads to data sparsity, with the consequence that it suffers from poor generalization, meaning that it is unable to accurately predict parameters for models of unseen contexts: the hard decision tree is a weak function approximator. To alleviate this, we propose the soft decision tree, which is a binary decision tree with soft decisions at the internal nodes. In this soft clustering method, internal nodes select both their children with certain membership degrees; therefore, each node can be viewed as a fuzzy set with a context-dependent membership function. The soft decision tree improves model generalization and provides a superior function approximator because it is able to assign each context to several overlapped leaves. In order to use such a soft decision tree to predict the parameters of the HMM output probability distribution, we derive the smoothest (maximum entropy) distribution which captures all partial first-order moments and a global second-order moment of the training samples. Employing such a soft decision tree architecture with maximum entropy distributions, a novel speech synthesis system is trained using maximum likelihood (ML) parameter re-estimation and synthesis is achieved via maximum output probability parameter generation. In addition, a soft decision tree construction algorithm optimizing a log-likelihood measure

  12. Concatenated Coding Using Trellis-Coded Modulation

    NASA Technical Reports Server (NTRS)

    Thompson, Michael W.

    1997-01-01

    In the late seventies and early eighties a technique known as Trellis Coded Modulation (TCM) was developed for providing spectrally efficient error correction coding. Instead of adding redundant information in the form of parity bits, redundancy is added at the modulation stage thereby increasing bandwidth efficiency. A digital communications system can be designed to use bandwidth-efficient multilevel/phase modulation such as Amplitude Shift Keying (ASK), Phase Shift Keying (PSK), Differential Phase Shift Keying (DPSK) or Quadrature Amplitude Modulation (QAM). Performance gain can be achieved by increasing the number of signals over the corresponding uncoded system to compensate for the redundancy introduced by the code. A considerable amount of research and development has been devoted toward developing good TCM codes for severely bandlimited applications. More recently, the use of TCM for satellite and deep space communications applications has received increased attention. This report describes the general approach of using a concatenated coding scheme that features TCM and RS coding. Results have indicated that substantial (6-10 dB) performance gains can be achieved with this approach with comparatively little bandwidth expansion. Since all of the bandwidth expansion is due to the RS code we see that TCM based concatenated coding results in roughly 10-50% bandwidth expansion compared to 70-150% expansion for similar concatenated scheme which use convolution code. We stress that combined coding and modulation optimization is important for achieving performance gains while maintaining spectral efficiency.

  13. Thresholds for Universal Concatenated Quantum Codes.

    PubMed

    Chamberland, Christopher; Jochym-O'Connor, Tomas; Laflamme, Raymond

    2016-07-01

    Quantum error correction and fault tolerance make it possible to perform quantum computations in the presence of imprecision and imperfections of realistic devices. An important question is to find the noise rate at which errors can be arbitrarily suppressed. By concatenating the 7-qubit Steane and 15-qubit Reed-Muller codes, the 105-qubit code enables a universal set of fault-tolerant gates despite not all of them being transversal. Importantly, the cnot gate remains transversal in both codes, and as such has increased error protection relative to the other single qubit logical gates. We show that while the level-1 pseudothreshold for the concatenated scheme is limited by the logical Hadamard gate, the error suppression of the logical cnot gates allows for the asymptotic threshold to increase by orders of magnitude at higher levels. We establish a lower bound of 1.28×10^{-3} for the asymptotic threshold of this code, which is competitive with known concatenated models and does not rely on ancillary magic state preparation for universal computation. PMID:27419549

  14. Thresholds for Universal Concatenated Quantum Codes

    NASA Astrophysics Data System (ADS)

    Chamberland, Christopher; Jochym-O'Connor, Tomas; Laflamme, Raymond

    2016-07-01

    Quantum error correction and fault tolerance make it possible to perform quantum computations in the presence of imprecision and imperfections of realistic devices. An important question is to find the noise rate at which errors can be arbitrarily suppressed. By concatenating the 7-qubit Steane and 15-qubit Reed-Muller codes, the 105-qubit code enables a universal set of fault-tolerant gates despite not all of them being transversal. Importantly, the cnot gate remains transversal in both codes, and as such has increased error protection relative to the other single qubit logical gates. We show that while the level-1 pseudothreshold for the concatenated scheme is limited by the logical Hadamard gate, the error suppression of the logical cnot gates allows for the asymptotic threshold to increase by orders of magnitude at higher levels. We establish a lower bound of 1.28 ×10-3 for the asymptotic threshold of this code, which is competitive with known concatenated models and does not rely on ancillary magic state preparation for universal computation.

  15. Alveolate phylogeny inferred using concatenated ribosomal proteins.

    PubMed

    Bachvaroff, Tsvetan R; Handy, Sara M; Place, Allen R; Delwiche, Charles F

    2011-01-01

    Dinoflagellates and apicomplexans are a strongly supported monophyletic group in rDNA phylogenies, although this phylogeny is not without controversy, particularly between the two groups. Here we use concatenated protein-coding genes from expressed sequence tags or genomic data to construct phylogenies including "typical" dinophycean dinoflagellates, a parasitic syndinian dinoflagellate, Amoebophrya sp., and two related species, Oxyrrhis marina, and Perkinsus marinus. Seventeen genes encoding proteins associated with the ribosome were selected for phylogenetic analysis. The dataset was limited for the most part by data availability from the dinoflagellates. Forty-five taxa from four major lineages were used: the heterokont outgroup, ciliates, dinoflagellates, and apicomplexans. Amoebophrya sp. was included in this phylogeny as a sole representative of the enigmatic marine alveolate or syndinian lineage. The atypical dinoflagellate O. marina, usually excluded from rDNA analyses due to long branches, was also included. The resulting phylogenies were well supported in concatenated analyses with only a few unstable or weakly supported branches; most features were consistent when different lineages were pruned from the tree or different genes were concatenated. The least stable branches involved the placement of Cryptosporidium spp. within the Apicomplexa and the relationships between P. marinus, Amoebophrya sp., and O. marina. Both bootstrap and approximately unbiased test results confirmed that P. marinus, Amoebophrya sp., O. marina, and the remaining dinoflagellates form a monophyletic lineage to the exclusion of Apicomplexa. PMID:21518081

  16. The Compensatory Effectiveness of Optical Character Recognition/Speech Synthesis on Reading Comprehension of Postsecondary Students with Learning Disabilities.

    ERIC Educational Resources Information Center

    Higgins, Eleanor L.; Raskind, Marshall H.

    1997-01-01

    Thirty-seven college students with learning disabilities were given a reading comprehension task under the following conditions: (1) using an optical character recognition/speech synthesis system; (2) having the text read aloud by a human reader; or (3) reading silently without assistance. Findings indicated that the greater the disability, the…

  17. A concatenated coding scheme for error control

    NASA Technical Reports Server (NTRS)

    Lin, S.

    1985-01-01

    A concatenated coding scheme for error contol in data communications was analyzed. The inner code is used for both error correction and detection, however the outer code is used only for error detection. A retransmission is requested if either the inner code decoder fails to make a successful decoding or the outer code decoder detects the presence of errors after the inner code decoding. Probability of undetected error of the proposed scheme is derived. An efficient method for computing this probability is presented. Throughout efficiency of the proposed error control scheme incorporated with a selective repeat ARQ retransmission strategy is analyzed.

  18. (abstract) Synthesis of Speaker Facial Movements to Match Selected Speech Sequences

    NASA Technical Reports Server (NTRS)

    Scott, Kenneth C.

    1994-01-01

    We are developing a system for synthesizing image sequences the simulate the facial motion of a speaker. To perform this synthesis, we are pursuing two major areas of effort. We are developing the necessary computer graphics technology to synthesize a realistic image sequence of a person speaking selected speech sequences. Next, we are developing a model that expresses the relation between spoken phonemes and face/mouth shape. A subject is video taped speaking an arbitrary text that contains expression of the full list of desired database phonemes. The subject is video taped from the front speaking normally, recording both audio and video detail simultaneously. Using the audio track, we identify the specific video frames on the tape relating to each spoken phoneme. From this range we digitize the video frame which represents the extreme of mouth motion/shape. Thus, we construct a database of images of face/mouth shape related to spoken phonemes. A selected audio speech sequence is recorded which is the basis for synthesizing a matching video sequence; the speaker need not be the same as used for constructing the database. The audio sequence is analyzed to determine the spoken phoneme sequence and the relative timing of the enunciation of those phonemes. Synthesizing an image sequence corresponding to the spoken phoneme sequence is accomplished using a graphics technique known as morphing. Image sequence keyframes necessary for this processing are based on the spoken phoneme sequence and timing. We have been successful in synthesizing the facial motion of a native English speaker for a small set of arbitrary speech segments. Our future work will focus on advancement of the face shape/phoneme model and independent control of facial features.

  19. Estimating speech spectra for copy synthesis by linear prediction and by hand

    PubMed Central

    Remez, Robert E.; Dubowski, Kathryn R.; Davids, Morgana L.; Thomas, Emily F.; Paddu, Nina U.; Grossman, Yael S.; Moskalenko, Marina

    2011-01-01

    Linear prediction is a widely available technique for analyzing acoustic properties of speech, although this method is known to be error-prone. New tests assessed the adequacy of linear prediction estimates by using this method to derive synthesis parameters and testing the intelligibility of the synthetic speech that results. Matched sets of sine-wave sentences were created, one set using uncorrected linear prediction estimates of natural sentences, the other using estimates made by hand. Phoneme restrictions imposed on linguistic properties allowed comparisons between continuous and intermittent voicing, oral or nasal and fricative manner, and unrestricted phonemic variation. Intelligibility tests revealed uniformly good performance with sentences created by hand-estimation and a minimal decrease in intelligibility with estimation by linear prediction due to manner variation with continuous voicing. Poorer performance was observed when linear prediction estimates were used to produce synthetic versions of phonemically unrestricted sentences, but no similar decline was observed with synthetic sentences produced by hand estimation. The results show a substantial intelligibility cost of reliance on uncorrected linear prediction estimates when phonemic variation approaches natural incidence. PMID:21973371

  20. Stereotaxy, navigation and the temporal concatenation.

    PubMed

    Apuzzo, M L; Chen, J C

    1999-01-01

    Nautical and cerebral navigation share similar elements of functional need and similar developmental pathways. The need for orientation necessitates the development of appropriate concepts, and such concepts are dependent on technology for practical realization. Occasionally, a concept precedes technology in time and requires periods of delay for appropriate development. A temporal concatenation exists where time allows the additive as need, concept and technology ultimately provide an endpoint of elegant solution. Nautical navigation has proceeded through periods of dead reckoning and celestial navigation to satellite orientation with associated refinements of instrumentation and charts for guidance. Cerebral navigation has progressed from craniometric orientation and burr hole mounted guidance systems to simple rectolinear and arc-centered devices based on radiographs to guidance by complex anatomical and functional maps provided as an amalgam of modern imaging modes. These maps are now augmented by complex frame and frameless systems which allow not only precise orientation, but also point and volumetric action. These complex technical modalities required and developed in part from elements of maritime navigation that have been translated to cerebral navigation in a temporal concatenation. PMID:10853057

  1. An Interactive Concatenated Turbo Coding System

    NASA Technical Reports Server (NTRS)

    Liu, Ye; Tang, Heng; Lin, Shu; Fossorier, Marc

    1999-01-01

    This paper presents a concatenated turbo coding system in which a Reed-Solomon outer code is concatenated with a binary turbo inner code. In the proposed system, the outer code decoder and the inner turbo code decoder interact to achieve both good bit error and frame error performances. The outer code decoder helps the inner turbo code decoder to terminate its decoding iteration while the inner turbo code decoder provides soft-output information to the outer code decoder to carry out a reliability-based soft- decision decoding. In the case that the outer code decoding fails, the outer code decoder instructs the inner code decoder to continue its decoding iterations until the outer code decoding is successful or a preset maximum number of decoding iterations is reached. This interaction between outer and inner code decoders reduces decoding delay. Also presented in the paper are an effective criterion for stopping the iteration process of the inner code decoder and a new reliability-based decoding algorithm for nonbinary codes.

  2. Performance of concatenated Reed-Solomon trellis-coded modulation over Rician fading channels

    NASA Technical Reports Server (NTRS)

    Moher, Michael L.; Lodge, John H.

    1990-01-01

    A concatenated coding scheme for providing very reliable data over mobile-satellite channels at power levels similar to those used for vocoded speech is described. The outer code is a shorter Reed-Solomon code which provides error detection as well as error correction capabilities. The inner code is a 1-D 8-state trellis code applied independently to both the inphase and quadrature channels. To achieve the full error correction potential of this inner code, the code symbols are multiplexed with a pilot sequence which is used to provide dynamic channel estimation and coherent detection. The implementation structure of this scheme is discussed and its performance is estimated.

  3. Effects of prosodic factors on spectral dynamics. II. Synthesis

    NASA Astrophysics Data System (ADS)

    Wouters, Johan; Macon, Michael W.

    2002-01-01

    In Paper I [J. Wouters and M. Macon, J. Acoust. Soc. Am. 111, 417-427 (2002)], the effects of prosodic factors on the spectral rate of change of phoneme transitions were analyzed for a balanced speech corpus. The results showed that the spectral rate of change, defined as the root-mean-square of the first three formant slopes, increased with linguistic prominence, i.e., in stressed syllables, in accented words, in sentence-medial words, and in clearly articulated speech. In the present paper, an initial approach is described to integrate the results of Paper I in a concatenative synthesis framework. The target spectral rate of change of acoustic units is predicted based on the prosodic structure of utterances to be synthesized. Then, the spectral shape of the acoustic units is modified according to the predicted spectral rate of change. Experiments show that the proposed approach provides control over the degree of articulation of acoustic units, and improves the naturalness and intelligibility of concatenated speech in comparison to standard concatenation methods.

  4. Advancements in text-to-speech technology and implications for AAC applications

    NASA Astrophysics Data System (ADS)

    Syrdal, Ann K.

    2003-10-01

    Intelligibility was the initial focus in text-to-speech (TTS) research, since it is clearly a necessary condition for the application of the technology. Sufficiently high intelligibility (approximating human speech) has been achieved in the last decade by the better formant-based and concatenative TTS systems. This led to commercially available TTS systems for highly motivated users, particularly the blind and vocally impaired. Some unnatural qualities of TTS were exploited by these users, such as very fast speaking rates and altered pitch ranges for flagging relevant information. Recently, the focus in TTS research has turned to improving naturalness, so that synthetic speech sounds more human and less robotic. Unit selection approaches to concatenative synthesis have dramatically improved TTS quality, although at the cost of larger and more complex systems. This advancement in naturalness has made TTS technology more acceptable to the general public. The vocally impaired appreciate a more natural voice with which to represent themselves when communicating with others. Unit selection TTS does not achieve such high speaking rates as the earlier TTS systems, however, which is a disadvantage to some AAC device users. An important new research emphasis is to improve and increase the range of emotional expressiveness of TTS.

  5. Speech research directions

    SciTech Connect

    Atal, B.S.; Rabiner, L.R.

    1986-09-01

    This paper presents an overview of the current activities in speech research. The authors discuss the state of the art in speech coding, text-to-speech synthesis, speech recognition, and speaker recognition. In the speech coding area, current algorithms perform well at bit rates down to 9.6 kb/s, and the research is directed at bringing the rate for high-quality speech coding down to 2.4 kb/s. In text-to-speech synthesis, what we currently are able to produce is very intelligible but not yet completely natural. Current research aims at providing higher quality and intelligibility to the synthetic speech that these systems produce. Finally, today's systems for speech and speaker recognition provide excellent performance on limited tasks; i.e., limited vocabulary, modest syntax, small talker populations, constrained inputs, etc.

  6. A concatenated coding scheme for error control

    NASA Technical Reports Server (NTRS)

    Lin, S.

    1985-01-01

    A concatenated coding scheme for error control in data communications is analyzed. The inner code is used for both error correction and detection, however the outer code is used only for error detection. A retransmission is requested if the outer code detects the presence of errors after the inner code decoding. The probability of undetected error of the above error control scheme is derived and upper bounded. Two specific exmaples are analyzed. In the first example, the inner code is a distance-4 shortened Hamming code with generator polynomial (X+1)(X(6)+X+1) = X(7)+X(6)+X(2)+1 and the outer code is a distance-4 shortened Hamming code with generator polynomial (X+1)X(15+X(14)+X(13)+X(12)+X(4)+X(3)+X(2)+X+1) = X(16)+X(12)+X(5)+1 which is the X.25 standard for packet-switched data network. This example is proposed for error control on NASA telecommand links. In the second example, the inner code is the same as that in the first example but the outer code is a shortened Reed-Solomon code with symbols from GF(2(8)) and generator polynomial (X+1)(X+alpha) where alpha is a primitive element in GF(z(8)).

  7. The Neural Basis of Speech Parsing in Children and Adults

    ERIC Educational Resources Information Center

    McNealy, Kristin; Mazziotta, John C.; Dapretto, Mirella

    2010-01-01

    Word segmentation, detecting word boundaries in continuous speech, is a fundamental aspect of language learning that can occur solely by the computation of statistical and speech cues. Fifty-four children underwent functional magnetic resonance imaging (fMRI) while listening to three streams of concatenated syllables that contained either high…

  8. Performance Bounds on Two Concatenated, Interleaved Codes

    NASA Technical Reports Server (NTRS)

    Moision, Bruce; Dolinar, Samuel

    2010-01-01

    A method has been developed of computing bounds on the performance of a code comprised of two linear binary codes generated by two encoders serially concatenated through an interleaver. Originally intended for use in evaluating the performances of some codes proposed for deep-space communication links, the method can also be used in evaluating the performances of short-block-length codes in other applications. The method applies, more specifically, to a communication system in which following processes take place: At the transmitter, the original binary information that one seeks to transmit is first processed by an encoder into an outer code (Co) characterized by, among other things, a pair of numbers (n,k), where n (n > k)is the total number of code bits associated with k information bits and n k bits are used for correcting or at least detecting errors. Next, the outer code is processed through either a block or a convolutional interleaver. In the block interleaver, the words of the outer code are processed in blocks of I words. In the convolutional interleaver, the interleaving operation is performed bit-wise in N rows with delays that are multiples of B bits. The output of the interleaver is processed through a second encoder to obtain an inner code (Ci) characterized by (ni,ki). The output of the inner code is transmitted over an additive-white-Gaussian- noise channel characterized by a symbol signal-to-noise ratio (SNR) Es/No and a bit SNR Eb/No. At the receiver, an inner decoder generates estimates of bits. Depending on whether a block or a convolutional interleaver is used at the transmitter, the sequence of estimated bits is processed through a block or a convolutional de-interleaver, respectively, to obtain estimates of code words. Then the estimates of the code words are processed through an outer decoder, which generates estimates of the original information along with flags indicating which estimates are presumed to be correct and which are found to

  9. Speech processing using maximum likelihood continuity mapping

    SciTech Connect

    Hogden, John E.

    2000-01-01

    Speech processing is obtained that, given a probabilistic mapping between static speech sounds and pseudo-articulator positions, allows sequences of speech sounds to be mapped to smooth sequences of pseudo-articulator positions. In addition, a method for learning a probabilistic mapping between static speech sounds and pseudo-articulator position is described. The method for learning the mapping between static speech sounds and pseudo-articulator position uses a set of training data composed only of speech sounds. The said speech processing can be applied to various speech analysis tasks, including speech recognition, speaker recognition, speech coding, speech synthesis, and voice mimicry.

  10. Speech processing using maximum likelihood continuity mapping

    SciTech Connect

    Hogden, J.E.

    2000-04-18

    Speech processing is obtained that, given a probabilistic mapping between static speech sounds and pseudo-articulator positions, allows sequences of speech sounds to be mapped to smooth sequences of pseudo-articulator positions. In addition, a method for learning a probabilistic mapping between static speech sounds and pseudo-articulator position is described. The method for learning the mapping between static speech sounds and pseudo-articulator position uses a set of training data composed only of speech sounds. The said speech processing can be applied to various speech analysis tasks, including speech recognition, speaker recognition, speech coding, speech synthesis, and voice mimicry.

  11. Automated recognition of helium speech. Phase I: Investigation of microprocessor based analysis/synthesis system

    NASA Astrophysics Data System (ADS)

    Jelinek, H. J.

    1986-01-01

    This is the Final Report of Electronic Design Associates on its Phase I SBIR project. The purpose of this project is to develop a method for correcting helium speech, as experienced in diver-surface communication. The goal of the Phase I study was to design, prototype, and evaluate a real time helium speech corrector system based upon digital signal processing techniques. The general approach was to develop hardware (an IBM PC board) to digitize helium speech and software (a LAMBDA computer based simulation) to translate the speech. As planned in the study proposal, this initial prototype may now be used to assess expected performance from a self contained real time system which uses an identical algorithm. The Final Report details the work carried out to produce the prototype system. Four major project tasks were: a signal processing scheme for converting helium speech to normal sounding speech was generated. The signal processing scheme was simulated on a general purpose (LAMDA) computer. Actual helium speech was supplied to the simulation and the converted speech was generated. An IBM-PC based 14 bit data Input/Output board was designed and built. A bibliography of references on speech processing was generated.

  12. Speech input and output

    NASA Astrophysics Data System (ADS)

    Class, F.; Mangold, H.; Stall, D.; Zelinski, R.

    1981-12-01

    Possibilities for acoustical dialogs with electronic data processing equipment were investigated. Speech recognition is posed as recognizing word groups. An economical, multistage classifier for word string segmentation is presented and its reliability in dealing with continuous speech (problems of temporal normalization and context) is discussed. Speech synthesis is considered in terms of German linguistics and phonetics. Preprocessing algorithms for total synthesis of written texts were developed. A macrolanguage, MUSTER, is used to implement this processing in an acoustic data information system (ADES).

  13. Prosody Production and Perception with Conversational Speech

    ERIC Educational Resources Information Center

    Mo, Yoonsook

    2010-01-01

    Speech utterances are more than the linear concatenation of individual phonemes or words. They are organized by prosodic structures comprising phonological units of different sizes (e.g., syllable, foot, word, and phrase) and the prominence relations among them. As the linguistic structure of spoken languages, prosody serves an important function…

  14. A Wireless Brain-Machine Interface for Real-Time Speech Synthesis

    PubMed Central

    Guenther, Frank H.; Brumberg, Jonathan S.; Wright, E. Joseph; Nieto-Castanon, Alfonso; Tourville, Jason A.; Panko, Mikhail; Law, Robert; Siebert, Steven A.; Bartels, Jess L.; Andreasen, Dinal S.; Ehirim, Princewill; Mao, Hui; Kennedy, Philip R.

    2009-01-01

    Background Brain-machine interfaces (BMIs) involving electrodes implanted into the human cerebral cortex have recently been developed in an attempt to restore function to profoundly paralyzed individuals. Current BMIs for restoring communication can provide important capabilities via a typing process, but unfortunately they are only capable of slow communication rates. In the current study we use a novel approach to speech restoration in which we decode continuous auditory parameters for a real-time speech synthesizer from neuronal activity in motor cortex during attempted speech. Methodology/Principal Findings Neural signals recorded by a Neurotrophic Electrode implanted in a speech-related region of the left precentral gyrus of a human volunteer suffering from locked-in syndrome, characterized by near-total paralysis with spared cognition, were transmitted wirelessly across the scalp and used to drive a speech synthesizer. A Kalman filter-based decoder translated the neural signals generated during attempted speech into continuous parameters for controlling a synthesizer that provided immediate (within 50 ms) auditory feedback of the decoded sound. Accuracy of the volunteer's vowel productions with the synthesizer improved quickly with practice, with a 25% improvement in average hit rate (from 45% to 70%) and 46% decrease in average endpoint error from the first to the last block of a three-vowel task. Conclusions/Significance Our results support the feasibility of neural prostheses that may have the potential to provide near-conversational synthetic speech output for individuals with severely impaired speech motor control. They also provide an initial glimpse into the functional properties of neurons in speech motor cortical areas. PMID:20011034

  15. Multilevel Analysis in Analyzing Speech Data

    ERIC Educational Resources Information Center

    Guddattu, Vasudeva; Krishna, Y.

    2011-01-01

    The speech produced by human vocal tract is a complex acoustic signal, with diverse applications in phonetics, speech synthesis, automatic speech recognition, speaker identification, communication aids, speech pathology, speech perception, machine translation, hearing research, rehabilitation and assessment of communication disorders and many…

  16. Hardware Implementation of Serially Concatenated PPM Decoder

    NASA Technical Reports Server (NTRS)

    Moision, Bruce; Hamkins, Jon; Barsoum, Maged; Cheng, Michael; Nakashima, Michael

    2009-01-01

    A prototype decoder for a serially concatenated pulse position modulation (SCPPM) code has been implemented in a field-programmable gate array (FPGA). At the time of this reporting, this is the first known hardware SCPPM decoder. The SCPPM coding scheme, conceived for free-space optical communications with both deep-space and terrestrial applications in mind, is an improvement of several dB over the conventional Reed-Solomon PPM scheme. The design of the FPGA SCPPM decoder is based on a turbo decoding algorithm that requires relatively low computational complexity while delivering error-rate performance within approximately 1 dB of channel capacity. The SCPPM encoder consists of an outer convolutional encoder, an interleaver, an accumulator, and an inner modulation encoder (more precisely, a mapping of bits to PPM symbols). Each code is describable by a trellis (a finite directed graph). The SCPPM decoder consists of an inner soft-in-soft-out (SISO) module, a de-interleaver, an outer SISO module, and an interleaver connected in a loop (see figure). Each SISO module applies the Bahl-Cocke-Jelinek-Raviv (BCJR) algorithm to compute a-posteriori bit log-likelihood ratios (LLRs) from apriori LLRs by traversing the code trellis in forward and backward directions. The SISO modules iteratively refine the LLRs by passing the estimates between one another much like the working of a turbine engine. Extrinsic information (the difference between the a-posteriori and a-priori LLRs) is exchanged rather than the a-posteriori LLRs to minimize undesired feedback. All computations are performed in the logarithmic domain, wherein multiplications are translated into additions, thereby reducing complexity and sensitivity to fixed-point implementation roundoff errors. To lower the required memory for storing channel likelihood data and the amounts of data transfer between the decoder and the receiver, one can discard the majority of channel likelihoods, using only the remainder in

  17. Decoding of QOSTBC concatenates RS code using parallel interference cancellation

    NASA Astrophysics Data System (ADS)

    Yan, Zhenghang; Lu, Yilong; Ma, Maode; Yang, Yuhang

    2010-02-01

    Comparing with orthogonal space time block code (OSTBC), quasi orthogonal space time block code (QOSTBC) can achieve high transmission rate with partial diversity. In this paper, we present a QOSTBC concatenated Reed-Solomon (RS) error correction code structure. At the receiver, pairwise detection and error correction are first implemented. The decoded data are regrouped. Parallel interference cancellation (PIC) and dual orthogonal space time block code (OSTBC) maximum likelihood decoding are deployed to the regrouped data. The pure concatenated scheme is shown to have higher diversity order and have better error performance at high signal-to-noise ratio (SNR) scenario than both QOSTBC and OSTBC schemes. The PIC and dual OSTBC decoding algorithm can further obtain more than 1.3 dB gains than the pure concatenated scheme at 10-6 bit error probability.

  18. Performance of concatenated Reed-Solomon/Viterbi channel coding

    NASA Technical Reports Server (NTRS)

    Divsalar, D.; Yuen, J. H.

    1982-01-01

    The concatenated Reed-Solomon (RS)/Viterbi coding system is reviewed. The performance of the system is analyzed and results are derived with a new simple approach. A functional model for the input RS symbol error probability is presented. Based on this new functional model, we compute the performance of a concatenated system in terms of RS word error probability, output RS symbol error probability, bit error probability due to decoding failure, and bit error probability due to decoding error. Finally we analyze the effects of the noisy carrier reference and the slow fading on the system performance.

  19. Performance of DBS-Radio using concatenated coding and equalization

    NASA Technical Reports Server (NTRS)

    Gevargiz, J.; Bell, D.; Truong, L.; Vaisnys, A.; Suwitra, K.; Henson, P.

    1995-01-01

    The Direct Broadcast Satellite-Radio (DBS-R) receiver is being developed for operation in a multipath Rayleigh channel. This receiver uses equalization and concatenated coding, in addition to open loop and closed loop architectures for carrier demodulation and symbol synchronization. Performance test results of this receiver are presented in both AWGN and multipath Rayleigh channels. Simulation results show that the performance of the receiver operating in a multipath Rayleigh channel is significantly improved by using equalization. These results show that fractional-symbol equalization offers a performance advantage over full symbol equalization. Also presented is the base-line performance of the DBS-R receiver using concatenated coding and interleaving.

  20. Algorithmization and programming of operator activities for parametric synthesis of speech using the CT-1 system controlled by the MERA 303 minicomputer

    NASA Astrophysics Data System (ADS)

    Ciarkowski, A.

    1983-08-01

    The System of Programmed Synthesizer Service is proposed as a means of automating operator functions achieved during parametric speech synthesis with the help of the synthesizing computer system COMPUTALKER CT-1. The characteristics of the proposed system are described, with emphasis on the collaboration between the synthesizer and other internal devices within the computer system. The various functions of the system are shown in the form of block programs corresponding to the different functions. Complete evaluation of the system will be possible only after long-term synthetic speech experiments are concluded.

  1. Quantum fault-tolerant thresholds for universal concatenated schemes

    NASA Astrophysics Data System (ADS)

    Chamberland, Christopher; Jochym-O'Connor, Tomas; Laflamme, Raymond

    Fault-tolerant quantum computation uses ancillary qubits in order to protect logical data qubits while allowing for the manipulation of the quantum information without severe losses in coherence. While different models for fault-tolerant quantum computation exist, determining the ancillary qubit overhead for competing schemes remains a challenging theoretical problem. In this work, we study the fault-tolerance threshold rates of different models for universal fault-tolerant quantum computation. Namely, we provide different threshold rates for the 105-qubit concatenated coding scheme for universal computation without the need for state distillation. We study two error models: adversarial noise and depolarizing noise and provide lower bounds for the threshold in each of these error regimes. Establishing the threshold rates for the concatenated coding scheme will allow for a physical quantum resource comparison between our fault-tolerant universal quantum computation model and the traditional model using magic state distillation.

  2. Speech coding

    NASA Astrophysics Data System (ADS)

    Gersho, Allen

    1990-05-01

    Recent advances in algorithms and techniques for speech coding now permit high quality voice reproduction at remarkably low bit rates. The advent of powerful single-ship signal processors has made it cost effective to implement these new and sophisticated speech coding algorithms for many important applications in voice communication and storage. Some of the main ideas underlying the algorithms of major interest today are reviewed. The concept of removing redundancy by linear prediction is reviewed, first in the context of predictive quantization or DPCM. Then linear predictive coding, adaptive predictive coding, and vector quantization are discussed. The concepts of excitation coding via analysis-by-synthesis, vector sum excitation codebooks, and adaptive postfiltering are explained. The main idea of vector excitation coding (VXC) or code excited linear prediction (CELP) are presented. Finally low-delay VXC coding and phonetic segmentation for VXC are described.

  3. Stability of neuronal pulses composed of concatenated unstable kinks

    NASA Astrophysics Data System (ADS)

    Romeo, Mónica M.; Jones, Christopher K.

    2001-01-01

    We demonstrate that a traveling pulse solution, emerging from the concatenation of two unstable kinks, can be stable. By means of stability analysis and numerical simulations, we show the stability of neuronal pulses (action potentials) with increasing refractory periods, which decompose into two (radiationally) unstable kinks in the limit. These action potentials are solutions of an ultrarefractory version of the FitzHugh-Nagumo system.

  4. Shor-Preskill-type security proof for concatenated Bennett-Brassard 1984 quantum-key-distribution protocol

    SciTech Connect

    Hwang, Won-Young; Matsumoto, Keiji; Imai, Hiroshi; Kim, Jaewan; Lee, Hai-Woong

    2003-02-01

    We discuss a long code problem in the Bennett-Brassard 1984 (BB84) quantum-key-distribution protocol and describe how it can be overcome by concatenation of the protocol. Observing that concatenated modified Lo-Chau protocol finally reduces to the concatenated BB84 protocol, we give the unconditional security of the concatenated BB84 protocol.

  5. THE COMPREHENSION OF RAPID SPEECH BY THE BLIND, PART III.

    ERIC Educational Resources Information Center

    FOULKE, EMERSON

    A REVIEW OF THE RESEARCH ON THE COMPREHENSION OF RAPID SPEECH BY THE BLIND IDENTIFIES FIVE METHODS OF SPEECH COMPRESSION--SPEECH CHANGING, ELECTROMECHANICAL SAMPLING, COMPUTER SAMPLING, SPEECH SYNTHESIS, AND FREQUENCY DIVIDING WITH THE HARMONIC COMPRESSOR. THE SPEECH CHANGING AND ELECTROMECHANICAL SAMPLING METHODS AND THE NECESSARY APPARATUS HAVE…

  6. Improving concatenated coding communications by employing signal editing techniques

    NASA Astrophysics Data System (ADS)

    Ng, W. H.; Ungar, J. L.

    1993-04-01

    Signal editing is a technique used to locate and erase unreliable data before error correction decoding. Consider a concatenated coding (CC) communication system in which the inner code employs convolutional encoding with Viterbi decoding and the outer code could employ either a convolutional or a Reed-Solomon code. In this study, we show that useful information can be derived from the inner Viterbi decoding process to perform two special operations: to locate and erase unreliable decoded data and to estimate the input channel noise level. As a result, the number of errors input to the outer decoder is reduced and the overall CC system performance is improved.

  7. Bounds on Block Error Probability for Multilevel Concatenated Codes

    NASA Technical Reports Server (NTRS)

    Lin, Shu; Moorthy, Hari T.; Stojanovic, Diana

    1996-01-01

    Maximum likelihood decoding of long block codes is not feasable due to large complexity. Some classes of codes are shown to be decomposable into multilevel concatenated codes (MLCC). For these codes, multistage decoding provides good trade-off between performance and complexity. In this paper, we derive an upper bound on the probability of block error for MLCC. We use this bound to evaluate difference in performance for different decompositions of some codes. Examples given show that a significant reduction in complexity can be achieved when increasing number of stages of decoding. Resulting performance degradation varies for different decompositions. A guideline is given for finding good m-level decompositions.

  8. Designing robust unitary gates: Application to concatenated composite pulses

    SciTech Connect

    Ichikawa, Tsubasa; Bando, Masamitsu; Kondo, Yasushi; Nakahara, Mikio

    2011-12-15

    We propose a simple formalism to design unitary gates robust against given systematic errors. This formalism generalizes our previous observation [Y. Kondo and M. Bando, J. Phys. Soc. Jpn. 80, 054002 (2011)] that vanishing dynamical phase in some composite gates is essential to suppress pulse-length errors. By employing our formalism, we derive a composite unitary gate which can be seen as a concatenation of two known composite unitary operations. The obtained unitary gate has high fidelity over a wider range of error strengths compared to existing composite gates.

  9. Advances in speech processing

    NASA Astrophysics Data System (ADS)

    Ince, A. Nejat

    1992-10-01

    The field of speech processing is undergoing a rapid growth in terms of both performance and applications and this is fueled by the advances being made in the areas of microelectronics, computation, and algorithm design. The use of voice for civil and military communications is discussed considering advantages and disadvantages including the effects of environmental factors such as acoustic and electrical noise and interference and propagation. The structure of the existing NATO communications network and the evolving Integrated Services Digital Network (ISDN) concept are briefly reviewed to show how they meet the present and future requirements. The paper then deals with the fundamental subject of speech coding and compression. Recent advances in techniques and algorithms for speech coding now permit high quality voice reproduction at remarkably low bit rates. The subject of speech synthesis is next treated where the principle objective is to produce natural quality synthetic speech from unrestricted text input. Speech recognition where the ultimate objective is to produce a machine which would understand conversational speech with unrestricted vocabulary, from essentially any talker, is discussed. Algorithms for speech recognition can be characterized broadly as pattern recognition approaches and acoustic phonetic approaches. To date, the greatest degree of success in speech recognition has been obtained using pattern recognition paradigms. It is for this reason that the paper is concerned primarily with this technique.

  10. A Statistical Approach to Automatic Speech Summarization

    NASA Astrophysics Data System (ADS)

    Hori, Chiori; Furui, Sadaoki; Malkin, Rob; Yu, Hua; Waibel, Alex

    2003-12-01

    This paper proposes a statistical approach to automatic speech summarization. In our method, a set of words maximizing a summarization score indicating the appropriateness of summarization is extracted from automatically transcribed speech and then concatenated to create a summary. The extraction process is performed using a dynamic programming (DP) technique based on a target compression ratio. In this paper, we demonstrate how an English news broadcast transcribed by a speech recognizer is automatically summarized. We adapted our method, which was originally proposed for Japanese, to English by modifying the model for estimating word concatenation probabilities based on a dependency structure in the original speech given by a stochastic dependency context free grammar (SDCFG). We also propose a method of summarizing multiple utterances using a two-level DP technique. The automatically summarized sentences are evaluated by summarization accuracy based on a comparison with a manual summary of speech that has been correctly transcribed by human subjects. Our experimental results indicate that the method we propose can effectively extract relatively important information and remove redundant and irrelevant information from English news broadcasts.

  11. Speech coding, reconstruction and recognition using acoustics and electromagnetic waves

    DOEpatents

    Holzrichter, J.F.; Ng, L.C.

    1998-03-17

    The use of EM radiation in conjunction with simultaneously recorded acoustic speech information enables a complete mathematical coding of acoustic speech. The methods include the forming of a feature vector for each pitch period of voiced speech and the forming of feature vectors for each time frame of unvoiced, as well as for combined voiced and unvoiced speech. The methods include how to deconvolve the speech excitation function from the acoustic speech output to describe the transfer function each time frame. The formation of feature vectors defining all acoustic speech units over well defined time frames can be used for purposes of speech coding, speech compression, speaker identification, language-of-speech identification, speech recognition, speech synthesis, speech translation, speech telephony, and speech teaching. 35 figs.

  12. Speech coding, reconstruction and recognition using acoustics and electromagnetic waves

    DOEpatents

    Holzrichter, John F.; Ng, Lawrence C.

    1998-01-01

    The use of EM radiation in conjunction with simultaneously recorded acoustic speech information enables a complete mathematical coding of acoustic speech. The methods include the forming of a feature vector for each pitch period of voiced speech and the forming of feature vectors for each time frame of unvoiced, as well as for combined voiced and unvoiced speech. The methods include how to deconvolve the speech excitation function from the acoustic speech output to describe the transfer function each time frame. The formation of feature vectors defining all acoustic speech units over well defined time frames can be used for purposes of speech coding, speech compression, speaker identification, language-of-speech identification, speech recognition, speech synthesis, speech translation, speech telephony, and speech teaching.

  13. Concatenation of 'alert' and 'identity' segments in dingoes' alarm calls.

    PubMed

    Déaux, Eloïse C; Allen, Andrew P; Clarke, Jennifer A; Charrier, Isabelle

    2016-01-01

    Multicomponent signals can be formed by the uninterrupted concatenation of multiple call types. One such signal is found in dingoes, Canis familiaris dingo. This stereotyped, multicomponent 'bark-howl' vocalisation is formed by the concatenation of a noisy bark segment and a tonal howl segment. Both segments are structurally similar to bark and howl vocalisations produced independently in other contexts (e.g. intra- and inter-pack communication). Bark-howls are mainly uttered in response to human presence and were hypothesized to serve as alarm calls. We investigated the function of bark-howls and the respective roles of the bark and howl segments. We found that dingoes could discriminate between familiar and unfamiliar howl segments, after having only heard familiar howl vocalisations (i.e. different calls). We propose that howl segments could function as 'identity signals' and allow receivers to modulate their responses according to the caller's characteristics. The bark segment increased receivers' attention levels, providing support for earlier observational claims that barks have an 'alerting' function. Lastly, dingoes were more likely to display vigilance behaviours upon hearing bark-howl vocalisations, lending support to the alarm function hypothesis. Canid vocalisations, such as the dingo bark-howl, may provide a model system to investigate the selective pressures shaping complex communication systems. PMID:27460289

  14. A concatenated coded modulation scheme for error control (addition 2)

    NASA Technical Reports Server (NTRS)

    Lin, Shu

    1988-01-01

    A concatenated coded modulation scheme for error control in data communications is described. The scheme is achieved by concatenating a Reed-Solomon outer code and a bandwidth efficient block inner code for M-ary PSK modulation. Error performance of the scheme is analyzed for an AWGN channel. It is shown that extremely high reliability can be attained by using a simple M-ary PSK modulation inner code and a relatively powerful Reed-Solomon outer code. Furthermore, if an inner code of high effective rate is used, the bandwidth expansion required by the scheme due to coding will be greatly reduced. The proposed scheme is particularly effective for high-speed satellite communications for large file transfer where high reliability is required. This paper also presents a simple method for constructing block codes for M-ary PSK modulation. Some short M-ary PSK codes with good minimum squared Euclidean distance are constructed. These codes have trellis structure and hence can be decoded with a soft-decision Viterbi decoding algorithm. Furthermore, some of these codes are phase invariant under multiples of 45 deg rotation.

  15. A concatenated coded modulation scheme for error control

    NASA Technical Reports Server (NTRS)

    Lin, Shu

    1988-01-01

    A concatenated coded modulation scheme for error control in data communications is presented. The scheme is achieved by concatenating a Reed-Solomon outer code and a bandwidth efficient block inner code for M-ary PSK modulation. Error performance of the scheme is analyzed for an AWGN channel. It is shown that extremely high reliability can be attained by using a simple M-ary PSK modulation inner code and a relatively powerful Reed-Solomon outer code. Furthermore, if an inner code of high effective rate is used, the bandwidth expansion required by the scheme due to coding will be greatly reduced. The proposed scheme is very effective for high speed satellite communications for large file transfer where high reliability is required. A simple method is also presented for constructing codes for M-ary PSK modulation. Some short M-ary PSK codes with good minimum squared Euclidean distance are constructed. These codes have trellis structure and hence can be decoded with a soft decision Viterbi decoding algorithm. Furthermore, some of these codes are phase invariant under multiples of 45 deg rotation.

  16. A concatenated coded modulation scheme for error control

    NASA Technical Reports Server (NTRS)

    Kasami, Tadao; Lin, Shu

    1988-01-01

    A concatenated coded modulation scheme for error control in data communications is presented. The scheme is achieved by concatenating a Reed-Solomon outer code and a bandwidth efficient block inner code for M-ary PSK modulation. Error performance of the scheme is analyzed for an AWGN channel. It is shown that extremely high reliability can be attained by using a simple M-ary PSK modulation inner code and relatively powerful Reed-Solomon outer code. Furthermore, if an inner code of high effective rate is used, the bandwidth expansion required by the scheme due to coding will be greatly reduced. The proposed scheme is particularly effective for high speed satellite communication for large file transfer where high reliability is required. Also presented is a simple method for constructing block codes for M-ary PSK modulation. Some short M-ary PSK codes with good minimum squared Euclidean distance are constructed. These codes have trellis structure and hence can be decoded with a soft decision Viterbi decoding algorithm.

  17. Research in speech communication.

    PubMed Central

    Flanagan, J

    1995-01-01

    Advances in digital speech processing are now supporting application and deployment of a variety of speech technologies for human/machine communication. In fact, new businesses are rapidly forming about these technologies. But these capabilities are of little use unless society can afford them. Happily, explosive advances in microelectronics over the past two decades have assured affordable access to this sophistication as well as to the underlying computing technology. The research challenges in speech processing remain in the traditionally identified areas of recognition, synthesis, and coding. These three areas have typically been addressed individually, often with significant isolation among the efforts. But they are all facets of the same fundamental issue--how to represent and quantify the information in the speech signal. This implies deeper understanding of the physics of speech production, the constraints that the conventions of language impose, and the mechanism for information processing in the auditory system. In ongoing research, therefore, we seek more accurate models of speech generation, better computational formulations of language, and realistic perceptual guides for speech processing--along with ways to coalesce the fundamental issues of recognition, synthesis, and coding. Successful solution will yield the long-sought dictation machine, high-quality synthesis from text, and the ultimate in low bit-rate transmission of speech. It will also open the door to language-translating telephony, where the synthetic foreign translation can be in the voice of the originating talker. Images Fig. 1 Fig. 2 Fig. 5 Fig. 8 Fig. 11 Fig. 12 Fig. 13 PMID:7479806

  18. Serial turbo trellis coded modulation using a serially concatenated coder

    NASA Technical Reports Server (NTRS)

    Divsalar, Dariush (Inventor); Dolinar, Samuel J. (Inventor); Pollara, Fabrizio (Inventor)

    2010-01-01

    Serial concatenated trellis coded modulation (SCTCM) includes an outer coder, an interleaver, a recursive inner coder and a mapping element. The outer coder receives data to be coded and produces outer coded data. The interleaver permutes the outer coded data to produce interleaved data. The recursive inner coder codes the interleaved data to produce inner coded data. The mapping element maps the inner coded data to a symbol. The recursive inner coder has a structure which facilitates iterative decoding of the symbols at a decoder system. The recursive inner coder and the mapping element are selected to maximize the effective free Euclidean distance of a trellis coded modulator formed from the recursive inner coder and the mapping element. The decoder system includes a demodulation unit, an inner SISO (soft-input soft-output) decoder, a deinterleaver, an outer SISO decoder, and an interleaver.

  19. Serial turbo trellis coded modulation using a serially concatenated coder

    NASA Technical Reports Server (NTRS)

    Divsalar, Dariush (Inventor); Dolinar, Samuel J. (Inventor); Pollara, Fabrizio (Inventor)

    2011-01-01

    Serial concatenated trellis coded modulation (SCTCM) includes an outer coder, an interleaver, a recursive inner coder and a mapping element. The outer coder receives data to be coded and produces outer coded data. The interleaver permutes the outer coded data to produce interleaved data. The recursive inner coder codes the interleaved data to produce inner coded data. The mapping element maps the inner coded data to a symbol. The recursive inner coder has a structure which facilitates iterative decoding of the symbols at a decoder system. The recursive inner coder and the mapping element are selected to maximize the effective free Euclidean distance of a trellis coded modulator formed from the recursive inner coder and the mapping element. The decoder system includes a demodulation unit, an inner SISO (soft-input soft-output) decoder, a deinterleaver, an outer SISO decoder, and an interleaver.

  20. Entanglement concentration for concatenated Greenberger-Horne-Zeilinger state

    NASA Astrophysics Data System (ADS)

    Qu, Chang-Cheng; Zhou, Lan; Sheng, Yu-Bo

    2015-11-01

    The concatenated Greenberger-Horne-Zeilinger state is a new type of logic-qubit entanglement, which attracts a lot of attentions recently. In this paper, we discuss the entanglement concentration for such logic-qubit entanglement. We present two groups of entanglement concentration protocols (ECPs) for logic-qubit entanglement. In the first group, the parties do not know the initial coefficients of the partially logic-qubit entanglement. In the second group, the parties know the initial coefficients of the partially logic-qubit entanglement. In our ECPs, the unsuccessful cases can be reused to increase the total success probability in the next step. These ECPs may be useful in future long-distant quantum communication.

  1. Methods of Teaching Speech Recognition

    ERIC Educational Resources Information Center

    Rader, Martha H.; Bailey, Glenn A.

    2010-01-01

    Objective: This article introduces the history and development of speech recognition, addresses its role in the business curriculum, outlines related national and state standards, describes instructional strategies, and discusses the assessment of student achievement in speech recognition classes. Methods: Research methods included a synthesis of…

  2. Multilevel Concatenated Block Modulation Codes for the Frequency Non-selective Rayleigh Fading Channel

    NASA Technical Reports Server (NTRS)

    Lin, Shu; Rhee, Dojun

    1996-01-01

    This paper is concerned with construction of multilevel concatenated block modulation codes using a multi-level concatenation scheme for the frequency non-selective Rayleigh fading channel. In the construction of multilevel concatenated modulation code, block modulation codes are used as the inner codes. Various types of codes (block or convolutional, binary or nonbinary) are being considered as the outer codes. In particular, we focus on the special case for which Reed-Solomon (RS) codes are used as the outer codes. For this special case, a systematic algebraic technique for constructing q-level concatenated block modulation codes is proposed. Codes have been constructed for certain specific values of q and compared with the single-level concatenated block modulation codes using the same inner codes. A multilevel closest coset decoding scheme for these codes is proposed.

  3. Computer-generated speech

    SciTech Connect

    Aimthikul, Y.

    1981-12-01

    This thesis reviews the essential aspects of speech synthesis and distinguishes between the two prevailing techniques: compressed digital speech and phonemic synthesis. It then presents the hardware details of the five speech modules evaluated. FORTRAN programs were written to facilitate message creation and retrieval with four of the modules driven by a PDP-11 minicomputer. The fifth module was driven directly by a computer terminal. The compressed digital speech modules (T.I. 990/306, T.S.I. Series 3D and N.S. Digitalker) each contain a limited vocabulary produced by the manufacturers while both the phonemic synthesizers made by Votrax permit an almost unlimited set of sounds and words. A text-to-phoneme rules program was adapted for the PDP-11 (running under the RSX-11M operating system) to drive the Votrax Speech Pac module. However, the Votrax Type'N Talk unit has its own built-in translator. Comparison of these modules revealed that the compressed digital speech modules were superior in pronouncing words on an individual basis but lacked the inflection capability that permitted the phonemic synthesizers to generate more coherent phrases. These findings were necessarily highly subjective and dependent on the specific words and phrases studied. In addition, the rapid introduction of new modules by manufacturers will necessitate new comparisons. However, the results of this research verified that all of the modules studied do possess reasonable quality of speech that is suitable for man-machine applications. Furthermore, the development tools are now in place to permit the addition of computer speech output in such applications.

  4. Correctable noise of quantum-error-correcting codes under adaptive concatenation

    NASA Astrophysics Data System (ADS)

    Fern, Jesse

    2008-01-01

    We examine the transformation of noise under a quantum-error-correcting code (QECC) concatenated repeatedly with itself, by analyzing the effects of a quantum channel after each level of concatenation using recovery operators that are optimally adapted to use error syndrome information from the previous levels of the code. We use the Shannon entropy of these channels to estimate the thresholds of correctable noise for QECCs and find considerable improvements under this adaptive concatenation. Similar methods could be used to increase quantum-fault-tolerant thresholds.

  5. Speech Technologies. Tech Use Guide: Using Computer Technology.

    ERIC Educational Resources Information Center

    Williams, John M.

    Speech synthesis and speech recognition systems offer access to communication and information for students with communication disabilities, thus eliminating major historical barriers to learning for these students and allowing them to participate in the school environment. This guide describes two ways of producing speech synthesis: (1) by…

  6. Digression and Value Concatenation to Enable Privacy-Preserving Regression

    PubMed Central

    Li, Xiao-Bai; Sarkar, Sumit

    2015-01-01

    Regression techniques can be used not only for legitimate data analysis, but also to infer private information about individuals. In this paper, we demonstrate that regression trees, a popular data-analysis and data-mining technique, can be used to effectively reveal individuals’ sensitive data. This problem, which we call a “regression attack,” has not been addressed in the data privacy literature, and existing privacy-preserving techniques are not appropriate in coping with this problem. We propose a new approach to counter regression attacks. To protect against privacy disclosure, our approach introduces a novel measure, called digression, which assesses the sensitive value disclosure risk in the process of building a regression tree model. Specifically, we develop an algorithm that uses the measure for pruning the tree to limit disclosure of sensitive data. We also propose a dynamic value-concatenation method for anonymizing data, which better preserves data utility than a user-defined generalization scheme commonly used in existing approaches. Our approach can be used for anonymizing both numeric and categorical data. An experimental study is conducted using real-world financial, economic and healthcare data. The results of the experiments demonstrate that the proposed approach is very effective in protecting data privacy while preserving data quality for research and analysis. PMID:26752802

  7. Hamming and Accumulator Codes Concatenated with MPSK or QAM

    NASA Technical Reports Server (NTRS)

    Divsalar, Dariush; Dolinar, Samuel

    2009-01-01

    In a proposed coding-and-modulation scheme, a high-rate binary data stream would be processed as follows: 1. The input bit stream would be demultiplexed into multiple bit streams. 2. The multiple bit streams would be processed simultaneously into a high-rate outer Hamming code that would comprise multiple short constituent Hamming codes a distinct constituent Hamming code for each stream. 3. The streams would be interleaved. The interleaver would have a block structure that would facilitate parallelization for high-speed decoding. 4. The interleaved streams would be further processed simultaneously into an inner two-state, rate-1 accumulator code that would comprise multiple constituent accumulator codes - a distinct accumulator code for each stream. 5. The resulting bit streams would be mapped into symbols to be transmitted by use of a higher-order modulation - for example, M-ary phase-shift keying (MPSK) or quadrature amplitude modulation (QAM). The novelty of the scheme lies in the concatenation of the multiple-constituent Hamming and accumulator codes and the corresponding parallel architectures of the encoder and decoder circuitry (see figure) needed to process the multiple bit streams simultaneously. As in the cases of other parallel-processing schemes, one advantage of this scheme is that the overall data rate could be much greater than the data rate of each encoder and decoder stream and, hence, the encoder and decoder could handle data at an overall rate beyond the capability of the individual encoder and decoder circuits.

  8. Cyanuric acid hydrolase: evolutionary innovation by structural concatenation

    PubMed Central

    Peat, Thomas S; Balotra, Sahil; Wilding, Matthew; French, Nigel G; Briggs, Lyndall J; Panjikar, Santosh; Cowieson, Nathan; Newman, Janet; Scott, Colin

    2013-01-01

    The cyanuric acid hydrolase, AtzD, is the founding member of a newly identified family of ring-opening amidases. We report the first X-ray structure for this family, which is a novel fold (termed the ‘Toblerone’ fold) that likely evolved via the concatenation of monomers of the trimeric YjgF superfamily and the acquisition of a metal binding site. Structures of AtzD with bound substrate (cyanuric acid) and inhibitors (phosphate, barbituric acid and melamine), along with mutagenesis studies, allowed the identification of the active site. The AtzD monomer, active site and substrate all possess threefold rotational symmetry, to the extent that the active site possesses three potential Ser–Lys catalytic dyads. A single catalytic dyad (Ser85–Lys42) is hypothesized, based on biochemical evidence and crystallographic data. A plausible catalytic mechanism based on these observations is also presented. A comparison with a homology model of the related barbiturase, Bar, was used to infer the active-site residues responsible for substrate specificity, and the phylogeny of the 68 AtzD-like enzymes in the database were analysed in light of this structure–function relationship. PMID:23651355

  9. Campbell's monkeys concatenate vocalizations into context-specific call sequences

    PubMed Central

    Ouattara, Karim; Lemasson, Alban; Zuberbühler, Klaus

    2009-01-01

    Primate vocal behavior is often considered irrelevant in modeling human language evolution, mainly because of the caller's limited vocal control and apparent lack of intentional signaling. Here, we present the results of a long-term study on Campbell's monkeys, which has revealed an unrivaled degree of vocal complexity. Adult males produced six different loud call types, which they combined into various sequences in highly context-specific ways. We found stereotyped sequences that were strongly associated with cohesion and travel, falling trees, neighboring groups, nonpredatory animals, unspecific predatory threat, and specific predator classes. Within the responses to predators, we found that crowned eagles triggered four and leopards three different sequences, depending on how the caller learned about their presence. Callers followed a number of principles when concatenating sequences, such as nonrandom transition probabilities of call types, addition of specific calls into an existing sequence to form a different one, or recombination of two sequences to form a third one. We conclude that these primates have overcome some of the constraints of limited vocal control by combinatorial organization. As the different sequences were so tightly linked to specific external events, the Campbell's monkey call system may be the most complex example of ‘proto-syntax’ in animal communication known to date. PMID:20007377

  10. Speech Development

    MedlinePlus

    ... W View More… Donate Donor Spotlight Fundraising Ideas Vehicle Donation Volunteer Efforts Speech Development skip to submenu ... Lip and Palate . Bzoch (1997). Cleft Palate Speech Management: A Multidisciplinary Approach . Shprintzen, Bardach (1995). Cleft Palate: ...

  11. Speech Problems

    MedlinePlus

    ... a person's ability to speak clearly. Some Common Speech Disorders Stuttering is a problem that interferes with fluent ... is a language disorder, while stuttering is a speech disorder. A person who stutters has trouble getting out ...

  12. VISIBLE SPEECH.

    ERIC Educational Resources Information Center

    POTTER, RALPH K.; AND OTHERS

    A CORRECTED REPUBLICATION OF THE 1947 EDITION, THE BOOK DESCRIBES A FORM OF VISIBLE SPEECH OBTAINED BY THE RECORDING OF AN ANALYSIS OF SPEECH SOMEWHAT SIMILAR TO THE ANALYSIS PERFORMED BY THE EAR. ORIGINALLY INTENDED TO PRESENT AN EXPERIMENTAL TRAINING PROGRAM IN THE READING OF VISIBLE SPEECH AND EXPANDED TO INCLUDE MATERIAL OF INTEREST TO VARIOUS…

  13. Herative Decoding of Serially Concatenated Codes with Interleavers and Comparison with Turbo Codes

    NASA Technical Reports Server (NTRS)

    Benedetto, S.; Montorsi, G.; Divsalar, D.; Pollara, F.

    1997-01-01

    A serially concatenated code with interleaver consists of the cascade of an outer encoder, an interleaver permuting the outer codewords bits, and an inner encoder whose input words are the permuted outer codewords.

  14. A novel method for performance improvement of optical CDMA system using alterable concatenated code

    NASA Astrophysics Data System (ADS)

    Qiu, Kun; Zhang, Chongfu

    2007-04-01

    A novel method using alterable concatenated code to pre-encode is proposed to reduce the impact of system impairment and multiple access interference (MAI) in optical code division multiple access (OCDMA) system, comprehensive comparisons between different concatenated code type and forward error correcting (FEC) scheme are studied by simulation. In the scheme, we apply concatenated coding to the embedded modulation scheme, and optical orthogonal code (OOC) is employed as address sequence code, an avalanche photodiode (APD) is selected as the system receiver. The bit error rate (BER) performance is derived taking into account the effect of some noises, dispersion power penalty and the MAI. From both theoretical analysis and numerical results, we can show that the proposed system has good performance at a BER of 10 -9 with a gain of 6.4 dB improvement achieved using the concatenated code as the pre-code, and this scheme permits implementation of a cost effective OCDMA system.

  15. Overview of speech technology of the 80's

    SciTech Connect

    Crook, S.B.

    1981-01-01

    The author describes the technology innovations necessary to accommodate the market need which is the driving force toward greater perceived computer intelligence. The author discusses aspects of both speech synthesis and speech recognition.

  16. Speech Communication.

    ERIC Educational Resources Information Center

    Anderson, Betty

    The communications approach to teaching speech to high school students views speech as the study of the communication process in order to develop an awareness of and a sensitivity to the variables that affect human interaction. In using this approach the student is encouraged to try out as many types of messages using as many techniques and…

  17. Speech Aids

    NASA Technical Reports Server (NTRS)

    1987-01-01

    Designed to assist deaf and hearing impaired-persons in achieving better speech, Resnick Worldwide Inc.'s device provides a visual means of cuing the deaf as a speech-improvement measure. This is done by electronically processing the subjects' sounds and comparing them with optimum values which are displayed for comparison.

  18. Symbolic Speech

    ERIC Educational Resources Information Center

    Podgor, Ellen S.

    1976-01-01

    The concept of symbolic speech emanates from the 1967 case of United States v. O'Brien. These discussions of flag desecration, grooming and dress codes, nude entertainment, buttons and badges, and musical expression show that the courts place symbolic speech in different strata from verbal communication. (LBH)

  19. Speech coding

    SciTech Connect

    Ravishankar, C., Hughes Network Systems, Germantown, MD

    1998-05-08

    Speech is the predominant means of communication between human beings and since the invention of the telephone by Alexander Graham Bell in 1876, speech services have remained to be the core service in almost all telecommunication systems. Original analog methods of telephony had the disadvantage of speech signal getting corrupted by noise, cross-talk and distortion Long haul transmissions which use repeaters to compensate for the loss in signal strength on transmission links also increase the associated noise and distortion. On the other hand digital transmission is relatively immune to noise, cross-talk and distortion primarily because of the capability to faithfully regenerate digital signal at each repeater purely based on a binary decision. Hence end-to-end performance of the digital link essentially becomes independent of the length and operating frequency bands of the link Hence from a transmission point of view digital transmission has been the preferred approach due to its higher immunity to noise. The need to carry digital speech became extremely important from a service provision point of view as well. Modem requirements have introduced the need for robust, flexible and secure services that can carry a multitude of signal types (such as voice, data and video) without a fundamental change in infrastructure. Such a requirement could not have been easily met without the advent of digital transmission systems, thereby requiring speech to be coded digitally. The term Speech Coding is often referred to techniques that represent or code speech signals either directly as a waveform or as a set of parameters by analyzing the speech signal. In either case, the codes are transmitted to the distant end where speech is reconstructed or synthesized using the received set of codes. A more generic term that is applicable to these techniques that is often interchangeably used with speech coding is the term voice coding. This term is more generic in the sense that the

  20. Performance of PSK modulation with serial concatenated turbo codes through a nonideal satellite channel

    NASA Astrophysics Data System (ADS)

    Shoup, Ryan

    2005-08-01

    Turbo codes and Low Density Parity Check (LDPC) codes are well known to provide Bit Error Rate (BER) performance close to the Shannon capacity limit. Bandwidth constrained satellite channels could potentially benefit by employing higher order PSK modulations. However, employing higher order PSK modulations may not be practical for satellite amplifiers due to the increased power requirements. The excellent performance of serial concatenated turbo codes could be used to maintain satellite amplifier power levels to those relatively close to the Shannon limit. The performance of the system, however, is dependent on the satellite channel, which typically includes phase noise and some degree of nonlinearity in the satellite amplifier. The performance of various waveforms and PSK modulations employing Serial Concatenated Turbo Codes are investigated using a model of a non-ideal satellite channel. The hardware complexity of the serial concatenated turbo decoder at the ground receiver is also considered.

  1. High pH reversed-phase chromatography with fraction concatenation for 2D proteomic analysis

    SciTech Connect

    Yang, Feng; Shen, Yufeng; Camp, David G.; Smith, Richard D.

    2012-04-01

    Orthogonal high-resolution separations are critical for attaining improved analytical dynamic ranges of proteome measurements. Concatenated high pH reversed phase liquid chromatography affords better separations than the strong cation exchange conventionally applied for two-dimensional shotgun proteomic analysis. For example, concatenated high pH reversed phase liquid chromatography increased identification coverage for peptides (e.g., by 1.8-fold) and proteins (e.g., by 1.6-fold) in shotgun proteomics analyses of a digested human protein sample. Additional advantages of concatenated high pH RPLC include improved protein sequence coverage, simplified sample processing, and reduced sample losses, making this an attractive first dimension separation strategy for two-dimensional proteomics analyses.

  2. Punctured Parallel and Serial Concatenated Convolutional Codes for BPSK/QPSK Channels

    NASA Technical Reports Server (NTRS)

    Acikel, Omer Fatih

    1999-01-01

    As available bandwidth for communication applications becomes scarce, bandwidth-efficient modulation and coding schemes become ever important. Since their discovery in 1993, turbo codes (parallel concatenated convolutional codes) have been the center of the attention in the coding community because of their bit error rate performance near the Shannon limit. Serial concatenated convolutional codes have also been shown to be as powerful as turbo codes. In this dissertation, we introduce algorithms for designing bandwidth-efficient rate r = k/(k + 1),k = 2, 3,..., 16, parallel and rate 3/4, 7/8, and 15/16 serial concatenated convolutional codes via puncturing for BPSK/QPSK (Binary Phase Shift Keying/Quadrature Phase Shift Keying) channels. Both parallel and serial concatenated convolutional codes have initially, steep bit error rate versus signal-to-noise ratio slope (called the -"cliff region"). However, this steep slope changes to a moderate slope with increasing signal-to-noise ratio, where the slope is characterized by the weight spectrum of the code. The region after the cliff region is called the "error rate floor" which dominates the behavior of these codes in moderate to high signal-to-noise ratios. Our goal is to design high rate parallel and serial concatenated convolutional codes while minimizing the error rate floor effect. The design algorithm includes an interleaver enhancement procedure and finds the polynomial sets (only for parallel concatenated convolutional codes) and the puncturing schemes that achieve the lowest bit error rate performance around the floor for the code rates of interest.

  3. Programmable concatenation of conductively linked gold nanorods using molecular assembly and femtosecond irradiation

    NASA Astrophysics Data System (ADS)

    Fontana, Jake; Flom, Steve; Naciri, Jawad; Ratna, Banahalli

    The ability to tune the resonant frequency in plasmonic nanostructures is fundamental to developing novel optical properties and ensuing materials. Recent theoretical insights show that the plasmon resonance can be exquisitely controlled through the conductive concatenation of plasmonic nanoparticles. Furthermore these charge transfer systems may mimic complex and hard to build nanostructures. Here we experimentally demonstrate a directed molecular assembly approach to controllably concatenate gold nanorods end to end into discrete linear structures, bridged with gold nanojunctions, using femtosecond laser light. By utilizing high throughput and nanometer resolution this approach offers a pragmatic assembly strategy for charge transfer plasmonic systems.

  4. Cut-off rate calculations for the outer channel in a concatenated cooling system

    NASA Technical Reports Server (NTRS)

    Herro, M. A.; Costello, D. J., Jr.; Hu, L.

    1984-01-01

    Concatenated codes were long used as a practical means of achieving long block or constraint lengths for combating errors on very noisy channels. The inner and outer encoders are normally separated by an interleaver, so that decoded error bursts coming from the inner decoder are randomized before entering the outer decoder. The effectiveness of this interleaver is examined by calculating the cut-off rate of the outer channel seen by the outer decoder with and without interleaving. Interleaving never hurts the performance of a concatenated code, and when the inner code rate is near the cut-off rate of the inner channel, interleaving significantly improves code performance.

  5. Iterative Decoding of SPC Outer Coded Concatenation Codes with Maximal Ratio Combining

    NASA Astrophysics Data System (ADS)

    Chen, Xiaogang; Yang, Hongwen

    This letter proposes a simple iterative decoding algorithm for the concatenation codes where the outer code is single-parity-check (SPC) code. The erroneous inner codewords are iteratively combined with maximum ratio combining (MRC) and then re-decoded. Compared with the conventional scheme where the RS outer code concatenation is algebraically decoded to recover the erasures, the proposed scheme has better performance due to MRC processing. On the other hand, the proposed scheme is less complex because the linear combination is simpler than algebraical decoding and the MRC gain can loose the requirement for inner decoder.

  6. Research on Speech Perception. Progress Report No. 12.

    ERIC Educational Resources Information Center

    Pisoni, David B.; And Others

    Summarizing research activities in 1986, this is the twelfth annual report of research on speech perception, analysis, synthesis, and recognition conducted in the Speech Research Laboratory of the Department of Psychology at Indiana University. The report contains the following 23 articles: "Comprehension of Digitally Encoded Natural Speech Using…

  7. Research on Speech Perception. Progress Report No. 15.

    ERIC Educational Resources Information Center

    Pisoni, David B.

    Summarizing research activities in 1989, this is the fifteenth annual report of research on speech perception, analysis, synthesis, and recognition conducted in the Speech Research Laboratory of the Department of Psychology at Indiana University. The report contains the following 21 articles: "Perceptual Learning of Nonnative Speech Contrasts:…

  8. Use of Computer Speech Technologies To Enhance Learning.

    ERIC Educational Resources Information Center

    Ferrell, Joe

    1999-01-01

    Discusses the design of an innovative learning system that uses new technologies for the man-machine interface, incorporating a combination of Automatic Speech Recognition (ASR) and Text To Speech (TTS) synthesis. Highlights include using speech technologies to mimic the attributes of the ideal tutor and design features. (AEF)

  9. A low-complexity and high performance concatenated coding scheme for high-speed satellite communications

    NASA Technical Reports Server (NTRS)

    Lin, Shu; Rhee, Dojun; Rajpal, Sandeep

    1993-01-01

    This report presents a low-complexity and high performance concatenated coding scheme for high-speed satellite communications. In this proposed scheme, the NASA Standard Reed-Solomon (RS) code over GF(2(exp 8) is used as the outer code and the second-order Reed-Muller (RM) code of Hamming distance 8 is used as the inner code. The RM inner code has a very simple trellis structure and is decoded with the soft-decision Viterbi decoding algorithm. It is shown that the proposed concatenated coding scheme achieves an error performance which is comparable to that of the NASA TDRS concatenated coding scheme in which the NASA Standard rate-1/2 convolutional code of constraint length 7 and d sub free = 10 is used as the inner code. However, the proposed RM inner code has much smaller decoding complexity, less decoding delay, and much higher decoding speed. Consequently, the proposed concatenated coding scheme is suitable for reliable high-speed satellite communications, and it may be considered as an alternate coding scheme for the NASA TDRS system.

  10. A VLSI Reed-Solomon decoder architecture for concatenate-coded space and spread spectrum communications

    NASA Astrophysics Data System (ADS)

    Liu, K. Y.

    In this paper, a VLSI Reed-Solomon (RS) decoder architecture for concatenate-coded space and spread spectrum communications, is presented. The known decoding procedures for RS codes are exploited and modified to obtain a repetitive and recursive decoding technique which is suitable for VLSI implementation and pipeline processing.

  11. Concatenative and Nonconcatenative Plural Formation in L1, L2, and Heritage Speakers of Arabic

    ERIC Educational Resources Information Center

    Albirini, Abdulkafi; Benmamoun, Elabbas

    2014-01-01

    This study compares Arabic L1, L2, and heritage speakers' (HS) knowledge of plural formation, which involves concatenative and nonconcatenative modes of derivation. Ninety participants (divided equally among L1, L2, and heritage speakers) completed two oral tasks: a picture naming task (to measure proficiency) and a plural formation task. The…

  12. Performance of CO-OFDM system with RS-Turbo concatenated code

    NASA Astrophysics Data System (ADS)

    Tong, Zheng-rong; Hu, Gui-bin; Cao, Ye; Zhang, Wei-hua

    2015-11-01

    In this paper, the RS-Turbo concatenated code is applied to coherent optical orthogonal frequency division multiplexing (CO-OFDM) system. RS(186,166,8) and Turbo code with code rate of 1/2 are employed for RS-Turbo concatenated code. Two decoding algorithms, which are Max-Log-MAP algorithm and Log-MAP algorithm, are adopted for Turbo decoding, and the iteration Berlekamp-Massey (BM) algorithm is adopted for RS decoding. The simulation results show that the bit error rate ( BER) performance of CO-OFDM system with RS-Turbo concatenated code is significantly improved at high optical signal to noise ratio ( OSNR), and the iteration number is reduced compared with that of the Turbo coded system. Furthermore, when the Max-Log-MAP algorithm is adopted for Turbo decoding, the transmission distance of CO-OFDM system with RS-Turbo concatenated code can reach about 400 km without error, while that of the Turbo coded system can only reach about 240 km when BER is lower than 10-4 order of magnitude.

  13. A VLSI Reed-Solomon decoder architecture for concatenate-coded space and spread spectrum communications

    NASA Technical Reports Server (NTRS)

    Liu, K. Y.

    1983-01-01

    In this paper, a VLSI Reed-Solomon (RS) decoder architecture for concatenate-coded space and spread spectrum communications, is presented. The known decoding procedures for RS codes are exploited and modified to obtain a repetitive and recursive decoding technique which is suitable for VLSI implementation and pipeline processing.

  14. A low-complexity and high performance concatenated coding scheme for high-speed satellite communications

    NASA Astrophysics Data System (ADS)

    Lin, Shu; Rhee, Dojun; Rajpal, Sandeep

    1993-02-01

    This report presents a low-complexity and high performance concatenated coding scheme for high-speed satellite communications. In this proposed scheme, the NASA Standard Reed-Solomon (RS) code over GF(2(exp 8) is used as the outer code and the second-order Reed-Muller (RM) code of Hamming distance 8 is used as the inner code. The RM inner code has a very simple trellis structure and is decoded with the soft-decision Viterbi decoding algorithm. It is shown that the proposed concatenated coding scheme achieves an error performance which is comparable to that of the NASA TDRS concatenated coding scheme in which the NASA Standard rate-1/2 convolutional code of constraint length 7 and d sub free = 10 is used as the inner code. However, the proposed RM inner code has much smaller decoding complexity, less decoding delay, and much higher decoding speed. Consequently, the proposed concatenated coding scheme is suitable for reliable high-speed satellite communications, and it may be considered as an alternate coding scheme for the NASA TDRS system.

  15. Speech processing: An evolving technology

    SciTech Connect

    Crochiere, R.E.; Flanagan, J.L.

    1986-09-01

    As we enter the information age, speech processing is emerging as an important technology for making machines easier and more convenient for humans to use. It is both an old and a new technology - dating back to the invention of the telephone and forward, at least in aspirations, to the capabilities of HAL in 2001. Explosive advances in microelectronics now make it possible to implement economical real-time hardware for sophisticated speech processing - processing that formerly could be demonstrated only in simulations on main-frame computers. As a result, fundamentally new product concepts - as well as new features and functions in existing products - are becoming possible and are being explored in the marketplace. As the introductory piece to this issue, the authors draw a brief perspective on the evolving field of speech processing and assess the technology in the the three constituent sectors: speech coding, synthesis, and recognition.

  16. A study of acoustic-to-articulatory inversion of speech by analysis-by-synthesis using chain matrices and the Maeda articulatory model

    PubMed Central

    Panchapagesan, Sankaran; Alwan, Abeer

    2011-01-01

    In this paper, a quantitative study of acoustic-to-articulatory inversion for vowel speech sounds by analysis-by-synthesis using the Maeda articulatory model is performed. For chain matrix calculation of vocal tract (VT) acoustics, the chain matrix derivatives with respect to area function are calculated and used in a quasi-Newton method for optimizing articulatory trajectories. The cost function includes a distance measure between natural and synthesized first three formants, and parameter regularization and continuity terms. Calibration of the Maeda model to two speakers, one male and one female, from the University of Wisconsin x-ray microbeam (XRMB) database, using a cost function, is discussed. Model adaptation includes scaling the overall VT and the pharyngeal region and modifying the outer VT outline using measured palate and pharyngeal traces. The inversion optimization is initialized by a fast search of an articulatory codebook, which was pruned using XRMB data to improve inversion results. Good agreement between estimated midsagittal VT outlines and measured XRMB tongue pellet positions was achieved for several vowels and diphthongs for the male speaker, with average pellet-VT outline distances around 0.15 cm, smooth articulatory trajectories, and less than 1% average error in the first three formants. PMID:21476670

  17. Characteristic Extraction of Speech Signal Using Wavelet

    NASA Astrophysics Data System (ADS)

    Moriai, Shogo; Hanazaki, Izumi

    In the analysis-synthesis coding of speech signals, realization of the high quality in the low bit rate coding depends on the extraction of its characteristic parameters in the pre-processing. The precise extraction of the fundamental frequency, one of the parameters of the source information, guarantees the quality in the speech synthesis. But its extraction is diffcult because of the influence of the consonant, non-periodicity of vocal cords vibration, wide range of the fundamental frequency, etc.. In this paper, we will propose a new fundamental frequency extraction of the speech signals using the Wavelet transform with the criterion based on its harmonics structure.

  18. Noise suppression methods for robust speech processing

    NASA Astrophysics Data System (ADS)

    Boll, S. F.; Kajiya, J.; Youngberg, J.; Petersen, T. L.; Ravindra, H.; Done, W.; Cox, B. V.; Cohen, E.

    1981-04-01

    Robust speech processing in practical operating environments requires effective environmental and processor noise suppression. This report describes the technical findings and accomplishments during the reporting period for the research program funded to develop real-time, compressed speech analysis-synthesis algorithms whose performance is invariant under signal contamination. Fulfillment of this requirement is necessary to insure reliable secure compressed speech transmission within realistic military command and control environments. Overall contributions resulting from this research program include the understanding of how environmental noise degrades narrow band, coded speech, development of appropriate real-time noise suppression algorithms, and development of speech parameter identification methods that consider signal contamination as a fundamental element in the estimation process. This report describes the research and results in the areas of noise suppression using the dual input adaptive noise cancellation articulation rate change techniques, spectral subtraction and a description of an experiment which demonstrated that the spectral substraction noise suppression algorithm can improve the intelligibility of 2400 bps, LPC-10 coded, helicopter speech by 10.6 points. In addition summaries are included of prior studies in Constant-Q signal analysis and synthesis, perceptual modelling, speech activity detection, and pole-zero modelling of noisy signals. Three recent studies in speech modelling using the critical band analysis-synthesis transform and using splines are then presented. Finally a list of major publications generated under this contract is given.

  19. An articulatory silicon vocal tract for speech and hearing prostheses.

    PubMed

    Keng Hoong Wee; Turicchia, L; Sarpeshkar, R

    2011-08-01

    We describe the concept of a bioinspired feedback loop that combines a cochlear processor with an integrated-circuit vocal tract to create what we call a speech-locked loop. We discuss how the speech-locked loop can be applied in hearing prostheses, such as cochlear implants, to help improve speech recognition in noise. We also investigate speech-coding strategies for brain-machine-interface-based speech prostheses and present an articulatory speech-synthesis system by using an integrated-circuit vocal tract that models the human vocal tract. Our articulatory silicon vocal tract makes the transmission of low bit-rate speech-coding parameters feasible over a bandwidth-constrained body sensor network. To the best of our knowledge, this is the first articulatory speech-prosthesis system reported to date. We also present a speech-prosthesis simulator as a means to generate realistic articulatory parameter sequences. PMID:23851948

  20. Free Speech Yearbook: 1972.

    ERIC Educational Resources Information Center

    Tedford, Thomas L., Ed.

    This book is a collection of essays on free speech issues and attitudes, compiled by the Commission on Freedom of Speech of the Speech Communication Association. Four articles focus on freedom of speech in classroom situations as follows: a philosophic view of teaching free speech, effects of a course on free speech on student attitudes,…

  1. Speech analyzer

    NASA Technical Reports Server (NTRS)

    Lokerson, D. C. (Inventor)

    1977-01-01

    A speech signal is analyzed by applying the signal to formant filters which derive first, second and third signals respectively representing the frequency of the speech waveform in the first, second and third formants. A first pulse train having approximately a pulse rate representing the average frequency of the first formant is derived; second and third pulse trains having pulse rates respectively representing zero crossings of the second and third formants are derived. The first formant pulse train is derived by establishing N signal level bands, where N is an integer at least equal to two. Adjacent ones of the signal bands have common boundaries, each of which is a predetermined percentage of the peak level of a complete cycle of the speech waveform.

  2. Speech Research

    NASA Astrophysics Data System (ADS)

    Several articles addressing topics in speech research are presented. The topics include: exploring the functional significance of physiological tremor: A biospectroscopic approach; differences between experienced and inexperienced listeners to deaf speech; a language-oriented view of reading and its disabilities; Phonetic factors in letter detection; categorical perception; Short-term recall by deaf signers of American sign language; a common basis for auditory sensory storage in perception and immediate memory; phonological awareness and verbal short-term memory; initiation versus execution time during manual and oral counting by stutterers; trading relations in the perception of speech by five-year-old children; the role of the strap muscles in pitch lowering; phonetic validation of distinctive features; consonants and syllable boundaires; and vowel information in postvocalic frictions.

  3. Performance analysis of a concatenated erbium-doped fiber amplifier supporting four mode groups

    NASA Astrophysics Data System (ADS)

    Qin, Zujun; Fan, Di; Zhang, Wentao; Xiong, Xianming

    2016-05-01

    An erbium-doped fiber amplifier (EDFA) supporting four mode groups has been theoretically designed by concatenating two sections of erbium-doped fibers (EDFs). Each EDF has a simple erbium doping profile for the purpose of reducing its fabrication complexity. We propose a modified genetic algorithm (GA) to provide detailed investigations on the concatenated amplifier. Both the optimal fiber length and erbium doping radius in each EDF have been found to minimize the gain difference between signal modes. Results show that the parameters of the central-doped EDF have a greater impact on the amplifier performance compared to those of the annular-doped one. We then investigate the influence of the small deviations of the erbium fiber length, doping radius and doping concentration of each EDF from their optimal values upon the amplifier performance, and discuss their design tolerances in obtaining a desirable amplification characteristics.

  4. Concatenation and Species Tree Methods Exhibit Statistically Indistinguishable Accuracy under a Range of Simulated Conditions

    PubMed Central

    Tonini, João; Moore, Andrew; Stern, David; Shcheglovitova, Maryia; Ortí, Guillermo

    2015-01-01

    Phylogeneticists have long understood that several biological processes can cause a gene tree to disagree with its species tree. In recent years, molecular phylogeneticists have increasingly foregone traditional supermatrix approaches in favor of species tree methods that account for one such source of error, incomplete lineage sorting (ILS). While gene tree-species tree discordance no doubt poses a significant challenge to phylogenetic inference with molecular data, researchers have only recently begun to systematically evaluate the relative accuracy of traditional and ILS-sensitive methods. Here, we report on simulations demonstrating that concatenation can perform as well or better than methods that attempt to account for sources of error introduced by ILS. Based on these and similar results from other researchers, we argue that concatenation remains a useful component of the phylogeneticist’s toolbox and highlight that phylogeneticists should continue to make explicit comparisons of results produced by contemporaneous and classical methods. PMID:25901289

  5. Using Concatenated Quantum Codes for Universal Fault-Tolerant Quantum Gates

    NASA Astrophysics Data System (ADS)

    Jochym-O'Connor, Tomas; Laflamme, Raymond

    2014-01-01

    We propose a method for universal fault-tolerant quantum computation using concatenated quantum error correcting codes. The concatenation scheme exploits the transversal properties of two different codes, combining them to provide a means to protect against low-weight arbitrary errors. We give the required properties of the error correcting codes to ensure universal fault tolerance and discuss a particular example using the 7-qubit Steane and 15-qubit Reed-Muller codes. Namely, other than computational basis state preparation as required by the DiVincenzo criteria, our scheme requires no special ancillary state preparation to achieve universality, as opposed to schemes such as magic state distillation. We believe that optimizing the codes used in such a scheme could provide a useful alternative to state distillation schemes that exhibit high overhead costs.

  6. Basal Ganglia Subcircuits Distinctively Encode the Parsing and Concatenation of Action Sequences

    PubMed Central

    Jin, Xin; Tecuapetla, Fatuel; Costa, Rui M

    2014-01-01

    Chunking allows the brain to efficiently organize memories and actions. Although basal ganglia circuits have been implicated in action chunking, little is known about how individual elements are concatenated into a behavioral sequence at the neural level. Using a task where mice learn rapid action sequences, we uncovered neuronal activity encoding entire sequences as single actions in basal ganglia circuits. Besides start/stop activity signaling sequence parsing, we found neurons displaying inhibited or sustained activity throughout the execution of an entire sequence. This sustained activity covaried with the rate of execution of individual sequence elements, consistent with motor concatenation. Direct and indirect pathways of basal ganglia were concomitantly active during sequence initiation, but behaved differently during sequence performance, revealing a more complex functional organization of these circuits than previously postulated. These results have important implications for understanding the functional organization of basal ganglia during the learning and execution of action sequences. PMID:24464039

  7. Probability of undetected error after decoding for a concatenated coding scheme

    NASA Technical Reports Server (NTRS)

    Costello, D. J., Jr.; Lin, S.

    1984-01-01

    A concatenated coding scheme for error control in data communications is analyzed. In this scheme, the inner code is used for both error correction and detection, however the outer code is used only for error detection. A retransmission is requested if the outer code detects the presence of errors after the inner code decoding. Probability of undetected error is derived and bounded. A particular example, proposed for NASA telecommand system is analyzed.

  8. Speech Intelligibility

    NASA Astrophysics Data System (ADS)

    Brand, Thomas

    Speech intelligibility (SI) is important for different fields of research, engineering and diagnostics in order to quantify very different phenomena like the quality of recordings, communication and playback devices, the reverberation of auditoria, characteristics of hearing impairment, benefit using hearing aids or combinations of these things.

  9. Speech Improvement.

    ERIC Educational Resources Information Center

    Gordon, Morton J.

    This book serves as a guide for the native and non-native speaker of English in overcoming various problems in articulation, rhythm, and intonation. It is also useful in group therapy speech programs. Forty-five practice chapters offer drill materials for all the vowels, diphthongs, and consonants of American English plus English stress and…

  10. Phrase-level speech simulation with an airway modulation model of speech production

    PubMed Central

    Story, Brad H.

    2012-01-01

    Artificial talkers and speech synthesis systems have long been used as a means of understanding both speech production and speech perception. The development of an airway modulation model is described that simulates the time-varying changes of the glottis and vocal tract, as well as acoustic wave propagation, during speech production. The result is a type of artificial talker that can be used to study various aspects of how sound is generated by humans and how that sound is perceived by a listener. The primary components of the model are introduced and simulation of words and phrases are demonstrated. PMID:23503742

  11. Coalescence vs. concatenation: Sophisticated analyses vs. first principles applied to rooting the angiosperms.

    PubMed

    Simmons, Mark P; Gatesy, John

    2015-10-01

    It has recently been concluded that phylogenomic data from 310 nuclear genes support the clade of (Amborellales, Nymphaeales) as sister to the remaining angiosperms and that shortcut coalescent phylogenetic methods outperformed concatenation for these data. We falsify both of those conclusions here by demonstrating that discrepant results between the coalescent and concatenation analyses are primarily caused by the coalescent methods applied (MP-EST and STAR) not being robust to the highly divergent and often mis-rooted gene trees that were used. This result reinforces the expectation that low amounts of phylogenetic signal and methodological artifacts in gene-tree reconstruction can be more problematic for shortcut coalescent methods than is the assumption of a single hierarchy for all genes by concatenation methods when these approaches are applied to ancient divergences in empirical studies. We also demonstrate that a third coalescent method, ASTRAL, is more robust to mis-rooted gene trees than MP-EST or STAR, and that both Observed Variability (OV) and Tree Independent Generation of Evolutionary Rates (TIGER), which are two character subsampling procedures, are biased in favor of characters with highly asymmetrical distributions of character states when applied to this dataset. We conclude that enthusiastic application of novel tools is not a substitute for rigorous application of first principles, and that trending methods (e.g., shortcut coalescent methods applied to ancient divergences, tree-independent character subsampling), may be novel sources of previously under-appreciated, systematic errors. PMID:26002829

  12. Coalescent versus concatenation methods and the placement of Amborella as sister to water lilies.

    PubMed

    Xi, Zhenxiang; Liu, Liang; Rest, Joshua S; Davis, Charles C

    2014-11-01

    The molecular era has fundamentally reshaped our knowledge of the evolution and diversification of angiosperms. One outstanding question is the phylogenetic placement of Amborella trichopoda Baill., commonly thought to represent the first lineage of extant angiosperms. Here, we leverage publicly available data and provide a broad coalescent-based species tree estimation of 45 seed plants. By incorporating 310 nuclear genes, our coalescent analyses strongly support a clade containing Amborella plus water lilies (i.e., Nymphaeales) that is sister to all other angiosperms across different nucleotide rate partitions. Our results also show that commonly applied concatenation methods produce strongly supported, but incongruent placements of Amborella: slow-evolving nucleotide sites corroborate results from coalescent analyses, whereas fast-evolving sites place Amborella alone as the first lineage of extant angiosperms. We further explored the performance of coalescent versus concatenation methods using nucleotide sequences simulated on (i) the two alternate placements of Amborella with branch lengths and substitution model parameters estimated from each of the 310 nuclear genes and (ii) three hypothetical species trees that are topologically identical except with respect to the degree of deep coalescence and branch lengths. Our results collectively suggest that the Amborella alone placement inferred using concatenation methods is likely misled by fast-evolving sites. This appears to be exacerbated by the combination of long branches in stem group angiosperms, Amborella, and Nymphaeales with the short internal branch separating Amborella and Nymphaeales. In contrast, coalescent methods appear to be more robust to elevated substitution rates. PMID:25077515

  13. High rate concatenated coding systems using multidimensional bandwidth-efficient trellis inner codes

    NASA Astrophysics Data System (ADS)

    Deng, Robert H.; Costello, Daniel J., Jr.

    1989-10-01

    A concatenated coding system using two-dimensional trellis-coded MPSK inner codes and Reed-Solomon outer codes for application in high-speed satellite communication systems was proposed previously by the authors (1989). The authors extend their results to systems using symbol-oriented, multidimensional, trellis-coded MPSK inner codes. The concatenated coding systems are divided into two classes according to their achievable effective information rates. The first class uses multidimensional trellis-coded 8-PSK inner codes and achieves effective information rates around 1 b/dimension (spectral efficiency 2 b/s/Hz). The second class employs multidimensional trellis-coded 16-PSK inner codes and provides effective information rates around 1.5 b/dimension (spectral efficiency 3 b/s/Hz). Both classes provide significant coding gains over an uncoded reference system with the same effective information rate as the coded system. The results show that the symbol-oriented nature of multidimensional inner codes can provide an improvement of up to 1 dB in the overall performance of a concatenated coding system when these codes replace bit-oriented two-dimensional codes.

  14. Viterbi decoder node synchronization losses in the Reed-Solomon/Veterbi concatenated channel

    NASA Technical Reports Server (NTRS)

    Deutsch, L. J.; Miller, R. L.

    1982-01-01

    The Viterbi decoders currently used by the Deep Space Network (DSN) employ an algorithm for maintaining node synchronization that significantly degrades at bit signal-to-noise ratios (SNRs) of below 2.0 dB. In a recent report by the authors, it was shown that the telemetry receiving system, which uses a convolutionally encoded downlink, will suffer losses of 0.85 dB and 1.25 dB respectively at Voyager 2 Uranus and Neptune encounters. This report extends the results of that study to a concatenated (255,223) Reed-Solomon/(7, 1/2) convolutionally coded channel, by developing a new radio loss model for the concatenated channel. It is shown here that losses due to improper node synchronization of 0.57 dB at Uranus and 1.0 dB at Neptune can be expected if concatenated coding is used along with an array of one 64-meter and three 34-meter antennas.

  15. DSD (Double Soft Decision) concatenated FEC scheme on mobile satellite communication systems

    NASA Astrophysics Data System (ADS)

    Honda, Shunji; Kubota, Shuji; Kato, Shuzo

    1992-10-01

    In order to realize a higher-code-gain forward error correction scheme in mobile satellite communication systems, a novel concatenated coding scheme employng soft decision decoding for not only inner codes but also outer codes (DSD (Double Soft Decision) concatenated forward error correction scheme) is proposed. Soft-decision outer decoding can improve the bit error probability of inner decoded data, which traditionally cannot be sufficiently corrected in fading channels. In this scheme, likelihood information from an inner Viterbi decoder is used in the decoding of outer codes. A technique using the path memory circuit status 1.0 ratio for likelihood information is newly proposed, and it is shown that this method is the most reliable even though it requires the simplest hardware among the alternative likelihood information extracting methods. A computer simulation clarifies that the proposed DSD scheme improves Pe performance to one-third that of the conventional hard-decision outer decoding. Moreover, to reduce the interleaving delay time in fading channel or inner decoded data of concatenated codes, a parallel forward error correction scheme is proposed.

  16. Type of Speech Material Affects Acceptable Noise Level Test Outcome.

    PubMed

    Koch, Xaver; Dingemanse, Gertjan; Goedegebure, André; Janse, Esther

    2016-01-01

    The acceptable noise level (ANL) test, in which individuals indicate what level of noise they are willing to put up with while following speech, has been used to guide hearing aid fitting decisions and has been found to relate to prospective hearing aid use. Unlike objective measures of speech perception ability, ANL outcome is not related to individual hearing loss or age, but rather reflects an individual's inherent acceptance of competing noise while listening to speech. As such, the measure may predict aspects of hearing aid success. Crucially, however, recent studies have questioned its repeatability (test-retest reliability). The first question for this study was whether the inconsistent results regarding the repeatability of the ANL test may be due to differences in speech material types used in previous studies. Second, it is unclear whether meaningfulness and semantic coherence of the speech modify ANL outcome. To investigate these questions, we compared ANLs obtained with three types of materials: the International Speech Test Signal (ISTS), which is non-meaningful and semantically non-coherent by definition, passages consisting of concatenated meaningful standard audiology sentences, and longer fragments taken from conversational speech. We included conversational speech as this type of speech material is most representative of everyday listening. Additionally, we investigated whether ANL outcomes, obtained with these three different speech materials, were associated with self-reported limitations due to hearing problems and listening effort in everyday life, as assessed by a questionnaire. ANL data were collected for 57 relatively good-hearing adult participants with an age range representative for hearing aid users. Results showed that meaningfulness, but not semantic coherence of the speech material affected ANL. Less noise was accepted for the non-meaningful ISTS signal than for the meaningful speech materials. ANL repeatability was comparable across

  17. Type of Speech Material Affects Acceptable Noise Level Test Outcome

    PubMed Central

    Koch, Xaver; Dingemanse, Gertjan; Goedegebure, André; Janse, Esther

    2016-01-01

    The acceptable noise level (ANL) test, in which individuals indicate what level of noise they are willing to put up with while following speech, has been used to guide hearing aid fitting decisions and has been found to relate to prospective hearing aid use. Unlike objective measures of speech perception ability, ANL outcome is not related to individual hearing loss or age, but rather reflects an individual’s inherent acceptance of competing noise while listening to speech. As such, the measure may predict aspects of hearing aid success. Crucially, however, recent studies have questioned its repeatability (test–retest reliability). The first question for this study was whether the inconsistent results regarding the repeatability of the ANL test may be due to differences in speech material types used in previous studies. Second, it is unclear whether meaningfulness and semantic coherence of the speech modify ANL outcome. To investigate these questions, we compared ANLs obtained with three types of materials: the International Speech Test Signal (ISTS), which is non-meaningful and semantically non-coherent by definition, passages consisting of concatenated meaningful standard audiology sentences, and longer fragments taken from conversational speech. We included conversational speech as this type of speech material is most representative of everyday listening. Additionally, we investigated whether ANL outcomes, obtained with these three different speech materials, were associated with self-reported limitations due to hearing problems and listening effort in everyday life, as assessed by a questionnaire. ANL data were collected for 57 relatively good-hearing adult participants with an age range representative for hearing aid users. Results showed that meaningfulness, but not semantic coherence of the speech material affected ANL. Less noise was accepted for the non-meaningful ISTS signal than for the meaningful speech materials. ANL repeatability was comparable

  18. Concatenated coding systems employing a unit-memory convolutional code and a byte-oriented decoding algorithm

    NASA Technical Reports Server (NTRS)

    Lee, L. N.

    1976-01-01

    Concatenated coding systems utilizing a convolutional code as the inner code and a Reed-Solomon code as the outer code are considered. In order to obtain very reliable communications over a very noisy channel with relatively small coding complexity, it is proposed to concatenate a byte oriented unit memory convolutional code with an RS outer code whose symbol size is one byte. It is further proposed to utilize a real time minimal byte error probability decoding algorithm, together with feedback from the outer decoder, in the decoder for the inner convolutional code. The performance of the proposed concatenated coding system is studied, and the improvement over conventional concatenated systems due to each additional feature is isolated.

  19. Concatenated coding systems employing a unit-memory convolutional code and a byte-oriented decoding algorithm

    NASA Technical Reports Server (NTRS)

    Lee, L.-N.

    1977-01-01

    Concatenated coding systems utilizing a convolutional code as the inner code and a Reed-Solomon code as the outer code are considered. In order to obtain very reliable communications over a very noisy channel with relatively modest coding complexity, it is proposed to concatenate a byte-oriented unit-memory convolutional code with an RS outer code whose symbol size is one byte. It is further proposed to utilize a real-time minimal-byte-error probability decoding algorithm, together with feedback from the outer decoder, in the decoder for the inner convolutional code. The performance of the proposed concatenated coding system is studied, and the improvement over conventional concatenated systems due to each additional feature is isolated.

  20. Static and Dynamic Features for Improved HMM based Visual Speech Recognition

    NASA Astrophysics Data System (ADS)

    Rajavel, R.; Sathidevi, P. S.

    Visual speech recognition refers to the identification of utterances through the movements of lips, tongue, teeth, and other facial muscles of the speaker without using the acoustic signal. This work shows the relative benefits of both static and dynamic visual speech features for improved visual speech recognition. Two approaches for visual feature extraction have been considered: (1) an image transform based static feature approach in which Discrete Cosine Transform (DCT) is applied to each video frame and 6×6 triangle region coefficients are considered as features. Principal Component Analysis (PCA) is applied over all 60 features corresponding to the video frame to reduce the redundancy; the resultant 21 coefficients are taken as the static visual features. (2) Motion segmentation based dynamic feature approach in which the facial movements are segmented from the video file using motion history images (MHI). DCT is applied to the MHI and triangle region coefficients are taken as the dynamic visual features. Two types of experiments were done one with concatenated features and another with dimension reduced feature by using PCA to identify the utterances. The left-right continuous HMMs are used as visual speech classifier to classify nine MPEG-4 standard viseme consonants. The experimental result shows that the concatenated as well as dimension reduced features improve te visual speech recognition with a high accuracy of 92.45% and 92.15% respectively.

  1. Speech communications in noise

    NASA Technical Reports Server (NTRS)

    1984-01-01

    The physical characteristics of speech, the methods of speech masking measurement, and the effects of noise on speech communication are investigated. Topics include the speech signal and intelligibility, the effects of noise on intelligibility, the articulation index, and various devices for evaluating speech systems.

  2. Voice synthesis application

    SciTech Connect

    Lightstone, P.C.; Davidson, W.M.

    1982-01-27

    Selection of a speech synthesis system as an augmentation for a perimeter security device is described. Criteria used in selection of a system are discussed. The final system is a speech 1000 speech synthesizer board that has a 2000 word speech lexicon, a first time charge of $75 for a 32 K EPROM of custom words, and extra features such as an alternate command to adjust desired listening level.

  3. Speech and Communication Disorders

    MedlinePlus

    ... or understand speech. Causes include Hearing disorders and deafness Voice problems, such as dysphonia or those caused by cleft lip or palate Speech problems like stuttering Developmental disabilities Learning disorders Autism spectrum disorder Brain injury Stroke Some speech and ...

  4. Speech impairment (adult)

    MedlinePlus

    Language impairment; Impairment of speech; Inability to speak; Aphasia; Dysarthria; Slurred speech; Dysphonia voice disorders ... disorders develop gradually, but anyone can develop a speech and ... suddenly, usually in a trauma. APHASIA Alzheimer disease ...

  5. Speech impairment (adult)

    MedlinePlus

    Language impairment; Impairment of speech; Inability to speak; Aphasia; Dysarthria; Slurred speech; Dysphonia voice disorders ... Common speech and language disorders include: APHASIA Aphasia is ... understand or express spoken or written language. It commonly ...

  6. Cracking the Language Code: Neural Mechanisms Underlying Speech Parsing

    PubMed Central

    McNealy, Kristin; Mazziotta, John C.; Dapretto, Mirella

    2013-01-01

    Word segmentation, detecting word boundaries in continuous speech, is a critical aspect of language learning. Previous research in infants and adults demonstrated that a stream of speech can be readily segmented based solely on the statistical and speech cues afforded by the input. Using functional magnetic resonance imaging (fMRI), the neural substrate of word segmentation was examined on-line as participants listened to three streams of concatenated syllables, containing either statistical regularities alone, statistical regularities and speech cues, or no cues. Despite the participants’ inability to explicitly detect differences between the speech streams, neural activity differed significantly across conditions, with left-lateralized signal increases in temporal cortices observed only when participants listened to streams containing statistical regularities, particularly the stream containing speech cues. In a second fMRI study, designed to verify that word segmentation had implicitly taken place, participants listened to trisyllabic combinations that occurred with different frequencies in the streams of speech they just heard (“words,” 45 times; “partwords,” 15 times; “nonwords,” once). Reliably greater activity in left inferior and middle frontal gyri was observed when comparing words with partwords and, to a lesser extent, when comparing partwords with nonwords. Activity in these regions, taken to index the implicit detection of word boundaries, was positively correlated with participants’ rapid auditory processing skills. These findings provide a neural signature of on-line word segmentation in the mature brain and an initial model with which to study developmental changes in the neural architecture involved in processing speech cues during language learning. PMID:16855090

  7. Building on residual speech: a portable processing prosthesis for aphasia.

    PubMed

    Linebarger, Marcia C; Romania, John F; Fink, Ruth B; Bartlett, Megan R; Schwartz, Myrna F

    2008-01-01

    This article examines the challenges of developing electronic communication aids for individuals with mild-to-moderate aphasia and introduces a new portable aid designed for this population. People with some residual speech are often reluctant to use communication aids that replace their natural speech with synthesized speech or the recorded utterances of another individual. SentenceShaper (computer software; Psycholinguistic Technologies, Inc; Jenkintown, Pennsylvania; www.sentenceshaper.com), a computerized "processing prosthesis," allows the user to record spoken sentence fragments and hold them in memory long enough to combine them into larger structures. Previous studies have shown that spoken narratives created with SentenceShaper--composed of concatenated, recorded segments in the user's own voice--may show marked superiority to the individual's spontaneous speech and that sustained use may engender treatment effects. However, these findings do not guarantee the program's efficacy to support functional communication or its acceptance by people with aphasia. Here, we examine strengths and weaknesses of SentenceShaper as the basis for a communication aid for individuals with mild-to-moderate aphasia and review factors guiding the design of SentenceShaper To Go, a portable extension to the program. Data from a "proof-of-concept" pilot study with the portable system suggest the viability of providing computer-based support for users' residual speech in composing and delivering spoken messages. PMID:19319763

  8. Design optimization of two concatenated long period waveguide grating devices for an application specific target spectrum.

    PubMed

    Semwal, Girish; Rastogi, Vipul

    2015-04-10

    We propose a global optimization method to optimize the parameters of two concatenated long period waveguide gratings (LPWGs) for generating a desired target spectrum. The design consists of two concatenated LPWGs with different grating periods inscribed in the guiding films of a four-layer planar waveguide with finite over cladding. We have used the transfer matrix method to compute the modes of the structure and the coupled mode theory to compute the spectrum of the device. The adaptive particle swarm optimization method has been used to optimize the parameters of LPWGs to generate symmetric as well as asymmetric target spectra. Two concatenated gratings of different lengths and periods have been used to generate the target spectra. To demonstrate the method of optimization we have designed a variety of wavelength filters including a rectangular shape rejection band filter, asymmetric band rejection filters, band rejection filters for flattening the amplified spontaneous emission (ASE) spectrum of an erbium doped fiber amplifier (EDFA), and a gain equalization filter for an erbium doped waveguide amplifier (EDWA) in the C-band. Seven parameters of the proposed LPWG structure have been optimized to achieve the desired spectra. We have obtained an ASE flattening with ±0.8  dB peak-to-peak ripple in case of the EDFA and gain flattening with ±0.4  dB peak-to-peak ripple in case of an EDWA. The study would be useful in the design of wavelength filters for specific applications. PMID:25967297

  9. Sample-based engine noise synthesis using an enhanced pitch-synchronous overlap-and-add method.

    PubMed

    Jagla, Jan; Maillard, Julien; Martin, Nadine

    2012-11-01

    An algorithm for the real time synthesis of internal combustion engine noise is presented. Through the analysis of a recorded engine noise signal of continuously varying engine speed, a dataset of sound samples is extracted allowing the real time synthesis of the noise induced by arbitrary evolutions of engine speed. The sound samples are extracted from a recording spanning the entire engine speed range. Each sample is delimitated such as to contain the sound emitted during one cycle of the engine plus the necessary overlap to ensure smooth transitions during the synthesis. The proposed approach, an extension of the PSOLA method introduced for speech processing, takes advantage of the specific periodicity of engine noise signals to locate the extraction instants of the sound samples. During the synthesis stage, the sound samples corresponding to the target engine speed evolution are concatenated with an overlap and add algorithm. It is shown that this method produces high quality audio restitution with a low computational load. It is therefore well suited for real time applications. PMID:23145595

  10. Extended electrical tuning of quantum cascade lasers with digital concatenated gratings

    SciTech Connect

    Slivken, S.; Bandyopadhyay, N.; Bai, Y.; Lu, Q. Y.; Razeghi, M.

    2013-12-02

    In this report, the sampled grating distributed feedback laser architecture is modified with digital concatenated gratings to partially compensate for the wavelength dependence of optical gain in a standard high efficiency quantum cascade laser core. This allows equalization of laser threshold over a wide wavelength range and demonstration of wide electrical tuning. With only two control currents, a full tuning range of 500 nm (236 cm{sup −1}) has been demonstrated. Emission is single mode, with a side mode suppression of >20 dB.

  11. Investigation of the Use of Erasures in a Concatenated Coding Scheme

    NASA Technical Reports Server (NTRS)

    Kwatra, S. C.; Marriott, Philip J.

    1997-01-01

    A new method for declaring erasures in a concatenated coding scheme is investigated. This method is used with the rate 1/2 K = 7 convolutional code and the (255, 223) Reed Solomon code. Errors and erasures Reed Solomon decoding is used. The erasure method proposed uses a soft output Viterbi algorithm and information provided by decoded Reed Solomon codewords in a deinterleaving frame. The results show that a gain of 0.3 dB is possible using a minimum amount of decoding trials.

  12. An Interleaver Implementation for the Serially Concatenated Pulse-Position Modulation Decoder

    NASA Technical Reports Server (NTRS)

    Cheng, Michael K.; Moision, Bruce E.; Hamkins, Jon; Nakashima, Michael A.

    2006-01-01

    We describe novel interleaver and deinterleaver architectures that support bandwidth efficient memory access for decoders of turbo-like codes that are used in conjunction with high order modulations. The presentation focuses on a decoder for serially concatenated pulse-position modulation (SCPPM), which is a forward-error-correction code designed by NASA to support laser communications from Mars at more than 50 megabits-per-second (Mbps). For 64-ary PPM, the new architectures effectively triple the fan-in of the interleaver and fan-out of the deinterleaver, enabling parallelization that doubles the overall throughput. The techniques described here can be readily modified for other PPM orders.

  13. Space communication system for compressed data with a concatenated Reed-Solomon-Viterbi coding channel

    NASA Technical Reports Server (NTRS)

    Rice, R. F.; Hilbert, E. E. (Inventor)

    1976-01-01

    A space communication system incorporating a concatenated Reed Solomon Viterbi coding channel is discussed for transmitting compressed and uncompressed data from a spacecraft to a data processing center on Earth. Imaging (and other) data are first compressed into source blocks which are then coded by a Reed Solomon coder and interleaver, followed by a convolutional encoder. The received data is first decoded by a Viterbi decoder, followed by a Reed Solomon decoder and deinterleaver. The output of the latter is then decompressed, based on the compression criteria used in compressing the data in the spacecraft. The decompressed data is processed to reconstruct an approximation of the original data-producing condition or images.

  14. Undetected error probability and throughput analysis of a concatenated coding scheme

    NASA Technical Reports Server (NTRS)

    Costello, D. J.

    1984-01-01

    The performance of a proposed concatenated coding scheme for error control on a NASA telecommand system is analyzed. In this scheme, the inner code is a distance-4 Hamming code used for both error correction and error detection. The outer code is a shortened distance-4 Hamming code used only for error detection. Interleaving is assumed between the inner and outer codes. A retransmission is requested if either the inner or outer code detects the presence of errors. Both the undetected error probability and the throughput of the system are analyzed. Results indicate that high throughputs and extremely low undetected error probabilities are achievable using this scheme.

  15. Performance of concatenated codes using 8-bit and 10-bit Reed-Solomon codes

    NASA Technical Reports Server (NTRS)

    Pollara, F.; Cheung, K.-M.

    1989-01-01

    The performance improvement of concatenated coding systems using 10-bit instead of 8-bit Reed-Solomon codes is measured by simulation. Three inner convolutional codes are considered: (7,1/2), (15,1/4), and (15,1/6). It is shown that approximately 0.2 dB can be gained at a bit error rate of 10(-6). The loss due to nonideal interleaving is also evaluated. Performance comparisons at very low bit error rates may be relevant for systems using data compression.

  16. Speech research

    NASA Astrophysics Data System (ADS)

    1992-06-01

    Phonology is traditionally seen as the discipline that concerns itself with the building blocks of linguistic messages. It is the study of the structure of sound inventories of languages and of the participation of sounds in rules or processes. Phonetics, in contrast, concerns speech sounds as produced and perceived. Two extreme positions on the relationship between phonological messages and phonetic realizations are represented in the literature. One holds that the primary home for linguistic symbols, including phonological ones, is the human mind, itself housed in the human brain. The second holds that their primary home is the human vocal tract.

  17. Speech recognition and understanding

    SciTech Connect

    Vintsyuk, T.K.

    1983-05-01

    This article discusses the automatic processing of speech signals with the aim of finding a sequence of works (speech recognition) or a concept (speech understanding) being transmitted by the speech signal. The goal of the research is to develop an automatic typewriter that will automatically edit and type text under voice control. A dynamic programming method is proposed in which all possible class signals are stored, after which the presented signal is compared to all the stored signals during the recognition phase. Topics considered include element-by-element recognition of words of speech, learning speech recognition, phoneme-by-phoneme speech recognition, the recognition of connected speech, understanding connected speech, and prospects for designing speech recognition and understanding systems. An application of the composition dynamic programming method for the solution of basic problems in the recognition and understanding of speech is presented.

  18. Concatenation of ‘alert’ and ‘identity’ segments in dingoes’ alarm calls

    PubMed Central

    Déaux, Eloïse C.; Allen, Andrew P.; Clarke, Jennifer A.; Charrier, Isabelle

    2016-01-01

    Multicomponent signals can be formed by the uninterrupted concatenation of multiple call types. One such signal is found in dingoes, Canis familiaris dingo. This stereotyped, multicomponent ‘bark-howl’ vocalisation is formed by the concatenation of a noisy bark segment and a tonal howl segment. Both segments are structurally similar to bark and howl vocalisations produced independently in other contexts (e.g. intra- and inter-pack communication). Bark-howls are mainly uttered in response to human presence and were hypothesized to serve as alarm calls. We investigated the function of bark-howls and the respective roles of the bark and howl segments. We found that dingoes could discriminate between familiar and unfamiliar howl segments, after having only heard familiar howl vocalisations (i.e. different calls). We propose that howl segments could function as ‘identity signals’ and allow receivers to modulate their responses according to the caller’s characteristics. The bark segment increased receivers’ attention levels, providing support for earlier observational claims that barks have an ‘alerting’ function. Lastly, dingoes were more likely to display vigilance behaviours upon hearing bark-howl vocalisations, lending support to the alarm function hypothesis. Canid vocalisations, such as the dingo bark-howl, may provide a model system to investigate the selective pressures shaping complex communication systems. PMID:27460289

  19. High rate concatenated coding systems using bandwidth efficient trellis inner codes

    NASA Astrophysics Data System (ADS)

    Deng, Robert H.; Costello, Daniel J., Jr.

    1989-05-01

    High-rate concatenated coding systems with bandwidth-efficient trellis inner codes and Reed-Solomon (RS) outer codes are investigated for application in high-speed satellite communication systems. Two concatenated coding schemes are proposed. In one the inner code is decoded with soft-decision Viterbi decoding, and the outer RS code performs error-correction-only decoding (decoding without side information). In the other, the inner code is decoded with a modified Viterbi algorithm, which produces reliability information along with the decoded output. In this algorithm, path metrics are used to estimate the entire information sequence, whereas branch metrics are used to provide reliability information on the decoded sequence. This information is used to erase unreliable bits in the decoded output. An errors-and-erasures RS decoder is then used for the outer code. The two schemes have been proposed for high-speed data communication on NASA satellite channels. The rates considered are at least double those used in current NASA systems, and the results indicate that high system reliability can still be achieved.

  20. DSD (Double Soft Decision) concatenated FEC Scheme in mobile satellite communication systems

    NASA Astrophysics Data System (ADS)

    Honda, Shunji; Kubota, Shuji; Kato, Shuzo

    This paper proposes a DSD (Double Soft Decision) concatenated forward error correction (FEC) scheme which employs soft decision decoding for not only inner codes but also outer codes to offer higher coding gain for mobile satellite communication systems. Soft decision outer decoding can improve the bit error probability Pe of inner decoded data which traditionally cannot be sufficiently corrected in fading channels. In this scheme, likelihood information form an inner Viterbi decoder is used is used as soft decision data to decode outer codes. A technique using the path memory circuit status 1,0 ratio for likelihood information is proposed, and it is shown that this method is the most reliable likelihood information extracting method even though its hardware is the simplest. A computer simulation shows that the proposed DSD scheme improves Pe performance to one-third that of a conventional hard decision outer decoding scheme. Moreover, to reduce the interleaving delay time of inner decoded data of concatenated codes, especially in fading channels, a parallel forward error correction scheme is proposed.

  1. Careers in Speech Communication.

    ERIC Educational Resources Information Center

    Speech Communication Association, New York, NY.

    Brief discussions in this pamphlet suggest educational and career opportunities in the following fields of speech communication: rhetoric, public address, and communication; theatre, drama, and oral interpretation; radio, television, and film; speech pathology and audiology; speech science, phonetics, and linguistics; and speech education.…

  2. Opportunities in Speech Pathology.

    ERIC Educational Resources Information Center

    Newman, Parley W.

    The importance of speech is discussed and speech pathology is described. Types of communication disorders considered are articulation disorders, aphasia, facial deformity, hearing loss, stuttering, delayed speech, voice disorders, and cerebral palsy; examples of five disorders are given. Speech pathology is investigated from these aspects: the…

  3. Processing of Speech Signals for Physical and Sensory Disabilities

    NASA Astrophysics Data System (ADS)

    Levitt, Harry

    1995-10-01

    Assistive technology involving voice communication is used primarily by people who are deaf, hard of hearing, or who have speech and/or language disabilities. It is also used to a lesser extent by people with visual or motor disabilities. A very wide range of devices has been developed for people with hearing loss. These devices can be categorized not only by the modality of stimulation [i.e., auditory, visual, tactile, or direct electrical stimulation of the auditory nerve (auditory-neural)] but also in terms of the degree of speech processing that is used. At least four such categories can be distinguished: assistive devices (a) that are not designed specifically for speech, (b) that take the average characteristics of speech into account, (c) that process articulatory or phonetic characteristics of speech, and (d) that embody some degree of automatic speech recognition. Assistive devices for people with speech and/or language disabilities typically involve some form of speech synthesis or symbol generation for severe forms of language disability. Speech synthesis is also used in text-to-speech systems for sightless persons. Other applications of assistive technology involving voice communication include voice control of wheelchairs and other devices for people with mobility disabilities.

  4. The effects of receiver tracking phase error on the performance of the concatenated Reed-Solomon/Viterbi channel coding system

    NASA Technical Reports Server (NTRS)

    Liu, K. Y.

    1981-01-01

    Analytical and experimental results are presented of the effects of receiver tracking phase error, caused by weak signal conditions on either the uplink or the downlink or both, on the performance of the concatenated Reed-Solomon (RS) Viterbi channel coding system. The test results were obtained under an emulated S band uplink and X band downlink, two way space communication channel in the telecommunication development laboratory of JPL with data rates ranging from 4 kHz to 20 kHz. It is shown that, with ideal interleaving, the concatenated RS/Viterbi coding system is capable of yielding large coding gains at very low bit error probabilities over the Viterbi decoded convolutional only coding system. Results on the effects of receiver tracking phase errors on the performance of the concatenated coding system with antenna array combining are included.

  5. Insufficient chunk concatenation may underlie changes in sleep-dependent consolidation of motor sequence learning in older adults.

    PubMed

    Bottary, Ryan; Sonni, Akshata; Wright, David; Spencer, Rebecca M C

    2016-09-01

    Sleep enhances motor sequence learning (MSL) in young adults by concatenating subsequences ("chunks") formed during skill acquisition. To examine whether this process is reduced in aging, we assessed performance changes on the MSL task following overnight sleep or daytime wake in healthy young and older adults. Young adult performance enhancement was correlated with nREM2 sleep, and facilitated by preferential improvement of slowest within-sequence transitions. This effect was markedly reduced in older adults, and accompanied by diminished sigma power density (12-15 Hz) during nREM2 sleep, suggesting that diminished chunk concatenation following sleep may underlie reduced consolidation of MSL in older adults. PMID:27531835

  6. Self-Similar Conformations and Dynamics of Non-Concatenated Entangled Ring Polymers

    NASA Astrophysics Data System (ADS)

    Ge, Ting

    A scaling model of self-similar conformations and dynamics of non-concatenated entangled ring polymers is developed. Topological constraints force these ring polymers into compact conformations with fractal dimension D =3 that we call fractal loopy globules (FLGs). This result is based on the conjecture that the overlap parameter of loops on all length scales is equal to the Kavassalis-Noolandi number 10-20. The dynamics of entangled rings is self-similar, and proceeds as loops of increasing sizes are rearranged progressively at their respective diffusion times. The topological constraints associated with smaller rearranged loops affect the dynamics of larger loops by increasing the effective friction coefficient, but have no influence on the tubes confining larger loops. Therefore, the tube diameter defined as the average spacing between relevant topological constraints increases with time, leading to ``tube dilation''. Analysis of the primitive paths in molecular dynamics (MD) simulations suggests complete tube dilation with the tube diameter on the order of the time-dependent characteristic loop size. A characteristic loop at time t is defined as a ring section that has diffused a distance of its size during time t. We derive dynamic scaling exponents in terms of fractal dimensions of an entangled ring and the underlying primitive path and a parameter characterizing the extent of tube dilation. The results reproduce the predictions of different dynamic models of a single non-concatenated entangled ring. We demonstrate that traditional generalization of single-ring models to multi-ring dynamics is not self-consistent and develop a FLG model with self-consistent multi-ring dynamics and complete tube dilation. Various dynamic scaling exponents predicted by the self-consistent FLG model are consistent with recent computer simulations and experiments. We also perform MD simulations of nanoparticle (NP) diffusion in melts of non-concatenated entangled ring polymers

  7. Physical properties of modification of speech signal fragments

    NASA Astrophysics Data System (ADS)

    Gusev, Mikhail N.

    2004-04-01

    The methods used for modification of separate speech signals fragments in the process of speech synthesis by arbitrary text are described in this report. Three groups of sounds differ in the modification methods of frequency characteristics. Two groups of sounds differ in that they need different methods of duration changes. To modify the samples of a speaker's voice by the methods used it is necessary to make pre-marking, so called segementation. As variable speech fragments, the allophones are taken. The modification methods described allow form arbitrary speech successions in the wide intonation diapason on the basis of limited amount of the speaker's voice patterns.

  8. BoD services in layer 1 VPN with dynamic virtual concatenation group

    NASA Astrophysics Data System (ADS)

    Du, Shu; Peng, Yunfeng; Long, Keping

    2008-11-01

    Bandwidth-on-Demand (BoD) services are characteristic of dynamic bandwidth provisioning based on customers' resource requirement, which will be a must for future networks. BoD services become possible with the development of make-before-break, Virtual Concatenation (VCAT) and Link Capacity Adjustment Scheme (LCAS). In this paper, we introduce BoD services into L1VPN, thus the resource assigned to a L1VPN can be gracefully adjusted at various bandwidth granularities based on customers' requirement. And we propose a dynamic bandwidth adjustment scheme, which is compromise between make-before-break and VCAT&LCAS and mainly based on the latter. The scheme minimizes the number of distinct paths to support a connection between a source-destination pair, and uses make-beforebreak technology for re-optimization.

  9. Structure of Concatenated HAMP Domains Provides a Mechanism for Signal Transduction

    SciTech Connect

    Airola, Michael V.; Watts, Kylie J.; Bilwes, Alexandrine M.; Crane, Brian R.

    2010-08-23

    HAMP domains are widespread prokaryotic signaling modules found as single domains or poly-HAMP chains in both transmembrane and soluble proteins. The crystal structure of a three-unit poly-HAMP chain from the Pseudomonas aeruginosa soluble receptor Aer2 defines a universal parallel four-helix bundle architecture for diverse HAMP domains. Two contiguous domains integrate to form a concatenated di-HAMP structure. The three HAMP domains display two distinct conformations that differ by changes in helical register, crossing angle, and rotation. These conformations are stabilized by different subsets of conserved residues. Known signals delivered to HAMP would be expected to switch the relative stability of the two conformations and the position of a coiled-coil phase stutter at the junction with downstream helices. We propose that the two conformations represent opposing HAMP signaling states and suggest a signaling mechanism whereby HAMP domains interconvert between the two states, which alternate down a poly-HAMP chain.

  10. Efficient entanglement concentration for concatenated Greenberger-Horne-Zeilinger state with the cross-Kerr nonlinearity

    NASA Astrophysics Data System (ADS)

    Pan, Jun; Zhou, Lan; Gu, Shi-Pu; Wang, Xing-Fu; Sheng, Yu-Bo; Wang, Qin

    2016-04-01

    Concatenated Greenberger-Horne-Zeilinger (C-GHZ) state, which encodes physical qubits in a logic qubit, has great application in the future quantum communication. We present an efficient entanglement concentration protocol (ECP) for recovering less-entangled C-GHZ state into the maximally entangled C-GHZ state with the help of cross-Kerr nonlinearities and photon detectors. With the help of the cross-Kerr nonlinearity, the obtained maximally entangled C-GHZ state can be remained for other applications. Moreover, the ECP can be used repeatedly, which can increase the success probability largely. Based on the advantages above, our ECP may be useful in the future long-distance quantum communication.

  11. Multidimensional Trellis Coded Phase Modulation Using a Multilevel Concatenation Approach. Part 1; Code Design

    NASA Technical Reports Server (NTRS)

    Rajpal, Sandeep; Rhee, Do Jun; Lin, Shu

    1997-01-01

    The first part of this paper presents a simple and systematic technique for constructing multidimensional M-ary phase shift keying (MMK) trellis coded modulation (TCM) codes. The construction is based on a multilevel concatenation approach in which binary convolutional codes with good free branch distances are used as the outer codes and block MPSK modulation codes are used as the inner codes (or the signal spaces). Conditions on phase invariance of these codes are derived and a multistage decoding scheme for these codes is proposed. The proposed technique can be used to construct good codes for both the additive white Gaussian noise (AWGN) and fading channels as is shown in the second part of this paper.

  12. Temperature insensitive refractive index sensor based on concatenated long period fiber gratings

    NASA Astrophysics Data System (ADS)

    Tripathi, Saurabh M.; Bock, Wojtek J.; Mikulic, Predrag

    2013-10-01

    We propose and demonstrate a temperature immune biosensor based on two concatenated LPGs incorporating a suitable inter-grating-space (IGS). Compensating the thermal induced phase changes in the grating region by use of an appropriate length of the IGS the temperature insensitivity has been achieved. Using standard telecommunication grade single-mode fibers we show that a length ratio of ~8.2 is sufficient to realize the proposed temperature insensitivity. The resulting sensor shows a refractive index sensitivity of 423.28 nm/RIU displaying the capability of detecting an index variation of 2.36 × 10-6 RIU in the bio-samples. The sensor can also be applied as a temperature insensitive WMD channel isolation filter in the optical communication systems, removing the necessity of any external thermal insulation packaging.

  13. Hyperbranched Hybridization Chain Reaction for Triggered Signal Amplification and Concatenated Logic Circuits.

    PubMed

    Bi, Sai; Chen, Min; Jia, Xiaoqiang; Dong, Ying; Wang, Zonghua

    2015-07-01

    A hyper-branched hybridization chain reaction (HB-HCR) is presented herein, which consists of only six species that can metastably coexist until the introduction of an initiator DNA to trigger a cascade of hybridization events, leading to the self-sustained assembly of hyper-branched and nicked double-stranded DNA structures. The system can readily achieve ultrasensitive detection of target DNA. Moreover, the HB-HCR principle is successfully applied to construct three-input concatenated logic circuits with excellent specificity and extended to design a security-mimicking keypad lock system. Significantly, the HB-HCR-based keypad lock can alarm immediately if the "password" is incorrect. Overall, the proposed HB-HCR with high amplification efficiency is simple, homogeneous, fast, robust, and low-cost, and holds great promise in the development of biosensing, in the programmable assembly of DNA architectures, and in molecular logic operations. PMID:26012841

  14. Inter-Calibration and Concatenation of Climate Quality Infrared Cloudy Radiances from Multiple Instruments

    NASA Technical Reports Server (NTRS)

    Behrangi, Ali; Aumann, Hartmut H.

    2013-01-01

    A change in climate is not likely captured from any single instrument, since no single instrument can span decades of time. Therefore, to detect signals of global climate change, observations from many instruments on different platforms have to be concatenated. This requires careful and detailed consideration of instrumental differences such as footprint size, diurnal cycle of observations, and relative biases in the spectral brightness temperatures. Furthermore, a common basic assumption is that the data quality is independent of the observed scene and therefore can be determined using clear scene data. However, as will be demonstrated, this is not necessarily a valid assumption as the globe is mostly cloudy. In this study we highlight challenges in inter-calibration and concatenation of infrared radiances from multiple instruments by focusing on the analysis of deep convective or anvil clouds. TRMM/VIRS is potentially useful instrument to make correction for observational differences in the local time and foot print sizes, and thus could be applied retroactively to vintage instruments such as AIRS, IASI, IRIS, AVHRR, and HIRS. As the first step, in this study, we investigate and discuss to what extent AIRS and VIRS agree in capturing deep cloudy radiances at the same local time. The analysis also includes comparisons with one year observations from CrIS. It was found that the instruments show calibration differences of about 1K under deep cloudy scenes that can vary as a function of land type and local time of observation. The sensitivity of footprint size, view angle, and spectral band-pass differences cannot fully explain the observed differences. The observed discrepancies can be considered as a measure of the magnitude of issues which will arise in the comparison of legacy data with current data.

  15. Intercalibration and concatenation of climate quality infrared cloudy radiances from multiple instruments

    NASA Astrophysics Data System (ADS)

    Behrangi, Ali; Aumann, Hartmut H.

    2013-09-01

    A change in climate is not likely captured from any single instrument, since no single instrument can span decades of time. Therefore, to detect signals of global climate change, observations from many instruments on different platforms have to be concatenated. This requires careful and detailed consideration of instrumental differences such as footprint size, diurnal cycle of observations, and relative biases in the spectral brightness temperatures. Furthermore, a common basic assumption is that the data quality is independent of the observed scene and therefore can be determined using clear scene data. However, as will be demonstrated, this is not necessarily a valid assumption as the globe is mostly cloudy. In this study we highlight challenges in inter-calibration and concatenation of infrared radiances from multiple instruments by focusing on the analysis of deep convective or anvil clouds. TRMM/VIRS is potentially useful instrument to make correction for observational differences in the local time and footprint sizes, and thus could be applied retroactively to vintage instruments such as AIRS, IASI, IRIS, AVHRR, and HIRS. As the first step, in this study, we investigate and discuss to what extent AIRS and VIRS agree in capturing deep cloudy radiances at the same local time. The analysis also includes comparisons with one year observations from CrIS. It was found that the instruments show calibration differences of about 1K under deep cloudy scenes that can vary as a function of land type and local time of observation. The sensitivity of footprint size, view angle, and spectral band-pass differenceartmut h. Aumanns cannot fully explain the observed differences. The observed discrepancies can be considered as a measure of the magnitude of issues which will arise in the comparison of legacy data with current data.

  16. Concatenated hERG1 Tetramers Reveal Stoichiometry of Altered Channel Gating by RPR-260243

    PubMed Central

    Wu, Wei; Gardner, Alison

    2015-01-01

    Activation of human ether-a-go-go–related gene 1 (hERG1) K+ channels mediates repolarization of action potentials in cardiomyocytes. RPR-260243 [(3R,4R)-4-[3-(6-methoxy-quinolin-4-yl)-3-oxo-propyl]-1-[3-(2,3,5-trifluorophenyl)-prop-2-ynyl]-piperidine-3-carboxylic acid] (RPR) slows deactivation and attenuates inactivation of hERG1 channels. A detailed understanding of the molecular mechanism of hERG1 agonists such as RPR may facilitate the design of more selective and potent compounds for prevention of arrhythmia associated with abnormally prolonged ventricular repolarization. RPR binds to a hydrophobic pocket located between two adjacent hERG1 subunits, and, hence, a homotetrameric channel has four identical RPR binding sites. To investigate the stoichiometry of altered channel gating induced by RPR, we constructed and characterized tetrameric hERG1 concatemers containing a variable number of wild-type subunits and subunits containing a point mutation (L553A) that rendered the channel insensitive to RPR, ostensibly by preventing ligand binding. The slowing of deactivation by RPR was proportional to the number of wild-type subunits incorporated into a concatenated tetrameric channel, and four wild-type subunits were required to achieve maximal slowing of deactivation. In contrast, a single wild-type subunit within a concatenated tetramer was sufficient to achieve half of the maximal RPR-induced shift in the voltage dependence of hERG1 inactivation, and maximal effect was achieved in channels containing three or four wild-type subunits. Together our findings suggest that the allosteric modulation of channel gating involves distinct mechanisms of coupling between drug binding and altered deactivation and inactivation. PMID:25519838

  17. Concatenated hERG1 tetramers reveal stoichiometry of altered channel gating by RPR-260243.

    PubMed

    Wu, Wei; Gardner, Alison; Sanguinetti, Michael C

    2015-01-01

    Activation of human ether-a-go-go-related gene 1 (hERG1) K(+) channels mediates repolarization of action potentials in cardiomyocytes. RPR-260243 [(3R,4R)-4-[3-(6-methoxy-quinolin-4-yl)-3-oxo-propyl]-1-[3-(2,3,5-trifluorophenyl)-prop-2-ynyl]-piperidine-3-carboxylic acid] (RPR) slows deactivation and attenuates inactivation of hERG1 channels. A detailed understanding of the molecular mechanism of hERG1 agonists such as RPR may facilitate the design of more selective and potent compounds for prevention of arrhythmia associated with abnormally prolonged ventricular repolarization. RPR binds to a hydrophobic pocket located between two adjacent hERG1 subunits, and, hence, a homotetrameric channel has four identical RPR binding sites. To investigate the stoichiometry of altered channel gating induced by RPR, we constructed and characterized tetrameric hERG1 concatemers containing a variable number of wild-type subunits and subunits containing a point mutation (L553A) that rendered the channel insensitive to RPR, ostensibly by preventing ligand binding. The slowing of deactivation by RPR was proportional to the number of wild-type subunits incorporated into a concatenated tetrameric channel, and four wild-type subunits were required to achieve maximal slowing of deactivation. In contrast, a single wild-type subunit within a concatenated tetramer was sufficient to achieve half of the maximal RPR-induced shift in the voltage dependence of hERG1 inactivation, and maximal effect was achieved in channels containing three or four wild-type subunits. Together our findings suggest that the allosteric modulation of channel gating involves distinct mechanisms of coupling between drug binding and altered deactivation and inactivation. PMID:25519838

  18. Delayed Speech or Language Development

    MedlinePlus

    ... to Know About Zika & Pregnancy Delayed Speech or Language Development KidsHealth > For Parents > Delayed Speech or Language ... your child is right on schedule. Normal Speech & Language Development It's important to discuss early speech and ...

  19. Reversed-phase chromatography with multiple fraction concatenation strategy for proteome profiling of human MCF10A cells

    PubMed Central

    Wang, Yuexi; Yang, Feng; Gritsenko, Marina A.; Wang, Yingchun; Clauss, Therese; Liu, Tao; Shen, Yufeng; Monroe, Matthew E.; Lopez-Ferrer, Daniel; Reno, Theresa; Moore, Ronald J.; Klemke, Richard L.; Camp, David G.; Smith, Richard D.

    2011-01-01

    In this study, we evaluated a concatenated low pH (pH 3) and high pH (pH 10) reversed-phase liquid chromatography strategy as a first dimension for two-dimensional liquid chromatography tandem mass spectrometry (“shotgun”) proteomic analysis of trypsin-digested human MCF10A cell sample. Compared with the more traditional strong cation exchange method, the use of concatenated high pH reversed-phase liquid chromatography as a first-dimension fractionation strategy resulted in 1.8- and 1.6-fold increases in the number of peptide and protein identifications (with two or more unique peptides), respectively. In addition to broader identifications, advantages of the concatenated high pH fractionation approach include improved protein sequence coverage, simplified sample processing, and reduced sample losses. The results demonstrate that the concatenated high pH reversed-phased strategy is an attractive alternative to strong cation exchange for two-dimensional shotgun proteomic analysis. PMID:21500348

  20. A novel super-FEC code based on concatenated code for high-speed long-haul optical communication systems

    NASA Astrophysics Data System (ADS)

    Yuan, Jianguo; Ye, Wenwei; Jiang, Ze; Mao, Youju; Wang, Wei

    2007-05-01

    The structures of the novel super forward error correction (Super-FEC) code type based on the concatenated code for high-speed long-haul optical communication systems are studied in this paper. The Reed-Solomon (RS) (255, 239) + Bose-Chaudhuri-Hocguenghem (BCH) (1023, 963) concatenated code is presented after the characteristics of the concatenated code and the two Super-FEC code type presented in ITU-T G.975.1 have theoretically been analyzed, the simulation result shows that this novel code type, compared with the RS (255, 239) + convolutional-self-orthogonal-code (CSOC) ( k0/ n0 = 6/7, J = 8) code in ITU-T G.975.1, has a lower redundancy and better error-correction capabilities, and its net coding gain (NCG) at the third iteration is 0.57 dB more than that of RS (255, 239) + CSOC ( k0/ n0 = 6/7, J = 8) code in ITU-T G.975.1 at the third iteration for the bit error rate (BER) of 10 -12. Therefore, the novel code type can better be used in long-haul, larger capacity and higher bit-rate optical communication systems. Furthermore, the design and implementation of the novel concatenated code type are also discussed.

  1. Emotional speech acoustic model for Malay: iterative versus isolated unit training.

    PubMed

    Mustafa, Mumtaz Begum; Ainon, Raja Noor

    2013-10-01

    The ability of speech synthesis system to synthesize emotional speech enhances the user's experience when using this kind of system and its related applications. However, the development of an emotional speech synthesis system is a daunting task in view of the complexity of human emotional speech. The more recent state-of-the-art speech synthesis systems, such as the one based on hidden Markov models, can synthesize emotional speech with acceptable naturalness with the use of a good emotional speech acoustic model. However, building an emotional speech acoustic model requires adequate resources including segment-phonetic labels of emotional speech, which is a problem for many under-resourced languages, including Malay. This research shows how it is possible to build an emotional speech acoustic model for Malay with minimal resources. To achieve this objective, two forms of initialization methods were considered: iterative training using the deterministic annealing expectation maximization algorithm and the isolated unit training. The seed model for the automatic segmentation is a neutral speech acoustic model, which was transformed to target emotion using two transformation techniques: model adaptation and context-dependent boundary refinement. Two forms of evaluation have been performed: an objective evaluation measuring the prosody error and a listening evaluation to measure the naturalness of the synthesized emotional speech. PMID:24116440

  2. A review and synthesis of the first 20 years of PET and fMRI studies of heard speech, spoken language and reading.

    PubMed

    Price, Cathy J

    2012-08-15

    The anatomy of language has been investigated with PET or fMRI for more than 20 years. Here I attempt to provide an overview of the brain areas associated with heard speech, speech production and reading. The conclusions of many hundreds of studies were considered, grouped according to the type of processing, and reported in the order that they were published. Many findings have been replicated time and time again leading to some consistent and undisputable conclusions. These are summarised in an anatomical model that indicates the location of the language areas and the most consistent functions that have been assigned to them. The implications for cognitive models of language processing are also considered. In particular, a distinction can be made between processes that are localized to specific structures (e.g. sensory and motor processing) and processes where specialisation arises in the distributed pattern of activation over many different areas that each participate in multiple functions. For example, phonological processing of heard speech is supported by the functional integration of auditory processing and articulation; and orthographic processing is supported by the functional integration of visual processing, articulation and semantics. Future studies will undoubtedly be able to improve the spatial precision with which functional regions can be dissociated but the greatest challenge will be to understand how different brain regions interact with one another in their attempts to comprehend and produce language. PMID:22584224

  3. A review and synthesis of the first 20 years of PET and fMRI studies of heard speech, spoken language and reading

    PubMed Central

    Price, Cathy J.

    2012-01-01

    The anatomy of language has been investigated with PET or fMRI for more than 20 years. Here I attempt to provide an overview of the brain areas associated with heard speech, speech production and reading. The conclusions of many hundreds of studies were considered, grouped according to the type of processing, and reported in the order that they were published. Many findings have been replicated time and time again leading to some consistent and undisputable conclusions. These are summarised in an anatomical model that indicates the location of the language areas and the most consistent functions that have been assigned to them. The implications for cognitive models of language processing are also considered. In particular, a distinction can be made between processes that are localized to specific structures (e.g. sensory and motor processing) and processes where specialisation arises in the distributed pattern of activation over many different areas that each participate in multiple functions. For example, phonological processing of heard speech is supported by the functional integration of auditory processing and articulation; and orthographic processing is supported by the functional integration of visual processing, articulation and semantics. Future studies will undoubtedly be able to improve the spatial precision with which functional regions can be dissociated but the greatest challenge will be to understand how different brain regions interact with one another in their attempts to comprehend and produce language. PMID:22584224

  4. Segmentation and frequency domain ML pitch estimation of speech signals

    NASA Astrophysics Data System (ADS)

    Hanna, Salim A.

    The rate of oscillation of the vocal cords and its inverse value, the pitch period, are important speech features that are useful for speech analysis/synthesis, speech recognition, and speech coding. An automatic approach for the estimation of the pitch period in continuous speech is presented. The proposed approach considers the segmentation of the speech signal into homogeneous regions and the detection of segments that are generated by vocal cord oscillations prior to pitch estimation. The pitch period of voiced segments is estimated in the frequency domain using a maximum likelihood (ML) procedure. The estimated pitch period is chosen to maximize a likelihood function over the range of expected pitch periods. An efficient simplified realization of the generalized likelihood ratio segmentation method is also described.

  5. Children's perception of their synthetically corrected speech production.

    PubMed

    Strömbergsson, Sofia; Wengelin, Asa; House, David

    2014-06-01

    We explore children's perception of their own speech - in its online form, in its recorded form, and in synthetically modified forms. Children with phonological disorder (PD) and children with typical speech and language development (TD) performed tasks of evaluating accuracy of the different types of speech stimuli, either immediately after having produced the utterance or after a delay. In addition, they performed a task designed to assess their ability to detect synthetic modification. Both groups showed high performance in tasks involving evaluation of other children's speech, whereas in tasks of evaluating one's own speech, the children with PD were less accurate than their TD peers. The children with PD were less sensitive to misproductions in immediate conjunction with their production of an utterance, and more accurate after a delay. Within-category modification often passed undetected, indicating a satisfactory quality of the generated speech. Potential clinical benefits of using corrective re-synthesis are discussed. PMID:24405224

  6. Acceptance speech.

    PubMed

    Carpenter, M

    1994-01-01

    In Bangladesh, the assistant administrator of USAID gave an acceptance speech at an awards ceremony on the occasion of the 25th anniversary of oral rehydration solution (ORS). The ceremony celebrated the key role of the International Centre for Diarrhoeal Disease Research, Bangladesh (ICDDR,B) in the discovery of ORS. Its research activities over the last 25 years have brought ORS to every village in the world, preventing more than a million deaths each year. ORS is the most important medical advance of the 20th century. It is affordable and client-oriented, a true appropriate technology. USAID has provided more than US$ 40 million to ICDDR,B for diarrheal disease and measles research, urban and rural applied family planning and maternal and child health research, and vaccine development. ICDDR,B began as the relatively small Cholera Research Laboratory and has grown into an acclaimed international center for health, family planning, and population research. It leads the world in diarrheal disease research. ICDDR,B is the leading center for applied health research in South Asia. It trains public health specialists from around the world. The government of Bangladesh and the international donor community have actively joined in support of ICDDR,B. The government applies the results of ICDDR,B research to its programs to improve the health and well-being of Bangladeshis. ICDDR,B now also studies acute respiratory diseases and measles. Population and health comprise 1 of USAID's 4 strategic priorities, the others being economic growth, environment, and democracy, USAID promotes people's participation in these 4 areas and in the design and implementation of development projects. USAID is committed to the use and improvement of ORS and to complementary strategies that further reduce diarrhea-related deaths. Continued collaboration with a strong user perspective and integrated services will lead to sustainable development. PMID:12345470

  7. Speech disorders - children

    MedlinePlus

    ... deficiency; Voice disorders; Vocal disorders; Disfluency; Communication disorder - speech disorder ... The following tests can help diagnose speech disorders: Denver ... Peabody Picture Test Revised A hearing test may also be done.

  8. Speech and Communication Disorders

    MedlinePlus

    ... speech. Causes include Hearing disorders and deafness Voice problems, such as dysphonia or those caused by cleft lip or palate Speech problems like stuttering Developmental disabilities Learning disorders Autism spectrum ...

  9. Speech disorders - children

    MedlinePlus

    ... person has problems creating or forming the speech sounds needed to communicate with others. Three common speech ... are disorders in which a person repeats a sound, word, or phrase. Stuttering may be the most ...

  10. Speech imagery recalibrates speech-perception boundaries.

    PubMed

    Scott, Mark

    2016-07-01

    The perceptual boundaries between speech sounds are malleable and can shift after repeated exposure to contextual information. This shift is known as recalibration. To date, the known inducers of recalibration are lexical (including phonotactic) information, lip-read information and reading. The experiments reported here are a proof-of-effect demonstration that speech imagery can also induce recalibration. PMID:27068050

  11. Speech and Language Delay

    MedlinePlus

    MENU Return to Web version Speech and Language Delay Overview How do I know if my child has speech delay? Every child develops at his or her ... of the same age, the problem may be speech delay. Your doctor may think your child has ...

  12. Talking Speech Input.

    ERIC Educational Resources Information Center

    Berliss-Vincent, Jane; Whitford, Gigi

    2002-01-01

    This article presents both the factors involved in successful speech input use and the potential barriers that may suggest that other access technologies could be more appropriate for a given individual. Speech input options that are available are reviewed and strategies for optimizing use of speech recognition technology are discussed. (Contains…

  13. Speech 7 through 12.

    ERIC Educational Resources Information Center

    Nederland Independent School District, TX.

    GRADES OR AGES: Grades 7 through 12. SUBJECT MATTER: Speech. ORGANIZATION AND PHYSICAL APPEARANCE: Following the foreward, philosophy and objectives, this guide presents a speech curriculum. The curriculum covers junior high and Speech I, II, III (senior high). Thirteen units of study are presented for junior high, each unit is divided into…

  14. The Tao of Speech.

    ERIC Educational Resources Information Center

    Dance, Frank E. X.

    1981-01-01

    Argues that the study of speech may present the characteristics of a "tao"--a path leading to an increase in humane being. Calls for speech teachers to profess the primacy of speech: "...the source of life of the human mind, the source of the compassion of the human spirit." (PD)

  15. Free Speech Yearbook 1978.

    ERIC Educational Resources Information Center

    Phifer, Gregg, Ed.

    The 17 articles in this collection deal with theoretical and practical freedom of speech issues. The topics include: freedom of speech in Marquette Park, Illinois; Nazis in Skokie, Illinois; freedom of expression in the Confederate States of America; Robert M. LaFollette's arguments for free speech and the rights of Congress; the United States…

  16. A Study to Relate the Theories of John Dewey to College Level Speech-Communication Education. Final Report.

    ERIC Educational Resources Information Center

    Haase, Mary

    This report presents a synthesis of John Dewey's concepts of man-to-man speech-communication located in numerous speeches and in other relevant Dewey writings. The communication theories are formulated in a verbal model and related to college level speech-communication education. Dewey's views of communication as cooperative, communal phenomena…

  17. On the undetected error probability of a concatenated coding scheme for error control

    NASA Technical Reports Server (NTRS)

    Deng, H.; Costello, D. J., Jr.

    1984-01-01

    Consider a concatenated coding scheme for error control on a binary symmetric channel, called the inner channel. The bit error rate (BER) of the channel is correspondingly called the inner BER, and is denoted by Epsilon (sub i). Two linear block codes, C(sub f) and C(sub b), are used. The inner code C(sub f), called the frame code, is an (n,k) systematic binary block code with minimum distance, d(sub f). The frame code is designed to correct + or fewer errors and simultaneously detect gamma (gamma +) or fewer errors, where + + gamma + 1 = to or d(sub f). The outer code C(sub b) is either an (n(sub b), K(sub b)) binary block with a n(sub b) = mk, or an (n(sub b), k(Sub b) maximum distance separable (MDS) code with symbols from GF(q), where q = 2(b) and the code length n(sub b) satisfies n(sub)(b) = mk. The integerim is the number of frames. The outercode is designed for error detection only.

  18. Multi-flow virtual concatenation triggered by path cascading degree in Flexi-Grid optical networks

    NASA Astrophysics Data System (ADS)

    Yang, Hui; Zhang, Jie; Zhao, Yongli; Wang, Shouyu; Gu, Wanyi; Han, Jianrui; Lin, Yi; Lee, Young

    2013-12-01

    The Flexi-Grid optical networks can elastically allocate spectrum tailored for various bandwidth requirements. In the flexible architecture, routing and spectrum allocation (RSA) is the key problem is to assign spectral resources to accommodate traffic demands. However, spectrum continuity and contiguity constraints in Flexi-Grid optical networks may cause the network fragmentation issue and lead to poor spectrum utilization. In this paper, different from defragmentation methods, we propose multi-flow virtual concatenation (MFVC) in Flexi-Grid optical networks. MFVC can utilize spectral fragments effectively and decrease blocking probability without influencing the already existing active services or wasting additional spectrum resources. We also analyze the feasibility of MFVC and present a MFVC-enabled transponder and control model implementation. For estimating the distribution and size of available fragments on a path in advance, a split-multi-flow RSA heuristic algorithm (SMF) is proposed by introducing path cascading degree (PCD) based triggered mechanism according to the proposed model. Additionally, resource assignment scheme, guard band size, maximum number of split-flow and differential delay constraint are also considered into MFVC and the performances of the proposed algorithm can be demonstrated to improve the spectral utilization and greatly decrease blocking probability through extensive simulations.

  19. Auditory cortical deactivation during speech production and following speech perception: an EEG investigation of the temporal dynamics of the auditory alpha rhythm

    PubMed Central

    Jenson, David; Harkrider, Ashley W.; Thornton, David; Bowers, Andrew L.; Saltuklaroglu, Tim

    2015-01-01

    Sensorimotor integration (SMI) across the dorsal stream enables online monitoring of speech. Jenson et al. (2014) used independent component analysis (ICA) and event related spectral perturbation (ERSP) analysis of electroencephalography (EEG) data to describe anterior sensorimotor (e.g., premotor cortex, PMC) activity during speech perception and production. The purpose of the current study was to identify and temporally map neural activity from posterior (i.e., auditory) regions of the dorsal stream in the same tasks. Perception tasks required “active” discrimination of syllable pairs (/ba/ and /da/) in quiet and noisy conditions. Production conditions required overt production of syllable pairs and nouns. ICA performed on concatenated raw 68 channel EEG data from all tasks identified bilateral “auditory” alpha (α) components in 15 of 29 participants localized to pSTG (left) and pMTG (right). ERSP analyses were performed to reveal fluctuations in the spectral power of the α rhythm clusters across time. Production conditions were characterized by significant α event related synchronization (ERS; pFDR < 0.05) concurrent with EMG activity from speech production, consistent with speech-induced auditory inhibition. Discrimination conditions were also characterized by α ERS following stimulus offset. Auditory α ERS in all conditions temporally aligned with PMC activity reported in Jenson et al. (2014). These findings are indicative of speech-induced suppression of auditory regions, possibly via efference copy. The presence of the same pattern following stimulus offset in discrimination conditions suggests that sensorimotor contributions following speech perception reflect covert replay, and that covert replay provides one source of the motor activity previously observed in some speech perception tasks. To our knowledge, this is the first time that inhibition of auditory regions by speech has been observed in real-time with the ICA/ERSP technique. PMID

  20. Generation of concatenated long high-density plasma channels in air by a single femtosecond laser pulse

    NASA Astrophysics Data System (ADS)

    Papeer, J.; Bruch, R.; Dekel, E.; Pollak, O.; Botton, M.; Henis, Z.; Zigler, A.

    2015-09-01

    We experimentally demonstrate a stable and reproducible generation of long concatenated high-density plasma channels in air by a single femtosecond laser pulse. Each segment of the plasma channel is created by a plasma filament left in the wake of the same single high power laser pulse. Our method enables a control of a few millimeters over the position of each segment as well as exact temporal synchronization between them. The combined plasma channel can extend up to several meters long. The plasma density along the entire concatenated plasma channels is measured to be above 1015 cm-3. The demonstrated approach can be further extrapolated to a higher number of filament segments, thus to much longer high-density plasma channels.

  1. The effects of receiver tracking phase error on the performance of the concatenated Reed-Solomon Viterbi channel coding system

    NASA Technical Reports Server (NTRS)

    Liu, K. Y.; Woo, K. T.

    1980-01-01

    In connection with attempts to achieve very low error probabilities, Odenwalder (1970) proposed a concatenated coding system using the Viterbi-decoded convolutional codes as the inner code and Reed-Solomon (RS) codes as the outer code. Analytical and experimental results are presented concerning the effects of the receiver tracking phase error on the performance of the concatenated RS/Viterbi channel coding system. On the basis of these results it is concluded that certain problems regarding communication operations on deep-space missions can be alleviated by employing the RS/Viterbi coding system. In one-way communication, an employment of RS/Viterbi coding will also provide greater data protection than the Viterbi-decoded convolutional-only coding system.

  2. Expanding the occupational health methodology: A concatenated artificial neural network approach to model the burnout process in Chinese nurses.

    PubMed

    Ladstätter, Felix; Garrosa, Eva; Moreno-Jiménez, Bernardo; Ponsoda, Vicente; Reales Aviles, José Manuel; Dai, Junming

    2016-02-01

    Artificial neural networks are sophisticated modelling and prediction tools capable of extracting complex, non-linear relationships between predictor (input) and predicted (output) variables. This study explores this capacity by modelling non-linearities in the hardiness-modulated burnout process with a neural network. Specifically, two multi-layer feed-forward artificial neural networks are concatenated in an attempt to model the composite non-linear burnout process. Sensitivity analysis, a Monte Carlo-based global simulation technique, is then utilised to examine the first-order effects of the predictor variables on the burnout sub-dimensions and consequences. Results show that (1) this concatenated artificial neural network approach is feasible to model the burnout process, (2) sensitivity analysis is a prolific method to study the relative importance of predictor variables and (3) the relationships among variables involved in the development of burnout and its consequences are to different degrees non-linear. PMID:26230967

  3. Data Concatenation, Bayesian Concordance and Coalescent-Based Analyses of the Species Tree for the Rapid Radiation of Triturus Newts

    PubMed Central

    Wielstra, Ben; Arntzen, Jan W.; van der Gaag, Kristiaan J.; Pabijan, Maciej; Babik, Wieslaw

    2014-01-01

    The phylogenetic relationships for rapid species radiations are difficult to disentangle. Here we study one such case, namely the genus Triturus, which is composed of the marbled and crested newts. We analyze data for 38 genetic markers, positioned in 3-prime untranslated regions of protein-coding genes, obtained with 454 sequencing. Our dataset includes twenty Triturus newts and represents all nine species. Bayesian analysis of population structure allocates all individuals to their respective species. The branching patterns obtained by data concatenation, Bayesian concordance analysis and coalescent-based estimations of the species tree differ from one another. The data concatenation based species tree shows high branch support but branching order is considerably affected by allele choice in the case of heterozygotes in the concatenation process. Bayesian concordance analysis expresses the conflict between individual gene trees for part of the Triturus species tree as low concordance factors. The coalescent-based species tree is relatively similar to a previously published species tree based upon morphology and full mtDNA and any conflicting internal branches are not highly supported. Our findings reflect high gene tree discordance due to incomplete lineage sorting (possibly aggravated by hybridization) in combination with low information content of the markers employed (as can be expected for relatively recent species radiations). This case study highlights the complexity of resolving rapid radiations and we acknowledge that to convincingly resolve the Triturus species tree even more genes will have to be consulted. PMID:25337997

  4. Expansion and concatenation of nonmuscle myosin IIA filaments drive cellular contractile system formation during interphase and mitosis

    PubMed Central

    Fenix, Aidan M.; Taneja, Nilay; Buttler, Carmen A.; Lewis, John; Van Engelenburg, Schuyler B.; Ohi, Ryoma; Burnette, Dylan T.

    2016-01-01

    Cell movement and cytokinesis are facilitated by contractile forces generated by the molecular motor, nonmuscle myosin II (NMII). NMII molecules form a filament (NMII-F) through interactions of their C-terminal rod domains, positioning groups of N-terminal motor domains on opposite sides. The NMII motors then bind and pull actin filaments toward the NMII-F, thus driving contraction. Inside of crawling cells, NMIIA-Fs form large macromolecular ensembles (i.e., NMIIA-F stacks), but how this occurs is unknown. Here we show NMIIA-F stacks are formed through two non–mutually exclusive mechanisms: expansion and concatenation. During expansion, NMIIA molecules within the NMIIA-F spread out concurrent with addition of new NMIIA molecules. Concatenation occurs when multiple NMIIA-Fs/NMIIA-F stacks move together and align. We found that NMIIA-F stack formation was regulated by both motor activity and the availability of surrounding actin filaments. Furthermore, our data showed expansion and concatenation also formed the contractile ring in dividing cells. Thus interphase and mitotic cells share similar mechanisms for creating large contractile units, and these are likely to underlie how other myosin II–based contractile systems are assembled. PMID:26960797

  5. High-speed concatenation of frequency ramps using sampled grating distributed Bragg reflector laser diode sources for OCT resolution enhancement

    NASA Astrophysics Data System (ADS)

    George, Brandon; Derickson, Dennis

    2010-02-01

    Wavelength tunable sampled grating distributed Bragg reflector (SG-DBR) lasers used for telecommunications applications have previously demonstrated the ability for linear frequency ramps covering the entire tuning range of the laser at 100 kHz repetition rates1. An individual SG-DBR laser has a typical tuning range of 50 nm. The InGaAs/InP material system often used with SG-DBR lasers allows for design variations that cover the 1250 to 1650 nm wavelength range. This paper addresses the possibility of concatenating the outputs of tunable SGDBR lasers covering adjacent wavelength ranges for enhancing the resolution of OCT measurements. This laser concatenation method is demonstrated by combining the 1525 nm to 1575 nm wavelength range of a "C Band" SG-DBR laser with the 1570nm to 1620 nm wavelength coverage of an "L-Band" SG-DBR laser. Measurements show that SGDBR lasers can be concatenated with a transition switching time of less than 50 ns with undesired leakage signals attenuated by 50 dB.

  6. Early recognition of speech

    PubMed Central

    Remez, Robert E; Thomas, Emily F

    2013-01-01

    Classic research on the perception of speech sought to identify minimal acoustic correlates of each consonant and vowel. In explaining perception, this view designated momentary components of an acoustic spectrum as cues to the recognition of elementary phonemes. This conceptualization of speech perception is untenable given the findings of phonetic sensitivity to modulation independent of the acoustic and auditory form of the carrier. The empirical key is provided by studies of the perceptual organization of speech, a low-level integrative function that finds and follows the sensory effects of speech amid concurrent events. These projects have shown that the perceptual organization of speech is keyed to modulation; fast; unlearned; nonsymbolic; indifferent to short-term auditory properties; and organization requires attention. The ineluctably multisensory nature of speech perception also imposes conditions that distinguish language among cognitive systems. WIREs Cogn Sci 2013, 4:213–223. doi: 10.1002/wcs.1213 PMID:23926454

  7. Speech Alarms Pilot Study

    NASA Technical Reports Server (NTRS)

    Sandor, Aniko; Moses, Haifa

    2016-01-01

    Speech alarms have been used extensively in aviation and included in International Building Codes (IBC) and National Fire Protection Association's (NFPA) Life Safety Code. However, they have not been implemented on space vehicles. Previous studies conducted at NASA JSC showed that speech alarms lead to faster identification and higher accuracy. This research evaluated updated speech and tone alerts in a laboratory environment and in the Human Exploration Research Analog (HERA) in a realistic setup.

  8. Neural network based speech synthesizer: A preliminary report

    NASA Technical Reports Server (NTRS)

    Villarreal, James A.; Mcintire, Gary

    1987-01-01

    A neural net based speech synthesis project is discussed. The novelty is that the reproduced speech was extracted from actual voice recordings. In essence, the neural network learns the timing, pitch fluctuations, connectivity between individual sounds, and speaking habits unique to that individual person. The parallel distributed processing network used for this project is the generalized backward propagation network which has been modified to also learn sequences of actions or states given in a particular plan.

  9. Military applications of automatic speech recognition and future requirements

    NASA Technical Reports Server (NTRS)

    Beek, Bruno; Cupples, Edward J.

    1977-01-01

    An updated summary of the state-of-the-art of automatic speech recognition and its relevance to military applications is provided. A number of potential systems for military applications are under development. These include: (1) digital narrowband communication systems; (2) automatic speech verification; (3) on-line cartographic processing unit; (4) word recognition for militarized tactical data system; and (5) voice recognition and synthesis for aircraft cockpit.

  10. NINJA-OPS: Fast Accurate Marker Gene Alignment Using Concatenated Ribosomes

    PubMed Central

    Al-Ghalith, Gabriel A.; Montassier, Emmanuel; Ward, Henry N.; Knights, Dan

    2016-01-01

    The explosion of bioinformatics technologies in the form of next generation sequencing (NGS) has facilitated a massive influx of genomics data in the form of short reads. Short read mapping is therefore a fundamental component of next generation sequencing pipelines which routinely match these short reads against reference genomes for contig assembly. However, such techniques have seldom been applied to microbial marker gene sequencing studies, which have mostly relied on novel heuristic approaches. We propose NINJA Is Not Just Another OTU-Picking Solution (NINJA-OPS, or NINJA for short), a fast and highly accurate novel method enabling reference-based marker gene matching (picking Operational Taxonomic Units, or OTUs). NINJA takes advantage of the Burrows-Wheeler (BW) alignment using an artificial reference chromosome composed of concatenated reference sequences, the “concatesome,” as the BW input. Other features include automatic support for paired-end reads with arbitrary insert sizes. NINJA is also free and open source and implements several pre-filtering methods that elicit substantial speedup when coupled with existing tools. We applied NINJA to several published microbiome studies, obtaining accuracy similar to or better than previous reference-based OTU-picking methods while achieving an order of magnitude or more speedup and using a fraction of the memory footprint. NINJA is a complete pipeline that takes a FASTA-formatted input file and outputs a QIIME-formatted taxonomy-annotated BIOM file for an entire MiSeq run of human gut microbiome 16S genes in under 10 minutes on a dual-core laptop. PMID:26820746

  11. Distributed processing for speech understanding

    SciTech Connect

    Bronson, E.C.; Siegel, L.

    1983-01-01

    Continuous speech understanding is a highly complex artificial intelligence task requiring extensive computation. This complexity precludes real-time speech understanding on a conventional serial computer. Distributed processing technique can be applied to the speech understanding task to improve processing speed. In the paper, the speech understanding task and several speech understanding systems are described. Parallel processing techniques are presented and a distributed processing architecture for speech understanding is outlined. 35 references.

  12. Speech-Language Therapy (For Parents)

    MedlinePlus

    ... 5 Things to Know About Zika & Pregnancy Speech-Language Therapy KidsHealth > For Parents > Speech-Language Therapy Print ... with speech and/or language disorders. Speech Disorders, Language Disorders, and Feeding Disorders A speech disorder refers ...

  13. Time-expanded speech and speech recognition in older adults.

    PubMed

    Vaughan, Nancy E; Furukawa, Izumi; Balasingam, Nirmala; Mortz, Margaret; Fausti, Stephen A

    2002-01-01

    Speech understanding deficits are common in older adults. In addition to hearing sensitivity, changes in certain cognitive functions may affect speech recognition. One such change that may impact the ability to follow a rapidly changing speech signal is processing speed. When speakers slow the rate of their speech naturally in order to speak clearly, speech recognition is improved. The acoustic characteristics of naturally slowed speech are of interest in developing time-expansion algorithms to improve speech recognition for older listeners. In this study, we tested younger normally hearing, older normally hearing, and older hearing-impaired listeners on time-expanded speech using increased duration and increased intensity of unvoiced consonants. Although all groups performed best on unprocessed speech, performance with processed speech was better with the consonant gain feature without time expansion in the noise condition and better at the slowest time-expanded rate in the quiet condition. The effects of signal processing on speech recognition are discussed. PMID:17642020

  14. Free Speech Yearbook 1980.

    ERIC Educational Resources Information Center

    Kane, Peter E., Ed.

    The 11 articles in this collection deal with theoretical and practical freedom of speech issues. The topics covered are (1) the United States Supreme Court and communication theory; (2) truth, knowledge, and a democratic respect for diversity; (3) denial of freedom of speech in Jock Yablonski's campaign for the presidency of the United Mine…

  15. Improving Alaryngeal Speech Intelligibility.

    ERIC Educational Resources Information Center

    Christensen, John M.; Dwyer, Patricia E.

    1990-01-01

    Laryngectomized patients using esophageal speech or an electronic artificial larynx have difficulty producing correct voicing contrasts between homorganic consonants. This paper describes a therapy technique that emphasizes "pushing harder" on voiceless consonants to improve alaryngeal speech intelligibility and proposes focusing on the production…

  16. Free Speech. No. 38.

    ERIC Educational Resources Information Center

    Kane, Peter E., Ed.

    This issue of "Free Speech" contains the following articles: "Daniel Schoor Relieved of Reporting Duties" by Laurence Stern, "The Sellout at CBS" by Michael Harrington, "Defending Dan Schorr" by Tome Wicker, "Speech to the Washington Press Club, February 25, 1976" by Daniel Schorr, "Funds Voted For Schorr Inquiry" by Richard Lyons, "Erosion of the…

  17. Tracking Speech Sound Acquisition

    ERIC Educational Resources Information Center

    Powell, Thomas W.

    2011-01-01

    This article describes a procedure to aid in the clinical appraisal of child speech. The approach, based on the work by Dinnsen, Chin, Elbert, and Powell (1990; Some constraints on functionally disordered phonologies: Phonetic inventories and phonotactics. "Journal of Speech and Hearing Research", 33, 28-37), uses a railway idiom to track gains in…

  18. Chief Seattle's Speech Revisited

    ERIC Educational Resources Information Center

    Krupat, Arnold

    2011-01-01

    Indian orators have been saying good-bye for more than three hundred years. John Eliot's "Dying Speeches of Several Indians" (1685), as David Murray notes, inaugurates a long textual history in which "Indians... are most useful dying," or, as in a number of speeches, bidding the world farewell as they embrace an undesired but apparently inevitable…

  19. Illustrated Speech Anatomy.

    ERIC Educational Resources Information Center

    Shearer, William M.

    Written for students in the fields of speech correction and audiology, the text deals with the following: structures involved in respiration; the skeleton and the processes of inhalation and exhalation; phonation and pitch, the larynx, and esophageal speech; muscles involved in articulation; muscles involved in resonance; and the anatomy of the…

  20. Migrations in Speech Recognition.

    ERIC Educational Resources Information Center

    Kolinsky, Regine; Morais, Jose

    1996-01-01

    Describes a new paradigm that may be appropriate for uncovering speech perceptual codes. Illusory words are detected by blending two dichotic stimuli. The paradigm's design allows for comparison of different speech units by the manipulation of the distribution of information between two inputs. (23 references) (Author/CK)

  1. Private Speech in Ballet

    ERIC Educational Resources Information Center

    Johnston, Dale

    2006-01-01

    Authoritarian teaching practices in ballet inhibit the use of private speech. This paper highlights the critical importance of private speech in the cognitive development of young ballet students, within what is largely a non-verbal art form. It draws upon research by Russian psychologist Lev Vygotsky and contemporary socioculturalists, to…

  2. Teaching Freedom of Speech.

    ERIC Educational Resources Information Center

    McGaffey, Ruth

    1983-01-01

    The speech communication department at the University of Wisconsin, Madison, provides a rigorous and legally oriented course in freedom of speech. The objectives of the course are to help students gain insight into the historical and philosophical foundations of the First Amendment, the legal/judicial processes concerning the First Amendment, and…

  3. Free Speech Yearbook 1976.

    ERIC Educational Resources Information Center

    Phifer, Gregg, Ed.

    The articles collected in this annual address several aspects of First Amendment Law. The following titles are included: "Freedom of Speech As an Academic Discipline" (Franklyn S. Haiman), "Free Speech and Foreign-Policy Decision Making" (Douglas N. Freeman), "The Supreme Court and the First Amendment: 1975-1976" (William A. Linsley), "'Arnett v.…

  4. Speech processing standards

    NASA Astrophysics Data System (ADS)

    Ince, A. Nejat

    1990-05-01

    Speech processing standards are given for 64, 32, 16 kb/s and lower rate speech and more generally, speech-band signals which are or will be promulgated by CCITT and NATO. The International Telegraph and Telephone Consultative Committee (CCITT) of the International body which deals, among other things, with speech processing within the context of ISDN. Within NATO there are also bodies promulgating standards which make interoperability, possible without complex and expensive interfaces. Some of the applications for low-bit rate voice and the related work undertaken by CCITT Study Groups which are responsible for developing standards in terms of encoding algorithms, codec design objectives as well as standards on the assessment of speech quality, are highlighted.

  5. Automatic speech recognition

    NASA Astrophysics Data System (ADS)

    Espy-Wilson, Carol

    2005-04-01

    Great strides have been made in the development of automatic speech recognition (ASR) technology over the past thirty years. Most of this effort has been centered around the extension and improvement of Hidden Markov Model (HMM) approaches to ASR. Current commercially-available and industry systems based on HMMs can perform well for certain situational tasks that restrict variability such as phone dialing or limited voice commands. However, the holy grail of ASR systems is performance comparable to humans-in other words, the ability to automatically transcribe unrestricted conversational speech spoken by an infinite number of speakers under varying acoustic environments. This goal is far from being reached. Key to the success of ASR is effective modeling of variability in the speech signal. This tutorial will review the basics of ASR and the various ways in which our current knowledge of speech production, speech perception and prosody can be exploited to improve robustness at every level of the system.

  6. Can you McGurk yourself? Self-face and self-voice in audiovisual speech.

    PubMed

    Aruffo, Christopher; Shore, David I

    2012-02-01

    We are constantly exposed to our own face and voice, and we identify our own faces and voices as familiar. However, the influence of self-identity upon self-speech perception is still uncertain. Speech perception is a synthesis of both auditory and visual inputs; although we hear our own voice when we speak, we rarely see the dynamic movements of our own face. If visual speech and identity are processed independently, no processing advantage would obtain in viewing one's own highly familiar face. In the present experiment, the relative contributions of facial and vocal inputs to speech perception were evaluated with an audiovisual illusion. Our results indicate that auditory self-speech conveys a processing advantage, whereas visual self-speech does not. The data thereby support a model of visual speech as dynamic movement processed separately from speaker recognition. PMID:22033983

  7. Commercial applications of speech interface technology: an industry at the threshold.

    PubMed

    Oberteuffer, J A

    1995-10-24

    Speech interface technology, which includes automatic speech recognition, synthetic speech, and natural language processing, is beginning to have a significant impact on business and personal computer use. Today, powerful and inexpensive microprocessors and improved algorithms are driving commercial applications in computer command, consumer, data entry, speech-to-text, telephone, and voice verification. Robust speaker-independent recognition systems for command and navigation in personal computers are now available; telephone-based transaction and database inquiry systems using both speech synthesis and recognition are coming into use. Large-vocabulary speech interface systems for document creation and read-aloud proofing are expanding beyond niche markets. Today's applications represent a small preview of a rich future for speech interface technology that will eventually replace keyboards with microphones and loud-speakers to give easy accessibility to increasingly intelligent machines. PMID:7479717

  8. FPGA implementation of concatenated non-binary QC-LDPC codes for high-speed optical transport.

    PubMed

    Zou, Ding; Djordjevic, Ivan B

    2015-06-01

    In this paper, we propose a soft-decision-based FEC scheme that is the concatenation of a non-binary LDPC code and hard-decision FEC code. The proposed NB-LDPC + RS with overhead of 27.06% provides a superior NCG of 11.9dB at a post-FEC BER of 10-15. As a result, the proposed NB-LDPC codes represent the strong FEC candidate of soft-decision FEC for beyond 100Gb/s optical transmission systems. PMID:26072810

  9. Detection of terahertz radiation by tightly concatenated InGaAs field-effect transistors integrated on a single chip

    SciTech Connect

    Popov, V. V.; Yermolaev, D. M.; Shapoval, S. Yu.; Maremyanin, K. V.; Gavrilenko, V. I.; Zemlyakov, V. E.; Bespalov, V. A.; Yegorkin, V. I.; Maleev, N. A.; Ustinov, V. M.

    2014-04-21

    A tightly concatenated chain of InGaAs field-effect transistors with an asymmetric T-gate in each transistor demonstrates strong terahertz photovoltaic response without using supplementary antenna elements. We obtain the responsivity above 1000 V/W and up to 2000 V/W for unbiased and drain-biased transistors in the chain, respectively, with the noise equivalent power below 10{sup −11} W/Hz{sup 0.5} in the unbiased mode of the detector operation.

  10. Research on Speech Perception. Progress Report No. 9, January 1983-December 1983.

    ERIC Educational Resources Information Center

    Pisoni, David B.; And Others

    Summarizing research activities from January 1983 to December 1983, this is the ninth annual report of research on speech perception, analysis and synthesis conducted in the Speech Research Laboratory of the Department of Psychology at Indiana University. The report includes extended manuscripts, short reports, and progress reports. The report…

  11. Voice and Speech after Laryngectomy

    ERIC Educational Resources Information Center

    Stajner-Katusic, Smiljka; Horga, Damir; Musura, Maja; Globlek, Dubravka

    2006-01-01

    The aim of the investigation is to compare voice and speech quality in alaryngeal patients using esophageal speech (ESOP, eight subjects), electroacoustical speech aid (EACA, six subjects) and tracheoesophageal voice prosthesis (TEVP, three subjects). The subjects reading a short story were recorded in the sound-proof booth and the speech samples…

  12. Sperry Univac speech communications technology

    NASA Technical Reports Server (NTRS)

    Medress, Mark F.

    1977-01-01

    Technology and systems for effective verbal communication with computers were developed. A continuous speech recognition system for verbal input, a word spotting system to locate key words in conversational speech, prosodic tools to aid speech analysis, and a prerecorded voice response system for speech output are described.

  13. Speech Pathology Assistant. Trainee Manual.

    ERIC Educational Resources Information Center

    National Association for Hearing and Speech Action, Silver Spring, MD.

    Part of an instructional set which includes an instructor's guide, this trainee manual is designed to provide speech pathology students with some basic and essential knowledge about the communication process. The manual contains nine modules: (1) speech pathology assistant, (2) the bases of speech (structure and function of the speech mechanism,…

  14. Speech Correction in the Schools.

    ERIC Educational Resources Information Center

    Eisenson, Jon; Ogilvie, Mardel

    An introduction to the problems and therapeutic needs of school age children whose speech requires remedial attention, the text is intended for both the classroom teacher and the speech correctionist. General considerations include classification and incidence of speech defects, speech correction services, the teacher as a speaker, the mechanism…

  15. Speech Delay: Its Treatment by Speech Play.

    ERIC Educational Resources Information Center

    Craft, Michael

    Directed to parents, the text discusses normal and delayed speech development and considers the causes of delay. Suggestions are given for helping deaf, emotionally disturbed, brain damaged, and physically handicapped children. Additional suggestions are provided for parents of twins, of stutterers, and of mongoloid or multiply handicapped…

  16. Portable Speech Synthesizer

    NASA Technical Reports Server (NTRS)

    Leibfritz, Gilbert H.; Larson, Howard K.

    1987-01-01

    Compact speech synthesizer useful traveling companion to speech-handicapped. User simply enters statement on board, and synthesizer converts statement into spoken words. Battery-powered and housed in briefcase, easily carried on trips. Unit used on telephones and face-to-face communication. Synthesizer consists of micro-computer with memory-expansion module, speech-synthesizer circuit, batteries, recharger, dc-to-dc converter, and telephone amplifier. Components, commercially available, fit neatly in 17-by 13-by 5-in. briefcase. Weighs about 20 lb (9 kg) and operates and recharges from ac receptable.

  17. The Effect of SpeechEasy on Stuttering Frequency, Speech Rate, and Speech Naturalness

    ERIC Educational Resources Information Center

    Armson, Joy; Kiefte, Michael

    2008-01-01

    The effects of SpeechEasy on stuttering frequency, stuttering severity self-ratings, speech rate, and speech naturalness for 31 adults who stutter were examined. Speech measures were compared for samples obtained with and without the device in place in a dispensing setting. Mean stuttering frequencies were reduced by 79% and 61% for the device…

  18. Reversed-Phase Chromatography with Multiple Fraction Concatenation Strategy for Proteome Profiling of Human MCF10A Cells

    SciTech Connect

    Wang, Yuexi; Yang, Feng; Gritsenko, Marina A.; Wang, Yingchun; Clauss, Therese RW; Liu, Tao; Shen, Yufeng; Monroe, Matthew E.; Lopez-Ferrer, Daniel; Reno, Theresa; Moore, Ronald J.; Klemke, Richard L.; Camp, David G.; Smith, Richard D.

    2011-05-01

    Two dimensional liquid chromatography (2D LC) is commonly used for shotgun proteomics to improve the analysis dynamic range. Reversed phase liquid chromatography (RPLC) has been routinely employed as the second dimensional separation prior to the mass spectrometric analysis. Construction of 2D separation with RP-RP arises a concern for the separation orthogonality. In this study, we applied a novel concatenation strategy to improve the orthogonality of 2D RP-RP formed by low pH (i.e., pH 3) and high pH (i.e., pH 10) RPLC. We confidently identified 3753 proteins (18570 unique peptides) and 5907 proteins (37633 unique peptides) from low pH RPLC-RP and high pH RPLC-RP, respectively, for a trypsin-digested human MCF10A cell sample. Compared with SCX-RP, the high pH-low pH RP-RP approach resulted in 1.8-fold and 1.6-fold in the number of peptide and protein identifications, respectively. In addition to the broader identifications, the High pH-low pH RP-RP approach has advantages including the improved protein sequence coverage, the simplified sample processing, and the reduced sample loss. These results demonstrated that the concatenation high pH-low pH RP-RP strategy is an attractive alternative to SCX for 2D LC shotgun proteomic analysis.

  19. A strategy for absolute proteome quantification with mass spectrometry by hierarchical use of peptide-concatenated standards.

    PubMed

    Kito, Keiji; Okada, Mitsuhiro; Ishibashi, Yuko; Okada, Satoshi; Ito, Takashi

    2016-05-01

    The accurate and precise absolute abundance of proteins can be determined using mass spectrometry by spiking the sample with stable isotope-labeled standards. In this study, we developed a strategy of hierarchical use of peptide-concatenated standards (PCSs) to quantify more proteins over a wider dynamic range. Multiple primary PCSs were used for quantification of many target proteins. Unique "ID-tag peptides" were introduced into individual primary PCSs, allowing us to monitor the exact amounts of individual PCSs using a "secondary PCS" in which all "ID-tag peptides" were concatenated. Furthermore, we varied the copy number of the "ID-tag peptide" in each PCS according to a range of expression levels of target proteins. This strategy accomplished absolute quantification over a wider range than that of the measured ratios. The quantified abundance of budding yeast proteins showed a high reproducibility for replicate analyses and similar copy numbers per cell for ribosomal proteins, demonstrating the accuracy and precision of this strategy. A comparison with the absolute abundance of transcripts clearly indicated different post-transcriptional regulation of expression for specific functional groups. Thus, the approach presented here is a faithful method for the absolute quantification of proteomes and provides insights into biological mechanisms, including the regulation of expressed protein abundance. PMID:27030420

  20. An experimental study of the concatenated Reed-Solomon/Viterbi channel coding system performance and its impact on space communications

    NASA Technical Reports Server (NTRS)

    Liu, K. Y.; Lee, J. J.

    1981-01-01

    The need for efficient space communication at very low bit error probabilities to the specification and implementation of a concatenated coding system using an interleaved Reed-Solomon code as the outer code and a Viterbi-decoded convolutional code as the inner code. Experimental results of this channel coding system are presented under an emulated S-band uplink and X-band downlink two-way space communication channel, where both uplink and downlink have strong carrier power. This work was performed under the NASA End-to-End Data Systems program at JPL. Test results verify that at a bit error probability of 10 to the -6 power or less, this concatenated coding system does provide a coding gain of 2.5 dB or more over the Viterbi-decoded convolutional-only coding system. These tests also show that a desirable interleaving depth for the Reed-Solomon outer code is 8 or more. The impact of this "virtually" error-free space communication link on the transmission of images is discussed and examples of simulation results are given.

  1. Fluid Dynamics of Human Phonation and Speech

    NASA Astrophysics Data System (ADS)

    Mittal, Rajat; Erath, Byron D.; Plesniak, Michael W.

    2013-01-01

    This article presents a review of the fluid dynamics, flow-structure interactions, and acoustics associated with human phonation and speech. Our voice is produced through the process of phonation in the larynx, and an improved understanding of the underlying physics of this process is essential to advancing the treatment of voice disorders. Insights into the physics of phonation and speech can also contribute to improved vocal training and the development of new speech compression and synthesis schemes. This article introduces the key biomechanical features of the laryngeal physiology, reviews the basic principles of voice production, and summarizes the progress made over the past half-century in understanding the flow physics of phonation and speech. Laryngeal pathologies, which significantly enhance the complexity of phonatory dynamics, are discussed. After a thorough examination of the state of the art in computational modeling and experimental investigations of phonatory biomechanics, we present a synopsis of the pacing issues in this arena and an outlook for research in this fascinating subject.

  2. Speech perception as categorization

    PubMed Central

    Holt, Lori L.; Lotto, Andrew J.

    2010-01-01

    Speech perception (SP) most commonly refers to the perceptual mapping from the highly variable acoustic speech signal to a linguistic representation, whether it be phonemes, diphones, syllables, or words. This is an example of categorization, in that potentially discriminable speech sounds are assigned to functionally equivalent classes. In this tutorial, we present some of the main challenges to our understanding of the categorization of speech sounds and the conceptualization of SP that has resulted from these challenges. We focus here on issues and experiments that define open research questions relevant to phoneme categorization, arguing that SP is best understood as perceptual categorization, a position that places SP in direct contact with research from other areas of perception and cognition. PMID:20601702

  3. Auditory speech preprocessors

    SciTech Connect

    Zweig, G.

    1989-01-01

    A nonlinear transmission line model of the cochlea (Zweig 1988) is proposed as the basis for a novel speech preprocessor. Sounds of different intensities, such as voiced and unvoiced speech, are preprocessed in radically different ways. The Q's of the preprocessor's nonlinear filters vary with input amplitude, higher Q's (longer integration times) corresponding to quieter sounds. Like the cochlea, the preprocessor acts as a ''subthreshold laser'' that traps and amplifies low level signals, thereby aiding in their detection and analysis. 17 refs.

  4. High pH reversed-phase chromatography with fraction concatenation as an alternative to strong-cation exchange chromatography for two-dimensional proteomic analysis

    PubMed Central

    Yang, Feng; Shen, Yufeng; Camp, David G.; Smith, Richard D.

    2012-01-01

    Summary Orthogonal high-resolution separations are critical for attaining improved analytical dynamic range and protein coverage in proteomic measurements. High pH reversed-phase liquid chromatography (RPLC) followed by fraction concatenation affords better peptide analysis than conventional strong-cation exchange (SCX) chromatography applied for the two-dimensional proteomic analysis. For example, concatenated high pH reversed-phase liquid chromatography increased identification for peptides (1.8-fold) and proteins (1.6-fold) in shotgun proteomics analyses of a digested human protein sample. Additional advantages of high pH RPLC with fraction concatenation include improved protein sequence coverage, simplified sample processing, and reduced sample losses, making this an attractive alternative to SCX chromatography in conjunction with the second dimension low pH RPLC for two-dimensional proteomics analyses. PMID:22462785

  5. Human speech articulator measurements using low power, 2GHz Homodyne sensors

    SciTech Connect

    Barnes, T; Burnett, G C; Holzrichter, J F

    1999-06-29

    Very low power, short-range microwave ''radar-like'' sensors can measure the motions and vibrations of internal human speech articulators as speech is produced. In these animate (and also in inanimate acoustic systems) microwave sensors can measure vibration information associated with excitation sources and other interfaces. These data, together with the corresponding acoustic data, enable the calculation of system transfer functions. This information appears to be useful for a surprisingly wide range of applications such as speech coding and recognition, speaker or object identification, speech and musical instrument synthesis, noise cancellation, and other applications.

  6. Robust Speech Rate Estimation for Spontaneous Speech

    PubMed Central

    Wang, Dagen; Narayanan, Shrikanth S.

    2010-01-01

    In this paper, we propose a direct method for speech rate estimation from acoustic features without requiring any automatic speech transcription. We compare various spectral and temporal signal analysis and smoothing strategies to better characterize the underlying syllable structure to derive speech rate. The proposed algorithm extends the methods of spectral subband correlation by including temporal correlation and the use of prominent spectral subbands for improving the signal correlation essential for syllable detection. Furthermore, to address some of the practical robustness issues in previously proposed methods, we introduce some novel components into the algorithm such as the use of pitch confidence for filtering spurious syllable envelope peaks, magnifying window for tackling neighboring syllable smearing, and relative peak measure thresholds for pseudo peak rejection. We also describe an automated approach for learning algorithm parameters from data, and find the optimal settings through Monte Carlo simulations and parameter sensitivity analysis. Final experimental evaluations are conducted based on a portion of the Switchboard corpus for which manual phonetic segmentation information, and published results for direct comparison are available. The results show a correlation coefficient of 0.745 with respect to the ground truth based on manual segmentation. This result is about a 17% improvement compared to the current best single estimator and a 11% improvement over the multiestimator evaluated on the same Switchboard database. PMID:20428476

  7. The Application of The Wavelet Transform to The Low Bit Rate Speech Coding System

    NASA Astrophysics Data System (ADS)

    Mriai, Shogo; Hanazaki, Izumi

    In the analysis-synthesis coding of speech signals, realization of the high quality in the low bit rate coding depends on the extraction of its characteristic parameters in the pre-processing. The precise extraction of the fundamental frequency, one of the parameters of the source information, guarantees the quality in the speech synthesis. But its extraction is difficult because of the influence of the consonant, non-periodicity of vocal cords vibration, wide range of the fundamental frequency, etc.. In this paper, we will propose a new fundamental frequency extraction with the criterion based on its harmonics structure and low bit rate speech coding system using the Wavelet transform.

  8. The implementation of a system for synthesizing articulate Polish speech from written text input from a PC type computer keyboard

    NASA Astrophysics Data System (ADS)

    Imiolczyk, Janusz; Nowak, Ignacy; Demenko, Grazyna

    1993-12-01

    This article describes a system for automatic synthesis of Polish speech developed over the last three years at the Speech Analysis and Synthesis Laboratory. The system's acoustic signal generator is a customized integrated circuit controlled from a PC AT computer which runs on original software consisting of editor, phonetic transcription, and digital parameter synthesis modules. The speech is synthesized in real time and is very comprehensible, which has opened up prospects for specific applications for the system. This study was conducted under IBPT Order No 412.

  9. Speech Alarms Pilot Study

    NASA Technical Reports Server (NTRS)

    Sandor, A.; Moses, H. R.

    2016-01-01

    Currently on the International Space Station (ISS) and other space vehicles Caution & Warning (C&W) alerts are represented with various auditory tones that correspond to the type of event. This system relies on the crew's ability to remember what each tone represents in a high stress, high workload environment when responding to the alert. Furthermore, crew receive a year or more in advance of the mission that makes remembering the semantic meaning of the alerts more difficult. The current system works for missions conducted close to Earth where ground operators can assist as needed. On long duration missions, however, they will need to work off-nominal events autonomously. There is evidence that speech alarms may be easier and faster to recognize, especially during an off-nominal event. The Information Presentation Directed Research Project (FY07-FY09) funded by the Human Research Program included several studies investigating C&W alerts. The studies evaluated tone alerts currently in use with NASA flight deck displays along with candidate speech alerts. A follow-on study used four types of speech alerts to investigate how quickly various types of auditory alerts with and without a speech component - either at the beginning or at the end of the tone - can be identified. Even though crew were familiar with the tone alert from training or direct mission experience, alerts starting with a speech component were identified faster than alerts starting with a tone. The current study replicated the results from the previous study in a more rigorous experimental design to determine if the candidate speech alarms are ready for transition to operations or if more research is needed. Four types of alarms (caution, warning, fire, and depressurization) were presented to participants in both tone and speech formats in laboratory settings and later in the Human Exploration Research Analog (HERA). In the laboratory study, the alerts were presented by software and participants were

  10. Annealed lattice animal model and Flory theory for the melt of non-concatenated rings: towards the physics of crumpling

    NASA Astrophysics Data System (ADS)

    Grosberg, Alexander Y.

    A Flory theory is constructed for a long polymer ring in a melt of unknotted and non-concatenated rings. The theory assumes that the ring forms an effective annealed branched object and computes its primitive path. It is shown that the primitive path follows self-avoiding statistics and is characterized by the corresponding Flory exponent of a polymer with excluded volume. Based on that, it is shown that rings in the melt are compact objects with overall size proportional to their length raised to the 1/3 power. Furthermore, the contact probability exponent is estimated, albeit by a poorly controlled approximation, with the result consistent with both numerical and experimental data.

  11. TCR and Lat are expressed on separate protein islands on T cell membranes and concatenate during activation

    PubMed Central

    Lillemeier, Björn F; Mörtelmaier, Manuel A; Forstner, Martin B; Huppa, Johannes B; Groves, Jay T; Davis, Mark M

    2010-01-01

    The organization and dynamics of receptors and other molecules in the plasma membrane are not well understood. Here we analyzed the spatio-temporal dynamics of T cell antigen receptor (TCR) complexes and linker for activation of T cells (Lat), a key adaptor molecule in the TCR signaling pathway, in T cell membranes using high-speed photoactivated localization microscopy, dual-color fluorescence cross-correlation spectroscopy and transmission electron microscopy. In quiescent T cells, both molecules existed in separate membrane domains (protein islands), and these domains concatenated after T cell activation. These concatemers were identical to signaling microclusters, a prominent hallmark of T cell activation. This separation versus physical juxtapositioning of receptor domains and domains containing downstream signaling molecules in quiescent versus activated T cells may be a general feature of plasma membrane–associated signal transduction. PMID:20010844

  12. Differential Diagnosis of Severe Speech Disorders Using Speech Gestures

    ERIC Educational Resources Information Center

    Bahr, Ruth Huntley

    2005-01-01

    The differentiation of childhood apraxia of speech from severe phonological disorder is a common clinical problem. This article reports on an attempt to describe speech errors in children with childhood apraxia of speech on the basis of gesture use and acoustic analyses of articulatory gestures. The focus was on the movement of articulators and…

  13. Performance of Single and Concatenated Sets of Mitochondrial Genes at Inferring Metazoan Relationships Relative to Full Mitogenome Data

    PubMed Central

    Havird, Justin C.; Santos, Scott R.

    2014-01-01

    Mitochondrial (mt) genes are some of the most popular and widely-utilized genetic loci in phylogenetic studies of metazoan taxa. However, their linked nature has raised questions on whether using the entire mitogenome for phylogenetics is overkill (at best) or pseudoreplication (at worst). Moreover, no studies have addressed the comparative phylogenetic utility of mitochondrial genes across individual lineages within the entire Metazoa. To comment on the phylogenetic utility of individual mt genes as well as concatenated subsets of genes, we analyzed mitogenomic data from 1865 metazoan taxa in 372 separate lineages spanning genera to subphyla. Specifically, phylogenies inferred from these datasets were statistically compared to ones generated from all 13 mt protein-coding (PC) genes (i.e., the “supergene” set) to determine which single genes performed “best” at, and the minimum number of genes required to, recover the “supergene” topology. Surprisingly, the popular marker COX1 performed poorest, while ND5, ND4, and ND2 were most likely to reproduce the “supergene” topology. Averaged across all lineages, the longest ∼2 mt PC genes were sufficient to recreate the “supergene” topology, although this average increased to ∼5 genes for datasets with 40 or more taxa. Furthermore, concatenation of the three “best” performing mt PC genes outperformed that of the three longest mt PC genes (i.e, ND5, COX1, and ND4). Taken together, while not all mt PC genes are equally interchangeable in phylogenetic studies of the metazoans, some subset can serve as a proxy for the 13 mt PC genes. However, the exact number and identity of these genes is specific to the lineage in question and cannot be applied indiscriminately across the Metazoa. PMID:24454717

  14. On the Resource Efficiency of Virtual Concatenation in SDH/SONET Mesh Transport Networks Bearing Protected Scheduled Connections

    NASA Astrophysics Data System (ADS)

    Kuri, Josu�; Gagnaire, Maurice; Puech, Nicolas

    2005-10-01

    Virtual concatenation (VCAT) is a Synchronous Digital Hierarchy (SDH)/Synchronous Optical Network (SONET) network functionality recently standardized by the International Telecommunication Union Telecommunication Standardization Sector (ITU-T). VCAT provides the flexibility required to efficiently allocate network resources to Ethernet, Fiber Channel (FC), Enterprise System Connection (ESCON), and other important data traffic signals. In this article, we assess the resources' gain provided by VCAT with respect to contiguous concatenation (CCAT) in SDH/SONET mesh transport networks bearing protected scheduled connection demands (SCDs). As explained later, an SCD is a connection demand for which the set-up and tear-down dates are known in advance. We define mathematical models to quantify the add/drop and transmission resources required to instantiate a set of protected SCDs in VCAT-and CCAT-capable networks. Quantification of transmission resources requires a routing and slot assignment (RSA) problem to be solved. We formulate the RSA problem in VCAT-and CCAT-capable networks as two different combinatorial optimization problems: RSA in VCAT-capable networks (RSAv) and RSA in CCAT-capable networks (RSAc), respectively. Protection of the SCDs is considered in the formulations using a shared backup path protection (SBPP) technique. We propose a simulated annealing (SA)-based meta-heuristic algorithm to compute approximate solutions to these problems (i.e., solutions whose cost approximates the cost of the optimal ones). The gain in transmission resources and the cost structure of add/drop resources making VCAT-capable networks more economical are analyzed for different traffic scenarios.

  15. Hearing or speech impairment - resources

    MedlinePlus

    ... resources for information on hearing impairment or speech impairment: Alexander Graham Bell Association for the Deaf and Hard of Hearing -- www.agbell.org American Speech-Language-Hearing Association -- www.asha.org/public Center for ...

  16. Why Go to Speech Therapy?

    MedlinePlus

    ... Teachers Speech-Language Pathologists Physicians Employers Tweet Why Go To Speech Therapy? Parents of Preschoolers Parents of ... types of therapy work best when you can go on an intensive schedule (i.e., every day ...

  17. Development of a speech autocuer

    NASA Technical Reports Server (NTRS)

    Bedles, R. L.; Kizakvich, P. N.; Lawson, D. T.; Mccartney, M. L.

    1980-01-01

    A wearable, visually based prosthesis for the deaf based upon the proven method for removing lipreading ambiguity known as cued speech was fabricated and tested. Both software and hardware developments are described, including a microcomputer, display, and speech preprocessor.

  18. Hearing or speech impairment - resources

    MedlinePlus

    Resources - hearing or speech impairment ... The following organizations are good resources for information on hearing impairment or speech impairment: Alexander Graham Bell Association for the Deaf and Hard of Hearing -- www.agbell. ...

  19. Speech spectrogram expert

    SciTech Connect

    Johannsen, J.; Macallister, J.; Michalek, T.; Ross, S.

    1983-01-01

    Various authors have pointed out that humans can become quite adept at deriving phonetic transcriptions from speech spectrograms (as good as 90percent accuracy at the phoneme level). The authors describe an expert system which attempts to simulate this performance. The speech spectrogram expert (spex) is actually a society made up of three experts: a 2-dimensional vision expert, an acoustic-phonetic expert, and a phonetics expert. The visual reasoning expert finds important visual features of the spectrogram. The acoustic-phonetic expert reasons about how visual features relates to phonemes, and about how phonemes change visually in different contexts. The phonetics expert reasons about allowable phoneme sequences and transformations, and deduces an english spelling for phoneme strings. The speech spectrogram expert is highly interactive, allowing users to investigate hypotheses and edit rules. 10 references.

  20. "Zero Tolerance" for Free Speech.

    ERIC Educational Resources Information Center

    Hils, Lynda

    2001-01-01

    Argues that school policies of "zero tolerance" of threatening speech may violate a student's First Amendment right to freedom of expression if speech is less than a "true threat." Suggests a two-step analysis to determine if student speech is a "true threat." (PKP)

  1. Signed Soliloquy: Visible Private Speech

    ERIC Educational Resources Information Center

    Zimmermann, Kathrin; Brugger, Peter

    2013-01-01

    Talking to oneself can be silent (inner speech) or vocalized for others to hear (private speech, or soliloquy). We investigated these two types of self-communication in 28 deaf signers and 28 hearing adults. With a questionnaire specifically developed for this study, we established the visible analog of vocalized private speech in deaf signers.…

  2. Abortion and compelled physician speech.

    PubMed

    Orentlicher, David

    2015-01-01

    Informed consent mandates for abortion providers may infringe the First Amendment's freedom of speech. On the other hand, they may reinforce the physician's duty to obtain informed consent. Courts can promote both doctrines by ensuring that compelled physician speech pertains to medical facts about abortion rather than abortion ideology and that compelled speech is truthful and not misleading. PMID:25846035

  3. Study of acoustic correlates associate with emotional speech

    NASA Astrophysics Data System (ADS)

    Yildirim, Serdar; Lee, Sungbok; Lee, Chul Min; Bulut, Murtaza; Busso, Carlos; Kazemzadeh, Ebrahim; Narayanan, Shrikanth

    2004-10-01

    This study investigates the acoustic characteristics of four different emotions expressed in speech. The aim is to obtain detailed acoustic knowledge on how a speech signal is modulated by changes from neutral to a certain emotional state. Such knowledge is necessary for automatic emotion recognition and classification and emotional speech synthesis. Speech data obtained from two semi-professional actresses are analyzed and compared. Each subject produces 211 sentences with four different emotions; neutral, sad, angry, happy. We analyze changes in temporal and acoustic parameters such as magnitude and variability of segmental duration, fundamental frequency and the first three formant frequencies as a function of emotion. Acoustic differences among the emotions are also explored with mutual information computation, multidimensional scaling and acoustic likelihood comparison with normal speech. Results indicate that speech associated with anger and happiness is characterized by longer duration, shorter interword silence, higher pitch and rms energy with wider ranges. Sadness is distinguished from other emotions by lower rms energy and longer interword silence. Interestingly, the difference in formant pattern between [happiness/anger] and [neutral/sadness] are better reflected in back vowels such as /a/(/father/) than in front vowels. Detailed results on intra- and interspeaker variability will be reported.

  4. Subjective analysis of a HMM-based visual speech synthesizer

    NASA Astrophysics Data System (ADS)

    Williams, Jay J.; Katsaggelos, Aggelos K.; Garstecki, Dean C.

    2001-06-01

    Emerging broadband communication systems promise a future of multimedia telephony. The addition of visual information, for example, during telephone conversions would be most beneficial to people with impaired hearing useful for speech reading, based on existing narrowband communications system used for speech signal. A Hidden Markov Model (HMM)-based visual speech synthesizer is designed to improve speech understanding. The key elements in the application of HMMs to this problem are: a) the decomposition of the overall modeling task into key stages; and, b) the judicious determination of the components of the observation vector for each stage. The main contribution of this paper is the development of a novel correlation HMM model that is able to integrate independently trained acoustic and visual HMMs for speech-to-visual synthesis. This model allows increased flexibility in choosing model topologies for the acoustic and visual HMMs. It also reduces the amount of required training data compared to early integration modeling techniques. Results form objective and subjective analysis show that an HMM correlating model can significantly decrease audio-visual synchronization errors and increase speech understanding.

  5. Microphones for speech and speech recognition

    NASA Astrophysics Data System (ADS)

    West, James E.

    2004-10-01

    Automatic speech recognition (ASR) requires about a 15- to 20-dB signal-to-noise ratio (S/N) for high accuracy even for small vocabulary systems. This S/N is generally achievable using a telephone handset in normal office or home environments. In the early 1990s ATT and the regional telephone companies began using speaker-independent ASR to replace several operator services. The variable distortion in the carbon microphone was not transparent and resulted in reduced ASR accuracy. The linear electret condenser microphone, common in most modern telephones, improved handset performance both in sound quality and ASR accuracy. Hands-free ASR in quiet conditions is a bit more complex because of the increased distance between the microphone and the speech source. Cardioid directional microphones offer some improvement in noisy locations when the noise and desired signals are spatially separated, but this is not the general case and the resulting S/N is not adequate for seamless machine translation. Higher-order directional microphones, when properly oriented with respect to the talker and noise, have shown good improvement over omni-directional microphones. Some ASR results measured in simulated car noise will be presented.

  6. Speech transmission index from running speech: A neural network approach

    NASA Astrophysics Data System (ADS)

    Li, F. F.; Cox, T. J.

    2003-04-01

    Speech transmission index (STI) is an important objective parameter concerning speech intelligibility for sound transmission channels. It is normally measured with specific test signals to ensure high accuracy and good repeatability. Measurement with running speech was previously proposed, but accuracy is compromised and hence applications limited. A new approach that uses artificial neural networks to accurately extract the STI from received running speech is developed in this paper. Neural networks are trained on a large set of transmitted speech examples with prior knowledge of the transmission channels' STIs. The networks perform complicated nonlinear function mappings and spectral feature memorization to enable accurate objective parameter extraction from transmitted speech. Validations via simulations demonstrate the feasibility of this new method on a one-net-one-speech extract basis. In this case, accuracy is comparable with normal measurement methods. This provides an alternative to standard measurement techniques, and it is intended that the neural network method can facilitate occupied room acoustic measurements.

  7. Methods for real-time speech processing on Unix

    SciTech Connect

    Romberger, A.

    1982-01-01

    The author discusses computer programming done at the University of California, Berkeley, in support of research work in the area of speech analysis and synthesis. The purpose of this programming is to set up a system for doing real-time speech sampling using the Unix operating system. Two alternative approaches to real time work on Unix are discussed. The first approach is to do the real-time input/output on a secondary (satellite) machine that is not running Unix. The second approach is to do the real-time input/output on the main machine with the aid of special hardware.

  8. Microprocessor for speech recognition

    SciTech Connect

    Ishizuka, H.; Watari, M.; Sakoe, H.; Chiba, S.; Iwata, T.; Matsuki, T.; Kawakami, Y.

    1983-01-01

    A new single-chip microprocessor for speech recognition has been developed utilizing multi-processor architecture and pipelined structure. By DP-matching algorithm, the processor recognizes up to 340 isolated words or 40 connected words in realtime. 6 references.

  9. Hearing speech in music.

    PubMed

    Ekström, Seth-Reino; Borg, Erik

    2011-01-01

    The masking effect of a piano composition, played at different speeds and in different octaves, on speech-perception thresholds was investigated in 15 normal-hearing and 14 moderately-hearing-impaired subjects. Running speech (just follow conversation, JFC) testing and use of hearing aids increased the everyday validity of the findings. A comparison was made with standard audiometric noises [International Collegium of Rehabilitative Audiology (ICRA) noise and speech spectrum-filtered noise (SPN)]. All masking sounds, music or noise, were presented at the same equivalent sound level (50 dBA). The results showed a significant effect of piano performance speed and octave (P<.01). Low octave and fast tempo had the largest effect; and high octave and slow tempo, the smallest. Music had a lower masking effect than did ICRA noise with two or six speakers at normal vocal effort (P<.01) and SPN (P<.05). Subjects with hearing loss had higher masked thresholds than the normal-hearing subjects (P<.01), but there were smaller differences between masking conditions (P<.01). It is pointed out that music offers an interesting opportunity for studying masking under realistic conditions, where spectral and temporal features can be varied independently. The results have implications for composing music with vocal parts, designing acoustic environments and creating a balance between speech perception and privacy in social settings. PMID:21768731

  10. On Curbing Racial Speech.

    ERIC Educational Resources Information Center

    Gale, Mary Ellen

    1991-01-01

    An alternative interpretation of the First Amendment guarantee of free speech suggests that universities may prohibit and punish direct verbal assaults on specific individuals if the speaker intends to do harm and if a reasonable person would recognize the potential for serious interference with the victim's educational rights. (MSE)

  11. Mandarin Visual Speech Information

    ERIC Educational Resources Information Center

    Chen, Trevor H.

    2010-01-01

    While the auditory-only aspects of Mandarin speech are heavily-researched and well-known in the field, this dissertation addresses its lesser-known aspects: The visual and audio-visual perception of Mandarin segmental information and lexical-tone information. Chapter II of this dissertation focuses on the audiovisual perception of Mandarin…

  12. Packet speech systems technology

    NASA Astrophysics Data System (ADS)

    Weinstein, C. J.; Blankenship, P. E.

    1982-09-01

    The long-range objectives of the Packet Speech Systems Technology Program are to develop and demonstrate techniques for efficient digital speech communications on networks suitable for both voice and data, and to investigate and develop techniques for integrated voice and data communication in packetized networks, including wideband common-user satellite links. Specific areas of concern are: the concentration of statistically fluctuating volumes of voice traffic, the adaptation of communication strategies to varying conditions of network links and traffic volume, and the interconnection of wideband satellite networks to terrestrial systems. Previous efforts in this area have led to new vocoder structures for improved narrowband voice performance and multiple-rate transmission, and to demonstrations of conversational speech and conferencing on the ARPANET and the Atlantic Packet Satellite Network. The current program has two major thrusts: i.e., the development and refinement of practical low-cost, robust, narrowband, and variable-rate speech algorithms and voice terminal structures; and the establishment of an experimental wideband satellite network to serve as a unique facility for the realistic investigation of voice/data networking strategies.

  13. Perceptual Learning in Speech

    ERIC Educational Resources Information Center

    Norris, Dennis; McQueen, James M.; Cutler, Anne

    2003-01-01

    This study demonstrates that listeners use lexical knowledge in perceptual learning of speech sounds. Dutch listeners first made lexical decisions on Dutch words and nonwords. The final fricative of 20 critical words had been replaced by an ambiguous sound, between [f] and [s]. One group of listeners heard ambiguous [f]-final words (e.g.,…

  14. Free Speech Yearbook 1979.

    ERIC Educational Resources Information Center

    Kane, Peter E., Ed.

    The seven articles in this collection deal with theoretical and practical freedom of speech issues. Topics covered are: the United States Supreme Court, motion picture censorship, and the color line; judicial decision making; the established scientific community's suppression of the ideas of Immanuel Velikovsky; the problems of avant-garde jazz,…

  15. 1984 Newbery Acceptance Speech.

    ERIC Educational Resources Information Center

    Cleary, Beverly

    1984-01-01

    This acceptance speech for an award honoring "Dear Mr. Henshaw," a book about feelings of a lonely child of divorce intended for eight-, nine-, and ten-year-olds, highlights children's letters to author. Changes in society that affect children, the inception of "Dear Mr. Henshaw," and children's reactions to books are highlighted. (EJS)

  16. Black History Speech

    ERIC Educational Resources Information Center

    Noldon, Carl

    2007-01-01

    The author argues in this speech that one cannot expect students in the school system to know and understand the genius of Black history if the curriculum is Eurocentric, which is a residue of racism. He states that his comments are designed for the enlightenment of those who suffer from a school system that "hypocritically manipulates Black…

  17. Speech to schoolchildren

    NASA Astrophysics Data System (ADS)

    Angell, C. Austen

    2013-02-01

    Prof. C. A. Angell from Arizona State University read the following short and simple speech, saying the sentences in Italics in the best Japanese he could manage (after earnest coaching from a Japanese colleague). The rest was translated on the bus ride, and then spoken, as I spoke, by Ms. Yukako Endo- to whom the author is very grateful.

  18. Free Speech Yearbook 1973.

    ERIC Educational Resources Information Center

    Barbour, Alton, Ed.

    The first article in this collection examines civil disobedience and the protections offered by the First Amendment. The second article discusses a study on antagonistic expressions in a free society. The third essay deals with attitudes toward free speech and treatment of the United States flag. There are two articles on media; the first examines…

  19. Speech and Hearing Therapy.

    ERIC Educational Resources Information Center

    Sakata, Reiko; Sakata, Robert

    1978-01-01

    In the public school, the speech and hearing therapist attempts to foster child growth and development through the provision of services basic to awareness of self and others, management of personal and social interactions, and development of strategies for coping with the handicap. (MM)

  20. Speech and Language Impairments

    MedlinePlus

    ... SLP) who can help you identify strategies for teaching and supporting this student, ways to adapt the ... ASHA | American Speech-Language-Hearing Association Information in Spanish | Información en español. 1.800.638.8255 | actioncenter@ ...

  1. Free Speech Yearbook, 1974.

    ERIC Educational Resources Information Center

    Barbour, Alton, Ed.

    A collection of essays on free speech and communication is contained in this book. The essays include "From Fairness to Access and Back Again: Some Dimensions of Free Expression in Broadcasting"; "Local Option on the First Amendment?"; "A Look at the Fire Symbol Before and After May 4, 1970"; "Freedom to Teach, to Learn, and to Speak: Rhetorical…

  2. Expectations and speech intelligibility.

    PubMed

    Babel, Molly; Russell, Jamie

    2015-05-01

    Socio-indexical cues and paralinguistic information are often beneficial to speech processing as this information assists listeners in parsing the speech stream. Associations that particular populations speak in a certain speech style can, however, make it such that socio-indexical cues have a cost. In this study, native speakers of Canadian English who identify as Chinese Canadian and White Canadian read sentences that were presented to listeners in noise. Half of the sentences were presented with a visual-prime in the form of a photo of the speaker and half were presented in control trials with fixation crosses. Sentences produced by Chinese Canadians showed an intelligibility cost in the face-prime condition, whereas sentences produced by White Canadians did not. In an accentedness rating task, listeners rated White Canadians as less accented in the face-prime trials, but Chinese Canadians showed no such change in perceived accentedness. These results suggest a misalignment between an expected and an observed speech signal for the face-prime trials, which indicates that social information about a speaker can trigger linguistic associations that come with processing benefits and costs. PMID:25994710

  3. The cortical representation of the speech envelope is earlier for audiovisual speech than audio speech.

    PubMed

    Crosse, Michael J; Lalor, Edmund C

    2014-04-01

    Visual speech can greatly enhance a listener's comprehension of auditory speech when they are presented simultaneously. Efforts to determine the neural underpinnings of this phenomenon have been hampered by the limited temporal resolution of hemodynamic imaging and the fact that EEG and magnetoencephalographic data are usually analyzed in response to simple, discrete stimuli. Recent research has shown that neuronal activity in human auditory cortex tracks the envelope of natural speech. Here, we exploit this finding by estimating a linear forward-mapping between the speech envelope and EEG data and show that the latency at which the envelope of natural speech is represented in cortex is shortened by >10 ms when continuous audiovisual speech is presented compared with audio-only speech. In addition, we use a reverse-mapping approach to reconstruct an estimate of the speech stimulus from the EEG data and, by comparing the bimodal estimate with the sum of the unimodal estimates, find no evidence of any nonlinear additive effects in the audiovisual speech condition. These findings point to an underlying mechanism that could account for enhanced comprehension during audiovisual speech. Specifically, we hypothesize that low-level acoustic features that are temporally coherent with the preceding visual stream may be synthesized into a speech object at an earlier latency, which may provide an extended period of low-level processing before extraction of semantic information. PMID:24401714

  4. System and method for characterizing voiced excitations of speech and acoustic signals, removing acoustic noise from speech, and synthesizing speech

    DOEpatents

    Burnett, Greg C.; Holzrichter, John F.; Ng, Lawrence C.

    2002-01-01

    Low power EM waves are used to detect motions of vocal tract tissues of the human speech system before, during, and after voiced speech. A voiced excitation function is derived. The excitation function provides speech production information to enhance speech characterization and to enable noise removal from human speech.

  5. Speech Motor Control in Fluent and Dysfluent Speech Production of an Individual with Apraxia of Speech and Broca's Aphasia

    ERIC Educational Resources Information Center

    van Lieshout, Pascal H. H. M.; Bose, Arpita; Square, Paula A.; Steele, Catriona M.

    2007-01-01

    Apraxia of speech (AOS) is typically described as a motor-speech disorder with clinically well-defined symptoms, but without a clear understanding of the underlying problems in motor control. A number of studies have compared the speech of subjects with AOS to the fluent speech of controls, but only a few have included speech movement data and if…

  6. Seeking a reading machine for the blind and discovering the speech code.

    PubMed

    Shankweiler, Donald; Fowler, Carol A

    2015-02-01

    A machine that can read printed material to the blind became a priority at the end of World War II with the appointment of a U.S. Government committee to instigate research on sensory aids to improve the lot of blinded veterans. The committee chose Haskins Laboratories to lead a multisite research program. Initially, Haskins researchers overestimated the capacities of users to learn an acoustic code based on the letters of a text, resulting in unsuitable designs. Progress was slow because the researchers clung to a mistaken view that speech is a sound alphabet and because of persisting gaps in man-machine technology. The tortuous route to a practical reading machine transformed the scientific understanding of speech perception and reading at Haskins Labs and elsewhere, leading to novel lines of basic research and new technologies. Research at Haskins Laboratories made valuable contributions in clarifying the physical basis of speech. Researchers recognized that coarticulatory overlap eliminated the possibility of alphabet-like discrete acoustic segments in speech. This work advanced the study of speech perception and contributed to our understanding of the relation of speech perception to production. Basic findings on speech enabled the development of speech synthesis, part science and part technology, essential for development of a reading machine, which has found many applications. Findings on the nature of speech further stimulated a new understanding of word recognition in reading across languages and scripts and contributed to our understanding of reading development and reading disabilities. PMID:25528275

  7. Selected military applications of automatic speech recognition technology

    NASA Astrophysics Data System (ADS)

    Woodard, J. P.; Cupples, E. J.

    1983-12-01

    Voice input for command and control, message sorting by voice, and advanced low-bit-rate voice communications systems are discussed in relation to automatic speech recognition (ASR) technology applications. Several research efforts in the areas of ASR and speech synthesis technology, which forms the basis for voice input/output systems, are described for military airborne and ground-based applications. The success of the speech enhancement unit in noise reduction for both human and machine listeners is documented and shown to be indispensible for the ASR used in message sorting, and for many command-and-control applications in harsh environments. Two experimental low-bit-rate systems, one phonetically based, the other based on vector, or block quantization, are compared. ASR technology is shown to have the potential to increase the effectiveness of man-machine communications in a variety of military applications, such as equipment maintenance and computers. Further research is necessary to solve some basic problems.

  8. TEACHER'S GUIDE TO HIGH SCHOOL SPEECH.

    ERIC Educational Resources Information Center

    JENKINSON, EDWARD B., ED.

    THIS GUIDE TO HIGH SCHOOL SPEECH FOCUSES ON SPEECH AS ORAL COMPOSITION, STRESSING THE IMPORTANCE OF CLEAR THINKING AND COMMUNICATION. THE PROPOSED 1-SEMESTER BASIC COURSE IN SPEECH ATTEMPTS TO IMPROVE THE STUDENT'S ABILITY TO COMPOSE AND DELIVER SPEECHES, TO THINK AND LISTEN CRITICALLY, AND TO UNDERSTAND THE SOCIAL FUNCTION OF SPEECH. IN ADDITION…

  9. Hate Speech: Power in the Marketplace.

    ERIC Educational Resources Information Center

    Harrison, Jack B.

    1994-01-01

    A discussion of hate speech and freedom of speech on college campuses examines the difference between hate speech from normal, objectionable interpersonal comments and looks at Supreme Court decisions on the limits of student free speech. Two cases specifically concerning regulation of hate speech on campus are considered: Chaplinsky v. New…

  10. Elicitation of the Acoustic Change Complex to Long-Duration Speech Stimuli in Four-Month-Old Infants.

    PubMed

    Chen, Ke Heng; Small, Susan A

    2015-01-01

    The acoustic change complex (ACC) is an auditory-evoked potential elicited to changes within an ongoing stimulus that indicates discrimination at the level of the auditory cortex. Only a few studies to date have attempted to record ACCs in young infants. The purpose of the present study was to investigate the elicitation of ACCs to long-duration speech stimuli in English-learning 4-month-old infants. ACCs were elicited to consonant contrasts made up of two concatenated speech tokens. The stimuli included native dental-dental /dada/ and dental-labial /daba/ contrasts and a nonnative Hindi dental-retroflex /daDa/ contrast. Each consonant-vowel speech token was 410 ms in duration. Slow cortical responses were recorded to the onset of the stimulus and to the acoustic change from /da/ to either /ba/ or /Da/ within the stimulus with significantly prolonged latencies compared with adults. ACCs were reliably elicited for all stimulus conditions with more robust morphology compared with our previous findings using stimuli that were shorter in duration. The P1 amplitudes elicited to the acoustic change in /daba/ and /daDa/ were significantly larger compared to /dada/ supporting that the brain discriminated between the speech tokens. These findings provide further evidence for the use of ACCs as an index of discrimination ability. PMID:26798343

  11. Elicitation of the Acoustic Change Complex to Long-Duration Speech Stimuli in Four-Month-Old Infants

    PubMed Central

    Chen, Ke Heng; Small, Susan A.

    2015-01-01

    The acoustic change complex (ACC) is an auditory-evoked potential elicited to changes within an ongoing stimulus that indicates discrimination at the level of the auditory cortex. Only a few studies to date have attempted to record ACCs in young infants. The purpose of the present study was to investigate the elicitation of ACCs to long-duration speech stimuli in English-learning 4-month-old infants. ACCs were elicited to consonant contrasts made up of two concatenated speech tokens. The stimuli included native dental-dental /dada/ and dental-labial /daba/ contrasts and a nonnative Hindi dental-retroflex /daDa/ contrast. Each consonant-vowel speech token was 410 ms in duration. Slow cortical responses were recorded to the onset of the stimulus and to the acoustic change from /da/ to either /ba/ or /Da/ within the stimulus with significantly prolonged latencies compared with adults. ACCs were reliably elicited for all stimulus conditions with more robust morphology compared with our previous findings using stimuli that were shorter in duration. The P1 amplitudes elicited to the acoustic change in /daba/ and /daDa/ were significantly larger compared to /dada/ supporting that the brain discriminated between the speech tokens. These findings provide further evidence for the use of ACCs as an index of discrimination ability. PMID:26798343

  12. Headphone localization of speech

    NASA Technical Reports Server (NTRS)

    Begault, Durand R.; Wenzel, Elizabeth M.

    1993-01-01

    Three-dimensional acoustic display systems have recently been developed that synthesize virtual sound sources over headphones based on filtering by head-related transfer functions (HRTFs), the direction-dependent spectral changes caused primarily by the pinnae. In this study, 11 inexperienced subjects judged the apparent spatial location of headphone-presented speech stimuli filtered with nonindividualized HRTFs. About half of the subjects 'pulled' their judgments toward either the median or the lateral-vertical planes, and estimates were almost always elevated. Individual differences were pronounced for the distance judgments; 15 to 46 percent of stimuli were heard inside the head, with the shortest estimates near the median plane. The results suggest that most listeners can obtain useful azimuth information from speech stimuli filtered by nonindividualized HRTFs. Measurements of localization error and reversal rates are comparable with a previous study that used broadband noise stimuli.

  13. Neurophysiology of speech differences in childhood apraxia of speech.

    PubMed

    Preston, Jonathan L; Molfese, Peter J; Gumkowski, Nina; Sorcinelli, Andrea; Harwood, Vanessa; Irwin, Julia R; Landi, Nicole

    2014-01-01

    Event-related potentials (ERPs) were recorded during a picture naming task of simple and complex words in children with typical speech and with childhood apraxia of speech (CAS). Results reveal reduced amplitude prior to speaking complex (multisyllabic) words relative to simple (monosyllabic) words for the CAS group over the right hemisphere during a time window thought to reflect phonological encoding of word forms. Group differences were also observed prior to production of spoken tokens regardless of word complexity during a time window just prior to speech onset (thought to reflect motor planning/programming). Results suggest differences in pre-speech neurolinguistic processes. PMID:25090016

  14. [Improving speech comprehension using a new cochlear implant speech processor].

    PubMed

    Müller-Deile, J; Kortmann, T; Hoppe, U; Hessel, H; Morsnowski, A

    2009-06-01

    The aim of this multicenter clinical field study was to assess the benefits of the new Freedom 24 sound processor for cochlear implant (CI) users implanted with the Nucleus 24 cochlear implant system. The study included 48 postlingually profoundly deaf experienced CI users who demonstrated speech comprehension performance with their current speech processor on the Oldenburg sentence test (OLSA) in quiet conditions of at least 80% correct scores and who were able to perform adaptive speech threshold testing using the OLSA in noisy conditions. Following baseline measures of speech comprehension performance with their current speech processor, subjects were upgraded to the Freedom 24 speech processor. After a take-home trial period of at least 2 weeks, subject performance was evaluated by measuring the speech reception threshold with the Freiburg multisyllabic word test and speech intelligibility with the Freiburg monosyllabic word test at 50 dB and 70 dB in the sound field. The results demonstrated highly significant benefits for speech comprehension with the new speech processor. Significant benefits for speech comprehension were also demonstrated with the new speech processor when tested in competing background noise.In contrast, use of the Abbreviated Profile of Hearing Aid Benefit (APHAB) did not prove to be a suitably sensitive assessment tool for comparative subjective self-assessment of hearing benefits with each processor. Use of the preprocessing algorithm known as adaptive dynamic range optimization (ADRO) in the Freedom 24 led to additional improvements over the standard upgrade map for speech comprehension in quiet and showed equivalent performance in noise. Through use of the preprocessing beam-forming algorithm BEAM, subjects demonstrated a highly significant improved signal-to-noise ratio for speech comprehension thresholds (i.e., signal-to-noise ratio for 50% speech comprehension scores) when tested with an adaptive procedure using the Oldenburg

  15. Neurophysiology of Speech Differences in Childhood Apraxia of Speech

    PubMed Central

    Preston, Jonathan L.; Molfese, Peter J.; Gumkowski, Nina; Sorcinelli, Andrea; Harwood, Vanessa; Irwin, Julia; Landi, Nicole

    2014-01-01

    Event-related potentials (ERPs) were recorded during a picture naming task of simple and complex words in children with typical speech and with childhood apraxia of speech (CAS). Results reveal reduced amplitude prior to speaking complex (multisyllabic) words relative to simple (monosyllabic) words for the CAS group over the right hemisphere during a time window thought to reflect phonological encoding of word forms. Group differences were also observed prior to production of spoken tokens regardless of word complexity during a time window just prior to speech onset (thought to reflect motor planning/programming). Results suggest differences in pre-speech neurolinguistic processes. PMID:25090016

  16. The feasibility of miniaturizing the versatile portable speech prosthesis: A market survey of commercial products

    NASA Technical Reports Server (NTRS)

    Walklet, T.

    1981-01-01

    The feasibility of a miniature versatile portable speech prosthesis (VPSP) was analyzed and information on its potential users and on other similar devices was collected. The VPSP is a device that incorporates speech synthesis technology. The objective is to provide sufficient information to decide whether there is valuable technology to contribute to the miniaturization of the VPSP. The needs of potential users are identified, the development status of technologies similar or related to those used in the VPSP are evaluated. The VPSP, a computer based speech synthesis system fits on a wheelchair. The purpose was to produce a device that provides communication assistance in educational, vocational, and social situations to speech impaired individuals. It is expected that the VPSP can be a valuable aid for persons who are also motor impaired, which explains the placement of the system on a wheelchair.

  17. Applications for Subvocal Speech

    NASA Technical Reports Server (NTRS)

    Jorgensen, Charles; Betts, Bradley

    2007-01-01

    A research and development effort now underway is directed toward the use of subvocal speech for communication in settings in which (1) acoustic noise could interfere excessively with ordinary vocal communication and/or (2) acoustic silence or secrecy of communication is required. By "subvocal speech" is meant sub-audible electromyographic (EMG) signals, associated with speech, that are acquired from the surface of the larynx and lingual areas of the throat. Topics addressed in this effort include recognition of the sub-vocal EMG signals that represent specific original words or phrases; transformation (including encoding and/or enciphering) of the signals into forms that are less vulnerable to distortion, degradation, and/or interception; and reconstruction of the original words or phrases at the receiving end of a communication link. Potential applications include ordinary verbal communications among hazardous- material-cleanup workers in protective suits, workers in noisy environments, divers, and firefighters, and secret communications among law-enforcement officers and military personnel in combat and other confrontational situations.

  18. Speech rhythm: a metaphor?

    PubMed Central

    Nolan, Francis; Jeon, Hae-Sung

    2014-01-01

    Is speech rhythmic? In the absence of evidence for a traditional view that languages strive to coordinate either syllables or stress-feet with regular time intervals, we consider the alternative that languages exhibit contrastive rhythm subsisting merely in the alternation of stronger and weaker elements. This is initially plausible, particularly for languages with a steep ‘prominence gradient’, i.e. a large disparity between stronger and weaker elements; but we point out that alternation is poorly achieved even by a ‘stress-timed’ language such as English, and, historically, languages have conspicuously failed to adopt simple phonological remedies that would ensure alternation. Languages seem more concerned to allow ‘syntagmatic contrast’ between successive units and to use durational effects to support linguistic functions than to facilitate rhythm. Furthermore, some languages (e.g. Tamil, Korean) lack the lexical prominence which would most straightforwardly underpin prominence of alternation. We conclude that speech is not incontestibly rhythmic, and may even be antirhythmic. However, its linguistic structure and patterning allow the metaphorical extension of rhythm in varying degrees and in different ways depending on the language, and it is this analogical process which allows speech to be matched to external rhythms. PMID:25385774

  19. [Speech changes in dementia].

    PubMed

    Benke, T; Andree, B; Hittmair, M; Gerstenbrand, F

    1990-06-01

    This review analyzes the spectrum of language deficits commonly encountered in dementia. A specific communication profile is found in dementia of the "cortical" type, such as Alzheimer's disease. With advancing disease lexical, comprehension and pragmatic functions deteriorate, whereas syntax and phonology tend to be preserved. This pattern bears some resemblance to aphasia types like transcortical and Wernicke's aphasia, however, a much broader range of communicative functions is impaired in Alzheimer's disease than in aphasia. Differentiation of dementia and aphasia, especially in elderly patients requires careful neuropsychological assessment of language, memory and other psychological functions. "Subcortical" dementia commonly presents with dysarthria as the leading symptom and linguistic impairment is rarely of crucial importance until late stages. Thus, the interetiologic dissociation of language and speech impairment can be used for dementia differentiation. Aphasia batteries are not sufficient to comprehend the range of language deficits in demented patients. Testing the communication impairment in dementia requires specific tasks for spontaneous speech, naming, comprehension, reading, writing, repetition and motor speech functions. Tasks for verbal learning and metalinguistic abilities should also be performed. Language deficits are frequent initial symptoms of dementia, thus language assessment may be of diagnostic relevance. Many data support the concept that the communication deficit in dementia results from a particular impairment of semantic memory. PMID:1695887

  20. Speech rhythm: a metaphor?

    PubMed

    Nolan, Francis; Jeon, Hae-Sung

    2014-12-19

    Is speech rhythmic? In the absence of evidence for a traditional view that languages strive to coordinate either syllables or stress-feet with regular time intervals, we consider the alternative that languages exhibit contrastive rhythm subsisting merely in the alternation of stronger and weaker elements. This is initially plausible, particularly for languages with a steep 'prominence gradient', i.e. a large disparity between stronger and weaker elements; but we point out that alternation is poorly achieved even by a 'stress-timed' language such as English, and, historically, languages have conspicuously failed to adopt simple phonological remedies that would ensure alternation. Languages seem more concerned to allow 'syntagmatic contrast' between successive units and to use durational effects to support linguistic functions than to facilitate rhythm. Furthermore, some languages (e.g. Tamil, Korean) lack the lexical prominence which would most straightforwardly underpin prominence of alternation. We conclude that speech is not incontestibly rhythmic, and may even be antirhythmic. However, its linguistic structure and patterning allow the metaphorical extension of rhythm in varying degrees and in different ways depending on the language, and it is this analogical process which allows speech to be matched to external rhythms. PMID:25385774

  1. Somatosensory function in speech perception

    PubMed Central

    Ito, Takayuki; Tiede, Mark; Ostry, David J.

    2009-01-01

    Somatosensory signals from the facial skin and muscles of the vocal tract provide a rich source of sensory input in speech production. We show here that the somatosensory system is also involved in the perception of speech. We use a robotic device to create patterns of facial skin deformation that would normally accompany speech production. We find that when we stretch the facial skin while people listen to words, it alters the sounds they hear. The systematic perceptual variation we observe in conjunction with speech-like patterns of skin stretch indicates that somatosensory inputs affect the neural processing of speech sounds and shows the involvement of the somatosensory system in the perceptual processing in speech. PMID:19164569

  2. Evaluation of NASA speech encoder

    NASA Technical Reports Server (NTRS)

    1976-01-01

    Techniques developed by NASA for spaceflight instrumentation were used in the design of a quantizer for speech-decoding. Computer simulation of the actions of the quantizer was tested with synthesized and real speech signals. Results were evaluated by a phometician. Topics discussed include the relationship between the number of quantizer levels and the required sampling rate; reconstruction of signals; digital filtering; speech recording, sampling, and storage, and processing results.

  3. Random Addition Concatenation Analysis: A Novel Approach to the Exploration of Phylogenomic Signal Reveals Strong Agreement between Core and Shell Genomic Partitions in the Cyanobacteria

    PubMed Central

    Narechania, Apurva; Baker, Richard H.; Sit, Ryan; Kolokotronis, Sergios-Orestis; DeSalle, Rob; Planet, Paul J.

    2012-01-01

    Recent whole-genome approaches to microbial phylogeny have emphasized partitioning genes into functional classes, often focusing on differences between a stable core of genes and a variable shell. To rigorously address the effects of partitioning and combining genes in genome-level analyses, we developed a novel technique called Random Addition Concatenation Analysis (RADICAL). RADICAL operates by sequentially concatenating randomly chosen gene partitions starting with a single-gene partition and ending with the entire genomic data set. A phylogenetic tree is built for every successive addition, and the entire process is repeated creating multiple random concatenation paths. The result is a library of trees representing a large variety of differently sized random gene partitions. This library can then be mined to identify unique topologies, assess overall agreement, and measure support for different trees. To evaluate RADICAL, we used 682 orthologous genes across 13 cyanobacterial genomes. Despite previous assertions of substantial differences between a core and a shell set of genes for this data set, RADICAL reveals the two partitions contain congruent phylogenetic signal. Substantial disagreement within the data set is limited to a few nodes and genes involved in metabolism, a functional group that is distributed evenly between the core and the shell partitions. We highlight numerous examples where RADICAL reveals aspects of phylogenetic behavior not evident by examining individual gene trees or a “‘total evidence” tree. Our method also demonstrates that most emergent phylogenetic signal appears early in the concatenation process. The software is freely available at http://desalle.amnh.org. PMID:22094860

  4. Somatosensory basis of speech production.

    PubMed

    Tremblay, Stéphanie; Shiller, Douglas M; Ostry, David J

    2003-06-19

    The hypothesis that speech goals are defined acoustically and maintained by auditory feedback is a central idea in speech production research. An alternative proposal is that speech production is organized in terms of control signals that subserve movements and associated vocal-tract configurations. Indeed, the capacity for intelligible speech by deaf speakers suggests that somatosensory inputs related to movement play a role in speech production-but studies that might have documented a somatosensory component have been equivocal. For example, mechanical perturbations that have altered somatosensory feedback have simultaneously altered acoustics. Hence, any adaptation observed under these conditions may have been a consequence of acoustic change. Here we show that somatosensory information on its own is fundamental to the achievement of speech movements. This demonstration involves a dissociation of somatosensory and auditory feedback during speech production. Over time, subjects correct for the effects of a complex mechanical load that alters jaw movements (and hence somatosensory feedback), but which has no measurable or perceptible effect on acoustic output. The findings indicate that the positions of speech articulators and associated somatosensory inputs constitute a goal of speech movements that is wholly separate from the sounds produced. PMID:12815431

  5. A Cool Approach to Probing Speech Cortex

    PubMed Central

    Flinker, Adeen; Knight, Robert T.

    2016-01-01

    In this issue of Neuron, Long et al. (2016) employ a novel technique of intraoperative cortical cooling in humans during speech production. They demonstrate that cooling Broca’s area interferes with speech timing but not speech quality. PMID:26985719

  6. A Cool Approach to Probing Speech Cortex.

    PubMed

    Flinker, Adeen; Knight, Robert T

    2016-03-16

    In this issue of Neuron, Long et al. (2016) employ a novel technique of intraoperative cortical cooling in humans during speech production. They demonstrate that cooling Broca's area interferes with speech timing but not speech quality. PMID:26985719

  7. Speech Recognition: How Do We Teach It?

    ERIC Educational Resources Information Center

    Barksdale, Karl

    2002-01-01

    States that growing use of speech recognition software has made voice writing an essential computer skill. Describes how to present the topic, develop basic speech recognition skills, and teach speech recognition outlining, writing, proofreading, and editing. (Contains 14 references.) (SK)

  8. Speech systems research at Texas Instruments

    NASA Technical Reports Server (NTRS)

    Doddington, George R.

    1977-01-01

    An assessment of automatic speech processing technology is presented. Fundamental problems in the development and the deployment of automatic speech processing systems are defined and a technology forecast for speech systems is presented.

  9. Huntington's Disease: Speech, Language and Swallowing

    MedlinePlus

    ... the course of the disease. What do speech-language pathologists do when working with people with Huntington's ... of Neurological Disorders and Stroke Typical Speech and Language Development Learning More Than One Language Adult Speech ...

  10. Activities to Encourage Speech and Language Development

    MedlinePlus

    ... and Swallowing / Development Activities to Encourage Speech and Language Development Birth to 2 Years Encourage your baby ... or light) of the packages. Typical Speech and Language Development Learning More Than One Language Adult Speech ...

  11. What Is Language? What Is Speech?

    MedlinePlus

    ... Public / Speech, Language and Swallowing / Development What Is Language? What Is Speech? [ en Español ] Kelly's 4-year-old son, Tommy, has speech and language problems. Friends and family have a hard time ...

  12. General American Speech and Phonic Symbols.

    ERIC Educational Resources Information Center

    Calvert, Donald R.

    1982-01-01

    General American Symbols, speech and phonic symbols adapted from the Northampton symbols, are presented as a simplified system for teaching reading and speech to deaf children. Ways to use symbols for indicating features of speech production are suggested. (Author)

  13. Hemispheric Asymmetries in Speech Perception: Sense, Nonsense and Modulations

    PubMed Central

    Rosen, Stuart; Wise, Richard J. S.; Chadha, Shabneet; Conway, Eleanor-Jayne; Scott, Sophie K.

    2011-01-01

    Background The well-established left hemisphere specialisation for language processing has long been claimed to be based on a low-level auditory specialization for specific acoustic features in speech, particularly regarding ‘rapid temporal processing’. Methodology A novel analysis/synthesis technique was used to construct a variety of sounds based on simple sentences which could be manipulated in spectro-temporal complexity, and whether they were intelligible or not. All sounds consisted of two noise-excited spectral prominences (based on the lower two formants in the original speech) which could be static or varying in frequency and/or amplitude independently. Dynamically varying both acoustic features based on the same sentence led to intelligible speech but when either or both acoustic features were static, the stimuli were not intelligible. Using the frequency dynamics from one sentence with the amplitude dynamics of another led to unintelligible sounds of comparable spectro-temporal complexity to the intelligible ones. Positron emission tomography (PET) was used to compare which brain regions were active when participants listened to the different sounds. Conclusions Neural activity to spectral and amplitude modulations sufficient to support speech intelligibility (without actually being intelligible) was seen bilaterally, with a right temporal lobe dominance. A left dominant response was seen only to intelligible sounds. It thus appears that the left hemisphere specialisation for speech is based on the linguistic properties of utterances, not on particular acoustic features. PMID:21980349

  14. Analysis of a digital technique for frequency transposition of speech

    NASA Astrophysics Data System (ADS)

    Digirolamo, V.

    1985-09-01

    Frequency transposition is the process of raising or lowering the frequency content (pitch) of an audio signal. The hearing impaired community has the greatest interest in the applications of frequency transposing. Though several analog and digital frequency transposing hearing aid systems have been built and tested, this thesis investigates a possible digital processing alternative. Pole shifting, in the z-domain, of an autoregressive (all pole) model of speech was proven to be a viable theory for changing frequency content. Since linear predictive coding (LPC) techniques are used to code, analyze and synthesize speech, with the resulting LPC coefficients related to the coefficients of an equivalent autoregressive model, a linear relationship between LPC coefficients and frequency tranposition is explored. This theoretical relationship is first established using a pure sine wave and then is extended into processing speech. The resulting speech synthesis experiments failed to substantiate the conjectures of this thesis. However, future research avenues are suggested that may lead toward a viable approach to transpose speech.

  15. Feature extraction and models for speech: An overview

    NASA Astrophysics Data System (ADS)

    Schroeder, Manfred

    2002-11-01

    Modeling of speech has a long history, beginning with Count von Kempelens 1770 mechanical speaking machine. Even then human vowel production was seen as resulting from a source (the vocal chords) driving a physically separate resonator (the vocal tract). Homer Dudley's 1928 frequency-channel vocoder and many of its descendants are based on the same successful source-filter paradigm. For linguistic studies as well as practical applications in speech recognition, compression, and synthesis (see M. R. Schroeder, Computer Speech), the extant models require the (often difficult) extraction of numerous parameters such as the fundamental and formant frequencies and various linguistic distinctive features. Some of these difficulties were obviated by the introduction of linear predictive coding (LPC) in 1967 in which the filter part is an all-pole filter, reflecting the fact that for non-nasalized vowels the vocal tract is well approximated by an all-pole transfer function. In the now ubiquitous code-excited linear prediction (CELP), the source-part is replaced by a code book which (together with a perceptual error criterion) permits speech compression to very low bit rates at high speech quality for the Internet and cell phones.

  16. A Technique for Estimating Intensity of Emotional Expressions and Speaking Styles in Speech Based on Multiple-Regression HSMM

    NASA Astrophysics Data System (ADS)

    Nose, Takashi; Kobayashi, Takao

    In this paper, we propose a technique for estimating the degree or intensity of emotional expressions and speaking styles appearing in speech. The key idea is based on a style control technique for speech synthesis using a multiple regression hidden semi-Markov model (MRHSMM), and the proposed technique can be viewed as the inverse of the style control. In the proposed technique, the acoustic features of spectrum, power, fundamental frequency, and duration are simultaneously modeled using the MRHSMM. We derive an algorithm for estimating explanatory variables of the MRHSMM, each of which represents the degree or intensity of emotional expressions and speaking styles appearing in acoustic features of speech, based on a maximum likelihood criterion. We show experimental results to demonstrate the ability of the proposed technique using two types of speech data, simulated emotional speech and spontaneous speech with different speaking styles. It is found that the estimated values have correlation with human perception.

  17. Enhancing Peer Feedback and Speech Preparation: The Speech Video Activity

    ERIC Educational Resources Information Center

    Opt, Susan

    2012-01-01

    In the typical public speaking course, instructors or assistants videotape or digitally record at least one of the term's speeches in class or lab to offer students additional presentation feedback. Students often watch and self-critique their speeches on their own. Peers often give only written feedback on classroom presentations or completed…

  18. Of Speech and Time: Temporal Speech Patterns in Interpersonal Contexts.

    ERIC Educational Resources Information Center

    Sieqman, Aron W., Ed.; Feldstein, Stanley, Ed.

    The temporal patterning of speech, primarily within the context of interpersonal exchanges, is traced in this cross-section of research exploring the major directions such studies have taken. Eighteen authors contributed selections to support the thesis that time as a dimension of speech reflects many of the important processes that occur during…

  19. Speech-in-Speech Recognition: A Training Study

    ERIC Educational Resources Information Center

    Van Engen, Kristin J.

    2012-01-01

    This study aims to identify aspects of speech-in-noise recognition that are susceptible to training, focusing on whether listeners can learn to adapt to target talkers ("tune in") and learn to better cope with various maskers ("tune out") after short-term training. Listeners received training on English sentence recognition in speech-shaped noise…

  20. Auditory detection of non-speech and speech stimuli in noise: Native speech advantage.

    PubMed

    Huo, Shuting; Tao, Sha; Wang, Wenjing; Li, Mingshuang; Dong, Qi; Liu, Chang

    2016-05-01

    Detection thresholds of Chinese vowels, Korean vowels, and a complex tone, with harmonic and noise carriers were measured in noise for Mandarin Chinese-native listeners. The harmonic index was calculated as the difference between detection thresholds of the stimuli with harmonic carriers and those with noise carriers. The harmonic index for Chinese vowels was significantly greater than that for Korean vowels and the complex tone. Moreover, native speech sounds were rated significantly more native-like than non-native speech and non-speech sounds. The results indicate that native speech has an advantage over other sounds in simple auditory tasks like sound detection. PMID:27250202

  1. Statistical assessment of speech system performance

    NASA Technical Reports Server (NTRS)

    Moshier, Stephen L.

    1977-01-01

    Methods for the normalization of performance tests results of speech recognition systems are presented. Technological accomplishments in speech recognition systems, as well as planned research activities are described.

  2. Overview of requirements and networks for voice communications and speech processing

    NASA Astrophysics Data System (ADS)

    Ince, A. Nejat

    1990-05-01

    The use of voice for military and civil communications are discussed. The military operational requirements are outlined in relation to air operations, including the effects of propagational factors and electronic warfare. Structures of the existing NATO communications network and the evolving Integrated Service Digital Network (ISDN) are reviewed to show how they meet the requirements. It is concluded that speech coding at low-bit rates is a growing need for transmitting speech messages with a high level of security and reliability over low data-rate channels and for memory-efficient systems for voice storage, voice response, and voice mail. Furthermore, it is pointed out that the low-bit rate voice coding can ease the transition to shared channels for voice and data and can readily adopt voice messages for packet switching. The speech processing techniques and systems are then outlined as an introduction to the lectures of this series in terms of: the character of the speech signal, its generation and perception; speech coding which is mainly concerned with man-to-man voice communication; speech synthesis which deals with machine-to-man communication; speech recognition which is related to man-to-machine communication; and quality assessment of speech system and standards.

  3. Interpersonal Orientation and Speech Behavior.

    ERIC Educational Resources Information Center

    Street, Richard L., Jr.; Murphy, Thomas L.

    1987-01-01

    Indicates that (1) males with low interpersonal orientation (IO) were least vocally active and expressive and least consistent in their speech performances, and (2) high IO males and low IO females tended to demonstrate greater speech convergence than either low IO males or high IO females. (JD)

  4. American Studies through Folk Speech.

    ERIC Educational Resources Information Center

    Pedersen, E. Martin

    1993-01-01

    American slang reflects diversity, imagination, self-confidence, and optimism of the American people. Its vitality is due in part to the guarantee of free speech and lack of a national academy of language or of any official attempt to purify American speech, in part to Americans' historic geographic mobility. Such "folksay" includes riddles and…

  5. Speech Restoration: An Interactive Process

    ERIC Educational Resources Information Center

    Grataloup, Claire; Hoen, Michael; Veuillet, Evelyne; Collet, Lionel; Pellegrino, Francois; Meunier, Fanny

    2009-01-01

    Purpose: This study investigates the ability to understand degraded speech signals and explores the correlation between this capacity and the functional characteristics of the peripheral auditory system. Method: The authors evaluated the capability of 50 normal-hearing native French speakers to restore time-reversed speech. The task required them…

  6. SILENT SPEECH DURING SILENT READING.

    ERIC Educational Resources Information Center

    MCGUIGAN, FRANK J.

    EFFORTS WERE MADE IN THIS STUDY TO (1) RELATE THE AMOUNT OF SILENT SPEECH DURING SILENT READING TO LEVEL OF READING PROFICIENCY, INTELLIGENCE, AGE, AND GRADE PLACEMENT OF SUBJECTS, AND (2) DETERMINE WHETHER THE AMOUNT OF SILENT SPEECH DURING SILENT READING IS AFFECTED BY THE LEVEL OF DIFFICULTY OF PROSE READ AND BY THE READING OF A FOREIGN…

  7. Speech Prosody in Cerebellar Ataxia

    ERIC Educational Resources Information Center

    Casper, Maureen A.; Raphael, Lawrence J.; Harris, Katherine S.; Geibel, Jennifer M.

    2007-01-01

    Persons with cerebellar ataxia exhibit changes in physical coordination and speech and voice production. Previously, these alterations of speech and voice production were described primarily via perceptual coordinates. In this study, the spatial-temporal properties of syllable production were examined in 12 speakers, six of whom were healthy…

  8. Taking a Stand for Speech.

    ERIC Educational Resources Information Center

    Moore, Wayne D.

    1995-01-01

    Asserts that freedom of speech issues were among the first major confrontations in U.S. constitutional law. Maintains that lessons from the controversies surrounding the Sedition Act of 1798 have continuing practical relevance. Describes and discusses the significance of freedom of speech to the U.S. political system. (CFR)

  9. Speech Training for Inmate Rehabilitation.

    ERIC Educational Resources Information Center

    Parkinson, Michael G.; Dobkins, David H.

    1982-01-01

    Using a computerized content analysis, the authors demonstrate changes in speech behaviors of prison inmates. They conclude that two to four hours of public speaking training can have only limited effect on students who live in a culture in which "prison speech" is the expected and rewarded form of behavior. (PD)

  10. SPEECH--MAN'S NATURAL COMMUNICATION.

    ERIC Educational Resources Information Center

    DUDLEY, HOMER; AND OTHERS

    SESSION 63 OF THE 1967 INSTITUTE OF ELECTRICAL AND ELECTRONIC ENGINEERS INTERNATIONAL CONVENTION BROUGHT TOGETHER SEVEN DISTINGUISHED MEN WORKING IN FIELDS RELEVANT TO LANGUAGE. THEIR TOPICS INCLUDED ORIGIN AND EVOLUTION OF SPEECH AND LANGUAGE, LANGUAGE AND CULTURE, MAN'S PHYSIOLOGICAL MECHANISMS FOR SPEECH, LINGUISTICS, AND TECHNOLOGY AND…

  11. Techniques for automatic speech recognition

    NASA Astrophysics Data System (ADS)

    Moore, R. K.

    1983-05-01

    A brief insight into some of the algorithms that lie behind current automatic speech recognition system is provided. Early phonetically based approaches were not particularly successful, due mainly to a lack of appreciation of the problems involved. These problems are summarized, and various recognition techniques are reviewed in the contect of the solutions that they provide. It is pointed out that the majority of currently available speech recognition equipments employ a "whole-word' pattern matching approach which, although relatively simple, has proved particularly successful in its ability to recognize speech. The concepts of time-normalizing plays a central role in this type of recognition process and a family of such algorithms is described in detail. The technique of dynamic time warping is not only capable of providing good performance for isolated word recognition, but how it is also extended to the recognition of connected speech (thereby removing one of the most severe limitations of early speech recognition equipment).

  12. Interactions between distal speech rate, linguistic knowledge, and speech environment.

    PubMed

    Morrill, Tuuli; Baese-Berk, Melissa; Heffner, Christopher; Dilley, Laura

    2015-10-01

    During lexical access, listeners use both signal-based and knowledge-based cues, and information from the linguistic context can affect the perception of acoustic speech information. Recent findings suggest that the various cues used in lexical access are implemented with flexibility and may be affected by information from the larger speech context. We conducted 2 experiments to examine effects of a signal-based cue (distal speech rate) and a knowledge-based cue (linguistic structure) on lexical perception. In Experiment 1, we manipulated distal speech rate in utterances where an acoustically ambiguous critical word was either obligatory for the utterance to be syntactically well formed (e.g., Conner knew that bread and butter (are) both in the pantry) or optional (e.g., Don must see the harbor (or) boats). In Experiment 2, we examined identical target utterances as in Experiment 1 but changed the distribution of linguistic structures in the fillers. The results of the 2 experiments demonstrate that speech rate and linguistic knowledge about critical word obligatoriness can both influence speech perception. In addition, it is possible to alter the strength of a signal-based cue by changing information in the speech environment. These results provide support for models of word segmentation that include flexible weighting of signal-based and knowledge-based cues. PMID:25794478

  13. Hate Speech or Free Speech: Can Broad Campus Speech Regulations Survive Current Judicial Reasoning?

    ERIC Educational Resources Information Center

    Heiser, Gregory M.; Rossow, Lawrence F.

    1993-01-01

    Federal courts have found speech regulations overbroad in suits against the University of Michigan and the University of Wisconsin System. Attempts to assess the theoretical justification and probable fate of broad speech regulations that have not been explicitly rejected by the courts. Concludes that strong arguments for broader regulation will…

  14. Binary concatenated coding system

    NASA Technical Reports Server (NTRS)

    Monford, L. G., Jr.

    1973-01-01

    Coding, using 3-bit binary words, is applicable to any measurement having integer scale up to 100. System using 6-bit data words can be expanded to read from 1 to 10,000, and 9-bit data words can increase range to 1,000,000. Code may be ''read'' directly by observation after memorizing simple listing of 9's and 10's.

  15. Hate Speech/Free Speech: Using Feminist Perspectives To Foster On-Campus Dialogue.

    ERIC Educational Resources Information Center

    Cornwell, Nancy; Orbe, Mark P.; Warren, Kiesha

    1999-01-01

    Explores the complex issues inherent in the tension between hate speech and free speech, focusing on the phenomenon of hate speech on college campuses. Describes the challenges to hate speech made by critical race theorists and explains how a feminist critique can reorient the parameters of hate speech. (SLD)

  16. ON THE NATURE OF SPEECH SCIENCE.

    ERIC Educational Resources Information Center

    PETERSON, GORDON E.

    IN THIS ARTICLE THE NATURE OF THE DISCIPLINE OF SPEECH SCIENCE IS CONSIDERED AND THE VARIOUS BASIC AND APPLIED AREAS OF THE DISCIPLINE ARE DISCUSSED. THE BASIC AREAS ENCOMPASS THE VARIOUS PROCESSES OF THE PHYSIOLOGY OF SPEECH PRODUCTION, THE ACOUSTICAL CHARACTERISTICS OF SPEECH, INCLUDING THE SPEECH WAVE TYPES AND THE INFORMATION-BEARING ACOUSTIC…

  17. Freedom of Speech Newsletter, February 1976.

    ERIC Educational Resources Information Center

    Allen, Winfred G., Jr., Ed.

    The "Freedom of Speech Newsletter" is the communication medium, published four times each academic year, of the Freedom of Speech Interest Group, Western Speech Communication Association. Articles included in this issue are "What Is Academic Freedom For?" by Ralph Ross, "A Sociology of Free Speech" by Ray Heidt, "A Queer Interpretation fo the…

  18. Multifractal nature of unvoiced speech signals

    SciTech Connect

    Adeyemi, O.A.; Hartt, K.; Boudreaux-Bartels, G.F.

    1996-06-01

    A refinement is made in the nonlinear dynamic modeling of speech signals. Previous research successfully characterized speech signals as chaotic. Here, we analyze fricative speech signals using multifractal measures to determine various fractal regimes present in their chaotic attractors. Results support the hypothesis that speech signals have multifractal measures. {copyright} {ital 1996 American Institute of Physics.}

  19. Infant Perception of Atypical Speech Signals

    ERIC Educational Resources Information Center

    Vouloumanos, Athena; Gelfand, Hanna M.

    2013-01-01

    The ability to decode atypical and degraded speech signals as intelligible is a hallmark of speech perception. Human adults can perceive sounds as speech even when they are generated by a variety of nonhuman sources including computers and parrots. We examined how infants perceive the speech-like vocalizations of a parrot. Further, we examined how…

  20. Is Birdsong More Like Speech or Music?

    PubMed

    Shannon, Robert V

    2016-04-01

    Music and speech share many acoustic cues but not all are equally important. For example, harmonic pitch is essential for music but not for speech. When birds communicate is their song more like speech or music? A new study contrasting pitch and spectral patterns shows that birds perceive their song more like humans perceive speech. PMID:26944220

  1. Phonetic Recalibration Only Occurs in Speech Mode

    ERIC Educational Resources Information Center

    Vroomen, Jean; Baart, Martijn

    2009-01-01

    Upon hearing an ambiguous speech sound dubbed onto lipread speech, listeners adjust their phonetic categories in accordance with the lipread information (recalibration) that tells what the phoneme should be. Here we used sine wave speech (SWS) to show that this tuning effect occurs if the SWS sounds are perceived as speech, but not if the sounds…

  2. Preschool Children's Awareness of Private Speech

    ERIC Educational Resources Information Center

    Manfra, Louis; Winsler, Adam

    2006-01-01

    The present study explored: (a) preschool children's awareness of their own talking and private speech (speech directed to the self); (b) differences in age, speech use, language ability, and mentalizing abilities between children with awareness and those without; and (c) children's beliefs and attitudes about private speech. Fifty-one children…

  3. Automated Speech Rate Measurement in Dysarthria

    ERIC Educational Resources Information Center

    Martens, Heidi; Dekens, Tomas; Van Nuffelen, Gwen; Latacz, Lukas; Verhelst, Werner; De Bodt, Marc

    2015-01-01

    Purpose: In this study, a new algorithm for automated determination of speech rate (SR) in dysarthric speech is evaluated. We investigated how reliably the algorithm calculates the SR of dysarthric speech samples when compared with calculation performed by speech-language pathologists. Method: The new algorithm was trained and tested using Dutch…

  4. Speech Patterns and Racial Wage Inequality

    ERIC Educational Resources Information Center

    Grogger, Jeffrey

    2011-01-01

    Speech patterns differ substantially between whites and many African Americans. I collect and analyze speech data to understand the role that speech may play in explaining racial wage differences. Among blacks, speech patterns are highly correlated with measures of skill such as schooling and AFQT scores. They are also highly correlated with the…

  5. Metrical perception of trisyllabic speech rhythms.

    PubMed

    Benadon, Fernando

    2014-01-01

    The perception of duration-based syllabic rhythm was examined within a metrical framework. Participants assessed the duration patterns of four-syllable phrases set within the stress structure XxxX (an Abercrombian trisyllabic foot). Using on-screen sliders, participants created percussive sequences that imitated speech rhythms and analogous non-speech monotone rhythms. There was a tendency to equalize the interval durations for speech stimuli but not for non-speech. Despite the perceptual regularization of syllable durations, different speech phrases were conceived in various rhythmic configurations, pointing to a diversity of perceived meters in speech. In addition, imitations of speech stimuli showed more variability than those of non-speech. Rhythmically skilled listeners exhibited lower variability and were more consistent with vowel-centric estimates when assessing speech stimuli. These findings enable new connections between meter- and duration-based models of speech rhythm perception. PMID:23417710

  6. Pronunciation models for conversational speech

    NASA Astrophysics Data System (ADS)

    Johnson, Keith

    2005-09-01

    Using a pronunciation dictionary of clear speech citation forms a segment deletion rate of nearly 12% is found in a corpus of conversational speech. The number of apparent segment deletions can be reduced by constructing a pronunciation dictionary that records one or more of the actual pronunciations found in conversational speech; however, the resulting empirical pronunciation dictionary often fails to include the citation pronunciation form. Issues involved in selecting pronunciations for a dictionary for linguistic, psycholinguistic, and ASR research will be discussed. One conclusion is that Ladefoged may have been the wiser for avoiding the business of producing pronunciation dictionaries. [Supported by NIDCD Grant No. R01 DC04330-03.

  7. Speech recovery device

    DOEpatents

    Frankle, Christen M.

    2004-04-20

    There is provided an apparatus and method for assisting speech recovery in people with inability to speak due to aphasia, apraxia or another condition with similar effect. A hollow, rigid, thin-walled tube with semi-circular or semi-elliptical cut out shapes at each open end is positioned such that one end mates with the throat/voice box area of the neck of the assistor and the other end mates with the throat/voice box area of the assisted. The speaking person (assistor) makes sounds that produce standing wave vibrations at the same frequency in the vocal cords of the assisted person. Driving the assisted person's vocal cords with the assisted person being able to hear the correct tone enables the assisted person to speak by simply amplifying the vibration of membranes in their throat.

  8. Speech recovery device

    SciTech Connect

    Frankle, Christen M.

    2000-10-19

    There is provided an apparatus and method for assisting speech recovery in people with inability to speak due to aphasia, apraxia or another condition with similar effect. A hollow, rigid, thin-walled tube with semi-circular or semi-elliptical cut out shapes at each open end is positioned such that one end mates with the throat/voice box area of the neck of the assistor and the other end mates with the throat/voice box area of the assisted. The speaking person (assistor) makes sounds that produce standing wave vibrations at the same frequency in the vocal cords of the assisted person. Driving the assisted person's vocal cords with the assisted person being able to hear the correct tone enables the assisted person to speak by simply amplifying the vibration of membranes in their throat.

  9. Silog: Speech Input Logon

    NASA Astrophysics Data System (ADS)

    Grau, Sergio; Allen, Tony; Sherkat, Nasser

    Silog is a biometrie authentication system that extends the conventional PC logon process using voice verification. Users enter their ID and password using a conventional Windows logon procedure but then the biometrie authentication stage makes a Voice over IP (VoIP) call to a VoiceXML (VXML) server. User interaction with this speech-enabled component then allows the user's voice characteristics to be extracted as part of a simple user/system spoken dialogue. If the captured voice characteristics match those of a previously registered voice profile, then network access is granted. If no match is possible, then a potential unauthorised system access has been detected and the logon process is aborted.

  10. Multilocus phylogeny of the lichen-forming fungal genus Melanohalea (Parmeliaceae, Ascomycota): insights on diversity, distributions, and a comparison of species tree and concatenated topologies.

    PubMed

    Leavitt, Steven D; Esslinger, Theodore L; Spribille, Toby; Divakar, Pradeep K; Thorsten Lumbsch, H

    2013-01-01

    Accurate species circumscriptions are central for many biological disciplines and have critical implications for ecological and conservation studies. An increasing body of evidence suggests that in some cases traditional morphology-based taxonomy have underestimated diversity in lichen-forming fungi. Therefore, genetic data play an increasing role for recognizing distinct lineages of lichenized fungi that it would otherwise be improbable to recognize using classical phenotypic characters. Melanohalea (Parmeliaceae, Ascomycota) is one of the most widespread and common lichen-forming genera in the northern Hemisphere. In this study, we assess traditional phenotype-based species boundaries, identify previously unrecognized species-level lineages and discuss biogeographic patterns in Melanohalea. We sampled 487 individuals worldwide, representing 18 of the 22 described Melanohalea species, and generated DNA sequence data from mitochondrial, nuclear ribosomal, and protein-coding markers. Diversity previously hidden within traditional species was identified using a genealogical concordance approach. We inferred relationships among sampled species-level lineages within Melanohalea using both concatenated phylogenetic methods and a coalescent-based multilocus species tree approach. Although lineages identified from genetic data are largely congruent with traditional taxonomy, we found strong evidence supporting the presence of previously unrecognized species in six of the 18 sampled taxa. Strong nodal support and overall congruence among independent loci suggest long-term reproductive isolation among most species-level lineages. While some Melanohalea taxa are truly widespread, a limited number of clades appear to have much more restricted distributional ranges. In most instances the concatenated gene tree and multilocus species tree approaches provided similar estimates of relationships. However, nodal support was generally higher in the phylogeny estimated from

  11. Applications of Hilbert Spectral Analysis for Speech and Sound Signals

    NASA Technical Reports Server (NTRS)

    Huang, Norden E.

    2003-01-01

    A new method for analyzing nonlinear and nonstationary data has been developed, and the natural applications are to speech and sound signals. The key part of the method is the Empirical Mode Decomposition method with which any complicated data set can be decomposed into a finite and often small number of Intrinsic Mode Functions (IMF). An IMF is defined as any function having the same numbers of zero-crossing and extrema, and also having symmetric envelopes defined by the local maxima and minima respectively. The IMF also admits well-behaved Hilbert transform. This decomposition method is adaptive, and, therefore, highly efficient. Since the decomposition is based on the local characteristic time scale of the data, it is applicable to nonlinear and nonstationary processes. With the Hilbert transform, the Intrinsic Mode Functions yield instantaneous frequencies as functions of time, which give sharp identifications of imbedded structures. This method invention can be used to process all acoustic signals. Specifically, it can process the speech signals for Speech synthesis, Speaker identification and verification, Speech recognition, and Sound signal enhancement and filtering. Additionally, as the acoustical signals from machinery are essentially the way the machines are talking to us. Therefore, the acoustical signals, from the machines, either from sound through air or vibration on the machines, can tell us the operating conditions of the machines. Thus, we can use the acoustic signal to diagnosis the problems of machines.

  12. Speech in the Marxist State.

    ERIC Educational Resources Information Center

    McGuire, Michael; Berger, Lothar

    1979-01-01

    Describes the field of speech communication in East Germany with emphasis on the influence of the ideology of Marxism upon its nature and status in academic settings. Contrasts the East German system with the American. (JMF)

  13. Perceptual Learning of Interrupted Speech

    PubMed Central

    Benard, Michel Ruben; Başkent, Deniz

    2013-01-01

    The intelligibility of periodically interrupted speech improves once the silent gaps are filled with noise bursts. This improvement has been attributed to phonemic restoration, a top-down repair mechanism that helps intelligibility of degraded speech in daily life. Two hypotheses were investigated using perceptual learning of interrupted speech. If different cognitive processes played a role in restoring interrupted speech with and without filler noise, the two forms of speech would be learned at different rates and with different perceived mental effort. If the restoration benefit were an artificial outcome of using the ecologically invalid stimulus of speech with silent gaps, this benefit would diminish with training. Two groups of normal-hearing listeners were trained, one with interrupted sentences with the filler noise, and the other without. Feedback was provided with the auditory playback of the unprocessed and processed sentences, as well as the visual display of the sentence text. Training increased the overall performance significantly, however restoration benefit did not diminish. The increase in intelligibility and the decrease in perceived mental effort were relatively similar between the groups, implying similar cognitive mechanisms for the restoration of the two types of interruptions. Training effects were generalizable, as both groups improved their performance also with the other form of speech than that they were trained with, and retainable. Due to null results and relatively small number of participants (10 per group), further research is needed to more confidently draw conclusions. Nevertheless, training with interrupted speech seems to be effective, stimulating participants to more actively and efficiently use the top-down restoration. This finding further implies the potential of this training approach as a rehabilitative tool for hearing-impaired/elderly populations. PMID:23469266

  14. Statistical Analysis of Spectral Properties and Prosodic Parameters of Emotional Speech

    NASA Astrophysics Data System (ADS)

    Přibil, J.; Přibilová, A.

    2009-01-01

    The paper addresses reflection of microintonation and spectral properties in male and female acted emotional speech. Microintonation component of speech melody is analyzed regarding its spectral and statistical parameters. According to psychological research of emotional speech, different emotions are accompanied by different spectral noise. We control its amount by spectral flatness according to which the high frequency noise is mixed in voiced frames during cepstral speech synthesis. Our experiments are aimed at statistical analysis of cepstral coefficient values and ranges of spectral flatness in three emotions (joy, sadness, anger), and a neutral state for comparison. Calculated histograms of spectral flatness distribution are visually compared and modelled by Gamma probability distribution. Histograms of cepstral coefficient distribution are evaluated and compared using skewness and kurtosis. Achieved statistical results show good correlation comparing male and female voices for all emotional states portrayed by several Czech and Slovak professional actors.

  15. A maximum likelihood approach to estimating articulator positions from speech acoustics

    SciTech Connect

    Hogden, J.

    1996-09-23

    This proposal presents an algorithm called maximum likelihood continuity mapping (MALCOM) which recovers the positions of the tongue, jaw, lips, and other speech articulators from measurements of the sound-pressure waveform of speech. MALCOM differs from other techniques for recovering articulator positions from speech in three critical respects: it does not require training on measured or modeled articulator positions, it does not rely on any particular model of sound propagation through the vocal tract, and it recovers a mapping from acoustics to articulator positions that is linearly, not topographically, related to the actual mapping from acoustics to articulation. The approach categorizes short-time windows of speech into a finite number of sound types, and assumes the probability of using any articulator position to produce a given sound type can be described by a parameterized probability density function. MALCOM then uses maximum likelihood estimation techniques to: (1) find the most likely smooth articulator path given a speech sample and a set of distribution functions (one distribution function for each sound type), and (2) change the parameters of the distribution functions to better account for the data. Using this technique improves the accuracy of articulator position estimates compared to continuity mapping -- the only other technique that learns the relationship between acoustics and articulation solely from acoustics. The technique has potential application to computer speech recognition, speech synthesis and coding, teaching the hearing impaired to speak, improving foreign language instruction, and teaching dyslexics to read. 34 refs., 7 figs.

  16. New Ideas for Speech Recognition and Related Technologies

    SciTech Connect

    Holzrichter, J F

    2002-06-17

    The ideas relating to the use of organ motion sensors for the purposes of speech recognition were first described by.the author in spring 1994. During the past year, a series of productive collaborations between the author, Tom McEwan and Larry Ng ensued and have lead to demonstrations, new sensor ideas, and algorithmic descriptions of a large number of speech recognition concepts. This document summarizes the basic concepts of recognizing speech once organ motions have been obtained. Micro power radars and their uses for the measurement of body organ motions, such as those of the heart and lungs, have been demonstrated by Tom McEwan over the past two years. McEwan and I conducted a series of experiments, using these instruments, on vocal organ motions beginning in late spring, during which we observed motions of vocal folds (i.e., cords), tongue, jaw, and related organs that are very useful for speech recognition and other purposes. These will be reviewed in a separate paper. Since late summer 1994, Lawrence Ng and I have worked to make many of the initial recognition ideas more rigorous and to investigate the applications of these new ideas to new speech recognition algorithms, to speech coding, and to speech synthesis. I introduce some of those ideas in section IV of this document, and we describe them more completely in the document following this one, UCRL-UR-120311. For the design and operation of micro-power radars and their application to body organ motions, the reader may contact Tom McEwan directly. The capability for using EM sensors (i.e., radar units) to measure body organ motions and positions has been available for decades. Impediments to their use appear to have been size, excessive power, lack of resolution, and lack of understanding of the value of organ motion measurements, especially as applied to speech related technologies. However, with the invention of very low power, portable systems as demonstrated by McEwan at LLNL researchers have begun

  17. Effect of Speaking Rate on Recognition of Synthetic and Natural Speech by Normal-Hearing and Cochlear Implant Listeners

    PubMed Central

    Ji, Caili; Galvin, John J.; Xu, Anting; Fu, Qian-Jie

    2012-01-01

    Objective Most studies have evaluated cochlear implant (CI) performance using “clear” speech materials, which are highly intelligible and well-articulated. CI users may encounter much greater variability in speech patterns in the “real-world,” including synthetic speech. In this study, we measured normal-hearing (NH) and CI listeners’ sentence recognition with multiple talkers and speaking rates, and with naturally produced and synthetic speech. Design NH and CI subjects were asked to recognize naturally produced or synthetic sentences, presented at a slow, normal, or fast speaking rate. Natural speech was produced by one male and one female talker; synthetic speech was generated to simulate a male and female talker. For natural speech, the speaking rate was time-scaled while preserving voice pitch and formant frequency information. For synthetic speech, the speaking rate was adjusted within the speech synthesis engine. NH subjects were tested while listening to unprocessed speech or to an 8-channel acoustic CI simulation. CI subjects were tested while listening with their clinical processors and the recommended microphone sensitivity and volume settings. Results The NH group performed significantly better than the CI simulation group, and the CI simulation group performed significantly better than the CI group. For all subject groups, sentence recognition was significantly better with natural than with synthetic speech. The performance deficit with synthetic speech was relatively small for NH subjects listening to unprocessed speech. However, the performance deficit with synthetic speech was much greater for CI subjects and for CI simulation subjects. There was significant effect of talker gender, with slightly better performance with the female talker for CI subjects and slightly better performance with the male talker for the CI simulations. For all subject groups, sentence recognition was significantly poorer only at the fast rate. CI performance was

  18. Neural pathways for visual speech perception

    PubMed Central

    Bernstein, Lynne E.; Liebenthal, Einat

    2014-01-01

    This paper examines the questions, what levels of speech can be perceived visually, and how is visual speech represented by the brain? Review of the literature leads to the conclusions that every level of psycholinguistic speech structure (i.e., phonetic features, phonemes, syllables, words, and prosody) can be perceived visually, although individuals differ in their abilities to do so; and that there are visual modality-specific representations of speech qua speech in higher-level vision brain areas. That is, the visual system represents the modal patterns of visual speech. The suggestion that the auditory speech pathway receives and represents visual speech is examined in light of neuroimaging evidence on the auditory speech pathways. We outline the generally agreed-upon organization of the visual ventral and dorsal pathways and examine several types of visual processing that might be related to speech through those pathways, specifically, face and body, orthography, and sign language processing. In this context, we examine the visual speech processing literature, which reveals widespread diverse patterns of activity in posterior temporal cortices in response to visual speech stimuli. We outline a model of the visual and auditory speech pathways and make several suggestions: (1) The visual perception of speech relies on visual pathway representations of speech qua speech. (2) A proposed site of these representations, the temporal visual speech area (TVSA) has been demonstrated in posterior temporal cortex, ventral and posterior to multisensory posterior superior temporal sulcus (pSTS). (3) Given that visual speech has dynamic and configural features, its representations in feedforward visual pathways are expected to integrate these features, possibly in TVSA. PMID:25520611

  19. Speech recognition by computer. October 1981-1982 (a bibliography with abstracts)

    SciTech Connect

    Not Available

    1983-02-01

    The cited reports present investigations on the recognition, synthesis, and processing of speech by computer. The research includes the acoustical, phonological, and linguistic processes necessary in the conversion of the various waveforms by computers. (This updated bibliography contains 33 citations, all of which are new entries to the previous edition.)

  20. Speech recognition by computer. 1964-September 1981 (a bibliography with abstracts)

    SciTech Connect

    Not Available

    1983-02-01

    The cited reports present investigations on the recognition, synthesis, and processing of speech by computer. The research includes the acoustical, phonological, and linguistic processes necessary in the conversion of the various waveforms by computers. (This updated bibliography contains 294 citations, none of which are new entries to the previous edition.)

  1. Experimental comparison between speech transmission index, rapid speech transmission index, and speech intelligibility index.

    PubMed

    Larm, Petra; Hongisto, Valtteri

    2006-02-01

    During the acoustical design of, e.g., auditoria or open-plan offices, it is important to know how speech can be perceived in various parts of the room. Different objective methods have been developed to measure and predict speech intelligibility, and these have been extensively used in various spaces. In this study, two such methods were compared, the speech transmission index (STI) and the speech intelligibility index (SII). Also the simplification of the STI, the room acoustics speech transmission index (RASTI), was considered. These quantities are all based on determining an apparent speech-to-noise ratio on selected frequency bands and summing them using a specific weighting. For comparison, some data were needed on the possible differences of these methods resulting from the calculation scheme and also measuring equipment. Their prediction accuracy was also of interest. Measurements were made in a laboratory having adjustable noise level and absorption, and in a real auditorium. It was found that the measurement equipment, especially the selection of the loudspeaker, can greatly affect the accuracy of the results. The prediction accuracy of the RASTI was found acceptable, if the input values for the prediction are accurately known, even though the studied space was not ideally diffuse. PMID:16521772

  2. Sparse representation in speech signal processing

    NASA Astrophysics Data System (ADS)

    Lee, Te-Won; Jang, Gil-Jin; Kwon, Oh-Wook

    2003-11-01

    We review the sparse representation principle for processing speech signals. A transformation for encoding the speech signals is learned such that the resulting coefficients are as independent as possible. We use independent component analysis with an exponential prior to learn a statistical representation for speech signals. This representation leads to extremely sparse priors that can be used for encoding speech signals for a variety of purposes. We review applications of this method for speech feature extraction, automatic speech recognition and speaker identification. Furthermore, this method is also suited for tackling the difficult problem of separating two sounds given only a single microphone.

  3. Speech prosody in cerebellar ataxia

    NASA Astrophysics Data System (ADS)

    Casper, Maureen

    The present study sought an acoustic signature for the speech disturbance recognized in cerebellar degeneration. Magnetic resonance imaging was used for a radiological rating of cerebellar involvement in six cerebellar ataxic dysarthric speakers. Acoustic measures of the [pap] syllables in contrastive prosodic conditions and of normal vs. brain-damaged patients were used to further our understanding both of the speech degeneration that accompanies cerebellar pathology and of speech motor control and movement in general. Pair-wise comparisons of the prosodic conditions within the normal group showed statistically significant differences for four prosodic contrasts. For three of the four contrasts analyzed, the normal speakers showed both longer durations and higher formant and fundamental frequency values in the more prominent first condition of the contrast. The acoustic measures of the normal prosodic contrast values were then used as a model to measure the degree of speech deterioration for individual cerebellar subjects. This estimate of speech deterioration as determined by individual differences between cerebellar and normal subjects' acoustic values of the four prosodic contrasts was used in correlation analyses with MRI ratings. Moderate correlations between speech deterioration and cerebellar atrophy were found in the measures of syllable duration and f0. A strong negative correlation was found for F1. Moreover, the normal model presented by these acoustic data allows for a description of the flexibility of task- oriented behavior in normal speech motor control. These data challenge spatio-temporal theory which explains movement as an artifact of time wherein longer durations predict more extreme movements and give further evidence for gestural internal dynamics of movement in which time emerges from articulatory events rather than dictating those events. This model provides a sensitive index of cerebellar pathology with quantitative acoustic

  4. System and method for characterizing voiced excitations of speech and acoustic signals, removing acoustic noise from speech, and synthesizing speech

    DOEpatents

    Burnett, Greg C.; Holzrichter, John F.; Ng, Lawrence C.

    2004-03-23

    The present invention is a system and method for characterizing human (or animate) speech voiced excitation functions and acoustic signals, for removing unwanted acoustic noise which often occurs when a speaker uses a microphone in common environments, and for synthesizing personalized or modified human (or other animate) speech upon command from a controller. A low power EM sensor is used to detect the motions of windpipe tissues in the glottal region of the human speech system before, during, and after voiced speech is produced by a user. From these tissue motion measurements, a voiced excitation function can be derived. Further, the excitation function provides speech production information to enhance noise removal from human speech and it enables accurate transfer functions of speech to be obtained. Previously stored excitation and transfer functions can be used for synthesizing personalized or modified human speech. Configurations of EM sensor and acoustic microphone systems are described to enhance noise cancellation and to enable multiple articulator measurements.

  5. System and method for characterizing voiced excitations of speech and acoustic signals, removing acoustic noise from speech, and synthesizing speech

    DOEpatents

    Burnett, Greg C.; Holzrichter, John F.; Ng, Lawrence C.

    2006-02-14

    The present invention is a system and method for characterizing human (or animate) speech voiced excitation functions and acoustic signals, for removing unwanted acoustic noise which often occurs when a speaker uses a microphone in common environments, and for synthesizing personalized or modified human (or other animate) speech upon command from a controller. A low power EM sensor is used to detect the motions of windpipe tissues in the glottal region of the human speech system before, during, and after voiced speech is produced by a user. From these tissue motion measurements, a voiced excitation function can be derived. Further, the excitation function provides speech production information to enhance noise removal from human speech and it enables accurate transfer functions of speech to be obtained. Previously stored excitation and transfer functions can be used for synthesizing personalized or modified human speech. Configurations of EM sensor and acoustic microphone systems are described to enhance noise cancellation and to enable multiple articulator measurements.

  6. System and method for characterizing voiced excitations of speech and acoustic signals, removing acoustic noise from speech, and synthesizing speech

    DOEpatents

    Burnett, Greg C.; Holzrichter, John F.; Ng, Lawrence C.

    2006-08-08

    The present invention is a system and method for characterizing human (or animate) speech voiced excitation functions and acoustic signals, for removing unwanted acoustic noise which often occurs when a speaker uses a microphone in common environments, and for synthesizing personalized or modified human (or other animate) speech upon command from a controller. A low power EM sensor is used to detect the motions of windpipe tissues in the glottal region of the human speech system before, during, and after voiced speech is produced by a user. From these tissue motion measurements, a voiced excitation function can be derived. Further, the excitation function provides speech production information to enhance noise removal from human speech and it enables accurate transfer functions of speech to be obtained. Previously stored excitation and transfer functions can be used for synthesizing personalized or modified human speech. Configurations of EM sensor and acoustic microphone systems are described to enhance noise cancellation and to enable multiple articulator measurements.

  7. Production and perception of clear speech

    NASA Astrophysics Data System (ADS)

    Bradlow, Ann R.

    2003-04-01

    When a talker believes that the listener is likely to have speech perception difficulties due to a hearing loss, background noise, or a different native language, she or he will typically adopt a clear speaking style. Previous research has established that, with a simple set of instructions to the talker, ``clear speech'' can be produced by most talkers under laboratory recording conditions. Furthermore, there is reliable evidence that adult listeners with either impaired or normal hearing typically find clear speech more intelligible than conversational speech. Since clear speech production involves listener-oriented articulatory adjustments, a careful examination of the acoustic-phonetic and perceptual consequences of the conversational-to-clear speech transformation can serve as an effective window into talker- and listener-related forces in speech communication. Furthermore, clear speech research has considerable potential for the development of speech enhancement techniques. After reviewing previous and current work on the acoustic properties of clear versus conversational speech, this talk will present recent data from a cross-linguistic study of vowel production in clear speech and a cross-population study of clear speech perception. Findings from these studies contribute to an evolving view of clear speech production and perception as reflecting both universal, auditory and language-specific, phonological contrast enhancement features.

  8. Contextual variability during speech-in-speech recognition

    PubMed Central

    Brouwer, Susanne; Bradlow, Ann R.

    2014-01-01

    This study examined the influence of background language variation on speech recognition. English listeners performed an English sentence recognition task in either “pure” background conditions in which all trials had either English or Dutch background babble or in mixed background conditions in which the background language varied across trials (i.e., a mix of English and Dutch or one of these background languages mixed with quiet trials). This design allowed the authors to compare performance on identical trials across pure and mixed conditions. The data reveal that speech-in-speech recognition is sensitive to contextual variation in terms of the target-background language (mis)match depending on the relative ease/difficulty of the test trials in relation to the surrounding trials. PMID:24993234

  9. Nonsensory factors in speech perception

    NASA Astrophysics Data System (ADS)

    Holt, Rachael F.; Carney, Arlene E.

    2001-05-01

    The nature of developmental differences was examined in a speech discrimination task, the change/no-change procedure, in which a varying number of speech stimuli are presented during a trial. Standard stimuli are followed by comparison stimuli that are identical to or acoustically different from the standard. Fourteen adults and 30 4- and 5-year-old children were tested with three speech contrast pairs at a variety of signal-to-noise ratios using various numbers of standard and comparison stimulus presentations. Adult speech discrimination performance followed the predictions of the multiple looks hypothesis [N. F. Viemeister and G. H. Wakefield, J. Acoust. Soc. Am. 90, 858-865 (1991)] there was an increase in d by a factor of 1.4 for a doubling in the number of standard and comparison stimulus presentations near d values of 1.0. For children, increasing the number of standard stimuli improved discrimination performance, whereas increasing the number of comparisons did not. The multiple looks hypothesis did not explain the children's data. They are explained more parsimoniously by the developmental weighting shift [Nittrouer et al., J. Acoust. Soc. Am. 101, 2253-2266 (1993)], which proposes that children attend to different aspects of speech stimuli from adults. [Work supported by NIDCD and ASHF.

  10. MENDING THE CHILD'S SPEECH. THE INSTRUCTOR HANDBOOK SERIES, NUMBER 325.

    ERIC Educational Resources Information Center

    GOLDBERG, EDITH B.

    THIS GUIDE FOR THE ELEMENTARY SCHOOL CLASSROOM TEACHER DISCUSSES HER ROLE IN A PROGRAM OF SPEECH THERAPY OR SPEECH IMPROVEMENT, WHETHER IN COOPERATION WITH A SPEECH THERAPIST OR ALONE. GOOD SPEECH AND DEFECTIVE SPEECH ARE DEFINED, AND ACTIVITIES TO ENCOURAGE SPEECH IN THE CLASSROOM ARE LISTED. SPECIFIC DIAGNOSTIC TECHNIQUES AND THERAPEUTIC…

  11. The Effect of Speech Rate on Stuttering Frequency, Phonated Intervals, Speech Effort, and Speech Naturalness during Chorus Reading

    ERIC Educational Resources Information Center

    Davidow, Jason H.; Ingham, Roger J.

    2013-01-01

    Purpose: This study examined the effect of speech rate on phonated intervals (PIs), in order to test whether a reduction in the frequency of short PIs is an important part of the fluency-inducing mechanism of chorus reading. The influence of speech rate on stuttering frequency, speaker-judged speech effort, and listener-judged naturalness was also…

  12. A causal test of the motor theory of speech perception: A case of impaired speech production and spared speech perception

    PubMed Central

    Stasenko, Alena; Bonn, Cory; Teghipco, Alex; Garcea, Frank E.; Sweet, Catherine; Dombovy, Mary; McDonough, Joyce; Mahon, Bradford Z.

    2015-01-01

    In the last decade, the debate about the causal role of the motor system in speech perception has been reignited by demonstrations that motor processes are engaged during the processing of speech sounds. However, the exact role of the motor system in auditory speech processing remains elusive. Here we evaluate which aspects of auditory speech processing are affected, and which are not, in a stroke patient with dysfunction of the speech motor system. The patient’s spontaneous speech was marked by frequent phonological/articulatory errors, and those errors were caused, at least in part, by motor-level impairments with speech production. We found that the patient showed a normal phonemic categorical boundary when discriminating two nonwords that differ by a minimal pair (e.g., ADA-AGA). However, using the same stimuli, the patient was unable to identify or label the nonword stimuli (using a button-press response). A control task showed that he could identify speech sounds by speaker gender, ruling out a general labeling impairment. These data suggest that the identification (i.e. labeling) of nonword speech sounds may involve the speech motor system, but that the perception of speech sounds (i.e., discrimination) does not require the motor system. This means that motor processes are not causally involved in perception of the speech signal, and suggest that the motor system may be used when other cues (e.g., meaning, context) are not available. PMID:25951749

  13. Determining the threshold for usable speech within co-channel speech with the SPHINX automated speech recognition system

    NASA Astrophysics Data System (ADS)

    Hicks, William T.; Yantorno, Robert E.

    2004-10-01

    Much research has been and is continuing to be done in the area of separating the original utterances of two speakers from co-channel speech. This is very important in the area of automated speech recognition (ASR), where the current state of technology is not nearly as accurate as human listeners when the speech is co-channel. It is desired to determine what types of speech (voiced, unvoiced, and silence) and at what target to interference ratio (TIR) two speakers can speak at the same time and not reduce speech intelligibility of the target speaker (referred to as usable speech). Knowing which segments of co-channel speech are usable in ASR can be used to improve the reconstruction of single speaker speech. Tests were performed using the SPHINX ASR software and the TIDIGITS database. It was found that interfering voiced speech with a TIR of 6 dB or greater (on a per frame basis) did not significantly reduce the intelligibility of the target speaker in co-channel speech. It was further found that interfering unvoiced speech with a TIR of 18 dB or greater (on a per frame basis) did not significantly reduce the intelligibility of the target speaker in co-channel speech.

  14. Perceived Liveliness and Speech Comprehensibility in Aphasia: The Effects of Direct Speech in Auditory Narratives

    ERIC Educational Resources Information Center

    Groenewold, Rimke; Bastiaanse, Roelien; Nickels, Lyndsey; Huiskes, Mike

    2014-01-01

    Background: Previous studies have shown that in semi-spontaneous speech, individuals with Broca's and anomic aphasia produce relatively many direct speech constructions. It has been claimed that in "healthy" communication direct speech constructions contribute to the liveliness, and indirectly to the comprehensibility, of speech.…

  15. Predicting Speech Intelligibility with a Multiple Speech Subsystems Approach in Children with Cerebral Palsy

    ERIC Educational Resources Information Center

    Lee, Jimin; Hustad, Katherine C.; Weismer, Gary

    2014-01-01

    Purpose: Speech acoustic characteristics of children with cerebral palsy (CP) were examined with a multiple speech subsystems approach; speech intelligibility was evaluated using a prediction model in which acoustic measures were selected to represent three speech subsystems. Method: Nine acoustic variables reflecting different subsystems, and…

  16. Speech Perception and Short-Term Memory Deficits in Persistent Developmental Speech Disorder

    ERIC Educational Resources Information Center

    Kenney, Mary Kay; Barac-Cikoja, Dragana; Finnegan, Kimberly; Jeffries, Neal; Ludlow, Christy L.

    2006-01-01

    Children with developmental speech disorders may have additional deficits in speech perception and/or short-term memory. To determine whether these are only transient developmental delays that can accompany the disorder in childhood or persist as part of the speech disorder, adults with a persistent familial speech disorder were tested on speech…

  17. The Role of Visual Speech Information in Supporting Perceptual Learning of Degraded Speech

    ERIC Educational Resources Information Center

    Wayne, Rachel V.; Johnsrude, Ingrid S.

    2012-01-01

    Following cochlear implantation, hearing-impaired listeners must adapt to speech as heard through their prosthesis. Visual speech information (VSI; the lip and facial movements of speech) is typically available in everyday conversation. Here, we investigate whether learning to understand a popular auditory simulation of speech as transduced by a…

  18. Speech Rate Acceptance Ranges as a Function of Evaluative Domain, Listener Speech Rate, and Communication Context.

    ERIC Educational Resources Information Center

    Street, Richard L., Jr.; Brady, Robert M.

    1982-01-01

    Speech rate appears to be an important communicative dimension upon which people evaluate the speech of others. Findings of this study indicate that speech rates at moderate through fast levels generated more favorable impressions of competence and social attractiveness than did slow speech. (PD)

  19. President Kennedy's Speech at Rice University

    NASA Technical Reports Server (NTRS)

    1988-01-01

    This video tape presents unedited film footage of President John F. Kennedy's speech at Rice University, Houston, Texas, September 12, 1962. The speech expresses the commitment of the United States to landing an astronaut on the Moon.

  20. Speech perception and production in severe environments

    NASA Astrophysics Data System (ADS)

    Pisoni, David B.

    1990-09-01

    The goal was to acquire new knowledge about speech perception and production in severe environments such as high masking noise, increased cognitive load or sustained attentional demands. Changes were examined in speech production under these adverse conditions through acoustic analysis techniques. One set of studies focused on the effects of noise on speech production. The experiments in this group were designed to generate a database of speech obtained in noise and in quiet. A second set of experiments was designed to examine the effects of cognitive load on the acoustic-phonetic properties of speech. Talkers were required to carry out a demanding perceptual motor task while they read lists of test words. A final set of experiments explored the effects of vocal fatigue on the acoustic-phonetic properties of speech. Both cognitive load and vocal fatigue are present in many applications where speech recognition technology is used, yet their influence on speech production is poorly understood.

  1. On-Line Measurement of Aphasic Speech.

    ERIC Educational Resources Information Center

    Packman, Ann; Ingham, Roger J.

    1978-01-01

    The spontaneous speech of five aphasic Ss (47-70 years old) was rated on-line by four clinicians to test the reliability of seven response categories (devised for the concurrent evaluation of aphasic speech). (Author/PHR)

  2. Speech Recognition: Its Place in Business Education.

    ERIC Educational Resources Information Center

    Szul, Linda F.; Bouder, Michele

    2003-01-01

    Suggests uses of speech recognition devices in the classroom for students with disabilities. Compares speech recognition software packages and provides guidelines for selection and teaching. (Contains 14 references.) (SK)

  3. Speech and Language Problems in Children

    MedlinePlus

    Children vary in their development of speech and language skills. Health professionals have milestones for what's normal. ... it may be due to a speech or language disorder. Language disorders can mean that the child ...

  4. Auditory-visual speech perception and synchrony detection for speech and nonspeech signals

    PubMed Central

    Conrey, Brianna; Pisoni, David B.

    2012-01-01

    Previous research has identified a “synchrony window” of several hundred milliseconds over which auditory-visual (AV) asynchronies are not reliably perceived. Individual variability in the size of this AV synchrony window has been linked with variability in AV speech perception measures, but it was not clear whether AV speech perception measures are related to synchrony detection for speech only or for both speech and nonspeech signals. An experiment was conducted to investigate the relationship between measures of AV speech perception and AV synchrony detection for speech and nonspeech signals. Variability in AV synchrony detection for both speech and nonspeech signals was found to be related to variability in measures of auditory-only (A-only) and AV speech perception, suggesting that temporal processing for both speech and nonspeech signals must be taken into account in explaining variability in A-only and multisensory speech perception. PMID:16838548

  5. A Chimpanzee Recognizes Synthetic Speech With Significantly Reduced Acoustic Cues to Phonetic Content

    PubMed Central

    Heimbauer, Lisa A.; Beran, Michael J.; Owren, Michael J.

    2011-01-01

    Summary A long-standing debate concerns whether humans are specialized for speech perception [1–7], which some researchers argue is demonstrated by the ability to understand synthetic speech with significantly reduced acoustic cues to phonetic content [2–4,7]. We tested a chimpanzee (Pan troglodytes) that recognizes 128 spoken words [8,9], asking whether she could understand such speech. Three experiments presented 48 individual words, with the animal selecting a corresponding visuo-graphic symbol from among four alternatives. Experiment 1 tested spectrally reduced, noise-vocoded (NV) synthesis, originally developed to simulate input received by human cochlear-implant users [10]. Experiment 2 tested “impossibly unspeechlike” [3] sine-wave (SW) synthesis, which reduces speech to just three moving tones [11]. Although receiving only intermittent and non-contingent reward, the chimpanzee performed well above chance level, including when hearing synthetic versions for the first time. Recognition of SW words was least accurate, but improved in Experiment 3 when natural words in the same session were rewarded. The chimpanzee was more accurate with NV than SW versions, as were 32 human participants hearing these items. The chimpanzee's ability to spontaneously recognize acoustically reduced synthetic words suggests that experience rather than specialization is critical for speech-perception capabilities that some have suggested are uniquely human [12–14]. PMID:21723125

  6. The Mutual Intelligibility of L2 Speech

    ERIC Educational Resources Information Center

    Munro, Murray J.; Derwing, Tracey M.; Morton, Susan L.

    2006-01-01

    When understanding or evaluating foreign-accented speech, listeners are affected not only by properties of the speech itself but by their own linguistic backgrounds and their experience with different speech varieties. Given the latter influence, it is not known to what degree a diverse group of listeners might share a response to second language…

  7. Acoustics of Clear Speech: Effect of Instruction

    ERIC Educational Resources Information Center

    Lam, Jennifer; Tjaden, Kris; Wilding, Greg

    2012-01-01

    Purpose: This study investigated how different instructions for eliciting clear speech affected selected acoustic measures of speech. Method: Twelve speakers were audio-recorded reading 18 different sentences from the Assessment of Intelligibility of Dysarthric Speech (Yorkston & Beukelman, 1984). Sentences were produced in habitual, clear,…

  8. Audiovisual Asynchrony Detection in Human Speech

    ERIC Educational Resources Information Center

    Maier, Joost X.; Di Luca, Massimiliano; Noppeney, Uta

    2011-01-01

    Combining information from the visual and auditory senses can greatly enhance intelligibility of natural speech. Integration of audiovisual speech signals is robust even when temporal offsets are present between the component signals. In the present study, we characterized the temporal integration window for speech and nonspeech stimuli with…

  9. Characteristics of Speech Motor Development in Children.

    ERIC Educational Resources Information Center

    Ostry, David J.; And Others

    1984-01-01

    Pulsed ultrasound was used to study tongue movements in the speech of children from 3 to 11 years of age. Speech data attained were characteristic of systems that can be described by second-order differential equations. Relationships observed in these systems may indicate that speech control involves tonic and phasic muscle inputs. (Author/RH)

  10. Normal Aspects of Speech, Hearing, and Language.

    ERIC Educational Resources Information Center

    Minifie, Fred. D., Ed.; And Others

    This book is written as a guide to the understanding of the processes involved in human speech communication. Ten authorities contributed material to provide an introduction to the physiological aspects of speech production and reception, the acoustical aspects of speech production and transmission, the psychophysics of sound reception, the nature…

  11. Speech sounds alter facial skin sensation

    PubMed Central

    Ito, Takayuki

    2012-01-01

    Interactions between auditory and somatosensory information are relevant to the neural processing of speech since speech processes and certainly speech production involves both auditory information and inputs that arise from the muscles and tissues of the vocal tract. We previously demonstrated that somatosensory inputs associated with facial skin deformation alter the perceptual processing of speech sounds. We show here that the reverse is also true, that speech sounds alter the perception of facial somatosensory inputs. As a somatosensory task, we used a robotic device to create patterns of facial skin deformation that would normally accompany speech production. We found that the perception of the facial skin deformation was altered by speech sounds in a manner that reflects the way in which auditory and somatosensory effects are linked in speech production. The modulation of orofacial somatosensory processing by auditory inputs was specific to speech and likewise to facial skin deformation. Somatosensory judgments were not affected when the skin deformation was delivered to the forearm or palm or when the facial skin deformation accompanied nonspeech sounds. The perceptual modulation that we observed in conjunction with speech sounds shows that speech sounds specifically affect neural processing in the facial somatosensory system and suggest the involvement of the somatosensory system in both the production and perceptual processing of speech. PMID:22013241

  12. Freedom of Speech as an Academic Discipline.

    ERIC Educational Resources Information Center

    Haiman, Franklyn S.

    Since its formation, the Speech Communication Association's Committee on Freedom of Speech has played a critical leadership role in course offerings, research efforts, and regional activities in freedom of speech. Areas in which research has been done and in which further research should be carried out include: historical-critical research, in…

  13. Cognitive Functions in Childhood Apraxia of Speech

    ERIC Educational Resources Information Center

    Nijland, Lian; Terband, Hayo; Maassen, Ben

    2015-01-01

    Purpose: Childhood apraxia of speech (CAS) is diagnosed on the basis of specific speech characteristics, in the absence of problems in hearing, intelligence, and language comprehension. This does not preclude the possibility that children with this speech disorder might demonstrate additional problems. Method: Cognitive functions were investigated…

  14. The Dynamic Nature of Speech Perception

    ERIC Educational Resources Information Center

    McQueen, James M.; Norris, Dennis; Cutler, Anne

    2006-01-01

    The speech perception system must be flexible in responding to the variability in speech sounds caused by differences among speakers and by language change over the lifespan of the listener. Indeed, listeners use lexical knowledge to retune perception of novel speech (Norris, McQueen, & Cutler, 2003). In that study, Dutch listeners made lexical…

  15. Communicating by Language: The Speech Process.

    ERIC Educational Resources Information Center

    House, Arthur S., Ed.

    This document reports on a conference focused on speech problems. The main objective of these discussions was to facilitate a deeper understanding of human communication through interaction of conference participants with colleagues in other disciplines. Topics discussed included speech production, feedback, speech perception, and development of…

  16. SPEECH DURATIONS OF ASTRONAUT AND GROUND COMMUNICATOR.

    PubMed

    MATARAZZO, J D; WIENS, A N; SASLOW, G; DUNHAM, R M; VOAS, R B

    1964-01-10

    Laboratory studies suggest that an interviewer can influence the speech duration of an interviewee by modifications in his own speech duration. What appears to be a related association between the speech duration of communicators on the ground and an astronaut in orbital flight was found. PMID:14075727

  17. Campus Speech Codes Said to Violate Rights

    ERIC Educational Resources Information Center

    Lipka, Sara

    2007-01-01

    Most college and university speech codes would not survive a legal challenge, according to a report released in December by the Foundation for Individual Rights in Education, a watchdog group for free speech on campuses. The report labeled many speech codes as overly broad or vague, and cited examples such as Furman University's prohibition of…

  18. Hate Speech on Campus: A Practical Approach.

    ERIC Educational Resources Information Center

    Hogan, Patrick

    1997-01-01

    Looks at arguments concerning hate speech and speech codes on college campuses, arguing that speech codes are likely to be of limited value in achieving civil rights objectives, and that there are alternatives less harmful to civil liberties and more successful in promoting civil rights. Identifies specific goals, and considers how restriction of…

  19. Liberalism, Speech Codes, and Related Problems.

    ERIC Educational Resources Information Center

    Sunstein, Cass R.

    1993-01-01

    It is argued that universities are pervasively and necessarily engaged in regulation of speech, which complicates many existing claims about hate speech codes on campus. The ultimate test is whether the restriction on speech is a legitimate part of the institution's mission, commitment to liberal education. (MSE)

  20. DEVELOPMENT AND DISORDERS OF SPEECH IN CHILDHOOD.

    ERIC Educational Resources Information Center

    KARLIN, ISAAC W.; AND OTHERS

    THE GROWTH, DEVELOPMENT, AND ABNORMALITIES OF SPEECH IN CHILDHOOD ARE DESCRIBED IN THIS TEXT DESIGNED FOR PEDIATRICIANS, PSYCHOLOGISTS, EDUCATORS, MEDICAL STUDENTS, THERAPISTS, PATHOLOGISTS, AND PARENTS. THE NORMAL DEVELOPMENT OF SPEECH AND LANGUAGE IS DISCUSSED, INCLUDING THEORIES ON THE ORIGIN OF SPEECH IN MAN AND FACTORS INFLUENCING THE NORMAL…

  1. Syllable Structure in Dysfunctional Portuguese Children's Speech

    ERIC Educational Resources Information Center

    Candeias, Sara; Perdigao, Fernando

    2010-01-01

    The goal of this work is to investigate whether children with speech dysfunctions (SD) show a deficit in planning some Portuguese syllable structures (PSS) in continuous speech production. Knowledge of which aspects of speech production are affected by SD is necessary for efficient improvement in the therapy techniques. The case-study is focused…

  2. Vygotskian Inner Speech and the Reading Process

    ERIC Educational Resources Information Center

    Ehrich, J. F.

    2006-01-01

    There is a paucity of Vygotskian influenced inner speech research in relation to the reading process. Those few studies which have examined Vygotskian inner speech from a reading perspective tend to support the notion that inner speech is an important covert function that is crucial to the reading process and to reading acquisition in general.…

  3. Interventions for Speech Sound Disorders in Children

    ERIC Educational Resources Information Center

    Williams, A. Lynn, Ed.; McLeod, Sharynne, Ed.; McCauley, Rebecca J., Ed.

    2010-01-01

    With detailed discussion and invaluable video footage of 23 treatment interventions for speech sound disorders (SSDs) in children, this textbook and DVD set should be part of every speech-language pathologist's professional preparation. Focusing on children with functional or motor-based speech disorders from early childhood through the early…

  4. The Varieties of Speech to Young Children

    ERIC Educational Resources Information Center

    Huttenlocher, Janellen; Vasilyeva, Marina; Waterfall, Heidi R.; Vevea, Jack L.; Hedges, Larry V.

    2007-01-01

    This article examines caregiver speech to young children. The authors obtained several measures of the speech used to children during early language development (14-30 months). For all measures, they found substantial variation across individuals and subgroups. Speech patterns vary with caregiver education, and the differences are maintained over…

  5. Speech Perception in Individuals with Auditory Neuropathy

    ERIC Educational Resources Information Center

    Zeng, Fan-Gang; Liu, Sheng

    2006-01-01

    Purpose: Speech perception in participants with auditory neuropathy (AN) was systematically studied to answer the following 2 questions: Does noise present a particular problem for people with AN: Can clear speech and cochlear implants alleviate this problem? Method: The researchers evaluated the advantage in intelligibility of clear speech over…

  6. Speech and Hearing Science, Anatomy and Physiology.

    ERIC Educational Resources Information Center

    Zemlin, Willard R.

    Written for those interested in speech pathology and audiology, the text presents the anatomical, physiological, and neurological bases for speech and hearing. Anatomical nomenclature used in the speech and hearing sciences is introduced and the breathing mechanism is defined and discussed in terms of the respiratory passage, the framework and…

  7. Hate Speech and the First Amendment.

    ERIC Educational Resources Information Center

    Rainey, Susan J.; Kinsler, Waren S.; Kannarr, Tina L.; Reaves, Asa E.

    This document is comprised of California state statutes, federal legislation, and court litigation pertaining to hate speech and the First Amendment. The document provides an overview of California education code sections relating to the regulation of speech; basic principles of the First Amendment; government efforts to regulate hate speech,…

  8. Auditory models for speech analysis

    NASA Astrophysics Data System (ADS)

    Maybury, Mark T.

    This paper reviews the psychophysical basis for auditory models and discusses their application to automatic speech recognition. First an overview of the human auditory system is presented, followed by a review of current knowledge gleaned from neurological and psychoacoustic experimentation. Next, a general framework describes established peripheral auditory models which are based on well-understood properties of the peripheral auditory system. This is followed by a discussion of current enhancements to that models to include nonlinearities and synchrony information as well as other higher auditory functions. Finally, the initial performance of auditory models in the task of speech recognition is examined and additional applications are mentioned.

  9. Research in continuous speech recognition

    NASA Astrophysics Data System (ADS)

    Schwartz, R. M.; Chow, Y. L.; Makhoul, J.

    1983-12-01

    This annual report describes the work performed during the past year in an ongoing effort to design and implement a system that performs phonetic recognition of continuous speech. The general approach used it to develop a Hidden Markov Model (HMM) of speech parameter movements, which can be used to distinguish among the different phonemes. The resulting phoneme models incorporate the contextural effects of neighboring phonemes. One main aspect of this research is to incorporate both spectral parameters and acoustic-phonetic features into the HMM formalism.

  10. Speech entrainment enables patients with Broca's aphasia to produce fluent speech.

    PubMed

    Fridriksson, Julius; Hubbard, H Isabel; Hudspeth, Sarah Grace; Holland, Audrey L; Bonilha, Leonardo; Fromm, Davida; Rorden, Chris

    2012-12-01

    A distinguishing feature of Broca's aphasia is non-fluent halting speech typically involving one to three words per utterance. Yet, despite such profound impairments, some patients can mimic audio-visual speech stimuli enabling them to produce fluent speech in real time. We call this effect 'speech entrainment' and reveal its neural mechanism as well as explore its usefulness as a treatment for speech production in Broca's aphasia. In Experiment 1, 13 patients with Broca's aphasia were tested in three conditions: (i) speech entrainment with audio-visual feedback where they attempted to mimic a speaker whose mouth was seen on an iPod screen; (ii) speech entrainment with audio-only feedback where patients mimicked heard speech; and (iii) spontaneous speech where patients spoke freely about assigned topics. The patients produced a greater variety of words using audio-visual feedback compared with audio-only feedback and spontaneous speech. No difference was found between audio-only feedback and spontaneous speech. In Experiment 2, 10 of the 13 patients included in Experiment 1 and 20 control subjects underwent functional magnetic resonance imaging to determine the neural mechanism that supports speech entrainment. Group results with patients and controls revealed greater bilateral cortical activation for speech produced during speech entrainment compared with spontaneous speech at the junction of the anterior insula and Brodmann area 47, in Brodmann area 37, and unilaterally in the left middle temporal gyrus and the dorsal portion of Broca's area. Probabilistic white matter tracts constructed for these regions in the normal subjects revealed a structural network connected via the corpus callosum and ventral fibres through the extreme capsule. Unilateral areas were connected via the arcuate fasciculus. In Experiment 3, all patients included in Experiment 1 participated in a 6-week treatment phase using speech entrainment to improve speech production. Behavioural and

  11. Relationship between Speech Intelligibility and Speech Comprehension in Babble Noise

    ERIC Educational Resources Information Center

    Fontan, Lionel; Tardieu, Julien; Gaillard, Pascal; Woisard, Virginie; Ruiz, Robert

    2015-01-01

    Purpose: The authors investigated the relationship between the intelligibility and comprehension of speech presented in babble noise. Method: Forty participants listened to French imperative sentences (commands for moving objects) in a multitalker babble background for which intensity was experimentally controlled. Participants were instructed to…

  12. Perception of Speech Reflects Optimal Use of Probabilistic Speech Cues

    ERIC Educational Resources Information Center

    Clayards, Meghan; Tanenhaus, Michael K.; Aslin, Richard N.; Jacobs, Robert A.

    2008-01-01

    Listeners are exquisitely sensitive to fine-grained acoustic detail within phonetic categories for sounds and words. Here we show that this sensitivity is optimal given the probabilistic nature of speech cues. We manipulated the probability distribution of one probabilistic cue, voice onset time (VOT), which differentiates word initial labial…

  13. Speech Perception in Children with Speech Output Disorders

    ERIC Educational Resources Information Center

    Nijland, Lian

    2009-01-01

    Research in the field of speech production pathology is dominated by describing deficits in output. However, perceptual problems might underlie, precede, or interact with production disorders. The present study hypothesizes that the level of the production disorders is linked to level of perception disorders, thus lower-order production problems…

  14. Speech Priming: Evidence for Rate Persistence in Unscripted Speech

    ERIC Educational Resources Information Center

    Jungers, Melissa K.; Hupp, Julie M.

    2009-01-01

    Previous research has shown evidence for priming of rate in scripted speech. Two experiments examined the persistence of rate in production of unscripted picture descriptions. In Experiment 1, speakers heard and repeated priming sentences presented at a fast or slow rate and in a passive or active form. Speakers then described a new picture. The…

  15. Comparing the single-word intelligibility of two speech synthesizers for small computers

    SciTech Connect

    Cochran, P.S.

    1986-01-01

    Previous research on the intelligibility of synthesized speech has placed emphasis on the segmental intelligibility (rather than word or sentence intelligibility) of expensive and sophisticated synthesis systems. There is a need for more information about the intelligibility of low-to-moderately priced speech synthesizers because they are the most likely to be widely purchase for clinical and educational use. This study was to compare the word intelligibility of two such synthesizers for small computers, the Votrax Personal Speech System (PSS) and the Echo GP (General Purpose). A multiple-choice word identification task was used in a two-part study in which 48 young adults served as listeners. Groups of subjects in Part I completed one trial listening to taped natural speech followed by one trial with each synthesizer. Subjects in Part II listened to the taped human speech followed by two trials with the same synthesizer. Under the quiet listening conditions used for this study, taped human speech was 30% more intelligible than the Votrax PSS, and 53% more intelligible than the Echo GP.

  16. Phylogeny of the cycads based on multiple single-copy nuclear genes: congruence of concatenated parsimony, likelihood and species tree inference methods

    PubMed Central

    Salas-Leiva, Dayana E.; Meerow, Alan W.; Calonje, Michael; Griffith, M. Patrick; Francisco-Ortega, Javier; Nakamura, Kyoko; Stevenson, Dennis W.; Lewis, Carl E.; Namoff, Sandra

    2013-01-01

    Background and aims Despite a recent new classification, a stable phylogeny for the cycads has been elusive, particularly regarding resolution of Bowenia, Stangeria and Dioon. In this study, five single-copy nuclear genes (SCNGs) are applied to the phylogeny of the order Cycadales. The specific aim is to evaluate several gene tree–species tree reconciliation approaches for developing an accurate phylogeny of the order, to contrast them with concatenated parsimony analysis and to resolve the erstwhile problematic phylogenetic position of these three genera. Methods DNA sequences of five SCNGs were obtained for 20 cycad species representing all ten genera of Cycadales. These were analysed with parsimony, maximum likelihood (ML) and three Bayesian methods of gene tree–species tree reconciliation, using Cycas as the outgroup. A calibrated date estimation was developed with Bayesian methods, and biogeographic analysis was also conducted. Key Results Concatenated parsimony, ML and three species tree inference methods resolve exactly the same tree topology with high support at most nodes. Dioon and Bowenia are the first and second branches of Cycadales after Cycas, respectively, followed by an encephalartoid clade (Macrozamia–Lepidozamia–Encephalartos), which is sister to a zamioid clade, of which Ceratozamia is the first branch, and in which Stangeria is sister to Microcycas and Zamia. Conclusions A single, well-supported phylogenetic hypothesis of the generic relationships of the Cycadales is presented. However, massive extinction events inferred from the fossil record that eliminated broader ancestral distributions within Zamiaceae compromise accurate optimization of ancestral biogeographical areas for that hypothesis. While major lineages of Cycadales are ancient, crown ages of all modern genera are no older than 12 million years, supporting a recent hypothesis of mostly Miocene radiations. This phylogeny can contribute to an accurate infrafamilial

  17. Vocal quality factors: analysis, synthesis, and perception.

    PubMed

    Childers, D G; Lee, C K

    1991-11-01

    The purpose of this study was to examine several factors of vocal quality that might be affected by changes in vocal fold vibratory patterns. Four voice types were examined: modal, vocal fry, falsetto, and breathy. Three categories of analysis techniques were developed to extract source-related features from speech and electroglottographic (EGG) signals. Four factors were found to be important for characterizing the glottal excitations for the four voice types: the glottal pulse width, the glottal pulse skewness, the abruptness of glottal closure, and the turbulent noise component. The significance of these factors for voice synthesis was studied and a new voice source model that accounted for certain physiological aspects of vocal fold motion was developed and tested using speech synthesis. Perceptual listening tests were conducted to evaluate the auditory effects of the source model parameters upon synthesized speech. The effects of the spectral slope of the source excitation, the shape of the glottal excitation pulse, and the characteristics of the turbulent noise source were considered. Applications for these research results include synthesis of natural sounding speech, synthesis and modeling of vocal disorders, and the development of speaker independent (or adaptive) speech recognition systems. PMID:1837797

  18. Audiovisual Speech Synchrony Measure: Application to Biometrics

    NASA Astrophysics Data System (ADS)

    Bredin, Hervé; Chollet, Gérard

    2007-12-01

    Speech is a means of communication which is intrinsically bimodal: the audio signal originates from the dynamics of the articulators. This paper reviews recent works in the field of audiovisual speech, and more specifically techniques developed to measure the level of correspondence between audio and visual speech. It overviews the most common audio and visual speech front-end processing, transformations performed on audio, visual, or joint audiovisual feature spaces, and the actual measure of correspondence between audio and visual speech. Finally, the use of synchrony measure for biometric identity verification based on talking faces is experimented on the BANCA database.

  19. Pulse Vector-Excitation Speech Encoder

    NASA Technical Reports Server (NTRS)

    Davidson, Grant; Gersho, Allen

    1989-01-01

    Proposed pulse vector-excitation speech encoder (PVXC) encodes analog speech signals into digital representation for transmission or storage at rates below 5 kilobits per second. Produces high quality of reconstructed speech, but with less computation than required by comparable speech-encoding systems. Has some characteristics of multipulse linear predictive coding (MPLPC) and of code-excited linear prediction (CELP). System uses mathematical model of vocal tract in conjunction with set of excitation vectors and perceptually-based error criterion to synthesize natural-sounding speech.

  20. Feasibility of Technology Enabled Speech Disorder Screening.

    PubMed

    Duenser, Andreas; Ward, Lauren; Stefani, Alessandro; Smith, Daniel; Freyne, Jill; Morgan, Angela; Dodd, Barbara

    2016-01-01

    One in twenty Australian children suffers from a speech disorder. Early detection of such problems can significantly improve literacy and academic outcomes for these children, reduce health and educational burden and ongoing social costs. Here we present the development of a prototype and feasibility tests of a screening and decision support tool to assess speech disorders in young children. The prototype incorporates speech signal processing, machine learning and expert knowledge to automatically classify phonemes of normal and disordered speech. We discuss these results and our future work towards the development of a mobile tool to facilitate broad, early speech disorder screening by non-experts. PMID:27440284

  1. Prosodic Contrasts in Ironic Speech

    ERIC Educational Resources Information Center

    Bryant, Gregory A.

    2010-01-01

    Prosodic features in spontaneous speech help disambiguate implied meaning not explicit in linguistic surface structure, but little research has examined how these signals manifest themselves in real conversations. Spontaneously produced verbal irony utterances generated between familiar speakers in conversational dyads were acoustically analyzed…

  2. Sociolinguistic Factors in Speech Identification.

    ERIC Educational Resources Information Center

    Shuy, Roger W.; And Others

    The first of two experiments conducted in Detroit investigated the relationship between class and ethnic membership and identification of class and ethnicity; the role age and sex of respondent play in accuracy of speaker identification; and attitudes toward various socioethnic speech patterns. The second study was concerned with the attitudes of…

  3. Free Speech Advocates at Berkeley.

    ERIC Educational Resources Information Center

    Watts, William A.; Whittaker, David

    1966-01-01

    This study compares highly committed members of the Free Speech Movement (FSM) at Berkeley with the student population at large on 3 sociopsychological foci: general biographical data, religious orientation, and rigidity-flexibility. Questionnaires were administered to 172 FSM members selected by chance from the 10 to 1200 who entered and "sat-in"…

  4. Speech and Language Developmental Milestones

    MedlinePlus

    ... What are the milestones for speech and language development? The first signs of communication occur when an infant learns that a cry will bring food, comfort, and companionship. Newborns also begin to recognize important sounds in their environment, such as the voice of their mother or ...

  5. Embedding speech into virtual realities

    NASA Astrophysics Data System (ADS)

    Bohn, Christian-Arved; Krueger, Wolfgang

    1993-05-01

    In this work a speaker-independent speech recognition system is presented, which is suitable for implementation in Virtual Reality applications. The use of an artificial neural network in connection with a special compression of the acoustic input leads to a system, which is robust, fast, easy to use and needs no additional hardware, beside a common VR-equipment.

  6. Models for Teaching Speech Communication.

    ERIC Educational Resources Information Center

    Deethardt, John F., II

    Intended for use by educators of preservice speech communications teachers, this description of a methods course is geared towards high school and college level pedagogy. The philosophy of the guide rejects the typical textbook style, in which generalizations are given to students as unqualified positive statements rather than made objects of…

  7. Speech Research. Interim Scientific Report.

    ERIC Educational Resources Information Center

    Cooper, Franklin S.

    The status and progress of several studies dealing with the nature of speech, instrumentation for its investigation, and instrumentation for practical applications is reported on. The period of January 1 through June 30, 1969 is covered. Extended reports and manuscripts cover the following topics: programing for the Glace-Holmes synthesizer,…

  8. Embedding speech into virtual realities

    NASA Technical Reports Server (NTRS)

    Bohn, Christian-Arved; Krueger, Wolfgang

    1993-01-01

    In this work a speaker-independent speech recognition system is presented, which is suitable for implementation in Virtual Reality applications. The use of an artificial neural network in connection with a special compression of the acoustic input leads to a system, which is robust, fast, easy to use and needs no additional hardware, beside a common VR-equipment.

  9. The Ontogenesis of Speech Acts

    ERIC Educational Resources Information Center

    Bruner, Jerome S.

    1975-01-01

    A speech act approach to the transition from pre-linguistic to linguistic communication is adopted in order to consider language in relation to behavior and to allow for an emphasis on the use, rather than the form, of language. A pilot study of mothers and infants is discussed. (Author/RM)

  10. Inner Speech Impairments in Autism

    ERIC Educational Resources Information Center

    Whitehouse, Andrew J. O.; Maybery, Murray T.; Durkin, Kevin

    2006-01-01

    Background: Three experiments investigated the role of inner speech deficit in cognitive performances of children with autism. Methods: Experiment 1 compared children with autism with ability-matched controls on a verbal recall task presenting pictures and words. Experiment 2 used pictures for which the typical names were either single syllable or…

  11. Vector adaptive predictive coder for speech and audio

    NASA Technical Reports Server (NTRS)

    Chen, Juin-Hwey (Inventor); Gersho, Allen (Inventor)

    1990-01-01

    A real-time vector adaptive predictive coder which approximates each vector of K speech samples by using each of M fixed vectors in a first codebook to excite a time-varying synthesis filter and picking the vector that minimizes distortion. Predictive analysis for each frame determines parameters used for computing from vectors in the first codebook zero-state response vectors that are stored at the same address (index) in a second codebook. Encoding of input speech vectors s.sub.n is then carried out using the second codebook. When the vector that minimizes distortion is found, its index is transmitted to a decoder which has a codebook identical to the first codebook of the decoder. There the index is used to read out a vector that is used to synthesize an output speech vector s.sub.n. The parameters used in the encoder are quantized, for example by using a table, and the indices are transmitted to the decoder where they are decoded to specify transfer characteristics of filters used in producing the vector s.sub.n from the receiver codebook vector selected by the vector index transmitted.

  12. Speech-property-based FEC for Internet telephony applications

    NASA Astrophysics Data System (ADS)

    Sanneck, Henning A.; Le, Nguyen T. L.

    1999-12-01

    In this paper we first analyze the concealment performance of the G.729 decoder. We find that the loss of unvoiced frames can be concealed well. Also, the loss of voiced frames is concealed well once the decoder has obtained sufficient information on them. However the decoder fails to conceal the loss of voiced frames at an unvoiced/voiced transition because it extrapolates internal state (filter coefficients and excitation) for an unvoiced sound. Moreover, once the encoder has failed to build the appropriate linear prediction synthesis filter, it takes a long time for the decoder to resynchronize with the encoder. Using this result, we then develop a new FEC scheme to support frame-based codecs, which adjusts the amount of added redundancy adaptively to the properties of the speech signal. Objective quality measures (ITU P.861A and EMBSD) show that our speech property-based FEC scheme achieves almost the same speech quality as current FEC schemes while approximately halving the amount of necessary redundant data to adequately protect the voice flow.

  13. Phrase-programmable digital speech system

    SciTech Connect

    Raymond, W.J.; Morgan, R.L.; Miller, R.L.

    1987-01-27

    This patent describes a phrase speaking computer system having a programmable digital computer and a speech processor, the speech processor comprising: a voice synthesizer; a read/write speech data segment memory; a read/write command memory; control processor means including processor control programs and logic connecting to the memories and to the voice synthesizer. It is arranged to scan the command memory and to respond to command data entries stored therein by transferring corresponding speech data segments from the speech data segment memory to the voice synthesizer; data conveyance means, connecting the computer to the command memory and the speech data segment memory, for transferring the command data entries supplied by the computer into the command memory and for transferring the speech data segments supplied by the computer into the speech data segment memory; and an enable signal line connecting the computer to the speech processor and arranged to initiate the operation of the processor control programs and logic when the enable signal line is enabled by the computer; the programmable computer including speech control programs controlling the operation of the computer including data conveyance command sequences that cause the computer to supply command data entries to the data conveyance means and speech processor enabling command sequences that cause computer to energize the enable signal line.

  14. Speech recognition with amplitude and frequency modulations

    NASA Astrophysics Data System (ADS)

    Zeng, Fan-Gang; Nie, Kaibao; Stickney, Ginger S.; Kong, Ying-Yee; Vongphoe, Michael; Bhargave, Ashish; Wei, Chaogang; Cao, Keli

    2005-02-01

    Amplitude modulation (AM) and frequency modulation (FM) are commonly used in communication, but their relative contributions to speech recognition have not been fully explored. To bridge this gap, we derived slowly varying AM and FM from speech sounds and conducted listening tests using stimuli with different modulations in normal-hearing and cochlear-implant subjects. We found that although AM from a limited number of spectral bands may be sufficient for speech recognition in quiet, FM significantly enhances speech recognition in noise, as well as speaker and tone recognition. Additional speech reception threshold measures revealed that FM is particularly critical for speech recognition with a competing voice and is independent of spectral resolution and similarity. These results suggest that AM and FM provide independent yet complementary contributions to support robust speech recognition under realistic listening situations. Encoding FM may improve auditory scene analysis, cochlear-implant, and audiocoding performance. auditory analysis | cochlear implant | neural code | phase | scene analysis

  15. Perception of Speech Sounds in School-Aged Children with Speech Sound Disorders.

    PubMed

    Preston, Jonathan L; Irwin, Julia R; Turcios, Jacqueline

    2015-11-01

    Children with speech sound disorders may perceive speech differently than children with typical speech development. The nature of these speech differences is reviewed with an emphasis on assessing phoneme-specific perception for speech sounds that are produced in error. Category goodness judgment, or the ability to judge accurate and inaccurate tokens of speech sounds, plays an important role in phonological development. The software Speech Assessment and Interactive Learning System, which has been effectively used to assess preschoolers' ability to perform goodness judgments, is explored for school-aged children with residual speech errors (RSEs). However, data suggest that this particular task may not be sensitive to perceptual differences in school-aged children. The need for the development of clinical tools for assessment of speech perception in school-aged children with RSE is highlighted, and clinical suggestions are provided. PMID:26458198

  16. The verbal transformation effect and the perceptual organization of speech: influence of formant transitions and F0-contour continuity.

    PubMed

    Stachurski, Marcin; Summers, Robert J; Roberts, Brian

    2015-05-01

    This study explored the role of formant transitions and F0-contour continuity in binding together speech sounds into a coherent stream. Listening to a repeating recorded word produces verbal transformations to different forms; stream segregation contributes to this effect and so it can be used to measure changes in perceptual coherence. In experiment 1, monosyllables with strong formant transitions between the initial consonant and following vowel were monotonized; each monosyllable was paired with a weak-transitions counterpart. Further stimuli were derived by replacing the consonant-vowel transitions with samples from adjacent steady portions. Each stimulus was concatenated into a 3-min-long sequence. Listeners only reported more forms in the transitions-removed condition for strong-transitions words, for which formant-frequency discontinuities were substantial. In experiment 2, the F0 contour of all-voiced monosyllables was shaped to follow a rising or falling pattern, spanning one octave. Consecutive tokens either had the same contour, giving an abrupt F0 change between each token, or alternated, giving a continuous contour. Discontinuous sequences caused more transformations and forms, and shorter times to the first transformation. Overall, these findings support the notion that continuity cues provided by formant transitions and the F0 contour play an important role in maintaining the perceptual coherence of speech. PMID:25620314

  17. Speech Entrainment Compensates for Broca's Area Damage

    PubMed Central

    Fridriksson, Julius; Basilakos, Alexandra; Hickok, Gregory; Bonilha, Leonardo; Rorden, Chris

    2015-01-01

    Speech entrainment (SE), the online mimicking of an audiovisual speech model, has been shown to increase speech fluency in patients with Broca's aphasia. However, not all individuals with aphasia benefit from SE. The purpose of this study was to identify patterns of cortical damage that predict a positive response SE's fluency-inducing effects. Forty-four chronic patients with left hemisphere stroke (15 female) were included in this study. Participants completed two tasks: 1) spontaneous speech production, and 2) audiovisual SE. Number of different words per minute was calculated as a speech output measure for each task, with the difference between SE and spontaneous speech conditions yielding a measure of fluency improvement. Voxel-wise lesion-symptom mapping (VLSM) was used to relate the number of different words per minute for spontaneous speech, SE, and SE-related improvement to patterns of brain damage in order to predict lesion locations associated with the fluency-inducing response to speech entrainment. Individuals with Broca's aphasia demonstrated a significant increase in different words per minute during speech entrainment versus spontaneous speech. A similar pattern of improvement was not seen in patients with other types of aphasia. VLSM analysis revealed damage to the inferior frontal gyrus predicted this response. Results suggest that SE exerts its fluency-inducing effects by providing a surrogate target for speech production via internal monitoring processes. Clinically, these results add further support for the use of speech entrainment to improve speech production and may help select patients for speech entrainment treatment. PMID:25989443

  18. Temporal characteristics of speech: the effect of age and speech style.

    PubMed

    Bóna, Judit

    2014-08-01

    Aging affects temporal characteristics of speech. It is still a question how these changes occur in different speech styles which require various cognitive skills. In this paper speech rate, articulation rate, and pauses of 20 young and 20 old speakers are analyzed in four speech styles: spontaneous narrative, narrative recalls, a three-participant conversation, and reading aloud. Results show that age has a significant effect only on speech rate, articulation rate, and frequency of pauses. Speech style has a higher effect on temporal parameters than speakers' age. PMID:25096134

  19. A hardware preprocessor for use in speech recognition: Speech Input Device SID3

    NASA Astrophysics Data System (ADS)

    Renger, R. E.; Manning, D. R.

    1983-05-01

    A device which reduces the amount of data sent to the computer for speech recognition, by extracting from the speech signal the information that conveys the meaning of the speech, all other data being discarded is presented. The design includes signal to noise ratios as low as 10 dB, public telephone frequency bandwidth and unconstrained speech. It produces continuously at its output 64 bits of digital information, which represents the way 16 speech parameters vary. The parameters cover speech quality, voice pitch, resonant frequency, level of resonance and unvoiced spectrum color. The receiving computer must have supporting software containing recognition algorithms adapted to SID3 parameters.

  20. Speech Enhancement Using Microphone Arrays.

    NASA Astrophysics Data System (ADS)

    Adugna, Eneyew

    Arrays of sensors have been employed effectively in communication systems for the directional transmission and reception of electromagnetic waves. Among the numerous benefits, this helps improve the signal-to-interference ratio (SIR) of the signal at the receiver. Arrays have since been used in related areas that employ propagating waves for the transmission of information. Several investigators have successfully adopted array principles to acoustics, sonar, seismic, and medical imaging. In speech applications the microphone is used as the sensor for acoustic data acquisition. The performance of subsequent speech processing algorithms--such as speech recognition or speaker recognition--relies heavily on the level of interference within the transduced or recorded speech signal. The normal practice is to use a single, hand-held or head-mounted, microphone. Under most environmental conditions, i.e., environments where other acoustic sources are also active, the speech signal from a single microphone is a superposition of acoustic signals present in the environment. Such cases represent a lower SIR value. To alleviate this problem an array of microphones--linear array, planar array, and 3-dimensional arrays--have been suggested and implemented. This work focuses on microphone arrays in room environments where reverberation is the main source of interference. The acoustic wave incident on the array from a point source is sampled and recorded by a linear array of sensors along with reflected waves. Array signal processing algorithms are developed and used to remove reverberations from the signal received by the array. Signals from other positions are considered as interference. Unlike most studies that deal with plane waves, we base our algorithm on spherical waves originating at a source point. This is especially true for room environments. The algorithm consists of two stages--a first stage to locate the source and a second stage to focus on the source. The first part

  1. Speech and language delay in children.

    PubMed

    McLaughlin, Maura R

    2011-05-15

    Speech and language delay in children is associated with increased difficulty with reading, writing, attention, and socialization. Although physicians should be alert to parental concerns and to whether children are meeting expected developmental milestones, there currently is insufficient evidence to recommend for or against routine use of formal screening instruments in primary care to detect speech and language delay. In children not meeting the expected milestones for speech and language, a comprehensive developmental evaluation is essential, because atypical language development can be a secondary characteristic of other physical and developmental problems that may first manifest as language problems. Types of primary speech and language delay include developmental speech and language delay, expressive language disorder, and receptive language disorder. Secondary speech and language delays are attributable to another condition such as hearing loss, intellectual disability, autism spectrum disorder, physical speech problems, or selective mutism. When speech and language delay is suspected, the primary care physician should discuss this concern with the parents and recommend referral to a speech-language pathologist and an audiologist. There is good evidence that speech-language therapy is helpful, particularly for children with expressive language disorder. PMID:21568252

  2. Loss tolerant speech decoder for telecommunications

    NASA Technical Reports Server (NTRS)

    Prieto, Jr., Jaime L. (Inventor)

    1999-01-01

    A method and device for extrapolating past signal-history data for insertion into missing data segments in order to conceal digital speech frame errors. The extrapolation method uses past-signal history that is stored in a buffer. The method is implemented with a device that utilizes a finite-impulse response (FIR) multi-layer feed-forward artificial neural network that is trained by back-propagation for one-step extrapolation of speech compression algorithm (SCA) parameters. Once a speech connection has been established, the speech compression algorithm device begins sending encoded speech frames. As the speech frames are received, they are decoded and converted back into speech signal voltages. During the normal decoding process, pre-processing of the required SCA parameters will occur and the results stored in the past-history buffer. If a speech frame is detected to be lost or in error, then extrapolation modules are executed and replacement SCA parameters are generated and sent as the parameters required by the SCA. In this way, the information transfer to the SCA is transparent, and the SCA processing continues as usual. The listener will not normally notice that a speech frame has been lost because of the smooth transition between the last-received, lost, and next-received speech frames.

  3. Speech entrainment compensates for Broca's area damage.

    PubMed

    Fridriksson, Julius; Basilakos, Alexandra; Hickok, Gregory; Bonilha, Leonardo; Rorden, Chris

    2015-08-01

    Speech entrainment (SE), the online mimicking of an audiovisual speech model, has been shown to increase speech fluency in patients with Broca's aphasia. However, not all individuals with aphasia benefit from SE. The purpose of this study was to identify patterns of cortical damage that predict a positive response SE's fluency-inducing effects. Forty-four chronic patients with left hemisphere stroke (15 female) were included in this study. Participants completed two tasks: 1) spontaneous speech production, and 2) audiovisual SE. Number of different words per minute was calculated as a speech output measure for each task, with the difference between SE and spontaneous speech conditions yielding a measure of fluency improvement. Voxel-wise lesion-symptom mapping (VLSM) was used to relate the number of different words per minute for spontaneous speech, SE, and SE-related improvement to patterns of brain damage in order to predict lesion locations associated with the fluency-inducing response to SE. Individuals with Broca's aphasia demonstrated a significant increase in different words per minute during SE versus spontaneous speech. A similar pattern of improvement was not seen in patients with other types of aphasia. VLSM analysis revealed damage to the inferior frontal gyrus predicted this response. Results suggest that SE exerts its fluency-inducing effects by providing a surrogate target for speech production via internal monitoring processes. Clinically, these results add further support for the use of SE to improve speech production and may help select patients for SE treatment. PMID:25989443

  4. Some articulatory details of emotional speech

    NASA Astrophysics Data System (ADS)

    Lee, Sungbok; Yildirim, Serdar; Bulut, Murtaza; Kazemzadeh, Abe; Narayanan, Shrikanth

    2005-09-01

    Differences in speech articulation among four emotion types, neutral, anger, sadness, and happiness are investigated by analyzing tongue tip, jaw, and lip movement data collected from one male and one female speaker of American English. The data were collected using an electromagnetic articulography (EMA) system while subjects produce simulated emotional speech. Pitch, root-mean-square (rms) energy and the first three formants were estimated for vowel segments. For both speakers, angry speech exhibited the largest rms energy and largest articulatory activity in terms of displacement range and movement speed. Happy speech is characterized by largest pitch variability. It has higher rms energy than neutral speech but articulatory activity is rather comparable to, or less than, neutral speech. That is, happy speech is more prominent in voicing activity than in articulation. Sad speech exhibits longest sentence duration and lower rms energy. However, its articulatory activity is no less than neutral speech. Interestingly, for the male speaker, articulation for vowels in sad speech is consistently more peripheral (i.e., more forwarded displacements) when compared to other emotions. However, this does not hold for female subject. These and other results will be discussed in detail with associated acoustics and perceived emotional qualities. [Work supported by NIH.

  5. Individual differneces in degraded speech perception

    NASA Astrophysics Data System (ADS)

    Carbonell, Kathy M.

    One of the lasting concerns in audiology is the unexplained individual differences in speech perception performance even for individuals with similar audiograms. One proposal is that there are cognitive/perceptual individual differences underlying this vulnerability and that these differences are present in normal hearing (NH) individuals but do not reveal themselves in studies that use clear speech produced in quiet (because of a ceiling effect). However, previous studies have failed to uncover cognitive/perceptual variables that explain much of the variance in NH performance on more challenging degraded speech tasks. This lack of strong correlations may be due to either examining the wrong measures (e.g., working memory capacity) or to there being no reliable differences in degraded speech performance in NH listeners (i.e., variability in performance is due to measurement noise). The proposed project has 3 aims; the first, is to establish whether there are reliable individual differences in degraded speech performance for NH listeners that are sustained both across degradation types (speech in noise, compressed speech, noise-vocoded speech) and across multiple testing sessions. The second aim is to establish whether there are reliable differences in NH listeners' ability to adapt their phonetic categories based on short-term statistics both across tasks and across sessions; and finally, to determine whether performance on degraded speech perception tasks are correlated with performance on phonetic adaptability tasks, thus establishing a possible explanatory variable for individual differences in speech perception for NH and hearing impaired listeners.

  6. Sensorimotor influences on speech perception in infancy.

    PubMed

    Bruderer, Alison G; Danielson, D Kyle; Kandhadai, Padmapriya; Werker, Janet F

    2015-11-01

    The influence of speech production on speech perception is well established in adults. However, because adults have a long history of both perceiving and producing speech, the extent to which the perception-production linkage is due to experience is unknown. We addressed this issue by asking whether articulatory configurations can influence infants' speech perception performance. To eliminate influences from specific linguistic experience, we studied preverbal, 6-mo-old infants and tested the discrimination of a nonnative, and hence never-before-experienced, speech sound distinction. In three experimental studies, we used teething toys to control the position and movement of the tongue tip while the infants listened to the speech sounds. Using ultrasound imaging technology, we verified that the teething toys consistently and effectively constrained the movement and positioning of infants' tongues. With a looking-time procedure, we found that temporarily restraining infants' articulators impeded their discrimination of a nonnative consonant contrast but only when the relevant articulator was selectively restrained to prevent the movements associated with producing those sounds. Our results provide striking evidence that even before infants speak their first words and without specific listening experience, sensorimotor information from the articulators influences speech perception. These results transform theories of speech perception by suggesting that even at the initial stages of development, oral-motor movements influence speech sound discrimination. Moreover, an experimentally induced "impairment" in articulator movement can compromise speech perception performance, raising the question of whether long-term oral-motor impairments may impact perceptual development. PMID:26460030

  7. A causal test of the motor theory of speech perception: a case of impaired speech production and spared speech perception.

    PubMed

    Stasenko, Alena; Bonn, Cory; Teghipco, Alex; Garcea, Frank E; Sweet, Catherine; Dombovy, Mary; McDonough, Joyce; Mahon, Bradford Z

    2015-01-01

    The debate about the causal role of the motor system in speech perception has been reignited by demonstrations that motor processes are engaged during the processing of speech sounds. Here, we evaluate which aspects of auditory speech processing are affected, and which are not, in a stroke patient with dysfunction of the speech motor system. We found that the patient showed a normal phonemic categorical boundary when discriminating two non-words that differ by a minimal pair (e.g., ADA-AGA). However, using the same stimuli, the patient was unable to identify or label the non-word stimuli (using a button-press response). A control task showed that he could identify speech sounds by speaker gender, ruling out a general labelling impairment. These data suggest that while the motor system is not causally involved in perception of the speech signal, it may be used when other cues (e.g., meaning, context) are not available. PMID:25951749

  8. Speech Articulator and User Gesture Measurements Using Micropower, Interferometric EM-Sensors

    SciTech Connect

    Holzrichter, J F; Ng, L C

    2001-02-06

    Very low power, GHz frequency, ''radar-like'' sensors can measure a variety of motions produced by a human user of machine interface devices. These data can be obtained ''at a distance'' and can measure ''hidden'' structures. Measurements range from acoustic induced, 10-micron amplitude vibrations of vocal tract tissues, to few centimeter human speech articulator motions, to meter-class motions of the head, hands, or entire body. These EM sensors measure ''fringe motions'' as reflected EM waves are mixed with a local (homodyne) reference wave. These data, when processed using models of the system being measured, provide real time states of interface positions or other targets vs. time. An example is speech articulator positions vs. time in the user's body. This information appears to be useful for a surprisingly wide range of applications ranging from speech coding synthesis and recognition, speaker or object identification, noise cancellation, hand or head motions for cursor direction, and other applications.

  9. Extensions to the Speech Disorders Classification System (SDCS)

    ERIC Educational Resources Information Center

    Shriberg, Lawrence D.; Fourakis, Marios; Hall, Sheryl D.; Karlsson, Heather B.; Lohmeier, Heather L.; McSweeny, Jane L.; Potter, Nancy L.; Scheer-Cohen, Alison R.; Strand, Edythe A.; Tilkens, Christie M.; Wilson, David L.

    2010-01-01

    This report describes three extensions to a classification system for paediatric speech sound disorders termed the Speech Disorders Classification System (SDCS). Part I describes a classification extension to the SDCS to differentiate motor speech disorders from speech delay and to differentiate among three sub-types of motor speech disorders.…

  10. Segmenting Words from Natural Speech: Subsegmental Variation in Segmental Cues

    ERIC Educational Resources Information Center

    Rytting, C. Anton; Brew, Chris; Fosler-Lussier, Eric

    2010-01-01

    Most computational models of word segmentation are trained and tested on transcripts of speech, rather than the speech itself, and assume that speech is converted into a sequence of symbols prior to word segmentation. We present a way of representing speech corpora that avoids this assumption, and preserves acoustic variation present in speech. We…

  11. Analysis and synthesis of laughter

    NASA Astrophysics Data System (ADS)

    Sundaram, Shiva; Narayanan, Shrikanth

    2004-10-01

    There is much enthusiasm in the text-to-speech community for synthesis of emotional and natural speech. One idea being proposed is to include emotion dependent paralinguistic cues during synthesis to convey emotions effectively. This requires modeling and synthesis techniques of various cues for different emotions. Motivated by this, a technique to synthesize human laughter is proposed. Laughter is a complex mechanism of expression and has high variability in terms of types and usage in human-human communication. People have their own characteristic way of laughing. Laughter can be seen as a controlled/uncontrolled physiological process of a person resulting from an initial excitation in context. A parametric model based on damped simple harmonic motion to effectively capture these diversities and also maintain the individuals characteristics is developed here. Limited laughter/speech data from actual humans and synthesis ease are the constraints imposed on the accuracy of the model. Analysis techniques are also developed to determine the parameters of the model for a given individual or laughter type. Finally, the effectiveness of the model to capture the individual characteristics and naturalness compared to real human laughter has been analyzed. Through this the factors involved in individual human laughter and their importance can be better understood.

  12. Modeling Interactions between Speech Production and Perception: Speech Error Detection at Semantic and Phonological Levels and the Inner Speech Loop

    PubMed Central

    Kröger, Bernd J.; Crawford, Eric; Bekolay, Trevor; Eliasmith, Chris

    2016-01-01

    Production and comprehension of speech are closely interwoven. For example, the ability to detect an error in one's own speech, halt speech production, and finally correct the error can be explained by assuming an inner speech loop which continuously compares the word representations induced by production to those induced by perception at various cognitive levels (e.g., conceptual, word, or phonological levels). Because spontaneous speech errors are relatively rare, a picture naming and halt paradigm can be used to evoke them. In this paradigm, picture presentation (target word initiation) is followed by an auditory stop signal (distractor word) for halting speech production. The current study seeks to understand the neural mechanisms governing self-detection of speech errors by developing a biologically inspired neural model of the inner speech loop. The neural model is based on the Neural Engineering Framework (NEF) and consists of a network of about 500,000 spiking neurons. In the first experiment we induce simulated speech errors semantically and phonologically. In the second experiment, we simulate a picture naming and halt task. Target-distractor word pairs were balanced with respect to variation of phonological and semantic similarity. The results of the first experiment show that speech errors are successfully detected by a monitoring component in the inner speech loop. The results of the second experiment show that the model correctly reproduces human behavioral data on the picture naming and halt task. In particular, the halting rate in the production of target words was lower for phonologically similar words than for semantically similar or fully dissimilar distractor words. We thus conclude that the neural architecture proposed here to model the inner speech loop reflects important interactions in production and perception at phonological and semantic levels. PMID:27303287

  13. Modeling Interactions between Speech Production and Perception: Speech Error Detection at Semantic and Phonological Levels and the Inner Speech Loop.

    PubMed

    Kröger, Bernd J; Crawford, Eric; Bekolay, Trevor; Eliasmith, Chris

    2016-01-01

    Production and comprehension of speech are closely interwoven. For example, the ability to detect an error in one's own speech, halt speech production, and finally correct the error can be explained by assuming an inner speech loop which continuously compares the word representations induced by production to those induced by perception at various cognitive levels (e.g., conceptual, word, or phonological levels). Because spontaneous speech errors are relatively rare, a picture naming and halt paradigm can be used to evoke them. In this paradigm, picture presentation (target word initiation) is followed by an auditory stop signal (distractor word) for halting speech production. The current study seeks to understand the neural mechanisms governing self-detection of speech errors by developing a biologically inspired neural model of the inner speech loop. The neural model is based on the Neural Engineering Framework (NEF) and consists of a network of about 500,000 spiking neurons. In the first experiment we induce simulated speech errors semantically and phonologically. In the second experiment, we simulate a picture naming and halt task. Target-distractor word pairs were balanced with respect to variation of phonological and semantic similarity. The results of the first experiment show that speech errors are successfully detected by a monitoring component in the inner speech loop. The results of the second experiment show that the model correctly reproduces human behavioral data on the picture naming and halt task. In particular, the halting rate in the production of target words was lower for phonologically similar words than for semantically similar or fully dissimilar distractor words. We thus conclude that the neural architecture proposed here to model the inner speech loop reflects important interactions in production and perception at phonological and semantic levels. PMID:27303287

  14. Headphone localization of speech stimuli

    NASA Technical Reports Server (NTRS)

    Begault, Durand R.; Wenzel, Elizabeth M.

    1991-01-01

    Recently, three dimensional acoustic display systems have been developed that synthesize virtual sound sources over headphones based on filtering by Head-Related Transfer Functions (HRTFs), the direction-dependent spectral changes caused primarily by the outer ears. Here, 11 inexperienced subjects judged the apparent spatial location of headphone-presented speech stimuli filtered with non-individualized HRTFs. About half of the subjects 'pulled' their judgements toward either the median or the lateral-vertical planes, and estimates were almost always elevated. Individual differences were pronounced for the distance judgements; 15 to 46 percent of stimuli were heard inside the head with the shortest estimates near the median plane. The results infer that most listeners can obtain useful azimuth information from speech stimuli filtered by nonindividualized RTFs. Measurements of localization error and reversal rates are comparable with a previous study that used broadband noise stimuli.

  15. Training speech pathologists through microtherapy.

    PubMed

    Irwin, R B

    1981-03-01

    Two microtraining methods were evaluated for training speech pathologists in the acquisition of skills utilized in treating misarticulations. Fifteen subjects in an introductory class in speech pathology were randomly placed in two groups (modeling, video replay, and counseling versus video replay and counseling). The training included reading a manual about the skills and a sequence of three teach sessions. The control group did not view the video model. According to the results, the model group made a greater gain score (M = 8.38) than the nonmodel group (M = 3.88). Significant gains were made for both experimental groups between teach sessions one and two, but no significant gains were made between the second and third teach sessions. PMID:7019270

  16. Apraxia of speech: an overview.

    PubMed

    Ogar, Jennifer; Slama, Hilary; Dronkers, Nina; Amici, Serena; Gorno-Tempini, Maria Luisa

    2005-12-01

    Apraxia of speech (AOS) is a motor speech disorder that can occur in the absence of aphasia or dysarthria. AOS has been the subject of some controversy since the disorder was first named and described by Darley and his Mayo Clinic colleagues in the 1960s. A recent revival of interest in AOS is due in part to the fact that it is often the first symptom of neurodegenerative diseases, such as primary progressive aphasia and corticobasal degeneration. This article will provide a brief review of terminology associated with AOS, its clinical hallmarks and neuroanatomical correlates. Current models of motor programming will also be addressed as they relate to AOS and finally, typical treatment strategies used in rehabilitating the articulation and prosody deficits associated with AOS will be summarized. PMID:16393756

  17. Language processing for speech understanding

    NASA Astrophysics Data System (ADS)

    Woods, W. A.

    1983-07-01

    This report considers language understanding techniques and control strategies that can be applied to provide higher-level support to aid in the understanding of spoken utterances. The discussion is illustrated with concepts and examples from the BBN speech understanding system, HWIM (Hear What I Mean). The HWIM system was conceived as an assistant to a travel budget manager, a system that would store information about planned and taken trips, travel budgets and their planning. The system was able to respond to commands and answer questions spoken into a microphone, and was able to synthesize spoken responses as output. HWIM was a prototype system used to drive speech understanding research. It used a phonetic-based approach, with no speaker training, a large vocabulary, and a relatively unconstraining English grammar. Discussed here is the control structure of the HWIM and the parsing algorithm used to parse sentences from the middle-out, using an ATN grammar.

  18. Prediction and imitation in speech

    PubMed Central

    Gambi, Chiara; Pickering, Martin J.

    2013-01-01

    It has been suggested that intra- and inter-speaker variability in speech are correlated. Interlocutors have been shown to converge on various phonetic dimensions. In addition, speakers imitate the phonetic properties of voices they are exposed to in shadowing, repetition, and even passive listening tasks. We review three theoretical accounts of speech imitation and convergence phenomena: (i) the Episodic Theory (ET) of speech perception and production (Goldinger, 1998); (ii) the Motor Theory (MT) of speech perception (Liberman and Whalen, 2000; Galantucci et al., 2006); (iii) Communication Accommodation Theory (CAT; Giles and Coupland, 1991; Giles et al., 1991). We argue that no account is able to explain all the available evidence. In particular, there is a need to integrate low-level, mechanistic accounts (like ET and MT), and higher-level accounts (like CAT). We propose that this is possible within the framework of an integrated theory of production and comprehension (Pickering and Garrod, 2013). Similarly to both ET and MT, this theory assumes parity between production and perception. Uniquely, however, it posits that listeners simulate speakers' utterances by computing forward-model predictions at many different levels, which are then compared to the incoming phonetic input. In our account phonetic imitation can be achieved via the same mechanism that is responsible for sensorimotor adaptation; i.e., the correction of prediction errors. In addition, the model assumes that the degree to which sensory prediction errors lead to motor adjustments is context-dependent. The notion of context subsumes both the preceding linguistic input and non-linguistic attributes of the situation (e.g., the speaker's and listener's social identities, their conversational roles, the listener's intention to imitate). PMID:23801971

  19. The Levels of Speech Usage Rating Scale: Comparison of Client Self-Ratings with Speech Pathologist Ratings

    ERIC Educational Resources Information Center

    Gray, Christina; Baylor, Carolyn; Eadie, Tanya; Kendall, Diane; Yorkston, Kathryn

    2012-01-01

    Background: The term "speech usage" refers to what people want or need to do with their speech to fulfil the communication demands in their life roles. Speech-language pathologists (SLPs) need to know about clients' speech usage to plan appropriate interventions to meet their life participation goals. The Levels of Speech Usage is a categorical…

  20. Primary Progressive Aphasia and Apraxia of Speech

    PubMed Central

    Jung, Youngsin; Duffy, Joseph R.; Josephs, Keith A.

    2014-01-01

    Primary progressive aphasia is a neurodegenerative syndrome characterized by progressive language dysfunction. The majority of primary progressive aphasia cases can be classified into three subtypes: non-fluent/agrammatic, semantic, and logopenic variants of primary progressive aphasia. Each variant presents with unique clinical features, and is associated with distinctive underlying pathology and neuroimaging findings. Unlike primary progressive aphasia, apraxia of speech is a disorder that involves inaccurate production of sounds secondary to impaired planning or programming of speech movements. Primary progressive apraxia of speech is a neurodegenerative form of apraxia of speech, and it should be distinguished from primary progressive aphasia given its discrete clinicopathological presentation. Recently, there have been substantial advances in our understanding of these speech and language disorders. Here, we review clinical, neuroimaging, and histopathological features of primary progressive aphasia and apraxia of speech. The distinctions among these disorders will be crucial since accurate diagnosis will be important from a prognostic and therapeutic standpoint. PMID:24234355

  1. Giving Speech a Hand: Gesture Modulates Activity in Auditory Cortex During Speech Perception

    PubMed Central

    Hubbard, Amy L.; Wilson, Stephen M.; Callan, Daniel E.; Dapretto, Mirella

    2008-01-01

    Viewing hand gestures during face-to-face communication affects speech perception and comprehension. Despite the visible role played by gesture in social interactions, relatively little is known about how the brain integrates hand gestures with co-occurring speech. Here we used functional magnetic resonance imaging (fMRI) and an ecologically valid paradigm to investigate how beat gesture – a fundamental type of hand gesture that marks speech prosody – might impact speech perception at the neural level. Subjects underwent fMRI while listening to spontaneously-produced speech accompanied by beat gesture, nonsense hand movement, or a still body; as additional control conditions, subjects also viewed beat gesture, nonsense hand movement, or a still body all presented without speech. Validating behavioral evidence that gesture affects speech perception, bilateral nonprimary auditory cortex showed greater activity when speech was accompanied by beat gesture than when speech was presented alone. Further, the left superior temporal gyrus/sulcus showed stronger activity when speech was accompanied by beat gesture than when speech was accompanied by nonsense hand movement. Finally, the right planum temporale was identified as a putative multisensory integration site for beat gesture and speech (i.e., here activity in response to speech accompanied by beat gesture was greater than the summed responses to speech alone and beat gesture alone), indicating that this area may be pivotally involved in synthesizing the rhythmic aspects of both speech and gesture. Taken together, these findings suggest a common neural substrate for processing speech and gesture, likely reflecting their joint communicative role in social interactions. PMID:18412134

  2. Integrated speech enhancement for functional MRI environment.

    PubMed

    Pathak, Nishank; Milani, Ali A; Panahi, Issa; Briggs, Richard

    2009-01-01

    This paper presents an integrated speech enhancement (SE) method for the noisy MRI environment. We show that the performance of SE system improves considerably when the speech signal dominated by MRI acoustic noise at very low SNR is enhanced in two successive stages using two-channel SE methods followed by a single-channel post processing SE algorithm. Actual MRI noisy speech data are used in our experiments showing the improved performance of the proposed SE method. PMID:19964964

  3. Investigating Holistic Measures of Speech Prosody

    ERIC Educational Resources Information Center

    Cunningham, Dana Aliel

    2012-01-01

    Speech prosody is a multi-faceted dimension of speech which can be measured and analyzed in a variety of ways. In this study, the speech prosody of Mandarin L1 speakers, English L2 speakers, and English L1 speakers was assessed by trained raters who listened to sound clips of the speakers responding to a graph prompt and reading a short passage.…

  4. Construction of a Rated Speech Corpus of L2 Learners' Spontaneous Speech

    ERIC Educational Resources Information Center

    Yoon, Su-Youn; Pierce, Lisa; Huensch, Amanda; Juul, Eric; Perkins, Samantha; Sproat, Richard; Hasegawa-Johnson, Mark

    2009-01-01

    This work reports on the construction of a rated database of spontaneous speech produced by second language (L2) learners of English. Spontaneous speech was collected from 28 L2 speakers representing six language backgrounds and five different proficiency levels. Speech was elicited using formats similar to that of the TOEFL iBT and the Speaking…

  5. Speech and Language Skills of Parents of Children with Speech Sound Disorders

    ERIC Educational Resources Information Center

    Lewis, Barbara A.; Freebairn, Lisa A.; Hansen, Amy J.; Miscimarra, Lara; Iyengar, Sudha K.; Taylor, H. Gerry

    2007-01-01

    Purpose: This study compared parents with histories of speech sound disorders (SSD) to parents without known histories on measures of speech sound production, phonological processing, language, reading, and spelling. Familial aggregation for speech and language disorders was also examined. Method: The participants were 147 parents of children with…

  6. Exploring the Role of Brain Oscillations in Speech Perception in Noise: Intelligibility of Isochronously Retimed Speech

    PubMed Central

    Aubanel, Vincent; Davis, Chris; Kim, Jeesun

    2016-01-01

    A growing body of evidence shows that brain oscillations track speech. This mechanism is thought to maximize processing efficiency by allocating resources to important speech information, effectively parsing speech into units of appropriate granularity for further decoding. However, some aspects of this mechanism remain unclear. First, while periodicity is an intrinsic property of this physiological mechanism, speech is only quasi-periodic, so it is not clear whether periodicity would present an advantage in processing. Second, it is still a matter of debate which aspect of speech triggers or maintains cortical entrainment, from bottom-up cues such as fluctuations of the amplitude envelope of speech to higher level linguistic cues such as syntactic structure. We present data from a behavioral experiment assessing the effect of isochronous retiming of speech on speech perception in noise. Two types of anchor points were defined for retiming speech, namely syllable onsets and amplitude envelope peaks. For each anchor point type, retiming was implemented at two hierarchical levels, a slow time scale around 2.5 Hz and a fast time scale around 4 Hz. Results show that while any temporal distortion resulted in reduced speech intelligibility, isochronous speech anchored to P-centers (approximated by stressed syllable vowel onsets) was significantly more intelligible than a matched anisochronous retiming, suggesting a facilitative role of periodicity defined on linguistically motivated units in processing speech in noise.

  7. Private and Inner Speech and the Regulation of Social Speech Communication

    ERIC Educational Resources Information Center

    San Martin Martinez, Conchi; Boada i Calbet, Humbert; Feigenbaum, Peter

    2011-01-01

    To further investigate the possible regulatory role of private and inner speech in the context of referential social speech communications, a set of clear and systematically applied measures is needed. This study addresses this need by introducing a rigorous method for identifying private speech and certain sharply defined instances of inaudible…

  8. A MANUAL ON SPEECH THERAPY FOR PARENTS' USE WITH CHILDREN WHO HAVE MINOR SPEECH PROBLEMS.

    ERIC Educational Resources Information Center

    OGG, HELEN LOREE

    A MANUAL, TO PROVIDE PARENTS WITH AN UNDERSTANDING OF THE WORK OF THE SPEECH TEACHER AND WITH METHODS TO CORRECT THE POOR SPEECH HABITS OF THEIR CHILDREN IS PRESENTED. AREAS INCLUDE THE ORGANS OF SPEECH, WHERE THEY SHOULD BE PLACED TO MAKE EACH SOUND, AND HOW THEY SHOULD OR SHOULD NOT MOVE. EASY DIRECTIONS ARE GIVEN FOR PRODUCING THE MOST…

  9. Spotlight on Speech Codes 2012: The State of Free Speech on Our Nation's Campuses

    ERIC Educational Resources Information Center

    Foundation for Individual Rights in Education (NJ1), 2012

    2012-01-01

    The U.S. Supreme Court has called America's colleges and universities "vital centers for the Nation's intellectual life," but the reality today is that many of these institutions severely restrict free speech and open debate. Speech codes--policies prohibiting student and faculty speech that would, outside the bounds of campus, be protected by the…

  10. Spotlight on Speech Codes 2007: The State of Free Speech on Our Nation's Campuses

    ERIC Educational Resources Information Center

    Foundation for Individual Rights in Education (NJ1), 2007

    2007-01-01

    Last year, the Foundation for Individual Rights in Education (FIRE) conducted its first-ever comprehensive study of restrictions on speech at America's colleges and universities, "Spotlight on Speech Codes 2006: The State of Free Speech on our Nation's Campuses." In light of the essentiality of free expression to a truly liberal education, its…

  11. The Practical Philosophy of Communication Ethics and Free Speech as the Foundation for Speech Communication.

    ERIC Educational Resources Information Center

    Arnett, Ronald C.

    1990-01-01

    Argues that communication ethics and free speech are the foundation for understanding the field of speech communication and its proper positioning in the larger array of academic disciplines. Argues that speech communication as a discipline can be traced back to a "practical philosophical" foundation detailed by Aristotle. (KEH)

  12. Cleft Audit Protocol for Speech (CAPS-A): A Comprehensive Training Package for Speech Analysis

    ERIC Educational Resources Information Center

    Sell, D.; John, A.; Harding-Bell, A.; Sweeney, T.; Hegarty, F.; Freeman, J.

    2009-01-01

    Background: The previous literature has largely focused on speech analysis systems and ignored process issues, such as the nature of adequate speech samples, data acquisition, recording and playback. Although there has been recognition of the need for training on tools used in speech analysis associated with cleft palate, little attention has been…

  13. DELAYED SPEECH AND LANGUAGE DEVELOPMENT, PRENTICE-HALL FOUNDATIONS OF SPEECH PATHOLOGY SERIES.

    ERIC Educational Resources Information Center

    WOOD, NANCY E.

    WRITTEN FOR SPEECH PATHOLOGY STUDENTS AND PROFESSIONAL WORKERS, THE BOOK BEGINS BY DEFINING LANGUAGE AND SPEECH AND TRACING THE DEVELOPMENT OF SPEECH AND LANGUAGE FROM THE INFANT THROUGH THE 4-YEAR OLD. CAUSAL FACTORS OF DELAYED DEVELOPMENT ARE GIVEN, INCLUDING CENTRAL NERVOUS SYSTEM IMPAIRMENT AND ASSOCIATED BEHAVIORAL CLUES AND LANGUAGE…

  14. Speech rate effects on the processing of conversational speech across the adult life span.

    PubMed

    Koch, Xaver; Janse, Esther

    2016-04-01

    This study investigates the effect of speech rate on spoken word recognition across the adult life span. Contrary to previous studies, conversational materials with a natural variation in speech rate were used rather than lab-recorded stimuli that are subsequently artificially time-compressed. It was investigated whether older adults' speech recognition is more adversely affected by increased speech rate compared to younger and middle-aged adults, and which individual listener characteristics (e.g., hearing, fluid cognitive processing ability) predict the size of the speech rate effect on recognition performance. In an eye-tracking experiment, participants indicated with a mouse-click which visually presented words they recognized in a conversational fragment. Click response times, gaze, and pupil size data were analyzed. As expected, click response times and gaze behavior were affected by speech rate, indicating that word recognition is more difficult if speech rate is faster. Contrary to earlier findings, increased speech rate affected the age groups to the same extent. Fluid cognitive processing ability predicted general recognition performance, but did not modulate the speech rate effect. These findings emphasize that earlier results of age by speech rate interactions mainly obtained with artificially speeded materials may not generalize to speech rate variation as encountered in conversational speech. PMID:27106310

  15. Speech coding research at Bell Laboratories

    NASA Astrophysics Data System (ADS)

    Atal, Bishnu S.

    2001-05-01

    The field of speech coding is now over 70 years old. It started from the desire to transmit voice signals over telegraph cables. The availability of digital computers in the mid 1960s made it possible to test complex speech coding algorithms rapidly. The introduction of linear predictive coding (LPC) started a new era in speech coding. The fundamental philosophy of speech coding went through a major shift, resulting in a new generation of low bit rate speech coders, such as multi-pulse and code-excited LPC. The semiconductor revolution produced faster and faster DSP chips and made linear predictive coding practical. Code-excited LPC has become the method of choice for low bit rate speech coding applications and is used in most voice transmission standards for cell phones. Digital speech communication is rapidly evolving from circuit-switched to packet-switched networks to provide integrated transmission of voice, data, and video signals. The new communication environment is also moving the focus of speech coding research from compression to low cost, reliable, and secure transmission of voice signals on digital networks, and provides the motivation for creating a new class of speech coders suitable for future applications.

  16. Speech Enhancement based on Compressive Sensing Algorithm

    NASA Astrophysics Data System (ADS)

    Sulong, Amart; Gunawan, Teddy S.; Khalifa, Othman O.; Chebil, Jalel

    2013-12-01

    There are various methods, in performance of speech enhancement, have been proposed over the years. The accurate method for the speech enhancement design mainly focuses on quality and intelligibility. The method proposed with high performance level. A novel speech enhancement by using compressive sensing (CS) is a new paradigm of acquiring signals, fundamentally different from uniform rate digitization followed by compression, often used for transmission or storage. Using CS can reduce the number of degrees of freedom of a sparse/compressible signal by permitting only certain configurations of the large and zero/small coefficients, and structured sparsity models. Therefore, CS is significantly provides a way of reconstructing a compressed version of the speech in the original signal by taking only a small amount of linear and non-adaptive measurement. The performance of overall algorithms will be evaluated based on the speech quality by optimise using informal listening test and Perceptual Evaluation of Speech Quality (PESQ). Experimental results show that the CS algorithm perform very well in a wide range of speech test and being significantly given good performance for speech enhancement method with better noise suppression ability over conventional approaches without obvious degradation of speech quality.

  17. Speech and Language Disorders in the School Setting

    MedlinePlus

    ... and Swallowing / Development Frequently Asked Questions: Speech and Language Disorders in the School Setting What types of speech and language disorders affect school-age children ? Do speech-language ...

  18. Development of a concatenated Reed-Solomon/Viterbi FEC combined modem and its field test via 14/11 GHz satellite

    NASA Astrophysics Data System (ADS)

    Fujino, T.; Moritani, Y.; Miyake, M.; Murakami, K.; Shibuya, A.

    The development of a concatenated FEC (forward error connecting) codec and its associating PSK (phase shift keying) modem aimed at low-bit-rate satellite communications use are discussed, and their performance in a field test at the 14-GHz/11-GHz band using an INTELSAT-V satellite is assessed. The FEC codec consists of a convolutional encoder with a Viterbi decoder for inner coding and a Reed-Solomon encoder with a Euclid decoder for outer coding. A coding gain of 7.5 dB was achieved at a bit error rate (BER) of 10-6 relative to the uncoded performance, and virtually error-free transmission was achieved at a carrier/noise (C/N) ratio of 0 dB. Some tests were conducted to transmit a voice signal and G3- and G4-facsimile signals, which revealed that the qualities of both received facsimile signals were highly dependent on the event error probabilities rather than the BERs. It is concluded that the developed FEC combined modem is useful for reliable data transmission in the low C/N ratio environment.

  19. Speech Planning Happens before Speech Execution: Online Reaction Time Methods in the Study of Apraxia of Speech

    ERIC Educational Resources Information Center

    Maas, Edwin; Mailend, Marja-Liisa

    2012-01-01

    Purpose: The purpose of this article is to present an argument for the use of online reaction time (RT) methods to the study of apraxia of speech (AOS) and to review the existing small literature in this area and the contributions it has made to our fundamental understanding of speech planning (deficits) in AOS. Method: Following a brief…

  20. Perceptual centres in speech - an acoustic analysis

    NASA Astrophysics Data System (ADS)

    Scott, Sophie Kerttu

    Perceptual centres, or P-centres, represent the perceptual moments of occurrence of acoustic signals - the 'beat' of a sound. P-centres underlie the perception and production of rhythm in perceptually regular speech sequences. P-centres have been modelled both in speech and non speech (music) domains. The three aims of this thesis were toatest out current P-centre models to determine which best accounted for the experimental data bto identify a candidate parameter to map P-centres onto (a local approach) as opposed to the previous global models which rely upon the whole signal to determine the P-centre the final aim was to develop a model of P-centre location which could be applied to speech and non speech signals. The first aim was investigated by a series of experiments in which a) speech from different speakers was investigated to determine whether different models could account for variation between speakers b) whether rendering the amplitude time plot of a speech signal affects the P-centre of the signal c) whether increasing the amplitude at the offset of a speech signal alters P-centres in the production and perception of speech. The second aim was carried out by a) manipulating the rise time of different speech signals to determine whether the P-centre was affected, and whether the type of speech sound ramped affected the P-centre shift b) manipulating the rise time and decay time of a synthetic vowel to determine whether the onset alteration was had more affect on P-centre than the offset manipulation c) and whether the duration of a vowel affected the P-centre, if other attributes (amplitude, spectral contents) were held constant. The third aim - modelling P-centres - was based on these results. The Frequency dependent Amplitude Increase Model of P-centre location (FAIM) was developed using a modelling protocol, the APU GammaTone Filterbank and the speech from different speakers. The P-centres of the stimuli corpus were highly predicted by attributes of

  1. Speech perception as an active cognitive process

    PubMed Central

    Heald, Shannon L. M.; Nusbaum, Howard C.

    2014-01-01

    One view of speech perception is that acoustic signals are transformed into representations for pattern matching to determine linguistic structure. This process can be taken as a statistical pattern-matching problem, assuming realtively stable linguistic categories are characterized by neural representations related to auditory properties of speech that can be compared to speech input. This kind of pattern matching can be termed a passive process which implies rigidity of processing with few demands on cognitive processing. An alternative view is that speech recognition, even in early stages, is an active process in which speech analysis is attentionally guided. Note that this does not mean consciously guided but that information-contingent changes in early auditory encoding can occur as a function of context and experience. Active processing assumes that attention, plasticity, and listening goals are important in considering how listeners cope with adverse circumstances that impair hearing by masking noise in the environment or hearing loss. Although theories of speech perception have begun to incorporate some active processing, they seldom treat early speech encoding as plastic and attentionally guided. Recent research has suggested that speech perception is the product of both feedforward and feedback interactions between a number of brain regions that include descending projections perhaps as far downstream as the cochlea. It is important to understand how the ambiguity of the speech signal and constraints of context dynamically determine cognitive resources recruited during perception including focused attention, learning, and working memory. Theories of speech perception need to go beyond the current corticocentric approach in order to account for the intrinsic dynamics of the auditory encoding of speech. In doing so, this may provide new insights into ways in which hearing disorders and loss may be treated either through augementation or therapy. PMID

  2. Prediction and constraint in audiovisual speech perception.

    PubMed

    Peelle, Jonathan E; Sommers, Mitchell S

    2015-07-01

    During face-to-face conversational speech listeners must efficiently process a rapid and complex stream of multisensory information. Visual speech can serve as a critical complement to auditory information because it provides cues to both the timing of the incoming acoustic signal (the amplitude envelope, influencing attention and perceptual sensitivity) and its content (place and manner of articulation, constraining lexical selection). Here we review behavioral and neurophysiological evidence regarding listeners' use of visual speech information. Multisensory integration of audiovisual speech cues improves recognition accuracy, particularly for speech in noise. Even when speech is intelligible based solely on auditory information, adding visual information may reduce the cognitive demands placed on listeners through increasing the precision of prediction. Electrophysiological studies demonstrate that oscillatory cortical entrainment to speech in auditory cortex is enhanced when visual speech is present, increasing sensitivity to important acoustic cues. Neuroimaging studies also suggest increased activity in auditory cortex when congruent visual information is available, but additionally emphasize the involvement of heteromodal regions of posterior superior temporal sulcus as playing a role in integrative processing. We interpret these findings in a framework of temporally-focused lexical competition in which visual speech information affects auditory processing to increase sensitivity to acoustic information through an early integration mechanism, and a late integration stage that incorporates specific information about a speaker's articulators to constrain the number of possible candidates in a spoken utterance. Ultimately it is words compatible with both auditory and visual information that most strongly determine successful speech perception during everyday listening. Thus, audiovisual speech perception is accomplished through multiple stages of integration

  3. Prediction and constraint in audiovisual speech perception

    PubMed Central

    Peelle, Jonathan E.; Sommers, Mitchell S.

    2015-01-01

    During face-to-face conversational speech listeners must efficiently process a rapid and complex stream of multisensory information. Visual speech can serve as a critical complement to auditory information because it provides cues to both the timing of the incoming acoustic signal (the amplitude envelope, influencing attention and perceptual sensitivity) and its content (place and manner of articulation, constraining lexical selection). Here we review behavioral and neurophysiological evidence regarding listeners' use of visual speech information. Multisensory integration of audiovisual speech cues improves recognition accuracy, particularly for speech in noise. Even when speech is intelligible based solely on auditory information, adding visual information may reduce the cognitive demands placed on listeners through increasing precision of prediction. Electrophysiological studies demonstrate oscillatory cortical entrainment to speech in auditory cortex is enhanced when visual speech is present, increasing sensitivity to important acoustic cues. Neuroimaging studies also suggest increased activity in auditory cortex when congruent visual information is available, but additionally emphasize the involvement of heteromodal regions of posterior superior temporal sulcus as playing a role in integrative processing. We interpret these findings in a framework of temporally-focused lexical competition in which visual speech information affects auditory processing to increase sensitivity to auditory information through an early integration mechanism, and a late integration stage that incorporates specific information about a speaker's articulators to constrain the number of possible candidates in a spoken utterance. Ultimately it is words compatible with both auditory and visual information that most strongly determine successful speech perception during everyday listening. Thus, audiovisual speech perception is accomplished through multiple stages of integration, supported

  4. Monaural speech intelligibility and detection in maskers with varying amounts of spectro-temporal speech features.

    PubMed

    Schubotz, Wiebke; Brand, Thomas; Kollmeier, Birger; Ewert, Stephan D

    2016-07-01

    Speech intelligibility is strongly affected by the presence of maskers. Depending on the spectro-temporal structure of the masker and its similarity to the target speech, different masking aspects can occur which are typically referred to as energetic, amplitude modulation, and informational masking. In this study speech intelligibility and speech detection was measured in maskers that vary systematically in the time-frequency domain from steady-state noise to a single interfering talker. Male and female target speech was used in combination with maskers based on speech for the same or different gender. Observed data were compared to predictions of the speech intelligibility index, extended speech intelligibility index, multi-resolution speech-based envelope-power-spectrum model, and the short-time objective intelligibility measure. The different models served as analysis tool to help distinguish between the different masking aspects. Comparison shows that overall masking can to a large extent be explained by short-term energetic masking. However, the other masking aspects (amplitude modulation an informational masking) influence speech intelligibility as well. Additionally, it was obvious that all models showed considerable deviations from the data. Therefore, the current study provides a benchmark for further evaluation of speech prediction models. PMID:27475175

  5. Speech discrimination after early exposure to pulsed-noise or speech

    PubMed Central

    Ranasinghe, Kamalini G.; Carraway, Ryan S.; Borland, Michael S.; Moreno, Nicole A.; Hanacik, Elizabeth A.; Miller, Robert S.; Kilgard, Michael P

    2012-01-01

    Early experience of structured inputs and complex sound features generate lasting changes in tonotopy and receptive field properties of primary auditory cortex (A1). In this study we tested whether these changes are severe enough to alter neural representations and behavioral discrimination of speech. We exposed two groups of rat pups during the critical period of auditory development to pulsed noise or speech. Both groups of rats were trained to discriminate speech sounds when they were young adults, and anesthetized neural responses were recorded from A1. The representation of speech in A1 and behavioral discrimination of speech remained robust to altered spectral and temporal characteristics of A1 neurons after pulsed-noise exposure. Exposure to passive speech during early development provided no added advantage in speech sound processing. Speech training increased A1 neuronal firing rate for speech stimuli in naïve rats, but did not increase responses in rats that experienced early exposure to pulsed noise or speech. Our results suggest that speech sound processing is resistant to changes in simple neural response properties caused by manipulating early acoustic environment. PMID:22575207

  6. Open Microphone Speech Understanding: Correct Discrimination Of In Domain Speech

    NASA Technical Reports Server (NTRS)

    Hieronymus, James; Aist, Greg; Dowding, John

    2006-01-01

    An ideal spoken dialogue system listens continually and determines which utterances were spoken to it, understands them and responds appropriately while ignoring the rest This paper outlines a simple method for achieving this goal which involves trading a slightly higher false rejection rate of in domain utterances for a higher correct rejection rate of Out of Domain (OOD) utterances. The system recognizes semantic entities specified by a unification grammar which is specialized by Explanation Based Learning (EBL). so that it only uses rules which are seen in the training data. The resulting grammar has probabilities assigned to each construct so that overgeneralizations are not a problem. The resulting system only recognizes utterances which reduce to a valid logical form which has meaning for the system and rejects the rest. A class N-gram grammar has been trained on the same training data. This system gives good recognition performance and offers good Out of Domain discrimination when combined with the semantic analysis. The resulting systems were tested on a Space Station Robot Dialogue Speech Database and a subset of the OGI conversational speech database. Both systems run in real time on a PC laptop and the present performance allows continuous listening with an acceptably low false acceptance rate. This type of open microphone system has been used in the Clarissa procedure reading and navigation spoken dialogue system which is being tested on the International Space Station.

  7. Speech Perception and Working Memory in Children with Residual Speech Errors: A Case Study Analysis.

    PubMed

    Cabbage, Kathryn L; Farquharson, Kelly; Hogan, Tiffany P

    2015-11-01

    Some children with residual deficits in speech production also display characteristics of dyslexia; however, the causes of these disorders--in isolation or comorbidly--remain unknown. Presently, the role of phonological representations is an important construct for considering how the underlying system of phonology functions. In particular, two related skills--speech perception and phonological working memory--may provide insight into the nature of phonological representations. This study provides an exploratory investigation into the profiles of three 9-year-old children: one with residual speech errors, one with residual speech errors and dyslexia, and one who demonstrated typical, age-appropriate speech sound production and reading skills. We provide an in-depth examination of their relative abilities in the areas of speech perception, phonological working memory, vocabulary, and word reading. Based on these preliminary explorations, we suggest implications for the assessment and treatment of children with residual speech errors and/or dyslexia. PMID:26458199

  8. Method and apparatus for obtaining complete speech signals for speech recognition applications

    NASA Technical Reports Server (NTRS)

    Abrash, Victor (Inventor); Cesari, Federico (Inventor); Franco, Horacio (Inventor); George, Christopher (Inventor); Zheng, Jing (Inventor)

    2009-01-01

    The present invention relates to a method and apparatus for obtaining complete speech signals for speech recognition applications. In one embodiment, the method continuously records an audio stream comprising a sequence of frames to a circular buffer. When a user command to commence or terminate speech recognition is received, the method obtains a number of frames of the audio stream occurring before or after the user command in order to identify an augmented audio signal for speech recognition processing. In further embodiments, the method analyzes the augmented audio signal in order to locate starting and ending speech endpoints that bound at least a portion of speech to be processed for recognition. At least one of the speech endpoints is located using a Hidden Markov Model.

  9. Speech levels in meeting rooms and the probability of speech privacy problems.

    PubMed

    Bradley, J S; Gover, B N

    2010-02-01

    Speech levels were measured in a large number of meetings and meeting rooms to better understand their influence on the speech privacy of closed meeting rooms. The effects of room size and number of occupants on average speech levels, for meetings with and without sound amplification, were investigated. The characteristics of the statistical variations of speech levels were determined in terms of speech levels measured over 10 s intervals at locations inside, but near the periphery of the meeting rooms. A procedure for predicting the probability of speech being audible or intelligible at points outside meeting rooms is proposed. It is based on the statistics of meeting room speech levels, in combination with the sound insulation characteristics of the room and the ambient noise levels at locations outside the room. PMID:20136204

  10. Speech recognition technology: a critique.

    PubMed Central

    Levinson, S E

    1995-01-01

    This paper introduces the session on advanced speech recognition technology. The two papers comprising this session argue that current technology yields a performance that is only an order of magnitude in error rate away from human performance and that incremental improvements will bring us to that desired level. I argue that, to the contrary, present performance is far removed from human performance and a revolution in our thinking is required to achieve the goal. It is further asserted that to bring about the revolution more effort should be expended on basic research and less on trying to prematurely commercialize a deficient technology. PMID:7479808

  11. Jam-resistant speech encoding

    NASA Astrophysics Data System (ADS)

    Poole, M. A.; Rifkin, R.

    1983-06-01

    This report describes techniques that provide increased jam resistance for digitized speech. Methods for increasing the jam resistance of pulse code modulated data are analyzed and evaluated in listener tests. Special emphasis is placed on new voice encoding approaches that take advantage of a spread spectrum system with a variable (or multiple)-data-rate/variable (or multiple)-AJ capability. Methods for matching a source to a channel in a jamming environment are investigated. Several techniques that provide about a 4 dB increase in jam resistance have been identified.

  12. [Electrographic Correlations of Inner Speech].

    PubMed

    Kiroy, V N; Bakhtin, O M; Minyaeva, N R; Lazurenko, D M; Aslanyan, E V; Kiroy, R I

    2015-01-01

    On the purpose to detect in EEG specific patterns associated with any verbal performance the gamma activity were investigated. The technique which allows the subject to initiate the mental pronunciation of words and phrases (inner speech) was created. Wavelet analysis of EEG has been experimentally demonstrated that the preparation and implementation stages are related to the specific spatio-temporal patterns in frequency range 64-68 Hz. Sustainable reproduction and efficient identification of such patterns can solve the fundamentally problem of alphabet control commands formation for Brain Computer Interface and Brain to Braine Interface systems. PMID:26860004

  13. General-Purpose Monitoring during Speech Production

    ERIC Educational Resources Information Center

    Ries, Stephanie; Janssen, Niels; Dufau, Stephane; Alario, F.-Xavier; Burle, Boris

    2011-01-01

    The concept of "monitoring" refers to our ability to control our actions on-line. Monitoring involved in speech production is often described in psycholinguistic models as an inherent part of the language system. We probed the specificity of speech monitoring in two psycholinguistic experiments where electroencephalographic activities were…

  14. The Oral Speech Mechanism Screening Examination (OSMSE).

    ERIC Educational Resources Information Center

    St. Louis, Kenneth O.; Ruscello, Dennis M.

    Although speech-language pathologists are expected to be able to administer and interpret oral examinations, there are currently no screening tests available that provide careful administration instructions and data for intra-examiner and inter-examiner reliability. The Oral Speech Mechanism Screening Examination (OSMSE) is designed primarily for…

  15. School Principal Speech about Fiscal Mismanagement

    ERIC Educational Resources Information Center

    Hassenpflug, Ann

    2015-01-01

    A review of two recent federal court cases concerning school principals who experienced adverse job actions after they engaged in speech about fiscal misconduct by other employees indicates that the courts found that the principal's speech was made as part of his or her job duties and was not protected by the First Amendment.

  16. Anatomy and Physiology of the Speech Mechanism.

    ERIC Educational Resources Information Center

    Sheets, Boyd V.

    This monograph on the anatomical and physiological aspects of the speech mechanism stresses the importance of a general understanding of the process of verbal communication. Contents include "Positions of the Body,""Basic Concepts Linked with the Speech Mechanism,""The Nervous System,""The Respiratory System--Sound-Power Source,""The…

  17. Philosophy of Research in Motor Speech Disorders

    ERIC Educational Resources Information Center

    Weismer, Gary

    2006-01-01

    The primary objective of this position paper is to assess the theoretical and empirical support that exists for the Mayo Clinic view of motor speech disorders in general, and for oromotor, nonverbal tasks as a window to speech production processes in particular. Literature both in support of and against the Mayo clinic view and the associated use…

  18. Speech neglect: A strange educational blind spot

    NASA Astrophysics Data System (ADS)

    Harris, Katherine Safford

    2005-09-01

    Speaking is universally acknowledged as an important human talent, yet as a topic of educated common knowledge, it is peculiarly neglected. Partly, this is a consequence of the relatively recent growth of research on speech perception, production, and development, but also a function of the way that information is sliced up by undergraduate colleges. Although the basic acoustic mechanism of vowel production was known to Helmholtz, the ability to view speech production as a physiological event is evolving even now with such techniques as fMRI. Intensive research on speech perception emerged only in the early 1930s as Fletcher and the engineers at Bell Telephone Laboratories developed the transmission of speech over telephone lines. The study of speech development was revolutionized by the papers of Eimas and his colleagues on speech perception in infants in the 1970s. Dissemination of knowledge in these fields is the responsibility of no single academic discipline. It forms a center for two departments, Linguistics, and Speech and Hearing, but in the former, there is a heavy emphasis on other aspects of language than speech and, in the latter, a focus on clinical practice. For psychologists, it is a rather minor component of a very diverse assembly of topics. I will focus on these three fields in proposing possible remedies.

  19. The Lombard Effect on Alaryngeal Speech.

    ERIC Educational Resources Information Center

    Zeine, Lina; Brandt, John F.

    1988-01-01

    The study investigated the Lombard effect (evoking increased speech intensity by applying masking noise to ears of talker) on the speech of esophageal talkers, artificial larynx users, and normal speakers. The noise condition produced the highest intensity increase in the esophageal speakers. (Author/DB)

  20. Tampa Bay International Business Summit Keynote Speech

    NASA Technical Reports Server (NTRS)

    Clary, Christina

    2011-01-01

    A keynote speech outlining the importance of collaboration and diversity in the workplace. The 20-minute speech describes NASA's challenges and accomplishments over the years and what lies ahead. Topics include: diversity and inclusion principles, international cooperation, Kennedy Space Center planning and development, opportunities for cooperation, and NASA's vision for exploration.