speech error patterns: Topics by Science.gov

Sample records for speech error patterns

Preschool speech error patterns predict articulation and phonological awareness outcomes in children with histories of speech sound disorders.

PubMed

Preston, Jonathan L; Hull, Margaret; Edwards, Mary Louise

2013-05-01

To determine if speech error patterns in preschoolers with speech sound disorders (SSDs) predict articulation and phonological awareness (PA) outcomes almost 4 years later. Twenty-five children with histories of preschool SSDs (and normal receptive language) were tested at an average age of 4;6 (years;months) and were followed up at age 8;3. The frequency of occurrence of preschool distortion errors, typical substitution and syllable structure errors, and atypical substitution and syllable structure errors was used to predict later speech sound production, PA, and literacy outcomes. Group averages revealed below-average school-age articulation scores and low-average PA but age-appropriate reading and spelling. Preschool speech error patterns were related to school-age outcomes. Children for whom >10% of their speech sound errors were atypical had lower PA and literacy scores at school age than children who produced <10% atypical errors. Preschoolers who produced more distortion errors were likely to have lower school-age articulation scores than preschoolers who produced fewer distortion errors. Different preschool speech error patterns predict different school-age clinical outcomes. Many atypical speech sound errors in preschoolers may be indicative of weak phonological representations, leading to long-term PA weaknesses. Preschoolers' distortions may be resistant to change over time, leading to persisting speech sound production problems.
Preschool speech error patterns predict articulation and phonological awareness outcomes in children with histories of speech sound disorders

PubMed Central

Preston, Jonathan L.; Hull, Margaret; Edwards, Mary Louise

2012-01-01

Purpose To determine if speech error patterns in preschoolers with speech sound disorders (SSDs) predict articulation and phonological awareness (PA) outcomes almost four years later. Method Twenty-five children with histories of preschool SSDs (and normal receptive language) were tested at an average age of 4;6 and followed up at 8;3. The frequency of occurrence of preschool distortion errors, typical substitution and syllable structure errors, and atypical substitution and syllable structure errors were used to predict later speech sound production, PA, and literacy outcomes. Results Group averages revealed below-average school-age articulation scores and low-average PA, but age-appropriate reading and spelling. Preschool speech error patterns were related to school-age outcomes. Children for whom more than 10% of their speech sound errors were atypical had lower PA and literacy scores at school-age than children who produced fewer than 10% atypical errors. Preschoolers who produced more distortion errors were likely to have lower school-age articulation scores. Conclusions Different preschool speech error patterns predict different school-age clinical outcomes. Many atypical speech sound errors in preschool may be indicative of weak phonological representations, leading to long-term PA weaknesses. Preschool distortions may be resistant to change over time, leading to persisting speech sound production problems. PMID:23184137
The role of consolidation in learning context-dependent phonotactic patterns in speech and digital sequence production.

PubMed

Anderson, Nathaniel D; Dell, Gary S

2018-04-03

Speakers implicitly learn novel phonotactic patterns by producing strings of syllables. The learning is revealed in their speech errors. First-order patterns, such as "/f/ must be a syllable onset," can be distinguished from contingent, or second-order, patterns, such as "/f/ must be an onset if the vowel is /a/, but a coda if the vowel is /o/." A metaanalysis of 19 experiments clearly demonstrated that first-order patterns affect speech errors to a very great extent in a single experimental session, but second-order vowel-contingent patterns only affect errors on the second day of testing, suggesting the need for a consolidation period. Two experiments tested an analogue to these studies involving sequences of button pushes, with fingers as "consonants" and thumbs as "vowels." The button-push errors revealed two of the key speech-error findings: first-order patterns are learned quickly, but second-order thumb-contingent patterns are only strongly revealed in the errors on the second day of testing. The influence of computational complexity on the implicit learning of phonotactic patterns in speech production may be a general feature of sequence production.
Patterns of Post-Stroke Brain Damage that Predict Speech Production Errors in Apraxia of Speech and Aphasia Dissociate

PubMed Central

Basilakos, Alexandra; Rorden, Chris; Bonilha, Leonardo; Moser, Dana; Fridriksson, Julius

2015-01-01

Background and Purpose Acquired apraxia of speech (AOS) is a motor speech disorder caused by brain damage. AOS often co-occurs with aphasia, a language disorder in which patients may also demonstrate speech production errors. The overlap of speech production deficits in both disorders has raised questions regarding if AOS emerges from a unique pattern of brain damage or as a sub-element of the aphasic syndrome. The purpose of this study was to determine whether speech production errors in AOS and aphasia are associated with distinctive patterns of brain injury. Methods Forty-three patients with history of a single left-hemisphere stroke underwent comprehensive speech and language testing. The Apraxia of Speech Rating Scale was used to rate speech errors specific to AOS versus speech errors that can also be associated with AOS and/or aphasia. Localized brain damage was identified using structural MRI, and voxel-based lesion-impairment mapping was used to evaluate the relationship between speech errors specific to AOS, those that can occur in AOS and/or aphasia, and brain damage. Results The pattern of brain damage associated with AOS was most strongly associated with damage to cortical motor regions, with additional involvement of somatosensory areas. Speech production deficits that could be attributed to AOS and/or aphasia were associated with damage to the temporal lobe and the inferior pre-central frontal regions. Conclusion AOS likely occurs in conjunction with aphasia due to the proximity of the brain areas supporting speech and language, but the neurobiological substrate for each disorder differs. PMID:25908457
Patterns of poststroke brain damage that predict speech production errors in apraxia of speech and aphasia dissociate.

PubMed

Basilakos, Alexandra; Rorden, Chris; Bonilha, Leonardo; Moser, Dana; Fridriksson, Julius

2015-06-01

Acquired apraxia of speech (AOS) is a motor speech disorder caused by brain damage. AOS often co-occurs with aphasia, a language disorder in which patients may also demonstrate speech production errors. The overlap of speech production deficits in both disorders has raised questions on whether AOS emerges from a unique pattern of brain damage or as a subelement of the aphasic syndrome. The purpose of this study was to determine whether speech production errors in AOS and aphasia are associated with distinctive patterns of brain injury. Forty-three patients with history of a single left-hemisphere stroke underwent comprehensive speech and language testing. The AOS Rating Scale was used to rate speech errors specific to AOS versus speech errors that can also be associated with both AOS and aphasia. Localized brain damage was identified using structural magnetic resonance imaging, and voxel-based lesion-impairment mapping was used to evaluate the relationship between speech errors specific to AOS, those that can occur in AOS or aphasia, and brain damage. The pattern of brain damage associated with AOS was most strongly associated with damage to cortical motor regions, with additional involvement of somatosensory areas. Speech production deficits that could be attributed to AOS or aphasia were associated with damage to the temporal lobe and the inferior precentral frontal regions. AOS likely occurs in conjunction with aphasia because of the proximity of the brain areas supporting speech and language, but the neurobiological substrate for each disorder differs. © 2015 American Heart Association, Inc.
Perceptual Bias in Speech Error Data Collection: Insights from Spanish Speech Errors

ERIC Educational Resources Information Center

Perez, Elvira; Santiago, Julio; Palma, Alfonso; O'Seaghdha, Padraig G.

2007-01-01

This paper studies the reliability and validity of naturalistic speech errors as a tool for language production research. Possible biases when collecting naturalistic speech errors are identified and specific predictions derived. These patterns are then contrasted with published reports from Germanic languages (English, German and Dutch) and one…
Preschool Speech Error Patterns Predict Articulation and Phonological Awareness Outcomes in Children with Histories of Speech Sound Disorders

ERIC Educational Resources Information Center

Preston, Jonathan L.; Hull, Margaret; Edwards, Mary Louise

2013-01-01

Purpose: To determine if speech error patterns in preschoolers with speech sound disorders (SSDs) predict articulation and phonological awareness (PA) outcomes almost 4 years later. Method: Twenty-five children with histories of preschool SSDs (and normal receptive language) were tested at an average age of 4;6 (years;months) and were followed up…
Speech abilities in preschool children with speech sound disorder with and without co-occurring language impairment.

PubMed

Macrae, Toby; Tyler, Ann A

2014-10-01

The authors compared preschool children with co-occurring speech sound disorder (SSD) and language impairment (LI) to children with SSD only in their numbers and types of speech sound errors. In this post hoc quasi-experimental study, independent samples t tests were used to compare the groups in the standard score from different tests of articulation/phonology, percent consonants correct, and the number of omission, substitution, distortion, typical, and atypical error patterns used in the production of different wordlists that had similar levels of phonetic and structural complexity. In comparison with children with SSD only, children with SSD and LI used similar numbers but different types of errors, including more omission patterns ( p < .001, d = 1.55) and fewer distortion patterns ( p = .022, d = 1.03). There were no significant differences in substitution, typical, and atypical error pattern use. Frequent omission error pattern use may reflect a more compromised linguistic system characterized by absent phonological representations for target sounds (see Shriberg et al., 2005). Research is required to examine the diagnostic potential of early frequent omission error pattern use in predicting later diagnoses of co-occurring SSD and LI and/or reading problems.
Investigating Perceptual Biases, Data Reliability, and Data Discovery in a Methodology for Collecting Speech Errors From Audio Recordings.

PubMed

Alderete, John; Davies, Monica

2018-04-01

This work describes a methodology of collecting speech errors from audio recordings and investigates how some of its assumptions affect data quality and composition. Speech errors of all types (sound, lexical, syntactic, etc.) were collected by eight data collectors from audio recordings of unscripted English speech. Analysis of these errors showed that: (i) different listeners find different errors in the same audio recordings, but (ii) the frequencies of error patterns are similar across listeners; (iii) errors collected "online" using on the spot observational techniques are more likely to be affected by perceptual biases than "offline" errors collected from audio recordings; and (iv) datasets built from audio recordings can be explored and extended in a number of ways that traditional corpus studies cannot be.
English speech sound development in preschool-aged children from bilingual English-Spanish environments.

PubMed

Gildersleeve-Neumann, Christina E; Kester, Ellen S; Davis, Barbara L; Peña, Elizabeth D

2008-07-01

English speech acquisition by typically developing 3- to 4-year-old children with monolingual English was compared to English speech acquisition by typically developing 3- to 4-year-old children with bilingual English-Spanish backgrounds. We predicted that exposure to Spanish would not affect the English phonetic inventory but would increase error frequency and type in bilingual children. Single-word speech samples were collected from 33 children. Phonetically transcribed samples for the 3 groups (monolingual English children, English-Spanish bilingual children who were predominantly exposed to English, and English-Spanish bilingual children with relatively equal exposure to English and Spanish) were compared at 2 time points and for change over time for phonetic inventory, phoneme accuracy, and error pattern frequencies. Children demonstrated similar phonetic inventories. Some bilingual children produced Spanish phonemes in their English and produced few consonant cluster sequences. Bilingual children with relatively equal exposure to English and Spanish averaged more errors than did bilingual children who were predominantly exposed to English. Both bilingual groups showed higher error rates than English-only children overall, particularly for syllable-level error patterns. All language groups decreased in some error patterns, although the ones that decreased were not always the same across language groups. Some group differences of error patterns and accuracy were significant. Vowel error rates did not differ by language group. Exposure to English and Spanish may result in a higher English error rate in typically developing bilinguals, including the application of Spanish phonological properties to English. Slightly higher error rates are likely typical for bilingual preschool-aged children. Change over time at these time points for all 3 groups was similar, suggesting that all will reach an adult-like system in English with exposure and practice.
Effects of hemisphere speech dominance and seizure focus on patterns of behavioral response errors for three types of stimuli.

PubMed

Rausch, R; MacDonald, K

1997-03-01

We used a protocol consisting of a continuous presentation of stimuli with associated response requests during an intracarotid sodium amobarbital procedure (IAP) to study the effects of hemisphere injected (speech dominant vs. nondominant) and seizure focus (left temporal lobe vs. right temporal lobe) on the pattern of behavioral response errors for three types of visual stimuli (pictures of common objects, words, and abstract forms). Injection of the left speech dominant hemisphere compared to the right nondominant hemisphere increased overall errors and affected the pattern of behavioral errors. The presence of a seizure focus in the contralateral hemisphere increased overall errors, particularly for the right temporal lobe seizure patients, but did not affect the pattern of behavioral errors. Left hemisphere injections disrupted both naming and reading responses at a rate similar to that of matching-to-sample performance. Also, a short-term memory deficit was observed with all three stimuli. Long-term memory testing following the left hemisphere injection indicated that only for pictures of common objects were there fewer errors during the early postinjection period than for the later long-term memory testing. Therefore, despite the inability to respond to picture stimuli, picture items, but not words or forms, could be sufficiently encoded for later recall. In contrast, right hemisphere injections resulted in few errors, with a pattern suggesting a mild general cognitive decrease. A selective weakness in learning unfamiliar forms was found. Our findings indicate that different patterns of behavioral deficits occur following the left vs. right hemisphere injections, with selective patterns specific to stimulus type.
Children's Identification of Consonants in a Speech-Shaped Noise or a Two-Talker Masker

ERIC Educational Resources Information Center

Leibold, Lori J.; Buss, Emily

2013-01-01

Purpose: To evaluate child-adult differences for consonant identification in a noise or a 2-talker masker. Error patterns were compared across age and masker type to test the hypothesis that errors with the noise masker reflect limitations in the peripheral encoding of speech, whereas errors with the 2-talker masker reflect target-masker…
Prediction Errors but Not Sharpened Signals Simulate Multivoxel fMRI Patterns during Speech Perception

PubMed Central

Davis, Matthew H.

2016-01-01

Successful perception depends on combining sensory input with prior knowledge. However, the underlying mechanism by which these two sources of information are combined is unknown. In speech perception, as in other domains, two functionally distinct coding schemes have been proposed for how expectations influence representation of sensory evidence. Traditional models suggest that expected features of the speech input are enhanced or sharpened via interactive activation (Sharpened Signals). Conversely, Predictive Coding suggests that expected features are suppressed so that unexpected features of the speech input (Prediction Errors) are processed further. The present work is aimed at distinguishing between these two accounts of how prior knowledge influences speech perception. By combining behavioural, univariate, and multivariate fMRI measures of how sensory detail and prior expectations influence speech perception with computational modelling, we provide evidence in favour of Prediction Error computations. Increased sensory detail and informative expectations have additive behavioural and univariate neural effects because they both improve the accuracy of word report and reduce the BOLD signal in lateral temporal lobe regions. However, sensory detail and informative expectations have interacting effects on speech representations shown by multivariate fMRI in the posterior superior temporal sulcus. When prior knowledge was absent, increased sensory detail enhanced the amount of speech information measured in superior temporal multivoxel patterns, but with informative expectations, increased sensory detail reduced the amount of measured information. Computational simulations of Sharpened Signals and Prediction Errors during speech perception could both explain these behavioural and univariate fMRI observations. However, the multivariate fMRI observations were uniquely simulated by a Prediction Error and not a Sharpened Signal model. The interaction between prior expectation and sensory detail provides evidence for a Predictive Coding account of speech perception. Our work establishes methods that can be used to distinguish representations of Prediction Error and Sharpened Signals in other perceptual domains. PMID:27846209
Multi-voxel Patterns Reveal Functionally Differentiated Networks Underlying Auditory Feedback Processing of Speech

PubMed Central

Zheng, Zane Z.; Vicente-Grabovetsky, Alejandro; MacDonald, Ewen N.; Munhall, Kevin G.; Cusack, Rhodri; Johnsrude, Ingrid S.

2013-01-01

The everyday act of speaking involves the complex processes of speech motor control. An important component of control is monitoring, detection and processing of errors when auditory feedback does not correspond to the intended motor gesture. Here we show, using fMRI and converging operations within a multi-voxel pattern analysis framework, that this sensorimotor process is supported by functionally differentiated brain networks. During scanning, a real-time speech-tracking system was employed to deliver two acoustically different types of distorted auditory feedback or unaltered feedback while human participants were vocalizing monosyllabic words, and to present the same auditory stimuli while participants were passively listening. Whole-brain analysis of neural-pattern similarity revealed three functional networks that were differentially sensitive to distorted auditory feedback during vocalization, compared to during passive listening. One network of regions appears to encode an ‘error signal’ irrespective of acoustic features of the error: this network, including right angular gyrus, right supplementary motor area, and bilateral cerebellum, yielded consistent neural patterns across acoustically different, distorted feedback types, only during articulation (not during passive listening). In contrast, a fronto-temporal network appears sensitive to the speech features of auditory stimuli during passive listening; this preference for speech features was diminished when the same stimuli were presented as auditory concomitants of vocalization. A third network, showing a distinct functional pattern from the other two, appears to capture aspects of both neural response profiles. Taken together, our findings suggest that auditory feedback processing during speech motor control may rely on multiple, interactive, functionally differentiated neural systems. PMID:23467350
[Investigating phonological planning processes in speech production through a speech-error induction technique].

PubMed

Nakayama, Masataka; Saito, Satoru

2015-08-01

The present study investigated principles of phonological planning, a common serial ordering mechanism for speech production and phonological short-term memory. Nakayama and Saito (2014) have investigated the principles by using a speech-error induction technique, in which participants were exposed to an auditory distracIor word immediately before an utterance of a target word. They demonstrated within-word adjacent mora exchanges and serial position effects on error rates. These findings support, respectively, the temporal distance and the edge principles at a within-word level. As this previous study induced errors using word distractors created by exchanging adjacent morae in the target words, it is possible that the speech errors are expressions of lexical intrusions reflecting interactive activation of phonological and lexical/semantic representations. To eliminate this possibility, the present study used nonword distractors that had no lexical or semantic representations. This approach successfully replicated the error patterns identified in the abovementioned study, further confirming that the temporal distance and edge principles are organizing precepts in phonological planning.
Speech outcomes in Cantonese patients after glossectomy.

PubMed

Wong, Ripley Kit; Poon, Esther Sok-Man; Woo, Cynthia Yuen-Man; Chan, Sabina Ching-Shun; Wong, Elsa Siu-Ping; Chu, Ada Wai-Sze

2007-08-01

We sought to determine the major factors affecting speech production of Cantonese-speaking glossectomized patients. Error pattern was analyzed. Forty-one Cantonese-speaking subjects who had undergone glossectomy > or = 6 months previously were recruited. Speech production evaluation included (1) phonetic error analysis in nonsense syllable; (2) speech intelligibility in sentences evaluated by naive listeners; (3) overall speech intelligibility in conversation evaluated by experienced speech therapists. Patients receiving adjuvant radiotherapy had significantly poorer segmental and connected speech production. Total or subtotal glossectomy also resulted in poor speech outcomes. Patients having free flap reconstruction showed the best speech outcomes. Patients without lymph node metastasis had significantly better speech scores when compared with patients with lymph node metastasis. Initial consonant production had the worst scores, while vowel production was the least affected. Speech outcomes of Cantonese-speaking glossectomized patients depended on the severity of the disease. Initial consonants had the greatest effect on speech intelligibility.
Dynamic action units slip in speech production errors ☆

PubMed Central

Goldstein, Louis; Pouplier, Marianne; Chen, Larissa; Saltzman, Elliot; Byrd, Dani

2008-01-01

In the past, the nature of the compositional units proposed for spoken language has largely diverged from the types of control units pursued in the domains of other skilled motor tasks. A classic source of evidence as to the units structuring speech has been patterns observed in speech errors – “slips of the tongue”. The present study reports, for the first time, on kinematic data from tongue and lip movements during speech errors elicited in the laboratory using a repetition task. Our data are consistent with the hypothesis that speech production results from the assembly of dynamically defined action units – gestures – in a linguistically structured environment. The experimental results support both the presence of gestural units and the dynamical properties of these units and their coordination. This study of speech articulation shows that it is possible to develop a principled account of spoken language within a more general theory of action. PMID:16822494
Ingressive Speech Errors: A Service Evaluation of Speech-Sound Therapy in a Child Aged 4;6

ERIC Educational Resources Information Center

Hrastelj, Laura; Knight, Rachael-Anne

2017-01-01

Background: A pattern of ingressive substitutions for word-final sibilants can be identified in a small number of cases in child speech disorder, with growing evidence suggesting it is a phonological difficulty, despite the unusual surface form. Phonological difficulty implies a problem with the cognitive process of organizing speech into sound…
Intensive Treatment with Ultrasound Visual Feedback for Speech Sound Errors in Childhood Apraxia

PubMed Central

Preston, Jonathan L.; Leece, Megan C.; Maas, Edwin

2016-01-01

Ultrasound imaging is an adjunct to traditional speech therapy that has shown to be beneficial in the remediation of speech sound errors. Ultrasound biofeedback can be utilized during therapy to provide clients with additional knowledge about their tongue shapes when attempting to produce sounds that are erroneous. The additional feedback may assist children with childhood apraxia of speech (CAS) in stabilizing motor patterns, thereby facilitating more consistent and accurate productions of sounds and syllables. However, due to its specialized nature, ultrasound visual feedback is a technology that is not widely available to clients. Short-term intensive treatment programs are one option that can be utilized to expand access to ultrasound biofeedback. Schema-based motor learning theory suggests that short-term intensive treatment programs (massed practice) may assist children in acquiring more accurate motor patterns. In this case series, three participants ages 10–14 years diagnosed with CAS attended 16 h of speech therapy over a 2-week period to address residual speech sound errors. Two participants had distortions on rhotic sounds, while the third participant demonstrated lateralization of sibilant sounds. During therapy, cues were provided to assist participants in obtaining a tongue shape that facilitated a correct production of the erred sound. Additional practice without ultrasound was also included. Results suggested that all participants showed signs of acquisition of sounds in error. Generalization and retention results were mixed. One participant showed generalization and retention of sounds that were treated; one showed generalization but limited retention; and the third showed no evidence of generalization or retention. Individual characteristics that may facilitate generalization are discussed. Short-term intensive treatment programs using ultrasound biofeedback may result in the acquisition of more accurate motor patterns and improved articulation of sounds previously in error, with varying levels of generalization and retention. PMID:27625603
Toward diagnostic and phenotype markers for genetically transmitted speech delay.

PubMed

Shriberg, Lawrence D; Lewis, Barbara A; Tomblin, J Bruce; McSweeny, Jane L; Karlsson, Heather B; Scheer, Alison R

2005-08-01

Converging evidence supports the hypothesis that the most common subtype of childhood speech sound disorder (SSD) of currently unknown origin is genetically transmitted. We report the first findings toward a set of diagnostic markers to differentiate this proposed etiological subtype (provisionally termed speech delay-genetic) from other proposed subtypes of SSD of unknown origin. Conversational speech samples from 72 preschool children with speech delay of unknown origin from 3 research centers were selected from an audio archive. Participants differed on the number of biological, nuclear family members (0 or 2+) classified as positive for current and/or prior speech-language disorder. Although participants in the 2 groups were found to have similar speech competence, as indexed by their Percentage of Consonants Correct scores, their speech error patterns differed significantly in 3 ways. Compared with children who may have reduced genetic load for speech delay (no affected nuclear family members), children with possibly higher genetic load (2+ affected members) had (a) a significantly higher proportion of relative omission errors on the Late-8 consonants; (b) a significantly lower proportion of relative distortion errors on these consonants, particularly on the sibilant fricatives /s/, /z/, and //; and (c) a significantly lower proportion of backed /s/ distortions, as assessed by both perceptual and acoustic methods. Machine learning routines identified a 3-part classification rule that included differential weightings of these variables. The classification rule had diagnostic accuracy value of 0.83 (95% confidence limits = 0.74-0.92), with positive and negative likelihood ratios of 9.6 (95% confidence limits = 3.1-29.9) and 0.40 (95% confidence limits = 0.24-0.68), respectively. The diagnostic accuracy findings are viewed as promising. The error pattern for this proposed subtype of SSD is viewed as consistent with the cognitive-linguistic processing deficits that have been reported for genetically transmitted verbal disorders.

Identifying Residual Speech Sound Disorders in Bilingual Children: A Japanese-English Case Study

PubMed Central

Preston, Jonathan L.; Seki, Ayumi

2012-01-01

Purpose The purposes are to (1) describe the assessment of residual speech sound disorders (SSD) in bilinguals by distinguishing speech patterns associated with second language acquisition from patterns associated with misarticulations, and (2) describe how assessment of domains such as speech motor control and phonological awareness can provide a more complete understanding of SSDs in bilinguals. Method A review of Japanese phonology is provided to offer a context for understanding the transfer of Japanese to English productions. A case study of an 11-year-old is presented, demonstrating parallel speech assessments in English and Japanese. Speech motor and phonological awareness tasks were conducted in both languages. Results Several patterns were observed in the participant’s English that could be plausibly explained by the influence of Japanese phonology. However, errors indicating a residual SSD were observed in both Japanese and English. A speech motor assessment suggested possible speech motor control problems, and phonological awareness was judged to be within the typical range of performance in both languages. Conclusion Understanding the phonological characteristics of L1 can help clinicians recognize speech patterns in L2 associated with transfer. Once these differences are understood, patterns associated with a residual SSD can be identified. Supplementing a relational speech analysis with measures of speech motor control and phonological awareness can provide a more comprehensive understanding of a client’s strengths and needs. PMID:21386046
Effective Prediction of Errors by Non-native Speakers Using Decision Tree for Speech Recognition-Based CALL System

NASA Astrophysics Data System (ADS)

Wang, Hongcui; Kawahara, Tatsuya

CALL (Computer Assisted Language Learning) systems using ASR (Automatic Speech Recognition) for second language learning have received increasing interest recently. However, it still remains a challenge to achieve high speech recognition performance, including accurate detection of erroneous utterances by non-native speakers. Conventionally, possible error patterns, based on linguistic knowledge, are added to the lexicon and language model, or the ASR grammar network. However, this approach easily falls in the trade-off of coverage of errors and the increase of perplexity. To solve the problem, we propose a method based on a decision tree to learn effective prediction of errors made by non-native speakers. An experimental evaluation with a number of foreign students learning Japanese shows that the proposed method can effectively generate an ASR grammar network, given a target sentence, to achieve both better coverage of errors and smaller perplexity, resulting in significant improvement in ASR accuracy.
Error biases in inner and overt speech: evidence from tongue twisters.

PubMed

Corley, Martin; Brocklehurst, Paul H; Moat, H Susannah

2011-01-01

To compare the properties of inner and overt speech, Oppenheim and Dell (2008) counted participants' self-reported speech errors when reciting tongue twisters either overtly or silently and found a bias toward substituting phonemes that resulted in words in both conditions, but a bias toward substituting similar phonemes only when speech was overt. Here, we report 3 experiments revisiting their conclusion that inner speech remains underspecified at the subphonemic level, which they simulated within an activation-feedback framework. In 2 experiments, participants recited tongue twisters that could result in the errorful substitutions of similar or dissimilar phonemes to form real words or nonwords. Both experiments included an auditory masking condition, to gauge the possible impact of loss of auditory feedback on the accuracy of self-reporting of speech errors. In Experiment 1, the stimuli were composed entirely from real words, whereas, in Experiment 2, half the tokens used were nonwords. Although masking did not have any effects, participants were more likely to report substitutions of similar phonemes in both experiments, in inner as well as overt speech. This pattern of results was confirmed in a 3rd experiment using the real-word materials from Oppenheim and Dell (in press). In addition to these findings, a lexical bias effect found in Experiments 1 and 3 disappeared in Experiment 2. Our findings support a view in which plans for inner speech are indeed specified at the feature level, even when there is no intention to articulate words overtly, and in which editing of the plan for errors is implicated. (PsycINFO Database Record (c) 2010 APA, all rights reserved).
Interlanguage Variation: A Point Missed?

ERIC Educational Resources Information Center

Tice, Bradley Scott

A study investigated patterns in phonological errors occurring in the speaker's second language in both formal and informal speaking situations. Subjects were three adult learners of English as a second language, including a native Spanish-speaker and two Asians. Their speech was recorded during diagnostic testing (formal speech) and in everyday…
Describing Phonological Paraphasias in Three Variants of Primary Progressive Aphasia.

PubMed

Dalton, Sarah Grace Hudspeth; Shultz, Christine; Henry, Maya L; Hillis, Argye E; Richardson, Jessica D

2018-03-01

The purpose of this study was to describe the linguistic environment of phonological paraphasias in 3 variants of primary progressive aphasia (semantic, logopenic, and nonfluent) and to describe the profiles of paraphasia production for each of these variants. Discourse samples of 26 individuals diagnosed with primary progressive aphasia were investigated for phonological paraphasias using the criteria established for the Philadelphia Naming Test (Moss Rehabilitation Research Institute, 2013). Phonological paraphasias were coded for paraphasia type, part of speech of the target word, target word frequency, type of segment in error, word position of consonant errors, type of error, and degree of change in consonant errors. Eighteen individuals across the 3 variants produced phonological paraphasias. Most paraphasias were nonword, followed by formal, and then mixed, with errors primarily occurring on nouns and verbs, with relatively few on function words. Most errors were substitutions, followed by addition and deletion errors, and few sequencing errors. Errors were evenly distributed across vowels, consonant singletons, and clusters, with more errors occurring in initial and medial positions of words than in the final position of words. Most consonant errors consisted of only a single-feature change, with few 2- or 3-feature changes. Importantly, paraphasia productions by variant differed from these aggregate results, with unique production patterns for each variant. These results suggest that a system where paraphasias are coded as present versus absent may be insufficient to adequately distinguish between the 3 subtypes of PPA. The 3 variants demonstrate patterns that may be used to improve phenotyping and diagnostic sensitivity. These results should be integrated with recent findings on phonological processing and speech rate. Future research should attempt to replicate these results in a larger sample of participants with longer speech samples and varied elicitation tasks. https://doi.org/10.23641/asha.5558107.
Evaluation of Core Vocabulary Intervention for Treatment of Inconsistent Phonological Disorder: Three Treatment Case Studies

ERIC Educational Resources Information Center

McIntosh, Beth; Dodd, Barbara

2009-01-01

Children with unintelligible speech differ in severity, underlying deficit, type of surface error patterns and response to treatment. Detailed treatment case studies, evaluating specific intervention protocols for particular diagnostic groups, can identify best practice for children with speech disorder. Three treatment case studies evaluated the…
Speech variability effects on recognition accuracy associated with concurrent task performance by pilots

NASA Technical Reports Server (NTRS)

Simpson, C. A.

1985-01-01

In the present study of the responses of pairs of pilots to aircraft warning classification tasks using an isolated word, speaker-dependent speech recognition system, the induced stress was manipulated by means of different scoring procedures for the classification task and by the inclusion of a competitive manual control task. Both speech patterns and recognition accuracy were analyzed, and recognition errors were recorded by type for an isolated word speaker-dependent system and by an offline technique for a connected word speaker-dependent system. While errors increased with task loading for the isolated word system, there was no such effect for task loading in the case of the connected word system.
Factors affecting the perception of Korean-accented American English

NASA Astrophysics Data System (ADS)

Cho, Kwansun; Harris, John G.; Shrivastav, Rahul

2005-09-01

This experiment examines the relative contribution of two factors, intonation and articulation errors, on the perception of foreign accent in Korean-accented American English. Ten native speakers of Korean and ten native speakers of American English were asked to read ten English sentences. These sentences were then modified using high-quality speech resynthesis techniques [STRAIGHT Kawahara et al., Speech Commun. 27, 187-207 (1999)] to generate four sets of stimuli. In the first two sets of stimuli, the intonation patterns of the Korean speakers and American speakers were switched with one another. The articulatory errors for each speaker were not modified. In the final two sets, the sentences from the Korean and American speakers were resynthesized without any modifications. Fifteen listeners were asked to rate all the stimuli for the degree of foreign accent. Preliminary results show that, for native speakers of American English, articulation errors may play a greater role in the perception of foreign accent than errors in intonation patterns. [Work supported by KAIM.
Working Papers in Experimental Speech-Language Pathology and Audiology. Volume VII, 1979.

ERIC Educational Resources Information Center

City Univ. of New York, Flushing, NY. Queens Coll.

Seven papers review research in speech-language pathology and audiology. K. Polzer et al. describe an investigation of sign language therapy for the severely language impaired. S. Dworetsky and L. Clark analyze the phonemic and nonphonemic error patterns in five nonverbal and five verbal oral apraxic adults. The performance of three language…
Studies in automatic speech recognition and its application in aerospace

NASA Astrophysics Data System (ADS)

Taylor, Michael Robinson

Human communication is characterized in terms of the spectral and temporal dimensions of speech waveforms. Electronic speech recognition strategies based on Dynamic Time Warping and Markov Model algorithms are described and typical digit recognition error rates are tabulated. The application of Direct Voice Input (DVI) as an interface between man and machine is explored within the context of civil and military aerospace programmes. Sources of physical and emotional stress affecting speech production within military high performance aircraft are identified. Experimental results are reported which quantify fundamental frequency and coarse temporal dimensions of male speech as a function of the vibration, linear acceleration and noise levels typical of aerospace environments; preliminary indications of acoustic phonetic variability reported by other researchers are summarized. Connected whole-word pattern recognition error rates are presented for digits spoken under controlled Gz sinusoidal whole-body vibration. Correlations are made between significant increases in recognition error rate and resonance of the abdomen-thorax and head subsystems of the body. The phenomenon of vibrato style speech produced under low frequency whole-body Gz vibration is also examined. Interactive DVI system architectures and avionic data bus integration concepts are outlined together with design procedures for the efficient development of pilot-vehicle command and control protocols.
Preliteracy Speech Sound Production Skill and Linguistic Characteristics of Grade 3 Spellings: A Study Using the Templin Archive

PubMed Central

Masterson, Julie J.; Preston, Jonathan L.

2015-01-01

Purpose This archival investigation examined the relationship between preliteracy speech sound production skill (SSPS) and spelling in Grade 3 using a dataset in which children's receptive vocabulary was generally within normal limits, speech therapy was not provided until Grade 2, and phonological awareness instruction was discouraged at the time data were collected. Method Participants (N = 250), selected from the Templin Archive (Templin, 2004), varied on prekindergarten SSPS. Participants' real word spellings in Grade 3 were evaluated using a metric of linguistic knowledge, the Computerized Spelling Sensitivity System (Masterson & Apel, 2013). Relationships between kindergarten speech error types and later spellings also were explored. Results Prekindergarten children in the lowest SPSS (7th percentile) scored poorest among articulatory subgroups on both individual spelling elements (phonetic elements, junctures, and affixes) and acceptable spelling (using relatively more omissions and illegal spelling patterns). Within the 7th percentile subgroup, there were no statistical spelling differences between those with mostly atypical speech sound errors and those with mostly typical speech sound errors. Conclusions Findings were consistent with predictions from dual route models of spelling that SSPS is one of many variables associated with spelling skill and that children with impaired SSPS are at risk for spelling difficulty. PMID:26380965
[Velopharyngeal closure pattern and speech performance among submucous cleft palate patients].

PubMed

Heng, Yin; Chunli, Guo; Bing, Shi; Yang, Li; Jingtao, Li

2017-06-01

To characterize the velopharyngeal closure patterns and speech performance among submucous cleft palate patients. Patients with submucous cleft palate visiting the Department of Cleft Lip and Palate Surgery, West China Hospital of Stomatology, Sichuan University between 2008 and 2016 were reviewed. Outcomes of subjective speech evaluation including velopharyngeal function, consonant articulation, and objective nasopharyngeal endoscopy including the mobility of soft palate, pharyngeal walls were retrospectively analyzed. A total of 353 cases were retrieved in this study, among which 138 (39.09%) demonstrated velopharyngeal competence, 176 (49.86%) velopharyngeal incompetence, and 39 (11.05%) marginal velopharyngeal incompetence. A total of 268 cases were subjected to nasopharyngeal endoscopy examination, where 167 (62.31%) demonstrated circular closure pattern, 89 (33.21%) coronal pattern, and 12 (4.48%) sagittal pattern. Passavant's ridge existed in 45.51% (76/167) patients with circular closure and 13.48% (12/89) patients with coronal closure. Among the 353 patients included in this study, 137 (38.81%) presented normal articulation, 124 (35.13%) consonant elimination, 51 (14.45%) compensatory articulation, 36 (10.20%) consonant weakening, 25 (7.08%) consonant replacement, and 36 (10.20%) multiple articulation errors. Circular closure was the most prevalent velopharyngeal closure pattern among patients with submucous cleft palate, and high-pressure consonant deletion was the most common articulation abnormality. Articulation error occurred more frequently among patients with a low velopharyngeal closure rate.
The Effect of Auditory Information on Patterns of Intrusions and Reductions

ERIC Educational Resources Information Center

Slis, Anneke; van Lieshout, Pascal

2016-01-01

Purpose: The study investigates whether auditory information affects the nature of intrusion and reduction errors in reiterated speech. These errors are hypothesized to arise as a consequence of autonomous mechanisms to stabilize movement coordination. The specific question addressed is whether this process is affected by auditory information so…
SDI Software Technology Program Plan Version 1.5

DTIC Science & Technology

1987-06-01

computer generation of auditory communication of meaningful speech. Most speech synthesizers are based on mathematical models of the human vocal tract, but...oral/ auditory and multimodal communications. Although such state-of-the-art interaction technology has not fully matured, user experience has...superior I pattern matching capabilities and the subliminal intuitive deduction capability. The error performance of humans can be helped by careful
Error Patterns in Young German Children's "Wh"-Questions

ERIC Educational Resources Information Center

Schmerse, Daniel; Lieven, Elena; Tomasello, Michael

2013-01-01

In this article we report two studies: a detailed longitudinal analysis of errors in "wh"-questions from six German-learning children (age 2 ; 0-3 ; 0) and an analysis of the prosodic characteristics of "wh"-questions in German child-directed speech. The results of the first study demonstrate that German-learning children…
Bilingual language intrusions and other speech errors in Alzheimer's disease.

PubMed

Gollan, Tamar H; Stasenko, Alena; Li, Chuchu; Salmon, David P

2017-11-01

The current study investigated how Alzheimer's disease (AD) affects production of speech errors in reading-aloud. Twelve Spanish-English bilinguals with AD and 19 matched controls read-aloud 8 paragraphs in four conditions (a) English-only, (b) Spanish-only, (c) English-mixed (mostly English with 6 Spanish words), and (d) Spanish-mixed (mostly Spanish with 6 English words). Reading elicited language intrusions (e.g., saying la instead of the), and several types of within-language errors (e.g., saying their instead of the). Patients produced more intrusions (and self-corrected less often) than controls, particularly when reading non-dominant language paragraphs with switches into the dominant language. Patients also produced more within-language errors than controls, but differences between groups for these were not consistently larger with dominant versus non-dominant language targets. These results illustrate the potential utility of speech errors for diagnosis of AD, suggest a variety of linguistic and executive control impairments in AD, and reveal multiple cognitive mechanisms needed to mix languages fluently. The observed pattern of deficits, and unique sensitivity of intrusions to AD in bilinguals, suggests intact ability to select a default language with contextual support, to rapidly translate and switch languages in production of connected speech, but impaired ability to monitor language membership while regulating inhibitory control. Copyright © 2017 Elsevier Inc. All rights reserved.
Factors that enhance English-speaking speech-language pathologists' transcription of Cantonese-speaking children's consonants.

PubMed

Lockart, Rebekah; McLeod, Sharynne

2013-08-01

To investigate speech-language pathology students' ability to identify errors and transcribe typical and atypical speech in Cantonese, a nonnative language. Thirty-three English-speaking speech-language pathology students completed 3 tasks in an experimental within-subjects design. Task 1 (baseline) involved transcribing English words. In Task 2, students transcribed 25 words spoken by a Cantonese adult. An average of 59.1% consonants was transcribed correctly (72.9% when Cantonese-English transfer patterns were allowed). There was higher accuracy on shared English and Cantonese syllable-initial consonants /m,n,f,s,h,j,w,l/ and syllable-final consonants. In Task 3, students identified consonant errors and transcribed 100 words spoken by Cantonese-speaking children under 4 additive conditions: (1) baseline, (2) +adult model, (3) +information about Cantonese phonology, and (4) all variables (2 and 3 were counterbalanced). There was a significant improvement in the students' identification and transcription scores for conditions 2, 3, and 4, with a moderate effect size. Increased skill was not based on listeners' proficiency in speaking another language, perceived transcription skill, musicality, or confidence with multilingual clients. Speech-language pathology students, with no exposure to or specific training in Cantonese, have some skills to identify errors and transcribe Cantonese. Provision of a Cantonese-adult model and information about Cantonese phonology increased students' accuracy in transcribing Cantonese speech.
Is talking to an automated teller machine natural and fun?

PubMed

Chan, F Y; Khalid, H M

Usability and affective issues of using automatic speech recognition technology to interact with an automated teller machine (ATM) are investigated in two experiments. The first uncovered dialogue patterns of ATM users for the purpose of designing the user interface for a simulated speech ATM system. Applying the Wizard-of-Oz methodology, multiple mapping and word spotting techniques, the speech driven ATM accommodates bilingual users of Bahasa Melayu and English. The second experiment evaluates the usability of a hybrid speech ATM, comparing it with a simulated manual ATM. The aim is to investigate how natural and fun can talking to a speech ATM be for these first-time users. Subjects performed the withdrawal and balance enquiry tasks. The ANOVA was performed on the usability and affective data. The results showed significant differences between systems in the ability to complete the tasks as well as in transaction errors. Performance was measured on the time taken by subjects to complete the task and the number of speech recognition errors that occurred. On the basis of user emotions, it can be said that the hybrid speech system enabled pleasurable interaction. Despite the limitations of speech recognition technology, users are set to talk to the ATM when it becomes available for public use.
Feature Migration in Time: Reflection of Selective Attention on Speech Errors

PubMed Central

Nozari, Nazbanou; Dell, Gary S.

2012-01-01

This paper describes an initial study of the effect of focused attention on phonological speech errors. In three experiments, participants recited four-word tongue-twisters, and focused attention on one (or none) of the words. The attended word was singled out differently in each experiment; participants were under instructions to either avoid errors on the attended word, to stress it, or to say it silently. The experiments showed that all methods of attending to a word decreased errors on that word, while increasing errors on the surrounding words. However, this error increase did not result from a relative increase in phonemic migrations originating from the attended word. This pattern is inconsistent with conceptualizing attention either as higher activation of the attended word or greater inhibition of the unattended words throughout the production of the sequence. Instead, it is consistent with a model which presumes that attention exerts its effect at the time of production of the attended word, without lingering effects on the past or the future. PMID:22268910
Mimicking aphasic semantic errors in normal speech production: evidence from a novel experimental paradigm.

PubMed

Hodgson, Catherine; Lambon Ralph, Matthew A

2008-01-01

Semantic errors are commonly found in semantic dementia (SD) and some forms of stroke aphasia and provide insights into semantic processing and speech production. Low error rates are found in standard picture naming tasks in normal controls. In order to increase error rates and thus provide an experimental model of aphasic performance, this study utilised a novel method- tempo picture naming. Experiment 1 showed that, compared to standard deadline naming tasks, participants made more errors on the tempo picture naming tasks. Further, RTs were longer and more errors were produced to living items than non-living items a pattern seen in both semantic dementia and semantically-impaired stroke aphasic patients. Experiment 2 showed that providing the initial phoneme as a cue enhanced performance whereas providing an incorrect phonemic cue further reduced performance. These results support the contention that the tempo picture naming paradigm reduces the time allowed for controlled semantic processing causing increased error rates. This experimental procedure would, therefore, appear to mimic the performance of aphasic patients with multi-modal semantic impairment that results from poor semantic control rather than the degradation of semantic representations observed in semantic dementia [Jefferies, E. A., & Lambon Ralph, M. A. (2006). Semantic impairment in stoke aphasia vs. semantic dementia: A case-series comparison. Brain, 129, 2132-2147]. Further implications for theories of semantic cognition and models of speech processing are discussed.

Communication variations and aircrew performance

NASA Technical Reports Server (NTRS)

Kanki, Barbara G.; Folk, Valerie G.; Irwin, Cheryl M.

1991-01-01

The relationship between communication variations and aircrew performance (high-error vs low-error performances) was investigated by analyzing the coded verbal transcripts derived from the videotape records of 18 two-person air transport crews who participated in a high-fidelity, full-mission flight simulation. The flight scenario included a task which involved abnormal operations and required the coordinated efforts of all crew members. It was found that the best-performing crews were characterized by nearly identical patterns of communication, whereas the midrange and poorer performing crews showed a great deal of heterogeneity in their speech patterns. Although some specific speech sequences can be interpreted as being more or less facilitative to the crew-coordination process, predictability appears to be the key ingredient for enhancing crew performance. Crews communicating in highly standard (hence predictable) ways were better able to coordinate their task, whereas crews characterized by multiple, nonstandard communication profiles were less effective in their performance.
Lexical and phonological variability in preschool children with speech sound disorder.

PubMed

Macrae, Toby; Tyler, Ann A; Lewis, Kerry E

2014-02-01

The authors of this study examined relationships between measures of word and speech error variability and between these and other speech and language measures in preschool children with speech sound disorder (SSD). In this correlational study, 18 preschool children with SSD, age-appropriate receptive vocabulary, and normal oral motor functioning and hearing were assessed across 2 sessions. Experimental measures included word and speech error variability, receptive vocabulary, nonword repetition (NWR), and expressive language. Pearson product–moment correlation coefficients were calculated among the experimental measures. The correlation between word and speech error variability was slight and nonsignificant. The correlation between word variability and receptive vocabulary was moderate and negative, although nonsignificant. High word variability was associated with small receptive vocabularies. The correlations between speech error variability and NWR and between speech error variability and the mean length of children's utterances were moderate and negative, although both were nonsignificant. High speech error variability was associated with poor NWR and language scores. High word variability may reflect unstable lexical representations, whereas high speech error variability may reflect indistinct phonological representations. Preschool children with SSD who show abnormally high levels of different types of speech variability may require slightly different approaches to intervention.
Analysis of communication in the standard versus automated aircraft

NASA Technical Reports Server (NTRS)

Veinott, Elizabeth S.; Irwin, Cheryl M.

1993-01-01

Past research has shown crew communication patterns to be associated with overall crew performance, recent flight experience together, low-and high-error crew performance and personality variables. However, differences in communication patterns as a function of aircraft type and level of aircraft automation have not been fully addressed. Crew communications from ten MD-88 and twelve DC-9 crews were obtained during a full-mission simulation. In addition to large differences in overall amount of communication during the normal and abnormal phases of flight (DC-9 crews generating less speech than MD-88 crews), differences in specific speech categories were also found. Log-linear analyses also generated speaker-response patterns related to each aircraft type, although in future analyses these patterns will need to account for variations due to crew performance.
The Frame Constraint on Experimentally Elicited Speech Errors in Japanese.

PubMed

Saito, Akie; Inoue, Tomoyoshi

2017-06-01

The so-called syllable position effect in speech errors has been interpreted as reflecting constraints posed by the frame structure of a given language, which is separately operating from linguistic content during speech production. The effect refers to the phenomenon that when a speech error occurs, replaced and replacing sounds tend to be in the same position within a syllable or word. Most of the evidence for the effect comes from analyses of naturally occurring speech errors in Indo-European languages, and there are few studies examining the effect in experimentally elicited speech errors and in other languages. This study examined whether experimentally elicited sound errors in Japanese exhibits the syllable position effect. In Japanese, the sub-syllabic unit known as "mora" is considered to be a basic sound unit in production. Results showed that the syllable position effect occurred in mora errors, suggesting that the frame constrains the ordering of sounds during speech production.
Coarticulatory evidence in stuttered disfluencies

NASA Astrophysics Data System (ADS)

Arbisi-Kelm, Timothy

2005-09-01

While the disfluencies produced in stuttered speech surface at a significantly higher rate than those found in normal speech, it is less clear from the previous stuttering literature how exactly these disfluency patterns might differ in kind [Wingate (1988)]. One tendency found in normal speech is for disfluencies to remove acoustic evidence of coarticulation patterns [Shriberg (1999)]. This appears attributable to lexical search errors which prevent a speaker from accessing a word's phonological form; that is, coarticulation between words will fail to occur when segmental material from the following word is not retrieved. Since stuttering is a disorder which displays evidence of phonological but not lexical impairment, it was predicted that stuttered disfluencies would differ from normal errors in that the former would reveal acoustic evidence of word transitions. Eight speakers four stutterers and four control subjects participated in a narrative-production task, spontaneously describing a picture book. Preliminary results suggest that while both stutterers and controls did produce similar rates of disfluencies occurring without coarticulatory evidence, only the stutterers regularly produced disfluencies reflecting this transitional evidence. These results support the argument that disfluencies proper to stuttering result from a phonological deficit, while normal disfluencies are generally lexically based.
How should children with speech sound disorders be classified? A review and critical evaluation of current classification systems.

PubMed

Waring, R; Knight, R

2013-01-01

Children with speech sound disorders (SSD) form a heterogeneous group who differ in terms of the severity of their condition, underlying cause, speech errors, involvement of other aspects of the linguistic system and treatment response. To date there is no universal and agreed-upon classification system. Instead, a number of theoretically differing classification systems have been proposed based on either an aetiological (medical) approach, a descriptive-linguistic approach or a processing approach. To describe and review the supporting evidence, and to provide a critical evaluation of the current childhood SSD classification systems. Descriptions of the major specific approaches to classification are reviewed and research papers supporting the reliability and validity of the systems are evaluated. Three specific paediatric SSD classification systems; the aetiologic-based Speech Disorders Classification System, the descriptive-linguistic Differential Diagnosis system, and the processing-based Psycholinguistic Framework are identified as potentially useful in classifying children with SSD into homogeneous subgroups. The Differential Diagnosis system has a growing body of empirical support from clinical population studies, across language error pattern studies and treatment efficacy studies. The Speech Disorders Classification System is currently a research tool with eight proposed subgroups. The Psycholinguistic Framework is a potential bridge to linking cause and surface level speech errors. There is a need for a universally agreed-upon classification system that is useful to clinicians and researchers. The resulting classification system needs to be robust, reliable and valid. A universal classification system would allow for improved tailoring of treatments to subgroups of SSD which may, in turn, lead to improved treatment efficacy. © 2012 Royal College of Speech and Language Therapists.
How much is a word? Predicting ease of articulation planning from apraxic speech error patterns.

PubMed

Ziegler, Wolfram; Aichert, Ingrid

2015-08-01

According to intuitive concepts, 'ease of articulation' is influenced by factors like word length or the presence of consonant clusters in an utterance. Imaging studies of speech motor control use these factors to systematically tax the speech motor system. Evidence from apraxia of speech, a disorder supposed to result from speech motor planning impairment after lesions to speech motor centers in the left hemisphere, supports the relevance of these and other factors in disordered speech planning and the genesis of apraxic speech errors. Yet, there is no unified account of the structural properties rendering a word easy or difficult to pronounce. To model the motor planning demands of word articulation by a nonlinear regression model trained to predict the likelihood of accurate word production in apraxia of speech. We used a tree-structure model in which vocal tract gestures are embedded in hierarchically nested prosodic domains to derive a recursive set of terms for the computation of the likelihood of accurate word production. The model was trained with accuracy data from a set of 136 words averaged over 66 samples from apraxic speakers. In a second step, the model coefficients were used to predict a test dataset of accuracy values for 96 new words, averaged over 120 samples produced by a different group of apraxic speakers. Accurate modeling of the first dataset was achieved in the training study (R(2)adj = .71). In the cross-validation, the test dataset was predicted with a high accuracy as well (R(2)adj = .67). The model shape, as reflected by the coefficient estimates, was consistent with current phonetic theories and with clinical evidence. In accordance with phonetic and psycholinguistic work, a strong influence of word stress on articulation errors was found. The proposed model provides a unified and transparent account of the motor planning requirements of word articulation. Copyright © 2015 Elsevier Ltd. All rights reserved.
Electropalatography in home training of retracted articulation in a Swedish child with cleft palate: effect on articulation pattern and speech.

PubMed

Lohmander, Anette; Henriksson, Cecilia; Havstam, Christina

2010-12-01

The aim was to evaluate the effectiveness of electropalatography (EPG) in home training of persistent articulation errors in an 11-year-old Swedish girl born with isolated cleft palate. The /t/ and /s/ sounds were trained in a single subject design across behaviours during an eight month period using a portable training unit (PTU). Both EPG analysis and perceptual analysis showed an improvement in the production of /t/ and /s/ in words and sentences after therapy. Analysis of tongue-contact patterns showed that the participant had more normal articulatory patterns of /t/ and /s/ after just 2 months (after approximately 8 hours of training) respectively. No statistically significant transfer by means of intelligibility in connected speech was found. The present results show that EPG home training can be a sufficient method for treating persistent speech disorders associated with cleft palate. Methods for transfer from function (articulation) to activity (intelligibility) need to be explored.
Is Comprehension Necessary for Error Detection? A Conflict-Based Account of Monitoring in Speech Production

ERIC Educational Resources Information Center

Nozari, Nazbanou; Dell, Gary S.; Schwartz, Myrna F.

2011-01-01

Despite the existence of speech errors, verbal communication is successful because speakers can detect (and correct) their errors. The standard theory of speech-error detection, the perceptual-loop account, posits that the comprehension system monitors production output for errors. Such a comprehension-based monitor, however, cannot explain the…
Speech Synthesis Using Perceptually Motivated Features

DTIC Science & Technology

2012-01-23

with others a few years prior (with the concurrence of the project’s program manager. Willard Larkin). The Perceptual Flow of Phonetic Information and...34The Perceptual Flow of Phonetic Processing," consonant confusion matrices are analyzed for patterns of phonetic-feature decoding errors conditioned...decoding) is also observed. From these conditional probability patterns, it is proposed that they reflect a temporal flow of perceptual processing
Conflict monitoring in speech processing: An fMRI study of error detection in speech production and perception.

PubMed

Gauvin, Hanna S; De Baene, Wouter; Brass, Marcel; Hartsuiker, Robert J

2016-02-01

To minimize the number of errors in speech, and thereby facilitate communication, speech is monitored before articulation. It is, however, unclear at which level during speech production monitoring takes place, and what mechanisms are used to detect and correct errors. The present study investigated whether internal verbal monitoring takes place through the speech perception system, as proposed by perception-based theories of speech monitoring, or whether mechanisms independent of perception are applied, as proposed by production-based theories of speech monitoring. With the use of fMRI during a tongue twister task we observed that error detection in internal speech during noise-masked overt speech production and error detection in speech perception both recruit the same neural network, which includes pre-supplementary motor area (pre-SMA), dorsal anterior cingulate cortex (dACC), anterior insula (AI), and inferior frontal gyrus (IFG). Although production and perception recruit similar areas, as proposed by perception-based accounts, we did not find activation in superior temporal areas (which are typically associated with speech perception) during internal speech monitoring in speech production as hypothesized by these accounts. On the contrary, results are highly compatible with a domain general approach to speech monitoring, by which internal speech monitoring takes place through detection of conflict between response options, which is subsequently resolved by a domain general executive center (e.g., the ACC). Copyright © 2015 Elsevier Inc. All rights reserved.
Characterizing Articulation in Apraxic Speech Using Real-Time Magnetic Resonance Imaging.

PubMed

Hagedorn, Christina; Proctor, Michael; Goldstein, Louis; Wilson, Stephen M; Miller, Bruce; Gorno-Tempini, Maria Luisa; Narayanan, Shrikanth S

2017-04-14

Real-time magnetic resonance imaging (MRI) and accompanying analytical methods are shown to capture and quantify salient aspects of apraxic speech, substantiating and expanding upon evidence provided by clinical observation and acoustic and kinematic data. Analysis of apraxic speech errors within a dynamic systems framework is provided and the nature of pathomechanisms of apraxic speech discussed. One adult male speaker with apraxia of speech was imaged using real-time MRI while producing spontaneous speech, repeated naming tasks, and self-paced repetition of word pairs designed to elicit speech errors. Articulatory data were analyzed, and speech errors were detected using time series reflecting articulatory activity in regions of interest. Real-time MRI captured two types of apraxic gestural intrusion errors in a word pair repetition task. Gestural intrusion errors in nonrepetitive speech, multiple silent initiation gestures at the onset of speech, and covert (unphonated) articulation of entire monosyllabic words were also captured. Real-time MRI and accompanying analytical methods capture and quantify many features of apraxic speech that have been previously observed using other modalities while offering high spatial resolution. This patient's apraxia of speech affected the ability to select only the appropriate vocal tract gestures for a target utterance, suppressing others, and to coordinate them in time.
Foot Structure in Japanese Speech Errors: Normal vs. Pathological

ERIC Educational Resources Information Center

Miyakoda, Haruko

2008-01-01

Although many studies of speech errors have been presented in the literature, most have focused on errors occurring at either the segmental or feature level. Few, if any, studies have dealt with the prosodic structure of errors. This paper aims to fill this gap by taking up the issue of prosodic structure in Japanese speech errors, with a focus on…
The Frame Constraint on Experimentally Elicited Speech Errors in Japanese

ERIC Educational Resources Information Center

Saito, Akie; Inoue, Tomoyoshi

2017-01-01

The so-called syllable position effect in speech errors has been interpreted as reflecting constraints posed by the frame structure of a given language, which is separately operating from linguistic content during speech production. The effect refers to the phenomenon that when a speech error occurs, replaced and replacing sounds tend to be in the…
Hierarchical singleton-type recurrent neural fuzzy networks for noisy speech recognition.

PubMed

Juang, Chia-Feng; Chiou, Chyi-Tian; Lai, Chun-Lung

2007-05-01

This paper proposes noisy speech recognition using hierarchical singleton-type recurrent neural fuzzy networks (HSRNFNs). The proposed HSRNFN is a hierarchical connection of two singleton-type recurrent neural fuzzy networks (SRNFNs), where one is used for noise filtering and the other for recognition. The SRNFN is constructed by recurrent fuzzy if-then rules with fuzzy singletons in the consequences, and their recurrent properties make them suitable for processing speech patterns with temporal characteristics. In n words recognition, n SRNFNs are created for modeling n words, where each SRNFN receives the current frame feature and predicts the next one of its modeling word. The prediction error of each SRNFN is used as recognition criterion. In filtering, one SRNFN is created, and each SRNFN recognizer is connected to the same SRNFN filter, which filters noisy speech patterns in the feature domain before feeding them to the SRNFN recognizer. Experiments with Mandarin word recognition under different types of noise are performed. Other recognizers, including multilayer perceptron (MLP), time-delay neural networks (TDNNs), and hidden Markov models (HMMs), are also tested and compared. These experiments and comparisons demonstrate good results with HSRNFN for noisy speech recognition tasks.
The effect of talker and intonation variability on speech perception in noise in children with dyslexia

PubMed Central

Hazan, Valerie; Messaoud-Galusi, Souhila; Rosen, Stuart

2013-01-01

Purpose To determine whether children with dyslexia (DYS) are more affected than age-matched average readers (AR) by talker and intonation variability when perceiving speech in noise. Method Thirty-four DYS and 25 AR children were tested on their perception of consonants in naturally-produced consonant-vowel (CV) tokens in multi-talker babble. Twelve CVs were presented for identification in four conditions varying in the degree of talker and intonation variability. Consonant place (/bi/-/di/) and voicing (/bi/-/pi/) discrimination was investigated with the same conditions. Results DYS children made slightly more identification errors than AR children but only for conditions with variable intonation. Errors were more frequent for a subset of consonants, generally weakly-encoded for AR children, for tokens with intonation patterns (steady and rise-fall) that occur infrequently in connected discourse. In discrimination tasks, which have a greater memory and cognitive load, DYS children scored lower than AR children across all conditions. Conclusions Unusual intonation patterns had a disproportionate (but small) effect on consonant intelligibility in noise for DYS children but adding talker variability did not. DYS children do not appear to have a general problem in perceiving speech in degraded conditions, which makes it unlikely that they lack robust phonological representations. PMID:22761322
The effect of talker and intonation variability on speech perception in noise in children with dyslexia.

PubMed

Hazan, Valerie; Messaoud-Galusi, Souhila; Rosen, Stuart

2013-02-01

In this study, the authors aimed to determine whether children with dyslexia (hereafter referred to as "DYS children") are more affected than children with average reading ability (hereafter referred to as "AR children") by talker and intonation variability when perceiving speech in noise. Thirty-four DYS and 25 AR children were tested on their perception of consonants in naturally produced CV tokens in multitalker babble. Twelve CVs were presented for identification in four conditions varying in the degree of talker and intonation variability. Consonant place (/bi/-/di/) and voicing (/bi/-/pi/) discrimination were investigated with the same conditions. DYS children made slightly more identification errors than AR children but only for conditions with variable intonation. Errors were more frequent for a subset of consonants, generally weakly encoded for AR children, for tokens with intonation patterns (steady and rise-fall) that occur infrequently in connected discourse. In discrimination tasks, which have a greater memory and cognitive load, DYS children scored lower than AR children across all conditions. Unusual intonation patterns had a disproportionate (but small) effect on consonant intelligibility in noise for DYS children, but adding talker variability did not. DYS children do not appear to have a general problem in perceiving speech in degraded conditions, which makes it unlikely that they lack robust phonological representations.
Speech research: A report on the status and progress of studies on the nature of speech, instrumentation for its investigation, and practical applications

NASA Astrophysics Data System (ADS)

Liberman, A. M.

1980-06-01

This report (1 April - 30 June) is one of a regular series on the status and progress of studies on the nature of speech, instrumentation for its investigation, and practical applications. Manuscripts cover the following topics: The perceptual equivalance of two acoustic cues for a speech contrast is specific to phonetic perception; Duplex perception of acoustic patterns as speech and nonspeech; Evidence for phonetic processing of cues to place of articulation: Perceived manner affects perceived place; Some articulatory correlates of perceptual isochrony; Effects of utterance continuity on phonetic judgments; Laryngeal adjustments in stuttering: A glottographic observation using a modified reaction paradigm; Missing -ing in reading: Letter detection errors on word endings; Speaking rate; syllable stress, and vowel identity; Sonority and syllabicity: Acoustic correlates of perception, Influence of vocalic context on perception of the (S)-(s) distinction.
Speed and Accuracy of Rapid Speech Output by Adolescents with Residual Speech Sound Errors Including Rhotics

ERIC Educational Resources Information Center

Preston, Jonathan L.; Edwards, Mary Louise

2009-01-01

Children with residual speech sound errors are often underserved clinically, yet there has been a lack of recent research elucidating the specific deficits in this population. Adolescents aged 10-14 with residual speech sound errors (RE) that included rhotics were compared to normally speaking peers on tasks assessing speed and accuracy of speech…
The Role of Supralexical Prosodic Units in Speech Production: Evidence from the Distribution of Speech Errors

ERIC Educational Resources Information Center

Choe, Wook Kyung

2013-01-01

The current dissertation represents one of the first systematic studies of the distribution of speech errors within supralexical prosodic units. Four experiments were conducted to gain insight into the specific role of these units in speech planning and production. The first experiment focused on errors in adult English. These were found to be…

Recognizing Whispered Speech Produced by an Individual with Surgically Reconstructed Larynx Using Articulatory Movement Data

PubMed Central

Cao, Beiming; Kim, Myungjong; Mau, Ted; Wang, Jun

2017-01-01

Individuals with larynx (vocal folds) impaired have problems in controlling their glottal vibration, producing whispered speech with extreme hoarseness. Standard automatic speech recognition using only acoustic cues is typically ineffective for whispered speech because the corresponding spectral characteristics are distorted. Articulatory cues such as the tongue and lip motion may help in recognizing whispered speech since articulatory motion patterns are generally not affected. In this paper, we investigated whispered speech recognition for patients with reconstructed larynx using articulatory movement data. A data set with both acoustic and articulatory motion data was collected from a patient with surgically reconstructed larynx using an electromagnetic articulograph. Two speech recognition systems, Gaussian mixture model-hidden Markov model (GMM-HMM) and deep neural network-HMM (DNN-HMM), were used in the experiments. Experimental results showed adding either tongue or lip motion data to acoustic features such as mel-frequency cepstral coefficient (MFCC) significantly reduced the phone error rates on both speech recognition systems. Adding both tongue and lip data achieved the best performance. PMID:29423453
Evaluation of the importance of time-frequency contributions to speech intelligibility in noise

PubMed Central

Yu, Chengzhu; Wójcicki, Kamil K.; Loizou, Philipos C.; Hansen, John H. L.; Johnson, Michael T.

2014-01-01

Recent studies on binary masking techniques make the assumption that each time-frequency (T-F) unit contributes an equal amount to the overall intelligibility of speech. The present study demonstrated that the importance of each T-F unit to speech intelligibility varies in accordance with speech content. Specifically, T-F units are categorized into two classes, speech-present T-F units and speech-absent T-F units. Results indicate that the importance of each speech-present T-F unit to speech intelligibility is highly related to the loudness of its target component, while the importance of each speech-absent T-F unit varies according to the loudness of its masker component. Two types of mask errors are also considered, which include miss and false alarm errors. Consistent with previous work, false alarm errors are shown to be more harmful to speech intelligibility than miss errors when the mixture signal-to-noise ratio (SNR) is below 0 dB. However, the relative importance between the two types of error is conditioned on the SNR level of the input speech signal. Based on these observations, a mask-based objective measure, the loudness weighted hit-false, is proposed for predicting speech intelligibility. The proposed objective measure shows significantly higher correlation with intelligibility compared to two existing mask-based objective measures. PMID:24815280
Factors Influencing Consonant Acquisition in Brazilian Portuguese-Speaking Children

ERIC Educational Resources Information Center

Ceron, Marizete Ilha; Gubiani, Marileda Barichello; de Oliveira, Camila Rosa; Keske-Soares, Márcia

2017-01-01

Purpose: We sought to provide valid and reliable data on the acquisition of consonant sounds in speakers of Brazilian Portuguese. Method: The sample comprised 733 typically developing monolingual speakers of Brazilian Portuguese (ages 3;0-8;11 [years;months]). The presence of surface speech error patterns, the revised percentage consonants…
Australian children with cleft palate achieve age-appropriate speech by 5 years of age.

PubMed

Chacon, Antonia; Parkin, Melissa; Broome, Kate; Purcell, Alison

2017-12-01

Children with cleft palate demonstrate atypical speech sound development, which can influence their intelligibility, literacy and learning. There is limited documentation regarding how speech sound errors change over time in cleft palate speech and the effect that these errors have upon mono-versus polysyllabic word production. The objective of this study was to examine the phonetic and phonological speech skills of children with cleft palate at ages 3 and 5. A cross-sectional observational design was used. Eligible participants were aged 3 or 5 years with a repaired cleft palate. The Diagnostic Evaluation of Articulation and Phonology (DEAP) Articulation subtest and a non-standardised list of mono- and polysyllabic words were administered once for each child. The Profile of Phonology (PROPH) was used to analyse each child's speech. N = 51 children with cleft palate participated in the study. Three-year-old children with cleft palate produced significantly more speech errors than their typically-developing peers, but no difference was apparent at 5 years. The 5-year-olds demonstrated greater phonetic and phonological accuracy than the 3-year-old children. Polysyllabic words were more affected by errors than monosyllables in the 3-year-old group only. Children with cleft palate are prone to phonetic and phonological speech errors in their preschool years. Most of these speech errors approximate typically-developing children by 5 years. At 3 years, word shape has an influence upon phonological speech accuracy. Speech pathology intervention is indicated to support the intelligibility of these children from their earliest stages of development. Copyright © 2017 Elsevier B.V. All rights reserved.
Heft Lemisphere: Exchanges Predominate in Segmental Speech Errors

ERIC Educational Resources Information Center

Nooteboom, Sieb G.; Quene, Hugo

2013-01-01

In most collections of segmental speech errors, exchanges are less frequent than anticipations and perseverations. However, it has been suggested that in inner speech exchanges might be more frequent than either anticipations or perseverations, because many half-way repaired errors (Yew...uhh...New York) are classified as repaired anticipations,…
Korean speech sound development in children from bilingual Japanese-Korean environments

PubMed Central

Kim, Jeoung Suk; Lee, Jun Ho; Choi, Yoon Mi; Kim, Hyun Gi; Kim, Sung Hwan; Lee, Min Kyung

2010-01-01

Purpose This study investigates Korean speech sound development, including articulatory error patterns, among the Japanese-Korean children whose mothers are Japanese immigrants to Korea. Methods The subjects were 28 Japanese-Korean children with normal development born to Japanese women immigrants who lived in Jeonbuk province, Korea. They were assessed through Computerized Speech Lab 4500. The control group consisted of 15 Korean children who lived in the same area. Results The values of the voice onset time of consonants /ph/, /t/, /th/, and /k*/ among the children were prolonged. The children replaced the lenis sounds with aspirated or fortis sounds rather than replacing the fortis sounds with lenis or aspirated sounds, which are typical among Japanese immigrants. The children showed numerous articulatory errors for /c/ and /l/ sounds (similar to Koreans) rather than errors on /p/ sounds, which are more frequent among Japanese immigrants. The vowel formants of the children showed a significantly prolonged vowel /o/ as compared to that of Korean children (P<0.05). The Japanese immigrants and their children showed a similar substitution /n/ for /ɧ/ [Japanese immigrants (62.5%) vs Japanese-Korean children (14.3%)], which is rarely seen among Koreans. Conclusion The findings suggest that Korean speech sound development among Japanese-Korean children is influenced not only by the Korean language environment but also by their maternal language. Therefore, appropriate language education programs may be warranted not only or immigrant women but also for their children. PMID:21189968
Supporting Dictation Speech Recognition Error Correction: The Impact of External Information

ERIC Educational Resources Information Center

Shi, Yongmei; Zhou, Lina

2011-01-01

Although speech recognition technology has made remarkable progress, its wide adoption is still restricted by notable effort made and frustration experienced by users while correcting speech recognition errors. One of the promising ways to improve error correction is by providing user support. Although support mechanisms have been proposed for…
Repeated Speech Errors: Evidence for Learning

ERIC Educational Resources Information Center

Humphreys, Karin R.; Menzies, Heather; Lake, Johanna K.

2010-01-01

Three experiments elicited phonological speech errors using the SLIP procedure to investigate whether there is a tendency for speech errors on specific words to reoccur, and whether this effect can be attributed to implicit learning of an incorrect mapping from lemma to phonology for that word. In Experiment 1, when speakers made a phonological…
Cascading Influences on the Production of Speech: Evidence from Articulation

ERIC Educational Resources Information Center

McMillan, Corey T.; Corley, Martin

2010-01-01

Recent investigations have supported the suggestion that phonological speech errors may reflect the simultaneous activation of more than one phonemic representation. This presents a challenge for speech error evidence which is based on the assumption of well-formedness, because we may continue to perceive well-formed errors, even when they are not…
Adult Speakers' Tongue-Palate Contact Patterns for Bilabial Stops within Complex Clusters

ERIC Educational Resources Information Center

Zharkova, Natalia; Schaeffler, Sonja; Gibbon, Fiona E.

2009-01-01

Previous studies using Electropalatography (EPG) have shown that individuals with speech disorders sometimes produce articulation errors that affect bilabial targets, but currently there is limited normative data available. In this study, EPG and acoustic data were recorded during complex word final sps clusters spoken by 20 normal adults. A total…
Inhibitory control and the speech patterns of second language users.

PubMed

Korko, Malgorzata; Williams, Simon A

2017-02-01

Inhibitory control (IC), an ability to suppress irrelevant and/or conflicting information, has been found to underlie performance on a variety of cognitive tasks, including bilingual language processing. This study examines the relationship between IC and the speech patterns of second language (L2) users from the perspective of individual differences. While the majority of studies have supported the role of IC in bilingual language processing using single-word production paradigms, this work looks at inhibitory processes in the context of extended speech, with a particular emphasis on disfluencies. We hypothesized that the speech of individuals with poorer IC would be characterized by reduced fluency. A series of regression analyses, in which we controlled for age and L2 proficiency, revealed that IC (in terms of accuracy on the Stroop task) could reliably predict the occurrence of reformulations and the frequency and duration of silent pauses in L2 speech. No statistically significant relationship was found between IC and other L2 spoken output measures, such as repetitions, filled pauses, and performance errors. Conclusions focus on IC as one out of a number of cognitive functions in the service of spoken language production. A more qualitative approach towards the question of whether L2 speakers rely on IC is advocated. © 2016 The British Psychological Society.
A portable smoking pattern recorder.

PubMed

Creighton, D E; Noble, M J; Whewell, R T

1979-01-01

An instrument has been developed which can be used to record the smoking patterns of human smokers in almost any location. The smoker is required to smoke the cigarette through an orifice plate cigarette holder connected to the recorder. The smoking pattern data are recorded onto a standard audio cassette as pressure and flow signals together with timing impulses and speech. The instrument is battery powered and can be built into a small brief case. The four channels of data are decoded on a separate instrument, which uses the timing signals to synchronise a data logger, thus making the whole system independent of tape speed errors. The speech channel is used to identify the smoker, cigarette, location, etc. Comparisons have been made of the performance of the portable recorder and a laboratory smoking analyser and data logger. It was found that data decoded from the portable recorder are generally within 1% of the values recorded directly on the laboratory instrument.
Gesture production and comprehension in children with specific language impairment.

PubMed

Botting, Nicola; Riches, Nicholas; Gaynor, Marguerite; Morgan, Gary

2010-03-01

Children with specific language impairment (SLI) have difficulties with spoken language. However, some recent research suggests that these impairments reflect underlying cognitive limitations. Studying gesture may inform us clinically and theoretically about the nature of the association between language and cognition. A total of 20 children with SLI and 19 typically developing (TD) peers were assessed on a novel measure of gesture production. Children were also assessed for sentence comprehension errors in a speech-gesture integration task. Children with SLI performed equally to peers on gesture production but performed less well when comprehending integrated speech and gesture. Error patterns revealed a significant group interaction: children with SLI made more gesture-based errors, whilst TD children made semantically based ones. Children with SLI accessed and produced lexically encoded gestures despite having impaired spoken vocabulary and this group also showed stronger associations between gesture and language than TD children. When SLI comprehension breaks down, gesture may be relied on over speech, whilst TD children have a preference for spoken cues. The findings suggest that for children with SLI, gesture scaffolds are still more related to language development than for TD peers who have out-grown earlier reliance on gestures. Future clinical implications may include standardized assessment of symbolic gesture and classroom based gesture support for clinical groups.
Error Consistency in Acquired Apraxia of Speech with Aphasia: Effects of the Analysis Unit

ERIC Educational Resources Information Center

Haley, Katarina L.; Cunningham, Kevin T.; Eaton, Catherine Torrington; Jacks, Adam

2018-01-01

Purpose: Diagnostic recommendations for acquired apraxia of speech (AOS) have been contradictory concerning whether speech sound errors are consistent or variable. Studies have reported divergent findings that, on face value, could argue either for or against error consistency as a diagnostic criterion. The purpose of this study was to explain…
Error Biases in Inner and Overt Speech: Evidence from Tongue Twisters

ERIC Educational Resources Information Center

Corley, Martin; Brocklehurst, Paul H.; Moat, H. Susannah

2011-01-01

To compare the properties of inner and overt speech, Oppenheim and Dell (2008) counted participants' self-reported speech errors when reciting tongue twisters either overtly or silently and found a bias toward substituting phonemes that resulted in words in both conditions, but a bias toward substituting similar phonemes only when speech was…
Word production inconsistency of Singaporean-English-speaking adolescents with Down Syndrome.

PubMed

Wong, Betty; Brebner, Chris; McCormack, Paul; Butcher, Andy

2015-01-01

The nature of speech disorders in individuals with Down Syndrome (DS) remains controversial despite various explanations put forth in the literature to account for the observed speech profiles. A high level of word production inconsistency in children with DS has led researchers to query whether the inconsistency continues into adolescence, and if the inconsistency stems from inconsistent phonological disorder (IPD) or childhood apraxia of speech (CAS). Of the studies that have been published, most suggest that the speech profile of individuals with DS is delayed, while a few recent studies suggest a combination of delayed and disordered patterns. However, no studies have explored the nature of word production inconsistency in this population, and the relationship between word production inconsistency, receptive vocabulary and severity of speech disorder. To investigate in a pilot study the extent of word production inconsistency in adolescents with DS and to examine the correlations between word production inconsistency, measures of receptive vocabulary, severity of speech disorder and oromotor skills in adolescents with DS. The participants were 32 native speakers of Singaporean-English adolescents, comprising 16 participants with DS and 16 typically developing (TD) participants. The participants completed a battery of standardized speech and language assessments, including The Diagnostic Evaluation of Articulation and Phonology (DEAP) assessment. Results from each test were correlated to determine relationships. Qualitative analyses were also carried out on all the data collected. In this study, seven out of 16 participants with DS scored above 40% on word production inconsistency, a diagnostic criterion for IPD. In addition, all participants with DS performed poorly on the oromotor assessment of DEAP. The overall speech profile observed did not exactly correspond with the cluster symptoms observed in children with IPD or CAS. Word production inconsistency is a noticeable feature in the speech of individuals with DS. In addition, the speech profiles of individuals with DS consist of atypical and unusual errors alongside developmental errors. Significant correlations were found between the measures investigated, suggesting that speech disorder in DS is multifactorial. The results from this study will help to improve differential diagnosis of speech disorders and individualized treatment plans in the population with DS. © 2015 Royal College of Speech and Language Therapists.
Application of concepts from cross-recurrence analysis in speech production: an overview and comparison with other nonlinear methods.

PubMed

Lancia, Leonardo; Fuchs, Susanne; Tiede, Mark

2014-06-01

The aim of this article was to introduce an important tool, cross-recurrence analysis, to speech production applications by showing how it can be adapted to evaluate the similarity of multivariate patterns of articulatory motion. The method differs from classical applications of cross-recurrence analysis because no phase space reconstruction is conducted, and a cleaning algorithm removes the artifacts from the recurrence plot. The main features of the proposed approach are robustness to nonstationarity and efficient separation of amplitude variability from temporal variability. The authors tested these claims by applying their method to synthetic stimuli whose variability had been carefully controlled. The proposed method was also demonstrated in a practical application: It was used to investigate the role of biomechanical constraints in articulatory reorganization as a consequence of speeded repetition of CVCV utterances containing a labial and a coronal consonant. Overall, the proposed approach provided more reliable results than other methods, particularly in the presence of high variability. The proposed method is a useful and appropriate tool for quantifying similarity and dissimilarity in patterns of speech articulator movement, especially in such research areas as speech errors and pathologies, where unpredictable divergent behavior is expected.
Syntactic error modeling and scoring normalization in speech recognition

NASA Technical Reports Server (NTRS)

Olorenshaw, Lex

1991-01-01

The objective was to develop the speech recognition system to be able to detect speech which is pronounced incorrectly, given that the text of the spoken speech is known to the recognizer. Research was performed in the following areas: (1) syntactic error modeling; (2) score normalization; and (3) phoneme error modeling. The study into the types of errors that a reader makes will provide the basis for creating tests which will approximate the use of the system in the real world. NASA-Johnson will develop this technology into a 'Literacy Tutor' in order to bring innovative concepts to the task of teaching adults to read.
Differentiating primary progressive aphasias in a brief sample of connected speech

PubMed Central

Evans, Emily; O'Shea, Jessica; Powers, John; Boller, Ashley; Weinberg, Danielle; Haley, Jenna; McMillan, Corey; Irwin, David J.; Rascovsky, Katya; Grossman, Murray

2013-01-01

Objective: A brief speech expression protocol that can be administered and scored without special training would aid in the differential diagnosis of the 3 principal forms of primary progressive aphasia (PPA): nonfluent/agrammatic PPA, logopenic variant PPA, and semantic variant PPA. Methods: We used a picture-description task to elicit a short speech sample, and we evaluated impairments in speech-sound production, speech rate, lexical retrieval, and grammaticality. We compared the results with those obtained by a longer, previously validated protocol and further validated performance with multimodal imaging to assess the neuroanatomical basis of the deficits. Results: We found different patterns of impaired grammar in each PPA variant, and additional language production features were impaired in each: nonfluent/agrammatic PPA was characterized by speech-sound errors; logopenic variant PPA by dysfluencies (false starts and hesitations); and semantic variant PPA by poor retrieval of nouns. Strong correlations were found between this brief speech sample and a lengthier narrative speech sample. A composite measure of grammaticality and other measures of speech production were correlated with distinct regions of gray matter atrophy and reduced white matter fractional anisotropy in each PPA variant. Conclusions: These findings provide evidence that large-scale networks are required for fluent, grammatical expression; that these networks can be selectively disrupted in PPA syndromes; and that quantitative analysis of a brief speech sample can reveal the corresponding distinct speech characteristics. PMID:23794681
Speech Errors across the Lifespan

ERIC Educational Resources Information Center

Vousden, Janet I.; Maylor, Elizabeth A.

2006-01-01

Dell, Burger, and Svec (1997) proposed that the proportion of speech errors classified as anticipations (e.g., "moot and mouth") can be predicted solely from the overall error rate, such that the greater the error rate, the lower the anticipatory proportion (AP) of errors. We report a study examining whether this effect applies to changes in error…

A mathematical model of medial consonant identification by cochlear implant users.

PubMed

Svirsky, Mario A; Sagi, Elad; Meyer, Ted A; Kaiser, Adam R; Teoh, Su Wooi

2011-04-01

The multidimensional phoneme identification model is applied to consonant confusion matrices obtained from 28 postlingually deafened cochlear implant users. This model predicts consonant matrices based on these subjects' ability to discriminate a set of postulated spectral, temporal, and amplitude speech cues as presented to them by their device. The model produced confusion matrices that matched many aspects of individual subjects' consonant matrices, including information transfer for the voicing, manner, and place features, despite individual differences in age at implantation, implant experience, device and stimulation strategy used, as well as overall consonant identification level. The model was able to match the general pattern of errors between consonants, but not the full complexity of all consonant errors made by each individual. The present study represents an important first step in developing a model that can be used to test specific hypotheses about the mechanisms cochlear implant users employ to understand speech.
A mathematical model of medial consonant identification by cochlear implant users

PubMed Central

Svirsky, Mario A.; Sagi, Elad; Meyer, Ted A.; Kaiser, Adam R.; Teoh, Su Wooi

2011-01-01

The multidimensional phoneme identification model is applied to consonant confusion matrices obtained from 28 postlingually deafened cochlear implant users. This model predicts consonant matrices based on these subjects’ ability to discriminate a set of postulated spectral, temporal, and amplitude speech cues as presented to them by their device. The model produced confusion matrices that matched many aspects of individual subjects’ consonant matrices, including information transfer for the voicing, manner, and place features, despite individual differences in age at implantation, implant experience, device and stimulation strategy used, as well as overall consonant identification level. The model was able to match the general pattern of errors between consonants, but not the full complexity of all consonant errors made by each individual. The present study represents an important first step in developing a model that can be used to test specific hypotheses about the mechanisms cochlear implant users employ to understand speech. PMID:21476674
The Influence of Psycholinguistic Variables on Articulatory Errors in Naming in Progressive Motor Speech Degeneration

ERIC Educational Resources Information Center

Code, Chris; Tree, Jeremy; Ball, Martin

2011-01-01

We describe an analysis of speech errors on a confrontation naming task in a man with progressive speech degeneration of 10-year duration from Pick's disease. C.S. had a progressive non-fluent aphasia together with a motor speech impairment and early assessment indicated some naming impairments. There was also an absence of significant…
Does the cost function matter in Bayes decision rule?

PubMed

Schlü ter, Ralf; Nussbaum-Thom, Markus; Ney, Hermann

2012-02-01

In many tasks in pattern recognition, such as automatic speech recognition (ASR), optical character recognition (OCR), part-of-speech (POS) tagging, and other string recognition tasks, we are faced with a well-known inconsistency: The Bayes decision rule is usually used to minimize string (symbol sequence) error, whereas, in practice, we want to minimize symbol (word, character, tag, etc.) error. When comparing different recognition systems, we do indeed use symbol error rate as an evaluation measure. The topic of this work is to analyze the relation between string (i.e., 0-1) and symbol error (i.e., metric, integer valued) cost functions in the Bayes decision rule, for which fundamental analytic results are derived. Simple conditions are derived for which the Bayes decision rule with integer-valued metric cost function and with 0-1 cost gives the same decisions or leads to classes with limited cost. The corresponding conditions can be tested with complexity linear in the number of classes. The results obtained do not make any assumption w.r.t. the structure of the underlying distributions or the classification problem. Nevertheless, the general analytic results are analyzed via simulations of string recognition problems with Levenshtein (edit) distance cost function. The results support earlier findings that considerable improvements are to be expected when initial error rates are high.
Design of a robust baseband LPC coder for speech transmission over 9.6 kbit/s noisy channels

NASA Astrophysics Data System (ADS)

Viswanathan, V. R.; Russell, W. H.; Higgins, A. L.

1982-04-01

This paper describes the design of a baseband Linear Predictive Coder (LPC) which transmits speech over 9.6 kbit/sec synchronous channels with random bit errors of up to 1%. Presented are the results of our investigation of a number of aspects of the baseband LPC coder with the goal of maximizing the quality of the transmitted speech. Important among these aspects are: bandwidth of the baseband, coding of the baseband residual, high-frequency regeneration, and error protection of important transmission parameters. The paper discusses these and other issues, presents the results of speech-quality tests conducted during the various stages of optimization, and describes the details of the optimized speech coder. This optimized speech coding algorithm has been implemented as a real-time full-duplex system on an array processor. Informal listening tests of the real-time coder have shown that the coder produces good speech quality in the absence of channel bit errors and introduces only a slight degradation in quality for channel bit error rates of up to 1%.
Syntactic and semantic errors in radiology reports associated with speech recognition software.

PubMed

Ringler, Michael D; Goss, Brian C; Bartholmai, Brian J

2017-03-01

Speech recognition software can increase the frequency of errors in radiology reports, which may affect patient care. We retrieved 213,977 speech recognition software-generated reports from 147 different radiologists and proofread them for errors. Errors were classified as "material" if they were believed to alter interpretation of the report. "Immaterial" errors were subclassified as intrusion/omission or spelling errors. The proportion of errors and error type were compared among individual radiologists, imaging subspecialty, and time periods. In all, 20,759 reports (9.7%) contained errors, of which 3992 (1.9%) were material errors. Among immaterial errors, spelling errors were more common than intrusion/omission errors ( p < .001). Proportion of errors and fraction of material errors varied significantly among radiologists and between imaging subspecialties ( p < .001). Errors were more common in cross-sectional reports, reports reinterpreting results of outside examinations, and procedural studies (all p < .001). Error rate decreased over time ( p < .001), which suggests that a quality control program with regular feedback may reduce errors.
Analysis of error type and frequency in apraxia of speech among Portuguese speakers.

PubMed

Cera, Maysa Luchesi; Minett, Thaís Soares Cianciarullo; Ortiz, Karin Zazo

2010-01-01

Most studies characterizing errors in the speech of patients with apraxia involve English language. To analyze the types and frequency of errors produced by patients with apraxia of speech whose mother tongue was Brazilian Portuguese. 20 adults with apraxia of speech caused by stroke were assessed. The types of error committed by patients were analyzed both quantitatively and qualitatively, and frequencies compared. We observed the presence of substitution, omission, trial-and-error, repetition, self-correction, anticipation, addition, reiteration and metathesis, in descending order of frequency, respectively. Omission type errors were one of the most commonly occurring whereas addition errors were infrequent. These findings differed to those reported in English speaking patients, probably owing to differences in the methodologies used for classifying error types; the inclusion of speakers with apraxia secondary to aphasia; and the difference in the structure of Portuguese language to English in terms of syllable onset complexity and effect on motor control. The frequency of omission and addition errors observed differed to the frequency reported for speakers of English.
Multilayer perceptron, fuzzy sets, and classification

NASA Technical Reports Server (NTRS)

Pal, Sankar K.; Mitra, Sushmita

1992-01-01

A fuzzy neural network model based on the multilayer perceptron, using the back-propagation algorithm, and capable of fuzzy classification of patterns is described. The input vector consists of membership values to linguistic properties while the output vector is defined in terms of fuzzy class membership values. This allows efficient modeling of fuzzy or uncertain patterns with appropriate weights being assigned to the backpropagated errors depending upon the membership values at the corresponding outputs. During training, the learning rate is gradually decreased in discrete steps until the network converges to a minimum error solution. The effectiveness of the algorithm is demonstrated on a speech recognition problem. The results are compared with those of the conventional MLP, the Bayes classifier, and the other related models.
Acoustic evidence for phonologically mismatched speech errors.

PubMed

Gormley, Andrea

2015-04-01

Speech errors are generally said to accommodate to their new phonological context. This accommodation has been validated by several transcription studies. The transcription methodology is not the best choice for detecting errors at this level, however, as this type of error can be difficult to perceive. This paper presents an acoustic analysis of speech errors that uncovers non-accommodated or mismatch errors. A mismatch error is a sub-phonemic error that results in an incorrect surface phonology. This type of error could arise during the processing of phonological rules or they could be made at the motor level of implementation. The results of this work have important implications for both experimental and theoretical research. For experimentalists, it validates the tools used for error induction and the acoustic determination of errors free of the perceptual bias. For theorists, this methodology can be used to test the nature of the processes proposed in language production.
Inducing Speech Errors in Dysarthria Using Tongue Twisters

ERIC Educational Resources Information Center

Kember, Heather; Connaghan, Kathryn; Patel, Rupal

2017-01-01

Although tongue twisters have been widely use to study speech production in healthy speakers, few studies have employed this methodology for individuals with speech impairment. The present study compared tongue twister errors produced by adults with dysarthria and age-matched healthy controls. Eight speakers (four female, four male; mean age =…
Acoustic Evidence for Phonologically Mismatched Speech Errors

ERIC Educational Resources Information Center

Gormley, Andrea

2015-01-01

Speech errors are generally said to accommodate to their new phonological context. This accommodation has been validated by several transcription studies. The transcription methodology is not the best choice for detecting errors at this level, however, as this type of error can be difficult to perceive. This paper presents an acoustic analysis of…
Syntactic error modeling and scoring normalization in speech recognition: Error modeling and scoring normalization in the speech recognition task for adult literacy training

NASA Technical Reports Server (NTRS)

Olorenshaw, Lex; Trawick, David

1991-01-01

The purpose was to develop a speech recognition system to be able to detect speech which is pronounced incorrectly, given that the text of the spoken speech is known to the recognizer. Better mechanisms are provided for using speech recognition in a literacy tutor application. Using a combination of scoring normalization techniques and cheater-mode decoding, a reasonable acceptance/rejection threshold was provided. In continuous speech, the system was tested to be able to provide above 80 pct. correct acceptance of words, while correctly rejecting over 80 pct. of incorrectly pronounced words.
Phonologic errors as a clinical marker of the logopenic variant of PPA.

PubMed

Leyton, Cristian E; Ballard, Kirrie J; Piguet, Olivier; Hodges, John R

2014-05-06

To disentangle the clinical heterogeneity of nonsemantic variants of primary progressive aphasia (PPA) and to identify a coherent linguistic-anatomical marker for the logopenic variant of PPA (lv-PPA). Key speech and language features of 14 cases of lv-PPA and 18 cases of nonfluent/agrammatic variant of PPA were systematically evaluated and scored by an independent rater blinded to diagnosis. Every case underwent a structural MRI and a Pittsburgh compound B (PiB)-PET scan, a putative biomarker of Alzheimer disease. Key speech and language features that showed association with the PiB-PET status were entered into a hierarchical cluster analysis. The linguistic features and patterns of cortical thinning in each resultant cluster were analyzed. The cluster analysis revealed 3 coherent clinical groups, each of which was linked to a specific PiB-PET status. The first cluster was linked to high PiB retention and characterized by phonologic errors and cortical thinning focused on the left superior temporal gyrus. The second and third clusters were characterized by grammatical production errors and motor speech disorders, respectively, and were associated with low PiB brain retention. A fourth cluster, however, demonstrated nonspecific language deficits and unpredictable PiB-PET status. These findings suggest that despite the clinical and pathologic heterogeneity of nonsemantic variants, discrete clinical syndromes can be distinguished and linked to specific likelihood of PiB-PET status. Phonologic errors seem to be highly predictive of high amyloid burden in PPA and can provide a specific clinical marker for lv-PPA.
Intelligibility assessment in developmental phonological disorders: accuracy of caregiver gloss.

PubMed

Kwiatkowski, J; Shriberg, L D

1992-10-01

Fifteen caregivers each glossed a simultaneously videotaped and audiotaped sample of their child with speech delay engaged in conversation with a clinician. One of the authors generated a reference gloss for each sample, aided by (a) prior knowledge of the child's speech-language status and error patterns, (b) glosses from the child's clinician and the child's caregiver, (c) unlimited replays of the taped sample, and (d) the information gained from completing a narrow phonetic transcription of the sample. Caregivers glossed an average of 78% of the utterances and 81% of the words. A comparison of their glosses to the reference glosses suggested that they accurately understood an average of 58% of the utterances and 73% of the words. Discussion considers the implications of such findings for methodological and theoretical issues underlying children's moment-to-moment intelligibility breakdowns during speech-language processing.
Structure and Processing in Tunisian Arabic: Speech Error Data

ERIC Educational Resources Information Center

Hamrouni, Nadia

2010-01-01

This dissertation presents experimental research on speech errors in Tunisian Arabic. The nonconcatenative morphology of Arabic shows interesting interactions of phrasal and lexical constraints with morphological structure during language production. The central empirical questions revolve around properties of "exchange errors". These…
Is comprehension necessary for error detection? A conflict-based account of monitoring in speech production

PubMed Central

Nozari, Nazbanou; Dell, Gary S.; Schwartz, Myrna F.

2011-01-01

Despite the existence of speech errors, verbal communication is successful because speakers can detect (and correct) their errors. The standard theory of speech-error detection, the perceptual-loop account, posits that the comprehension system monitors production output for errors. Such a comprehension-based monitor, however, cannot explain the double dissociation between comprehension and error-detection ability observed in the aphasic patients. We propose a new theory of speech-error detection which is instead based on the production process itself. The theory borrows from studies of forced-choice-response tasks the notion that error detection is accomplished by monitoring response conflict via a frontal brain structure, such as the anterior cingulate cortex. We adapt this idea to the two-step model of word production, and test the model-derived predictions on a sample of aphasic patients. Our results show a strong correlation between patients’ error-detection ability and the model’s characterization of their production skills, and no significant correlation between error detection and comprehension measures, thus supporting a production-based monitor, generally, and the implemented conflict-based monitor in particular. The successful application of the conflict-based theory to error-detection in linguistic, as well as non-linguistic domains points to a domain-general monitoring system. PMID:21652015
Modelling Errors in Automatic Speech Recognition for Dysarthric Speakers

NASA Astrophysics Data System (ADS)

Caballero Morales, Santiago Omar; Cox, Stephen J.

2009-12-01

Dysarthria is a motor speech disorder characterized by weakness, paralysis, or poor coordination of the muscles responsible for speech. Although automatic speech recognition (ASR) systems have been developed for disordered speech, factors such as low intelligibility and limited phonemic repertoire decrease speech recognition accuracy, making conventional speaker adaptation algorithms perform poorly on dysarthric speakers. In this work, rather than adapting the acoustic models, we model the errors made by the speaker and attempt to correct them. For this task, two techniques have been developed: (1) a set of "metamodels" that incorporate a model of the speaker's phonetic confusion matrix into the ASR process; (2) a cascade of weighted finite-state transducers at the confusion matrix, word, and language levels. Both techniques attempt to correct the errors made at the phonetic level and make use of a language model to find the best estimate of the correct word sequence. Our experiments show that both techniques outperform standard adaptation techniques.
Reduced Performance During a Sentence Repetition Task by Continuous Theta-Burst Magnetic Stimulation of the Pre-supplementary Motor Area.

PubMed

Dietrich, Susanne; Hertrich, Ingo; Müller-Dahlhaus, Florian; Ackermann, Hermann; Belardinelli, Paolo; Desideri, Debora; Seibold, Verena C; Ziemann, Ulf

2018-01-01

The pre-supplementary motor area (pre-SMA) is engaged in speech comprehension under difficult circumstances such as poor acoustic signal quality or time-critical conditions. Previous studies found that left pre-SMA is activated when subjects listen to accelerated speech. Here, the functional role of pre-SMA was tested for accelerated speech comprehension by inducing a transient "virtual lesion" using continuous theta-burst stimulation (cTBS). Participants were tested (1) prior to (pre-baseline), (2) 10 min after (test condition for the cTBS effect), and (3) 60 min after stimulation (post-baseline) using a sentence repetition task (formant-synthesized at rates of 8, 10, 12, 14, and 16 syllables/s). Speech comprehension was quantified by the percentage of correctly reproduced speech material. For high speech rates, subjects showed decreased performance after cTBS of pre-SMA. Regarding the error pattern, the number of incorrect words without any semantic or phonological similarity to the target context increased, while related words decreased. Thus, the transient impairment of pre-SMA seems to affect its inhibitory function that normally eliminates erroneous speech material prior to speaking or, in case of perception, prior to encoding into a semantically/pragmatically meaningful message.
Reduced Performance During a Sentence Repetition Task by Continuous Theta-Burst Magnetic Stimulation of the Pre-supplementary Motor Area

PubMed Central

Dietrich, Susanne; Hertrich, Ingo; Müller-Dahlhaus, Florian; Ackermann, Hermann; Belardinelli, Paolo; Desideri, Debora; Seibold, Verena C.; Ziemann, Ulf

2018-01-01

The pre-supplementary motor area (pre-SMA) is engaged in speech comprehension under difficult circumstances such as poor acoustic signal quality or time-critical conditions. Previous studies found that left pre-SMA is activated when subjects listen to accelerated speech. Here, the functional role of pre-SMA was tested for accelerated speech comprehension by inducing a transient “virtual lesion” using continuous theta-burst stimulation (cTBS). Participants were tested (1) prior to (pre-baseline), (2) 10 min after (test condition for the cTBS effect), and (3) 60 min after stimulation (post-baseline) using a sentence repetition task (formant-synthesized at rates of 8, 10, 12, 14, and 16 syllables/s). Speech comprehension was quantified by the percentage of correctly reproduced speech material. For high speech rates, subjects showed decreased performance after cTBS of pre-SMA. Regarding the error pattern, the number of incorrect words without any semantic or phonological similarity to the target context increased, while related words decreased. Thus, the transient impairment of pre-SMA seems to affect its inhibitory function that normally eliminates erroneous speech material prior to speaking or, in case of perception, prior to encoding into a semantically/pragmatically meaningful message. PMID:29896086
A posteriori error estimates in voice source recovery

NASA Astrophysics Data System (ADS)

Leonov, A. S.; Sorokin, V. N.

2017-12-01

The inverse problem of voice source pulse recovery from a segment of a speech signal is under consideration. A special mathematical model is used for the solution that relates these quantities. A variational method of solving inverse problem of voice source recovery for a new parametric class of sources, that is for piecewise-linear sources (PWL-sources), is proposed. Also, a technique for a posteriori numerical error estimation for obtained solutions is presented. A computer study of the adequacy of adopted speech production model with PWL-sources is performed in solving the inverse problems for various types of voice signals, as well as corresponding study of a posteriori error estimates. Numerical experiments for speech signals show satisfactory properties of proposed a posteriori error estimates, which represent the upper bounds of possible errors in solving the inverse problem. The estimate of the most probable error in determining the source-pulse shapes is about 7-8% for the investigated speech material. It is noted that a posteriori error estimates can be used as a criterion of the quality for obtained voice source pulses in application to speaker recognition.

Speech Characteristics and Intelligibility in Adults with Mild and Moderate Intellectual Disabilities

PubMed Central

Coppens-Hofman, Marjolein C.; Terband, Hayo; Snik, Ad F.M.; Maassen, Ben A.M.

2017-01-01

Purpose Adults with intellectual disabilities (ID) often show reduced speech intelligibility, which affects their social interaction skills. This study aims to establish the main predictors of this reduced intelligibility in order to ultimately optimise management. Method Spontaneous speech and picture naming tasks were recorded in 36 adults with mild or moderate ID. Twenty-five naïve listeners rated the intelligibility of the spontaneous speech samples. Performance on the picture-naming task was analysed by means of a phonological error analysis based on expert transcriptions. Results The transcription analyses showed that the phonemic and syllabic inventories of the speakers were complete. However, multiple errors at the phonemic and syllabic level were found. The frequencies of specific types of errors were related to intelligibility and quality ratings. Conclusions The development of the phonemic and syllabic repertoire appears to be completed in adults with mild-to-moderate ID. The charted speech difficulties can be interpreted to indicate speech motor control and planning difficulties. These findings may aid the development of diagnostic tests and speech therapies aimed at improving speech intelligibility in this specific group. PMID:28118637
Down's syndrome and the acquisition of phonology by Cantonese-speaking children.

PubMed

So, L K; Dodd, B J

1994-10-01

The phonological abilities of two groups of 4-9-year-old intellectually impaired Cantonese-speaking children are described. Children with Down's syndrome did not differ from matched non-Down's syndrome controls in terms of a lexical comprehension measure, the size of their phoneme repertoires, the range of sounds affected by articulatory imprecision, or the number of consonants, vowels or tones produced in error. However, the types of errors made by the Down's syndrome children were different from those made by the control subjects. Cantonese-speaking children with Down's syndrome, as compared with controls, made a greater number of inconsistent errors, were more likely to produce non-developmental errors and were better in imitation than in spontaneous production. Despite extensive differences between the phonological structures of Cantonese and English, children with Down's syndrome acquiring these languages show the same characteristic pattern of speech errors. One unexpected finding was that the control group of non-Down's syndrome children failed to present with delayed phonological development typically reported for their English-speaking counterparts. The argument made is that cross-linguistic studies of intellectually impaired children's language acquisition provide evidence concerning language-specific characteristics of impairment, as opposed to those characteristics that, remaining constant across languages, are an integral part of the disorder. The results reported here support the hypothesis that the speech disorder typically associated with Down's syndrome arises from impaired phonological planning, i.e. a cognitive linguistic deficit.
Mimicking Aphasic Semantic Errors in Normal Speech Production: Evidence from a Novel Experimental Paradigm

ERIC Educational Resources Information Center

Hodgson, Catherine; Lambon Ralph, Matthew A.

2008-01-01

Semantic errors are commonly found in semantic dementia (SD) and some forms of stroke aphasia and provide insights into semantic processing and speech production. Low error rates are found in standard picture naming tasks in normal controls. In order to increase error rates and thus provide an experimental model of aphasic performance, this study…
Phonological Awareness and Types of Sound Errors in Preschoolers with Speech Sound Disorders

ERIC Educational Resources Information Center

Preston, Jonathan; Edwards, Mary Louise

2010-01-01

Purpose: Some children with speech sound disorders (SSD) have difficulty with literacy-related skills, particularly phonological awareness (PA). This study investigates the PA skills of preschoolers with SSD by using a regression model to evaluate the degree to which PA can be concurrently predicted by types of speech sound errors. Method:…
Leveraging Automatic Speech Recognition Errors to Detect Challenging Speech Segments in TED Talks

ERIC Educational Resources Information Center

Mirzaei, Maryam Sadat; Meshgi, Kourosh; Kawahara, Tatsuya

2016-01-01

This study investigates the use of Automatic Speech Recognition (ASR) systems to epitomize second language (L2) listeners' problems in perception of TED talks. ASR-generated transcripts of videos often involve recognition errors, which may indicate difficult segments for L2 listeners. This paper aims to discover the root-causes of the ASR errors…
The Structure of Segmental Errors in the Speech of Deaf Children.

ERIC Educational Resources Information Center

Levitt, H.; And Others

1980-01-01

A quantitative description of the segmental errors occurring in the speech of deaf children is developed. Journal availability: Elsevier North Holland, Inc., 52 Vanderbilt Avenue, New York, NY 10017. (Author)
Control of Task Sequences: What is the Role of Language?

PubMed Central

Mayr, Ulrich; Kleffner, Killian; Kikumoto, Atsushi; Redford, Melissa A.

2015-01-01

It is almost a truism that language aids serial-order control through self-cuing of upcoming sequential elements. We measured speech onset latencies as subjects performed hierarchically organized task sequences while "thinking aloud" each task label. Surprisingly, speech onset latencies and response times (RTs) were highly synchronized, a pattern that is not consistent with the hypothesis that speaking aids proactive retrieval of upcoming sequential elements during serial-order control. We also found that when instructed to do so, participants were able to speak task labels prior to presentation of response-relevant stimuli and that this substantially reduced RT signatures of retrieval—however at the cost of more sequencing errors. Thus, while proactive retrieval is possible in principle, in natural situations it seems to be prevented through a strong, "gestalt-like" tendency to synchronize speech and action. We suggest that this tendency may support context updating rather than proactive control. PMID:24274386
Measurement of trained speech patterns in stuttering: interjudge and intrajudge agreement of experts by means of modified time-interval analysis.

PubMed

Alpermann, Anke; Huber, Walter; Natke, Ulrich; Willmes, Klaus

2010-09-01

Improved fluency after stuttering therapy is usually measured by the percentage of stuttered syllables. However, outcome studies rarely evaluate the use of trained speech patterns that speakers use to manage stuttering. This study investigated whether the modified time interval analysis can distinguish between trained speech patterns, fluent speech, and stuttered speech. Seventeen German experts on stuttering judged a speech sample on two occasions. Speakers of the sample were stuttering adults, who were not undergoing therapy, as well as participants in a fluency shaping and a stuttering modification therapy. Results showed satisfactory inter-judge and intra-judge agreement above 80%. Intervals with trained speech patterns were identified as consistently as stuttered and fluent intervals. We discuss limitations of the study, as well as implications of our findings for the development of training for identification of trained speech patterns and future outcome studies. The reader will be able to (a) explain different methods to measure the use of trained speech patterns, (b) evaluate whether German experts are able to discriminate intervals with trained speech patterns reliably from fluent and stuttered intervals and (c) describe how the measurement of trained speech patterns can contribute to outcome studies.
Functional Brain Activation Differences in School-Age Children with Speech Sound Errors: Speech and Print Processing

ERIC Educational Resources Information Center

Preston, Jonathan L.; Felsenfeld, Susan; Frost, Stephen J.; Mencl, W. Einar; Fulbright, Robert K.; Grigorenko, Elena L.; Landi, Nicole; Seki, Ayumi; Pugh, Kenneth R.

2012-01-01

Purpose: To examine neural response to spoken and printed language in children with speech sound errors (SSE). Method: Functional magnetic resonance imaging was used to compare processing of auditorily and visually presented words and pseudowords in 17 children with SSE, ages 8;6[years;months] through 10;10, with 17 matched controls. Results: When…
Speech Errors in Progressive Non-Fluent Aphasia

ERIC Educational Resources Information Center

Ash, Sharon; McMillan, Corey; Gunawardena, Delani; Avants, Brian; Morgan, Brianna; Khan, Alea; Moore, Peachie; Gee, James; Grossman, Murray

2010-01-01

The nature and frequency of speech production errors in neurodegenerative disease have not previously been precisely quantified. In the present study, 16 patients with a progressive form of non-fluent aphasia (PNFA) were asked to tell a story from a wordless children's picture book. Errors in production were classified as either phonemic,…
Overreliance on auditory feedback may lead to sound/syllable repetitions: simulations of stuttering and fluency-inducing conditions with a neural model of speech production

PubMed Central

Civier, Oren; Tasko, Stephen M.; Guenther, Frank H.

2010-01-01

This paper investigates the hypothesis that stuttering may result in part from impaired readout of feedforward control of speech, which forces persons who stutter (PWS) to produce speech with a motor strategy that is weighted too much toward auditory feedback control. Over-reliance on feedback control leads to production errors which, if they grow large enough, can cause the motor system to “reset” and repeat the current syllable. This hypothesis is investigated using computer simulations of a “neurally impaired” version of the DIVA model, a neural network model of speech acquisition and production. The model’s outputs are compared to published acoustic data from PWS’ fluent speech, and to combined acoustic and articulatory movement data collected from the dysfluent speech of one PWS. The simulations mimic the errors observed in the PWS subject’s speech, as well as the repairs of these errors. Additional simulations were able to account for enhancements of fluency gained by slowed/prolonged speech and masking noise. Together these results support the hypothesis that many dysfluencies in stuttering are due to a bias away from feedforward control and toward feedback control. PMID:20831971
Speech-language pathology program for reading comprehension and orthography: effects on the spelling of dyslexic individuals.

PubMed

Nogueira, Débora Manzano; Cárnio, Maria Silvia

2018-01-01

Purpose Prepare a Speech-language Pathology Program for Reading Comprehension and Orthography and verify its effects on the reading comprehension and spelling of students with Developmental Dyslexia. Methods The study sample was composed of eleven individuals (eight males), diagnosed with Developmental Dyslexia, aged 09-11 years. All participants underwent a Speech-language Pathology Program in Reading Comprehension and Orthography comprising 16 individual weekly sessions. In each session, tasks of reading comprehension of texts and orthography were developed. At the beginning and end of the Program, the participants were submitted to a specific assessment (pre- and post-test). Results The individuals presented difficulty in reading comprehension, but the Cloze technique proved to be a useful remediation tool, and significant improvement in their performance was observed in the post-test evaluation. The dyslexic individuals showed poor performance for their educational level in the spelling assessment. At the end of the program, their performance evolved, but it remained below the expected, showing the same error pattern at the pre- and post-tests, with errors in both natural and arbitrary spelling. Conclusion The proposed Speech-language Pathology Program for Reading Comprehension and Orthography produced positive effects on the reading comprehension, spelling, and motivation to reading and writing of the participants. This study presents an unprecedented contribution by proposing joint stimulation of reading and writing by means of a program easy to apply and analyze in individuals with Developmental Dyslexia.
Automatic concept extraction from spoken medical reports.

PubMed

Happe, André; Pouliquen, Bruno; Burgun, Anita; Cuggia, Marc; Le Beux, Pierre

2003-07-01

The objective of this project is to investigate methods whereby a combination of speech recognition and automated indexing methods substitute for current transcription and indexing practices. We based our study on existing speech recognition software programs and on NOMINDEX, a tool that extracts MeSH concepts from medical text in natural language and that is mainly based on a French medical lexicon and on the UMLS. For each document, the process consists of three steps: (1) dictation and digital audio recording, (2) speech recognition, (3) automatic indexing. The evaluation consisted of a comparison between the set of concepts extracted by NOMINDEX after the speech recognition phase and the set of keywords manually extracted from the initial document. The method was evaluated on a set of 28 patient discharge summaries extracted from the MENELAS corpus in French, corresponding to in-patients admitted for coronarography. The overall precision was 73% and the overall recall was 90%. Indexing errors were mainly due to word sense ambiguity and abbreviations. A specific issue was the fact that the standard French translation of MeSH terms lacks diacritics. A preliminary evaluation of speech recognition tools showed that the rate of accurate recognition was higher than 98%. Only 3% of the indexing errors were generated by inadequate speech recognition. We discuss several areas to focus on to improve this prototype. However, the very low rate of indexing errors due to speech recognition errors highlights the potential benefits of combining speech recognition techniques and automatic indexing.
Robust Frequency Invariant Beamforming with Low Sidelobe for Speech Enhancement

NASA Astrophysics Data System (ADS)

Zhu, Yiting; Pan, Xiang

2018-01-01

Frequency invariant beamformers (FIBs) are widely used in speech enhancement and source localization. There are two traditional optimization methods for FIB design. The first one is convex optimization, which is simple but the frequency invariant characteristic of the beam pattern is poor with respect to frequency band of five octaves. The least squares (LS) approach using spatial response variation (SRV) constraint is another optimization method. Although, it can provide good frequency invariant property, it usually couldn’t be used in speech enhancement for its lack of weight norm constraint which is related to the robustness of a beamformer. In this paper, a robust wideband beamforming method with a constant beamwidth is proposed. The frequency invariant beam pattern is achieved by resolving an optimization problem of the SRV constraint to cover speech frequency band. With the control of sidelobe level, it is available for the frequency invariant beamformer (FIB) to prevent distortion of interference from the undesirable direction. The approach is completed in time-domain by placing tapped delay lines(TDL) and finite impulse response (FIR) filter at the output of each sensor which is more convenient than the Frost processor. By invoking the weight norm constraint, the robustness of the beamformer is further improved against random errors. Experiment results show that the proposed method has a constant beamwidth and almost the same white noise gain as traditional delay-and-sum (DAS) beamformer.
Apraxia of Speech and Phonological Errors in the Diagnosis of Nonfluent/Agrammatic and Logopenic Variants of Primary Progressive Aphasia

ERIC Educational Resources Information Center

Croot, Karen; Ballard, Kirrie; Leyton, Cristian E.; Hodges, John R.

2012-01-01

Purpose: The International Consensus Criteria for the diagnosis of primary progressive aphasia (PPA; Gorno-Tempini et al., 2011) propose apraxia of speech (AOS) as 1 of 2 core features of nonfluent/agrammatic PPA and propose phonological errors or absence of motor speech disorder as features of logopenic PPA. We investigated the sensitivity and…
Feature Migration in Time: Reflection of Selective Attention on Speech Errors

ERIC Educational Resources Information Center

Nozari, Nazbanou; Dell, Gary S.

2012-01-01

This article describes an initial study of the effect of focused attention on phonological speech errors. In 3 experiments, participants recited 4-word tongue twisters and focused attention on 1 (or none) of the words. The attended word was singled out differently in each experiment; participants were under instructions to avoid errors on the…
Motor-Based Treatment with and without Ultrasound Feedback for Residual Speech-Sound Errors

ERIC Educational Resources Information Center

Preston, Jonathan L.; Leece, Megan C.; Maas, Edwin

2017-01-01

Background: There is a need to develop effective interventions and to compare the efficacy of different interventions for children with residual speech-sound errors (RSSEs). Rhotics (the r-family of sounds) are frequently in error American English-speaking children with RSSEs and are commonly targeted in treatment. One treatment approach involves…
Speech Patterns and Racial Wage Inequality

ERIC Educational Resources Information Center

Grogger, Jeffrey

2011-01-01

Speech patterns differ substantially between whites and many African Americans. I collect and analyze speech data to understand the role that speech may play in explaining racial wage differences. Among blacks, speech patterns are highly correlated with measures of skill such as schooling and AFQT scores. They are also highly correlated with the…
Using electropalatography (EPG) to diagnose and treat articulation disorders associated with mild cerebral palsy: a case study.

PubMed

Gibbon, Fiona E; Wood, Sara E

2003-01-01

Some children with mild cerebral palsy have articulation disorders that are resistant to conventional speech therapy techniques. This preliminary study investigated the use of electropalatography (EPG) to diagnose and treat a long-standing articulation disorder that had not responded to conventional speech therapy techniques in an 8-year-old boy (D) with a congenital left hemiplegia. The targets for EPG therapy were speech errors affecting velar targets /k, g, eta/, which were consistently fronted to alveolar placement [t, d, n]. After 15 sessions of EPG therapy over a 4-month period, D's ability to produce velars improved significantly. The EPG data revealed two features of diagnostic importance. The first was an unusually asymmetrical pattern of tongue-palate contact and the second was unusually long stop closure durations. These features are interpreted as a subtle form of impaired speech motor control that could be related to a mild residual neurological deficit. The results suggest that EPG is of potential benefit for diagnosing and treating articulation disorders in individuals with mild cerebral palsy.
A Networking of Community-Based Speech Therapy: Borabue District, Maha Sarakham.

PubMed

Pumnum, Tawitree; Kum-ud, Weawta; Prathanee, Benjamas

2015-08-01

Most children with cleft lip and palate have articulation problems because of compensatory articulation disorders from velopharyngeal insufficiency. Theoretically, children should receive speech therapy from a speech and language pathologist (SLP) 1-2 sessions per week. For developing countries, particularly Thailand, most of them cannot reach standard speech services because of limitation of speech services and SLP Networking of a Community-Based Speech Model might be an appropriate way to solve this problem. To study the effectiveness of a networking of Khon Kaen University (KKU) Community-Based Speech Model, Non Thong Tambon Health Promotion Hospital, Borabue, Maha Sarakham, in decreasing the number of articulation errors for children with CLP. Six children with cleft lip and palate (CLP) who lived in Borabue and the surrounding district, Maha Sarakham, and had medical records in Srinagarind Hospital. They were assessed for pre- and post-articulation errors and provided speech therapy by SLP via teaching on service for speech assistant (SA). Then, children with CLP received speech correction (SC) by SA based on assignment and caregivers practiced home program for a year. Networking of Non Thong Tambon Health Promotion Hospital, Borabue, Maha Sarakham significantly reduce the number of post-articulation errors for 3 children with CLP. There were factors affecting the results in treatment of other children as follows: delayed speech and language development, hypernaslaity, and consistency of SC at local hospital and home. A networking of KKU Community-Based Speech Model, Non Thong Tambon Health Promotion Hospital, Borabue, and Maha Sarakham was a good way to enhance speech therapy in Thailand or other developing countries, where have limitation of speech services or lack of professionals.

Speech impairment in Down syndrome: a review.

PubMed

Kent, Ray D; Vorperian, Houri K

2013-02-01

This review summarizes research on disorders of speech production in Down syndrome (DS) for the purposes of informing clinical services and guiding future research. Review of the literature was based on searches using MEDLINE, Google Scholar, PsycINFO, and HighWire Press, as well as consideration of reference lists in retrieved documents (including online sources). Search terms emphasized functions related to voice, articulation, phonology, prosody, fluency, and intelligibility. The following conclusions pertain to four major areas of review: voice, speech sounds, fluency and prosody, and intelligibility. The first major area is voice. Although a number of studies have reported on vocal abnormalities in DS, major questions remain about the nature and frequency of the phonatory disorder. Results of perceptual and acoustic studies have been mixed, making it difficult to draw firm conclusions or even to identify sensitive measures for future study. The second major area is speech sounds. Articulatory and phonological studies show that speech patterns in DS are a combination of delayed development and errors not seen in typical development. Delayed (i.e., developmental) and disordered (i.e., nondevelopmental) patterns are evident by the age of about 3 years, although DS-related abnormalities possibly appear earlier, even in infant babbling. The third major area is fluency and prosody. Stuttering and/or cluttering occur in DS at rates of 10%-45%, compared with about 1% in the general population. Research also points to significant disturbances in prosody. The fourth major area is intelligibility. Studies consistently show marked limitations in this area, but only recently has the research gone beyond simple rating scales.
Incidence of speech recognition errors in the emergency department.

PubMed

Goss, Foster R; Zhou, Li; Weiner, Scott G

2016-09-01

Physician use of computerized speech recognition (SR) technology has risen in recent years due to its ease of use and efficiency at the point of care. However, error rates between 10 and 23% have been observed, raising concern about the number of errors being entered into the permanent medical record, their impact on quality of care and medical liability that may arise. Our aim was to determine the incidence and types of SR errors introduced by this technology in the emergency department (ED). Level 1 emergency department with 42,000 visits/year in a tertiary academic teaching hospital. A random sample of 100 notes dictated by attending emergency physicians (EPs) using SR software was collected from the ED electronic health record between January and June 2012. Two board-certified EPs annotated the notes and conducted error analysis independently. An existing classification schema was adopted to classify errors into eight errors types. Critical errors deemed to potentially impact patient care were identified. There were 128 errors in total or 1.3 errors per note, and 14.8% (n=19) errors were judged to be critical. 71% of notes contained errors, and 15% contained one or more critical errors. Annunciation errors were the highest at 53.9% (n=69), followed by deletions at 18.0% (n=23) and added words at 11.7% (n=15). Nonsense errors, homonyms and spelling errors were present in 10.9% (n=14), 4.7% (n=6), and 0.8% (n=1) of notes, respectively. There were no suffix or dictionary errors. Inter-annotator agreement was 97.8%. This is the first estimate at classifying speech recognition errors in dictated emergency department notes. Speech recognition errors occur commonly with annunciation errors being the most frequent. Error rates were comparable if not lower than previous studies. 15% of errors were deemed critical, potentially leading to miscommunication that could affect patient care. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Automatic analysis of slips of the tongue: Insights into the cognitive architecture of speech production.

PubMed

Goldrick, Matthew; Keshet, Joseph; Gustafson, Erin; Heller, Jordana; Needle, Jeremy

2016-04-01

Traces of the cognitive mechanisms underlying speaking can be found within subtle variations in how we pronounce sounds. While speech errors have traditionally been seen as categorical substitutions of one sound for another, acoustic/articulatory analyses show they partially reflect the intended sound. When "pig" is mispronounced as "big," the resulting /b/ sound differs from correct productions of "big," moving towards intended "pig"-revealing the role of graded sound representations in speech production. Investigating the origins of such phenomena requires detailed estimation of speech sound distributions; this has been hampered by reliance on subjective, labor-intensive manual annotation. Computational methods can address these issues by providing for objective, automatic measurements. We develop a novel high-precision computational approach, based on a set of machine learning algorithms, for measurement of elicited speech. The algorithms are trained on existing manually labeled data to detect and locate linguistically relevant acoustic properties with high accuracy. Our approach is robust, is designed to handle mis-productions, and overall matches the performance of expert coders. It allows us to analyze a very large dataset of speech errors (containing far more errors than the total in the existing literature), illuminating properties of speech sound distributions previously impossible to reliably observe. We argue that this provides novel evidence that two sources both contribute to deviations in speech errors: planning processes specifying the targets of articulation and articulatory processes specifying the motor movements that execute this plan. These findings illustrate how a much richer picture of speech provides an opportunity to gain novel insights into language processing. Copyright © 2016 Elsevier B.V. All rights reserved.
Speech therapy for errors secondary to cleft palate and velopharyngeal dysfunction.

PubMed

Kummer, Ann W

2011-05-01

Individuals with a history of cleft lip/palate or velopharyngeal dysfunction may demonstrate any combination of speech sound errors, hypernasality, and nasal emission. Speech sound distortion can also occur due to other structural anomalies, including malocclusion. Whenever there are structural anomalies, speech can be affected by obligatory distortions or compensatory errors. Obligatory distortions (including hypernasality due to velopharyngeal insufficiency) are caused by abnormal structure and not by abnormal function. Therefore, surgery or other forms of physical management are needed for correction. In contrast, speech therapy is indicated for compensatory articulation productions where articulation placement is changed in response to the abnormal structure. Speech therapy is much more effective if it is done after normalization of the structure. When speech therapy is appropriate, the techniques involve methods to change articulation placement using standard articulation therapy principles. Oral-motor exercises, including the use of blowing and sucking, are never indicated to improve velopharyngeal function. The purpose of this article is to provide information regarding when speech therapy is appropriate for individuals with a history of cleft palate or other structural anomalies and when physical management is needed. In addition, some specific therapy techniques are offered for the elimination of common compensatory articulation productions. © Thieme Medical Publishers.
Comparing different models of the development of verb inflection in early child Spanish.

PubMed

Aguado-Orea, Javier; Pine, Julian M

2015-01-01

How children acquire knowledge of verb inflection is a long-standing question in language acquisition research. In the present study, we test the predictions of some current constructivist and generativist accounts of the development of verb inflection by focusing on data from two Spanish-speaking children between the ages of 2;0 and 2;6. The constructivist claim that children's early knowledge of verb inflection is only partially productive is tested by comparing the average number of different inflections per verb in matched samples of child and adult speech. The generativist claim that children's early use of verb inflection is essentially error-free is tested by investigating the rate at which the children made subject-verb agreement errors in different parts of the present tense paradigm. Our results show: 1) that, although even adults' use of verb inflection in Spanish tends to look somewhat lexically restricted, both children's use of verb inflection was significantly less flexible than that of their caregivers, and 2) that, although the rate at which the two children produced subject-verb agreement errors in their speech was very low, this overall error rate hid a consistent pattern of error in which error rates were substantially higher in low frequency than in high frequency contexts, and substantially higher for low frequency than for high frequency verbs. These results undermine the claim that children's use of verb inflection is fully productive from the earliest observable stages, and are consistent with the constructivist claim that knowledge of verb inflection develops only gradually.
Public Speaking Apprehension, Decision-Making Errors in the Selection of Speech Introduction Strategies and Adherence to Strategy.

ERIC Educational Resources Information Center

Beatty, Michael J.

1988-01-01

Examines the choice-making processes of students engaged in the selection of speech introduction strategies. Finds that the frequency of students making decision-making errors was a positive function of public speaking apprehension. (MS)
Evaluation of speech errors in Putonghua speakers with cleft palate: a critical review of methodology issues.

PubMed

Jiang, Chenghui; Whitehill, Tara L

2014-04-01

Speech errors associated with cleft palate are well established for English and several other Indo-European languages. Few articles describing the speech of Putonghua (standard Mandarin Chinese) speakers with cleft palate have been published in English language journals. Although methodological guidelines have been published for the perceptual speech evaluation of individuals with cleft palate, there has been no critical review of methodological issues in studies of Putonghua speakers with cleft palate. A literature search was conducted to identify relevant studies published over the past 30 years in Chinese language journals. Only studies incorporating perceptual analysis of speech were included. Thirty-seven articles which met inclusion criteria were analyzed and coded on a number of methodological variables. Reliability was established by having all variables recoded for all studies. This critical review identified many methodological issues. These design flaws make it difficult to draw reliable conclusions about characteristic speech errors in this group of speakers. Specific recommendations are made to improve the reliability and validity of future studies, as well to facilitate cross-center comparisons.
The relationship between articulatory control and improved phonemic accuracy in childhood apraxia of speech: A longitudinal case study

PubMed Central

Grigos, Maria I.; Kolenda, Nicole

2010-01-01

Jaw movement patterns were examined longitudinally in a 3-year-old male with childhood apraxia of speech (CAS) and compared with a typically developing control group. The child with CAS was followed for 8 months, until he began accurately and consistently producing the bilabial phonemes /p/, /b/, and /m/. A movement tracking system was used to study jaw duration, displacement, velocity, and stability. A transcription analysis determined the percentage of phoneme errors and consistency. Results showed phoneme-specific changes which included increases in jaw velocity and stability over time, as well as decreases in duration. Kinematic parameters became more similar to patterns seen in the controls during final sessions where tokens were produced most accurately and consistently. Closing velocity and stability, however, were the only measures to fall within a 95% confidence interval established for the controls across all three target phonemes. These findings suggest that motor processes may differ between children with CAS and their typically developing peers. PMID:20030551
The influence of visual and auditory information on the perception of speech and non-speech oral movements in patients with left hemisphere lesions.

PubMed

Schmid, Gabriele; Thielmann, Anke; Ziegler, Wolfram

2009-03-01

Patients with lesions of the left hemisphere often suffer from oral-facial apraxia, apraxia of speech, and aphasia. In these patients, visual features often play a critical role in speech and language therapy, when pictured lip shapes or the therapist's visible mouth movements are used to facilitate speech production and articulation. This demands audiovisual processing both in speech and language treatment and in the diagnosis of oral-facial apraxia. The purpose of this study was to investigate differences in audiovisual perception of speech as compared to non-speech oral gestures. Bimodal and unimodal speech and non-speech items were used and additionally discordant stimuli constructed, which were presented for imitation. This study examined a group of healthy volunteers and a group of patients with lesions of the left hemisphere. Patients made substantially more errors than controls, but the factors influencing imitation accuracy were more or less the same in both groups. Error analyses in both groups suggested different types of representations for speech as compared to the non-speech domain, with speech having a stronger weight on the auditory modality and non-speech processing on the visual modality. Additionally, this study was able to show that the McGurk effect is not limited to speech.
Overreliance on auditory feedback may lead to sound/syllable repetitions: simulations of stuttering and fluency-inducing conditions with a neural model of speech production.

PubMed

Civier, Oren; Tasko, Stephen M; Guenther, Frank H

2010-09-01

This paper investigates the hypothesis that stuttering may result in part from impaired readout of feedforward control of speech, which forces persons who stutter (PWS) to produce speech with a motor strategy that is weighted too much toward auditory feedback control. Over-reliance on feedback control leads to production errors which if they grow large enough, can cause the motor system to "reset" and repeat the current syllable. This hypothesis is investigated using computer simulations of a "neurally impaired" version of the DIVA model, a neural network model of speech acquisition and production. The model's outputs are compared to published acoustic data from PWS' fluent speech, and to combined acoustic and articulatory movement data collected from the dysfluent speech of one PWS. The simulations mimic the errors observed in the PWS subject's speech, as well as the repairs of these errors. Additional simulations were able to account for enhancements of fluency gained by slowed/prolonged speech and masking noise. Together these results support the hypothesis that many dysfluencies in stuttering are due to a bias away from feedforward control and toward feedback control. The reader will be able to (a) describe the contribution of auditory feedback control and feedforward control to normal and stuttered speech production, (b) summarize the neural modeling approach to speech production and its application to stuttering, and (c) explain how the DIVA model accounts for enhancements of fluency gained by slowed/prolonged speech and masking noise.
Interaction and Representational Integration: Evidence from Speech Errors

ERIC Educational Resources Information Center

Goldrick, Matthew; Baker, H. Ross; Murphy, Amanda; Baese-Berk, Melissa

2011-01-01

We examine the mechanisms that support interaction between lexical, phonological and phonetic processes during language production. Studies of the phonetics of speech errors have provided evidence that partially activated lexical and phonological representations influence phonetic processing. We examine how these interactive effects are modulated…
Speech Impairment in Down Syndrome: A Review

PubMed Central

Kent, Ray D.; Vorperian, Houri K.

2012-01-01

Purpose This review summarizes research on disorders of speech production in Down Syndrome (DS) for the purposes of informing clinical services and guiding future research. Method Review of the literature was based on searches using Medline, Google Scholar, Psychinfo, and HighWire Press, as well as consideration of reference lists in retrieved documents (including online sources). Search terms emphasized functions related to voice, articulation, phonology, prosody, fluency and intelligibility. Conclusions The following conclusions pertain to four major areas of review: (a) Voice. Although a number of studies have been reported on vocal abnormalities in DS, major questions remain about the nature and frequency of the phonatory disorder. Results of perceptual and acoustic studies have been mixed, making it difficult to draw firm conclusions or even to identify sensitive measures for future study. (b) Speech sounds. Articulatory and phonological studies show that speech patterns in DS are a combination of delayed development and errors not seen in typical development. Delayed (i.e., developmental) and disordered (i.e., nondevelopmental) patterns are evident by the age of about 3 years, although DS-related abnormalities possibly appear earlier, even in infant babbling. (c) Fluency and prosody. Stuttering and/or cluttering occur in DS at rates of 10 to 45%, compared to about 1% in the general population. Research also points to significant disturbances in prosody. (d) Intelligibility. Studies consistently show marked limitations in this area but it is only recently that research goes beyond simple rating scales. PMID:23275397
Electropalatographic and perceptual analysis of the speech of Cantonese children with cleft palate.

PubMed

Whitehill, T; Stokes, S; Hardcastle, B; Gibbon, F

1995-01-01

This study used electropalatographic and perceptual analysis to investigate the speech of two Cantonese children with repaired cleft palate. Some features of their speech, as identified from the perceptual analysis, have been previously reported as being typical of children with cleft palate. For example, fricatives and affricates were vulnerable to disruption, and obstruent sounds were judged by listeners to have posterior placement. However, some apparently language-specific characteristics were identified in the Cantonese-speaking children. First there was a relatively high incidence of initial consonant deletion, and for one subject /s/ and /f/ targets were produced as bilabial fricatives. EPG error patterns for target lingual obstruents were largely similar to those reported to occur in English- and Japanese-speaking children. In particular, broader and more posterior tongue-palate contact was observed, and intrasubject variability was noted. There was also evidence of simultaneous labial/velar and alveolar/velar constriction for labial and velar targets respectively. The clinical implications of the findings are discussed.
Simulating Children's Retrieval Errors in Picture-Naming: A Test of Foygel and Dell's (2000) Semantic/Phonological Model of Speech Production

ERIC Educational Resources Information Center

Budd, Mary-Jane; Hanley, J. Richard; Griffiths, Yvonne

2011-01-01

This study investigated whether Foygel and Dell's (2000) interactive two-step model of speech production could simulate the number and type of errors made in picture-naming by 68 children of elementary-school age. Results showed that the model provided a satisfactory simulation of the mean error profile of children aged five, six, seven, eight and…
The nature of articulation errors in Egyptian Arabic-speaking children with velopharyngeal insufficiency due to cleft palate.

PubMed

Abou-Elsaad, Tamer; Baz, Hemmat; Afsah, Omayma; Mansy, Alzahraa

2015-09-01

Even with early surgical repair, the majority of cleft palate children demonstrate articulation errors and have typical cleft palate speech. Was to determine the nature of articulation errors of Arabic consonants in Egyptian Arabic-speaking children with velopharyngeal insufficiency (VPI). Thirty Egyptian Arabic-speaking children with VPI due to cleft palate (whether primary repaired or secondary repaired) were studied. Auditory perceptual assessment (APA) of children speech was conducted. Nasopharyngoscopy was done to assess the velopharyngeal port (VPP) movements while the child was repeating speech tasks. Mansoura Arabic Articulation test (MAAT) was performed to analyze the consonants articulation of these children. The most frequent type of articulatory errors observed was substitution, more specifically, backing. Pharyngealization of anterior fricatives was the most frequent substitution, especially for the /s/ sound. The most frequent substituting sounds for other sounds were /ʔ/ followed by /k/ and /n/ sounds. Significant correlations were found between the degrees of the open nasality and VPP closure and the articulation errors. On the other hand, the sounds (/ʔ/,/ħ/,/ʕ/,/n/,/w/,/j/) were normally articulated in all studied group. The determination of articulation errors in VPI children could guide the therapists for designing appropriate speech therapy programs for these cases. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Loss tolerant speech decoder for telecommunications

NASA Technical Reports Server (NTRS)

Prieto, Jr., Jaime L. (Inventor)

1999-01-01

A method and device for extrapolating past signal-history data for insertion into missing data segments in order to conceal digital speech frame errors. The extrapolation method uses past-signal history that is stored in a buffer. The method is implemented with a device that utilizes a finite-impulse response (FIR) multi-layer feed-forward artificial neural network that is trained by back-propagation for one-step extrapolation of speech compression algorithm (SCA) parameters. Once a speech connection has been established, the speech compression algorithm device begins sending encoded speech frames. As the speech frames are received, they are decoded and converted back into speech signal voltages. During the normal decoding process, pre-processing of the required SCA parameters will occur and the results stored in the past-history buffer. If a speech frame is detected to be lost or in error, then extrapolation modules are executed and replacement SCA parameters are generated and sent as the parameters required by the SCA. In this way, the information transfer to the SCA is transparent, and the SCA processing continues as usual. The listener will not normally notice that a speech frame has been lost because of the smooth transition between the last-received, lost, and next-received speech frames.
Effect of gap detection threshold on consistency of speech in children with speech sound disorder.

PubMed

Sayyahi, Fateme; Soleymani, Zahra; Akbari, Mohammad; Bijankhan, Mahmood; Dolatshahi, Behrooz

2017-02-01

The present study examined the relationship between gap detection threshold and speech error consistency in children with speech sound disorder. The participants were children five to six years of age who were categorized into three groups of typical speech, consistent speech disorder (CSD) and inconsistent speech disorder (ISD).The phonetic gap detection threshold test was used for this study, which is a valid test comprised six syllables with inter-stimulus intervals between 20-300ms. The participants were asked to listen to the recorded stimuli three times and indicate whether they heard one or two sounds. There was no significant difference between the typical and CSD groups (p=0.55), but there were significant differences in performance between the ISD and CSD groups and the ISD and typical groups (p=0.00). The ISD group discriminated between speech sounds at a higher threshold. Children with inconsistent speech errors could not distinguish speech sounds during time-limited phonetic discrimination. It is suggested that inconsistency in speech is a representation of inconsistency in auditory perception, which causes by high gap detection threshold. Copyright © 2016 Elsevier Ltd. All rights reserved.
Associations between tongue movement pattern consistency and formant movement pattern consistency in response to speech behavioral modificationsa)

PubMed Central

Mefferd, Antje S.

2016-01-01

The degree of speech movement pattern consistency can provide information about speech motor control. Although tongue motor control is particularly important because of the tongue's primary contribution to the speech acoustic signal, capturing tongue movements during speech remains difficult and costly. This study sought to determine if formant movements could be used to estimate tongue movement pattern consistency indirectly. Two age groups (seven young adults and seven older adults) and six speech conditions (typical, slow, loud, clear, fast, bite block speech) were selected to elicit an age- and task-dependent performance range in tongue movement pattern consistency. Kinematic and acoustic spatiotemporal indexes (STI) were calculated based on sentence-length tongue movement and formant movement signals, respectively. Kinematic and acoustic STI values showed strong associations across talkers and moderate to strong associations for each talker across speech tasks; although, in cases where task-related tongue motor performance changes were relatively small, the acoustic STI values were poorly associated with kinematic STI values. These findings suggest that, depending on the sensitivity needs, formant movement pattern consistency could be used in lieu of direct kinematic analysis to indirectly examine speech motor control. PMID:27908069
Automatic measurement and representation of prosodic features

NASA Astrophysics Data System (ADS)

Ying, Goangshiuan Shawn

Effective measurement and representation of prosodic features of the acoustic signal for use in automatic speech recognition and understanding systems is the goal of this work. Prosodic features-stress, duration, and intonation-are variations of the acoustic signal whose domains are beyond the boundaries of each individual phonetic segment. Listeners perceive prosodic features through a complex combination of acoustic correlates such as intensity, duration, and fundamental frequency (F0). We have developed new tools to measure F0 and intensity features. We apply a probabilistic global error correction routine to an Average Magnitude Difference Function (AMDF) pitch detector. A new short-term frequency-domain Teager energy algorithm is used to measure the energy of a speech signal. We have conducted a series of experiments performing lexical stress detection on words in continuous English speech from two speech corpora. We have experimented with two different approaches, a segment-based approach and a rhythm unit-based approach, in lexical stress detection. The first approach uses pattern recognition with energy- and duration-based measurements as features to build Bayesian classifiers to detect the stress level of a vowel segment. In the second approach we define rhythm unit and use only the F0-based measurement and a scoring system to determine the stressed segment in the rhythm unit. A duration-based segmentation routine was developed to break polysyllabic words into rhythm units. The long-term goal of this work is to develop a system that can effectively detect the stress pattern for each word in continuous speech utterances. Stress information will be integrated as a constraint for pruning the word hypotheses in a word recognition system based on hidden Markov models.
Developing a Weighted Measure of Speech Sound Accuracy

ERIC Educational Resources Information Center

Preston, Jonathan L.; Ramsdell, Heather L.; Oller, D. Kimbrough; Edwards, Mary Louise; Tobin, Stephen J.

2011-01-01

Purpose: To develop a system for numerically quantifying a speaker's phonetic accuracy through transcription-based measures. With a focus on normal and disordered speech in children, the authors describe a system for differentially weighting speech sound errors on the basis of various levels of phonetic accuracy using a Weighted Speech Sound…

Error Variability and the Differentiation between Apraxia of Speech and Aphasia with Phonemic Paraphasia

ERIC Educational Resources Information Center

Haley, Katarina L.; Jacks, Adam; Cunningham, Kevin T.

2013-01-01

Purpose: This study was conducted to evaluate the clinical utility of error variability for differentiating between apraxia of speech (AOS) and aphasia with phonemic paraphasia. Method: Participants were 32 individuals with aphasia after left cerebral injury. Diagnostic groups were formed on the basis of operationalized measures of recognized…
Are vowel errors influenced by consonantal context in the speech of persons with aphasia?

NASA Astrophysics Data System (ADS)

Gelfer, Carole E.; Bell-Berti, Fredericka; Boyle, Mary

2004-05-01

The literature suggests that vowels and consonants may be affected differently in the speech of persons with conduction aphasia (CA) or nonfluent aphasia with apraxia of speech (AOS). Persons with CA have shown similar error rates across vowels and consonants, while those with AOS have shown more errors for consonants than vowels. These data have been interpreted to suggest that consonants have greater gestural complexity than vowels. However, recent research [M. Boyle et al., Proc. International Cong. Phon. Sci., 3265-3268 (2003)] does not support this interpretation: persons with AOS and CA both had a high proportion of vowel errors, and vowel errors almost always occurred in the context of consonantal errors. To examine the notion that vowels are inherently less complex than consonants and are differentially affected in different types of aphasia, vowel production in different consonantal contexts for speakers with AOS or CA was examined. The target utterances, produced in carrier phrases, were bVC and bV syllables, allowing us to examine whether vowel production is influenced by consonantal context. Listener judgments were obtained for each token, and error productions were grouped according to the intended utterance and error type. Acoustical measurements were made from spectrographic displays.
Direct cortical stimulation of inferior frontal cortex disrupts both speech and music production in highly trained musicians.

PubMed

Leonard, Matthew K; Desai, Maansi; Hungate, Dylan; Cai, Ruofan; Singhal, Nilika S; Knowlton, Robert C; Chang, Edward F

2018-05-22

Music and speech are human-specific behaviours that share numerous properties, including the fine motor skills required to produce them. Given these similarities, previous work has suggested that music and speech may at least partially share neural substrates. To date, much of this work has focused on perception, and has not investigated the neural basis of production, particularly in trained musicians. Here, we report two rare cases of musicians undergoing neurosurgical procedures, where it was possible to directly stimulate the left hemisphere cortex during speech and piano/guitar music production tasks. We found that stimulation to left inferior frontal cortex, including pars opercularis and ventral pre-central gyrus, caused slowing and arrest for both speech and music, and note sequence errors for music. Stimulation to posterior superior temporal cortex only caused production errors during speech. These results demonstrate partially dissociable networks underlying speech and music production, with a shared substrate in frontal regions.
The effect of speaking rate on serial-order sound-level errors in normal healthy controls and persons with aphasia.

PubMed

Fossett, Tepanta R D; McNeil, Malcolm R; Pratt, Sheila R; Tompkins, Connie A; Shuster, Linda I

Although many speech errors can be generated at either a linguistic or motoric level of production, phonetically well-formed sound-level serial-order errors are generally assumed to result from disruption of phonologic encoding (PE) processes. An influential model of PE (Dell, 1986; Dell, Burger & Svec, 1997) predicts that speaking rate should affect the relative proportion of these serial-order sound errors (anticipations, perseverations, exchanges). These predictions have been extended to, and have special relevance for persons with aphasia (PWA) because of the increased frequency with which speech errors occur and because their localization within the functional linguistic architecture may help in diagnosis and treatment. Supporting evidence regarding the effect of speaking rate on phonological encoding has been provided by studies using young normal language (NL) speakers and computer simulations. Limited data exist for older NL users and no group data exist for PWA. This study tested the phonologic encoding properties of Dell's model of speech production (Dell, 1986; Dell,et al., 1997), which predicts that increasing speaking rate affects the relative proportion of serial-order sound errors (i.e., anticipations, perseverations, and exchanges). The effects of speech rate on the error ratios of anticipation/exchange (AE), anticipation/perseveration (AP) and vocal reaction time (VRT) were examined in 16 normal healthy controls (NHC) and 16 PWA without concomitant motor speech disorders. The participants were recorded performing a phonologically challenging (tongue twister) speech production task at their typical and two faster speaking rates. A significant effect of increased rate was obtained for the AP but not the AE ratio. Significant effects of group and rate were obtained for VRT. Although the significant effect of rate for the AP ratio provided evidence that changes in speaking rate did affect PE, the results failed to support the model derived predictions regarding the direction of change for error type proportions. The current findings argued for an alternative concept of the role of activation and decay in influencing types of serial-order sound errors. Rather than a slow activation decay rate (Dell, 1986), the results of the current study were more compatible with an alternative explanation of rapid activation decay or slow build-up of residual activation.
A bilateral cortical network responds to pitch perturbations in speech feedback

PubMed Central

Kort, Naomi S.; Nagarajan, Srikantan S.; Houde, John F.

2014-01-01

Auditory feedback is used to monitor and correct for errors in speech production, and one of the clearest demonstrations of this is the pitch perturbation reflex. During ongoing phonation, speakers respond rapidly to shifts of the pitch of their auditory feedback, altering their pitch production to oppose the direction of the applied pitch shift. In this study, we examine the timing of activity within a network of brain regions thought to be involved in mediating this behavior. To isolate auditory feedback processing relevant for motor control of speech, we used magnetoencephalography (MEG) to compare neural responses to speech onset and to transient (400ms) pitch feedback perturbations during speaking with responses to identical acoustic stimuli during passive listening. We found overlapping, but distinct bilateral cortical networks involved in monitoring speech onset and feedback alterations in ongoing speech. Responses to speech onset during speaking were suppressed in bilateral auditory and left ventral supramarginal gyrus/posterior superior temporal sulcus (vSMG/pSTS). In contrast, during pitch perturbations, activity was enhanced in bilateral vSMG/pSTS, bilateral premotor cortex, right primary auditory cortex, and left higher order auditory cortex. We also found speaking-induced delays in responses to both unaltered and altered speech in bilateral primary and secondary auditory regions, the left vSMG/pSTS and right premotor cortex. The network dynamics reveal the cortical processing involved in both detecting the speech error and updating the motor plan to create the new pitch output. These results implicate vSMG/pSTS as critical in both monitoring auditory feedback and initiating rapid compensation to feedback errors. PMID:24076223
Children with Comorbid Speech Sound Disorder and Specific Language Impairment Are at Increased Risk for Attention-Deficit/Hyperactivity Disorder

ERIC Educational Resources Information Center

McGrath, Lauren M.; Hutaff-Lee, Christa; Scott, Ashley; Boada, Richard; Shriberg, Lawrence D.; Pennington, Bruce F.

2008-01-01

This study focuses on the comorbidity between attention-deficit/hyperactivity disorder (ADHD) symptoms and speech sound disorder (SSD). SSD is a developmental disorder characterized by speech production errors that impact intelligibility. Previous research addressing this comorbidity has typically used heterogeneous groups of speech-language…
Factors that Enhance English-Speaking Speech-Language Pathologists' Transcription of Cantonese-Speaking Children's Consonants

ERIC Educational Resources Information Center

Lockart, Rebekah; McLeod, Sharynne

2013-01-01

Purpose: To investigate speech-language pathology students' ability to identify errors and transcribe typical and atypical speech in Cantonese, a nonnative language. Method: Thirty-three English-speaking speech-language pathology students completed 3 tasks in an experimental within-subjects design. Results: Task 1 (baseline) involved transcribing…
Childhood apraxia of speech: A survey of praxis and typical speech characteristics.

PubMed

Malmenholt, Ann; Lohmander, Anette; McAllister, Anita

2017-07-01

The purpose of this study was to investigate current knowledge of the diagnosis childhood apraxia of speech (CAS) in Sweden and compare speech characteristics and symptoms to those of earlier survey findings in mainly English-speakers. In a web-based questionnaire 178 Swedish speech-language pathologists (SLPs) anonymously answered questions about their perception of typical speech characteristics for CAS. They graded own assessment skills and estimated clinical occurrence. The seven top speech characteristics reported as typical for children with CAS were: inconsistent speech production (85%), sequencing difficulties (71%), oro-motor deficits (63%), vowel errors (62%), voicing errors (61%), consonant cluster deletions (54%), and prosodic disturbance (53%). Motor-programming deficits described as lack of automatization of speech movements were perceived by 82%. All listed characteristics were consistent with the American Speech-Language-Hearing Association (ASHA) consensus-based features, Strand's 10-point checklist, and the diagnostic model proposed by Ozanne. The mode for clinical occurrence was 5%. Number of suspected cases of CAS in the clinical caseload was approximately one new patient/year and SLP. The results support and add to findings from studies of CAS in English-speaking children with similar speech characteristics regarded as typical. Possibly, these findings could contribute to cross-linguistic consensus on CAS characteristics.
A causal test of the motor theory of speech perception: A case of impaired speech production and spared speech perception

PubMed Central

Stasenko, Alena; Bonn, Cory; Teghipco, Alex; Garcea, Frank E.; Sweet, Catherine; Dombovy, Mary; McDonough, Joyce; Mahon, Bradford Z.

2015-01-01

In the last decade, the debate about the causal role of the motor system in speech perception has been reignited by demonstrations that motor processes are engaged during the processing of speech sounds. However, the exact role of the motor system in auditory speech processing remains elusive. Here we evaluate which aspects of auditory speech processing are affected, and which are not, in a stroke patient with dysfunction of the speech motor system. The patient’s spontaneous speech was marked by frequent phonological/articulatory errors, and those errors were caused, at least in part, by motor-level impairments with speech production. We found that the patient showed a normal phonemic categorical boundary when discriminating two nonwords that differ by a minimal pair (e.g., ADA-AGA). However, using the same stimuli, the patient was unable to identify or label the nonword stimuli (using a button-press response). A control task showed that he could identify speech sounds by speaker gender, ruling out a general labeling impairment. These data suggest that the identification (i.e. labeling) of nonword speech sounds may involve the speech motor system, but that the perception of speech sounds (i.e., discrimination) does not require the motor system. This means that motor processes are not causally involved in perception of the speech signal, and suggest that the motor system may be used when other cues (e.g., meaning, context) are not available. PMID:25951749
The Effects of Simulated Stuttering and Prolonged Speech on the Neural Activation Patterns of Stuttering and Nonstuttering Adults

ERIC Educational Resources Information Center

De Nil, Luc F.; Beal, Deryk S.; Lafaille, Sophie J.; Kroll, Robert M.; Crawley, Adrian P.; Gracco, Vincent L.

2008-01-01

Functional magnetic resonance imaging was used to investigate the neural correlates of passive listening, habitual speech and two modified speech patterns (simulated stuttering and prolonged speech) in stuttering and nonstuttering adults. Within-group comparisons revealed increased right hemisphere biased activation of speech-related regions…
Are phonological influences on lexical (mis)selection the result of a monitoring bias?

PubMed Central

Ratinckx, Elie; Ferreira, Victor S.; Hartsuiker, Robert J.

2009-01-01

A monitoring bias account is often used to explain speech error patterns that seem to be the result of an interactive language production system, like phonological influences on lexical selection errors. A biased monitor is suggested to detect and covertly correct certain errors more often than others. For instance, this account predicts that errors which are phonologically similar to intended words are harder to detect than ones that are phonologically dissimilar. To test this, we tried to elicit phonological errors under the same conditions that show other kinds of lexical selection errors. In five experiments, we presented participants with high cloze probability sentence fragments followed by a picture that was either semantically related, a homophone of a semantically related word, or phonologically related to the (implicit) last word of the sentence. All experiments elicited semantic completions or homophones of semantic completions, but none elicited phonological completions. This finding is hard to reconcile with a monitoring bias account and is better explained with an interactive production system. Additionally, this finding constrains the amount of bottom-up information flow in interactive models. PMID:18942035
Kinematic Analysis of Speech Sound Sequencing Errors Induced by Delayed Auditory Feedback.

PubMed

Cler, Gabriel J; Lee, Jackson C; Mittelman, Talia; Stepp, Cara E; Bohland, Jason W

2017-06-22

Delayed auditory feedback (DAF) causes speakers to become disfluent and make phonological errors. Methods for assessing the kinematics of speech errors are lacking, with most DAF studies relying on auditory perceptual analyses, which may be problematic, as errors judged to be categorical may actually represent blends of sounds or articulatory errors. Eight typical speakers produced nonsense syllable sequences under normal and DAF (200 ms). Lip and tongue kinematics were captured with electromagnetic articulography. Time-locked acoustic recordings were transcribed, and the kinematics of utterances with and without perceived errors were analyzed with existing and novel quantitative methods. New multivariate measures showed that for 5 participants, kinematic variability for productions perceived to be error free was significantly increased under delay; these results were validated by using the spatiotemporal index measure. Analysis of error trials revealed both typical productions of a nontarget syllable and productions with articulatory kinematics that incorporated aspects of both the target and the perceived utterance. This study is among the first to characterize articulatory changes under DAF and provides evidence for different classes of speech errors, which may not be perceptually salient. New methods were developed that may aid visualization and analysis of large kinematic data sets. https://doi.org/10.23641/asha.5103067.
Kinematic Analysis of Speech Sound Sequencing Errors Induced by Delayed Auditory Feedback

PubMed Central

Lee, Jackson C.; Mittelman, Talia; Stepp, Cara E.; Bohland, Jason W.

2017-01-01

Purpose Delayed auditory feedback (DAF) causes speakers to become disfluent and make phonological errors. Methods for assessing the kinematics of speech errors are lacking, with most DAF studies relying on auditory perceptual analyses, which may be problematic, as errors judged to be categorical may actually represent blends of sounds or articulatory errors. Method Eight typical speakers produced nonsense syllable sequences under normal and DAF (200 ms). Lip and tongue kinematics were captured with electromagnetic articulography. Time-locked acoustic recordings were transcribed, and the kinematics of utterances with and without perceived errors were analyzed with existing and novel quantitative methods. Results New multivariate measures showed that for 5 participants, kinematic variability for productions perceived to be error free was significantly increased under delay; these results were validated by using the spatiotemporal index measure. Analysis of error trials revealed both typical productions of a nontarget syllable and productions with articulatory kinematics that incorporated aspects of both the target and the perceived utterance. Conclusions This study is among the first to characterize articulatory changes under DAF and provides evidence for different classes of speech errors, which may not be perceptually salient. New methods were developed that may aid visualization and analysis of large kinematic data sets. Supplemental Material https://doi.org/10.23641/asha.5103067 PMID:28655038
The Different Time Course of Phonotactic Constraint Learning in Children and Adults: Evidence from Speech Errors

ERIC Educational Resources Information Center

Smalle, Eleonore H. M.; Muylle, Merel; Szmalec, Arnaud; Duyck, Wouter

2017-01-01

Speech errors typically respect the speaker's implicit knowledge of language-wide phonotactics (e.g., /t/ cannot be a syllable onset in the English language). Previous work demonstrated that adults can learn novel experimentally induced phonotactic constraints by producing syllable strings in which the allowable position of a phoneme depends on…
Improved Speech Coding Based on Open-Loop Parameter Estimation

NASA Technical Reports Server (NTRS)

Juang, Jer-Nan; Chen, Ya-Chin; Longman, Richard W.

2000-01-01

A nonlinear optimization algorithm for linear predictive speech coding was developed early that not only optimizes the linear model coefficients for the open loop predictor, but does the optimization including the effects of quantization of the transmitted residual. It also simultaneously optimizes the quantization levels used for each speech segment. In this paper, we present an improved method for initialization of this nonlinear algorithm, and demonstrate substantial improvements in performance. In addition, the new procedure produces monotonically improving speech quality with increasing numbers of bits used in the transmitted error residual. Examples of speech encoding and decoding are given for 8 speech segments and signal to noise levels as high as 47 dB are produced. As in typical linear predictive coding, the optimization is done on the open loop speech analysis model. Here we demonstrate that minimizing the error of the closed loop speech reconstruction, instead of the simpler open loop optimization, is likely to produce negligible improvement in speech quality. The examples suggest that the algorithm here is close to giving the best performance obtainable from a linear model, for the chosen order with the chosen number of bits for the codebook.
Pitch-Learning Algorithm For Speech Encoders

NASA Technical Reports Server (NTRS)

Bhaskar, B. R. Udaya

1988-01-01

Adaptive algorithm detects and corrects errors in sequence of estimates of pitch period of speech. Algorithm operates in conjunction with techniques used to estimate pitch period. Used in such parametric and hybrid speech coders as linear predictive coders and adaptive predictive coders.
The Impact of Dysphonic Voices on Healthy Listeners: Listener Reaction Times, Speech Intelligibility, and Listener Comprehension.

PubMed

Evitts, Paul M; Starmer, Heather; Teets, Kristine; Montgomery, Christen; Calhoun, Lauren; Schulze, Allison; MacKenzie, Jenna; Adams, Lauren

2016-11-01

There is currently minimal information on the impact of dysphonia secondary to phonotrauma on listeners. Considering the high incidence of voice disorders with professional voice users, it is important to understand the impact of a dysphonic voice on their audiences. Ninety-one healthy listeners (39 men, 52 women; mean age = 23.62 years) were presented with speech stimuli from 5 healthy speakers and 5 speakers diagnosed with dysphonia secondary to phonotrauma. Dependent variables included processing speed (reaction time [RT] ratio), speech intelligibility, and listener comprehension. Voice quality ratings were also obtained for all speakers by 3 expert listeners. Statistical results showed significant differences between RT ratio and number of speech intelligibility errors between healthy and dysphonic voices. There was not a significant difference in listener comprehension errors. Multiple regression analyses showed that voice quality ratings from the Consensus Assessment Perceptual Evaluation of Voice (Kempster, Gerratt, Verdolini Abbott, Barkmeier-Kraemer, & Hillman, 2009) were able to predict RT ratio and speech intelligibility but not listener comprehension. Results of the study suggest that although listeners require more time to process and have more intelligibility errors when presented with speech stimuli from speakers with dysphonia secondary to phonotrauma, listener comprehension may not be affected.
Pulse Vector-Excitation Speech Encoder

NASA Technical Reports Server (NTRS)

Davidson, Grant; Gersho, Allen

1989-01-01

Proposed pulse vector-excitation speech encoder (PVXC) encodes analog speech signals into digital representation for transmission or storage at rates below 5 kilobits per second. Produces high quality of reconstructed speech, but with less computation than required by comparable speech-encoding systems. Has some characteristics of multipulse linear predictive coding (MPLPC) and of code-excited linear prediction (CELP). System uses mathematical model of vocal tract in conjunction with set of excitation vectors and perceptually-based error criterion to synthesize natural-sounding speech.
Neuropsychological analysis of a typewriting disturbance following cerebral damage.

PubMed

Boyle, M; Canter, G J

1987-01-01

Following a left CVA, a skilled professional typist sustained a disturbance of typing disproportionate to her handwriting disturbance. Typing errors were predominantly of the sequencing type, with spatial errors much less frequent, suggesting that the impairment was based on a relatively early (premotor) stage of processing. Depriving the subject of visual feedback during handwriting greatly increased her error rate. Similarly, interfering with auditory feedback during speech substantially reduced her self-correction of speech errors. These findings suggested that impaired ability to utilize somesthetic information--probably caused by the subject's parietal lobe lesion--may have been the basis of the typing disorder.
"Non-Vocalization": A Phonological Error Process in the Speech of Severely and Profoundly Hearing Impaired Adults, from the Point of View of the Theory of Phonology as Human Behaviour

ERIC Educational Resources Information Center

Halpern, Orly; Tobin, Yishai

2008-01-01

"Non-vocalization" (N-V) is a newly described phonological error process in hearing impaired speakers. In N-V the hearing impaired person actually articulates the phoneme but without producing a voice. The result is an error process looking as if it is produced but sounding as if it is omitted. N-V was discovered by video recording the speech of…

The Case for Subphonemic Attenuation in Inner Speech: Comment on Corley, Brocklehurst, and Moat (2011)

ERIC Educational Resources Information Center

Oppenheim, Gary M.

2012-01-01

Corley, Brocklehurst, and Moat (2011) recently demonstrated a phonemic similarity effect for phonological errors in inner speech, claiming that it contradicted Oppenheim and Dell's (2008) characterization of inner speech as lacking subphonemic detail (e.g., features). However, finding "an effect" in both inner and overt speech is not the same as…
Ultrasound Images of the Tongue: A Tutorial for Assessment and Remediation of Speech Sound Errors.

PubMed

Preston, Jonathan L; McAllister Byun, Tara; Boyce, Suzanne E; Hamilton, Sarah; Tiede, Mark; Phillips, Emily; Rivera-Campos, Ahmed; Whalen, Douglas H

2017-01-03

Diagnostic ultrasound imaging has been a common tool in medical practice for several decades. It provides a safe and effective method for imaging structures internal to the body. There has been a recent increase in the use of ultrasound technology to visualize the shape and movements of the tongue during speech, both in typical speakers and in clinical populations. Ultrasound imaging of speech has greatly expanded our understanding of how sounds articulated with the tongue (lingual sounds) are produced. Such information can be particularly valuable for speech-language pathologists. Among other advantages, ultrasound images can be used during speech therapy to provide (1) illustrative models of typical (i.e. "correct") tongue configurations for speech sounds, and (2) a source of insight into the articulatory nature of deviant productions. The images can also be used as an additional source of feedback for clinical populations learning to distinguish their better productions from their incorrect productions, en route to establishing more effective articulatory habits. Ultrasound feedback is increasingly used by scientists and clinicians as both the expertise of the users increases and as the expense of the equipment declines. In this tutorial, procedures are presented for collecting ultrasound images of the tongue in a clinical context. We illustrate these procedures in an extended example featuring one common error sound, American English /r/. Images of correct and distorted /r/ are used to demonstrate (1) how to interpret ultrasound images, (2) how to assess tongue shape during production of speech sounds, (3), how to categorize tongue shape errors, and (4), how to provide visual feedback to elicit a more appropriate and functional tongue shape. We present a sample protocol for using real-time ultrasound images of the tongue for visual feedback to remediate speech sound errors. Additionally, example data are shown to illustrate outcomes with the procedure.
When one person's mistake is another's standard usage: the effect of foreign accent on syntactic processing.

PubMed

Hanulíková, Adriana; van Alphen, Petra M; van Goch, Merel M; Weber, Andrea

2012-04-01

How do native listeners process grammatical errors that are frequent in non-native speech? We investigated whether the neural correlates of syntactic processing are modulated by speaker identity. ERPs to gender agreement errors in sentences spoken by a native speaker were compared with the same errors spoken by a non-native speaker. In line with previous research, gender violations in native speech resulted in a P600 effect (larger P600 for violations in comparison with correct sentences), but when the same violations were produced by the non-native speaker with a foreign accent, no P600 effect was observed. Control sentences with semantic violations elicited comparable N400 effects for both the native and the non-native speaker, confirming no general integration problem in foreign-accented speech. The results demonstrate that the P600 is modulated by speaker identity, extending our knowledge about the role of speaker's characteristics on neural correlates of speech processing.
Investigating the Retention and Time-Course of Phonotactic Constraint Learning From Production Experience

PubMed Central

Warker, Jill A.

2013-01-01

Adults can rapidly learn artificial phonotactic constraints such as /f/ only occurs at the beginning of syllables by producing syllables that contain those constraints. This implicit learning is then reflected in their speech errors. However, second-order constraints in which the placement of a phoneme depends on another characteristic of the syllable (e.g., if the vowel is /æ/, /f/ occurs at the beginning of syllables and /s/ occurs at the end of syllables but if the vowel is /I/, the reverse is true) require a longer learning period. Two experiments question the transience of second-order learning and whether consolidation plays a role in learning phonological dependencies. Using speech errors as a measure of learning, Experiment 1 investigated the durability of learning, and Experiment 2 investigated the time-course of learning. Experiment 1 found that learning is still present in speech errors a week later. Experiment 2 looked at whether more time in the form of a consolidation period or more experience in the form of more trials was necessary for learning to be revealed in speech errors. Both consolidation and more trials led to learning; however, consolidation provided a more substantial benefit. PMID:22686839
Understanding native Russian listeners' errors on an English word recognition test: model-based analysis of phoneme confusion.

PubMed

Shi, Lu-Feng; Morozova, Natalia

2012-08-01

Word recognition is a basic component in a comprehensive hearing evaluation, but data are lacking for listeners speaking two languages. This study obtained such data for Russian natives in the US and analysed the data using the perceptual assimilation model (PAM) and speech learning model (SLM). Listeners were randomly presented 200 NU-6 words in quiet. Listeners responded verbally and in writing. Performance was scored on words and phonemes (word-initial consonants, vowels, and word-final consonants). Seven normal-hearing, adult monolingual English natives (NM), 16 English-dominant (ED), and 15 Russian-dominant (RD) Russian natives participated. ED and RD listeners differed significantly in their language background. Consistent with the SLM, NM outperformed ED listeners and ED outperformed RD listeners, whether responses were scored on words or phonemes. NM and ED listeners shared similar phoneme error patterns, whereas RD listeners' errors had unique patterns that could be largely understood via the PAM. RD listeners had particular difficulty differentiating vowel contrasts /i-I/, /æ-ε/, and /ɑ-Λ/, word-initial consonant contrasts /p-h/ and /b-f/, and word-final contrasts /f-v/. Both first-language phonology and second-language learning history affect word and phoneme recognition. Current findings may help clinicians differentiate word recognition errors due to language background from hearing pathologies.
The speech perception skills of children with and without speech sound disorder.

PubMed

Hearnshaw, Stephanie; Baker, Elise; Munro, Natalie

To investigate whether Australian-English speaking children with and without speech sound disorder (SSD) differ in their overall speech perception accuracy. Additionally, to investigate differences in the perception of specific phonemes and the association between speech perception and speech production skills. Twenty-five Australian-English speaking children aged 48-60 months participated in this study. The SSD group included 12 children and the typically developing (TD) group included 13 children. Children completed routine speech and language assessments in addition to an experimental Australian-English lexical and phonetic judgement task based on Rvachew's Speech Assessment and Interactive Learning System (SAILS) program (Rvachew, 2009). This task included eight words across four word-initial phonemes-/k, ɹ, ʃ, s/. Children with SSD showed significantly poorer perceptual accuracy on the lexical and phonetic judgement task compared with TD peers. The phonemes /ɹ/ and /s/ were most frequently perceived in error across both groups. Additionally, the phoneme /ɹ/ was most commonly produced in error. There was also a positive correlation between overall speech perception and speech production scores. Children with SSD perceived speech less accurately than their typically developing peers. The findings suggest that an Australian-English variation of a lexical and phonetic judgement task similar to the SAILS program is promising and worthy of a larger scale study. Copyright © 2017 Elsevier Inc. All rights reserved.
The phonological abilities of Cantonese-speaking children with hearing loss.

PubMed

Dodd, B J; So, L K

1994-06-01

Little is known about the acquisition of phonology by children with hearing loss who learn languages other than English. In this study, the phonological abilities of 12 Cantonese-speaking children (ages 4:2 to 6:11) with prelingual hearing impairment are described. All but 3 children had almost complete syllable-initial consonant repertoires; all but 2 had complete syllable-final consonant and vowel repertoires; and only 1 child failed to produce all nine tones. Children's perception of single words was assessed using sets of words that included tone, consonant, and semantic distractors. Although the performance of the subjects was not age appropriate, they nevertheless most often chose the target, with most errors observed for the tone distractor. The phonological rules used included those that characterize the speech of younger hearing children acquiring Cantonese (e.g., cluster reduction, stopping, and deaspiration). However, most children also used at least one unusual phonological rule (e.g., frication, addition, initial consonant deletion, and/or backing). These rules are common in the speech of Cantonese-speaking children diagnosed as phonologically disordered. The influence of the ambient language on children's patterns of phonological errors is discussed.
Ultrasound visual feedback treatment and practice variability for residual speech sound errors

PubMed Central

Preston, Jonathan L.; McCabe, Patricia; Rivera-Campos, Ahmed; Whittle, Jessica L.; Landry, Erik; Maas, Edwin

2014-01-01

Purpose The goals were to (1) test the efficacy of a motor-learning based treatment that includes ultrasound visual feedback for individuals with residual speech sound errors, and (2) explore whether the addition of prosodic cueing facilitates speech sound learning. Method A multiple baseline single subject design was used, replicated across 8 participants. For each participant, one sound context was treated with ultrasound plus prosodic cueing for 7 sessions, and another sound context was treated with ultrasound but without prosodic cueing for 7 sessions. Sessions included ultrasound visual feedback as well as non-ultrasound treatment. Word-level probes assessing untreated words were used to evaluate retention and generalization. Results For most participants, increases in accuracy of target sound contexts at the word level were observed with the treatment program regardless of whether prosodic cueing was included. Generalization between onset singletons and clusters was observed, as well as generalization to sentence-level accuracy. There was evidence of retention during post-treatment probes, including at a two-month follow-up. Conclusions A motor-based treatment program that includes ultrasound visual feedback can facilitate learning of speech sounds in individuals with residual speech sound errors. PMID:25087938
Speech Abilities in Preschool Children with Speech Sound Disorder with and without Co-Occurring Language Impairment

ERIC Educational Resources Information Center

Macrae, Toby; Tyler, Ann A.

2014-01-01

Purpose: The authors compared preschool children with co-occurring speech sound disorder (SSD) and language impairment (LI) to children with SSD only in their numbers and types of speech sound errors. Method: In this post hoc quasi-experimental study, independent samples t tests were used to compare the groups in the standard score from different…
Incorporating Auditory Models in Speech/Audio Applications

NASA Astrophysics Data System (ADS)

Krishnamoorthi, Harish

2011-12-01

Following the success in incorporating perceptual models in audio coding algorithms, their application in other speech/audio processing systems is expanding. In general, all perceptual speech/audio processing algorithms involve minimization of an objective function that directly/indirectly incorporates properties of human perception. This dissertation primarily investigates the problems associated with directly embedding an auditory model in the objective function formulation and proposes possible solutions to overcome high complexity issues for use in real-time speech/audio algorithms. Specific problems addressed in this dissertation include: 1) the development of approximate but computationally efficient auditory model implementations that are consistent with the principles of psychoacoustics, 2) the development of a mapping scheme that allows synthesizing a time/frequency domain representation from its equivalent auditory model output. The first problem is aimed at addressing the high computational complexity involved in solving perceptual objective functions that require repeated application of auditory model for evaluation of different candidate solutions. In this dissertation, a frequency pruning and a detector pruning algorithm is developed that efficiently implements the various auditory model stages. The performance of the pruned model is compared to that of the original auditory model for different types of test signals in the SQAM database. Experimental results indicate only a 4-7% relative error in loudness while attaining up to 80-90 % reduction in computational complexity. Similarly, a hybrid algorithm is developed specifically for use with sinusoidal signals and employs the proposed auditory pattern combining technique together with a look-up table to store representative auditory patterns. The second problem obtains an estimate of the auditory representation that minimizes a perceptual objective function and transforms the auditory pattern back to its equivalent time/frequency representation. This avoids the repeated application of auditory model stages to test different candidate time/frequency vectors in minimizing perceptual objective functions. In this dissertation, a constrained mapping scheme is developed by linearizing certain auditory model stages that ensures obtaining a time/frequency mapping corresponding to the estimated auditory representation. This paradigm was successfully incorporated in a perceptual speech enhancement algorithm and a sinusoidal component selection task.
Acoustic and Perceptual Effects of Dysarthria in Greek with a Focus on Lexical Stress

NASA Astrophysics Data System (ADS)

Papakyritsis, Ioannis

The field of motor speech disorders in Greek is substantially underresearched. Additionally, acoustic studies on lexical stress in dysarthria are generally very rare (Kim et al. 2010). This dissertation examined the acoustic and perceptual effects of Greek dysarthria focusing on lexical stress. Additional possibly deviant speech characteristics were acoustically analyzed. Data from three dysarthric participants and matched controls was analyzed using a case study design. The analysis of lexical stress was based on data drawn from a single word repetition task that included pairs of disyllabic words differentiated by stress location. This data was acoustically analyzed in terms of the use of the acoustic cues for Greek stress. The ability of the dysarthric participants to signal stress in single words was further assessed in a stress identification task carried out by 14 naive Greek listeners. Overall, the acoustic and perceptual data indicated that, although all three dysarthric speakers presented with some difficulty in the patterning of stressed and unstressed syllables, each had different underlying problems that gave rise to quite distinct patterns of deviant speech characteristics. The atypical use of lexical stress cues in Anna's data obscured the prominence relations of stressed and unstressed syllables to the extent that the position of lexical stress was usually not perceptually transparent. Chris and Maria on the other hand, did not have marked difficulties signaling lexical stress location, although listeners were not 100% successful in the stress identification task. For the most part, Chris' atypical phonation patterns and Maria's very slow rate of speech did not interfere with lexical stress signaling. The acoustic analysis of the lexical stress cues was generally in agreement with the participants' performance in the stress identification task. Interestingly, in all three dysarthric participants, but more so in Anna, targets stressed on the 1st syllable were more impervious to error judgments of lexical stress location than targets stressed on the 2nd syllable, although the acoustic metrics did not always suggest a more appropriate use of lexical stress cues in 1st syllable position. The findings contribute to our limited knowledge of the speech characteristics of dysarthria across different languages.
Assessing Auditory Discrimination Skill of Malay Children Using Computer-based Method.

PubMed

Ting, H; Yunus, J; Mohd Nordin, M Z

2005-01-01

The purpose of this paper is to investigate the auditory discrimination skill of Malay children using computer-based method. Currently, most of the auditory discrimination assessments are conducted manually by Speech-Language Pathologist. These conventional tests are actually general tests of sound discrimination, which do not reflect the client's specific speech sound errors. Thus, we propose computer-based Malay auditory discrimination test to automate the whole process of assessment as well as to customize the test according to the specific speech error sounds of the client. The ability in discriminating voiced and unvoiced Malay speech sounds was studied for the Malay children aged between 7 and 10 years old. The study showed no major difficulty for the children in discriminating the Malay speech sounds except differentiating /g/-/k/ sounds. Averagely the children of 7 years old failed to discriminate /g/-/k/ sounds.
Effect of Parkinson's disease on the production of structured and unstructured speaking tasks: Respiratory physiologic and linguistic considerations

PubMed Central

Huber, Jessica E.; Darling, Meghan

2012-01-01

Purpose The purpose of the present study was to examine the effects of cognitive-linguistic deficits and respiratory physiologic changes on respiratory support for speech in PD, using two speech tasks, reading and extemporaneous speech. Methods Five women with PD, 9 men with PD, and 14 age- and sex-matched control participants read a passage and spoke extemporaneously on a topic of their choice at comfortable loudness. Sound pressure level, syllables per breath group, speech rate, and lung volume parameters were measured. Number of formulation errors, disfluencies, and filled pauses were counted. Results Individuals with PD produced shorter utterances as compared to control participants. The relationships between utterance length and lung volume initiation and inspiratory duration were weaker in individuals with PD than for control participants, particularly for the extemporaneous speech task. These results suggest less consistent planning for utterance length by individuals with PD in extemporaneous speech. Individuals with PD produced more formulation errors in both tasks and significantly fewer filled pauses in extemporaneous speech. Conclusions Both respiratory physiologic and cognitive-linguistic issues affected speech production by individuals with PD. Overall, individuals with PD had difficulty planning or coordinating language formulation and respiratory support, in particular during extemporaneous speech. PMID:20844256
Masking of errors in transmission of VAPC-coded speech

NASA Technical Reports Server (NTRS)

Cox, Neil B.; Froese, Edwin L.

1990-01-01

A subjective evaluation is provided of the bit error sensitivity of the message elements of a Vector Adaptive Predictive (VAPC) speech coder, along with an indication of the amenability of these elements to a popular error masking strategy (cross frame hold over). As expected, a wide range of bit error sensitivity was observed. The most sensitive message components were the short term spectral information and the most significant bits of the pitch and gain indices. The cross frame hold over strategy was found to be useful for pitch and gain information, but it was not beneficial for the spectral information unless severe corruption had occurred.
Stability and Patterning of Speech Movement Sequences in Children and Adults.

ERIC Educational Resources Information Center

Smith, Anne; Goffman, Lisa

1998-01-01

A study of 16 children (ages 4 and 7 years) and 8 young adults used an "Optotrak" system to study patterning and stability of speech movements in developing speech motor systems. Results indicate that nonlinear and nonuniform changes occur in components of the speech motor system during development. (Author/CR)
Between-Word Simplification Patterns in the Continuous Speech of Children with Speech Sound Disorders

ERIC Educational Resources Information Center

Klein, Harriet B.; Liu-Shea, May

2009-01-01

Purpose: This study was designed to identify and describe between-word simplification patterns in the continuous speech of children with speech sound disorders. It was hypothesized that word combinations would reveal phonological changes that were unobserved with single words, possibly accounting for discrepancies between the intelligibility of…
Identifying Residual Speech Sound Disorders in Bilingual Children: A Japanese-English Case Study

ERIC Educational Resources Information Center

Preston, Jonathan L.; Seki, Ayumi

2011-01-01

Purpose: To describe (a) the assessment of residual speech sound disorders (SSDs) in bilinguals by distinguishing speech patterns associated with second language acquisition from patterns associated with misarticulations and (b) how assessment of domains such as speech motor control and phonological awareness can provide a more complete…
Linkage of Speech Sound Disorder to Reading Disability Loci

ERIC Educational Resources Information Center

Smith, Shelley D.; Pennington, Bruce F.; Boada, Richard; Shriberg, Lawrence D.

2005-01-01

Background: Speech sound disorder (SSD) is a common childhood disorder characterized by developmentally inappropriate errors in speech production that greatly reduce intelligibility. SSD has been found to be associated with later reading disability (RD), and there is also evidence for both a cognitive and etiological overlap between the two…
Characterizing Articulation in Apraxic Speech Using Real-Time Magnetic Resonance Imaging

ERIC Educational Resources Information Center

Hagedorn, Christina; Proctor, Michael; Goldstein, Louis; Wilson, Stephen M.; Miller, Bruce; Gorno-Tempini, Maria Luisa; Narayanan, Shrikanth S.

2017-01-01

Purpose: Real-time magnetic resonance imaging (MRI) and accompanying analytical methods are shown to capture and quantify salient aspects of apraxic speech, substantiating and expanding upon evidence provided by clinical observation and acoustic and kinematic data. Analysis of apraxic speech errors within a dynamic systems framework is provided…
A model of serial order problems in fluent, stuttered and agrammatic speech.

PubMed

Howell, Peter

2007-10-01

Many models of speech production have attempted to explain dysfluent speech. Most models assume that the disruptions that occur when speech is dysfluent arise because the speakers make errors while planning an utterance. In this contribution, a model of the serial order of speech is described that does not make this assumption. It involves the coordination or 'interlocking' of linguistic planning and execution stages at the language-speech interface. The model is examined to determine whether it can distinguish two forms of dysfluent speech (stuttered and agrammatic speech) that are characterized by iteration and omission of whole words and parts of words.

When does speech sound disorder matter for literacy? The role of disordered speech errors, co-occurring language impairment and family risk of dyslexia.

PubMed

Hayiou-Thomas, Marianna E; Carroll, Julia M; Leavett, Ruth; Hulme, Charles; Snowling, Margaret J

2017-02-01

This study considers the role of early speech difficulties in literacy development, in the context of additional risk factors. Children were identified with speech sound disorder (SSD) at the age of 3½ years, on the basis of performance on the Diagnostic Evaluation of Articulation and Phonology. Their literacy skills were assessed at the start of formal reading instruction (age 5½), using measures of phoneme awareness, word-level reading and spelling; and 3 years later (age 8), using measures of word-level reading, spelling and reading comprehension. The presence of early SSD conferred a small but significant risk of poor phonemic skills and spelling at the age of 5½ and of poor word reading at the age of 8. Furthermore, within the group with SSD, the persistence of speech difficulties to the point of school entry was associated with poorer emergent literacy skills, and children with 'disordered' speech errors had poorer word reading skills than children whose speech errors indicated 'delay'. In contrast, the initial severity of SSD was not a significant predictor of reading development. Beyond the domain of speech, the presence of a co-occurring language impairment was strongly predictive of literacy skills and having a family risk of dyslexia predicted additional variance in literacy at both time-points. Early SSD alone has only modest effects on literacy development but when additional risk factors are present, these can have serious negative consequences, consistent with the view that multiple risks accumulate to predict reading disorders. © 2016 The Authors. Journal of Child Psychology and Psychiatry published by John Wiley & Sons Ltd on behalf of Association for Child and Adolescent Mental Health.
The varieties of speech to young children.

PubMed

Huttenlocher, Janellen; Vasilyeva, Marina; Waterfall, Heidi R; Vevea, Jack L; Hedges, Larry V

2007-09-01

This article examines caregiver speech to young children. The authors obtained several measures of the speech used to children during early language development (14-30 months). For all measures, they found substantial variation across individuals and subgroups. Speech patterns vary with caregiver education, and the differences are maintained over time. While there are distinct levels of complexity for different caregivers, there is a common pattern of increase across age within the range that characterizes each educational group. Thus, caregiver speech exhibits both long-standing patterns of linguistic behavior and adjustment for the interlocutor. This information about the variability of speech by individual caregivers provides a framework for systematic study of the role of input in language acquisition. PsycINFO Database Record (c) 2007 APA, all rights reserved
Measurement of Trained Speech Patterns in Stuttering: Interjudge and Intrajudge Agreement of Experts by Means of Modified Time-Interval Analysis

ERIC Educational Resources Information Center

Alpermann, Anke; Huber, Walter; Natke, Ulrich; Willmes, Klaus

2010-01-01

Improved fluency after stuttering therapy is usually measured by the percentage of stuttered syllables. However, outcome studies rarely evaluate the use of trained speech patterns that speakers use to manage stuttering. This study investigated whether the modified time interval analysis can distinguish between trained speech patterns, fluent…
Ictal speech and language dysfunction in adult epilepsy: Clinical study of 95 seizures.

PubMed

Dussaule, C; Cauquil, C; Flamand-Roze, C; Gagnepain, J-P; Bouilleret, V; Denier, C; Masnou, P

2017-04-01

To analyze the semiological characteristics of the language and speech disorders arising during epileptic seizures, and to describe the patterns of language and speech disorders that can predict laterality of the epileptic focus. This study retrospectively analyzed 95 consecutive videos of seizures with language and/or speech disorders in 44 patients admitted for diagnostic video-EEG monitoring. Laterality of the epileptic focus was defined according to electro-clinical correlation studies and structural and functional neuroimaging findings. Language and speech disorders were analyzed by a neurologist and a speech therapist blinded to these data. Language and/or speech disorders were subdivided into eight dynamic patterns: pure anterior aphasia; anterior aphasia and vocal; anterior aphasia and "arthria"; pure posterior aphasia; posterior aphasia and vocal; pure vocal; vocal and arthria; and pure arthria. The epileptic focus was in the left hemisphere in more than 4/5 of seizures presenting with pure anterior aphasia or pure posterior aphasia patterns, while discharges originated in the right hemisphere in almost 2/3 of seizures presenting with a pure vocal pattern. No laterality value was found for the other patterns. Classification of the language and speech disorders arising during epileptic seizures into dynamic patterns may be useful for the optimal analysis of anatomo-electro-clinical correlations. In addition, our research has led to the development of standardized tests for analyses of language and speech disorders arising during seizures that can be conducted during video-EEG sessions. Copyright © 2017 Elsevier Masson SAS. All rights reserved.
Status Report on Speech Research. A Report on the Status and Progress of Studies on the Nature of Speech, Instrumentation for Its Investigation, and Practical Applications.

DTIC Science & Technology

1985-10-01

speech errors. References Anderson, V. A. (1942). Training the speaking voice. New York: Oxford University Press. 50...is only about speech perception , in contrast to some t.at deal with other perceptual processes (e.g., Berkeley, 1709; Fest- inger, Burnham, Ono...there a process of learned equivalence. An example is the claim that the 66 * ° - . . Liberman & Mattingly: The Motor Theory of Speech Perception Revised
Associations among measures of sequential processing in motor and linguistics tasks in adults with and without a family history of childhood apraxia of speech: a replication study.

PubMed

Button, Le; Peter, Beate; Stoel-Gammon, Carol; Raskind, Wendy H

2013-03-01

The purpose of this study was to address the hypothesis that childhood apraxia of speech (CAS) is influenced by an underlying deficit in sequential processing that is also expressed in other modalities. In a sample of 21 adults from five multigenerational families, 11 with histories of various familial speech sound disorders, 3 biologically related adults from a family with familial CAS showed motor sequencing deficits in an alternating motor speech task. Compared with the other adults, these three participants showed deficits in tasks requiring high loads of sequential processing, including nonword imitation, nonword reading and spelling. Qualitative error analyses in real word and nonword imitations revealed group differences in phoneme sequencing errors. Motor sequencing ability was correlated with phoneme sequencing errors during real word and nonword imitation, reading and spelling. Correlations were characterized by extremely high scores in one family and extremely low scores in another. Results are consistent with a central deficit in sequential processing in CAS of familial origin.
Associations among measures of sequential processing in motor and linguistics tasks in adults with and without a family history of childhood apraxia of speech: A replication study

PubMed Central

BUTTON, LE; PETER, BEATE; STOEL-GAMMON, CAROL; RASKIND, WENDY H.

2013-01-01

The purpose of this study was to address the hypothesis that childhood apraxia of speech (CAS) is influenced by an underlying deficit in sequential processing that is also expressed in other modalities. In a sample of 21 adults from five multigenerational families, 11 with histories of various familial speech sound disorders, 3 biologically related adults from a family with familial CAS showed motor sequencing deficits in an alternating motor speech task. Compared with the other adults, these three participants showed deficits in tasks requiring high loads of sequential processing, including nonword imitation, nonword reading and spelling. Qualitative error analyses in real word and nonword imitations revealed group differences in phoneme sequencing errors. Motor sequencing ability was correlated with phoneme sequencing errors during real word and nonword imitation, reading and spelling. Correlations were characterized by extremely high scores in one family and extremely low scores in another. Results are consistent with a central deficit in sequential processing in CAS of familial origin. PMID:23339292
Cognitive Factors and Residual Speech Errors: Basic Science, Translational Research, and Some Clinical Frameworks.

PubMed

Eaton, Catherine Torrington

2015-11-01

This article explores the theoretical and empirical relationships between cognitive factors and residual speech errors (RSEs). Definitions of relevant cognitive domains are provided, as well as examples of formal and informal tasks that may be appropriate in assessment. Although studies to date have been limited in number and scope, basic research suggests that cognitive flexibility, short- and long-term memory, and self-monitoring may be areas of weakness in this population. Preliminary evidence has not supported a relationship between inhibitory control, attention, and RSEs; however, further studies that control variables such as language ability and temperament are warranted. Previous translational research has examined the effects of self-monitoring training on residual speech errors. Although results have been mixed, some findings suggest that children with RSEs may benefit from the inclusion of this training. The article closes with a discussion of clinical frameworks that target cognitive skills, including self-monitoring and attention, as a means of facilitating speech sound change. Thieme Medical Publishers 333 Seventh Avenue, New York, NY 10001, USA.
When Does Speech Sound Disorder Matter for Literacy? The Role of Disordered Speech Errors, Co-Occurring Language Impairment and Family Risk of Dyslexia

ERIC Educational Resources Information Center

Hayiou-Thomas, Marianna E.; Carroll, Julia M.; Leavett, Ruth; Hulme, Charles; Snowling, Margaret J.

2017-01-01

Background: This study considers the role of early speech difficulties in literacy development, in the context of additional risk factors. Method: Children were identified with speech sound disorder (SSD) at the age of 3½ years, on the basis of performance on the Diagnostic Evaluation of Articulation and Phonology. Their literacy skills were…
Brain-to-text: decoding spoken phrases from phone representations in the brain.

PubMed

Herff, Christian; Heger, Dominic; de Pesters, Adriana; Telaar, Dominic; Brunner, Peter; Schalk, Gerwin; Schultz, Tanja

2015-01-01

It has long been speculated whether communication between humans and machines based on natural speech related cortical activity is possible. Over the past decade, studies have suggested that it is feasible to recognize isolated aspects of speech from neural signals, such as auditory features, phones or one of a few isolated words. However, until now it remained an unsolved challenge to decode continuously spoken speech from the neural substrate associated with speech and language processing. Here, we show for the first time that continuously spoken speech can be decoded into the expressed words from intracranial electrocorticographic (ECoG) recordings.Specifically, we implemented a system, which we call Brain-To-Text that models single phones, employs techniques from automatic speech recognition (ASR), and thereby transforms brain activity while speaking into the corresponding textual representation. Our results demonstrate that our system can achieve word error rates as low as 25% and phone error rates below 50%. Additionally, our approach contributes to the current understanding of the neural basis of continuous speech production by identifying those cortical regions that hold substantial information about individual phones. In conclusion, the Brain-To-Text system described in this paper represents an important step toward human-machine communication based on imagined speech.
Brain-to-text: decoding spoken phrases from phone representations in the brain

PubMed Central

Herff, Christian; Heger, Dominic; de Pesters, Adriana; Telaar, Dominic; Brunner, Peter; Schalk, Gerwin; Schultz, Tanja

2015-01-01

It has long been speculated whether communication between humans and machines based on natural speech related cortical activity is possible. Over the past decade, studies have suggested that it is feasible to recognize isolated aspects of speech from neural signals, such as auditory features, phones or one of a few isolated words. However, until now it remained an unsolved challenge to decode continuously spoken speech from the neural substrate associated with speech and language processing. Here, we show for the first time that continuously spoken speech can be decoded into the expressed words from intracranial electrocorticographic (ECoG) recordings.Specifically, we implemented a system, which we call Brain-To-Text that models single phones, employs techniques from automatic speech recognition (ASR), and thereby transforms brain activity while speaking into the corresponding textual representation. Our results demonstrate that our system can achieve word error rates as low as 25% and phone error rates below 50%. Additionally, our approach contributes to the current understanding of the neural basis of continuous speech production by identifying those cortical regions that hold substantial information about individual phones. In conclusion, the Brain-To-Text system described in this paper represents an important step toward human-machine communication based on imagined speech. PMID:26124702
Neural Representations Used by Brain Regions Underlying Speech Production

ERIC Educational Resources Information Center

Segawa, Jennifer Anne

2013-01-01

Speech utterances are phoneme sequences but may not always be represented as such in the brain. For instance, electropalatography evidence indicates that as speaking rate increases, gestures within syllables are manipulated separately but those within consonant clusters act as one motor unit. Moreover, speech error data suggest that a syllable's…
The Relationship between Speech Impairment, Phonological Awareness and Early Literacy Development

ERIC Educational Resources Information Center

Harris, Judy; Botting, Nicola; Myers, Lucy; Dodd, Barbara

2011-01-01

Although children with speech impairment are at increased risk for impaired literacy, many learn to read and spell without difficulty. Around half the children with speech impairment have delayed acquisition, making errors typical of a normally developing younger child (e.g. reducing consonant clusters so that "spoon" is pronounced as…
The Comorbidity between Attention-Deficit/Hyperactivity Disorder (ADHD) in Children and Arabic Speech Sound Disorder

ERIC Educational Resources Information Center

Hariri, Ruaa Osama

2016-01-01

Children with Attention-Deficiency/Hyperactive Disorder (ADHD) often have co-existing learning disabilities and developmental weaknesses or delays in some areas including speech (Rief, 2005). Seeing that phonological disorders include articulation errors and other forms of speech disorders, studies pertaining to children with ADHD symptoms who…
Intervention for Children with Severe Speech Disorder: A Comparison of Two Approaches

ERIC Educational Resources Information Center

Crosbie, Sharon; Holm, Alison; Dodd, Barbara

2005-01-01

Background: Children with speech disorder are a heterogeneous group (e.g. in terms of severity, types of errors and underlying causal factors). Much research has ignored this heterogeneity, giving rise to contradictory intervention study findings. This situation provides clinical motivation to identify the deficits in the speech-processing chain…
The influence of phonological context on the sound errors of a speaker with Wernicke's aphasia.

PubMed

Goldmann, R E; Schwartz, M F; Wilshire, C E

2001-09-01

A corpus of phonological errors produced in narrative speech by a Wernicke's aphasic speaker (R.W.B.) was tested for context effects using two new methods for establishing chance baselines. A reliable anticipatory effect was found using the second method, which estimated chance from the distance between phoneme repeats in the speech sample containing the errors. Relative to this baseline, error-source distances were shorter than expected for anticipations, but not perseverations. R.W.B.'s anticipation/perseveration ratio measured intermediate between a nonaphasic error corpus and that of a more severe aphasic speaker (both reported in Schwartz et al., 1994), supporting the view that the anticipatory bias correlates to severity. Finally, R.W.B's anticipations favored word-initial segments, although errors and sources did not consistently share word or syllable position. Copyright 2001 Academic Press.
Real-time continuous visual biofeedback in the treatment of speech breathing disorders following childhood traumatic brain injury: report of one case.

PubMed

Murdoch, B E; Pitt, G; Theodoros, D G; Ward, E C

1999-01-01

The efficacy of traditional and physiological biofeedback methods for modifying abnormal speech breathing patterns was investigated in a child with persistent dysarthria following severe traumatic brain injury (TBI). An A-B-A-B single-subject experimental research design was utilized to provide the subject with two exclusive periods of therapy for speech breathing, based on traditional therapy techniques and physiological biofeedback methods, respectively. Traditional therapy techniques included establishing optimal posture for speech breathing, explanation of the movement of the respiratory muscles, and a hierarchy of non-speech and speech tasks focusing on establishing an appropriate level of sub-glottal air pressure, and improving the subject's control of inhalation and exhalation. The biofeedback phase of therapy utilized variable inductance plethysmography (or Respitrace) to provide real-time, continuous visual biofeedback of ribcage circumference during breathing. As in traditional therapy, a hierarchy of non-speech and speech tasks were devised to improve the subject's control of his respiratory pattern. Throughout the project, the subject's respiratory support for speech was assessed both instrumentally and perceptually. Instrumental assessment included kinematic and spirometric measures, and perceptual assessment included the Frenchay Dysarthria Assessment, Assessment of Intelligibility of Dysarthric Speech, and analysis of a speech sample. The results of the study demonstrated that real-time continuous visual biofeedback techniques for modifying speech breathing patterns were not only effective, but superior to the traditional therapy techniques for modifying abnormal speech breathing patterns in a child with persistent dysarthria following severe TBI. These results show that physiological biofeedback techniques are potentially useful clinical tools for the remediation of speech breathing impairment in the paediatric dysarthric population.
The Hypothesis of Apraxia of Speech in Children with Autism Spectrum Disorder

PubMed Central

Shriberg, Lawrence D.; Paul, Rhea; Black, Lois M.; van Santen, Jan P.

2010-01-01

In a sample of 46 children aged 4 to 7 years with Autism Spectrum Disorder (ASD) and intelligible speech, there was no statistical support for the hypothesis of concomitant Childhood Apraxia of Speech (CAS). Perceptual and acoustic measures of participants’ speech, prosody, and voice were compared with data from 40 typically-developing children, 13 preschool children with Speech Delay, and 15 participants aged 5 to 49 years with CAS in neurogenetic disorders. Speech Delay and Speech Errors, respectively, were modestly and substantially more prevalent in participants with ASD than reported population estimates. Double dissociations in speech, prosody, and voice impairments in ASD were interpreted as consistent with a speech attunement framework, rather than with the motor speech impairments that define CAS. Key Words: apraxia, dyspraxia, motor speech disorder, speech sound disorder PMID:20972615
Changes in Voice Onset Time and Motor Speech Skills in Children following Motor Speech Therapy: Evidence from /pa/ productions

PubMed Central

Yu, Vickie Y.; Kadis, Darren S.; Oh, Anna; Goshulak, Debra; Namasivayam, Aravind; Pukonen, Margit; Kroll, Robert; De Nil, Luc F.; Pang, Elizabeth W.

2016-01-01

This study evaluated changes in motor speech control and inter-gestural coordination for children with speech sound disorders (SSD) subsequent to PROMPT (Prompts for Restructuring Oral Muscular Phonetic Targets) intervention. We measured the distribution patterns of voice onset time (VOT) for a voiceless stop (/p/) to examine the changes in inter-gestural coordination. Two standardized tests were used (VMPAC, GFTA-2) to assess the changes in motor speech skills and articulation. Data showed positive changes in patterns of VOT with a lower pattern of variability. All children showed significantly higher scores for VMPAC, but only some children showed higher scores for GFTA-2. Results suggest that the proprioceptive feedback provided through PROMPT had a positive influence on motor speech control and inter-gestural coordination in voicing behavior. This set of VOT data for children with SSD adds to our understanding of the speech characteristics underlying motor speech control. Directions for future studies are discussed. PMID:24446799
Speech Correction for Children with Cleft Lip and Palate by Networking of Community-Based Care.

PubMed

Hanchanlert, Yotsak; Pramakhatay, Worawat; Pradubwong, Suteera; Prathanee, Benjamas

2015-08-01

Prevalence of cleft lip and palate (CLP) is high in Northeast Thailand. Most children with CLP face many problems, particularly compensatory articulation disorders (CAD) beyond surgery while speech services and the number of speech and language pathologists (SLPs) are limited. To determine the effectiveness of networking of Khon Kaen University (KKU) Community-Based Speech Therapy Model: Kosumphisai Hospital, Kosumphisai District and Maha Sarakham Hospital, Mueang District, Maha Sarakham Province for reduction of the number of articulations errors for children with CLP. Eleven children with CLP were recruited in 3 1-year projects of KKU Community-Based Speech Therapy Model. Articulation tests were formally assessed by qualified language pathologists (SLPs) for baseline and post treatment outcomes. Teachings on services for speech assistants (SAs) were conducted by SLPs. Assigned speech correction (SC) was performed by SAs at home and at local hospitals. Caregivers also gave SC at home 3-4 days a week. Networking of Community-Based Speech Therapy Model signficantly reduced the number of articulation errors for children with CLP in both word and sentence levels (mean difference = 6.91, 95% confidence interval = 4.15-9.67; mean difference = 5.36, 95% confidence interval = 2.99-7.73, respectively). Networking by Kosumphisai and Maha Sarakham of KKU Community-Based Speech Therapy Model was a valid and efficient method for providing speech services for children with cleft palate and could be extended to any area in Thailand and other developing countries, where have similar contexts.

Impact of speech presentation level on cognitive task performance: implications for auditory display design.

PubMed

Baldwin, Carryl L; Struckman-Johnson, David

2002-01-15

Speech displays and verbal response technologies are increasingly being used in complex, high workload environments that require the simultaneous performance of visual and manual tasks. Examples of such environments include the flight decks of modern aircraft, advanced transport telematics systems providing invehicle route guidance and navigational information and mobile communication equipment in emergency and public safety vehicles. Previous research has established an optimum range for speech intelligibility. However, the potential for variations in presentation levels within this range to affect attentional resources and cognitive processing of speech material has not been examined previously. Results of the current experimental investigation demonstrate that as presentation level increases within this 'optimum' range, participants in high workload situations make fewer sentence-processing errors and generally respond faster. Processing errors were more sensitive to changes in presentation level than were measures of reaction time. Implications of these findings are discussed in terms of their application for the design of speech communications displays in complex multi-task environments.
Speech Entrainment Compensates for Broca's Area Damage

PubMed Central

Fridriksson, Julius; Basilakos, Alexandra; Hickok, Gregory; Bonilha, Leonardo; Rorden, Chris

2015-01-01

Speech entrainment (SE), the online mimicking of an audiovisual speech model, has been shown to increase speech fluency in patients with Broca's aphasia. However, not all individuals with aphasia benefit from SE. The purpose of this study was to identify patterns of cortical damage that predict a positive response SE's fluency-inducing effects. Forty-four chronic patients with left hemisphere stroke (15 female) were included in this study. Participants completed two tasks: 1) spontaneous speech production, and 2) audiovisual SE. Number of different words per minute was calculated as a speech output measure for each task, with the difference between SE and spontaneous speech conditions yielding a measure of fluency improvement. Voxel-wise lesion-symptom mapping (VLSM) was used to relate the number of different words per minute for spontaneous speech, SE, and SE-related improvement to patterns of brain damage in order to predict lesion locations associated with the fluency-inducing response to speech entrainment. Individuals with Broca's aphasia demonstrated a significant increase in different words per minute during speech entrainment versus spontaneous speech. A similar pattern of improvement was not seen in patients with other types of aphasia. VLSM analysis revealed damage to the inferior frontal gyrus predicted this response. Results suggest that SE exerts its fluency-inducing effects by providing a surrogate target for speech production via internal monitoring processes. Clinically, these results add further support for the use of speech entrainment to improve speech production and may help select patients for speech entrainment treatment. PMID:25989443
Halting in Single Word Production: A Test of the Perceptual Loop Theory of Speech Monitoring

ERIC Educational Resources Information Center

Slevc, L. Robert; Ferreira, Victor S.

2006-01-01

The "perceptual loop theory" of speech monitoring (Levelt, 1983) claims that inner and overt speech are monitored by the comprehension system, which detects errors by comparing the comprehension of formulated utterances to originally intended utterances. To test the perceptual loop monitor, speakers named pictures and sometimes attempted to halt…
Phonological Awareness, Reading Accuracy and Spelling Ability of Children with Inconsistent Phonological Disorder

ERIC Educational Resources Information Center

Holm, Alison; Farrier, Faith; Dodd, Barbara

2008-01-01

Background: Although children with speech disorder are at increased risk of literacy impairments, many learn to read and spell without difficulty. They are also a heterogeneous population in terms of the number and type of speech errors and their identified speech processing deficits. One problem lies in determining which preschool children with…
Verbal Self-Monitoring in the Second Language

ERIC Educational Resources Information Center

Broos, Wouter P. J.; Duyck, Wouter; Hartsuiker, Robert J.

2016-01-01

Speakers monitor their own speech for errors. To do so, they may rely on perception of their own speech (external monitoring) but also on an internal speech representation (internal monitoring). While there are detailed accounts of monitoring in first language (L1) processing, it is not clear if and how monitoring is different in a second language…
Speech serial control in healthy speakers and speakers with hypokinetic or ataxic dysarthria: effects of sequence length and practice

PubMed Central

Reilly, Kevin J.; Spencer, Kristie A.

2013-01-01

The current study investigated the processes responsible for selection of sounds and syllables during production of speech sequences in 10 adults with hypokinetic dysarthria from Parkinson’s disease, five adults with ataxic dysarthria, and 14 healthy control speakers. Speech production data from a choice reaction time task were analyzed to evaluate the effects of sequence length and practice on speech sound sequencing. Speakers produced sequences that were between one and five syllables in length over five experimental runs of 60 trials each. In contrast to the healthy speakers, speakers with hypokinetic dysarthria demonstrated exaggerated sequence length effects for both inter-syllable intervals (ISIs) and speech error rates. Conversely, speakers with ataxic dysarthria failed to demonstrate a sequence length effect on ISIs and were also the only group that did not exhibit practice-related changes in ISIs and speech error rates over the five experimental runs. The exaggerated sequence length effects in the hypokinetic speakers with Parkinson’s disease are consistent with an impairment of action selection during speech sequence production. The absent length effects observed in the speakers with ataxic dysarthria is consistent with previous findings that indicate a limited capacity to buffer speech sequences in advance of their execution. In addition, the lack of practice effects in these speakers suggests that learning-related improvements in the production rate and accuracy of speech sequences involves processing by structures of the cerebellum. Together, the current findings inform models of serial control for speech in healthy speakers and support the notion that sequencing deficits contribute to speech symptoms in speakers with hypokinetic or ataxic dysarthria. In addition, these findings indicate that speech sequencing is differentially impaired in hypokinetic and ataxic dysarthria. PMID:24137121
Autistic traits and attention to speech: Evidence from typically developing individuals.

PubMed

Korhonen, Vesa; Werner, Stefan

2017-04-01

Individuals with autism spectrum disorder have a preference for attending to non-speech stimuli over speech stimuli. We are interested in whether non-speech preference is only a feature of diagnosed individuals, and whether we can we test implicit preference experimentally. In typically developed individuals, serial recall is disrupted more by speech stimuli than by non-speech stimuli. Since behaviour of individuals with autistic traits resembles that of individuals with autism, we have used serial recall to test whether autistic traits influence task performance during irrelevant speech sounds. The errors made on the serial recall task during speech or non-speech sounds were counted as a measure of speech or non-speech preference in relation to no sound condition. We replicated the serial order effect and found the speech to be more disruptive than the non-speech sounds, but were unable to find any associations between the autism quotient scores and the non-speech sounds. Our results may indicate a learnt behavioural response to speech sounds.
Sound Source Localization and Speech Understanding in Complex Listening Environments by Single-sided Deaf Listeners After Cochlear Implantation.

PubMed

Zeitler, Daniel M; Dorman, Michael F; Natale, Sarah J; Loiselle, Louise; Yost, William A; Gifford, Rene H

2015-09-01

To assess improvements in sound source localization and speech understanding in complex listening environments after unilateral cochlear implantation for single-sided deafness (SSD). Nonrandomized, open, prospective case series. Tertiary referral center. Nine subjects with a unilateral cochlear implant (CI) for SSD (SSD-CI) were tested. Reference groups for the task of sound source localization included young (n = 45) and older (n = 12) normal-hearing (NH) subjects and 27 bilateral CI (BCI) subjects. Unilateral cochlear implantation. Sound source localization was tested with 13 loudspeakers in a 180 arc in front of the subject. Speech understanding was tested with the subject seated in an 8-loudspeaker sound system arrayed in a 360-degree pattern. Directionally appropriate noise, originally recorded in a restaurant, was played from each loudspeaker. Speech understanding in noise was tested using the Azbio sentence test and sound source localization quantified using root mean square error. All CI subjects showed poorer-than-normal sound source localization. SSD-CI subjects showed a bimodal distribution of scores: six subjects had scores near the mean of those obtained by BCI subjects, whereas three had scores just outside the 95th percentile of NH listeners. Speech understanding improved significantly in the restaurant environment when the signal was presented to the side of the CI. Cochlear implantation for SSD can offer improved speech understanding in complex listening environments and improved sound source localization in both children and adults. On tasks of sound source localization, SSD-CI patients typically perform as well as BCI patients and, in some cases, achieve scores at the upper boundary of normal performance.
Deficits in sequential processing manifest in motor and linguistic tasks in a multigenerational family with childhood apraxia of speech

PubMed Central

PETER, BEATE; BUTTON, LE; STOEL-GAMMON, CAROL; CHAPMAN, KATHY; RASKIND, WENDY H.

2013-01-01

The purpose of this study was to evaluate a global deficit in sequential processing as candidate endophenotypein a family with familial childhood apraxia of speech (CAS). Of 10 adults and 13 children in a three-generational family with speech sound disorder (SSD) consistent with CAS, 3 adults and 6 children had past or present SSD diagnoses. Two preschoolers with unremediated CAS showed a high number of sequencing errors during single-word production. Performance on tasks with high sequential processing loads differentiated between the affected and unaffected family members, whereas there were no group differences in tasks with low processing loads. Adults with a history of SSD produced more sequencing errors during nonword and multisyllabic real word imitation, compared to those without such a history. Results are consistent with a global deficit in sequential processing that influences speech development as well as cognitive and linguistic processing. PMID:23339324
Thickened Liquids: Practice Patterns of Speech-Language Pathologists

ERIC Educational Resources Information Center

Garcia, Jane Mertz; Chambers, Edgar, IV; Molander, Michelle

2005-01-01

This study surveyed the practice patterns of speech-language pathologists in their use of thickened liquids for patients with swallowing difficulties. A 25-item Internet survey about thickened liquids was posted via an e-mail list to members of the American Speech-Language-Hearing Association Division 13, Swallowing and Swallowing Disorders…
How Should Children with Speech Sound Disorders be Classified? A Review and Critical Evaluation of Current Classification Systems

ERIC Educational Resources Information Center

Waring, R.; Knight, R.

2013-01-01

Background: Children with speech sound disorders (SSD) form a heterogeneous group who differ in terms of the severity of their condition, underlying cause, speech errors, involvement of other aspects of the linguistic system and treatment response. To date there is no universal and agreed-upon classification system. Instead, a number of…
Development of a good-quality speech coder for transmission over noisy channels at 2.4 kb/s

NASA Astrophysics Data System (ADS)

Viswanathan, V. R.; Berouti, M.; Higgins, A.; Russell, W.

1982-03-01

This report describes the development, study, and experimental results of a 2.4 kb/s speech coder called harmonic deviations (HDV) vocoder, which transmits good-quality speech over noisy channels with bit-error rates of up to 1%. The HDV coder is based on the linear predictive coding (LPC) vocoder, and it transmits additional information over and above the data transmitted by the LPC vocoder, in the form of deviations between the speech spectrum and the LPC all-pole model spectrum at a selected set of frequencies. At the receiver, the spectral deviations are used to generate the excitation signal for the all-pole synthesis filter. The report describes and compares several methods for extracting the spectral deviations from the speech signal and for encoding them. To limit the bit-rate of the HDV coder to 2.4 kb/s the report discusses several methods including orthogonal transformation and minimum-mean-square-error scalar quantization of log area ratios, two-stage vector-scalar quantization, and variable frame rate transmission. The report also presents the results of speech-quality optimization of the HDV coder at 2.4 kb/s.
Cohesive and coherent connected speech deficits in mild stroke.

PubMed

Barker, Megan S; Young, Breanne; Robinson, Gail A

2017-05-01

Spoken language production theories and lesion studies highlight several important prelinguistic conceptual preparation processes involved in the production of cohesive and coherent connected speech. Cohesion and coherence broadly connect sentences with preceding ideas and the overall topic. Broader cognitive mechanisms may mediate these processes. This study aims to investigate (1) whether stroke patients without aphasia exhibit impairments in cohesion and coherence in connected speech, and (2) the role of attention and executive functions in the production of connected speech. Eighteen stroke patients (8 right hemisphere stroke [RHS]; 6 left [LHS]) and 21 healthy controls completed two self-generated narrative tasks to elicit connected speech. A multi-level analysis of within and between-sentence processing ability was conducted. Cohesion and coherence impairments were found in the stroke group, particularly RHS patients, relative to controls. In the whole stroke group, better performance on the Hayling Test of executive function, which taps verbal initiation/suppression, was related to fewer propositional repetitions and global coherence errors. Better performance on attention tasks was related to fewer propositional repetitions, and decreased global coherence errors. In the RHS group, aspects of cohesive and coherent speech were associated with better performance on attention tasks. Better Hayling Test scores were related to more cohesive and coherent speech in RHS patients, and more coherent speech in LHS patients. Thus, we documented connected speech deficits in a heterogeneous stroke group without prominent aphasia. Our results suggest that broader cognitive processes may play a role in producing connected speech at the early conceptual preparation stage. Copyright © 2017 Elsevier Inc. All rights reserved.
Phonological and Motor Errors in Individuals with Acquired Sound Production Impairment

ERIC Educational Resources Information Center

Buchwald, Adam; Miozzo, Michele

2012-01-01

Purpose: This study aimed to compare sound production errors arising due to phonological processing impairment with errors arising due to motor speech impairment. Method: Two speakers with similar clinical profiles who produced similar consonant cluster simplification errors were examined using a repetition task. We compared both overall accuracy…
Effects of Error Experience When Learning to Simulate Hypernasality

ERIC Educational Resources Information Center

Wong, Andus W.-K.; Tse, Andy C.-Y.; Ma, Estella P.-M.; Whitehill, Tara L.; Masters, Rich S. W.

2013-01-01

Purpose: The purpose of this study was to evaluate the effects of error experience on the acquisition of hypernasal speech. Method: Twenty-eight healthy participants were asked to simulate hypernasality in either an "errorless learning" condition (in which the possibility for errors was limited) or an "errorful learning"…
Cortical activity patterns predict robust speech discrimination ability in noise

PubMed Central

Shetake, Jai A.; Wolf, Jordan T.; Cheung, Ryan J.; Engineer, Crystal T.; Ram, Satyananda K.; Kilgard, Michael P.

2012-01-01

The neural mechanisms that support speech discrimination in noisy conditions are poorly understood. In quiet conditions, spike timing information appears to be used in the discrimination of speech sounds. In this study, we evaluated the hypothesis that spike timing is also used to distinguish between speech sounds in noisy conditions that significantly degrade neural responses to speech sounds. We tested speech sound discrimination in rats and recorded primary auditory cortex (A1) responses to speech sounds in background noise of different intensities and spectral compositions. Our behavioral results indicate that rats, like humans, are able to accurately discriminate consonant sounds even in the presence of background noise that is as loud as the speech signal. Our neural recordings confirm that speech sounds evoke degraded but detectable responses in noise. Finally, we developed a novel neural classifier that mimics behavioral discrimination. The classifier discriminates between speech sounds by comparing the A1 spatiotemporal activity patterns evoked on single trials with the average spatiotemporal patterns evoked by known sounds. Unlike classifiers in most previous studies, this classifier is not provided with the stimulus onset time. Neural activity analyzed with the use of relative spike timing was well correlated with behavioral speech discrimination in quiet and in noise. Spike timing information integrated over longer intervals was required to accurately predict rat behavioral speech discrimination in noisy conditions. The similarity of neural and behavioral discrimination of speech in noise suggests that humans and rats may employ similar brain mechanisms to solve this problem. PMID:22098331
Identification and Remediation of Phonological and Motor Errors in Acquired Sound Production Impairment

PubMed Central

Gagnon, Bernadine; Miozzo, Michele

2017-01-01

Purpose This study aimed to test whether an approach to distinguishing errors arising in phonological processing from those arising in motor planning also predicts the extent to which repetition-based training can lead to improved production of difficult sound sequences. Method Four individuals with acquired speech production impairment who produced consonant cluster errors involving deletion were examined using a repetition task. We compared the acoustic details of productions with deletion errors in target consonant clusters to singleton consonants. Changes in accuracy over the course of the study were also compared. Results Two individuals produced deletion errors consistent with a phonological locus of the errors, and 2 individuals produced errors consistent with a motoric locus of the errors. The 2 individuals who made phonologically driven errors showed no change in performance on a repetition training task, whereas the 2 individuals with motoric errors improved in their production of both trained and untrained items. Conclusions The results extend previous findings about a metric for identifying the source of sound production errors in individuals with both apraxia of speech and aphasia. In particular, this work may provide a tool for identifying predominant error types in individuals with complex deficits. PMID:28655044
White Matter Integrity and Treatment-Based Change in Speech Performance in Minimally Verbal Children with Autism Spectrum Disorder.

PubMed

Chenausky, Karen; Kernbach, Julius; Norton, Andrea; Schlaug, Gottfried

2017-01-01

We investigated the relationship between imaging variables for two language/speech-motor tracts and speech fluency variables in 10 minimally verbal (MV) children with autism. Specifically, we tested whether measures of white matter integrity-fractional anisotropy (FA) of the arcuate fasciculus (AF) and frontal aslant tract (FAT)-were related to change in percent syllable-initial consonants correct, percent items responded to, and percent syllable insertion errors (from best baseline to post 25 treatment sessions). Twenty-three MV children with autism spectrum disorder (ASD) received Auditory-Motor Mapping Training (AMMT), an intonation-based treatment to improve fluency in spoken output, and we report on seven who received a matched control treatment. Ten of the AMMT participants were able to undergo a magnetic resonance imaging study at baseline; their performance on baseline speech production measures is compared to that of the other two groups. No baseline differences were found between groups. A canonical correlation analysis (CCA) relating FA values for left- and right-hemisphere AF and FAT to speech production measures showed that FA of the left AF and right FAT were the largest contributors to the synthetic independent imaging-related variable. Change in percent syllable-initial consonants correct and percent syllable-insertion errors were the largest contributors to the synthetic dependent fluency-related variable. Regression analyses showed that FA values in left AF significantly predicted change in percent syllable-initial consonants correct, no FA variables significantly predicted change in percent items responded to, and FA of right FAT significantly predicted change in percent syllable-insertion errors. Results are consistent with previously identified roles for the AF in mediating bidirectional mapping between articulation and acoustics, and the FAT in its relationship to speech initiation and fluency. They further suggest a division of labor between the hemispheres, implicating the left hemisphere in accuracy of speech production and the right hemisphere in fluency in this population. Changes in response rate are interpreted as stemming from factors other than the integrity of these two fiber tracts. This study is the first to document the existence of a subgroup of MV children who experience increases in syllable- insertion errors as their speech develops in response to therapy.
Speech pattern improvement following gingivectomy of excess palatal tissue.

PubMed

Holtzclaw, Dan; Toscano, Nicholas

2008-10-01

Speech disruption secondary to excessive gingival tissue has received scant attention in periodontal literature. Although a few articles have addressed the causes of this condition, documentation and scientific explanation of treatment outcomes are virtually non-existent. This case report describes speech pattern improvements secondary to periodontal surgery and provides a concise review of linguistic and phonetic literature pertinent to the case. A 21-year-old white female with a history of gingival abscesses secondary to excessive palatal tissue presented for treatment. Bilateral gingivectomies of palatal tissues were performed with inverse bevel incisions extending distally from teeth #5 and #12 to the maxillary tuberosities, and large wedges of epithelium/connective tissue were excised. Within the first month of the surgery, the patient noted "changes in the manner in which her tongue contacted the roof of her mouth" and "changes in her speech." Further anecdotal investigation revealed the patient's enunciation of sounds such as "s," "sh," and "k" was greatly improved following the gingivectomy procedure. Palatometric research clearly demonstrates that the tongue has intimate contact with the lateral aspects of the posterior palate during speech. Gingival excess in this and other palatal locations has the potential to alter linguopalatal contact patterns and disrupt normal speech patterns. Surgical correction of this condition via excisional procedures may improve linguopalatal contact patterns which, in turn, may lead to improved patient speech.
Power Spectral Density Error Analysis of Spectral Subtraction Type of Speech Enhancement Methods

NASA Astrophysics Data System (ADS)

Händel, Peter

2006-12-01

A theoretical framework for analysis of speech enhancement algorithms is introduced for performance assessment of spectral subtraction type of methods. The quality of the enhanced speech is related to physical quantities of the speech and noise (such as stationarity time and spectral flatness), as well as to design variables of the noise suppressor. The derived theoretical results are compared with the outcome of subjective listening tests as well as successful design strategies, performed by independent research groups.

Relationship Among Signal Fidelity, Hearing Loss, and Working Memory for Digital Noise Suppression.

PubMed

Arehart, Kathryn; Souza, Pamela; Kates, James; Lunner, Thomas; Pedersen, Michael Syskind

2015-01-01

This study considered speech modified by additive babble combined with noise-suppression processing. The purpose was to determine the relative importance of the signal modifications, individual peripheral hearing loss, and individual cognitive capacity on speech intelligibility and speech quality. The participant group consisted of 31 individuals with moderate high-frequency hearing loss ranging in age from 51 to 89 years (mean = 69.6 years). Speech intelligibility and speech quality were measured using low-context sentences presented in babble at several signal-to-noise ratios. Speech stimuli were processed with a binary mask noise-suppression strategy with systematic manipulations of two parameters (error rate and attenuation values). The cumulative effects of signal modification produced by babble and signal processing were quantified using an envelope-distortion metric. Working memory capacity was assessed with a reading span test. Analysis of variance was used to determine the effects of signal processing parameters on perceptual scores. Hierarchical linear modeling was used to determine the role of degree of hearing loss and working memory capacity in individual listener response to the processed noisy speech. The model also considered improvements in envelope fidelity caused by the binary mask and the degradations to envelope caused by error and noise. The participants showed significant benefits in terms of intelligibility scores and quality ratings for noisy speech processed by the ideal binary mask noise-suppression strategy. This benefit was observed across a range of signal-to-noise ratios and persisted when up to a 30% error rate was introduced into the processing. Average intelligibility scores and average quality ratings were well predicted by an objective metric of envelope fidelity. Degree of hearing loss and working memory capacity were significant factors in explaining individual listener's intelligibility scores for binary mask processing applied to speech in babble. Degree of hearing loss and working memory capacity did not predict listeners' quality ratings. The results indicate that envelope fidelity is a primary factor in determining the combined effects of noise and binary mask processing for intelligibility and quality of speech presented in babble noise. Degree of hearing loss and working memory capacity are significant factors in explaining variability in listeners' speech intelligibility scores but not in quality ratings.
The Motor Core of Speech: A Comparison of Serial Organization Patterns in Infants and Languages.

ERIC Educational Resources Information Center

MacNeilage, Peter F.; Davis, Barbara L.; Kinney, Ashlynn; Matyear, Christine L.

2000-01-01

Presents evidence for four major design features of serial organization of speech arising from comparison of babbling and early speech with patterns in ten languages. Maintains that no explanation for the design features is available from Universal Grammar; except for intercyclical consonant repetition development, perceptual-motor learning seems…
Impaired extraction of speech rhythm from temporal modulation patterns in speech in developmental dyslexia

PubMed Central

Leong, Victoria; Goswami, Usha

2014-01-01

Dyslexia is associated with impaired neural representation of the sound structure of words (phonology). The “phonological deficit” in dyslexia may arise in part from impaired speech rhythm perception, thought to depend on neural oscillatory phase-locking to slow amplitude modulation (AM) patterns in the speech envelope. Speech contains AM patterns at multiple temporal rates, and these different AM rates are associated with phonological units of different grain sizes, e.g., related to stress, syllables or phonemes. Here, we assess the ability of adults with dyslexia to use speech AMs to identify rhythm patterns (RPs). We study 3 important temporal rates: “Stress” (~2 Hz), “Syllable” (~4 Hz) and “Sub-beat” (reduced syllables, ~14 Hz). 21 dyslexics and 21 controls listened to nursery rhyme sentences that had been tone-vocoded using either single AM rates from the speech envelope (Stress only, Syllable only, Sub-beat only) or pairs of AM rates (Stress + Syllable, Syllable + Sub-beat). They were asked to use the acoustic rhythm of the stimulus to identity the original nursery rhyme sentence. The data showed that dyslexics were significantly poorer at detecting rhythm compared to controls when they had to utilize multi-rate temporal information from pairs of AMs (Stress + Syllable or Syllable + Sub-beat). These data suggest that dyslexia is associated with a reduced ability to utilize AMs <20 Hz for rhythm recognition. This perceptual deficit in utilizing AM patterns in speech could be underpinned by less efficient neuronal phase alignment and cross-frequency neuronal oscillatory synchronization in dyslexia. Dyslexics' perceptual difficulties in capturing the full spectro-temporal complexity of speech over multiple timescales could contribute to the development of impaired phonological representations for words, the cognitive hallmark of dyslexia across languages. PMID:24605099
[The endpoint detection of cough signal in continuous speech].

PubMed

Yang, Guoqing; Mo, Hongqiang; Li, Wen; Lian, Lianfang; Zheng, Zeguang

2010-06-01

The endpoint detection of cough signal in continuous speech has been researched in order to improve the efficiency and veracity of manual recognition or computer-based automatic recognition. First, using the short time zero crossing ratio(ZCR) for identifying the suspicious coughs and getting the threshold of short time energy based on acoustic characteristics of cough. Then, the short time energy is combined with short time ZCR in order to implement the endpoint detection of cough in continuous speech. To evaluate the effect of the method, first, the virtual number of coughs in each recording was identified by two experienced doctors using the graphical user interface (GUI). Second, the recordings were analyzed by automatic endpoint detection program under Matlab7.0. Finally, the comparison between these two results showed: The error rate of undetected cough is 2.18%, and 98.13% of noise, silence and speech were removed. The way of setting short time energy threshold is robust. The endpoint detection program can remove most speech and noise, thus maintaining a lower rate of error.
Relative Salience of Speech Rhythm and Speech Rate on Perceived Foreign Accent in a Second Language.

PubMed

Polyanskaya, Leona; Ordin, Mikhail; Busa, Maria Grazia

2017-09-01

We investigated the independent contribution of speech rate and speech rhythm to perceived foreign accent. To address this issue we used a resynthesis technique that allows neutralizing segmental and tonal idiosyncrasies between identical sentences produced by French learners of English at different proficiency levels and maintaining the idiosyncrasies pertaining to prosodic timing patterns. We created stimuli that (1) preserved the idiosyncrasies in speech rhythm while controlling for the differences in speech rate between the utterances; (2) preserved the idiosyncrasies in speech rate while controlling for the differences in speech rhythm between the utterances; and (3) preserved the idiosyncrasies both in speech rate and speech rhythm. All the stimuli were created in intoned (with imposed intonational contour) and flat (with monotonized, constant F0) conditions. The original and the resynthesized sentences were rated by native speakers of English for degree of foreign accent. We found that both speech rate and speech rhythm influence the degree of perceived foreign accent, but the effect of speech rhythm is larger than that of speech rate. We also found that intonation enhances the perception of fine differences in rhythmic patterns but reduces the perceptual salience of fine differences in speech rate.
Differential effects of speech prostheses in glossectomized patients.

PubMed

Leonard, R J; Gillis, R

1990-12-01

Five patients representing different categories of glossal resection were fitted with prostheses specifically designed to improve speech. Speech recordings made for subjects with and without their prostheses were subjected to a variety of analyses. Prosthetic influence on listeners' judgments of severity level/intelligibility, number of consonants in error, and on the acoustic measure F2 range of vowels was evaluated. Findings indicated that all subjects demonstrated improvement on the speech measures. However, the extent of improvement on each measure varied across speakers and resection categories. Implications of the findings for prosthetic speech rehabilitation in this population are discussed.
Objective speech quality evaluation of real-time speech coders

NASA Astrophysics Data System (ADS)

Viswanathan, V. R.; Russell, W. H.; Huggins, A. W. F.

1984-02-01

This report describes the work performed in two areas: subjective testing of a real-time 16 kbit/s adaptive predictive coder (APC) and objective speech quality evaluation of real-time coders. The speech intelligibility of the APC coder was tested using the Diagnostic Rhyme Test (DRT), and the speech quality was tested using the Diagnostic Acceptability Measure (DAM) test, under eight operating conditions involving channel error, acoustic background noise, and tandem link with two other coders. The test results showed that the DRT and DAM scores of the APC coder equalled or exceeded the corresponding test scores fo the 32 kbit/s CVSD coder. In the area of objective speech quality evaluation, the report describes the development, testing, and validation of a procedure for automatically computing several objective speech quality measures, given only the tape-recordings of the input speech and the corresponding output speech of a real-time speech coder.
The effects of gated speech on the fluency of speakers who stutter

PubMed Central

Howell, Peter

2007-01-01

It is known that the speech of people who stutter improves when the speaker’s own vocalization is changed while the participant is speaking. One explanation of these effects is the disruptive rhythm hypothesis (DRH). DRH maintains that the manipulated sound only needs to disturb timing to affect speech control. The experiment investigated whether speech that was gated on and off (interrupted) affected the speech control of speakers who stutter. Eight children who stutter read a passage when they heard their voice normally and when the speech was gated. Fluency was enhanced (fewer errors were made and time to read a set passage was reduced) when speech was interrupted in this way. The results support the DRH. PMID:17726328
The effects of gated speech on the fluency of speakers who stutter.

PubMed

Howell, Peter

2007-01-01

It is known that the speech of people who stutter improves when the speaker's own vocalization is changed while the participant is speaking. One explanation of these effects is the disruptive rhythm hypothesis (DRH). The DRH maintains that the manipulated sound only needs to disturb timing to affect speech control. The experiment investigated whether speech that was gated on and off (interrupted) affected the speech control of speakers who stutter. Eight children who stutter read a passage when they heard their voice normally and when the speech was gated. Fluency was enhanced (fewer errors were made and time to read a set passage was reduced) when speech was interrupted in this way. The results support the DRH. Copyright 2007 S. Karger AG, Basel.
Does Working Memory Enhance or Interfere with Speech Fluency in Adults Who Do and Do Not Stutter? Evidence from a Dual-Task Paradigm

ERIC Educational Resources Information Center

Eichorn, Naomi; Marton, Klara; Schwartz, Richard G.; Melara, Robert D.; Pirutinsky, Steven

2016-01-01

Purpose: The present study examined whether engaging working memory in a secondary task benefits speech fluency. Effects of dual-task conditions on speech fluency, rate, and errors were examined with respect to predictions derived from three related theoretical accounts of disfluencies. Method: Nineteen adults who stutter and twenty adults who do…
Variation in the pattern of omissions and substitutions of grammatical morphemes in the spontaneous speech of so-called agrammatic patients.

PubMed

Miceli, G; Silveri, M C; Romani, C; Caramazza, A

1989-04-01

We describe the patterns of omissions (and substitutions) of freestanding grammatical morphemes and the patterns of substitutions of bound grammatical morphemes in 20 so-called agrammatic patients. Extreme variation was observed in the patterns of omissions and substitutions of grammatical morphemes, both in terms of the distribution of errors for different grammatical morphemes as well as in terms of the distribution of omissions versus substitutions. Results are discussed in the context of current debates concerning the possibility of a theoretically motivated distinction between the clinical categories of agrammatism and paragrammatism and, more generally, concerning the theoretical usefulness of any clinical category. The conclusion is reached that the observed heterogeneity in the production of grammatical morphemes among putatively agrammatic patients renders the clinical category of agrammatism, and by extension all other clinical categories from the classical classification scheme (e.g., Broca's aphasia, Wernicke's aphasia, and so forth) to more recent classificatory attempts (e.g., surface dyslexia, deep dysgraphia, and so forth), theoretically useless.
Gesture helps learners learn, but not merely by guiding their visual attention.

PubMed

Wakefield, Elizabeth; Novack, Miriam A; Congdon, Eliza L; Franconeri, Steven; Goldin-Meadow, Susan

2018-04-16

Teaching a new concept through gestures-hand movements that accompany speech-facilitates learning above-and-beyond instruction through speech alone (e.g., Singer & Goldin-Meadow, ). However, the mechanisms underlying this phenomenon are still under investigation. Here, we use eye tracking to explore one often proposed mechanism-gesture's ability to direct visual attention. Behaviorally, we replicate previous findings: Children perform significantly better on a posttest after learning through Speech+Gesture instruction than through Speech Alone instruction. Using eye tracking measures, we show that children who watch a math lesson with gesture do allocate their visual attention differently from children who watch a math lesson without gesture-they look more to the problem being explained, less to the instructor, and are more likely to synchronize their visual attention with information presented in the instructor's speech (i.e., follow along with speech) than children who watch the no-gesture lesson. The striking finding is that, even though these looking patterns positively predict learning outcomes, the patterns do not mediate the effects of training condition (Speech Alone vs. Speech+Gesture) on posttest success. We find instead a complex relation between gesture and visual attention in which gesture moderates the impact of visual looking patterns on learning-following along with speech predicts learning for children in the Speech+Gesture condition, but not for children in the Speech Alone condition. Gesture's beneficial effects on learning thus come not merely from its ability to guide visual attention, but also from its ability to synchronize with speech and affect what learners glean from that speech. © 2018 John Wiley & Sons Ltd.
Discrimination of static and dynamic spectral patterns by children and young adults in relationship to speech perception in noise.

PubMed

Rayes, Hanin; Sheft, Stanley; Shafiro, Valeriy

2014-01-01

Past work has shown relationship between the ability to discriminate spectral patterns and measures of speech intelligibility. The purpose of this study was to investigate the ability of both children and young adults to discriminate static and dynamic spectral patterns, comparing performance between the two groups and evaluating within-group results in terms of relationship to speech-in-noise perception. Data were collected from normal-hearing children (age range: 5.4 - 12.8 yrs) and young adults (mean age: 22.8 yrs) on two spectral discrimination tasks and speech-in-noise perception. The first discrimination task, involving static spectral profiles, measured the ability to detect a change in the phase of a low-density sinusoidal spectral ripple of wideband noise. Using dynamic spectral patterns, the second task determined the signal-to-noise ratio needed to discriminate the temporal pattern of frequency fluctuation imposed by stochastic low-rate frequency modulation (FM). Children performed significantly poorer than young adults on both discrimination tasks. For children, a significant correlation between speech-in-noise perception and spectral-pattern discrimination was obtained only with the dynamic patterns of the FM condition, with partial correlation suggesting that factors related to the children's age mediated the relationship.
Plasticity in the adult human auditory brainstem following short-term linguistic training

PubMed Central

Song, Judy H.; Skoe, Erika; Wong, Patrick C. M.; Kraus, Nina

2009-01-01

Peripheral and central structures along the auditory pathway contribute to speech processing and learning. However, because speech requires the use of functionally and acoustically complex sounds which necessitates high sensory and cognitive demands, long-term exposure and experience using these sounds is often attributed to the neocortex with little emphasis placed on subcortical structures. The present study examines changes in the auditory brainstem, specifically the frequency following response (FFR), as native English-speaking adults learn to incorporate foreign speech sounds (lexical pitch patterns) in word identification. The FFR presumably originates from the auditory midbrain, and can be elicited pre-attentively. We measured FFRs to the trained pitch patterns before and after training. Measures of pitch-tracking were then derived from the FFR signals. We found increased accuracy in pitch-tracking after training, including a decrease in the number of pitch-tracking errors and a refinement in the energy devoted to encoding pitch. Most interestingly, this change in pitch-tracking accuracy only occurred in the most acoustically complex pitch contour (dipping contour), which is also the least familiar to our English-speaking subjects. These results not only demonstrate the contribution of the brainstem in language learning and its plasticity in adulthood, but they also demonstrate the specificity of this contribution (i.e., changes in encoding only occurs in specific, least familiar stimuli, not all stimuli). Our findings complement existing data showing cortical changes after second language learning, and are consistent with models suggesting that brainstem changes resulting from perceptual learning are most apparent when acuity in encoding is most needed. PMID:18370594
Automatic Analysis of Pronunciations for Children with Speech Sound Disorders.

PubMed

Dudy, Shiran; Bedrick, Steven; Asgari, Meysam; Kain, Alexander

2018-07-01

Computer-Assisted Pronunciation Training (CAPT) systems aim to help a child learn the correct pronunciations of words. However, while there are many online commercial CAPT apps, there is no consensus among Speech Language Therapists (SLPs) or non-professionals about which CAPT systems, if any, work well. The prevailing assumption is that practicing with such programs is less reliable and thus does not provide the feedback necessary to allow children to improve their performance. The most common method for assessing pronunciation performance is the Goodness of Pronunciation (GOP) technique. Our paper proposes two new GOP techniques. We have found that pronunciation models that use explicit knowledge about error pronunciation patterns can lead to more accurate classification whether a phoneme was correctly pronounced or not. We evaluate the proposed pronunciation assessment methods against a baseline state of the art GOP approach, and show that the proposed techniques lead to classification performance that is more similar to that of a human expert.
Making sense of progressive non-fluent aphasia: an analysis of conversational speech

PubMed Central

Woollams, Anna M.; Hodges, John R.; Patterson, Karalyn

2009-01-01

The speech of patients with progressive non-fluent aphasia (PNFA) has often been described clinically, but these descriptions lack support from quantitative data. The clinical classification of the progressive aphasic syndromes is also debated. This study selected 15 patients with progressive aphasia on broad criteria, excluding only those with clear semantic dementia. It aimed to provide a detailed quantitative description of their conversational speech, along with cognitive testing and visual rating of structural brain imaging, and to examine which, if any features were consistently present throughout the group; as well as looking for sub-syndromic associations between these features. A consistent increase in grammatical and speech sound errors and a simplification of spoken syntax relative to age-matched controls were observed, though telegraphic speech was rare; slow speech was common but not universal. Almost all patients showed impairments in picture naming, syntactic comprehension and executive function. The degree to which speech was affected was independent of the severity of the other cognitive deficits. A partial dissociation was also observed between slow speech with simplified grammar on the one hand, and grammatical and speech sound errors on the other. Overlap between these sets of impairments was however, the rule rather than the exception, producing continuous variation within a single consistent syndrome. The distribution of atrophy was remarkably variable, with frontal, temporal and medial temporal areas affected, either symmetrically or asymmetrically. The study suggests that PNFA is a coherent, well-defined syndrome and that varieties such as logopaenic progressive aphasia and progressive apraxia of speech may be seen as points in a space of continuous variation within progressive non-fluent aphasia. PMID:19696033
Optimal pattern synthesis for speech recognition based on principal component analysis

NASA Astrophysics Data System (ADS)

Korsun, O. N.; Poliyev, A. V.

2018-02-01

The algorithm for building an optimal pattern for the purpose of automatic speech recognition, which increases the probability of correct recognition, is developed and presented in this work. The optimal pattern forming is based on the decomposition of an initial pattern to principal components, which enables to reduce the dimension of multi-parameter optimization problem. At the next step the training samples are introduced and the optimal estimates for principal components decomposition coefficients are obtained by a numeric parameter optimization algorithm. Finally, we consider the experiment results that show the improvement in speech recognition introduced by the proposed optimization algorithm.
Speech entrainment compensates for Broca's area damage.

PubMed

Fridriksson, Julius; Basilakos, Alexandra; Hickok, Gregory; Bonilha, Leonardo; Rorden, Chris

2015-08-01

Speech entrainment (SE), the online mimicking of an audiovisual speech model, has been shown to increase speech fluency in patients with Broca's aphasia. However, not all individuals with aphasia benefit from SE. The purpose of this study was to identify patterns of cortical damage that predict a positive response SE's fluency-inducing effects. Forty-four chronic patients with left hemisphere stroke (15 female) were included in this study. Participants completed two tasks: 1) spontaneous speech production, and 2) audiovisual SE. Number of different words per minute was calculated as a speech output measure for each task, with the difference between SE and spontaneous speech conditions yielding a measure of fluency improvement. Voxel-wise lesion-symptom mapping (VLSM) was used to relate the number of different words per minute for spontaneous speech, SE, and SE-related improvement to patterns of brain damage in order to predict lesion locations associated with the fluency-inducing response to SE. Individuals with Broca's aphasia demonstrated a significant increase in different words per minute during SE versus spontaneous speech. A similar pattern of improvement was not seen in patients with other types of aphasia. VLSM analysis revealed damage to the inferior frontal gyrus predicted this response. Results suggest that SE exerts its fluency-inducing effects by providing a surrogate target for speech production via internal monitoring processes. Clinically, these results add further support for the use of SE to improve speech production and may help select patients for SE treatment. Copyright © 2015 Elsevier Ltd. All rights reserved.
Analysis of Factors Affecting System Performance in the ASpIRE Challenge

DTIC Science & Technology

2015-12-13

performance in the ASpIRE (Automatic Speech recognition In Reverberant Environments) challenge. In particular, overall word error rate (WER) of the solver...systems is analyzed as a function of room, distance between talker and microphone, and microphone type. We also analyze speech activity detection...analysis will inform the design of future challenges and provide insight into the efficacy of current solutions addressing noisy reverberant speech
Psychophysics of Complex Auditory and Speech Stimuli

DTIC Science & Technology

1993-10-31

unexpected, and does not seem to l:a ý a dice-ct counterpart in the extensive research on pitch perception. Experiment 2 was designed to quantify our...project is to use of different procedures to provide converging evidence on the natuge of perceptual spaces for speech categories. Completed research ...prior speech research on classification procedures may have led to errors. Thus, the opposite (falling F2 & F3) transitions lead somewhat ambiguous

Automated Intelligibility Assessment of Pathological Speech Using Phonological Features

NASA Astrophysics Data System (ADS)

Middag, Catherine; Martens, Jean-Pierre; Van Nuffelen, Gwen; De Bodt, Marc

2009-12-01

It is commonly acknowledged that word or phoneme intelligibility is an important criterion in the assessment of the communication efficiency of a pathological speaker. People have therefore put a lot of effort in the design of perceptual intelligibility rating tests. These tests usually have the drawback that they employ unnatural speech material (e.g., nonsense words) and that they cannot fully exclude errors due to listener bias. Therefore, there is a growing interest in the application of objective automatic speech recognition technology to automate the intelligibility assessment. Current research is headed towards the design of automated methods which can be shown to produce ratings that correspond well with those emerging from a well-designed and well-performed perceptual test. In this paper, a novel methodology that is built on previous work (Middag et al., 2008) is presented. It utilizes phonological features, automatic speech alignment based on acoustic models that were trained on normal speech, context-dependent speaker feature extraction, and intelligibility prediction based on a small model that can be trained on pathological speech samples. The experimental evaluation of the new system reveals that the root mean squared error of the discrepancies between perceived and computed intelligibilities can be as low as 8 on a scale of 0 to 100.
Chest Wall Motion during Speech Production in Patients with Advanced Ankylosing Spondylitis

ERIC Educational Resources Information Center

Kalliakosta, Georgia; Mandros, Charalampos; Tzelepis, George E.

2007-01-01

Purpose: To test the hypothesis that ankylosing spondylitis (AS) alters the pattern of chest wall motion during speech production. Method: The pattern of chest wall motion during speech was measured with respiratory inductive plethysmography in 6 participants with advanced AS (5 men, 1 woman, age 45 plus or minus 8 years, Schober test 1.45 plus or…
Effects of fixed labial orthodontic appliances on speech sound production.

PubMed

Paley, Jonathan S; Cisneros, George J; Nicolay, Olivier F; LeBlanc, Etoile M

2016-05-01

To explore the impact of fixed labial orthodontic appliances on speech sound production. Speech evaluations were performed on 23 patients with fixed labial appliances. Evaluations were performed immediately prior to appliance insertion, immediately following insertion, and 1 and 2 months post insertion. Baseline dental/skeletal variables were correlated with the ability to accommodate the presence of the appliances. Appliance effects were variable: 44% of the subjects were unaffected, 39% were temporarily affected but adapted within 2 months, and 17% of patients showed persistent sound errors at 2 months. Resolution of acquired sound errors was noted by 8 months post-appliance removal. Maladaptation to appliances was correlated to severity of malocclusion as determined by the Grainger's Treatment Priority Index. Sibilant sounds, most notably /s/, were affected most often. (1) Insertion of fixed labial appliances has an effect on speech sound production. (2) Sibilant and stopped sounds are affected, with /s/ being affected most often. (3) Accommodation to fixed appliances depends on the severity of malocclusion.
Voice Onset Time in Consonant Cluster Errors: Can Phonetic Accommodation Differentiate Cognitive from Motor Errors?

ERIC Educational Resources Information Center

Pouplier, Marianne; Marin, Stefania; Waltl, Susanne

2014-01-01

Purpose: Phonetic accommodation in speech errors has traditionally been used to identify the processing level at which an error has occurred. Recent studies have challenged the view that noncanonical productions may solely be due to phonetic, not phonological, processing irregularities, as previously assumed. The authors of the present study…
[Speech fluency developmental profile in Brazilian Portuguese speakers].

PubMed

Martins, Vanessa de Oliveira; Andrade, Claudia Regina Furquim de

2008-01-01

speech fluency varies from one individual to the next, fluent or stutterer, depending on several factors. Studies that investigate the influence of age on fluency patterns have been identified; however these differences were investigated in isolated age groups. Studies about life span fluency variations were not found. to verify the speech fluency developmental profile. speech samples of 594 fluent participants of both genders, with ages between 2:0 and 99:11 years, speakers of the Brazilian Portuguese language, were analyzed. Participants were grouped as follows: pre-scholars, scholars, early adolescence, late adolescence, adults and elderlies. Speech samples were analyzed according to the Speech Fluency Profile variables and were compared regarding: typology of speech disruptions (typical and less typical), speech rate (words and syllables per minute) and frequency of speech disruptions (percentage of speech discontinuity). although isolated variations were identified, overall there was no significant difference between the age groups for the speech disruption indexes (typical and less typical speech disruptions and percentage of speech discontinuity). Significant differences were observed between the groups when considering speech rate. the development of the neurolinguistic system for speech fluency, in terms of speech disruptions, seems to stabilize itself during the first years of life, presenting no alterations during the life span. Indexes of speech rate present variations in the age groups, indicating patterns of acquisition, development, stabilization and degeneration.
A multimodal spectral approach to characterize rhythm in natural speech.

PubMed

Alexandrou, Anna Maria; Saarinen, Timo; Kujala, Jan; Salmelin, Riitta

2016-01-01

Human utterances demonstrate temporal patterning, also referred to as rhythm. While simple oromotor behaviors (e.g., chewing) feature a salient periodical structure, conversational speech displays a time-varying quasi-rhythmic pattern. Quantification of periodicity in speech is challenging. Unimodal spectral approaches have highlighted rhythmic aspects of speech. However, speech is a complex multimodal phenomenon that arises from the interplay of articulatory, respiratory, and vocal systems. The present study addressed the question of whether a multimodal spectral approach, in the form of coherence analysis between electromyographic (EMG) and acoustic signals, would allow one to characterize rhythm in natural speech more efficiently than a unimodal analysis. The main experimental task consisted of speech production at three speaking rates; a simple oromotor task served as control. The EMG-acoustic coherence emerged as a sensitive means of tracking speech rhythm, whereas spectral analysis of either EMG or acoustic amplitude envelope alone was less informative. Coherence metrics seem to distinguish and highlight rhythmic structure in natural speech.
Systematic Studies of Modified Vocalization: The Effect of Speech Rate on Speech Production Measures during Metronome-Paced Speech in Persons Who Stutter

ERIC Educational Resources Information Center

Davidow, Jason H.

2014-01-01

Background: Metronome-paced speech results in the elimination, or substantial reduction, of stuttering moments. The cause of fluency during this fluency-inducing condition is unknown. Several investigations have reported changes in speech pattern characteristics from a control condition to a metronome-paced speech condition, but failure to control…
The use of fundamental frequency for lexical segmentation in listeners with cochlear implants.

PubMed

Spitzer, Stephanie; Liss, Julie; Spahr, Tony; Dorman, Michael; Lansford, Kaitlin

2009-06-01

Fundamental frequency (F0) variation is one of a number of acoustic cues normal hearing listeners use for guiding lexical segmentation of degraded speech. This study examined whether F0 contour facilitates lexical segmentation by listeners fitted with cochlear implants (CIs). Lexical boundary error patterns elicited under unaltered and flattened F0 conditions were compared across three groups: listeners with conventional CI, listeners with CI and preserved low-frequency acoustic hearing, and normal hearing listeners subjected to CI simulations. Results indicate that all groups attended to syllabic stress cues to guide lexical segmentation, and that F0 contours facilitated performance for listeners with low-frequency hearing.
Implementation fidelity of a computer-assisted intervention for children with speech sound disorders.

PubMed

McCormack, Jane; Baker, Elise; Masso, Sarah; Crowe, Kathryn; McLeod, Sharynne; Wren, Yvonne; Roulstone, Sue

2017-06-01

Implementation fidelity refers to the degree to which an intervention or programme adheres to its original design. This paper examines implementation fidelity in the Sound Start Study, a clustered randomised controlled trial of computer-assisted support for children with speech sound disorders (SSD). Sixty-three children with SSD in 19 early childhood centres received computer-assisted support (Phoneme Factory Sound Sorter [PFSS] - Australian version). Educators facilitated the delivery of PFSS targeting phonological error patterns identified by a speech-language pathologist. Implementation data were gathered via (1) the computer software, which recorded when and how much intervention was completed over 9 weeks; (2) educators' records of practice sessions; and (3) scoring of fidelity (intervention procedure, competence and quality of delivery) from videos of intervention sessions. Less than one-third of children received the prescribed number of days of intervention, while approximately one-half participated in the prescribed number of intervention plays. Computer data differed from educators' data for total number of days and plays in which children participated; the degree of match was lower as data became more specific. Fidelity to intervention procedures, competency and quality of delivery was high. Implementation fidelity may impact intervention outcomes and so needs to be measured in intervention research; however, the way in which it is measured may impact on data.
Explaining Errors in Children's Questions

ERIC Educational Resources Information Center

Rowland, Caroline F.

2007-01-01

The ability to explain the occurrence of errors in children's speech is an essential component of successful theories of language acquisition. The present study tested some generativist and constructivist predictions about error on the questions produced by ten English-learning children between 2 and 5 years of age. The analyses demonstrated that,…
Disorders of Articulation. Prentice-Hall Foundations of Speech Pathology Series.

ERIC Educational Resources Information Center

Carrell, James A.

Designed for students of speech pathology and audiology and for practicing clinicians, the text considers the nature of the articulation process, criteria for diagnosis, and classification and etiology of disorders. Also discussed are phonetic characteristics, including phonemic errors and configurational and contextual defects; and functional…
Categorical speech processing in Broca's area: an fMRI study using multivariate pattern-based analysis.

PubMed

Lee, Yune-Sang; Turkeltaub, Peter; Granger, Richard; Raizada, Rajeev D S

2012-03-14

Although much effort has been directed toward understanding the neural basis of speech processing, the neural processes involved in the categorical perception of speech have been relatively less studied, and many questions remain open. In this functional magnetic resonance imaging (fMRI) study, we probed the cortical regions mediating categorical speech perception using an advanced brain-mapping technique, whole-brain multivariate pattern-based analysis (MVPA). Normal healthy human subjects (native English speakers) were scanned while they listened to 10 consonant-vowel syllables along the /ba/-/da/ continuum. Outside of the scanner, individuals' own category boundaries were measured to divide the fMRI data into /ba/ and /da/ conditions per subject. The whole-brain MVPA revealed that Broca's area and the left pre-supplementary motor area evoked distinct neural activity patterns between the two perceptual categories (/ba/ vs /da/). Broca's area was also found when the same analysis was applied to another dataset (Raizada and Poldrack, 2007), which previously yielded the supramarginal gyrus using a univariate adaptation-fMRI paradigm. The consistent MVPA findings from two independent datasets strongly indicate that Broca's area participates in categorical speech perception, with a possible role of translating speech signals into articulatory codes. The difference in results between univariate and multivariate pattern-based analyses of the same data suggest that processes in different cortical areas along the dorsal speech perception stream are distributed on different spatial scales.
Rhythmic patterning in Malaysian and Singapore English.

PubMed

Tan, Rachel Siew Kuang; Low, Ee-Ling

2014-06-01

Previous work on the rhythm of Malaysian English has been based on impressionistic observations. This paper utilizes acoustic analysis to measure the rhythmic patterns of Malaysian English. Recordings of the read speech and spontaneous speech of 10 Malaysian English speakers were analyzed and compared with recordings of an equivalent sample of Singaporean English speakers. Analysis was done using two rhythmic indexes, the PVI and VarcoV. It was found that although the rhythm of read speech of the Singaporean speakers was syllable-based as described by previous studies, the rhythm of the Malaysian speakers was even more syllable-based. Analysis of the syllables in specific utterances showed that Malaysian speakers did not reduce vowels as much as Singaporean speakers in cases of syllables in utterances. Results of the spontaneous speech confirmed the findings for the read speech; that is, the same rhythmic patterning was found which normally triggers vowel reductions.
A preliminary comparison of speech recognition functionality in dental practice management systems.

PubMed

Irwin, Jeannie Y; Schleyer, Titus

2008-11-06

In this study, we examined speech recognition functionality in four leading dental practice management systems. Twenty dental students used voice to chart a simulated patient with 18 findings in each system. Results show it can take over a minute to chart one finding and that users frequently have to repeat commands. Limited functionality, poor usability and a high error rate appear to retard adoption of speech recognition in dentistry.
The dance of communication: retaining family membership despite severe non-speech dementia.

PubMed

Walmsley, Bruce D; McCormack, Lynne

2014-09-01

There is minimal research investigating non-speech communication as a result of living with severe dementia. This phenomenological study explores retained awareness expressed through non-speech patterns of communication in a family member living with severe dementia. Further, it describes reciprocal efforts used by all family members to engage in alternative patterns of communication. Family interactions were filmed to observe speech and non-speech relational communication. Participants were four family groups each with a family member living with non-speech communication as a result of severe dementia. Overall there were 16 participants. Data were analysed using thematic analysis. One superordinate theme, Dance of Communication, describes the interactive patterns that were observed during family communication. Two subordinate themes emerged: (a) in-step; characterised by communication that indicated harmony, spontaneity and reciprocity, and; (b) out-of-step; characterised by communication that indicated disharmony, syncopation, and vulnerability. This study highlights that retained awareness can exist at levels previously unrecognised in those living with limited or absent speech as a result of severe dementia. A recommendation for the development of a communication program for caregivers of individuals living with dementia is presented. © The Author(s) 2013 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav.
Automatic speech recognition and training for severely dysarthric users of assistive technology: the STARDUST project.

PubMed

Parker, Mark; Cunningham, Stuart; Enderby, Pam; Hawley, Mark; Green, Phil

2006-01-01

The STARDUST project developed robust computer speech recognizers for use by eight people with severe dysarthria and concomitant physical disability to access assistive technologies. Independent computer speech recognizers trained with normal speech are of limited functional use by those with severe dysarthria due to limited and inconsistent proximity to "normal" articulatory patterns. Severe dysarthric output may also be characterized by a small mass of distinguishable phonetic tokens making the acoustic differentiation of target words difficult. Speaker dependent computer speech recognition using Hidden Markov Models was achieved by the identification of robust phonetic elements within the individual speaker output patterns. A new system of speech training using computer generated visual and auditory feedback reduced the inconsistent production of key phonetic tokens over time.
Articulation in schoolchildren and adults with neurofibromatosis type 1.

PubMed

Cosyns, Marjan; Mortier, Geert; Janssens, Sandra; Bogaert, Famke; D'Hondt, Stephanie; Van Borsel, John

2012-01-01

Several authors mentioned the occurrence of articulation problems in the neurofibromatosis type 1 (NF1) population. However, few studies have undertaken a detailed analysis of the articulation skills of NF1 patients, especially in schoolchildren and adults. Therefore, the aim of the present study was to examine in depth the articulation skills of NF1 schoolchildren and adults, both phonetically and phonologically. Speech samples were collected from 43 Flemish NF1 patients (14 children and 29 adults), ranging in age between 7 and 53 years, using a standardized speech test in which all Flemish single speech sounds and most clusters occur in all their permissible syllable positions. Analyses concentrated on consonants only and included a phonetic inventory, a phonetic, and a phonological analysis. It was shown that phonetic inventories were incomplete in 16.28% (7/43) of participants, in which totally correct realizations of the sibilants /ʃ/ and/or /ʒ/ were missing. Phonetic analysis revealed that distortions were the predominant phonetic error type. Sigmatismus stridens, multiple ad- or interdentality, and, in children, rhotacismus non vibrans were frequently observed. From a phonological perspective, the most common error types were substitution and syllable structure errors. Particularly, devoicing, cluster simplification, and, in children, deletion of the final consonant of words were perceived. Further, it was demonstrated that significantly more men than women presented with an incomplete phonetic inventory, and that girls tended to display more articulation errors than boys. Additionally, children exhibited significantly more articulation errors than adults, suggesting that although the articulation skills of NF1 patients evolve positively with age, articulation problems do not resolve completely from childhood to adulthood. As such, the articulation errors made by NF1 adults may be regarded as residual articulation disorders. It can be concluded that the speech of NF1 patients is characterized by mild articulation disorders at an age where this is no longer expected. Readers will be able to describe neurofibromatosis type 1 (NF1) and explain the articulation errors displayed by schoolchildren and adults with this genetic syndrome. © 2011 Elsevier Inc. All rights reserved.
Speech perception as an active cognitive process

PubMed Central

Heald, Shannon L. M.; Nusbaum, Howard C.

2014-01-01

One view of speech perception is that acoustic signals are transformed into representations for pattern matching to determine linguistic structure. This process can be taken as a statistical pattern-matching problem, assuming realtively stable linguistic categories are characterized by neural representations related to auditory properties of speech that can be compared to speech input. This kind of pattern matching can be termed a passive process which implies rigidity of processing with few demands on cognitive processing. An alternative view is that speech recognition, even in early stages, is an active process in which speech analysis is attentionally guided. Note that this does not mean consciously guided but that information-contingent changes in early auditory encoding can occur as a function of context and experience. Active processing assumes that attention, plasticity, and listening goals are important in considering how listeners cope with adverse circumstances that impair hearing by masking noise in the environment or hearing loss. Although theories of speech perception have begun to incorporate some active processing, they seldom treat early speech encoding as plastic and attentionally guided. Recent research has suggested that speech perception is the product of both feedforward and feedback interactions between a number of brain regions that include descending projections perhaps as far downstream as the cochlea. It is important to understand how the ambiguity of the speech signal and constraints of context dynamically determine cognitive resources recruited during perception including focused attention, learning, and working memory. Theories of speech perception need to go beyond the current corticocentric approach in order to account for the intrinsic dynamics of the auditory encoding of speech. In doing so, this may provide new insights into ways in which hearing disorders and loss may be treated either through augementation or therapy. PMID:24672438
Rapid Statistical Learning Supporting Word Extraction From Continuous Speech.

PubMed

Batterink, Laura J

2017-07-01

The identification of words in continuous speech, known as speech segmentation, is a critical early step in language acquisition. This process is partially supported by statistical learning, the ability to extract patterns from the environment. Given that speech segmentation represents a potential bottleneck for language acquisition, patterns in speech may be extracted very rapidly, without extensive exposure. This hypothesis was examined by exposing participants to continuous speech streams composed of novel repeating nonsense words. Learning was measured on-line using a reaction time task. After merely one exposure to an embedded novel word, learners demonstrated significant learning effects, as revealed by faster responses to predictable than to unpredictable syllables. These results demonstrate that learners gained sensitivity to the statistical structure of unfamiliar speech on a very rapid timescale. This ability may play an essential role in early stages of language acquisition, allowing learners to rapidly identify word candidates and "break in" to an unfamiliar language.
Native Reactions to Non-Native Speech: A Review of Empirical Research.

ERIC Educational Resources Information Center

Eisenstein, Miriam

1983-01-01

Recent research on native speakers' reactions to nonnative speech that views listeners, speakers, and language from a variety of perspectives using both objective and subjective research paradigms is reviewed. Studies of error gravity, relative intelligibility of language samples, the role of accent, speakers' characteristics, and context in which…

Speech Production in Hearing-Impaired Children.

ERIC Educational Resources Information Center

Gold, Toni

1980-01-01

Investigations in recent years have indicated that only about 20% of the speech output of the deaf is understood by the "person on the street." This lack of intelligibility has been associated with some frequently occurring segmental and suprasegmental errors. Journal Availability: Elsevier North Holland, Inc., 52 Vanderbilt Avenue, New York, NY…
La parole, vue et prise par les etudiants (Speech as Seen and Understood by Student).

ERIC Educational Resources Information Center

Gajo, Laurent, Ed.; Jeanneret, Fabrice, Ed.

1998-01-01

Articles on speech and second language learning include: "Les sequences de correction en classe de langue seconde: evitement du 'non' explicite" ("Error Correction Sequences in Second Language Class: Avoidance of the Explicit 'No'") (Anne-Lise de Bosset); "Analyse hierarchique et fonctionnelle du discours: conversations…
Central Pattern Generation and the Motor Infrastructure for Suck, Respiration, and Speech

ERIC Educational Resources Information Center

Barlow, Steven M.; Estep, Meredith

2006-01-01

The objective of the current report is to review experimental findings on centrally patterned movements and sensory and descending modulation of central pattern generators (CPGs) in a variety of animal and human models. Special emphasis is directed toward speech production muscle systems, including the chest wall and orofacial complex during…
Differences in early speech patterns between Parkinson variant of multiple system atrophy and Parkinson's disease.

PubMed

Huh, Young Eun; Park, Jongkyu; Suh, Mee Kyung; Lee, Sang Eun; Kim, Jumin; Jeong, Yuri; Kim, Hee-Tae; Cho, Jin Whan

2015-08-01

In Parkinson variant of multiple system atrophy (MSA-P), patterns of early speech impairment and their distinguishing features from Parkinson's disease (PD) require further exploration. Here, we compared speech data among patients with early-stage MSA-P, PD, and healthy subjects using quantitative acoustic and perceptual analyses. Variables were analyzed for men and women in view of gender-specific features of speech. Acoustic analysis revealed that male patients with MSA-P exhibited more profound speech abnormalities than those with PD, regarding increased voice pitch, prolonged pause time, and reduced speech rate. This might be due to widespread pathology of MSA-P in nigrostriatal or extra-striatal structures related to speech production. Although several perceptual measures were mildly impaired in MSA-P and PD patients, none of these parameters showed a significant difference between patient groups. Detailed speech analysis using acoustic measures may help distinguish between MSA-P and PD early in the disease process. Copyright © 2015 Elsevier Inc. All rights reserved.
Neural pathways for visual speech perception

PubMed Central

Bernstein, Lynne E.; Liebenthal, Einat

2014-01-01

This paper examines the questions, what levels of speech can be perceived visually, and how is visual speech represented by the brain? Review of the literature leads to the conclusions that every level of psycholinguistic speech structure (i.e., phonetic features, phonemes, syllables, words, and prosody) can be perceived visually, although individuals differ in their abilities to do so; and that there are visual modality-specific representations of speech qua speech in higher-level vision brain areas. That is, the visual system represents the modal patterns of visual speech. The suggestion that the auditory speech pathway receives and represents visual speech is examined in light of neuroimaging evidence on the auditory speech pathways. We outline the generally agreed-upon organization of the visual ventral and dorsal pathways and examine several types of visual processing that might be related to speech through those pathways, specifically, face and body, orthography, and sign language processing. In this context, we examine the visual speech processing literature, which reveals widespread diverse patterns of activity in posterior temporal cortices in response to visual speech stimuli. We outline a model of the visual and auditory speech pathways and make several suggestions: (1) The visual perception of speech relies on visual pathway representations of speech qua speech. (2) A proposed site of these representations, the temporal visual speech area (TVSA) has been demonstrated in posterior temporal cortex, ventral and posterior to multisensory posterior superior temporal sulcus (pSTS). (3) Given that visual speech has dynamic and configural features, its representations in feedforward visual pathways are expected to integrate these features, possibly in TVSA. PMID:25520611
What does it take to stress a word? Digital manipulation of stress markers in ataxic dysarthria.

PubMed

Lowit, Anja; Ijitona, Tolulope; Kuschmann, Anja; Corson, Stephen; Soraghan, John

2018-05-18

Stress production is important for effective communication, but this skill is frequently impaired in people with motor speech disorders. The literature reports successful treatment of these deficits in this population, thus highlighting the therapeutic potential of this area. However, no specific guidance is currently available to clinicians about whether any of the stress markers are more effective than others, to what degree they have to be manipulated, and whether strategies need to differ according to the underlying symptoms. In order to provide detailed information on how stress production problems can be addressed, the study investigated (1) the minimum amount of change in a single stress marker necessary to achieve significant improvement in stress target identification; and (2) whether stress can be signalled more effectively with a combination of stress markers. Data were sourced from a sentence stress task performed by 10 speakers with ataxic dysarthria and 10 healthy matched control participants. Fifteen utterances perceived as having incorrect stress patterns (no stress, all words stressed or inappropriate word stressed) were selected and digitally manipulated in a stepwise fashion based on typical speaker performance. Manipulations were performed on F0, intensity and duration, either in isolation or in combination with each other. In addition, pitch contours were modified for some utterances. A total of 50 naïve listeners scored which word they perceived as being stressed. Results showed that increases in duration and intensity at levels smaller than produced by the control participants resulted in significant improvements in listener accuracy. The effectiveness of F0 increases depended on the underlying error pattern. Overall intensity showed the most stable effects. Modifications of the pitch contour also resulted in significant improvements, but not to the same degree as amplification. Integration of two or more stress markers did not result in better results than manipulation of individual stress markers, unless they were combined with pitch contour modifications. The results highlight the potential for improvement of stress production in speakers with motor speech disorders. The fact that individual parameter manipulation is as effective as combining them will facilitate the therapeutic process considerably, as will the result that amplification at lower levels than seen in typical speakers is sufficient. The difference in results across utterance sets highlights the need to investigate the underlying error pattern in order to select the most effective compensatory strategy for clients. © 2018 Royal College of Speech and Language Therapists.
Dog-directed speech: why do we use it and do dogs pay attention to it?

PubMed Central

Ben-Aderet, Tobey; Gallego-Abenza, Mario

2017-01-01

Pet-directed speech is strikingly similar to infant-directed speech, a peculiar speaking pattern with higher pitch and slower tempo known to engage infants' attention and promote language learning. Here, we report the first investigation of potential factors modulating the use of dog-directed speech, as well as its immediate impact on dogs' behaviour. We recorded adult participants speaking in front of pictures of puppies, adult and old dogs, and analysed the quality of their speech. We then performed playback experiments to assess dogs' reaction to dog-directed speech compared with normal speech. We found that human speakers used dog-directed speech with dogs of all ages and that the acoustic structure of dog-directed speech was mostly independent of dog age, except for sound pitch which was relatively higher when communicating with puppies. Playback demonstrated that, in the absence of other non-auditory cues, puppies were highly reactive to dog-directed speech, and that the pitch was a key factor modulating their behaviour, suggesting that this specific speech register has a functional value in young dogs. Conversely, older dogs did not react differentially to dog-directed speech compared with normal speech. The fact that speakers continue to use dog-directed with older dogs therefore suggests that this speech pattern may mainly be a spontaneous attempt to facilitate interactions with non-verbal listeners. PMID:28077769
Dog-directed speech: why do we use it and do dogs pay attention to it?

PubMed

Ben-Aderet, Tobey; Gallego-Abenza, Mario; Reby, David; Mathevon, Nicolas

2017-01-11

Pet-directed speech is strikingly similar to infant-directed speech, a peculiar speaking pattern with higher pitch and slower tempo known to engage infants' attention and promote language learning. Here, we report the first investigation of potential factors modulating the use of dog-directed speech, as well as its immediate impact on dogs' behaviour. We recorded adult participants speaking in front of pictures of puppies, adult and old dogs, and analysed the quality of their speech. We then performed playback experiments to assess dogs' reaction to dog-directed speech compared with normal speech. We found that human speakers used dog-directed speech with dogs of all ages and that the acoustic structure of dog-directed speech was mostly independent of dog age, except for sound pitch which was relatively higher when communicating with puppies. Playback demonstrated that, in the absence of other non-auditory cues, puppies were highly reactive to dog-directed speech, and that the pitch was a key factor modulating their behaviour, suggesting that this specific speech register has a functional value in young dogs. Conversely, older dogs did not react differentially to dog-directed speech compared with normal speech. The fact that speakers continue to use dog-directed with older dogs therefore suggests that this speech pattern may mainly be a spontaneous attempt to facilitate interactions with non-verbal listeners. © 2017 The Author(s).
[Vocal recognition in dental and oral radiology].

PubMed

La Fianza, A; Giorgetti, S; Marelli, P; Campani, R

1993-10-01

Speech reporting benefits by units which can recognize sentences in any natural language in real time. The use of this method in the everyday practice of radiology departments shows its possible application fields. We used the speech recognition method to report orthopantomographic exams in order to evaluate the advantages the method offers to the management and quality of reporting the exams which are difficult to fit in other closed computed reporting systems. Both speech recognition and the conventional reporting method (tape recording and typewriting) were used to report 760 orthopantomographs. The average time needed to make the report, the legibility (or Flesch) index, as adapted for the Italian language, and finally a clinical index (the subjective opinion of 4 odontostomatologists) were evaluated for each exam, with both techniques. Moreover, errors in speech reporting (crude, human and overall errors) were also evaluated. The advantages of speech reporting consisted in the shorter time needed for the report to become available (2.24 vs 2.99 minutes) (p < 0.0005), in the improved Flesch index (30.62 vs 28.9) and in the clinical index. The data obtained from speech reporting in odontostomatologic radiology were useful not only to reduce the mean reporting time of orthopantomographic exams but also to improve report quality by reducing both grammar and transmission mistakes. However, the basic condition for such results to be obtained is the speaker's skills to make a good report.
DETECTION AND IDENTIFICATION OF SPEECH SOUNDS USING CORTICAL ACTIVITY PATTERNS

PubMed Central

Centanni, T.M.; Sloan, A.M.; Reed, A.C.; Engineer, C.T.; Rennaker, R.; Kilgard, M.P.

2014-01-01

We have developed a classifier capable of locating and identifying speech sounds using activity from rat auditory cortex with an accuracy equivalent to behavioral performance without the need to specify the onset time of the speech sounds. This classifier can identify speech sounds from a large speech set within 40 ms of stimulus presentation. To compare the temporal limits of the classifier to behavior, we developed a novel task that requires rats to identify individual consonant sounds from a stream of distracter consonants. The classifier successfully predicted the ability of rats to accurately identify speech sounds for syllable presentation rates up to 10 syllables per second (up to 17.9 ± 1.5 bits/sec), which is comparable to human performance. Our results demonstrate that the spatiotemporal patterns generated in primary auditory cortex can be used to quickly and accurately identify consonant sounds from a continuous speech stream without prior knowledge of the stimulus onset times. Improved understanding of the neural mechanisms that support robust speech processing in difficult listening conditions could improve the identification and treatment of a variety of speech processing disorders. PMID:24286757
Speech rhythm analysis with decomposition of the amplitude envelope: characterizing rhythmic patterns within and across languages.

PubMed

Tilsen, Sam; Arvaniti, Amalia

2013-07-01

This study presents a method for analyzing speech rhythm using empirical mode decomposition of the speech amplitude envelope, which allows for extraction and quantification of syllabic- and supra-syllabic time-scale components of the envelope. The method of empirical mode decomposition of a vocalic energy amplitude envelope is illustrated in detail, and several types of rhythm metrics derived from this method are presented. Spontaneous speech extracted from the Buckeye Corpus is used to assess the effect of utterance length on metrics, and it is shown how metrics representing variability in the supra-syllabic time-scale components of the envelope can be used to identify stretches of speech with targeted rhythmic characteristics. Furthermore, the envelope-based metrics are used to characterize cross-linguistic differences in speech rhythm in the UC San Diego Speech Lab corpus of English, German, Greek, Italian, Korean, and Spanish speech elicited in read sentences, read passages, and spontaneous speech. The envelope-based metrics exhibit significant effects of language and elicitation method that argue for a nuanced view of cross-linguistic rhythm patterns.
Reliance on auditory feedback in children with childhood apraxia of speech.

PubMed

Iuzzini-Seigel, Jenya; Hogan, Tiffany P; Guarino, Anthony J; Green, Jordan R

2015-01-01

Children with childhood apraxia of speech (CAS) have been hypothesized to continuously monitor their speech through auditory feedback to minimize speech errors. We used an auditory masking paradigm to determine the effect of attenuating auditory feedback on speech in 30 children: 9 with CAS, 10 with speech delay, and 11 with typical development. The masking only affected the speech of children with CAS as measured by voice onset time and vowel space area. These findings provide preliminary support for greater reliance on auditory feedback among children with CAS. Readers of this article should be able to (i) describe the motivation for investigating the role of auditory feedback in children with CAS; (ii) report the effects of feedback attenuation on speech production in children with CAS, speech delay, and typical development, and (iii) understand how the current findings may support a feedforward program deficit in children with CAS. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
English speech acquisition in 3- to 5-year-old children learning Russian and English.

PubMed

Gildersleeve-Neumann, Christina E; Wright, Kira L

2010-10-01

English speech acquisition in Russian-English (RE) bilingual children was investigated, exploring the effects of Russian phonetic and phonological properties on English single-word productions. Russian has more complex consonants and clusters and a smaller vowel inventory than English. One hundred thirty-seven single-word samples were phonetically transcribed from 14 RE and 28 English-only (E) children, ages 3;3 (years;months) to 5;7. Language and age differences were compared descriptively for phonetic inventories. Multivariate analyses compared phoneme accuracy and error rates between the two language groups. RE children produced Russian-influenced phones in English, including palatalized consonants and trills, and demonstrated significantly higher rates of trill substitution, final devoicing, and vowel errors than E children, suggesting Russian language effects on English. RE and E children did not differ in their overall production complexity, with similar final consonant deletion and cluster reduction error rates, similar phonetic inventories by age, and similar levels of phonetic complexity. Both older language groups were more accurate than the younger language groups. We observed effects of Russian on English speech acquisition; however, there were similarities between the RE and E children that have not been reported in previous studies of speech acquisition in bilingual children. These findings underscore the importance of knowing the phonological properties of both languages of a bilingual child in assessment.
The influence of sexual orientation on vowel production (L)

NASA Astrophysics Data System (ADS)

Pierrehumbert, Janet B.; Bent, Tessa; Munson, Benjamin; Bradlow, Ann R.; Bailey, J. Michael

2004-10-01

Vowel production in gay, lesbian, bisexual (GLB), and heterosexual speakers was examined. Differences in the acoustic characteristics of vowels were found as a function of sexual orientation. Lesbian and bisexual women produced less fronted /u/ and /opena/ than heterosexual women. Gay men produced a more expanded vowel space than heterosexual men. However, the vowels of GLB speakers were not generally shifted toward vowel patterns typical of the opposite sex. These results are inconsistent with the conjecture that innate biological factors have a broadly feminizing influence on the speech of gay men and a broadly masculinizing influence on the speech of lesbian/bisexual women. They are consistent with the idea that innate biological factors influence GLB speech patterns indirectly by causing selective adoption of certain speech patterns characteristic of the opposite sex. .
Hearing Lips and Seeing Voices: How Cortical Areas Supporting Speech Production Mediate Audiovisual Speech Perception

PubMed Central

Skipper, Jeremy I.; van Wassenhove, Virginie; Nusbaum, Howard C.; Small, Steven L.

2009-01-01

Observing a speaker’s mouth profoundly influences speech perception. For example, listeners perceive an “illusory” “ta” when the video of a face producing /ka/ is dubbed onto an audio /pa/. Here, we show how cortical areas supporting speech production mediate this illusory percept and audiovisual (AV) speech perception more generally. Specifically, cortical activity during AV speech perception occurs in many of the same areas that are active during speech production. We find that different perceptions of the same syllable and the perception of different syllables are associated with different distributions of activity in frontal motor areas involved in speech production. Activity patterns in these frontal motor areas resulting from the illusory “ta” percept are more similar to the activity patterns evoked by AV/ta/ than they are to patterns evoked by AV/pa/ or AV/ka/. In contrast to the activity in frontal motor areas, stimulus-evoked activity for the illusory “ta” in auditory and somatosensory areas and visual areas initially resembles activity evoked by AV/pa/ and AV/ka/, respectively. Ultimately, though, activity in these regions comes to resemble activity evoked by AV/ta/. Together, these results suggest that AV speech elicits in the listener a motor plan for the production of the phoneme that the speaker might have been attempting to produce, and that feedback in the form of efference copy from the motor system ultimately influences the phonetic interpretation. PMID:17218482
Speech errors of amnesic H.M.: unlike everyday slips-of-the-tongue.

PubMed

MacKay, Donald G; James, Lori E; Hadley, Christopher B; Fogler, Kethera A

2011-03-01

Three language production studies indicate that amnesic H.M. produces speech errors unlike everyday slips-of-the-tongue. Study 1 was a naturalistic task: H.M. and six controls closely matched for age, education, background and IQ described what makes captioned cartoons funny. Nine judges rated the descriptions blind to speaker identity and gave reliably more negative ratings for coherence, vagueness, comprehensibility, grammaticality, and adequacy of humor-description for H.M. than the controls. Study 2 examined "major errors", a novel type of speech error that is uncorrected and reduces the coherence, grammaticality, accuracy and/or comprehensibility of an utterance. The results indicated that H.M. produced seven types of major errors reliably more often than controls: substitutions, omissions, additions, transpositions, reading errors, free associations, and accuracy errors. These results contradict recent claims that H.M. retains unconscious or implicit language abilities and produces spoken discourse that is "sophisticated," "intact" and "without major errors." Study 3 examined whether three classical types of errors (omissions, additions, and substitutions of words and phrases) differed for H.M. versus controls in basic nature and relative frequency by error type. The results indicated that omissions, and especially multi-word omissions, were relatively more common for H.M. than the controls; and substitutions violated the syntactic class regularity (whereby, e.g., nouns substitute with nouns but not verbs) relatively more often for H.M. than the controls. These results suggest that H.M.'s medial temporal lobe damage impaired his ability to rapidly form new connections between units in the cortex, a process necessary to form complete and coherent internal representations for novel sentence-level plans. In short, different brain mechanisms underlie H.M.'s major errors (which reflect incomplete and incoherent sentence-level plans) versus everyday slips-of-the tongue (which reflect errors in activating pre-planned units in fully intact sentence-level plans). Implications of the results of Studies 1-3 are discussed for systems theory, binding theory and relational memory theories. Copyright Â© 2010 Elsevier Srl. All rights reserved.
Effects of Production Training and Perception Training on Lexical Tone Perception--Are the Effects Domain General or Domain Specific?

ERIC Educational Resources Information Center

Lu, Shuang

2013-01-01

The relationship between speech perception and production has been debated for a long time. The Motor Theory of speech perception (Liberman et al., 1989) claims that perceiving speech is identifying the intended articulatory gestures rather than perceiving the sound patterns. It seems to suggest that speech production precedes speech perception,…
Is Birdsong More Like Speech or Music?

PubMed

Shannon, Robert V

2016-04-01

Music and speech share many acoustic cues but not all are equally important. For example, harmonic pitch is essential for music but not for speech. When birds communicate is their song more like speech or music? A new study contrasting pitch and spectral patterns shows that birds perceive their song more like humans perceive speech. Copyright © 2016 Elsevier Ltd. All rights reserved.
Teaching Speech Improvement to the Disadvantaged.

ERIC Educational Resources Information Center

Nash, Rosa Lee

1967-01-01

To develop positive speech patterns in disadvantaged students, the More Effective Schools Program in New York City instigated an experimental speech improvement program, K-6, in 20 of its elementary schools. Three typical speech-related problems of the disadvantaged--lack of school "know-how," inability to verbalize well, and the presence of poor…
Perception of speech rhythm in second language: the case of rhythmically similar L1 and L2

PubMed Central

Ordin, Mikhail; Polyanskaya, Leona

2015-01-01

We investigated the perception of developmental changes in timing patterns that happen in the course of second language (L2) acquisition, provided that the native and the target languages of the learner are rhythmically similar (German and English). It was found that speech rhythm in L2 English produced by German learners becomes increasingly stress-timed as acquisition progresses. This development is captured by the tempo-normalized rhythm measures of durational variability. Advanced learners also deliver speech at a faster rate. However, when native speakers have to classify the timing patterns characteristic of L2 English of German learners at different proficiency levels, they attend to speech rate cues and ignore the differences in speech rhythm. PMID:25859228

Direction of Attentional Focus in Biofeedback Treatment for /R/ Misarticulation

ERIC Educational Resources Information Center

McAllister Byun, Tara; Swartz, Michelle T.; Halpin, Peter F.; Szeredi, Daniel; Maas, Edwin

2016-01-01

Background: Maintaining an external direction of focus during practice is reported to facilitate acquisition of non-speech motor skills, but it is not known whether these findings also apply to treatment for speech errors. This question has particular relevance for treatment incorporating visual biofeedback, where clinician cueing can direct the…
Do Children with Phonological Delay Have Phonological Short-Term and Phonological Working Memory Deficits?

ERIC Educational Resources Information Center

Waring, Rebecca; Eadie, Patricia; Liow, Susan Rickard; Dodd, Barbara

2017-01-01

While little is known about why children make speech errors, it has been hypothesized that cognitive-linguistic factors may underlie phonological speech sound disorders. This study compared the phonological short-term and phonological working memory abilities (using immediate memory tasks) and receptive vocabulary size of 14 monolingual preschool…
Detecting and Correcting Speech Rhythm Errors

ERIC Educational Resources Information Center

Yurtbasi, Metin

2015-01-01

Every language has its own rhythm. Unlike many other languages in the world, English depends on the correct pronunciation of stressed and unstressed or weakened syllables recurring in the same phrase or sentence. Mastering the rhythm of English makes speaking more effective. Experiments have shown that we tend to hear speech as more rhythmical…
Evaluation of Core Vocabulary Therapy for Deaf Children: Four Treatment Case Studies

ERIC Educational Resources Information Center

Herman, Rosalind; Ford, Katie; Thomas, Jane; Oyebade, Natalie; Bennett, Danita; Dodd, Barbara

2015-01-01

This study evaluated whether core vocabulary intervention (CVT) improved single word speech accuracy, consistency and intelligibility in four 9-11-year-old children with profound sensori-neural deafness fitted with cochlear implants and/or digital hearing aids. Their speech was characterized by inconsistent production of different error forms for…
Hybrid Speaker Recognition Using Universal Acoustic Model

NASA Astrophysics Data System (ADS)

Nishimura, Jun; Kuroda, Tadahiro

We propose a novel speaker recognition approach using a speaker-independent universal acoustic model (UAM) for sensornet applications. In sensornet applications such as “Business Microscope”, interactions among knowledge workers in an organization can be visualized by sensing face-to-face communication using wearable sensor nodes. In conventional studies, speakers are detected by comparing energy of input speech signals among the nodes. However, there are often synchronization errors among the nodes which degrade the speaker recognition performance. By focusing on property of the speaker's acoustic channel, UAM can provide robustness against the synchronization error. The overall speaker recognition accuracy is improved by combining UAM with the energy-based approach. For 0.1s speech inputs and 4 subjects, speaker recognition accuracy of 94% is achieved at the synchronization error less than 100ms.
Acoustic-Emergent Phonology in the Amplitude Envelope of Child-Directed Speech

PubMed Central

Leong, Victoria; Goswami, Usha

2015-01-01

When acquiring language, young children may use acoustic spectro-temporal patterns in speech to derive phonological units in spoken language (e.g., prosodic stress patterns, syllables, phonemes). Children appear to learn acoustic-phonological mappings rapidly, without direct instruction, yet the underlying developmental mechanisms remain unclear. Across different languages, a relationship between amplitude envelope sensitivity and phonological development has been found, suggesting that children may make use of amplitude modulation (AM) patterns within the envelope to develop a phonological system. Here we present the Spectral Amplitude Modulation Phase Hierarchy (S-AMPH) model, a set of algorithms for deriving the dominant AM patterns in child-directed speech (CDS). Using Principal Components Analysis, we show that rhythmic CDS contains an AM hierarchy comprising 3 core modulation timescales. These timescales correspond to key phonological units: prosodic stress (Stress AM, ~2 Hz), syllables (Syllable AM, ~5 Hz) and onset-rime units (Phoneme AM, ~20 Hz). We argue that these AM patterns could in principle be used by naïve listeners to compute acoustic-phonological mappings without lexical knowledge. We then demonstrate that the modulation statistics within this AM hierarchy indeed parse the speech signal into a primitive hierarchically-organised phonological system comprising stress feet (proto-words), syllables and onset-rime units. We apply the S-AMPH model to two other CDS corpora, one spontaneous and one deliberately-timed. The model accurately identified 72–82% (freely-read CDS) and 90–98% (rhythmically-regular CDS) stress patterns, syllables and onset-rime units. This in-principle demonstration that primitive phonology can be extracted from speech AMs is termed Acoustic-Emergent Phonology (AEP) theory. AEP theory provides a set of methods for examining how early phonological development is shaped by the temporal modulation structure of speech across languages. The S-AMPH model reveals a crucial developmental role for stress feet (AMs ~2 Hz). Stress feet underpin different linguistic rhythm typologies, and speech rhythm underpins language acquisition by infants in all languages. PMID:26641472
Acoustic-Emergent Phonology in the Amplitude Envelope of Child-Directed Speech.

PubMed

Leong, Victoria; Goswami, Usha

2015-01-01

When acquiring language, young children may use acoustic spectro-temporal patterns in speech to derive phonological units in spoken language (e.g., prosodic stress patterns, syllables, phonemes). Children appear to learn acoustic-phonological mappings rapidly, without direct instruction, yet the underlying developmental mechanisms remain unclear. Across different languages, a relationship between amplitude envelope sensitivity and phonological development has been found, suggesting that children may make use of amplitude modulation (AM) patterns within the envelope to develop a phonological system. Here we present the Spectral Amplitude Modulation Phase Hierarchy (S-AMPH) model, a set of algorithms for deriving the dominant AM patterns in child-directed speech (CDS). Using Principal Components Analysis, we show that rhythmic CDS contains an AM hierarchy comprising 3 core modulation timescales. These timescales correspond to key phonological units: prosodic stress (Stress AM, ~2 Hz), syllables (Syllable AM, ~5 Hz) and onset-rime units (Phoneme AM, ~20 Hz). We argue that these AM patterns could in principle be used by naïve listeners to compute acoustic-phonological mappings without lexical knowledge. We then demonstrate that the modulation statistics within this AM hierarchy indeed parse the speech signal into a primitive hierarchically-organised phonological system comprising stress feet (proto-words), syllables and onset-rime units. We apply the S-AMPH model to two other CDS corpora, one spontaneous and one deliberately-timed. The model accurately identified 72-82% (freely-read CDS) and 90-98% (rhythmically-regular CDS) stress patterns, syllables and onset-rime units. This in-principle demonstration that primitive phonology can be extracted from speech AMs is termed Acoustic-Emergent Phonology (AEP) theory. AEP theory provides a set of methods for examining how early phonological development is shaped by the temporal modulation structure of speech across languages. The S-AMPH model reveals a crucial developmental role for stress feet (AMs ~2 Hz). Stress feet underpin different linguistic rhythm typologies, and speech rhythm underpins language acquisition by infants in all languages.
Speech perception skills of deaf infants following cochlear implantation: a first report

PubMed Central

Houston, Derek M.; Pisoni, David B.; Kirk, Karen Iler; Ying, Elizabeth A.; Miyamoto, Richard T.

2012-01-01

Summary Objective We adapted a behavioral procedure that has been used extensively with normal-hearing (NH) infants, the visual habituation (VH) procedure, to assess deaf infants’ discrimination and attention to speech. Methods Twenty-four NH 6-month-olds, 24 NH 9-month-olds, and 16 deaf infants at various ages before and following cochlear implantation (CI) were tested in a sound booth on their caregiver’s lap in front of a TV monitor. During the habituation phase, each infant was presented with a repeating speech sound (e.g. ‘hop hop hop’) paired with a visual display of a checkerboard pattern on half of the trials (‘sound trials’) and only the visual display on the other half (‘silent trials’). When the infant’s looking time decreased and reached a habituation criterion, a test phase began. This consisted of two trials: an ‘old trial’ that was identical to the ‘sound trials’ and a ‘novel trial’ that consisted of a different repeating speech sound (e.g. ‘ahhh’) paired with the same checkerboard pattern. Results During the habituation phase, NH infants looked significantly longer during the sound trials than during the silent trials. However, deaf infants who had received cochlear implants (CIs) displayed a much weaker preference for the sound trials. On the other hand, both NH infants and deaf infants with CIs attended significantly longer to the visual display during the novel trial than during the old trial, suggesting that they were able to discriminate the speech patterns. Before receiving CIs, deaf infants did not show any preferences. Conclusions Taken together, the findings suggest that deaf infants who receive CIs are able to detect and discriminate some speech patterns. However, their overall attention to speech sounds may be less than NH infants’. Attention to speech may impact other aspects of speech perception and spoken language development, such as segmenting words from fluent speech and learning novel words. Implications of the effects of early auditory deprivation and age at CI on speech perception and language development are discussed. PMID:12697350
Ultrasound biofeedback treatment for persisting childhood apraxia of speech.

PubMed

Preston, Jonathan L; Brick, Nickole; Landi, Nicole

2013-11-01

The purpose of this study was to evaluate the efficacy of a treatment program that includes ultrasound biofeedback for children with persisting speech sound errors associated with childhood apraxia of speech (CAS). Six children ages 9-15 years participated in a multiple baseline experiment for 18 treatment sessions during which treatment focused on producing sequences involving lingual sounds. Children were cued to modify their tongue movements using visual feedback from real-time ultrasound images. Probe data were collected before, during, and after treatment to assess word-level accuracy for treated and untreated sound sequences. As participants reached preestablished performance criteria, new sequences were introduced into treatment. All participants met the performance criterion (80% accuracy for 2 consecutive sessions) on at least 2 treated sound sequences. Across the 6 participants, performance criterion was met for 23 of 31 treated sequences in an average of 5 sessions. Some participants showed no improvement in untreated sequences, whereas others showed generalization to untreated sequences that were phonetically similar to the treated sequences. Most gains were maintained 2 months after the end of treatment. The percentage of phonemes correct increased significantly from pretreatment to the 2-month follow-up. A treatment program including ultrasound biofeedback is a viable option for improving speech sound accuracy in children with persisting speech sound errors associated with CAS.
Sensorimotor speech disorders in Parkinson's disease: Programming and execution deficits.

PubMed

Ortiz, Karin Zazo; Brabo, Natalia Casagrande; Minett, Thais Soares C

2016-01-01

Dysfunction in the basal ganglia circuits is a determining factor in the physiopathology of the classic signs of Parkinson's disease (PD) and hypokinetic dysarthria is commonly related to PD. Regarding speech disorders associated with PD, the latest four-level framework of speech complicates the traditional view of dysarthria as a motor execution disorder. Based on findings that dysfunctions in basal ganglia can cause speech disorders, and on the premise that the speech deficits seen in PD are not related to an execution motor disorder alone but also to a disorder at the motor programming level, the main objective of this study was to investigate the presence of sensorimotor disorders of programming (besides the execution disorders previously described) in PD patients. A cross-sectional study was conducted in a sample of 60 adults matched for gender, age and education: 30 adult patients diagnosed with idiopathic PD (PDG) and 30 healthy adults (CG). All types of articulation errors were reanalyzed to investigate the nature of these errors. Interjections, hesitations and repetitions of words or sentences (during discourse) were considered typical disfluencies; blocking, episodes of palilalia (words or syllables) were analyzed as atypical disfluencies. We analysed features including successive self-initiated trial, phoneme distortions, self-correction, repetition of sounds and syllables, prolonged movement transitions, additions or omissions of sounds and syllables, in order to identify programming and/or execution failures. Orofacial agility was also investigated. The PDG had worse performance on all sensorimotor speech tasks. All PD patients had hypokinetic dysarthria. The clinical characteristics found suggest both execution and programming sensorimotor speech disorders in PD patients.
Perceptual learning of degraded speech by minimizing prediction error.

PubMed

Sohoglu, Ediz; Davis, Matthew H

2016-03-22

Human perception is shaped by past experience on multiple timescales. Sudden and dramatic changes in perception occur when prior knowledge or expectations match stimulus content. These immediate effects contrast with the longer-term, more gradual improvements that are characteristic of perceptual learning. Despite extensive investigation of these two experience-dependent phenomena, there is considerable debate about whether they result from common or dissociable neural mechanisms. Here we test single- and dual-mechanism accounts of experience-dependent changes in perception using concurrent magnetoencephalographic and EEG recordings of neural responses evoked by degraded speech. When speech clarity was enhanced by prior knowledge obtained from matching text, we observed reduced neural activity in a peri-auditory region of the superior temporal gyrus (STG). Critically, longer-term improvements in the accuracy of speech recognition following perceptual learning resulted in reduced activity in a nearly identical STG region. Moreover, short-term neural changes caused by prior knowledge and longer-term neural changes arising from perceptual learning were correlated across subjects with the magnitude of learning-induced changes in recognition accuracy. These experience-dependent effects on neural processing could be dissociated from the neural effect of hearing physically clearer speech, which similarly enhanced perception but increased rather than decreased STG responses. Hence, the observed neural effects of prior knowledge and perceptual learning cannot be attributed to epiphenomenal changes in listening effort that accompany enhanced perception. Instead, our results support a predictive coding account of speech perception; computational simulations show how a single mechanism, minimization of prediction error, can drive immediate perceptual effects of prior knowledge and longer-term perceptual learning of degraded speech.
Perceptual learning of degraded speech by minimizing prediction error

PubMed Central

Sohoglu, Ediz

2016-01-01

Human perception is shaped by past experience on multiple timescales. Sudden and dramatic changes in perception occur when prior knowledge or expectations match stimulus content. These immediate effects contrast with the longer-term, more gradual improvements that are characteristic of perceptual learning. Despite extensive investigation of these two experience-dependent phenomena, there is considerable debate about whether they result from common or dissociable neural mechanisms. Here we test single- and dual-mechanism accounts of experience-dependent changes in perception using concurrent magnetoencephalographic and EEG recordings of neural responses evoked by degraded speech. When speech clarity was enhanced by prior knowledge obtained from matching text, we observed reduced neural activity in a peri-auditory region of the superior temporal gyrus (STG). Critically, longer-term improvements in the accuracy of speech recognition following perceptual learning resulted in reduced activity in a nearly identical STG region. Moreover, short-term neural changes caused by prior knowledge and longer-term neural changes arising from perceptual learning were correlated across subjects with the magnitude of learning-induced changes in recognition accuracy. These experience-dependent effects on neural processing could be dissociated from the neural effect of hearing physically clearer speech, which similarly enhanced perception but increased rather than decreased STG responses. Hence, the observed neural effects of prior knowledge and perceptual learning cannot be attributed to epiphenomenal changes in listening effort that accompany enhanced perception. Instead, our results support a predictive coding account of speech perception; computational simulations show how a single mechanism, minimization of prediction error, can drive immediate perceptual effects of prior knowledge and longer-term perceptual learning of degraded speech. PMID:26957596
Role of Grammatical Gender and Semantics in German Word Production

ERIC Educational Resources Information Center

Vigliocco, Gabriella; Vinson, David P.; Indefrey, Peter; Levelt, Willem J. M.; Hellwig, Frauke

2004-01-01

Semantic substitution errors (e.g., saying "arm" when "leg" is intended) are among the most common types of errors occurring during spontaneous speech. It has been shown that grammatical gender of German target nouns is preserved in the errors (E. Mane, 1999). In 3 experiments, the authors explored different accounts of the grammatical gender…
Listening to speech recruits specific tongue motor synergies as revealed by transcranial magnetic stimulation and tissue-Doppler ultrasound imaging

PubMed Central

D'Ausilio, A.; Maffongelli, L.; Bartoli, E.; Campanella, M.; Ferrari, E.; Berry, J.; Fadiga, L.

2014-01-01

The activation of listener's motor system during speech processing was first demonstrated by the enhancement of electromyographic tongue potentials as evoked by single-pulse transcranial magnetic stimulation (TMS) over tongue motor cortex. This technique is, however, technically challenging and enables only a rather coarse measurement of this motor mirroring. Here, we applied TMS to listeners’ tongue motor area in association with ultrasound tissue Doppler imaging to describe fine-grained tongue kinematic synergies evoked by passive listening to speech. Subjects listened to syllables requiring different patterns of dorso-ventral and antero-posterior movements (/ki/, /ko/, /ti/, /to/). Results show that passive listening to speech sounds evokes a pattern of motor synergies mirroring those occurring during speech production. Moreover, mirror motor synergies were more evident in those subjects showing good performances in discriminating speech in noise demonstrating a role of the speech-related mirror system in feed-forward processing the speaker's ongoing motor plan. PMID:24778384
The Temporal Prediction of Stress in Speech and Its Relation to Musical Beat Perception.

PubMed

Beier, Eleonora J; Ferreira, Fernanda

2018-01-01

While rhythmic expectancies are thought to be at the base of beat perception in music, the extent to which stress patterns in speech are similarly represented and predicted during on-line language comprehension is debated. The temporal prediction of stress may be advantageous to speech processing, as stress patterns aid segmentation and mark new information in utterances. However, while linguistic stress patterns may be organized into hierarchical metrical structures similarly to musical meter, they do not typically present the same degree of periodicity. We review the theoretical background for the idea that stress patterns are predicted and address the following questions: First, what is the evidence that listeners can predict the temporal location of stress based on preceding rhythm? If they can, is it thanks to neural entrainment mechanisms similar to those utilized for musical beat perception? And lastly, what linguistic factors other than rhythm may account for the prediction of stress in natural speech? We conclude that while expectancies based on the periodic presentation of stresses are at play in some of the current literature, other processes are likely to affect the prediction of stress in more naturalistic, less isochronous speech. Specifically, aspects of prosody other than amplitude changes (e.g., intonation) as well as lexical, syntactic and information structural constraints on the realization of stress may all contribute to the probabilistic expectation of stress in speech.
Low-dimensional recurrent neural network-based Kalman filter for speech enhancement.

PubMed

Xia, Youshen; Wang, Jun

2015-07-01

This paper proposes a new recurrent neural network-based Kalman filter for speech enhancement, based on a noise-constrained least squares estimate. The parameters of speech signal modeled as autoregressive process are first estimated by using the proposed recurrent neural network and the speech signal is then recovered from Kalman filtering. The proposed recurrent neural network is globally asymptomatically stable to the noise-constrained estimate. Because the noise-constrained estimate has a robust performance against non-Gaussian noise, the proposed recurrent neural network-based speech enhancement algorithm can minimize the estimation error of Kalman filter parameters in non-Gaussian noise. Furthermore, having a low-dimensional model feature, the proposed neural network-based speech enhancement algorithm has a much faster speed than two existing recurrent neural networks-based speech enhancement algorithms. Simulation results show that the proposed recurrent neural network-based speech enhancement algorithm can produce a good performance with fast computation and noise reduction. Copyright © 2015 Elsevier Ltd. All rights reserved.
Bilateral capacity for speech sound processing in auditory comprehension: evidence from Wada procedures.

PubMed

Hickok, G; Okada, K; Barr, W; Pa, J; Rogalsky, C; Donnelly, K; Barde, L; Grant, A

2008-12-01

Data from lesion studies suggest that the ability to perceive speech sounds, as measured by auditory comprehension tasks, is supported by temporal lobe systems in both the left and right hemisphere. For example, patients with left temporal lobe damage and auditory comprehension deficits (i.e., Wernicke's aphasics), nonetheless comprehend isolated words better than one would expect if their speech perception system had been largely destroyed (70-80% accuracy). Further, when comprehension fails in such patients their errors are more often semantically-based, than-phonemically based. The question addressed by the present study is whether this ability of the right hemisphere to process speech sounds is a result of plastic reorganization following chronic left hemisphere damage, or whether the ability exists in undamaged language systems. We sought to test these possibilities by studying auditory comprehension in acute left versus right hemisphere deactivation during Wada procedures. A series of 20 patients undergoing clinically indicated Wada procedures were asked to listen to an auditorily presented stimulus word, and then point to its matching picture on a card that contained the target picture, a semantic foil, a phonemic foil, and an unrelated foil. This task was performed under three conditions, baseline, during left carotid injection of sodium amytal, and during right carotid injection of sodium amytal. Overall, left hemisphere injection led to a significantly higher error rate than right hemisphere injection. However, consistent with lesion work, the majority (75%) of these errors were semantic in nature. These findings suggest that auditory comprehension deficits are predominantly semantic in nature, even following acute left hemisphere disruption. This, in turn, supports the hypothesis that the right hemisphere is capable of speech sound processing in the intact brain.
Dysarthria in Mandarin-Speaking Children with Cerebral Palsy: Speech Subsystem Profiles

ERIC Educational Resources Information Center

Chen, Li-Mei; Hustad, Katherine C.; Kent, Ray D.; Lin, Yu Ching

2018-01-01

Purpose: This study explored the speech characteristics of Mandarin-speaking children with cerebral palsy (CP) and typically developing (TD) children to determine (a) how children in the 2 groups may differ in their speech patterns and (b) the variables correlated with speech intelligibility for words and sentences. Method: Data from 6 children…
An evaluation of the effectiveness of PROMPT therapy in improving speech production accuracy in six children with cerebral palsy.

PubMed

Ward, Roslyn; Leitão, Suze; Strauss, Geoff

2014-08-01

This study evaluates perceptual changes in speech production accuracy in six children (3-11 years) with moderate-to-severe speech impairment associated with cerebral palsy before, during, and after participation in a motor-speech intervention program (Prompts for Restructuring Oral Muscular Phonetic Targets). An A1BCA2 single subject research design was implemented. Subsequent to the baseline phase (phase A1), phase B targeted each participant's first intervention priority on the PROMPT motor-speech hierarchy. Phase C then targeted one level higher. Weekly speech probes were administered, containing trained and untrained words at the two levels of intervention, plus an additional level that served as a control goal. The speech probes were analysed for motor-speech-movement-parameters and perceptual accuracy. Analysis of the speech probe data showed all participants recorded a statistically significant change. Between phases A1-B and B-C 6/6 and 4/6 participants, respectively, recorded a statistically significant increase in performance level on the motor speech movement patterns targeted during the training of that intervention. The preliminary data presented in this study make a contribution to providing evidence that supports the use of a treatment approach aligned with dynamic systems theory to improve the motor-speech movement patterns and speech production accuracy in children with cerebral palsy.
Speech rate in Parkinson's disease: A controlled study.

PubMed

Martínez-Sánchez, F; Meilán, J J G; Carro, J; Gómez Íñiguez, C; Millian-Morell, L; Pujante Valverde, I M; López-Alburquerque, T; López, D E

2016-09-01

Speech disturbances will affect most patients with Parkinson's disease (PD) over the course of the disease. The origin and severity of these symptoms are of clinical and diagnostic interest. To evaluate the clinical pattern of speech impairment in PD patients and identify significant differences in speech rate and articulation compared to control subjects. Speech rate and articulation in a reading task were measured using an automatic analytical method. A total of 39 PD patients in the 'on' state and 45 age-and sex-matched asymptomatic controls participated in the study. None of the patients experienced dyskinesias or motor fluctuations during the test. The patients with PD displayed a significant reduction in speech and articulation rates; there were no significant correlations between the studied speech parameters and patient characteristics such as L-dopa dose, duration of the disorder, age, and UPDRS III scores and Hoehn & Yahr scales. Patients with PD show a characteristic pattern of declining speech rate. These results suggest that in PD, disfluencies are the result of the movement disorder affecting the physiology of speech production systems. Copyright © 2014 Sociedad Española de Neurología. Publicado por Elsevier España, S.L.U. All rights reserved.

USDA APHIS | Wildlife Damage

Science.gov Websites

& Speeches USDA Newsroom Videos Pet Travel Blog Z6_LO4C1BS0LO4EB0AER7MEEI2G47 Error Error Biosecurity ESF11 Farm Bill Horse Protection Hungry Pests Pet Travel Trade Veterinary Accreditation USDA.gov
Cortical activation patterns correlate with speech understanding after cochlear implantation

PubMed Central

Olds, Cristen; Pollonini, Luca; Abaya, Homer; Larky, Jannine; Loy, Megan; Bortfeld, Heather; Beauchamp, Michael S.; Oghalai, John S.

2015-01-01

Objectives Cochlear implants are a standard therapy for deafness, yet the ability of implanted patients to understand speech varies widely. To better understand this variability in outcomes, we used functional near-infrared spectroscopy (fNIRS) to image activity within regions of the auditory cortex and compare the results to behavioral measures of speech perception. Design We studied 32 deaf adults hearing through cochlear implants and 35 normal-hearing controls. We used fNIRS to measure responses within the lateral temporal lobe and the superior temporal gyrus to speech stimuli of varying intelligibility. The speech stimuli included normal speech, channelized speech (vocoded into 20 frequency bands), and scrambled speech (the 20 frequency bands were shuffled in random order). We also used environmental sounds as a control stimulus. Behavioral measures consisted of the Speech Reception Threshold, CNC words, and AzBio Sentence tests measured in quiet. Results Both control and implanted participants with good speech perception exhibited greater cortical activations to natural speech than to unintelligible speech. In contrast, implanted participants with poor speech perception had large, indistinguishable cortical activations to all stimuli. The ratio of cortical activation to normal speech to that of scrambled speech directly correlated with the CNC Words and AzBio Sentences scores. This pattern of cortical activation was not correlated with auditory threshold, age, side of implantation, or time after implantation. Turning off the implant reduced cortical activations in all implanted participants. Conclusions Together, these data indicate that the responses we measured within the lateral temporal lobe and the superior temporal gyrus correlate with behavioral measures of speech perception, demonstrating a neural basis for the variability in speech understanding outcomes after cochlear implantation. PMID:26709749
Developing a Weighted Measure of Speech Sound Accuracy

PubMed Central

Preston, Jonathan L.; Ramsdell, Heather L.; Oller, D. Kimbrough; Edwards, Mary Louise; Tobin, Stephen J.

2010-01-01

Purpose The purpose is to develop a system for numerically quantifying a speaker’s phonetic accuracy through transcription-based measures. With a focus on normal and disordered speech in children, we describe a system for differentially weighting speech sound errors based on various levels of phonetic accuracy with a Weighted Speech Sound Accuracy (WSSA) score. We then evaluate the reliability and validity of this measure. Method Phonetic transcriptions are analyzed from several samples of child speech, including preschoolers and young adolescents with and without speech sound disorders and typically developing toddlers. The new measure of phonetic accuracy is compared to existing measures, is used to discriminate typical and disordered speech production, and is evaluated to determine whether it is sensitive to changes in phonetic accuracy over time. Results Initial psychometric data indicate that WSSA scores correlate with other measures of phonetic accuracy as well as listeners’ judgments of severity of a child’s speech disorder. The measure separates children with and without speech sound disorders. WSSA scores also capture growth in phonetic accuracy in toddler’s speech over time. Conclusion Results provide preliminary support for the WSSA as a valid and reliable measure of phonetic accuracy in children’s speech. PMID:20699344
Factors Associated With Negative Attitudes Toward Speaking in Preschool-Age Children Who Do and Do Not Stutter.

PubMed

Groner, Stephen; Walden, Tedra; Jones, Robin

2016-01-01

This study explored relations between the negativity of children's speech-related attitudes as measured by the Communication Attitude Test for Preschool and Kindergarten Children Who Stutter (KiddyCAT; Vanryckeghem & Brutten, 2007) and (a) age; (b) caregiver reports of stuttering and its social consequences; (c) types of disfluencies; and (d) standardized speech, vocabulary, and language scores. Participants were 46 preschool-age children who stutter (CWS; 12 females, 34 males) and 66 preschool-age children who do not stutter (CWNS; 35 females, 31 males). After a conversation, children completed standardized tests and the KiddyCAT while their caregivers completed scales on observed stuttering behaviors and their consequences. The KiddyCAT scores of both the CWS and the CWNS were significantly negatively correlated with age. Both groups' KiddyCAT scores increased with higher scores on the Speech Fluency Rating Scale of the Test of Childhood Stuttering (Gillam, Logan, & Pearson, 2009). Repetitions were a significant contributor to the CWNS's KiddyCAT scores, but no specific disfluency significantly contributed to the CWS's KiddyCAT scores. Greater articulation errors were associated with higher KiddyCAT scores in the CWNS. No standardized test scores were associated with KiddyCAT scores in the CWS. Attitudes that speech is difficult are not associated with similar aspects of communication for CWS and CWNS. Age significantly contributed to negative speech attitudes for CWS, whereas age, repetitions, and articulation errors contributed to negative speech attitudes for CWNS.
Conduction Aphasia, Sensory-Motor Integration, and Phonological Short-Term Memory--An Aggregate Analysis of Lesion and fMRI Data

ERIC Educational Resources Information Center

Buchsbaum, Bradley R.; Baldo, Juliana; Okada, Kayoko; Berman, Karen F.; Dronkers, Nina; D'Esposito, Mark; Hickok, Gregory

2011-01-01

Conduction aphasia is a language disorder characterized by frequent speech errors, impaired verbatim repetition, a deficit in phonological short-term memory, and naming difficulties in the presence of otherwise fluent and grammatical speech output. While traditional models of conduction aphasia have typically implicated white matter pathways,…
Speech and Prosody Characteristics of Adolescents and Adults with High-Functioning Autism and Asperger Syndrome.

ERIC Educational Resources Information Center

Shriberg, Lawrence D.; Paul, Rhea; McSweeny, Jane L.; Klin, Ami; Cohen, Donald J.; Volkmar, Fred R.

2001-01-01

This study compared the speech and prosody-voice profiles for 30 male speakers with either high-functioning autism (HFA) or Asperger syndrome (AS), and 53 typically developing male speakers. Both HFA and AS groups had more residual articulation distortion errors and utterances coded as inappropriate for phrasing, stress, and resonance. AS speakers…
Toward Tense as a Clinical Marker of Specific Language Impairment in English-Speaking Children.

ERIC Educational Resources Information Center

Rice, Mabel L.; Wexler, Kenneth

1996-01-01

Comparison of the speech of 37 preschool children with speech-language impairment (SLI), 40 language-matched children, and 45 age-matched children found that errors in a set of morphemes marking tense characterized the SLI children. Evidence supporting the use of these morphemes as clinical markers for SLI is offered. (DB)
Phonological and Phonetic Marking of Information Status in Foreign Accent Syndrome

ERIC Educational Resources Information Center

Kuschmann, Anja; Lowit, Anja

2012-01-01

Background: Foreign Accent Syndrome (FAS) is a motor speech disorder in which a variety of segmental and suprasegmental errors lead to the perception of a new accent in speech. Whilst changes in intonation have been identified to contribute considerably to the perceived alteration in accent, research has rarely focused on how these changes impact…
The effects of four variables on the intelligibility of synthesized sentences

NASA Astrophysics Data System (ADS)

Conroy, Carol; Raphael, Lawrence J.; Bell-Berti, Fredericka

2003-10-01

The experiments reported here examined the effects of four variables on the intelligibilty of synthetic speech: (1) listener age, (2) listener experience, (3) speech rate, and (4) the presence versus absence of interword pauses. The stimuli, eighty IEEE-Harvard Sentences, were generated by a DynaVox augmentative/alternative communication device equipped with a DECtalk synthesizer. The sentences were presented to four groups of 12 listeners each (children (9-11 years), teens (14-16 years), young adults (20-25 years), and adults (38-45 years). In the first experiment the sentences were heard at four rates: 105, 135, 165, and 195 wpm; in the second experiment half of the sentences (presented at two rates: 135 and 165 wpm), contained 250 ms interword pauses. Conditions in both experiments were counterbalanced and no sentence was presented twice. Results indicated a consistent decrease in error rates with increased exposure to the synthesized speech for all age groups. Error rates also varied inversely with listener age. Effects of rate variation were inconsistent across listener groups and between experiments. The presences versus absences of pauses affected listener groups differently: The youngest listeners had higher error rates, and the older listeners lower error rates when interword pauses were included in the stimuli. [Work supported by St. John's University and New York City Board of Education, Technology Solutions, District 75.
Does it really matter whether students' contributions are spoken versus typed in an intelligent tutoring system with natural language?

PubMed

D'Mello, Sidney K; Dowell, Nia; Graesser, Arthur

2011-03-01

There is the question of whether learning differs when students speak versus type their responses when interacting with intelligent tutoring systems with natural language dialogues. Theoretical bases exist for three contrasting hypotheses. The speech facilitation hypothesis predicts that spoken input will increase learning, whereas the text facilitation hypothesis predicts typed input will be superior. The modality equivalence hypothesis claims that learning gains will be equivalent. Previous experiments that tested these hypotheses were confounded by automated speech recognition systems with substantial error rates that were detected by learners. We addressed this concern in two experiments via a Wizard of Oz procedure, where a human intercepted the learner's speech and transcribed the utterances before submitting them to the tutor. The overall pattern of the results supported the following conclusions: (1) learning gains associated with spoken and typed input were on par and quantitatively higher than a no-intervention control, (2) participants' evaluations of the session were not influenced by modality, and (3) there were no modality effects associated with differences in prior knowledge and typing proficiency. Although the results generally support the modality equivalence hypothesis, highly motivated learners reported lower cognitive load and demonstrated increased learning when typing compared with speaking. We discuss the implications of our findings for intelligent tutoring systems that can support typed and spoken input.
Status report on speech research. A report on the status and progress of studies on the nature of speech, instrumentation for its investigation, and practical applications

NASA Astrophysics Data System (ADS)

Liberman, A. M.

1984-08-01

This report (1 January-30 June) is one of a regular series on the status and progress of studies on the nature of speech, instrumentation for its investigation, and practical applications. Manuscripts cover the following topics: Sources of variability in early speech development; Invariance: Functional or descriptive?; Brief comments on invariance in phonetic perception; Phonetic category boundaries are flexible; On categorizing asphasic speech errors; Universal and language particular aspects of vowel-to-vowel coarticulation; Functional specific articulatory cooperation following jaw perturbation; during speech: Evidence for coordinative structures; Formant integration and the perception of nasal vowel height; Relative power of cues: FO shifts vs. voice timing; Laryngeal management at utterance-internal word boundary in American English; Closure duration and release burst amplitude cues to stop consonant manner and place of articulation; Effects of temporal stimulus properties on perception of the (sl)-(spl) distinction; The physics of controlled conditions: A reverie about locomotion; On the perception of intonation from sinusoidal sentences; Speech Perception; Speech Articulation; Motor Control; Speech Development.
Speech recognition technology: an outlook for human-to-machine interaction.

PubMed

Erdel, T; Crooks, S

2000-01-01

Speech recognition, as an enabling technology in healthcare-systems computing, is a topic that has been discussed for quite some time, but is just now coming to fruition. Traditionally, speech-recognition software has been constrained by hardware, but improved processors and increased memory capacities are starting to remove some of these limitations. With these barriers removed, companies that create software for the healthcare setting have the opportunity to write more successful applications. Among the criticisms of speech-recognition applications are the high rates of error and steep training curves. However, even in the face of such negative perceptions, there remains significant opportunities for speech recognition to allow healthcare providers and, more specifically, physicians, to work more efficiently and ultimately spend more time with their patients and less time completing necessary documentation. This article will identify opportunities for inclusion of speech-recognition technology in the healthcare setting and examine major categories of speech-recognition software--continuous speech recognition, command and control, and text-to-speech. We will discuss the advantages and disadvantages of each area, the limitations of the software today, and how future trends might affect them.
Radiological reporting that combine continuous speech recognition with error correction by transcriptionists.

PubMed

Ichikawa, Tamaki; Kitanosono, Takashi; Koizumi, Jun; Ogushi, Yoichi; Tanaka, Osamu; Endo, Jun; Hashimoto, Takeshi; Kawada, Shuichi; Saito, Midori; Kobayashi, Makiko; Imai, Yutaka

2007-12-20

We evaluated the usefulness of radiological reporting that combines continuous speech recognition (CSR) and error correction by transcriptionists. Four transcriptionists (two with more than 10 years' and two with less than 3 months' transcription experience) listened to the same 100 dictation files and created radiological reports using conventional transcription and a method that combined CSR with manual error correction by the transcriptionists. We compared the 2 groups using the 2 methods for accuracy and report creation time and evaluated the transcriptionists' inter-personal dependence on accuracy rate and report creation time. We used a CSR system that did not require the training of the system to recognize the user's voice. We observed no significant difference in accuracy between the 2 groups and 2 methods that we tested, though transcriptionists with greater experience transcribed faster than those with less experience using conventional transcription. Using the combined method, error correction speed was not significantly different between two groups of transcriptionists with different levels of experience. Combining CSR and manual error correction by transcriptionists enabled convenient and accurate radiological reporting.
Language-Specific Developmental Differences in Speech Production: A Cross-Language Acoustic Study

ERIC Educational Resources Information Center

Li, Fangfang

2012-01-01

Speech productions of 40 English- and 40 Japanese-speaking children (aged 2-5) were examined and compared with the speech produced by 20 adult speakers (10 speakers per language). Participants were recorded while repeating words that began with "s" and "sh" sounds. Clear language-specific patterns in adults' speech were found,…
The prediction of speech intelligibility in classrooms using computer models

NASA Astrophysics Data System (ADS)

Dance, Stephen; Dentoni, Roger

2005-04-01

Two classrooms were measured and modeled using the industry standard CATT model and the Web model CISM. Sound levels, reverberation times and speech intelligibility were predicted in these rooms using data for 7 octave bands. It was found that overall sound levels could be predicted to within 2 dB by both models. However, overall reverberation time was found to be accurately predicted by CATT 14% prediction error, but not by CISM, 41% prediction error. This compared to a 30% prediction error using classical theory. As for STI: CATT predicted within 11%, CISM to within 3% and Sabine to within 28% of the measured value. It should be noted that CISM took approximately 15 seconds to calculate, while CATT took 15 minutes. CISM is freely available on-line at www.whyverne.co.uk/acoustics/Pages/cism/cism.html
Talker variability in audio-visual speech perception

PubMed Central

Heald, Shannon L. M.; Nusbaum, Howard C.

2014-01-01

A change in talker is a change in the context for the phonetic interpretation of acoustic patterns of speech. Different talkers have different mappings between acoustic patterns and phonetic categories and listeners need to adapt to these differences. Despite this complexity, listeners are adept at comprehending speech in multiple-talker contexts, albeit at a slight but measurable performance cost (e.g., slower recognition). So far, this talker variability cost has been demonstrated only in audio-only speech. Other research in single-talker contexts have shown, however, that when listeners are able to see a talker’s face, speech recognition is improved under adverse listening (e.g., noise or distortion) conditions that can increase uncertainty in the mapping between acoustic patterns and phonetic categories. Does seeing a talker’s face reduce the cost of word recognition in multiple-talker contexts? We used a speeded word-monitoring task in which listeners make quick judgments about target word recognition in single- and multiple-talker contexts. Results show faster recognition performance in single-talker conditions compared to multiple-talker conditions for both audio-only and audio-visual speech. However, recognition time in a multiple-talker context was slower in the audio-visual condition compared to audio-only condition. These results suggest that seeing a talker’s face during speech perception may slow recognition by increasing the importance of talker identification, signaling to the listener a change in talker has occurred. PMID:25076919
Talker variability in audio-visual speech perception.

PubMed

Heald, Shannon L M; Nusbaum, Howard C

2014-01-01

A change in talker is a change in the context for the phonetic interpretation of acoustic patterns of speech. Different talkers have different mappings between acoustic patterns and phonetic categories and listeners need to adapt to these differences. Despite this complexity, listeners are adept at comprehending speech in multiple-talker contexts, albeit at a slight but measurable performance cost (e.g., slower recognition). So far, this talker variability cost has been demonstrated only in audio-only speech. Other research in single-talker contexts have shown, however, that when listeners are able to see a talker's face, speech recognition is improved under adverse listening (e.g., noise or distortion) conditions that can increase uncertainty in the mapping between acoustic patterns and phonetic categories. Does seeing a talker's face reduce the cost of word recognition in multiple-talker contexts? We used a speeded word-monitoring task in which listeners make quick judgments about target word recognition in single- and multiple-talker contexts. Results show faster recognition performance in single-talker conditions compared to multiple-talker conditions for both audio-only and audio-visual speech. However, recognition time in a multiple-talker context was slower in the audio-visual condition compared to audio-only condition. These results suggest that seeing a talker's face during speech perception may slow recognition by increasing the importance of talker identification, signaling to the listener a change in talker has occurred.
Do age-related word retrieval difficulties appear (or disappear) in connected speech?

PubMed

Kavé, Gitit; Goral, Mira

2017-09-01

We conducted a comprehensive literature review of studies of word retrieval in connected speech in healthy aging and reviewed relevant aphasia research that could shed light on the aging literature. Four main hypotheses guided the review: (1) Significant retrieval difficulties would lead to reduced output in connected speech. (2) Significant retrieval difficulties would lead to a more limited lexical variety in connected speech. (3) Significant retrieval difficulties would lead to an increase in word substitution errors and in pronoun use as well as to greater dysfluency and hesitation in connected speech. (4) Retrieval difficulties on tests of single-word production would be associated with measures of word retrieval in connected speech. Studies on aging did not confirm these four hypotheses, unlike studies on aphasia that generally did. The review suggests that future research should investigate how context facilitates word production in old age.
Objective support for subjective reports of successful inner speech in two people with aphasia.

PubMed

Hayward, William; Snider, Sarah F; Luta, George; Friedman, Rhonda B; Turkeltaub, Peter E

2016-01-01

People with aphasia frequently report being able to say a word correctly in their heads, even if they are unable to say that word aloud. It is difficult to know what is meant by these reports of "successful inner speech". We probe the experience of successful inner speech in two people with aphasia. We show that these reports are associated with correct overt speech and phonologically related nonword errors, that they relate to word characteristics associated with ease of lexical access but not ease of production, and that they predict whether or not individual words are relearned during anomia treatment. These findings suggest that reports of successful inner speech are meaningful and may be useful to study self-monitoring in aphasia, to better understand anomia, and to predict treatment outcomes. Ultimately, the study of inner speech in people with aphasia could provide critical insights that inform our understanding of normal language.
[A modified speech enhancement algorithm for electronic cochlear implant and its digital signal processing realization].

PubMed

Wang, Yulin; Tian, Xuelong

2014-08-01

In order to improve the speech quality and auditory perceptiveness of electronic cochlear implant under strong noise background, a speech enhancement system used for electronic cochlear implant front-end was constructed. Taking digital signal processing (DSP) as the core, the system combines its multi-channel buffered serial port (McBSP) data transmission channel with extended audio interface chip TLV320AIC10, so speech signal acquisition and output with high speed are realized. Meanwhile, due to the traditional speech enhancement method which has the problems as bad adaptability, slow convergence speed and big steady-state error, versiera function and de-correlation principle were used to improve the existing adaptive filtering algorithm, which effectively enhanced the quality of voice communications. Test results verified the stability of the system and the de-noising performance of the algorithm, and it also proved that they could provide clearer speech signals for the deaf or tinnitus patients.

Neural mechanisms underlying auditory feedback control of speech

PubMed Central

Reilly, Kevin J.; Guenther, Frank H.

2013-01-01

The neural substrates underlying auditory feedback control of speech were investigated using a combination of functional magnetic resonance imaging (fMRI) and computational modeling. Neural responses were measured while subjects spoke monosyllabic words under two conditions: (i) normal auditory feedback of their speech, and (ii) auditory feedback in which the first formant frequency of their speech was unexpectedly shifted in real time. Acoustic measurements showed compensation to the shift within approximately 135 ms of onset. Neuroimaging revealed increased activity in bilateral superior temporal cortex during shifted feedback, indicative of neurons coding mismatches between expected and actual auditory signals, as well as right prefrontal and Rolandic cortical activity. Structural equation modeling revealed increased influence of bilateral auditory cortical areas on right frontal areas during shifted speech, indicating that projections from auditory error cells in posterior superior temporal cortex to motor correction cells in right frontal cortex mediate auditory feedback control of speech. PMID:18035557
Lip movements affect infants' audiovisual speech perception.

PubMed

Yeung, H Henny; Werker, Janet F

2013-05-01

Speech is robustly audiovisual from early in infancy. Here we show that audiovisual speech perception in 4.5-month-old infants is influenced by sensorimotor information related to the lip movements they make while chewing or sucking. Experiment 1 consisted of a classic audiovisual matching procedure, in which two simultaneously displayed talking faces (visual [i] and [u]) were presented with a synchronous vowel sound (audio /i/ or /u/). Infants' looking patterns were selectively biased away from the audiovisual matching face when the infants were producing lip movements similar to those needed to produce the heard vowel. Infants' looking patterns returned to those of a baseline condition (no lip movements, looking longer at the audiovisual matching face) when they were producing lip movements that did not match the heard vowel. Experiment 2 confirmed that these sensorimotor effects interacted with the heard vowel, as looking patterns differed when infants produced these same lip movements while seeing and hearing a talking face producing an unrelated vowel (audio /a/). These findings suggest that the development of speech perception and speech production may be mutually informative.
Sound Localization and Speech Perception in Noise of Pediatric Cochlear Implant Recipients: Bimodal Fitting Versus Bilateral Cochlear Implants.

PubMed

Choi, Ji Eun; Moon, Il Joon; Kim, Eun Yeon; Park, Hee-Sung; Kim, Byung Kil; Chung, Won-Ho; Cho, Yang-Sun; Brown, Carolyn J; Hong, Sung Hwa

The aim of this study was to compare binaural performance of auditory localization task and speech perception in babble measure between children who use a cochlear implant (CI) in one ear and a hearing aid (HA) in the other (bimodal fitting) and those who use bilateral CIs. Thirteen children (mean age ± SD = 10 ± 2.9 years) with bilateral CIs and 19 children with bimodal fitting were recruited to participate. Sound localization was assessed using a 13-loudspeaker array in a quiet sound-treated booth. Speakers were placed in an arc from -90° azimuth to +90° azimuth (15° interval) in horizontal plane. To assess the accuracy of sound location identification, we calculated the absolute error in degrees between the target speaker and the response speaker during each trial. The mean absolute error was computed by dividing the sum of absolute errors by the total number of trials. We also calculated the hemifield identification score to reflect the accuracy of right/left discrimination. Speech-in-babble perception was also measured in the sound field using target speech presented from the front speaker. Eight-talker babble was presented in the following four different listening conditions: from the front speaker (0°), from one of the two side speakers (+90° or -90°), from both side speakers (±90°). Speech, spatial, and quality questionnaire was administered. When the two groups of children were directly compared with each other, there was no significant difference in localization accuracy ability or hemifield identification score under binaural condition. Performance in speech perception test was also similar to each other under most babble conditions. However, when the babble was from the first device side (CI side for children with bimodal stimulation or first CI side for children with bilateral CIs), speech understanding in babble by bilateral CI users was significantly better than that by bimodal listeners. Speech, spatial, and quality scores were comparable with each other between the two groups. Overall, the binaural performance was similar to each other between children who are fit with two CIs (CI + CI) and those who use bimodal stimulation (HA + CI) in most conditions. However, the bilateral CI group showed better speech perception than the bimodal CI group when babble was from the first device side (first CI side for bilateral CI users or CI side for bimodal listeners). Therefore, if bimodal performance is significantly below the mean bilateral CI performance on speech perception in babble, these results suggest that a child should be considered to transit from bimodal stimulation to bilateral CIs.
Sensorimotor speech disorders in Parkinson's disease: Programming and execution deficits

PubMed Central

Ortiz, Karin Zazo; Brabo, Natalia Casagrande; Minett, Thais Soares C.

2016-01-01

ABSTRACT Introduction: Dysfunction in the basal ganglia circuits is a determining factor in the physiopathology of the classic signs of Parkinson's disease (PD) and hypokinetic dysarthria is commonly related to PD. Regarding speech disorders associated with PD, the latest four-level framework of speech complicates the traditional view of dysarthria as a motor execution disorder. Based on findings that dysfunctions in basal ganglia can cause speech disorders, and on the premise that the speech deficits seen in PD are not related to an execution motor disorder alone but also to a disorder at the motor programming level, the main objective of this study was to investigate the presence of sensorimotor disorders of programming (besides the execution disorders previously described) in PD patients. Methods: A cross-sectional study was conducted in a sample of 60 adults matched for gender, age and education: 30 adult patients diagnosed with idiopathic PD (PDG) and 30 healthy adults (CG). All types of articulation errors were reanalyzed to investigate the nature of these errors. Interjections, hesitations and repetitions of words or sentences (during discourse) were considered typical disfluencies; blocking, episodes of palilalia (words or syllables) were analyzed as atypical disfluencies. We analysed features including successive self-initiated trial, phoneme distortions, self-correction, repetition of sounds and syllables, prolonged movement transitions, additions or omissions of sounds and syllables, in order to identify programming and/or execution failures. Orofacial agility was also investigated. Results: The PDG had worse performance on all sensorimotor speech tasks. All PD patients had hypokinetic dysarthria. Conclusion: The clinical characteristics found suggest both execution and programming sensorimotor speech disorders in PD patients. PMID:29213457
Speech Recognition for Medical Dictation: Overview in Quebec and Systematic Review.

PubMed

Poder, Thomas G; Fisette, Jean-François; Déry, Véronique

2018-04-03

Speech recognition is increasingly used in medical reporting. The aim of this article is to identify in the literature the strengths and weaknesses of this technology, as well as barriers to and facilitators of its implementation. A systematic review of systematic reviews was performed using PubMed, Scopus, the Cochrane Library and the Center for Reviews and Dissemination through August 2017. The gray literature has also been consulted. The quality of systematic reviews has been assessed with the AMSTAR checklist. The main inclusion criterion was use of speech recognition for medical reporting (front-end or back-end). A survey has also been conducted in Quebec, Canada, to identify the dissemination of this technology in this province, as well as the factors leading to the success or failure of its implementation. Five systematic reviews were identified. These reviews indicated a high level of heterogeneity across studies. The quality of the studies reported was generally poor. Speech recognition is not as accurate as human transcription, but it can dramatically reduce turnaround times for reporting. In front-end use, medical doctors need to spend more time on dictation and correction than required with human transcription. With speech recognition, major errors occur up to three times more frequently. In back-end use, a potential increase in productivity of transcriptionists was noted. In conclusion, speech recognition offers several advantages for medical reporting. However, these advantages are countered by an increased burden on medical doctors and by risks of additional errors in medical reports. It is also hard to identify for which medical specialties and which clinical activities the use of speech recognition will be the most beneficial.
Effects and modeling of phonetic and acoustic confusions in accented speech.

PubMed

Fung, Pascale; Liu, Yi

2005-11-01

Accented speech recognition is more challenging than standard speech recognition due to the effects of phonetic and acoustic confusions. Phonetic confusion in accented speech occurs when an expected phone is pronounced as a different one, which leads to erroneous recognition. Acoustic confusion occurs when the pronounced phone is found to lie acoustically between two baseform models and can be equally recognized as either one. We propose that it is necessary to analyze and model these confusions separately in order to improve accented speech recognition without degrading standard speech recognition. Since low phonetic confusion units in accented speech do not give rise to automatic speech recognition errors, we focus on analyzing and reducing phonetic and acoustic confusability under high phonetic confusion conditions. We propose using likelihood ratio test to measure phonetic confusion, and asymmetric acoustic distance to measure acoustic confusion. Only accent-specific phonetic units with low acoustic confusion are used in an augmented pronunciation dictionary, while phonetic units with high acoustic confusion are reconstructed using decision tree merging. Experimental results show that our approach is effective and superior to methods modeling phonetic confusion or acoustic confusion alone in accented speech, with a significant 5.7% absolute WER reduction, without degrading standard speech recognition.
Five-year speech and language outcomes in children with cleft lip-palate.

PubMed

Prathanee, Benjamas; Pumnum, Tawitree; Seepuaham, Cholada; Jaiyong, Pechcharat

2016-10-01

To investigate 5-year speech and language outcomes in children with cleft lip/palate (CLP). Thirty-eight children aged 4-7 years and 8 months were recruited for this study. Speech abilities including articulation, resonance, voice, and intelligibility were assessed based on Thai Universal Parameters of Speech Outcomes. Language ability was assessed by the Language Screening Test. The findings revealed that children with clefts had speech and language delay, abnormal understandability, resonance abnormality, and voice disturbance; articulation defects that were 8.33 (1.75, 22.47), 50.00 (32.92, 67.08), 36.11 (20.82, 53.78), 30.56 (16.35, 48.11), and 94.44 (81.34, 99.32). Articulation errors were the most common speech and language defects in children with clefts, followed by abnormal understandability, resonance abnormality, and voice disturbance. These results should be of critical concern. Protocol reviewing and early intervention programs are needed for improved speech outcomes. Copyright © 2016 European Association for Cranio-Maxillo-Facial Surgery. Published by Elsevier Ltd. All rights reserved.
Dysfluencies in the speech of adults with intellectual disabilities and reported speech difficulties.

PubMed

Coppens-Hofman, Marjolein C; Terband, Hayo R; Maassen, Ben A M; van Schrojenstein Lantman-De Valk, Henny M J; van Zaalen-op't Hof, Yvonne; Snik, Ad F M

2013-01-01

In individuals with an intellectual disability, speech dysfluencies are more common than in the general population. In clinical practice, these fluency disorders are generally diagnosed and treated as stuttering rather than cluttering. To characterise the type of dysfluencies in adults with intellectual disabilities and reported speech difficulties with an emphasis on manifestations of stuttering and cluttering, which distinction is to help optimise treatment aimed at improving fluency and intelligibility. The dysfluencies in the spontaneous speech of 28 adults (18-40 years; 16 men) with mild and moderate intellectual disabilities (IQs 40-70), who were characterised as poorly intelligible by their caregivers, were analysed using the speech norms for typically developing adults and children. The speakers were subsequently assigned to different diagnostic categories by relating their resulting dysfluency profiles to mean articulatory rate and articulatory rate variability. Twenty-two (75%) of the participants showed clinically significant dysfluencies, of which 21% were classified as cluttering, 29% as cluttering-stuttering and 25% as clear cluttering at normal articulatory rate. The characteristic pattern of stuttering did not occur. The dysfluencies in the speech of adults with intellectual disabilities and poor intelligibility show patterns that are specific for this population. Together, the results suggest that in this specific group of dysfluent speakers interventions should be aimed at cluttering rather than stuttering. The reader will be able to (1) describe patterns of dysfluencies in the speech of adults with intellectual disabilities that are specific for this group of people, (2) explain that a high rate of dysfluencies in speech is potentially a major determiner of poor intelligibility in adults with ID and (3) describe suggestions for intervention focusing on cluttering rather than stuttering in dysfluent speakers with ID. Copyright © 2013 Elsevier Inc. All rights reserved.
Accountability Steps for Highly Reluctant Speech: Tiered-Services Consultation in a Head Start Classroom

ERIC Educational Resources Information Center

Howe, Heather; Barnett, David

2013-01-01

This consultation description reports parent and teacher problem solving for a preschool child with no typical speech directed to teachers or peers, and, by parent report, normal speech at home. This child's initial pattern of speech was similar to selective mutism, a low-incidence disorder often first detected during the preschool years, but…
Error Monitoring in Speech Production: A Computational Test of the Perceptual Loop Theory.

ERIC Educational Resources Information Center

Hartsuiker, Robert J.; Kolk, Herman H. J.

2001-01-01

Tested whether an elaborated version of the perceptual loop theory (W. Levelt, 1983) and the main interruption rule was consistent with existing time course data (E. Blackmer and E. Mitton, 1991; C. Oomen and A. Postma, in press). The study suggests that including an inner loop through the speech comprehension system generates predictions that fit…
A Perceptual and Electropalatographic Study of /Esh/ in Young People with Down's Syndrome

ERIC Educational Resources Information Center

Timmins, Claire; Cleland, Joanne; Wood, Sara E.; Hardcastle, William J.; Wishart, Jennifer G.

2009-01-01

Speech production in young people with Down's syndrome has been found to be variable and inconsistent. Errors tend to be more in the production of sounds that typically develop later, for example, fricatives and affricates, rather than stops and nasals. It has been suggested that inconsistency in production is a result of a motor speech deficit.…
Investigating the Inner Speech of People Who Stutter: Evidence for (and against) the Covert Repair Hypothesis

ERIC Educational Resources Information Center

Brocklehurst, Paul H.; Corley, Martin

2011-01-01

In their Covert Repair Hypothesis, Postma and Kolk (1993) suggest that people who stutter make greater numbers of phonological encoding errors, which are detected during the monitoring of inner speech and repaired, with stuttering-like disfluencies as a consequence. Here, we report an experiment that documents the frequency with which such errors…
Development of speech perception and production in children with cochlear implants.

PubMed

Kishon-Rabin, Liat; Taitelbaum, Riki; Muchnik, Chava; Gehtler, Inbal; Kronenberg, Jona; Hildesheimer, Minka

2002-05-01

The purpose of the present study was twofold: 1) to compare the hierarchy of perceived and produced significant speech pattern contrasts in children with cochlear implants, and 2) to compare this hierarchy to developmental data of children with normal hearing. The subjects included 35 prelingual hearing-impaired children with multichannel cochlear implants. The test materials were the Hebrew Speech Pattern Contrast (HeSPAC) test and the Hebrew Picture Speech Pattern Contrast (HePiSPAC) test for older and younger children, respectively. The results show that 1) auditory speech perception performance of children with cochlear implants reaches an asymptote at 76% (after correction for guessing) between 4 and 6 years of implant use; 2) all implant users perceived vowel place extremely well immediately after implantation; 3) most implanted children perceived initial voicing at chance level until 2 to 3 years after implantation, after which scores improved by 60% to 70% with implant use; 4) the hierarchy of phonetic-feature production paralleled that of perception: vowels first, voicing last, and manner and place of articulation in between; and 5) the hierarchy in speech pattern contrast perception and production was similar between the implanted and the normal-hearing children, with the exception of the vowels (possibly because of the interaction between the specific information provided by the implant device and the acoustics of the Hebrew language). The data reported here contribute to our current knowledge about the development of phonological contrasts in children who were deprived of sound in the first few years of their lives and then developed phonetic representations via cochlear implants. The data also provide additional insight into the interrelated skills of speech perception and production.
Differential Gaze Patterns on Eyes and Mouth During Audiovisual Speech Segmentation

PubMed Central

Lusk, Laina G.; Mitchel, Aaron D.

2016-01-01

Speech is inextricably multisensory: both auditory and visual components provide critical information for all aspects of speech processing, including speech segmentation, the visual components of which have been the target of a growing number of studies. In particular, a recent study (Mitchel and Weiss, 2014) established that adults can utilize facial cues (i.e., visual prosody) to identify word boundaries in fluent speech. The current study expanded upon these results, using an eye tracker to identify highly attended facial features of the audiovisual display used in Mitchel and Weiss (2014). Subjects spent the most time watching the eyes and mouth. A significant trend in gaze durations was found with the longest gaze duration on the mouth, followed by the eyes and then the nose. In addition, eye-gaze patterns changed across familiarization as subjects learned the word boundaries, showing decreased attention to the mouth in later blocks while attention on other facial features remained consistent. These findings highlight the importance of the visual component of speech processing and suggest that the mouth may play a critical role in visual speech segmentation. PMID:26869959
Is the Speech Transmission Index (STI) a robust measure of sound system speech intelligibility performance?

NASA Astrophysics Data System (ADS)

Mapp, Peter

2002-11-01

Although RaSTI is a good indicator of the speech intelligibility capability of auditoria and similar spaces, during the past 2-3 years it has been shown that RaSTI is not a robust predictor of sound system intelligibility performance. Instead, it is now recommended, within both national and international codes and standards, that full STI measurement and analysis be employed. However, new research is reported, that indicates that STI is not as flawless, nor robust as many believe. The paper highlights a number of potential error mechanisms. It is shown that the measurement technique and signal excitation stimulus can have a significant effect on the overall result and accuracy, particularly where DSP-based equipment is employed. It is also shown that in its current state of development, STI is not capable of appropriately accounting for a number of fundamental speech and system attributes, including typical sound system frequency response variations and anomalies. This is particularly shown to be the case when a system is operating under reverberant conditions. Comparisons between actual system measurements and corresponding word score data are reported where errors of up to 50 implications for VA and PA system performance verification will be discussed.
Effects of irrelevant sounds on phonological coding in reading comprehension and short-term memory.

PubMed

Boyle, R; Coltheart, V

1996-05-01

The effects of irrelevant sounds on reading comprehension and short-term memory were studied in two experiments. In Experiment 1, adults judged the acceptability of written sentences during irrelevant speech, accompanied and unaccompanied singing, instrumental music, and in silence. Sentences varied in syntactic complexity: Simple sentences contained a right-branching relative clause (The applause pleased the woman that gave the speech) and syntactically complex sentences included a centre-embedded relative clause (The hay that the farmer stored fed the hungry animals). Unacceptable sentences either sounded acceptable (The dog chased the cat that eight up all his food) or did not (The man praised the child that sight up his spinach). Decision accuracy was impaired by syntactic complexity but not by irrelevant sounds. Phonological coding was indicated by increased errors on unacceptable sentences that sounded correct. These errors rates were unaffected by irrelevant sounds. Experiment 2 examined effects of irrelevant sounds on ordered recall of phonologically similar and dissimilar word lists. Phonological similarity impaired recall. Irrelevant speech reduced recall but did not interact with phonological similarity. The results of these experiments question assumptions about the relationship between speech input and phonological coding in reading and the short-term store.
A reverberation-time-aware DNN approach leveraging spatial information for microphone array dereverberation

NASA Astrophysics Data System (ADS)

Wu, Bo; Yang, Minglei; Li, Kehuang; Huang, Zhen; Siniscalchi, Sabato Marco; Wang, Tong; Lee, Chin-Hui

2017-12-01

A reverberation-time-aware deep-neural-network (DNN)-based multi-channel speech dereverberation framework is proposed to handle a wide range of reverberation times (RT60s). There are three key steps in designing a robust system. First, to accomplish simultaneous speech dereverberation and beamforming, we propose a framework, namely DNNSpatial, by selectively concatenating log-power spectral (LPS) input features of reverberant speech from multiple microphones in an array and map them into the expected output LPS features of anechoic reference speech based on a single deep neural network (DNN). Next, the temporal auto-correlation function of received signals at different RT60s is investigated to show that RT60-dependent temporal-spatial contexts in feature selection are needed in the DNNSpatial training stage in order to optimize the system performance in diverse reverberant environments. Finally, the RT60 is estimated to select the proper temporal and spatial contexts before feeding the log-power spectrum features to the trained DNNs for speech dereverberation. The experimental evidence gathered in this study indicates that the proposed framework outperforms the state-of-the-art signal processing dereverberation algorithm weighted prediction error (WPE) and conventional DNNSpatial systems without taking the reverberation time into account, even for extremely weak and severe reverberant conditions. The proposed technique generalizes well to unseen room size, array geometry and loudspeaker position, and is robust to reverberation time estimation error.
Across-site patterns of modulation detection: Relation to speech recognitiona)

PubMed Central

Garadat, Soha N.; Zwolan, Teresa A.; Pfingst, Bryan E.

2012-01-01

The aim of this study was to identify across-site patterns of modulation detection thresholds (MDTs) in subjects with cochlear implants and to determine if removal of sites with the poorest MDTs from speech processor programs would result in improved speech recognition. Five hundred millisecond trains of symmetric-biphasic pulses were modulated sinusoidally at 10 Hz and presented at a rate of 900 pps using monopolar stimulation. Subjects were asked to discriminate a modulated pulse train from an unmodulated pulse train for all electrodes in quiet and in the presence of an interleaved unmodulated masker presented on the adjacent site. Across-site patterns of masked MDTs were then used to construct two 10-channel MAPs such that one MAP consisted of sites with the best masked MDTs and the other MAP consisted of sites with the worst masked MDTs. Subjects’ speech recognition skills were compared when they used these two different MAPs. Results showed that MDTs were variable across sites and were elevated in the presence of a masker by various amounts across sites. Better speech recognition was observed when the processor MAP consisted of sites with best masked MDTs, suggesting that temporal modulation sensitivity has important contributions to speech recognition with a cochlear implant. PMID:22559376
Profiles of verbal working memory growth predict speech and language development in children with cochlear implants.

PubMed

Kronenberger, William G; Pisoni, David B; Harris, Michael S; Hoen, Helena M; Xu, Huiping; Miyamoto, Richard T

2013-06-01

Verbal short-term memory (STM) and working memory (WM) skills predict speech and language outcomes in children with cochlear implants (CIs) even after conventional demographic, device, and medical factors are taken into account. However, prior research has focused on single end point outcomes as opposed to the longitudinal process of development of verbal STM/WM and speech-language skills. In this study, the authors investigated relations between profiles of verbal STM/WM development and speech-language development over time. Profiles of verbal STM/WM development were identified through the use of group-based trajectory analysis of repeated digit span measures over at least a 2-year time period in a sample of 66 children (ages 6-16 years) with CIs. Subjects also completed repeated assessments of speech and language skills during the same time period. Clusters representing different patterns of development of verbal STM (digit span forward scores) were related to the growth rate of vocabulary and language comprehension skills over time. Clusters representing different patterns of development of verbal WM (digit span backward scores) were related to the growth rate of vocabulary and spoken word recognition skills over time. Different patterns of development of verbal STM/WM capacity predict the dynamic process of development of speech and language skills in this clinical population.
Discrimination of stress in speech and music: a mismatch negativity (MMN) study.

PubMed

Peter, Varghese; McArthur, Genevieve; Thompson, William Forde

2012-12-01

The aim of this study was to determine if duration-related stress in speech and music is processed in a similar way in the brain. To this end, we tested 20 adults for their abstract mismatch negativity (MMN) event-related potentials to two duration-related stress patterns: stress on the first syllable or note (long-short), and stress on the second syllable or note (short-long). A significant MMN was elicited for both speech and music except for the short-long speech stimulus. The long-short stimuli elicited larger MMN amplitudes for speech and music compared to short-long stimuli. An extra negativity-the late discriminative negativity (LDN)-was observed only for music. The larger MMN amplitude for long-short stimuli might be due to the familiarity of the stress pattern in speech and music. The presence of LDN for music may reflect greater long-term memory transfer for music stimuli. Copyright © 2012 Society for Psychophysiological Research.

Codeswitching, Convergence and Compliance: The Development of Micro-Community Speech Norms.

ERIC Educational Resources Information Center

Burt, Susan Meredith

1992-01-01

In conversations between bilinguals, each of whom is a learner of the other's language, two different local patterns of codeswitching may emerge: compliance and mutual convergence. It is argued that a pattern of compliance is ultimately more accommodating that convergence, contrary to the claims of Speech Accommodation Theory. (20 references)…
Disfluency Markers in L1 Attrition

ERIC Educational Resources Information Center

Schmid, Monika S.; Fagersten, Kristy Beers

2010-01-01

Based on an analysis of the speech of long-term emigres of German and Dutch origin, the present investigation discusses to what extent hesitation patterns in language attrition may be the result of the creation of an interlanguage system, on the one hand, or of language-internal attrition patterns on the other. We compare speech samples elicited…
Maxillary dental arch dimensions in 6-year-old children with articulatory speech disorders.

PubMed

Heliövaara, Arja

2011-01-01

To evaluate maxillary dental arch dimensions in 6-year-old children with articulatory speech disorders and to compare their dental arch dimensions with age- and sex-matched controls without speech disorders. Fifty-two children (15 girls) with errors in the articulation of the sounds /r/, /s/ or /l/ were compared retrospectively with age- and sex-matched controls from dental casts taken at a mean age of 6.4 years (range 5.0-8.4). All children with articulatory speech disorders had been referred to City of Helsinki Health Care, Dental Care Department by a phoniatrician or a speech therapist in order to get oral-motor activators (removable palatal plates) to be used in their speech therapy. A χ2-test and paired Student's t tests were used in the statistical analyses. The children with articulatory speech disorders had similar maxillary dental arch widths but smaller maxillary dental arch length than the controls. This small series suggests that 6-year-old children with articulatory speech disorders may have decreased maxillary dental arch length. Copyright © 2011 S. Karger AG, Basel.
Developing a weighted measure of speech sound accuracy.

PubMed

Preston, Jonathan L; Ramsdell, Heather L; Oller, D Kimbrough; Edwards, Mary Louise; Tobin, Stephen J

2011-02-01

To develop a system for numerically quantifying a speaker's phonetic accuracy through transcription-based measures. With a focus on normal and disordered speech in children, the authors describe a system for differentially weighting speech sound errors on the basis of various levels of phonetic accuracy using a Weighted Speech Sound Accuracy (WSSA) score. The authors then evaluate the reliability and validity of this measure. Phonetic transcriptions were analyzed from several samples of child speech, including preschoolers and young adolescents with and without speech sound disorders and typically developing toddlers. The new measure of phonetic accuracy was validated against existing measures, was used to discriminate typical and disordered speech production, and was evaluated to examine sensitivity to changes in phonetic accuracy over time. Reliability between transcribers and consistency of scores among different word sets and testing points are compared. Initial psychometric data indicate that WSSA scores correlate with other measures of phonetic accuracy as well as listeners' judgments of the severity of a child's speech disorder. The measure separates children with and without speech sound disorders and captures growth in phonetic accuracy in toddlers' speech over time. The measure correlates highly across transcribers, word lists, and testing points. Results provide preliminary support for the WSSA as a valid and reliable measure of phonetic accuracy in children's speech.
A Near-Infrared Spectroscopy Study on Cortical Hemodynamic Responses to Normal and Whispered Speech in 3- to 7-Year-Old Children

ERIC Educational Resources Information Center

Remijn, Gerard B.; Kikuchi, Mitsuru; Yoshimura, Yuko; Shitamichi, Kiyomi; Ueno, Sanae; Tsubokawa, Tsunehisa; Kojima, Haruyuki; Higashida, Haruhiro; Minabe, Yoshio

2017-01-01

Purpose: The purpose of this study was to assess cortical hemodynamic response patterns in 3- to 7-year-old children listening to two speech modes: normally vocalized and whispered speech. Understanding whispered speech requires processing of the relatively weak, noisy signal, as well as the cognitive ability to understand the speaker's reason for…
Measuring word complexity in speech screening: single-word sampling to identify phonological delay/disorder in preschool children.

PubMed

Anderson, Carolyn; Cohen, Wendy

2012-01-01

Children's speech sound development is assessed by comparing speech production with the typical development of speech sounds based on a child's age and developmental profile. One widely used method of sampling is to elicit a single-word sample along with connected speech. Words produced spontaneously rather than imitated may give a more accurate indication of a child's speech development. A published word complexity measure can be used to score later-developing speech sounds and more complex word patterns. There is a need for a screening word list that is quick to administer and reliably differentiates children with typically developing speech from children with patterns of delayed/disordered speech. To identify a short word list based on word complexity that could be spontaneously named by most typically developing children aged 3;00-5;05 years. One hundred and five children aged between 3;00 and 5;05 years from three local authority nursery schools took part in the study. Items from a published speech assessment were modified and extended to include a range of phonemic targets in different word positions in 78 monosyllabic and polysyllabic words. The 78 words were ranked both by phonemic/phonetic complexity as measured by word complexity and by ease of spontaneous production. The ten most complex words (hereafter Triage 10) were named spontaneously by more than 90% of the children. There was no significant difference between the complexity measures for five identified age groups when the data were examined in 6-month groups. A qualitative analysis revealed eight children with profiles of phonological delay or disorder. When these children were considered separately, there was a statistically significant difference (p < 0.005) between the mean word complexity measure of the group compared with the mean for the remaining children in all other age groups. The Triage 10 words reliably differentiated children with typically developing speech from those with delayed or disordered speech patterns. The Triage 10 words can be used as a screening tool for triage and general assessment and have the potential to monitor progress during intervention. Further testing is being undertaken to establish reliability with children referred to speech and language therapy services. © 2012 Royal College of Speech and Language Therapists.
An experimental version of the MZT (speech-from-text) system with external F(sub 0) control

NASA Astrophysics Data System (ADS)

Nowak, Ignacy

1994-12-01

The version of a Polish speech from text system described in this article was developed using the speech-from-text system. The new system has additional functions which make it possible to enter commands in edited orthographic text to control the phrase component and accentuation parameters. This makes it possible to generate a series of modified intonation contours in the texts spoken by the system. The effects obtained are made easier to control by a graphic illustration of the base frequency pattern in phrases that were last 'spoken' by the system. This version of the system was designed as a test prototype which will help us expand and refine our set of rules for automatic generation of intonation contours, which in turn will enable the fully automated speech-from-text system to generate speech with a more varied and precisely formed fundamental frequency pattern.
Phonological awareness predicts activation patterns for print and speech

PubMed Central

Frost, Stephen J.; Landi, Nicole; Mencl, W. Einar; Sandak, Rebecca; Fulbright, Robert K.; Tejada, Eleanor T.; Jacobsen, Leslie; Grigorenko, Elena L.; Constable, R. Todd; Pugh, Kenneth R.

2009-01-01

Using fMRI, we explored the relationship between phonological awareness (PA), a measure of metaphonological knowledge of the segmental structure of speech, and brain activation patterns during processing of print and speech in young readers from six to ten years of age. Behavioral measures of PA were positively correlated with activation levels for print relative to speech tokens in superior temporal and occipito-temporal regions. Differences between print-elicited activation levels in superior temporal and inferior frontal sites were also correlated with PA measures with the direction of the correlation depending on stimulus type: positive for pronounceable pseudowords and negative for consonant strings. These results support and extend the many indications in the behavioral and neurocognitive literature that PA is a major component of skill in beginning readers and point to a developmental trajectory by which written language engages areas originally shaped by speech for learners on the path toward successful literacy acquisition. PMID:19306061
Without his shirt off he saved the child from almost drowning: interpreting an uncertain input

PubMed Central

Frazier, Lyn; Clifton, Charles

2014-01-01

Unedited speech and writing often contains errors, e.g., the blending of alternative ways of expressing a message. As a result comprehenders are faced with decisions about what the speaker may have intended, which may not be the same as the grammatically-licensed compositional interpretation of what was said. Two experiments investigated the comprehension of inputs that may have resulted from blending two syntactic forms. The results of the experiments suggest that readers and listeners tend to repair such utterances, restoring them to the presumed intended structure, and they assign the interpretation of the corrected utterance. Utterances that are repaired are expected to also be acceptable when they are easy to diagnose/repair and they are “familiar”, i.e., they correspond to natural speech errors. The results of the experiments established a continuum ranging from outright linguistic illusions with no indication that listeners and readers detected the error (the inclusion of almost in A passerby rescued a child from almost being run over by a bus.), to a majority of unblended interpretations for doubled quantifier sentences (Many students often turn in their assignments late) to only a third undoubled implicit negation (I just like the way the president looks without his shirt off.) The repair or speech error reversal account offered here is contrasted with the noisy channel approach (Gibson et al., 2013) and the good enough processing approach (Ferreiera et al., 2002). PMID:25984551
Obstructive sleep apnea severity estimation: Fusion of speech-based systems.

PubMed

Ben Or, D; Dafna, E; Tarasiuk, A; Zigel, Y

2016-08-01

Obstructive sleep apnea (OSA) is a common sleep-related breathing disorder. Previous studies associated OSA with anatomical abnormalities of the upper respiratory tract that may be reflected in the acoustic characteristics of speech. We tested the hypothesis that the speech signal carries essential information that can assist in early assessment of OSA severity by estimating apnea-hypopnea index (AHI). 198 men referred to routine polysomnography (PSG) were recorded shortly prior to sleep onset while reading a one-minute speech protocol. The different parts of the speech recordings, i.e., sustained vowels, short-time frames of fluent speech, and the speech recording as a whole, underwent separate analyses, using sustained vowels features, short-term features, and long-term features, respectively. Applying support vector regression and regression trees, these features were used in order to estimate AHI. The fusion of the outputs of the three subsystems resulted in a diagnostic agreement of 67.3% between the speech-estimated AHI and the PSG-determined AHI, and an absolute error rate of 10.8 events/hr. Speech signal analysis may assist in the estimation of AHI, thus allowing the development of a noninvasive tool for OSA screening.
Prevalence and Predictors of Persistent Speech Sound Disorder at Eight Years Old: Findings from a Population Cohort Study

ERIC Educational Resources Information Center

Wren, Yvonne; Miller, Laura L.; Peters, Tim J.; Emond, Alan; Roulstone, Sue

2016-01-01

Purpose: The purpose of this study was to determine prevalence and predictors of persistent speech sound disorder (SSD) in children aged 8 years after disregarding children presenting solely with common clinical distortions (i.e., residual errors). Method: Data from the Avon Longitudinal Study of Parents and Children (Boyd et al., 2012) were used.…
Verbal Paradata and Survey Error: Respondent Speech, Voice, and Question-Answering Behavior Can Predict Income Item Nonresponse

ERIC Educational Resources Information Center

Jans, Matthew E.

2010-01-01

Income nonresponse is a significant problem in survey data, with rates as high as 50%, yet we know little about why it occurs. It is plausible that the way respondents answer survey questions (e.g., their voice and speech characteristics, and their question- answering behavior) can predict whether they will provide income data, and will reflect…
Articulatory Placement for /t/, /d/, /k/ and /g/ Targets in School Age Children with Speech Disorders Associated with Cleft Palate

ERIC Educational Resources Information Center

Gibbon, Fiona; Ellis, Lucy; Crampin, Lisa

2004-01-01

This study used electropalatography (EPG) to identify place of articulation for lingual plosive targets /t/, /d/, /k/ and /g/ in the speech of 15 school age children with repaired cleft palate. Perceptual judgements indicated that all children had correct velar placement for /k/, /g/ targets, but /t/, /d/ targets were produced as errors involving…
Use of the Response to Intervention Model for Remediation of Mild Articulation Errors by Speech-Language Pathologists in Indiana Public Schools

ERIC Educational Resources Information Center

Fritz-Ocock, Amy

2016-01-01

The role of the school speech-language pathologist (SLP) has recently evolved to reflect national trends of educational reform. In an era of accountability for all student learning, Response to Intervention (RTI) has become the predominant vehicle for providing preventative, intensified instruction to students at risk. School SLPs in Indiana have…
Speech Fluency in Fragile X Syndrome

ERIC Educational Resources Information Center

Van Borsel, John; Dor, Orianne; Rondal, Jean

2008-01-01

The present study investigated the dysfluencies in the speech of nine French speaking individuals with fragile X syndrome. Type, number, and loci of dysfluencies were analysed. The study confirms that dysfluencies are a common feature of the speech of individuals with fragile X syndrome but also indicates that the dysfluency pattern displayed is…
Hierarchical Spatiotemporal Dynamics of Speech Rhythm and Articulation

ERIC Educational Resources Information Center

Tilsen, Samuel Edward

2009-01-01

Hierarchy is one of the most important concepts in the scientific study of language. This dissertation aims to understand why we observe hierarchical structures in speech by investigating the cognitive processes from which they emerge. To that end, the dissertation explores how articulatory, rhythmic, and prosodic patterns of speech interact.…
Speech Rhythm of Monolingual and Bilingual Children at age 2;6: Cantonese and English

ERIC Educational Resources Information Center

Mok, Peggy P. K.

2013-01-01

Previous studies have showed that at age 3;0, monolingual children acquiring rhythmically different languages display distinct rhythmic patterns while the speech rhythm patterns of the languages of bilingual children are more similar. It is unclear whether the same observations can be found for younger children, at 2;6. This study compared five…
Machine Learning Through Signature Trees. Applications to Human Speech.

ERIC Educational Resources Information Center

White, George M.

A signature tree is a binary decision tree used to classify unknown patterns. An attempt was made to develop a computer program for manipulating signature trees as a general research tool for exploring machine learning and pattern recognition. The program was applied to the problem of speech recognition to test its effectiveness for a specific…
Automated Classification of Phonological Errors in Aphasic Language

PubMed Central

Ahuja, Sanjeev B.; Reggia, James A.; Berndt, Rita S.

1984-01-01

Using heuristically-guided state space search, a prototype program has been developed to simulate and classify phonemic errors occurring in the speech of neurologically-impaired patients. Simulations are based on an interchangeable rule/operator set of elementary errors which represent a theory of phonemic processing faults. This work introduces and evaluates a novel approach to error simulation and classification, it provides a prototype simulation tool for neurolinguistic research, and it forms the initial phase of a larger research effort involving computer modelling of neurolinguistic processes.
A unified explanation for the diverse structural deviations reported for adult schizophrenics with disrupted speech.

PubMed

Chaika, E

1982-06-01

This paper attempts a unified explanation for the diverse manifestations of deviant speech considered pathognomic for schizophrenia. Examination of the structure of such speech shows that what appear to be diverse errors are really manifestations of two problems: apparently random or erroneous triggering of sounds and words coupled with inappropriate perseverations. These are shown to be different manifestations of the same problem, possibly a schizophrenic dysfunction in neurotransmitters in the brain.. Studies of hemispheric asymetry in schizophrenia, involuntary eyetracking, and the probable action of antipsychotic medication confirm the linguistic data.

Redistribution of neural phase coherence reflects establishment of feedforward map in speech motor adaptation

PubMed Central

Sengupta, Ranit

2015-01-01

Despite recent progress in our understanding of sensorimotor integration in speech learning, a comprehensive framework to investigate its neural basis is lacking at behaviorally relevant timescales. Structural and functional imaging studies in humans have helped us identify brain networks that support speech but fail to capture the precise spatiotemporal coordination within the networks that takes place during speech learning. Here we use neuronal oscillations to investigate interactions within speech motor networks in a paradigm of speech motor adaptation under altered feedback with continuous recording of EEG in which subjects adapted to the real-time auditory perturbation of a target vowel sound. As subjects adapted to the task, concurrent changes were observed in the theta-gamma phase coherence during speech planning at several distinct scalp regions that is consistent with the establishment of a feedforward map. In particular, there was an increase in coherence over the central region and a decrease over the fronto-temporal regions, revealing a redistribution of coherence over an interacting network of brain regions that could be a general feature of error-based motor learning in general. Our findings have implications for understanding the neural basis of speech motor learning and could elucidate how transient breakdown of neuronal communication within speech networks relates to speech disorders. PMID:25632078
Intonation and dialog context as constraints for speech recognition.

PubMed

Taylor, P; King, S; Isard, S; Wright, H

1998-01-01

This paper describes a way of using intonation and dialog context to improve the performance of an automatic speech recognition (ASR) system. Our experiments were run on the DCIEM Maptask corpus, a corpus of spontaneous task-oriented dialog speech. This corpus has been tagged according to a dialog analysis scheme that assigns each utterance to one of 12 "move types," such as "acknowledge," "query-yes/no" or "instruct." Most ASR systems use a bigram language model to constrain the possible sequences of words that might be recognized. Here we use a separate bigram language model for each move type. We show that when the "correct" move-specific language model is used for each utterance in the test set, the word error rate of the recognizer drops. Of course when the recognizer is run on previously unseen data, it cannot know in advance what move type the speaker has just produced. To determine the move type we use an intonation model combined with a dialog model that puts constraints on possible sequences of move types, as well as the speech recognizer likelihoods for the different move-specific models. In the full recognition system, the combination of automatic move type recognition with the move specific language models reduces the overall word error rate by a small but significant amount when compared with a baseline system that does not take intonation or dialog acts into account. Interestingly, the word error improvement is restricted to "initiating" move types, where word recognition is important. In "response" move types, where the important information is conveyed by the move type itself--for example, positive versus negative response--there is no word error improvement, but recognition of the response types themselves is good. The paper discusses the intonation model, the language models, and the dialog model in detail and describes the architecture in which they are combined.
The motor theory of speech perception revisited.

PubMed

Massaro, Dominic W; Chen, Trevor H

2008-04-01

Galantucci, Fowler, and Turvey (2006) have claimed that perceiving speech is perceiving gestures and that the motor system is recruited for perceiving speech. We make the counter argument that perceiving speech is not perceiving gestures, that the motor system is not recruitedfor perceiving speech, and that speech perception can be adequately described by a prototypical pattern recognition model, the fuzzy logical model of perception (FLMP). Empirical evidence taken as support for gesture and motor theory is reconsidered in more detail and in the framework of the FLMR Additional theoretical and logical arguments are made to challenge gesture and motor theory.
The Effects of Peer Tutoring on University Students' Success, Speaking Skills and Speech Self-Efficacy in the Effective and Good Speech Course

ERIC Educational Resources Information Center

Uzuner Yurt, Serap; Aktas, Elif

2016-01-01

In this study, the effects of the use of peer tutoring in Effective and Good Speech Course on students' success, perception of speech self-efficacy and speaking skills were examined. The study, designed as a mixed pattern in which quantitative and qualitative research approaches were combined, was carried out together with 57 students in 2014 to…
Frontal brain electrical activity (EEG) and heart rate in response to affective infant-directed (ID) speech in 9-month-old infants.

PubMed

Santesso, Diane L; Schmidt, Louis A; Trainor, Laurel J

2007-10-01

Many studies have shown that infants prefer infant-directed (ID) speech to adult-directed (AD) speech. ID speech functions to aid language learning, obtain and/or maintain an infant's attention, and create emotional communication between the infant and caregiver. We examined psychophysiological responses to ID speech that varied in affective content (i.e., love/comfort, surprise, fear) in a group of typically developing 9-month-old infants. Regional EEG and heart rate were collected continuously during stimulus presentation. We found the pattern of overall frontal EEG power was linearly related to affective intensity of the ID speech, such that EEG power was greatest in response to fear, than surprise than love/comfort; this linear pattern was specific to the frontal region. We also noted that heart rate decelerated to ID speech independent of affective content. As well, infants who were reported by their mothers as temperamentally distressed tended to exhibit greater relative right frontal EEG activity during baseline and in response to affective ID speech, consistent with previous work with visual stimuli and extending it to the auditory modality. Findings are discussed in terms of how increases in frontal EEG power in response to different affective intensity may reflect the cognitive aspects of emotional processing across sensory domains in infancy.
Effects of stimulus response compatibility on covert imitation of vowels.

PubMed

Adank, Patti; Nuttall, Helen; Bekkering, Harold; Maegherman, Gwijde

2018-03-13

When we observe someone else speaking, we tend to automatically activate the corresponding speech motor patterns. When listening, we therefore covertly imitate the observed speech. Simulation theories of speech perception propose that covert imitation of speech motor patterns supports speech perception. Covert imitation of speech has been studied with interference paradigms, including the stimulus-response compatibility paradigm (SRC). The SRC paradigm measures covert imitation by comparing articulation of a prompt following exposure to a distracter. Responses tend to be faster for congruent than for incongruent distracters; thus, showing evidence of covert imitation. Simulation accounts propose a key role for covert imitation in speech perception. However, covert imitation has thus far only been demonstrated for a select class of speech sounds, namely consonants, and it is unclear whether covert imitation extends to vowels. We aimed to demonstrate that covert imitation effects as measured with the SRC paradigm extend to vowels, in two experiments. We examined whether covert imitation occurs for vowels in a consonant-vowel-consonant context in visual, audio, and audiovisual modalities. We presented the prompt at four time points to examine how covert imitation varied over the distracter's duration. The results of both experiments clearly demonstrated covert imitation effects for vowels, thus supporting simulation theories of speech perception. Covert imitation was not affected by stimulus modality and was maximal for later time points.
Speech-like orofacial oscillations in stump-tailed macaque (Macaca arctoides) facial and vocal signals.

PubMed

Toyoda, Aru; Maruhashi, Tamaki; Malaivijitnond, Suchinda; Koda, Hiroki

2017-10-01

Speech is unique to humans and characterized by facial actions of ∼5 Hz oscillations of lip, mouth or jaw movements. Lip-smacking, a facial display of primates characterized by oscillatory actions involving the vertical opening and closing of the jaw and lips, exhibits stable 5-Hz oscillation patterns, matching that of speech, suggesting that lip-smacking is a precursor of speech. We tested if facial or vocal actions exhibiting the same rate of oscillation are found in wide forms of facial or vocal displays in various social contexts, exhibiting diversity among species. We observed facial and vocal actions of wild stump-tailed macaques (Macaca arctoides), and selected video clips including facial displays (teeth chattering; TC), panting calls, and feeding. Ten open-to-open mouth durations during TC and feeding and five amplitude peak-to-peak durations in panting were analyzed. Facial display (TC) and vocalization (panting) oscillated within 5.74 ± 1.19 and 6.71 ± 2.91 Hz, respectively, similar to the reported lip-smacking of long-tailed macaques and the speech of humans. These results indicated a common mechanism for the central pattern generator underlying orofacial movements, which would evolve to speech. Similar oscillations in panting, which evolved from different muscular control than the orofacial action, suggested the sensory foundations for perceptual saliency particular to 5-Hz rhythms in macaques. This supports the pre-adaptation hypothesis of speech evolution, which states a central pattern generator for 5-Hz facial oscillation and perceptual background tuned to 5-Hz actions existed in common ancestors of macaques and humans, before the emergence of speech. © 2017 Wiley Periodicals, Inc.
Cerebral specialization for speech production in persons with Down syndrome.

PubMed

Heath, M; Elliott, D

1999-09-01

The study of cerebral specialization in persons with Down syndrome (DS) has revealed an anomalous pattern of organization. Specifically, dichotic listening studies (e.g., Elliott & Weeks, 1993) have suggested a left ear/right hemisphere dominance for speech perception for persons with DS. In the current investigation, the cerebral dominance for speech production was examined using the mouth asymmetry technique. In right-handed, nonhandicapped subjects, mouth asymmetry methodology has shown that during speech, the right side of the mouth opens sooner and to a larger degree then the left side (Graves, Goodglass, & Landis, 1982). The phenomenon of right mouth asymmetry (RMA) is believed to reflect the direct access that the musculature on the right side of the face has to the left hemisphere's speech production systems. This direct access may facilitate the transfer of innervatory patterns to the muscles on the right side of the face. In the present study, the lateralization for speech production was investigated in 10 right-handed participants with DS and 10 nonhandicapped subjects. A RMA at the initiation and end of speech production occurred for subjects in both groups. Surprisingly, the degree of asymmetry between groups did not differ, suggesting that the lateralization of speech production is similar for persons with and persons without DS. These results support the biological dissociation model (Elliott, Weeks, & Elliott, 1987), which holds that persons with DS display a unique dissociation between speech perception (right hemisphere) and speech production (left hemisphere). Copyright 1999 Academic Press.
The Relationship Between Speech Production and Speech Perception Deficits in Parkinson's Disease.

PubMed

De Keyser, Kim; Santens, Patrick; Bockstael, Annelies; Botteldooren, Dick; Talsma, Durk; De Vos, Stefanie; Van Cauwenberghe, Mieke; Verheugen, Femke; Corthals, Paul; De Letter, Miet

2016-10-01

This study investigated the possible relationship between hypokinetic speech production and speech intensity perception in patients with Parkinson's disease (PD). Participants included 14 patients with idiopathic PD and 14 matched healthy controls (HCs) with normal hearing and cognition. First, speech production was objectified through a standardized speech intelligibility assessment, acoustic analysis, and speech intensity measurements. Second, an overall estimation task and an intensity estimation task were addressed to evaluate overall speech perception and speech intensity perception, respectively. Finally, correlation analysis was performed between the speech characteristics of the overall estimation task and the corresponding acoustic analysis. The interaction between speech production and speech intensity perception was investigated by an intensity imitation task. Acoustic analysis and speech intensity measurements demonstrated significant differences in speech production between patients with PD and the HCs. A different pattern in the auditory perception of speech and speech intensity was found in the PD group. Auditory perceptual deficits may influence speech production in patients with PD. The present results suggest a disturbed auditory perception related to an automatic monitoring deficit in PD.
An Elicited-Production Study of Inflectional Verb Morphology in Child Finnish

ERIC Educational Resources Information Center

Räsänen, Sanna H. M.; Ambridge, Ben; Pine, Julian M.

2016-01-01

Many generativist accounts (e.g., Wexler, 1998) argue for very early knowledge of inflection on the basis of very low rates of person/number marking errors in young children's speech. However, studies of Spanish (Aguado-Orea & Pine, 2015) and Brazilian Portuguese (Rubino & Pine, 1998) have revealed that these low overall error rates…
Speech disorders in Israeli Arab children.

PubMed

Jaber, L; Nahmani, A; Shohat, M

1997-10-01

The aim of this work was to study the frequency of speech disorders in Israeli Arab children and its association with parental consanguinity. A questionnaire was sent to the parents of 1,495 Arab children attending kindergarten and the first two grades of the seven primary schools in the town of Taibe. Eight-six percent (1,282 parents) responded. The answers to the questionnaire revealed that 25% of the children reportedly had a speech and language disorder. Of the children identified by their parents as having a speech disorder, 44 were selected randomly for examination by a speech specialist. The disorders noted in this subgroup included errors in articulation (48.0%), poor language (18%), poor voice quality (15.9%); stuttering (13.6%), and other problems (4.5%). Rates of affected children of consanguineous and non-consanguineous marriages were 31% and 22.4%, respectively (p < 0.01). We conclude that speech disorders are an important problem among Israeli Arab schoolchildren. More comprehensive programs are needed to facilitate diagnosis and treatment.
Long term rehabilitation of a total glossectomy patient.

PubMed

Bachher, Gurmit Kaur; Dholam, Kanchan P

2010-09-01

Malignant tumours of the oral cavity that require resection of the tongue result in severe deficiencies in speech and deglutition. Speech misarticulation leads to loss of speech intelligibility, which can prevent or limit communication. Prosthodontic rehabilitation involves fabrication of a Palatal Augmentation Prosthesis (PAP) following partial glossectomy and a mandibular tongue prosthesis after total glossectomy [1]. Speech analysis of a total glossectmy patient rehabilitated with a tongue prosthesis was done with the help of Dr. Speech Software Version 4 (Tiger DRS, Inc., Seattle) twelve years after treatment. Speech therapy sessions along with a prosthesis helped him to correct the dental sounds by using the lower lip and upper dentures (labio-dentals). It was noticed that speech intelligibility, intonation pattern, speech articulation and overall loudness was noticeably improved.
Speech endpoint detection with non-language speech sounds for generic speech processing applications

NASA Astrophysics Data System (ADS)

McClain, Matthew; Romanowski, Brian

2009-05-01

Non-language speech sounds (NLSS) are sounds produced by humans that do not carry linguistic information. Examples of these sounds are coughs, clicks, breaths, and filled pauses such as "uh" and "um" in English. NLSS are prominent in conversational speech, but can be a significant source of errors in speech processing applications. Traditionally, these sounds are ignored by speech endpoint detection algorithms, where speech regions are identified in the audio signal prior to processing. The ability to filter NLSS as a pre-processing step can significantly enhance the performance of many speech processing applications, such as speaker identification, language identification, and automatic speech recognition. In order to be used in all such applications, NLSS detection must be performed without the use of language models that provide knowledge of the phonology and lexical structure of speech. This is especially relevant to situations where the languages used in the audio are not known apriori. We present the results of preliminary experiments using data from American and British English speakers, in which segments of audio are classified as language speech sounds (LSS) or NLSS using a set of acoustic features designed for language-agnostic NLSS detection and a hidden-Markov model (HMM) to model speech generation. The results of these experiments indicate that the features and model used are capable of detection certain types of NLSS, such as breaths and clicks, while detection of other types of NLSS such as filled pauses will require future research.
Adaptive plasticity in speech perception: Effects of external information and internal predictions.

PubMed

Guediche, Sara; Fiez, Julie A; Holt, Lori L

2016-07-01

When listeners encounter speech under adverse listening conditions, adaptive adjustments in perception can improve comprehension over time. In some cases, these adaptive changes require the presence of external information that disambiguates the distorted speech signals, whereas in other cases mere exposure is sufficient. Both external (e.g., written feedback) and internal (e.g., prior word knowledge) sources of information can be used to generate predictions about the correct mapping of a distorted speech signal. We hypothesize that these predictions provide a basis for determining the discrepancy between the expected and actual speech signal that can be used to guide adaptive changes in perception. This study provides the first empirical investigation that manipulates external and internal factors through (a) the availability of explicit external disambiguating information via the presence or absence of postresponse orthographic information paired with a repetition of the degraded stimulus, and (b) the accuracy of internally generated predictions; an acoustic distortion is introduced either abruptly or incrementally. The results demonstrate that the impact of external information on adaptive plasticity is contingent upon whether the intelligibility of the stimuli permits accurate internally generated predictions during exposure. External information sources enhance adaptive plasticity only when input signals are severely degraded and cannot reliably access internal predictions. This is consistent with a computational framework for adaptive plasticity in which error-driven supervised learning relies on the ability to compute sensory prediction error signals from both internal and external sources of information. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Adaptive plasticity in speech perception: effects of external information and internal predictions

PubMed Central

Guediche, Sara; Fiez, Julie A.; Holt, Lori L.

2016-01-01

When listeners encounter speech under adverse listening conditions, adaptive adjustments in perception can improve comprehension over time. In some cases, these adaptive changes require the presence of external information that disambiguates the distorted speech signals, whereas in other cases mere exposure is sufficient. Both external (e.g. written feedback) and internal (e.g., prior word knowledge) sources of information can be used to generate predictions about the correct mapping of a distorted speech signal. We hypothesize that these predictions provide a basis for determining the discrepancy between the expected and actual speech signal that can be used to guide adaptive changes in perception. This study provides the first empirical investigation that manipulates external and internal factors through 1) the availability of explicit external disambiguating information via the presence or absence of post-response orthographic information paired with a repetition of the degraded stimulus, and 2) the accuracy of internally-generated predictions; an acoustic distortion is introduced either abruptly or incrementally. The results demonstrate that the impact of external information on adaptive plasticity is contingent upon whether the intelligibility of the stimuli permits accurate internally-generated predictions during exposure. External information sources enhance adaptive plasticity only when input signals are severely degraded and cannot reliably access internal predictions. This is consistent with a computational framework for adaptive plasticity in which error-driven supervised learning relies on the ability to compute sensory prediction error signals from both internal and external sources of information. PMID:26854531
Dysarthria and broader motor speech deficits in Dravet syndrome.

PubMed

Turner, Samantha J; Brown, Amy; Arpone, Marta; Anderson, Vicki; Morgan, Angela T; Scheffer, Ingrid E

2017-02-21

To analyze the oral motor, speech, and language phenotype in 20 children and adults with Dravet syndrome (DS) associated with mutations in SCN1A . Fifteen verbal and 5 minimally verbal DS patients with SCN1A mutations (aged 15 months-28 years) underwent a tailored assessment battery. Speech was characterized by imprecise articulation, abnormal nasal resonance, voice, and pitch, and prosody errors. Half of verbal patients had moderate to severely impaired conversational speech intelligibility. Oral motor impairment, motor planning/programming difficulties, and poor postural control were typical. Nonverbal individuals had intentional communication. Cognitive skills varied markedly, with intellectual functioning ranging from the low average range to severe intellectual disability. Language impairment was congruent with cognition. We describe a distinctive speech, language, and oral motor phenotype in children and adults with DS associated with mutations in SCN1A. Recognizing this phenotype will guide therapeutic intervention in patients with DS. © 2017 American Academy of Neurology.
Dysarthria and broader motor speech deficits in Dravet syndrome

PubMed Central

Turner, Samantha J.; Brown, Amy; Arpone, Marta; Anderson, Vicki; Morgan, Angela T.

2017-01-01

Objective: To analyze the oral motor, speech, and language phenotype in 20 children and adults with Dravet syndrome (DS) associated with mutations in SCN1A. Methods: Fifteen verbal and 5 minimally verbal DS patients with SCN1A mutations (aged 15 months-28 years) underwent a tailored assessment battery. Results: Speech was characterized by imprecise articulation, abnormal nasal resonance, voice, and pitch, and prosody errors. Half of verbal patients had moderate to severely impaired conversational speech intelligibility. Oral motor impairment, motor planning/programming difficulties, and poor postural control were typical. Nonverbal individuals had intentional communication. Cognitive skills varied markedly, with intellectual functioning ranging from the low average range to severe intellectual disability. Language impairment was congruent with cognition. Conclusions: We describe a distinctive speech, language, and oral motor phenotype in children and adults with DS associated with mutations in SCN1A. Recognizing this phenotype will guide therapeutic intervention in patients with DS. PMID:28148630
The Effect of Feedback Schedule Manipulation on Speech Priming Patterns and Reaction Time

ERIC Educational Resources Information Center

Slocomb, Dana; Spencer, Kristie A.

2009-01-01

Speech priming tasks are frequently used to delineate stages in the speech process such as lexical retrieval and motor programming. These tasks, often measured in reaction time (RT), require fast and accurate responses, reflecting maximized participant performance, to result in robust priming effects. Encouraging speed and accuracy in responding…
Speech after Mao: Literature and Belonging

ERIC Educational Resources Information Center

Hsieh, Victoria Linda

2012-01-01

This dissertation aims to understand the apparent failure of speech in post-Mao literature to fulfill its conventional functions of representation and communication. In order to understand this pattern, I begin by looking back on the utility of speech for nation-building in modern China. In addition to literary analysis of key authors and works,…
Children Perceive Speech Onsets by Ear and Eye

ERIC Educational Resources Information Center

Jerger, Susan; Damian, Markus F.; Tye-Murrey, Nancy; Abdi, Herve

2017-01-01

Adults use vision to perceive low-fidelity speech; yet how children acquire this ability is not well understood. The literature indicates that children show reduced sensitivity to visual speech from kindergarten to adolescence. We hypothesized that this pattern reflects the effects of complex tasks and a growth period with harder-to-utilize…

Do individuals with fragile X syndrome show developmental stuttering or not? Comment on "Speech fluency in fragile X syndrome" by van Borsel, Dor and Rondal.

PubMed

Howell, Peter

2008-02-01

Van Borsel, Dor, and Rondal (2007) examined the speech of seven boys and two young male adults with fragile X syndrome and considered whether their speech was comparable to that reported in the developmental stuttering literature. They listed five criteria which led them to conclude that the speech patterns of speakers with fragile X syndrome differed from those observed in developmental stuttering. The differences noted were: 1) distribution of type of dysfluency; 2) the class of word on which dysfluency occurred; 3) whether word length affected dysfluency; 4) number of times words and phrases were repeated; and 5) whether there were influences of material type on fluency (spontaneous speech, repeated material etc.). They concluded that the speech of speakers with fragile X syndrome differed from developmental stuttering. The comparisons that van Borsel et al. (2007) made between participant groups were not for speakers of comparable ages. Comparisons with groups of corresponding ages support the opposite conclusion, namely the young speakers with fragile X syndrome show patterns similar to developmental stuttering.
Patterns in Early Interaction between Young Preschool Children with Severe Speech and Physical Impairments and Their Parents

ERIC Educational Resources Information Center

Sandberg, Annika Dahlgren; Liliedahl, Marie

2008-01-01

The aim of this study is to examine whether the asymmetrical pattern of communication usually found between people who use augmentative and alternative communication and their partners using natural speech was also found in the interaction between non-vocal young preschool children with cerebral palsy and their parents. Three parent-child dyads…
Patterns of Communicative Interaction between a Child with Severe Speech and Physical Impairments and Her Caregiver during a Mealtime Activity

ERIC Educational Resources Information Center

Ferm, Ulrika; Ahlsen, Elisabeth; Bjorck-Akesson, Eva

2012-01-01

Background: Interaction between caregivers and children with severe impairments is closely related to the demands of daily activities. This study examines the relationship between interaction and the routine mealtime activity at home. Method: Patterns of interaction between a child (aged 6 years and 6 months) with severe speech and physical…
Patterns of lung volume use during an extemporaneous speech task in persons with Parkinson disease.

PubMed

Bunton, Kate

2005-01-01

This study examined patterns of lung volume use in speakers with Parkinson disease (PD) during an extemporaneous speaking task. The performance of a control group was also examined. Behaviors described are based on acoustic, kinematic and linguistic measures. Group differences were found in breath group duration, lung volume initiation, and lung volume termination measures. Speakers in the control group alternated between a longer and shorter breath groups. With starting lung volumes being higher for the longer breath groups and lower for shorter breath groups. Speech production was terminated before reaching tidal end expiratory level. This pattern was also seen in 4 of 7 speakers with PD. The remaining 3 PD speakers initiated speech at low starting lung volumes and continued speaking below EEL. This subgroup of PD speakers ended breath groups at agrammatical boundaries, whereas control speakers ended at appropriate grammatical boundaries. As a result of participating in this exercise, the reader will (1) be able to describe the patterns of lung volume use in speakers with Parkinson disease and compare them with those employed by control speakers; and (2) obtain information about the influence of speaking task on speech breathing.
Speech recognition training for enhancing written language generation by a traumatic brain injury survivor.

PubMed

Manasse, N J; Hux, K; Rankin-Erickson, J L

2000-11-01

Impairments in motor functioning, language processing, and cognitive status may impact the written language performance of traumatic brain injury (TBI) survivors. One strategy to minimize the impact of these impairments is to use a speech recognition system. The purpose of this study was to explore the effect of mild dysarthria and mild cognitive-communication deficits secondary to TBI on a 19-year-old survivor's mastery and use of such a system-specifically, Dragon Naturally Speaking. Data included the % of the participant's words accurately perceived by the system over time, the participant's accuracy over time in using commands for navigation and error correction, and quantitative and qualitative changes in the participant's written texts generated with and without the use of the speech recognition system. Results showed that Dragon NaturallySpeaking was approximately 80% accurate in perceiving words spoken by the participant, and the participant quickly and easily mastered all navigation and error correction commands presented. Quantitatively, the participant produced a greater amount of text using traditional word processing and a standard keyboard than using the speech recognition system. Minimal qualitative differences appeared between writing samples. Discussion of factors that may have contributed to the obtained results and that may affect the generalization of the findings to other TBI survivors is provided.
Prevalence and types of articulation errors in Saudi Arabic-speaking children with repaired cleft lip and palate.

PubMed

Albustanji, Yusuf M; Albustanji, Mahmoud M; Hegazi, Mohamed M; Amayreh, Mousa M

2014-10-01

The purpose of this study was to assess prevalence and types of consonant production errors and phonological processes in Saudi Arabic-speaking children with repaired cleft lip and palate, and to determine the relationship between frequency of errors on one hand and the type of the cleft. Possible relationship between age, gender and frequency of errors was also investigated. Eighty Saudi children with repaired cleft lip and palate aged 6-15 years (mean 6.7 years), underwent speech, language, and hearing evaluation. The diagnosis of articulation deficits was based on the results of an Arabic articulation test. Phonological processes were reported based on the productivity scale of a minimum 20% of occurrence. Diagnosis of nasality was based on a 5-point scale that reflects severity from 0 through 4. All participants underwent intraoral examination, informal language assessment, and hearing evaluation to assess their speech and language abilities. The Chi-Square test for independence was used to analyze the results of consonant production as a function of type of CLP and age. Out of 80 participants with CLP, 21 participants had normal articulation and resonance, 59 of participants (74%) showed speech abnormalities. Twenty-one of these 59 participants showed only articulation errors; 17 showed only hypernasality; and 21 showed both articulation and resonance deficits. CAs were observed in 20 participant. The productive phonological processes were consonant backing, final consonant deletion, gliding, and stopping. At age 6 and older, 37% of participants had persisting hearing loss. Despite early age at time of surgery (mean 6.7 months) for the studied CLP participants in this study, a substantial number of them demonstrated articulation errors and hypernasality. The results showed desirable findings for diverse languages. It is especially interesting to consider the prevalence of glottal stops and pharyngeal fricatives in a population for whom these sound are phonemic. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Distributed neural signatures of natural audiovisual speech and music in the human auditory cortex.

PubMed

Salmi, Juha; Koistinen, Olli-Pekka; Glerean, Enrico; Jylänki, Pasi; Vehtari, Aki; Jääskeläinen, Iiro P; Mäkelä, Sasu; Nummenmaa, Lauri; Nummi-Kuisma, Katarina; Nummi, Ilari; Sams, Mikko

2017-08-15

During a conversation or when listening to music, auditory and visual information are combined automatically into audiovisual objects. However, it is still poorly understood how specific type of visual information shapes neural processing of sounds in lifelike stimulus environments. Here we applied multi-voxel pattern analysis to investigate how naturally matching visual input modulates supratemporal cortex activity during processing of naturalistic acoustic speech, singing and instrumental music. Bayesian logistic regression classifiers with sparsity-promoting priors were trained to predict whether the stimulus was audiovisual or auditory, and whether it contained piano playing, speech, or singing. The predictive performances of the classifiers were tested by leaving one participant at a time for testing and training the model using the remaining 15 participants. The signature patterns associated with unimodal auditory stimuli encompassed distributed locations mostly in the middle and superior temporal gyrus (STG/MTG). A pattern regression analysis, based on a continuous acoustic model, revealed that activity in some of these MTG and STG areas were associated with acoustic features present in speech and music stimuli. Concurrent visual stimulus modulated activity in bilateral MTG (speech), lateral aspect of right anterior STG (singing), and bilateral parietal opercular cortex (piano). Our results suggest that specific supratemporal brain areas are involved in processing complex natural speech, singing, and piano playing, and other brain areas located in anterior (facial speech) and posterior (music-related hand actions) supratemporal cortex are influenced by related visual information. Those anterior and posterior supratemporal areas have been linked to stimulus identification and sensory-motor integration, respectively. Copyright © 2017 Elsevier Inc. All rights reserved.
Do Individuals with Fragile X Syndrome Show Developmental Stuttering or Not? Comment on "Speech Fluency in Fragile X Syndrome" by Van Borsel, Dor and Rondal

ERIC Educational Resources Information Center

Howell, Peter

2008-01-01

Van Borsel, Dor, and Rondal (2007) examined the speech of seven boys and two young male adults with fragile X syndrome and considered whether their speech was comparable to that reported in the developmental stuttering literature. They listed five criteria which led them to conclude that the speech patterns of speakers with fragile X syndrome…
Status Report on Speech Research: A Report on the Status and Progress of Studies on the Nature of Speech, Instrumentation for Its Investigation, and Practical Applications, July 1 - December 31, 1977).

ERIC Educational Resources Information Center

Haskins Labs., New Haven, CT.

This report is one of a regular series about the status and progress of studies on the nature of speech, instrumentation for its investigation, and practical applications. The 17 papers discuss the identification of sine-wave analogues of speech sounds; prosodic information for vowel identity; progressive changes in articulatory patterns in verbal…
Pattern learning with deep neural networks in EMG-based speech recognition.

PubMed

Wand, Michael; Schultz, Tanja

2014-01-01

We report on classification of phones and phonetic features from facial electromyographic (EMG) data, within the context of our EMG-based Silent Speech interface. In this paper we show that a Deep Neural Network can be used to perform this classification task, yielding a significant improvement over conventional Gaussian Mixture models. Our central contribution is the visualization of patterns which are learned by the neural network. With increasing network depth, these patterns represent more and more intricate electromyographic activity.
Assessment of rhythmic entrainment at multiple timescales in dyslexia: evidence for disruption to syllable timing.

PubMed

Leong, Victoria; Goswami, Usha

2014-02-01

Developmental dyslexia is associated with rhythmic difficulties, including impaired perception of beat patterns in music and prosodic stress patterns in speech. Spoken prosodic rhythm is cued by slow (<10 Hz) fluctuations in speech signal amplitude. Impaired neural oscillatory tracking of these slow amplitude modulation (AM) patterns is one plausible source of impaired rhythm tracking in dyslexia. Here, we characterise the temporal profile of the dyslexic rhythm deficit by examining rhythmic entrainment at multiple speech timescales. Adult dyslexic participants completed two experiments aimed at testing the perception and production of speech rhythm. In the perception task, participants tapped along to the beat of 4 metrically-regular nursery rhyme sentences. In the production task, participants produced the same 4 sentences in time to a metronome beat. Rhythmic entrainment was assessed using both traditional rhythmic indices and a novel AM-based measure, which utilised 3 dominant AM timescales in the speech signal each associated with a different phonological grain-sized unit (0.9-2.5 Hz, prosodic stress; 2.5-12 Hz, syllables; 12-40 Hz, phonemes). The AM-based measure revealed atypical rhythmic entrainment by dyslexic participants to syllable patterns in speech, in perception and production. In the perception task, both groups showed equally strong phase-locking to Syllable AM patterns, but dyslexic responses were entrained to a significantly earlier oscillatory phase angle than controls. In the production task, dyslexic utterances showed shorter syllable intervals, and differences in Syllable:Phoneme AM cross-frequency synchronisation. Our data support the view that rhythmic entrainment at slow (∼5 Hz, Syllable) rates is atypical in dyslexia, suggesting that neural mechanisms for syllable perception and production may also be atypical. These syllable timing deficits could contribute to the atypical development of phonological representations for spoken words, the central cognitive characteristic of developmental dyslexia across languages. Copyright © 2013 The Authors. Published by Elsevier B.V. All rights reserved.
Assessment of rhythmic entrainment at multiple timescales in dyslexia: Evidence for disruption to syllable timing☆

PubMed Central

Leong, Victoria; Goswami, Usha

2014-01-01

Developmental dyslexia is associated with rhythmic difficulties, including impaired perception of beat patterns in music and prosodic stress patterns in speech. Spoken prosodic rhythm is cued by slow (<10 Hz) fluctuations in speech signal amplitude. Impaired neural oscillatory tracking of these slow amplitude modulation (AM) patterns is one plausible source of impaired rhythm tracking in dyslexia. Here, we characterise the temporal profile of the dyslexic rhythm deficit by examining rhythmic entrainment at multiple speech timescales. Adult dyslexic participants completed two experiments aimed at testing the perception and production of speech rhythm. In the perception task, participants tapped along to the beat of 4 metrically-regular nursery rhyme sentences. In the production task, participants produced the same 4 sentences in time to a metronome beat. Rhythmic entrainment was assessed using both traditional rhythmic indices and a novel AM-based measure, which utilised 3 dominant AM timescales in the speech signal each associated with a different phonological grain-sized unit (0.9–2.5 Hz, prosodic stress; 2.5–12 Hz, syllables; 12–40 Hz, phonemes). The AM-based measure revealed atypical rhythmic entrainment by dyslexic participants to syllable patterns in speech, in perception and production. In the perception task, both groups showed equally strong phase-locking to Syllable AM patterns, but dyslexic responses were entrained to a significantly earlier oscillatory phase angle than controls. In the production task, dyslexic utterances showed shorter syllable intervals, and differences in Syllable:Phoneme AM cross-frequency synchronisation. Our data support the view that rhythmic entrainment at slow (∼5 Hz, Syllable) rates is atypical in dyslexia, suggesting that neural mechanisms for syllable perception and production may also be atypical. These syllable timing deficits could contribute to the atypical development of phonological representations for spoken words, the central cognitive characteristic of developmental dyslexia across languages. This article is part of a Special Issue entitled . PMID:23916752
Neural Prediction Errors Distinguish Perception and Misperception of Speech.

PubMed

Blank, Helen; Spangenberg, Marlene; Davis, Matthew H

2018-06-11

Humans use prior expectations to improve perception, especially of sensory signals that are degraded or ambiguous. However, if sensory input deviates from prior expectations, correct perception depends on adjusting or rejecting prior expectations. Failure to adjust or reject the prior leads to perceptual illusions especially if there is partial overlap (hence partial mismatch) between expectations and input. With speech, "Slips of the ear" occur when expectations lead to misperception. For instance, a entomologist, might be more susceptible to hear "The ants are my friends" for "The answer, my friend" (in the Bob Dylan song "Blowing in the Wind"). Here, we contrast two mechanisms by which prior expectations may lead to misperception of degraded speech. Firstly, clear representations of the common sounds in the prior and input (i.e., expected sounds) may lead to incorrect confirmation of the prior. Secondly, insufficient representations of sounds that deviate between prior and input (i.e., prediction errors) could lead to deception. We used cross-modal predictions from written words that partially match degraded speech to compare neural responses when male and female human listeners were deceived into accepting the prior or correctly reject it. Combined behavioural and multivariate representational similarity analysis of functional magnetic resonance imaging data shows that veridical perception of degraded speech is signalled by representations of prediction error in the left superior temporal sulcus. Instead of using top-down processes to support perception of expected sensory input, our findings suggest that the strength of neural prediction error representations distinguishes correct perception and misperception. SIGNIFICANCE STATEMENT Misperceiving spoken words is an everyday experience with outcomes that range from shared amusement to serious miscommunication. For hearing-impaired individuals, frequent misperception can lead to social withdrawal and isolation with severe consequences for well-being. In this work, we specify the neural mechanisms by which prior expectations - which are so often helpful for perception - can lead to misperception of degraded sensory signals. Most descriptive theories of illusory perception explain misperception as arising from a clear sensory representation of features or sounds that are in common between prior expectations and sensory input. Our work instead provides support for a complementary proposal; namely that misperception occurs when there is an insufficient sensory representations of the deviation between expectations and sensory signals. Copyright © 2018 the authors.
Severity-Based Adaptation with Limited Data for ASR to Aid Dysarthric Speakers

PubMed Central

Mustafa, Mumtaz Begum; Salim, Siti Salwah; Mohamed, Noraini; Al-Qatab, Bassam; Siong, Chng Eng

2014-01-01

Automatic speech recognition (ASR) is currently used in many assistive technologies, such as helping individuals with speech impairment in their communication ability. One challenge in ASR for speech-impaired individuals is the difficulty in obtaining a good speech database of impaired speakers for building an effective speech acoustic model. Because there are very few existing databases of impaired speech, which are also limited in size, the obvious solution to build a speech acoustic model of impaired speech is by employing adaptation techniques. However, issues that have not been addressed in existing studies in the area of adaptation for speech impairment are as follows: (1) identifying the most effective adaptation technique for impaired speech; and (2) the use of suitable source models to build an effective impaired-speech acoustic model. This research investigates the above-mentioned two issues on dysarthria, a type of speech impairment affecting millions of people. We applied both unimpaired and impaired speech as the source model with well-known adaptation techniques like the maximum likelihood linear regression (MLLR) and the constrained-MLLR(C-MLLR). The recognition accuracy of each impaired speech acoustic model is measured in terms of word error rate (WER), with further assessments, including phoneme insertion, substitution and deletion rates. Unimpaired speech when combined with limited high-quality speech-impaired data improves performance of ASR systems in recognising severely impaired dysarthric speech. The C-MLLR adaptation technique was also found to be better than MLLR in recognising mildly and moderately impaired speech based on the statistical analysis of the WER. It was found that phoneme substitution was the biggest contributing factor in WER in dysarthric speech for all levels of severity. The results show that the speech acoustic models derived from suitable adaptation techniques improve the performance of ASR systems in recognising impaired speech with limited adaptation data. PMID:24466004
Cepstral domain modification of audio signals for data embedding: preliminary results

NASA Astrophysics Data System (ADS)

Gopalan, Kaliappan

2004-06-01

A method of embedding data in an audio signal using cepstral domain modification is described. Based on successful embedding in the spectral points of perceptually masked regions in each frame of speech, first the technique was extended to embedding in the log spectral domain. This extension resulted at approximately 62 bits /s of embedding with less than 2 percent of bit error rate (BER) for a clean cover speech (from the TIMIT database), and about 2.5 percent for a noisy speech (from an air traffic controller database), when all frames - including silence and transition between voiced and unvoiced segments - were used. Bit error rate increased significantly when the log spectrum in the vicinity of a formant was modified. In the next procedure, embedding by altering the mean cepstral values of two ranges of indices was studied. Tests on both a noisy utterance and a clean utterance indicated barely noticeable perceptual change in speech quality when lower range of cepstral indices - corresponding to vocal tract region - was modified in accordance with data. With an embedding capacity of approximately 62 bits/s - using one bit per each frame regardless of frame energy or type of speech - initial results showed a BER of less than 1.5 percent for a payload capacity of 208 embedded bits using the clean cover speech. BER of less than 1.3 percent resulted for the noisy host with a capacity was 316 bits. When the cepstrum was modified in the region of excitation, BER increased to over 10 percent. With quantization causing no significant problem, the technique warrants further studies with different cepstral ranges and sizes. Pitch-synchronous cepstrum modification, for example, may be more robust to attacks. In addition, cepstrum modification in regions of speech that are perceptually masked - analogous to embedding in frequency masked regions - may yield imperceptible stego audio with low BER.
Compensation for Coarticulation: Disentangling Auditory and Gestural Theories of Perception of Coarticulatory Effects in Speech

ERIC Educational Resources Information Center

Viswanathan, Navin; Magnuson, James S.; Fowler, Carol A.

2010-01-01

According to one approach to speech perception, listeners perceive speech by applying general pattern matching mechanisms to the acoustic signal (e.g., Diehl, Lotto, & Holt, 2004). An alternative is that listeners perceive the phonetic gestures that structured the acoustic signal (e.g., Fowler, 1986). The two accounts have offered different…
Management of Non-Progressive Dysarthria: Practice Patterns of Speech and Language Therapists in the Republic of Ireland

ERIC Educational Resources Information Center

Conway, Aifric; Walshe, Margaret

2015-01-01

Background: Dysarthria is a commonly acquired speech disorder. Rising numbers of people surviving stroke and traumatic brain injury (TBI) mean the numbers of people with non-progressive dysarthria are likely to increase, with increased challenges for speech and language therapists (SLTs), service providers and key stakeholders. The evidence base…
Pedagogical Materials 1. The Yugoslav Serbo-Croatian-English Contrastive Project.

ERIC Educational Resources Information Center

Filipovic, Rudolf, Ed.

The first volume in this series on Serbo-Croatian-English contrastive analysis contains six articles. They are: "Contrastive Analysis and Error Analysis in Pedagogical Materials," by Rudolf Filipovic; "Errors in the Morphology and Syntax of the Parts of Speech in the English of Learners from the Serbo-Croatian-Speaking Area," by Vera Andrassy;…
A Comparison of Fathers' and Mothers' Speech Patterns When Communicating with Three, Four, and Five-Year-Old Children.

ERIC Educational Resources Information Center

Kurth, Ruth Justine; Kurth, Lila Mae

A study compared mothers' and fathers' speech patterns when speaking to preschool children, particularly utterance length, sentence types, and word frequencies. All of the children attended a nursery school with a student population of 136 in a large urban area in the Southwest. Volunteer subjects, 28 mothers and 28 fathers of 28 children who…
Functional overlap between regions involved in speech perception and in monitoring one's own voice during speech production.

PubMed

Zheng, Zane Z; Munhall, Kevin G; Johnsrude, Ingrid S

2010-08-01

The fluency and the reliability of speech production suggest a mechanism that links motor commands and sensory feedback. Here, we examined the neural organization supporting such links by using fMRI to identify regions in which activity during speech production is modulated according to whether auditory feedback matches the predicted outcome or not and by examining the overlap with the network recruited during passive listening to speech sounds. We used real-time signal processing to compare brain activity when participants whispered a consonant-vowel-consonant word ("Ted") and either heard this clearly or heard voice-gated masking noise. We compared this to when they listened to yoked stimuli (identical recordings of "Ted" or noise) without speaking. Activity along the STS and superior temporal gyrus bilaterally was significantly greater if the auditory stimulus was (a) processed as the auditory concomitant of speaking and (b) did not match the predicted outcome (noise). The network exhibiting this Feedback Type x Production/Perception interaction includes a superior temporal gyrus/middle temporal gyrus region that is activated more when listening to speech than to noise. This is consistent with speech production and speech perception being linked in a control system that predicts the sensory outcome of speech acts and that processes an error signal in speech-sensitive regions when this and the sensory data do not match.

Functional overlap between regions involved in speech perception and in monitoring one’s own voice during speech production

PubMed Central

Zheng, Zane Z.; Munhall, Kevin G; Johnsrude, Ingrid S

2009-01-01

The fluency and reliability of speech production suggests a mechanism that links motor commands and sensory feedback. Here, we examine the neural organization supporting such links by using fMRI to identify regions in which activity during speech production is modulated according to whether auditory feedback matches the predicted outcome or not, and examining the overlap with the network recruited during passive listening to speech sounds. We use real-time signal processing to compare brain activity when participants whispered a consonant-vowel-consonant word (‘Ted’) and either heard this clearly, or heard voice-gated masking noise. We compare this to when they listened to yoked stimuli (identical recordings of ‘Ted’ or noise) without speaking. Activity along the superior temporal sulcus (STS) and superior temporal gyrus (STG) bilaterally was significantly greater if the auditory stimulus was a) processed as the auditory concomitant of speaking and b) did not match the predicted outcome (noise). The network exhibiting this Feedback type by Production/Perception interaction includes an STG/MTG region that is activated more when listening to speech than to noise. This is consistent with speech production and speech perception being linked in a control system that predicts the sensory outcome of speech acts, and that processes an error signal in speech-sensitive regions when this and the sensory data do not match. PMID:19642886
More About Vector Adaptive/Predictive Coding Of Speech

NASA Technical Reports Server (NTRS)

Jedrey, Thomas C.; Gersho, Allen

1992-01-01

Report presents additional information about digital speech-encoding and -decoding system described in "Vector Adaptive/Predictive Encoding of Speech" (NPO-17230). Summarizes development of vector adaptive/predictive coding (VAPC) system and describes basic functions of algorithm. Describes refinements introduced enabling receiver to cope with errors. VAPC algorithm implemented in integrated-circuit coding/decoding processors (codecs). VAPC and other codecs tested under variety of operating conditions. Tests designed to reveal effects of various background quiet and noisy environments and of poor telephone equipment. VAPC found competitive with and, in some respects, superior to other 4.8-kb/s codecs and other codecs of similar complexity.
Method for detection and correction of errors in speech pitch period estimates

NASA Technical Reports Server (NTRS)

Bhaskar, Udaya (Inventor)

1989-01-01

A method of detecting and correcting received values of a pitch period estimate of a speech signal for use in a speech coder or the like. An average is calculated of the nonzero values of received pitch period estimate since the previous reset. If a current pitch period estimate is within a range of 0.75 to 1.25 times the average, it is assumed correct, while if not, a correction process is carried out. If correction is required successively for more than a preset number of times, which will most likely occur when the speaker changes, the average is discarded and a new average calculated.
Masking Period Patterns and Forward Masking for Speech-Shaped Noise: Age-Related Effects.

PubMed

Grose, John H; Menezes, Denise C; Porter, Heather L; Griz, Silvana

2016-01-01

The purpose of this study was to assess age-related changes in temporal resolution in listeners with relatively normal audiograms. The hypothesis was that increased susceptibility to nonsimultaneous masking contributes to the hearing difficulties experienced by older listeners in complex fluctuating backgrounds. Participants included younger (n = 11), middle-age (n = 12), and older (n = 11) listeners with relatively normal audiograms. The first phase of the study measured masking period patterns for speech-shaped noise maskers and signals. From these data, temporal window shapes were derived. The second phase measured forward-masking functions and assessed how well the temporal window fits accounted for these data. The masking period patterns demonstrated increased susceptibility to backward masking in the older listeners, compatible with a more symmetric temporal window in this group. The forward-masking functions exhibited an age-related decline in recovery to baseline thresholds, and there was also an increase in the variability of the temporal window fits to these data. This study demonstrated an age-related increase in susceptibility to nonsimultaneous masking, supporting the hypothesis that exacerbated nonsimultaneous masking contributes to age-related difficulties understanding speech in fluctuating noise. Further support for this hypothesis comes from limited speech-in-noise data, suggesting an association between susceptibility to forward masking and speech understanding in modulated noise.
On the function of stress rhythms in speech: evidence of a link with grouping effects on serial memory.

PubMed

Boucher, Victor J

2006-01-01

Language learning requires a capacity to recall novel series of speech sounds. Research shows that prosodic marks create grouping effects enhancing serial recall. However, any restriction on memory affecting the reproduction of prosody would limit the set of patterns that could be learned and subsequently used in speech. By implication, grouping effects of prosody would also be limited to reproducible patterns. This view of the role of prosody and the contribution of memory processes in the organization of prosodic patterns is examined by evaluating the correspondence between a reported tendency to restrict stress intervals in speech and size limits on stress-grouping effects. French speech is used where stress defines the endpoints of groups. In Experiment 1, 40 speakers recalled novel series of syllables containing stress-groups of varying size. Recall was not enhanced by groupings exceeding four syllables, which corresponded to a restriction on the reproducibility of stress-groups. In Experiment 2, the subjects produced given sentences containing phrases of differing length. The results show a strong tendency to insert stress within phrases that exceed four syllables. Since prosody can arise in the recall of syntactically unstructured lists, the results offer initial support for viewing memory processes as a factor of stress-rhythm organization.
Investigation of an HMM/ANN hybrid structure in pattern recognition application using cepstral analysis of dysarthric (distorted) speech signals.

PubMed

Polur, Prasad D; Miller, Gerald E

2006-10-01

Computer speech recognition of individuals with dysarthria, such as cerebral palsy patients requires a robust technique that can handle conditions of very high variability and limited training data. In this study, application of a 10 state ergodic hidden Markov model (HMM)/artificial neural network (ANN) hybrid structure for a dysarthric speech (isolated word) recognition system, intended to act as an assistive tool, was investigated. A small size vocabulary spoken by three cerebral palsy subjects was chosen. The effect of such a structure on the recognition rate of the system was investigated by comparing it with an ergodic hidden Markov model as a control tool. This was done in order to determine if this modified technique contributed to enhanced recognition of dysarthric speech. The speech was sampled at 11 kHz. Mel frequency cepstral coefficients were extracted from them using 15 ms frames and served as training input to the hybrid model setup. The subsequent results demonstrated that the hybrid model structure was quite robust in its ability to handle the large variability and non-conformity of dysarthric speech. The level of variability in input dysarthric speech patterns sometimes limits the reliability of the system. However, its application as a rehabilitation/control tool to assist dysarthric motor impaired individuals holds sufficient promise.
Speaker diarization system on the 2007 NIST rich transcription meeting recognition evaluation

NASA Astrophysics Data System (ADS)

Sun, Hanwu; Nwe, Tin Lay; Koh, Eugene Chin Wei; Bin, Ma; Li, Haizhou

2007-09-01

This paper presents a speaker diarization system developed at the Institute for Infocomm Research (I2R) for NIST Rich Transcription 2007 (RT-07) evaluation task. We describe in details our primary approaches for the speaker diarization on the Multiple Distant Microphones (MDM) conditions in conference room scenario. Our proposed system consists of six modules: 1). Least-mean squared (NLMS) adaptive filter for the speaker direction estimate via Time Difference of Arrival (TDOA), 2). An initial speaker clustering via two-stage TDOA histogram distribution quantization approach, 3). Multiple microphone speaker data alignment via GCC-PHAT Time Delay Estimate (TDE) among all the distant microphone channel signals, 4). A speaker clustering algorithm based on GMM modeling approach, 5). Non-speech removal via speech/non-speech verification mechanism and, 6). Silence removal via "Double-Layer Windowing"(DLW) method. We achieves error rate of 31.02% on the 2006 Spring (RT-06s) MDM evaluation task and a competitive overall error rate of 15.32% for the NIST Rich Transcription 2007 (RT-07) MDM evaluation task.
Dysgraphia in Patients with Primary Lateral Sclerosis: A Speech-Based Rehearsal Deficit?

PubMed Central

Zago, S.; Poletti, B.; Corbo, M.; Adobbati, L.; Silani, V.

2008-01-01

The present study aims to demonstrate that errors when writing are more common than expected in patients affected by primary lateral sclerosis (PLS) with severe dysarthria or complete mutism, independent of spasticity. Sixteen patients meeting Pringle’s et al. [34] criteria for PLS underwent standard neuropsychological tasks and evaluation of writing. We assessed writing abilities in spelling through dictation in which a set of words, non-words and short phrases were presented orally and by composing words using a set of preformed letters. Finally, a written copying task was performed with the same words. Relative to controls, PLS patients made a greater number of spelling errors in all writing conditions, but not in copy task. The error types included: omissions, transpositions, insertions and letter substitutions. These were equally distributed on the writing task and the composition of words with a set of preformed letters. This pattern of performance is consistent with a spelling impairment. The results are consistent with the concept that written production is critically dependent on the subvocal articulatory mechanism of rehearsal, perhaps at the level of retaining the sequence of graphemes in a graphemic buffer. In PLS patients a disturbance in rehearsal opportunity may affect the correct sequencing/assembly of an orthographic representation in the written process. PMID:19096141
Sentence imitation as a marker of SLI in Czech: disproportionate impairment of verbs and clitics.

PubMed

Smolík, Filip; Vávru, Petra

2014-06-01

The authors examined sentence imitation as a potential clinical marker of specific language impairment (SLI) in Czech and its use to identify grammatical markers of SLI. Children with SLI and the age- and language-matched control groups (total N = 57) were presented with a sentence imitation task, a receptive vocabulary task, and digit span and nonword repetition tasks. Sentence imitations were scored for accuracy and error types. A separate count of inaccuracies for individual part-of-speech categories was performed. Children with SLI had substantially more inaccurate imitations than the control groups. The differences in the memory measures could not account for the differences between children with SLI and the control groups in imitation accuracy, even though they accounted for the differences between the language-matched and age-matched control groups. The proportion of grammatical errors was larger in children with SLI than in the control groups. The categories that were most affected in imitations of children with SLI were verbs and clitics. Sentence imitation is a sensitive marker of SLI. Verbs and clitics are the most vulnerable categories in Czech SLI. The pattern of errors suggests that impaired syntactic representations are the most likely source of difficulties in children with SLI.
Children perceive speech onsets by ear and eye*

PubMed Central

JERGER, SUSAN; DAMIAN, MARKUS F.; TYE-MURRAY, NANCY; ABDI, HERVÉ

2016-01-01

Adults use vision to perceive low-fidelity speech; yet how children acquire this ability is not well understood. The literature indicates that children show reduced sensitivity to visual speech from kindergarten to adolescence. We hypothesized that this pattern reflects the effects of complex tasks and a growth period with harder-to-utilize cognitive resources, not lack of sensitivity. We investigated sensitivity to visual speech in children via the phonological priming produced by low-fidelity (non-intact onset) auditory speech presented audiovisually (see dynamic face articulate consonant/rhyme b/ag; hear non-intact onset/rhyme: −b/ag) vs. auditorily (see still face; hear exactly same auditory input). Audiovisual speech produced greater priming from four to fourteen years, indicating that visual speech filled in the non-intact auditory onsets. The influence of visual speech depended uniquely on phonology and speechreading. Children – like adults – perceive speech onsets multimodally. Findings are critical for incorporating visual speech into developmental theories of speech perception. PMID:26752548
GRIN2A: an aptly named gene for speech dysfunction.

PubMed

Turner, Samantha J; Mayes, Angela K; Verhoeven, Andrea; Mandelstam, Simone A; Morgan, Angela T; Scheffer, Ingrid E

2015-02-10

To delineate the specific speech deficits in individuals with epilepsy-aphasia syndromes associated with mutations in the glutamate receptor subunit gene GRIN2A. We analyzed the speech phenotype associated with GRIN2A mutations in 11 individuals, aged 16 to 64 years, from 3 families. Standardized clinical speech assessments and perceptual analyses of conversational samples were conducted. Individuals showed a characteristic phenotype of dysarthria and dyspraxia with lifelong impact on speech intelligibility in some. Speech was typified by imprecise articulation (11/11, 100%), impaired pitch (monopitch 10/11, 91%) and prosody (stress errors 7/11, 64%), and hypernasality (7/11, 64%). Oral motor impairments and poor performance on maximum vowel duration (8/11, 73%) and repetition of monosyllables (10/11, 91%) and trisyllables (7/11, 64%) supported conversational speech findings. The speech phenotype was present in one individual who did not have seizures. Distinctive features of dysarthria and dyspraxia are found in individuals with GRIN2A mutations, often in the setting of epilepsy-aphasia syndromes; dysarthria has not been previously recognized in these disorders. Of note, the speech phenotype may occur in the absence of a seizure disorder, reinforcing an important role for GRIN2A in motor speech function. Our findings highlight the need for precise clinical speech assessment and intervention in this group. By understanding the mechanisms involved in GRIN2A disorders, targeted therapy may be designed to improve chronic lifelong deficits in intelligibility. © 2015 American Academy of Neurology.
Voxel-based morphometry of auditory and speech-related cortex in stutterers.

PubMed

Beal, Deryk S; Gracco, Vincent L; Lafaille, Sophie J; De Nil, Luc F

2007-08-06

Stutterers demonstrate unique functional neural activation patterns during speech production, including reduced auditory activation, relative to nonstutterers. The extent to which these functional differences are accompanied by abnormal morphology of the brain in stutterers is unclear. This study examined the neuroanatomical differences in speech-related cortex between stutterers and nonstutterers using voxel-based morphometry. Results revealed significant differences in localized grey matter and white matter densities of left and right hemisphere regions involved in auditory processing and speech production.
Speech therapy and voice recognition instrument

NASA Technical Reports Server (NTRS)

Cohen, J.; Babcock, M. L.

1972-01-01

Characteristics of electronic circuit for examining variations in vocal excitation for diagnostic purposes and in speech recognition for determiniog voice patterns and pitch changes are described. Operation of the circuit is discussed and circuit diagram is provided.
Pausing Preceding and Following "Que" in the Production of Native Speakers of French

ERIC Educational Resources Information Center

Genc, Bilal; Mavasoglu, Mustafa; Bada, Erdogan

2011-01-01

Pausing strategies in read and spontaneous speech have been of significant interest for researchers since in literature it was observed that read speech and spontaneous speech pausing patterns do display some considerable differences. This, at least, is the case in the English language as it was produced by native speakers. As to what may be the…
On the Function of Stress Rhythms in Speech: Evidence of a Link with Grouping Effects on Serial Memory

ERIC Educational Resources Information Center

Boucher, Victor J.

2006-01-01

Language learning requires a capacity to recall novel series of speech sounds. Research shows that prosodic marks create grouping effects enhancing serial recall. However, any restriction on memory affecting the reproduction of prosody would limit the set of patterns that could be learned and subsequently used in speech. By implication, grouping…
Spontaneous Speech Events in Two Speech Databases of Human-Computer and Human-Human Dialogs in Spanish

ERIC Educational Resources Information Center

Rodriguez, Luis J.; Torres, M. Ines

2006-01-01

Previous works in English have revealed that disfluencies follow regular patterns and that incorporating them into the language model of a speech recognizer leads to lower perplexities and sometimes to a better performance. Although work on disfluency modeling has been applied outside the English community (e.g., in Japanese), as far as we know…
Characterizing Intonation Deficit in Motor Speech Disorders: An Autosegmental-Metrical Analysis of Spontaneous Speech in Hypokinetic Dysarthria, Ataxic Dysarthria, and Foreign Accent Syndrome

ERIC Educational Resources Information Center

Lowit, Anja; Kuschmann, Anja

2012-01-01

Purpose: The autosegmental-metrical (AM) framework represents an established methodology for intonational analysis in unimpaired speaker populations but has found little application in describing intonation in motor speech disorders (MSDs). This study compared the intonation patterns of unimpaired participants (CON) and those with Parkinson's…
Tone classification of syllable-segmented Thai speech based on multilayer perception

NASA Astrophysics Data System (ADS)

Satravaha, Nuttavudh; Klinkhachorn, Powsiri; Lass, Norman

2002-05-01

Thai is a monosyllabic tonal language that uses tone to convey lexical information about the meaning of a syllable. Thus to completely recognize a spoken Thai syllable, a speech recognition system not only has to recognize a base syllable but also must correctly identify a tone. Hence, tone classification of Thai speech is an essential part of a Thai speech recognition system. Thai has five distinctive tones (``mid,'' ``low,'' ``falling,'' ``high,'' and ``rising'') and each tone is represented by a single fundamental frequency (F0) pattern. However, several factors, including tonal coarticulation, stress, intonation, and speaker variability, affect the F0 pattern of a syllable in continuous Thai speech. In this study, an efficient method for tone classification of syllable-segmented Thai speech, which incorporates the effects of tonal coarticulation, stress, and intonation, as well as a method to perform automatic syllable segmentation, were developed. Acoustic parameters were used as the main discriminating parameters. The F0 contour of a segmented syllable was normalized by using a z-score transformation before being presented to a tone classifier. The proposed system was evaluated on 920 test utterances spoken by 8 speakers. A recognition rate of 91.36% was achieved by the proposed system.
Efficacy of Directional Microphones in Hearing Aids Equipped with Wireless Synchronization Technology.

PubMed

Geetha, Chinnaraj; Tanniru, Kishore; Rajan, R Raja

2017-04-01

This study aimed to evaluate the use of directionality in hearing aids with wireless synchronization on localization and speech intelligibility in noise. This study included 25 individuals with bilateral mild to moderate flat sensorineural hearing loss. For the localization experiment, eight loudspeakers (Genelec 8020B) arranged in a circle covering a 0-360° angle and the Cubase 6 software were used for presenting the stimulus. A car horn of 260 ms was presented from these loudspeakers, one at a time, randomly. The listener was instructed to point to the direction of the source. The degree of the localization error was obtained with and without directionality and wireless synchronization options. For speech perception in a noise experiment, signal to noise ratio-50 (SNR-50) was obtained using sentences played through a speaker at a fixed angle of 0°. A calibrated eight-talker speech babble was used as noise and the babble was routed either through 0°, 90°, 270° (through one speaker at a time) or through both 90° and 270° speakers. The results revealed that the conditions where both the wireless synchronization and directionality were activated resulted in a significantly better performance in both localization and speech perception in noise tasks. It can be concluded that the directionality in the wireless synchronization hearing aids coordinates with each other binaurally for better preservation of binaural cues, thus reducing the localization errors and improving speech perception in noise. The results of this study could be used to counsel and justify the selection of the directional wireless synchronization hearing aids.
Similarity of Cortical Activity Patterns Predicts generalization Behavior

PubMed Central

Engineer, Crystal T.; Perez, Claudia A.; Carraway, Ryan S.; Chang, Kevin Q.; Roland, Jarod L.; Sloan, Andrew M.; Kilgard, Michael P.

2013-01-01

Humans and animals readily generalize previously learned knowledge to new situations. Determining similarity is critical for assigning category membership to a novel stimulus. We tested the hypothesis that category membership is initially encoded by the similarity of the activity pattern evoked by a novel stimulus to the patterns from known categories. We provide behavioral and neurophysiological evidence that activity patterns in primary auditory cortex contain sufficient information to explain behavioral categorization of novel speech sounds by rats. Our results suggest that category membership might be encoded by the similarity of the activity pattern evoked by a novel speech sound to the patterns evoked by known sounds. Categorization based on featureless pattern matching may represent a general neural mechanism for ensuring accurate generalization across sensory and cognitive systems. PMID:24147140

Neural evidence for predictive coding in auditory cortex during speech production.

PubMed

Okada, Kayoko; Matchin, William; Hickok, Gregory

2018-02-01

Recent models of speech production suggest that motor commands generate forward predictions of the auditory consequences of those commands, that these forward predications can be used to monitor and correct speech output, and that this system is hierarchically organized (Hickok, Houde, & Rong, Neuron, 69(3), 407--422, 2011; Pickering & Garrod, Behavior and Brain Sciences, 36(4), 329--347, 2013). Recent psycholinguistic research has shown that internally generated speech (i.e., imagined speech) produces different types of errors than does overt speech (Oppenheim & Dell, Cognition, 106(1), 528--537, 2008; Oppenheim & Dell, Memory & Cognition, 38(8), 1147-1160, 2010). These studies suggest that articulated speech might involve predictive coding at additional levels than imagined speech. The current fMRI experiment investigates neural evidence of predictive coding in speech production. Twenty-four participants from UC Irvine were recruited for the study. Participants were scanned while they were visually presented with a sequence of words that they reproduced in sync with a visual metronome. On each trial, they were cued to either silently articulate the sequence or to imagine the sequence without overt articulation. As expected, silent articulation and imagined speech both engaged a left hemisphere network previously implicated in speech production. A contrast of silent articulation with imagined speech revealed greater activation for articulated speech in inferior frontal cortex, premotor cortex and the insula in the left hemisphere, consistent with greater articulatory load. Although both conditions were silent, this contrast also produced significantly greater activation in auditory cortex in dorsal superior temporal gyrus in both hemispheres. We suggest that these activations reflect forward predictions arising from additional levels of the perceptual/motor hierarchy that are involved in monitoring the intended speech output.
Detection of target phonemes in spontaneous and read speech.

PubMed

Mehta, G; Cutler, A

1988-01-01

Although spontaneous speech occurs more frequently in most listeners' experience than read speech, laboratory studies of human speech recognition typically use carefully controlled materials read from a script. The phonological and prosodic characteristics of spontaneous and read speech differ considerably, however, which suggests that laboratory results may not generalise to the recognition of spontaneous speech. In the present study listeners were presented with both spontaneous and read speech materials, and their response time to detect word-initial target phonemes was measured. Responses were, overall, equally fast in each speech mode. However, analysis of effects previously reported in phoneme detection studies revealed significant differences between speech modes. In read speech but not in spontaneous speech, later targets were detected more rapidly than targets preceded by short words. In contrast, in spontaneous speech but not in read speech, targets were detected more rapidly in accented than in unaccented words and in strong than in weak syllables. An explanation for this pattern is offered in terms of characteristic prosodic differences between spontaneous and read speech. The results support claims from previous work that listeners pay great attention to prosodic information in the process of recognising speech.
Explaining errors in children's questions.

PubMed

Rowland, Caroline F

2007-07-01

The ability to explain the occurrence of errors in children's speech is an essential component of successful theories of language acquisition. The present study tested some generativist and constructivist predictions about error on the questions produced by ten English-learning children between 2 and 5 years of age. The analyses demonstrated that, as predicted by some generativist theories [e.g. Santelmann, L., Berk, S., Austin, J., Somashekar, S. & Lust. B. (2002). Continuity and development in the acquisition of inversion in yes/no questions: dissociating movement and inflection, Journal of Child Language, 29, 813-842], questions with auxiliary DO attracted higher error rates than those with modal auxiliaries. However, in wh-questions, questions with modals and DO attracted equally high error rates, and these findings could not be explained in terms of problems forming questions with why or negated auxiliaries. It was concluded that the data might be better explained in terms of a constructivist account that suggests that entrenched item-based constructions may be protected from error in children's speech, and that errors occur when children resort to other operations to produce questions [e.g. Dabrowska, E. (2000). From formula to schema: the acquisition of English questions. Cognitive Liguistics, 11, 83-102; Rowland, C. F. & Pine, J. M. (2000). Subject-auxiliary inversion errors and wh-question acquisition: What children do know? Journal of Child Language, 27, 157-181; Tomasello, M. (2003). Constructing a language: A usage-based theory of language acquisition. Cambridge, MA: Harvard University Press]. However, further work on constructivist theory development is required to allow researchers to make predictions about the nature of these operations.
Audio visual speech source separation via improved context dependent association model

NASA Astrophysics Data System (ADS)

Kazemi, Alireza; Boostani, Reza; Sobhanmanesh, Fariborz

2014-12-01

In this paper, we exploit the non-linear relation between a speech source and its associated lip video as a source of extra information to propose an improved audio-visual speech source separation (AVSS) algorithm. The audio-visual association is modeled using a neural associator which estimates the visual lip parameters from a temporal context of acoustic observation frames. We define an objective function based on mean square error (MSE) measure between estimated and target visual parameters. This function is minimized for estimation of the de-mixing vector/filters to separate the relevant source from linear instantaneous or time-domain convolutive mixtures. We have also proposed a hybrid criterion which uses AV coherency together with kurtosis as a non-Gaussianity measure. Experimental results are presented and compared in terms of visually relevant speech detection accuracy and output signal-to-interference ratio (SIR) of source separation. The suggested audio-visual model significantly improves relevant speech classification accuracy compared to existing GMM-based model and the proposed AVSS algorithm improves the speech separation quality compared to reference ICA- and AVSS-based methods.
The effects of speech output technology in the learning of graphic symbols.

PubMed Central

Schlosser, R W; Belfiore, P J; Nigam, R; Blischak, D; Hetzroni, O

1995-01-01

The effects of auditory stimuli in the form of synthetic speech output on the learning of graphic symbols were evaluated. Three adults with severe to profound mental retardation and communication impairments were taught to point to lexigrams when presented with words under two conditions. In the first condition, participants used a voice output communication aid to receive synthetic speech as antecedent and consequent stimuli. In the second condition, with a nonelectronic communications board, participants did not receive synthetic speech. A parallel treatments design was used to evaluate the effects of the synthetic speech output as an added component of the augmentative and alternative communication system. The 3 participants reached criterion when not provided with the auditory stimuli. Although 2 participants also reached criterion when not provided with the auditory stimuli, the addition of auditory stimuli resulted in more efficient learning and a decreased error rate. Maintenance results, however, indicated no differences between conditions. Finding suggest that auditory stimuli in the form of synthetic speech contribute to the efficient acquisition of graphic communication symbols. PMID:14743828
Robust recognition of loud and Lombard speech in the fighter cockpit environment

NASA Astrophysics Data System (ADS)

Stanton, Bill J., Jr.

1988-08-01

There are a number of challenges associated with incorporating speech recognition technology into the fighter cockpit. One of the major problems is the wide range of variability in the pilot's voice. That can result from changing levels of stress and workload. Increasing the training set to include abnormal speech is not an attractive option because of the innumerable conditions that would have to be represented and the inordinate amount of time to collect such a training set. A more promising approach is to study subsets of abnormal speech that have been produced under controlled cockpit conditions with the purpose of characterizing reliable shifts that occur relative to normal speech. Such was the initiative of this research. Analyses were conducted for 18 features on 17671 phoneme tokens across eight speakers for normal, loud, and Lombard speech. It was discovered that there was a consistent migration of energy in the sonorants. This discovery of reliable energy shifts led to the development of a method to reduce or eliminate these shifts in the Euclidean distances between LPC log magnitude spectra. This combination significantly improved recognition performance of loud and Lombard speech. Discrepancies in recognition error rates between normal and abnormal speech were reduced by approximately 50 percent for all eight speakers combined.
Microscopic prediction of speech intelligibility in spatially distributed speech-shaped noise for normal-hearing listeners.

PubMed

Geravanchizadeh, Masoud; Fallah, Ali

2015-12-01

A binaural and psychoacoustically motivated intelligibility model, based on a well-known monaural microscopic model is proposed. This model simulates a phoneme recognition task in the presence of spatially distributed speech-shaped noise in anechoic scenarios. In the proposed model, binaural advantage effects are considered by generating a feature vector for a dynamic-time-warping speech recognizer. This vector consists of three subvectors incorporating two monaural subvectors to model the better-ear hearing, and a binaural subvector to simulate the binaural unmasking effect. The binaural unit of the model is based on equalization-cancellation theory. This model operates blindly, which means separate recordings of speech and noise are not required for the predictions. Speech intelligibility tests were conducted with 12 normal hearing listeners by collecting speech reception thresholds (SRTs) in the presence of single and multiple sources of speech-shaped noise. The comparison of the model predictions with the measured binaural SRTs, and with the predictions of a macroscopic binaural model called extended equalization-cancellation, shows that this approach predicts the intelligibility in anechoic scenarios with good precision. The square of the correlation coefficient (r(2)) and the mean-absolute error between the model predictions and the measurements are 0.98 and 0.62 dB, respectively.
Bilateral Versus Unilateral Cochlear Implantation in Adult Listeners: Speech-On-Speech Masking and Multitalker Localization.

PubMed

Rana, Baljeet; Buchholz, Jörg M; Morgan, Catherine; Sharma, Mridula; Weller, Tobias; Konganda, Shivali Appaiah; Shirai, Kyoko; Kawano, Atsushi

2017-01-01

Binaural hearing helps normal-hearing listeners localize sound sources and understand speech in noise. However, it is not fully understood how far this is the case for bilateral cochlear implant (CI) users. To determine the potential benefits of bilateral over unilateral CIs, speech comprehension thresholds (SCTs) were measured in seven Japanese bilateral CI recipients using Helen test sentences (translated into Japanese) in a two-talker speech interferer presented from the front (co-located with the target speech), ipsilateral to the first-implanted ear (at +90° or -90°), and spatially symmetric at ±90°. Spatial release from masking was calculated as the difference between co-located and spatially separated SCTs. Localization was assessed in the horizontal plane by presenting either male or female speech or both simultaneously. All measurements were performed bilaterally and unilaterally (with the first implanted ear) inside a loudspeaker array. Both SCTs and spatial release from masking were improved with bilateral CIs, demonstrating mean bilateral benefits of 7.5 dB in spatially asymmetric and 3 dB in spatially symmetric speech mixture. Localization performance varied strongly between subjects but was clearly improved with bilateral over unilateral CIs with the mean localization error reduced by 27°. Surprisingly, adding a second talker had only a negligible effect on localization.
Kurzweil Reading Machine: A Partial Evaluation of Its Optical Character Recognition Error Rate.

ERIC Educational Resources Information Center

Goodrich, Gregory L.; And Others

1979-01-01

A study designed to assess the ability of the Kurzweil reading machine (a speech reading device for the visually handicapped) to read three different type styles produced by five different means indicated that the machines tested had different error rates depending upon the means of producing the copy and upon the type style used. (Author/CL)
Application of independent component analysis for speech-music separation using an efficient score function estimation

NASA Astrophysics Data System (ADS)

Pishravian, Arash; Aghabozorgi Sahaf, Masoud Reza

2012-12-01

In this paper speech-music separation using Blind Source Separation is discussed. The separating algorithm is based on the mutual information minimization where the natural gradient algorithm is used for minimization. In order to do that, score function estimation from observation signals (combination of speech and music) samples is needed. The accuracy and the speed of the mentioned estimation will affect on the quality of the separated signals and the processing time of the algorithm. The score function estimation in the presented algorithm is based on Gaussian mixture based kernel density estimation method. The experimental results of the presented algorithm on the speech-music separation and comparing to the separating algorithm which is based on the Minimum Mean Square Error estimator, indicate that it can cause better performance and less processing time
Vector Sum Excited Linear Prediction (VSELP) speech coding at 4.8 kbps

NASA Technical Reports Server (NTRS)

Gerson, Ira A.; Jasiuk, Mark A.

1990-01-01

Code Excited Linear Prediction (CELP) speech coders exhibit good performance at data rates as low as 4800 bps. The major drawback to CELP type coders is their larger computational requirements. The Vector Sum Excited Linear Prediction (VSELP) speech coder utilizes a codebook with a structure which allows for a very efficient search procedure. Other advantages of the VSELP codebook structure is discussed and a detailed description of a 4.8 kbps VSELP coder is given. This coder is an improved version of the VSELP algorithm, which finished first in the NSA's evaluation of the 4.8 kbps speech coders. The coder uses a subsample resolution single tap long term predictor, a single VSELP excitation codebook, a novel gain quantizer which is robust to channel errors, and a new adaptive pre/postfilter arrangement.
Strain map of the tongue in normal and ALS speech patterns from tagged and diffusion MRI

NASA Astrophysics Data System (ADS)

Xing, Fangxu; Prince, Jerry L.; Stone, Maureen; Reese, Timothy G.; Atassi, Nazem; Wedeen, Van J.; El Fakhri, Georges; Woo, Jonghye

2018-03-01

Amyotrophic Lateral Sclerosis (ALS) is a neurological disease that causes death of neurons controlling muscle movements. Loss of speech and swallowing functions is a major impact due to degeneration of the tongue muscles. In speech studies using magnetic resonance (MR) techniques, diffusion tensor imaging (DTI) is used to capture internal tongue muscle fiber structures in three-dimensions (3D) in a non-invasive manner. Tagged magnetic resonance images (tMRI) are used to record tongue motion during speech. In this work, we aim to combine information obtained with both MR imaging techniques to compare the functionality characteristics of the tongue between normal and ALS subjects. We first extracted 3D motion of the tongue using tMRI from fourteen normal subjects in speech. The estimated motion sequences were then warped using diffeomorphic registration into the b0 spaces of the DTI data of two normal subjects and an ALS patient. We then constructed motion atlases by averaging all warped motion fields in each b0 space, and computed strain in the line of action along the muscle fiber directions provided by tractography. Strain in line with the fiber directions provides a quantitative map of the potential active region of the tongue during speech. Comparison between normal and ALS subjects explores the changing volume of compressing tongue tissues in speech facing the situation of muscle degradation. The proposed framework provides for the first time a dynamic map of contracting fibers in ALS speech patterns, and has the potential to provide more insight into the detrimental effects of ALS on speech.
Strain Map of the Tongue in Normal and ALS Speech Patterns from Tagged and Diffusion MRI.

PubMed

Xing, Fangxu; Prince, Jerry L; Stone, Maureen; Reese, Timothy G; Atassi, Nazem; Wedeen, Van J; El Fakhri, Georges; Woo, Jonghye

2018-02-01

Amyotrophic Lateral Sclerosis (ALS) is a neurological disease that causes death of neurons controlling muscle movements. Loss of speech and swallowing functions is a major impact due to degeneration of the tongue muscles. In speech studies using magnetic resonance (MR) techniques, diffusion tensor imaging (DTI) is used to capture internal tongue muscle fiber structures in three-dimensions (3D) in a non-invasive manner. Tagged magnetic resonance images (tMRI) are used to record tongue motion during speech. In this work, we aim to combine information obtained with both MR imaging techniques to compare the functionality characteristics of the tongue between normal and ALS subjects. We first extracted 3D motion of the tongue using tMRI from fourteen normal subjects in speech. The estimated motion sequences were then warped using diffeomorphic registration into the b0 spaces of the DTI data of two normal subjects and an ALS patient. We then constructed motion atlases by averaging all warped motion fields in each b0 space, and computed strain in the line of action along the muscle fiber directions provided by tractography. Strain in line with the fiber directions provides a quantitative map of the potential active region of the tongue during speech. Comparison between normal and ALS subjects explores the changing volume of compressing tongue tissues in speech facing the situation of muscle degradation. The proposed framework provides for the first time a dynamic map of contracting fibers in ALS speech patterns, and has the potential to provide more insight into the detrimental effects of ALS on speech.
Speech in 10-Year-Olds Born With Cleft Lip and Palate: What Do Peers Say?

PubMed

Nyberg, Jill; Havstam, Christina

2016-09-01

The aim of this study was to explore how 10-year-olds describe speech and communicative participation in children born with unilateral cleft lip and palate in their own words, whether they perceive signs of velopharyngeal insufficiency (VPI) and articulation errors of different degrees, and if so, which terminology they use. Methods/Participants: Nineteen 10-year-olds participated in three focus group interviews where they listened to 10 to 12 speech samples with different types of cleft speech characteristics assessed by speech and language pathologists (SLPs) and described what they heard. The interviews were transcribed and analyzed with qualitative content analysis. The analysis resulted in three interlinked categories encompassing different aspects of speech, personality, and social implications: descriptions of speech, thoughts on causes and consequences, and emotional reactions and associations. Each category contains four subcategories exemplified with quotes from the children's statements. More pronounced signs of VPI were perceived but referred to in terms relevant to 10-year-olds. Articulatory difficulties, even minor ones, were noted. Peers reflected on the risk to teasing and bullying and on how children with impaired speech might experience their situation. The SLPs and peers did not agree on minor signs of VPI, but they were unanimous in their analysis of clinically normal and more severely impaired speech. Articulatory impairments may be more important to treat than minor signs of VPI based on what peers say.
Age-Related Neural Oscillation Patterns During the Processing of Temporally Manipulated Speech.

PubMed

Rufener, Katharina S; Oechslin, Mathias S; Wöstmann, Malte; Dellwo, Volker; Meyer, Martin

2016-05-01

This EEG-study aims to investigate age-related differences in the neural oscillation patterns during the processing of temporally modulated speech. Viewing from a lifespan perspective, we recorded the electroencephalogram (EEG) data of three age samples: young adults, middle-aged adults and older adults. Stimuli consisted of temporally degraded sentences in Swedish-a language unfamiliar to all participants. We found age-related differences in phonetic pattern matching when participants were presented with envelope-degraded sentences, whereas no such age-effect was observed in the processing of fine-structure-degraded sentences. Irrespective of age, during speech processing the EEG data revealed a relationship between envelope information and the theta band (4-8 Hz) activity. Additionally, an association between fine-structure information and the gamma band (30-48 Hz) activity was found. No interaction, however, was found between acoustic manipulation of stimuli and age. Importantly, our main finding was paralleled by an overall enhanced power in older adults in high frequencies (gamma: 30-48 Hz). This occurred irrespective of condition. For the most part, this result is in line with the Asymmetric Sampling in Time framework (Poeppel in Speech Commun 41:245-255, 2003), which assumes an isomorphic correspondence between frequency modulations in neurophysiological patterns and acoustic oscillations in spoken language. We conclude that speech-specific neural networks show strong stability over adulthood, despite initial processes of cortical degeneration indicated by enhanced gamma power. The results of our study therefore confirm the concept that sensory and cognitive processes undergo multidirectional trajectories within the context of healthy aging.
Speech perception in autism spectrum disorder: An activation likelihood estimation meta-analysis.

PubMed

Tryfon, Ana; Foster, Nicholas E V; Sharda, Megha; Hyde, Krista L

2018-02-15

Autism spectrum disorder (ASD) is often characterized by atypical language profiles and auditory and speech processing. These can contribute to aberrant language and social communication skills in ASD. The study of the neural basis of speech perception in ASD can serve as a potential neurobiological marker of ASD early on, but mixed results across studies renders it difficult to find a reliable neural characterization of speech processing in ASD. To this aim, the present study examined the functional neural basis of speech perception in ASD versus typical development (TD) using an activation likelihood estimation (ALE) meta-analysis of 18 qualifying studies. The present study included separate analyses for TD and ASD, which allowed us to examine patterns of within-group brain activation as well as both common and distinct patterns of brain activation across the ASD and TD groups. Overall, ASD and TD showed mostly common brain activation of speech processing in bilateral superior temporal gyrus (STG) and left inferior frontal gyrus (IFG). However, the results revealed trends for some distinct activation in the TD group showing additional activation in higher-order brain areas including left superior frontal gyrus (SFG), left medial frontal gyrus (MFG), and right IFG. These results provide a more reliable neural characterization of speech processing in ASD relative to previous single neuroimaging studies and motivate future work to investigate how these brain signatures relate to behavioral measures of speech processing in ASD. Copyright © 2017 Elsevier B.V. All rights reserved.
Objective support for subjective reports of successful inner speech in two people with aphasia

PubMed Central

Hayward, William; Snider, Sarah F.; Luta, George; Friedman, Rhonda B.; Turkeltaub, Peter E.

2016-01-01

People with aphasia frequently report being able to say a word correctly in their heads, even if they are unable to say that word aloud. It is difficult to know what is meant by these reports of “successful inner speech”. We probe the experience of successful inner speech in two people with aphasia. We show that these reports are associated with correct overt speech and phonologically related nonwords errors, that they relate to word characteristics associated with ease of lexical access but not ease of production, and that they predict whether or not individual words are relearned during anomia treatment. These findings suggest that reports of successful inner speech are meaningful and may be useful to study self-monitoring in aphasia, to better understand anomia, and to predict treatment outcomes. Ultimately, the study of inner speech in people with aphasia could provide critical insights that inform our understanding of normal language. PMID:27469037
Imitation of contrastive lexical stress in children with speech delay

NASA Astrophysics Data System (ADS)

Vick, Jennell C.; Moore, Christopher A.

2005-09-01

This study examined the relationship between acoustic correlates of stress in trochaic (strong-weak), spondaic (strong-strong), and iambic (weak-strong) nonword bisyllables produced by children (30-50) with normal speech acquisition and children with speech delay. Ratios comparing the acoustic measures (vowel duration, rms, and f0) of the first syllable to the second syllable were calculated to evaluate the extent to which each phonetic parameter was used to mark stress. In addition, a calculation of the variability of jaw movement in each bisyllable was made. Finally, perceptual judgments of accuracy of stress production were made. Analysis of perceptual judgments indicated a robust difference between groups: While both groups of children produced errors in imitating the contrastive lexical stress models (~40%), the children with normal speech acquisition tended to produce trochaic forms in substitution for other stress types, whereas children with speech delay showed no preference for trochees. The relationship between segmental acoustic parameters, kinematic variability, and the ratings of stress by trained listeners will be presented.
Speech reading and learning to read: a comparison of 8-year-old profoundly deaf children with good and poor reading ability.

PubMed

Harris, Margaret; Moreno, Constanza

2006-01-01

Nine children with severe-profound prelingual hearing loss and single-word reading scores not more than 10 months behind chronological age (Good Readers) were matched with 9 children whose reading lag was at least 15 months (Poor Readers). Good Readers had significantly higher spelling and reading comprehension scores. They produced significantly more phonetic errors (indicating the use of phonological coding) and more often correctly represented the number of syllables in spelling than Poor Readers. They also scored more highly on orthographic awareness and were better at speech reading. Speech intelligibility was the same in the two groups. Cluster analysis revealed that only three Good Readers showed strong evidence of phonetic coding in spelling although seven had good representation of syllables; only four had high orthographic awareness scores. However, all 9 children were good speech readers, suggesting that a phonological code derived through speech reading may underpin reading success for deaf children.
Analysis of facial motion patterns during speech using a matrix factorization algorithm

PubMed Central

Lucero, Jorge C.; Munhall, Kevin G.

2008-01-01

This paper presents an analysis of facial motion during speech to identify linearly independent kinematic regions. The data consists of three-dimensional displacement records of a set of markers located on a subject’s face while producing speech. A QR factorization with column pivoting algorithm selects a subset of markers with independent motion patterns. The subset is used as a basis to fit the motion of the other facial markers, which determines facial regions of influence of each of the linearly independent markers. Those regions constitute kinematic “eigenregions” whose combined motion produces the total motion of the face. Facial animations may be generated by driving the independent markers with collected displacement records. PMID:19062866

Discrimination of speech stimuli based on neuronal response phase patterns depends on acoustics but not comprehension.

PubMed

Howard, Mary F; Poeppel, David

2010-11-01

Speech stimuli give rise to neural activity in the listener that can be observed as waveforms using magnetoencephalography. Although waveforms vary greatly from trial to trial due to activity unrelated to the stimulus, it has been demonstrated that spoken sentences can be discriminated based on theta-band (3-7 Hz) phase patterns in single-trial response waveforms. Furthermore, manipulations of the speech signal envelope and fine structure that reduced intelligibility were found to produce correlated reductions in discrimination performance, suggesting a relationship between theta-band phase patterns and speech comprehension. This study investigates the nature of this relationship, hypothesizing that theta-band phase patterns primarily reflect cortical processing of low-frequency (<40 Hz) modulations present in the acoustic signal and required for intelligibility, rather than processing exclusively related to comprehension (e.g., lexical, syntactic, semantic). Using stimuli that are quite similar to normal spoken sentences in terms of low-frequency modulation characteristics but are unintelligible (i.e., their time-inverted counterparts), we find that discrimination performance based on theta-band phase patterns is equal for both types of stimuli. Consistent with earlier findings, we also observe that whereas theta-band phase patterns differ across stimuli, power patterns do not. We use a simulation model of the single-trial response to spoken sentence stimuli to demonstrate that phase-locked responses to low-frequency modulations of the acoustic signal can account not only for the phase but also for the power results. The simulation offers insight into the interpretation of the empirical results with respect to phase-resetting and power-enhancement models of the evoked response.
Role of the motor system in language knowledge.

PubMed

Berent, Iris; Brem, Anna-Katharine; Zhao, Xu; Seligson, Erica; Pan, Hong; Epstein, Jane; Stern, Emily; Galaburda, Albert M; Pascual-Leone, Alvaro

2015-02-17

All spoken languages express words by sound patterns, and certain patterns (e.g., blog) are systematically preferred to others (e.g., lbog). What principles account for such preferences: does the language system encode abstract rules banning syllables like lbog, or does their dislike reflect the increased motor demands associated with speech production? More generally, we ask whether linguistic knowledge is fully embodied or whether some linguistic principles could potentially be abstract. To address this question, here we gauge the sensitivity of English speakers to the putative universal syllable hierarchy (e.g., blif ≻ bnif ≻ bdif ≻ lbif) while undergoing transcranial magnetic stimulation (TMS) over the cortical motor representation of the left orbicularis oris muscle. If syllable preferences reflect motor simulation, then worse-formed syllables (e.g., lbif) should (i) elicit more errors; (ii) engage more strongly motor brain areas; and (iii) elicit stronger effects of TMS on these motor regions. In line with the motor account, we found that repetitive TMS pulses impaired participants' global sensitivity to the number of syllables, and functional MRI confirmed that the cortical stimulation site was sensitive to the syllable hierarchy. Contrary to the motor account, however, ill-formed syllables were least likely to engage the lip sensorimotor area and they were least impaired by TMS. Results suggest that speech perception automatically triggers motor action, but this effect is not causally linked to the computation of linguistic structure. We conclude that the language and motor systems are intimately linked, yet distinct. Language is designed to optimize motor action, but its knowledge includes principles that are disembodied and potentially abstract.
Role of the motor system in language knowledge

PubMed Central

Berent, Iris; Brem, Anna-Katharine; Zhao, Xu; Seligson, Erica; Pan, Hong; Epstein, Jane; Stern, Emily; Galaburda, Albert M.; Pascual-Leone, Alvaro

2015-01-01

All spoken languages express words by sound patterns, and certain patterns (e.g., blog) are systematically preferred to others (e.g., lbog). What principles account for such preferences: does the language system encode abstract rules banning syllables like lbog, or does their dislike reflect the increased motor demands associated with speech production? More generally, we ask whether linguistic knowledge is fully embodied or whether some linguistic principles could potentially be abstract. To address this question, here we gauge the sensitivity of English speakers to the putative universal syllable hierarchy (e.g., blif≻bnif≻bdif≻lbif) while undergoing transcranial magnetic stimulation (TMS) over the cortical motor representation of the left orbicularis oris muscle. If syllable preferences reflect motor simulation, then worse-formed syllables (e.g., lbif) should (i) elicit more errors; (ii) engage more strongly motor brain areas; and (iii) elicit stronger effects of TMS on these motor regions. In line with the motor account, we found that repetitive TMS pulses impaired participants’ global sensitivity to the number of syllables, and functional MRI confirmed that the cortical stimulation site was sensitive to the syllable hierarchy. Contrary to the motor account, however, ill-formed syllables were least likely to engage the lip sensorimotor area and they were least impaired by TMS. Results suggest that speech perception automatically triggers motor action, but this effect is not causally linked to the computation of linguistic structure. We conclude that the language and motor systems are intimately linked, yet distinct. Language is designed to optimize motor action, but its knowledge includes principles that are disembodied and potentially abstract. PMID:25646465
The relationship between the neural computations for speech and music perception is context-dependent: an activation likelihood estimate study.

PubMed

LaCroix, Arianna N; Diaz, Alvaro F; Rogalsky, Corianne

2015-01-01

The relationship between the neurobiology of speech and music has been investigated for more than a century. There remains no widespread agreement regarding how (or to what extent) music perception utilizes the neural circuitry that is engaged in speech processing, particularly at the cortical level. Prominent models such as Patel's Shared Syntactic Integration Resource Hypothesis (SSIRH) and Koelsch's neurocognitive model of music perception suggest a high degree of overlap, particularly in the frontal lobe, but also perhaps more distinct representations in the temporal lobe with hemispheric asymmetries. The present meta-analysis study used activation likelihood estimate analyses to identify the brain regions consistently activated for music as compared to speech across the functional neuroimaging (fMRI and PET) literature. Eighty music and 91 speech neuroimaging studies of healthy adult control subjects were analyzed. Peak activations reported in the music and speech studies were divided into four paradigm categories: passive listening, discrimination tasks, error/anomaly detection tasks and memory-related tasks. We then compared activation likelihood estimates within each category for music vs. speech, and each music condition with passive listening. We found that listening to music and to speech preferentially activate distinct temporo-parietal bilateral cortical networks. We also found music and speech to have shared resources in the left pars opercularis but speech-specific resources in the left pars triangularis. The extent to which music recruited speech-activated frontal resources was modulated by task. While there are certainly limitations to meta-analysis techniques particularly regarding sensitivity, this work suggests that the extent of shared resources between speech and music may be task-dependent and highlights the need to consider how task effects may be affecting conclusions regarding the neurobiology of speech and music.
The relationship between the neural computations for speech and music perception is context-dependent: an activation likelihood estimate study

PubMed Central

LaCroix, Arianna N.; Diaz, Alvaro F.; Rogalsky, Corianne

2015-01-01

The relationship between the neurobiology of speech and music has been investigated for more than a century. There remains no widespread agreement regarding how (or to what extent) music perception utilizes the neural circuitry that is engaged in speech processing, particularly at the cortical level. Prominent models such as Patel's Shared Syntactic Integration Resource Hypothesis (SSIRH) and Koelsch's neurocognitive model of music perception suggest a high degree of overlap, particularly in the frontal lobe, but also perhaps more distinct representations in the temporal lobe with hemispheric asymmetries. The present meta-analysis study used activation likelihood estimate analyses to identify the brain regions consistently activated for music as compared to speech across the functional neuroimaging (fMRI and PET) literature. Eighty music and 91 speech neuroimaging studies of healthy adult control subjects were analyzed. Peak activations reported in the music and speech studies were divided into four paradigm categories: passive listening, discrimination tasks, error/anomaly detection tasks and memory-related tasks. We then compared activation likelihood estimates within each category for music vs. speech, and each music condition with passive listening. We found that listening to music and to speech preferentially activate distinct temporo-parietal bilateral cortical networks. We also found music and speech to have shared resources in the left pars opercularis but speech-specific resources in the left pars triangularis. The extent to which music recruited speech-activated frontal resources was modulated by task. While there are certainly limitations to meta-analysis techniques particularly regarding sensitivity, this work suggests that the extent of shared resources between speech and music may be task-dependent and highlights the need to consider how task effects may be affecting conclusions regarding the neurobiology of speech and music. PMID:26321976
GRIN2A

PubMed Central

Turner, Samantha J.; Mayes, Angela K.; Verhoeven, Andrea; Mandelstam, Simone A.; Morgan, Angela T.

2015-01-01

Objective: To delineate the specific speech deficits in individuals with epilepsy-aphasia syndromes associated with mutations in the glutamate receptor subunit gene GRIN2A. Methods: We analyzed the speech phenotype associated with GRIN2A mutations in 11 individuals, aged 16 to 64 years, from 3 families. Standardized clinical speech assessments and perceptual analyses of conversational samples were conducted. Results: Individuals showed a characteristic phenotype of dysarthria and dyspraxia with lifelong impact on speech intelligibility in some. Speech was typified by imprecise articulation (11/11, 100%), impaired pitch (monopitch 10/11, 91%) and prosody (stress errors 7/11, 64%), and hypernasality (7/11, 64%). Oral motor impairments and poor performance on maximum vowel duration (8/11, 73%) and repetition of monosyllables (10/11, 91%) and trisyllables (7/11, 64%) supported conversational speech findings. The speech phenotype was present in one individual who did not have seizures. Conclusions: Distinctive features of dysarthria and dyspraxia are found in individuals with GRIN2A mutations, often in the setting of epilepsy-aphasia syndromes; dysarthria has not been previously recognized in these disorders. Of note, the speech phenotype may occur in the absence of a seizure disorder, reinforcing an important role for GRIN2A in motor speech function. Our findings highlight the need for precise clinical speech assessment and intervention in this group. By understanding the mechanisms involved in GRIN2A disorders, targeted therapy may be designed to improve chronic lifelong deficits in intelligibility. PMID:25596506
The Effect of Furlow Palatoplasty Timing on Speech Outcomes in Submucous Cleft Palate.

PubMed

Swanson, Jordan W; Mitchell, Brianne T; Cohen, Marilyn; Solot, Cynthia; Jackson, Oksana; Low, David; Bartlett, Scott P; Taylor, Jesse A

2017-08-01

Because some patients with submucous cleft palate (SMCP) are asymptomatic, surgical treatment is conventionally delayed until hypernasal resonance is identified during speech production. We aim to identify whether speech outcomes after repair of a SMCP is influenced by age of repair. We retrospectively studied nonsyndromic children with SMCP. Speech results, before and after any surgical treatment or physical management of the palate were compared using the Pittsburgh Weighted Speech Scoring system. Furlow palatoplasty was performed on 40 nonsyndromic patients with SMCP, and 26 patients were not surgically treated. Total composite speech scores improved significantly among children repaired between 3 and 4 years of age (P = 0.02), but not older than 4 years (P = 0.63). Twelve (86%) of 14 patients repaired who are older than 4 years had borderline or incompetent speech (composite Pittsburgh Weighted Speech Scoring ≥3) compared with 2 (29%) of 7 repaired between 3 and 4 years of age (P = 0.0068), despite worse prerepair scores in the latter group. Resonance improved in children repaired who are older than 4 years, but articulation errors persisted to a greater degree than those treated before 4 years of age (P = 0.01.) CONCLUSIONS: Submucous cleft palate repair before 4 years of age appears associated with lower ultimate rates of borderline or incompetent speech. Speech of patients repaired at or after 4 years of age seems to be characterized by persistent misarticulation. These findings highlight the importance of timely diagnosis and management.
The Development of the Text Reception Threshold Test: A Visual Analogue of the Speech Reception Threshold Test

ERIC Educational Resources Information Center

Zekveld, Adriana A.; George, Erwin L. J.; Kramer, Sophia E.; Goverts, S. Theo; Houtgast, Tammo

2007-01-01

Purpose: In this study, the authors aimed to develop a visual analogue of the widely used Speech Reception Threshold (SRT; R. Plomp & A. M. Mimpen, 1979b) test. The Text Reception Threshold (TRT) test, in which visually presented sentences are masked by a bar pattern, enables the quantification of modality-aspecific variance in speech-in-noise…
Between-Word Processes in Children with Speech Difficulties: Insights from a Usage-Based Approach to Phonology

ERIC Educational Resources Information Center

Newton, Caroline

2012-01-01

There are some children with speech and/or language difficulties who are significantly more difficult to understand in connected speech than in single words. The study reported here explores the between-word behaviours of three such children, aged 11;8, 12;2 and 12;10. It focuses on whether these patterns could be accounted for by lenition, as…
Articulating What Infants Attune to in Native Speech

PubMed Central

Best, Catherine T.; Goldstein, Louis M.; Nam, Hosung; Tyler, Michael D.

2016-01-01

ABSTRACT To become language users, infants must embrace the integrality of speech perception and production. That they do so, and quite rapidly, is implied by the native-language attunement they achieve in each domain by 6–12 months. Yet research has most often addressed one or the other domain, rarely how they interrelate. Moreover, mainstream assumptions that perception relies on acoustic patterns whereas production involves motor patterns entail that the infant would have to translate incommensurable information to grasp the perception–production relationship. We posit the more parsimonious view that both domains depend on commensurate articulatory information. Our proposed framework combines principles of the Perceptual Assimilation Model (PAM) and Articulatory Phonology (AP). According to PAM, infants attune to articulatory information in native speech and detect similarities of nonnative phones to native articulatory patterns. The AP premise that gestures of the speech organs are the basic elements of phonology offers articulatory similarity metrics while satisfying the requirement that phonological information be discrete and contrastive: (a) distinct articulatory organs produce vocal tract constrictions and (b) phonological contrasts recruit different articulators and/or constrictions of a given articulator that differ in degree or location. Various lines of research suggest young children perceive articulatory information, which guides their productions: discrimination of between- versus within-organ contrasts, simulations of attunement to language-specific articulatory distributions, multimodal speech perception, oral/vocal imitation, and perceptual effects of articulator activation or suppression. We conclude that articulatory gesture information serves as the foundation for developmental integrality of speech perception and production. PMID:28367052
Masking Period Patterns & Forward Masking for Speech-Shaped Noise: Age-related effects

PubMed Central

Grose, John H.; Menezes, Denise C.; Porter, Heather L.; Griz, Silvana

2015-01-01

Objective The purpose of this study was to assess age-related changes in temporal resolution in listeners with relatively normal audiograms. The hypothesis was that increased susceptibility to non-simultaneous masking contributes to the hearing difficulties experienced by older listeners in complex fluctuating backgrounds. Design Participants included younger (n = 11), middle-aged (n = 12), and older (n = 11) listeners with relatively normal audiograms. The first phase of the study measured masking period patterns for speech-shaped noise maskers and signals. From these data, temporal window shapes were derived. The second phase measured forward-masking functions, and assessed how well the temporal window fits accounted for these data. Results The masking period patterns demonstrated increased susceptibility to backward masking in the older listeners, compatible with a more symmetric temporal window in this group. The forward-masking functions exhibited an age-related decline in recovery to baseline thresholds, and there was also an increase in the variability of the temporal window fits to these data. Conclusions This study demonstrated an age-related increase in susceptibility to non-simultaneous masking, supporting the hypothesis that exacerbated non-simultaneous masking contributes to age-related difficulties understanding speech in fluctuating noise. Further support for this hypothesis comes from limited speech-in-noise data suggesting an association between susceptibility to forward masking and speech understanding in modulated noise. PMID:26230495
Preschoolers' real-time coordination of vocal and facial emotional information.

PubMed

Berman, Jared M J; Chambers, Craig G; Graham, Susan A

2016-02-01

An eye-tracking methodology was used to examine the time course of 3- and 5-year-olds' ability to link speech bearing different acoustic cues to emotion (i.e., happy-sounding, neutral, and sad-sounding intonation) to photographs of faces reflecting different emotional expressions. Analyses of saccadic eye movement patterns indicated that, for both 3- and 5-year-olds, sad-sounding speech triggered gaze shifts to a matching (sad-looking) face from the earliest moments of speech processing. However, it was not until approximately 800ms into a happy-sounding utterance that preschoolers began to use the emotional cues from speech to identify a matching (happy-looking) face. Complementary analyses based on conscious/controlled behaviors (children's explicit points toward the faces) indicated that 5-year-olds, but not 3-year-olds, could successfully match happy-sounding and sad-sounding vocal affect to a corresponding emotional face. Together, the findings clarify developmental patterns in preschoolers' implicit versus explicit ability to coordinate emotional cues across modalities and highlight preschoolers' greater sensitivity to sad-sounding speech as the auditory signal unfolds in time. Copyright © 2015 Elsevier Inc. All rights reserved.
Lateralization of ERPs to speech and handedness in the early development of Autism Spectrum Disorder.

PubMed

Finch, Kayla H; Seery, Anne M; Talbott, Meagan R; Nelson, Charles A; Tager-Flusberg, Helen

2017-01-01

Language is a highly lateralized function, with typically developing individuals showing left hemispheric specialization. Individuals with autism spectrum disorder (ASD) often show reduced or reversed hemispheric lateralization in response to language. However, it is unclear when this difference emerges and whether or not it can serve as an early ASD biomarker. Additionally, atypical language lateralization is not specific to ASD as it is also seen more frequently in individuals with mixed- and left-handedness. Here, we examined early asymmetry patterns measured through neural responses to speech sounds at 12 months and behavioral observations of handedness at 36 months in children with and without ASD. Three different groups of children participated in the study: low-risk controls (LRC), high risk for ASD (HRA; infants with older sibling with ASD) without ASD, and HRA infants who later receive a diagnosis of ASD (ASD). Event-related potentials (ERPs) to speech sounds were recorded at 12 months. Utilizing a novel observational approach, handedness was measured by hand preference on a variety of behaviors at 36 months. At 12 months, lateralization patterns of ERPs to speech stimuli differed across the groups with the ASD group showing reversed lateralization compared to the LRC group. At 36 months, factor analysis of behavioral observations of hand preferences indicated a one-factor model with medium to high factor loadings. A composite handedness score was derived; no group differences were observed. There was no association between lateralization to speech at 12 months and handedness at 36 months in the LRC and HRA groups. However, children with ASD did show an association such that infants with lateralization patterns more similar to the LRC group at 12 months were stronger right-handers at 36 months. These results highlight early developmental patterns that might be specific to ASD, including a potential early biomarker of reversed lateralization to speech stimuli at 12 months, and a relation between behavioral and neural asymmetries. Future investigations of early asymmetry patterns, especially atypical hemispheric specialization, may be informative in the early identification of ASD.
Comprehension of synthetic speech and digitized natural speech by adults with aphasia.

PubMed

Hux, Karen; Knollman-Porter, Kelly; Brown, Jessica; Wallace, Sarah E

2017-09-01

Using text-to-speech technology to provide simultaneous written and auditory content presentation may help compensate for chronic reading challenges if people with aphasia can understand synthetic speech output; however, inherent auditory comprehension challenges experienced by people with aphasia may make understanding synthetic speech difficult. This study's purpose was to compare the preferences and auditory comprehension accuracy of people with aphasia when listening to sentences generated with digitized natural speech, Alex synthetic speech (i.e., Macintosh platform), or David synthetic speech (i.e., Windows platform). The methodology required each of 20 participants with aphasia to select one of four images corresponding in meaning to each of 60 sentences comprising three stimulus sets. Results revealed significantly better accuracy given digitized natural speech than either synthetic speech option; however, individual participant performance analyses revealed three patterns: (a) comparable accuracy regardless of speech condition for 30% of participants, (b) comparable accuracy between digitized natural speech and one, but not both, synthetic speech option for 45% of participants, and (c) greater accuracy with digitized natural speech than with either synthetic speech option for remaining participants. Ranking and Likert-scale rating data revealed a preference for digitized natural speech and David synthetic speech over Alex synthetic speech. Results suggest many individuals with aphasia can comprehend synthetic speech options available on popular operating systems. Further examination of synthetic speech use to support reading comprehension through text-to-speech technology is thus warranted. Copyright © 2017 Elsevier Inc. All rights reserved.
Audiovisual cues and perceptual learning of spectrally distorted speech.

PubMed

Pilling, Michael; Thomas, Sharon

2011-12-01

Two experiments investigate the effectiveness of audiovisual (AV) speech cues (cues derived from both seeing and hearing a talker speak) in facilitating perceptual learning of spectrally distorted speech. Speech was distorted through an eight channel noise-vocoder which shifted the spectral envelope of the speech signal to simulate the properties of a cochlear implant with a 6 mm place mismatch: Experiment I found that participants showed significantly greater improvement in perceiving noise-vocoded speech when training gave AV cues than when it gave auditory cues alone. Experiment 2 compared training with AV cues with training which gave written feedback. These two methods did not significantly differ in the pattern of training they produced. Suggestions are made about the types of circumstances in which the two training methods might be found to differ in facilitating auditory perceptual learning of speech.
Direct comparison of the impact of head tracking, reverberation, and individualized head-related transfer functions on the spatial perception of a virtual speech source

NASA Technical Reports Server (NTRS)

Begault, D. R.; Wenzel, E. M.; Anderson, M. R.

2001-01-01

A study of sound localization performance was conducted using headphone-delivered virtual speech stimuli, rendered via HRTF-based acoustic auralization software and hardware, and blocked-meatus HRTF measurements. The independent variables were chosen to evaluate commonly held assumptions in the literature regarding improved localization: inclusion of head tracking, individualized HRTFs, and early and diffuse reflections. Significant effects were found for azimuth and elevation error, reversal rates, and externalization.
Listening Effort: How the Cognitive Consequences of Acoustic Challenge Are Reflected in Brain and Behavior

PubMed Central

2018-01-01

Everyday conversation frequently includes challenges to the clarity of the acoustic speech signal, including hearing impairment, background noise, and foreign accents. Although an obvious problem is the increased risk of making word identification errors, extracting meaning from a degraded acoustic signal is also cognitively demanding, which contributes to increased listening effort. The concepts of cognitive demand and listening effort are critical in understanding the challenges listeners face in comprehension, which are not fully predicted by audiometric measures. In this article, the authors review converging behavioral, pupillometric, and neuroimaging evidence that understanding acoustically degraded speech requires additional cognitive support and that this cognitive load can interfere with other operations such as language processing and memory for what has been heard. Behaviorally, acoustic challenge is associated with increased errors in speech understanding, poorer performance on concurrent secondary tasks, more difficulty processing linguistically complex sentences, and reduced memory for verbal material. Measures of pupil dilation support the challenge associated with processing a degraded acoustic signal, indirectly reflecting an increase in neural activity. Finally, functional brain imaging reveals that the neural resources required to understand degraded speech extend beyond traditional perisylvian language networks, most commonly including regions of prefrontal cortex, premotor cortex, and the cingulo-opercular network. Far from being exclusively an auditory problem, acoustic degradation presents listeners with a systems-level challenge that requires the allocation of executive cognitive resources. An important point is that a number of dissociable processes can be engaged to understand degraded speech, including verbal working memory and attention-based performance monitoring. The specific resources required likely differ as a function of the acoustic, linguistic, and cognitive demands of the task, as well as individual differences in listeners’ abilities. A greater appreciation of cognitive contributions to processing degraded speech is critical in understanding individual differences in comprehension ability, variability in the efficacy of assistive devices, and guiding rehabilitation approaches to reducing listening effort and facilitating communication. PMID:28938250
Listening Effort: How the Cognitive Consequences of Acoustic Challenge Are Reflected in Brain and Behavior.

PubMed

Peelle, Jonathan E

Everyday conversation frequently includes challenges to the clarity of the acoustic speech signal, including hearing impairment, background noise, and foreign accents. Although an obvious problem is the increased risk of making word identification errors, extracting meaning from a degraded acoustic signal is also cognitively demanding, which contributes to increased listening effort. The concepts of cognitive demand and listening effort are critical in understanding the challenges listeners face in comprehension, which are not fully predicted by audiometric measures. In this article, the authors review converging behavioral, pupillometric, and neuroimaging evidence that understanding acoustically degraded speech requires additional cognitive support and that this cognitive load can interfere with other operations such as language processing and memory for what has been heard. Behaviorally, acoustic challenge is associated with increased errors in speech understanding, poorer performance on concurrent secondary tasks, more difficulty processing linguistically complex sentences, and reduced memory for verbal material. Measures of pupil dilation support the challenge associated with processing a degraded acoustic signal, indirectly reflecting an increase in neural activity. Finally, functional brain imaging reveals that the neural resources required to understand degraded speech extend beyond traditional perisylvian language networks, most commonly including regions of prefrontal cortex, premotor cortex, and the cingulo-opercular network. Far from being exclusively an auditory problem, acoustic degradation presents listeners with a systems-level challenge that requires the allocation of executive cognitive resources. An important point is that a number of dissociable processes can be engaged to understand degraded speech, including verbal working memory and attention-based performance monitoring. The specific resources required likely differ as a function of the acoustic, linguistic, and cognitive demands of the task, as well as individual differences in listeners' abilities. A greater appreciation of cognitive contributions to processing degraded speech is critical in understanding individual differences in comprehension ability, variability in the efficacy of assistive devices, and guiding rehabilitation approaches to reducing listening effort and facilitating communication.
Untrained listeners' ratings of speech disorders in a group with cleft palate: a comparison with speech and language pathologists' ratings.

PubMed

Brunnegård, Karin; Lohmander, Anette; van Doorn, Jan

2009-01-01

Hypernasal resonance, audible nasal air emission and/or nasal turbulence, and articulation errors are typical speech disorders associated with the speech of children with cleft lip and palate. Several studies indicate that hypernasal resonance tends to be perceived negatively by listeners. Most perceptual studies of speech disorders related to cleft palate are carried out with speech and language pathologists as listeners, whereas only a few studies have been conducted to explore how judgements by untrained listeners compare with expert assessments. These types of studies can be used to determine whether children for whom speech and language pathologists recommend intervention have a significant speech deviance that is also detected by untrained listeners. To compare ratings by untrained listeners with ratings by speech and language pathologists for cleft palate speech. An assessment form for untrained listeners was developed using statements and a five-point scale. The assessment form was tailored to facilitate comparison with expert judgements. Twenty-eight untrained listeners assessed the speech of 26 speakers with cleft palate and ten speakers without cleft in a comparison group. This assessment was compared with the joint assessment of two expert speech and language pathologists. Listener groups generally agreed on which speakers were nasal. The untrained listeners detected hyper- and hyponasality when it was present in speech and considered moderate to severe hypernasality to be serious enough to call for intervention. The expert listeners assessed audible nasal air emission and/or nasal turbulence to be present in twice as many speakers as the untrained listeners who were much less sensitive to audible nasal air emission and/or nasal turbulence. The results of untrained listeners' ratings in this study in the main confirm the ratings of speech and language pathologists and show that cleft palate speech disorders may have an impact in the everyday life of the speaker.
Automatic initial and final segmentation in cleft palate speech of Mandarin speakers

PubMed Central

Liu, Yin; Yin, Heng; Zhang, Junpeng; Zhang, Jing; Zhang, Jiang

2017-01-01

The speech unit segmentation is an important pre-processing step in the analysis of cleft palate speech. In Mandarin, one syllable is composed of two parts: initial and final. In cleft palate speech, the resonance disorders occur at the finals and the voiced initials, while the articulation disorders occur at the unvoiced initials. Thus, the initials and finals are the minimum speech units, which could reflect the characteristics of cleft palate speech disorders. In this work, an automatic initial/final segmentation method is proposed. It is an important preprocessing step in cleft palate speech signal processing. The tested cleft palate speech utterances are collected from the Cleft Palate Speech Treatment Center in the Hospital of Stomatology, Sichuan University, which has the largest cleft palate patients in China. The cleft palate speech data includes 824 speech segments, and the control samples contain 228 speech segments. The syllables are extracted from the speech utterances firstly. The proposed syllable extraction method avoids the training stage, and achieves a good performance for both voiced and unvoiced speech. Then, the syllables are classified into with “quasi-unvoiced” or with “quasi-voiced” initials. Respective initial/final segmentation methods are proposed to these two types of syllables. Moreover, a two-step segmentation method is proposed. The rough locations of syllable and initial/final boundaries are refined in the second segmentation step, in order to improve the robustness of segmentation accuracy. The experiments show that the initial/final segmentation accuracies for syllables with quasi-unvoiced initials are higher than quasi-voiced initials. For the cleft palate speech, the mean time error is 4.4ms for syllables with quasi-unvoiced initials, and 25.7ms for syllables with quasi-voiced initials, and the correct segmentation accuracy P30 for all the syllables is 91.69%. For the control samples, P30 for all the syllables is 91.24%. PMID:28926572

Automatic initial and final segmentation in cleft palate speech of Mandarin speakers.

PubMed

He, Ling; Liu, Yin; Yin, Heng; Zhang, Junpeng; Zhang, Jing; Zhang, Jiang

2017-01-01

The speech unit segmentation is an important pre-processing step in the analysis of cleft palate speech. In Mandarin, one syllable is composed of two parts: initial and final. In cleft palate speech, the resonance disorders occur at the finals and the voiced initials, while the articulation disorders occur at the unvoiced initials. Thus, the initials and finals are the minimum speech units, which could reflect the characteristics of cleft palate speech disorders. In this work, an automatic initial/final segmentation method is proposed. It is an important preprocessing step in cleft palate speech signal processing. The tested cleft palate speech utterances are collected from the Cleft Palate Speech Treatment Center in the Hospital of Stomatology, Sichuan University, which has the largest cleft palate patients in China. The cleft palate speech data includes 824 speech segments, and the control samples contain 228 speech segments. The syllables are extracted from the speech utterances firstly. The proposed syllable extraction method avoids the training stage, and achieves a good performance for both voiced and unvoiced speech. Then, the syllables are classified into with "quasi-unvoiced" or with "quasi-voiced" initials. Respective initial/final segmentation methods are proposed to these two types of syllables. Moreover, a two-step segmentation method is proposed. The rough locations of syllable and initial/final boundaries are refined in the second segmentation step, in order to improve the robustness of segmentation accuracy. The experiments show that the initial/final segmentation accuracies for syllables with quasi-unvoiced initials are higher than quasi-voiced initials. For the cleft palate speech, the mean time error is 4.4ms for syllables with quasi-unvoiced initials, and 25.7ms for syllables with quasi-voiced initials, and the correct segmentation accuracy P30 for all the syllables is 91.69%. For the control samples, P30 for all the syllables is 91.24%.
Nasal and Oral Inspiration During Natural Speech Breathing

PubMed Central

Lester, Rosemary A.; Hoit, Jeannette D.

2015-01-01

Purpose The purpose of this study was to determine the typical pattern for inspiration during speech breathing in healthy adults, as well as the factors that might influence it. Method Ten healthy adults, 18–45 years of age, performed a variety of speaking tasks while nasal ram pressure, audio, and video recordings were obtained. Inspirations were categorized as a nasal only, oral only, simultaneous nasal and oral, or alternating nasal and oral inspiration. The method was validated using nasal airflow, oral airflow, audio, and video recordings for two participants. Results The predominant pattern was simultaneous nasal and oral inspirations for all speaking tasks. This pattern was not affected by the nature of the speaking task or by the phonetic context surrounding the inspiration. The validation procedure confirmed that nearly all inspirations during counting and paragraph reading were simultaneous nasal and oral inspirations; whereas for sentence reading, the predominant pattern was alternating nasal and oral inspirations across the three phonetic contexts. Conclusions Healthy adults inspire through both the nose and mouth during natural speech breathing. This pattern of inspiration is likely beneficial in reducing pathway resistance while preserving some of the benefits of nasal breathing. PMID:24129013
Auditory-Perceptual Learning Improves Speech Motor Adaptation in Children

PubMed Central

Shiller, Douglas M.; Rochon, Marie-Lyne

2015-01-01

Auditory feedback plays an important role in children’s speech development by providing the child with information about speech outcomes that is used to learn and fine-tune speech motor plans. The use of auditory feedback in speech motor learning has been extensively studied in adults by examining oral motor responses to manipulations of auditory feedback during speech production. Children are also capable of adapting speech motor patterns to perceived changes in auditory feedback, however it is not known whether their capacity for motor learning is limited by immature auditory-perceptual abilities. Here, the link between speech perceptual ability and the capacity for motor learning was explored in two groups of 5–7-year-old children who underwent a period of auditory perceptual training followed by tests of speech motor adaptation to altered auditory feedback. One group received perceptual training on a speech acoustic property relevant to the motor task while a control group received perceptual training on an irrelevant speech contrast. Learned perceptual improvements led to an enhancement in speech motor adaptation (proportional to the perceptual change) only for the experimental group. The results indicate that children’s ability to perceive relevant speech acoustic properties has a direct influence on their capacity for sensory-based speech motor adaptation. PMID:24842067
Bilateral Versus Unilateral Cochlear Implantation in Adult Listeners: Speech-On-Speech Masking and Multitalker Localization

PubMed Central

Buchholz, Jörg M.; Morgan, Catherine; Sharma, Mridula; Weller, Tobias; Konganda, Shivali Appaiah; Shirai, Kyoko; Kawano, Atsushi

2017-01-01

Binaural hearing helps normal-hearing listeners localize sound sources and understand speech in noise. However, it is not fully understood how far this is the case for bilateral cochlear implant (CI) users. To determine the potential benefits of bilateral over unilateral CIs, speech comprehension thresholds (SCTs) were measured in seven Japanese bilateral CI recipients using Helen test sentences (translated into Japanese) in a two-talker speech interferer presented from the front (co-located with the target speech), ipsilateral to the first-implanted ear (at +90° or −90°), and spatially symmetric at ±90°. Spatial release from masking was calculated as the difference between co-located and spatially separated SCTs. Localization was assessed in the horizontal plane by presenting either male or female speech or both simultaneously. All measurements were performed bilaterally and unilaterally (with the first implanted ear) inside a loudspeaker array. Both SCTs and spatial release from masking were improved with bilateral CIs, demonstrating mean bilateral benefits of 7.5 dB in spatially asymmetric and 3 dB in spatially symmetric speech mixture. Localization performance varied strongly between subjects but was clearly improved with bilateral over unilateral CIs with the mean localization error reduced by 27°. Surprisingly, adding a second talker had only a negligible effect on localization. PMID:28752811
Analytic study of the Tadoma method: background and preliminary results.

PubMed

Norton, S J; Schultz, M C; Reed, C M; Braida, L D; Durlach, N I; Rabinowitz, W M; Chomsky, C

1977-09-01

Certain deaf-blind persons have been taught, through the Tadoma method of speechreading, to use vibrotactile cues from the face and neck to understand speech. This paper reports the results of preliminary tests of the speechreading ability of one adult Tadoma user. The tests were of four major types: (1) discrimination of speech stimuli; (2) recognition of words in isolation and in sentences; (3) interpretation of prosodic and syntactic features in sentences; and (4) comprehension of written (Braille) and oral speech. Words in highly contextual environments were much better perceived than were words in low-context environments. Many of the word errors involved phonemic substitutions which shared articulatory features with the target phonemes, with a higher error rate for vowels than consonants. Relative to performance on word-recognition tests, performance on some of the discrimination tests was worse than expected. Perception of sentences appeared to be mildly sensitive to rate of talking and to speaker differences. Results of the tests on perception of prosodic and syntactic features, while inconclusive, indicate that many of the features tested were not used in interpreting sentences. On an English comprehension test, a higher score was obtained for items administered in Braille than through oral presentation.
Changes in speech following maxillary distraction osteogenesis.

PubMed

Guyette, T W; Polley, J W; Figueroa, A; Smith, B E

2001-05-01

The purpose of this study was to describe changes in articulation and velopharyngeal function following maxillary distraction osteogenesis. This is a descriptive, post hoc clinical report comparing the performance of patients before and after maxillary distraction. The independent variable was maxillary distraction while the dependent variables were resonance, articulation errors, and velopharyngeal function. The data were collected at a tertiary health care center in Chicago. The data from pre- and postoperative evaluations of 18 maxillary distraction patients were used. The outcome measures were severity of hypernasality and hyponasality, velopharyngeal orifice size as estimated using the pressure-flow technique, and number and type of articulation errors. At the long-term follow-up, 16.7% exhibited a significant increase in hypernasality. Seventy-five percent of patients with preoperative hyponasality experienced improved nasal resonance. Articulation improved in 67% of patients by the 1-year follow-up. In a predominately cleft palate population, the risk for velopharyngeal insufficiency following maxillary distraction is similar to the risk observed in Le Fort I maxillary advancement. Patients being considered for maxillary distraction surgery should receive pre- and postoperative speech evaluations and be counseled about risks for changes in their speech.
Status report on speech research. A report on the status and progress of studies on the nature of speech, instrumentation for its investigation, and practical applications

NASA Astrophysics Data System (ADS)

Liberman, A. M.

1985-10-01

This interim status report on speech research discusses the following topics: On Vagueness and Fictions as Cornerstones of a Theory of Perceiving and Acting: A Comment on Walter (1983); The Informational Support for Upright Stance; Determining the Extent of Coarticulation-effects of Experimental Design; The Roles of Phoneme Frequency, Similarity, and Availability in the Experimental Elicitation of Speech Errors; On Learning to Speak; The Motor Theory of Speech Perception Revised; Linguistic and Acoustic Correlates of the Perceptual Structure Found in an Individual Differences Scaling Study of Vowels; Perceptual Coherence of Speech: Stability of Silence-cued Stop Consonants; Development of the Speech Perceptuomotor System; Dependence of Reading on Orthography-Investigations in Serbo-Croatian; The Relationship between Knowledge of Derivational Morphology and Spelling Ability in Fourth, Sixth, and Eighth Graders; Relations among Regular and Irregular, Morphologically-Related Words in the Lexicon as Revealed by Repetition Priming; Grammatical Priming of Inflected Nouns by the Gender of Possessive Adjectives; Grammatical Priming of Inflected Nouns by Inflected Adjectives; Deaf Signers and Serial Recall in the Visual Modality-Memory for Signs, Fingerspelling, and Print; Did Orthographies Evolve?; The Development of Children's Sensitivity to Factors Inf luencing Vowel Reading.
A Nonword Repetition Task for Speakers with Misarticulations: The Syllable Repetition Task (SRT)

PubMed Central

Shriberg, Lawrence D.; Lohmeier, Heather L.; Campbell, Thomas F.; Dollaghan, Christine A.; Green, Jordan R.; Moore, Christopher A.

2010-01-01

Purpose Conceptual and methodological confounds occur when non(sense) repetition tasks are administered to speakers who do not have the target speech sounds in their phonetic inventories or who habitually misarticulate targeted speech sounds. We describe a nonword repetition task, the Syllable Repetiton Task (SRT) that eliminates this confound and report findings from three validity studies. Method Ninety-five preschool children with Speech Delay and 63 with Typical Speech, completed an assessment battery that included the Nonword Repetition Task (NRT: Dollaghan & Campbell, 1998) and the SRT. SRT stimuli include only four of the earliest occurring consonants and one early occurring vowel. Results Study 1 findings indicated that the SRT eliminated the speech confound in nonword testing with speakers who misarticulate. Study 2 findings indicated that the accuracy of the SRT to identify expressive language impairment was comparable to findings for the NRT. Study 3 findings illustrated the SRT’s potential to interrogate speech processing constraints underlying poor nonword repetition accuracy. Results supported both memorial and auditory-perceptual encoding constraints underlying nonword repetition errors in children with speech-language impairment. Conclusion The SRT appears to be a psychometrically stable and substantively informative nonword repetition task for emerging genetic and other research with speakers who misarticulate. PMID:19635944
Stereotactic probability and variability of speech arrest and anomia sites during stimulation mapping of the language dominant hemisphere.

PubMed

Chang, Edward F; Breshears, Jonathan D; Raygor, Kunal P; Lau, Darryl; Molinaro, Annette M; Berger, Mitchel S

2017-01-01

OBJECTIVE Functional mapping using direct cortical stimulation is the gold standard for the prevention of postoperative morbidity during resective surgery in dominant-hemisphere perisylvian regions. Its role is necessitated by the significant interindividual variability that has been observed for essential language sites. The aim in this study was to determine the statistical probability distribution of eliciting aphasic errors for any given stereotactically based cortical position in a patient cohort and to quantify the variability at each cortical site. METHODS Patients undergoing awake craniotomy for dominant-hemisphere primary brain tumor resection between 1999 and 2014 at the authors' institution were included in this study, which included counting and picture-naming tasks during dense speech mapping via cortical stimulation. Positive and negative stimulation sites were collected using an intraoperative frameless stereotactic neuronavigation system and were converted to Montreal Neurological Institute coordinates. Data were iteratively resampled to create mean and standard deviation probability maps for speech arrest and anomia. Patients were divided into groups with a "classic" or an "atypical" location of speech function, based on the resultant probability maps. Patient and clinical factors were then assessed for their association with an atypical location of speech sites by univariate and multivariate analysis. RESULTS Across 102 patients undergoing speech mapping, the overall probabilities of speech arrest and anomia were 0.51 and 0.33, respectively. Speech arrest was most likely to occur with stimulation of the posterior inferior frontal gyrus (maximum probability from individual bin = 0.025), and variance was highest in the dorsal premotor cortex and the posterior superior temporal gyrus. In contrast, stimulation within the posterior perisylvian cortex resulted in the maximum mean probability of anomia (maximum probability = 0.012), with large variance in the regions surrounding the posterior superior temporal gyrus, including the posterior middle temporal, angular, and supramarginal gyri. Patients with atypical speech localization were far more likely to have tumors in canonical Broca's or Wernicke's areas (OR 7.21, 95% CI 1.67-31.09, p < 0.01) or to have multilobar tumors (OR 12.58, 95% CI 2.22-71.42, p < 0.01), than were patients with classic speech localization. CONCLUSIONS This study provides statistical probability distribution maps for aphasic errors during cortical stimulation mapping in a patient cohort. Thus, the authors provide an expected probability of inducing speech arrest and anomia from specific 10-mm 2 cortical bins in an individual patient. In addition, they highlight key regions of interindividual mapping variability that should be considered preoperatively. They believe these results will aid surgeons in their preoperative planning of eloquent cortex resection.
Measurement errors in voice-key naming latency for Hiragana.

PubMed

Yamada, Jun; Tamaoka, Katsuo

2003-12-01

This study makes explicit the limitations and possibilities of voice-key naming latency research on single hiragana symbols (a Japanese syllabic script) by examining three sets of voice-key naming data against Sakuma, Fushimi, and Tatsumi's 1997 speech-analyzer voice-waveform data. Analysis showed that voice-key measurement errors can be substantial in standard procedures as they may conceal the true effects of significant variables involved in hiragana-naming behavior. While one can avoid voice-key measurement errors to some extent by applying Sakuma, et al.'s deltas and by excluding initial phonemes which induce measurement errors, such errors may be ignored when test items are words and other higher-level linguistic materials.
Speech-rhythm characteristics of client-centered, Gestalt, and rational-emotive therapy interviews.

PubMed

Chen, C L

1981-07-01

The aim of this study was to discover whether client-centered, Gestalt, and rational-emotive psychotherapy interviews could be described and differentiated on the basis of quantitative measurement of their speech rhythms. These measures were taken from the sound portion of a film showing interviews by Carl Rogers, Frederick Perls, and Albert Ellis. The variables used were total session and percentage of speaking times, speaking turns, vocalizations, interruptions, inside and switching pauses, and speaking rates. The three types of interview had very distinctive patterns of speech-rhythm variables. These patterns suggested that Rogers's Client-centered therapy interview was patient dominated, that Ellis's rational-emotive therapy interview was therapist dominated, and that Perls's Gestalt therapy interview was neither therapist nor patient dominated.
Positron Emission Tomography Imaging Reveals Auditory and Frontal Cortical Regions Involved with Speech Perception and Loudness Adaptation.

PubMed

Berding, Georg; Wilke, Florian; Rode, Thilo; Haense, Cathleen; Joseph, Gert; Meyer, Geerd J; Mamach, Martin; Lenarz, Minoo; Geworski, Lilli; Bengel, Frank M; Lenarz, Thomas; Lim, Hubert H

2015-01-01

Considerable progress has been made in the treatment of hearing loss with auditory implants. However, there are still many implanted patients that experience hearing deficiencies, such as limited speech understanding or vanishing perception with continuous stimulation (i.e., abnormal loudness adaptation). The present study aims to identify specific patterns of cerebral cortex activity involved with such deficiencies. We performed O-15-water positron emission tomography (PET) in patients implanted with electrodes within the cochlea, brainstem, or midbrain to investigate the pattern of cortical activation in response to speech or continuous multi-tone stimuli directly inputted into the implant processor that then delivered electrical patterns through those electrodes. Statistical parametric mapping was performed on a single subject basis. Better speech understanding was correlated with a larger extent of bilateral auditory cortex activation. In contrast to speech, the continuous multi-tone stimulus elicited mainly unilateral auditory cortical activity in which greater loudness adaptation corresponded to weaker activation and even deactivation. Interestingly, greater loudness adaptation was correlated with stronger activity within the ventral prefrontal cortex, which could be up-regulated to suppress the irrelevant or aberrant signals into the auditory cortex. The ability to detect these specific cortical patterns and differences across patients and stimuli demonstrates the potential for using PET to diagnose auditory function or dysfunction in implant patients, which in turn could guide the development of appropriate stimulation strategies for improving hearing rehabilitation. Beyond hearing restoration, our study also reveals a potential role of the frontal cortex in suppressing irrelevant or aberrant activity within the auditory cortex, and thus may be relevant for understanding and treating tinnitus.
Positron Emission Tomography Imaging Reveals Auditory and Frontal Cortical Regions Involved with Speech Perception and Loudness Adaptation

PubMed Central

Berding, Georg; Wilke, Florian; Rode, Thilo; Haense, Cathleen; Joseph, Gert; Meyer, Geerd J.; Mamach, Martin; Lenarz, Minoo; Geworski, Lilli; Bengel, Frank M.; Lenarz, Thomas; Lim, Hubert H.

2015-01-01

Considerable progress has been made in the treatment of hearing loss with auditory implants. However, there are still many implanted patients that experience hearing deficiencies, such as limited speech understanding or vanishing perception with continuous stimulation (i.e., abnormal loudness adaptation). The present study aims to identify specific patterns of cerebral cortex activity involved with such deficiencies. We performed O-15-water positron emission tomography (PET) in patients implanted with electrodes within the cochlea, brainstem, or midbrain to investigate the pattern of cortical activation in response to speech or continuous multi-tone stimuli directly inputted into the implant processor that then delivered electrical patterns through those electrodes. Statistical parametric mapping was performed on a single subject basis. Better speech understanding was correlated with a larger extent of bilateral auditory cortex activation. In contrast to speech, the continuous multi-tone stimulus elicited mainly unilateral auditory cortical activity in which greater loudness adaptation corresponded to weaker activation and even deactivation. Interestingly, greater loudness adaptation was correlated with stronger activity within the ventral prefrontal cortex, which could be up-regulated to suppress the irrelevant or aberrant signals into the auditory cortex. The ability to detect these specific cortical patterns and differences across patients and stimuli demonstrates the potential for using PET to diagnose auditory function or dysfunction in implant patients, which in turn could guide the development of appropriate stimulation strategies for improving hearing rehabilitation. Beyond hearing restoration, our study also reveals a potential role of the frontal cortex in suppressing irrelevant or aberrant activity within the auditory cortex, and thus may be relevant for understanding and treating tinnitus. PMID:26046763
Deep neural network with weight sparsity control and pre-training extracts hierarchical features and enhances classification performance: Evidence from whole-brain resting-state functional connectivity patterns of schizophrenia

PubMed Central

Kim, Junghoe; Calhoun, Vince D.; Shim, Eunsoo; Lee, Jong-Hwan

2015-01-01

Functional connectivity (FC) patterns obtained from resting-state functional magnetic resonance imaging data are commonly employed to study neuropsychiatric conditions by using pattern classifiers such as the support vector machine (SVM). Meanwhile, a deep neural network (DNN) with multiple hidden layers has shown its ability to systematically extract lower-to-higher level information of image and speech data from lower-to-higher hidden layers, markedly enhancing classification accuracy. The objective of this study was to adopt the DNN for whole-brain resting-state FC pattern classification of schizophrenia (SZ) patients vs. healthy controls (HCs) and identification of aberrant FC patterns associated with SZ. We hypothesized that the lower-to-higher level features learned via the DNN would significantly enhance the classification accuracy, and proposed an adaptive learning algorithm to explicitly control the weight sparsity in each hidden layer via L1-norm regularization. Furthermore, the weights were initialized via stacked autoencoder based pre-training to further improve the classification performance. Classification accuracy was systematically evaluated as a function of (1) the number of hidden layers/nodes, (2) the use of L1-norm regularization, (3) the use of the pre-training, (4) the use of framewise displacement (FD) removal, and (5) the use of anatomical/functional parcellation. Using FC patterns from anatomically parcellated regions without FD removal, an error rate of 14.2% was achieved by employing three hidden layers and 50 hidden nodes with both L1-norm regularization and pre-training, which was substantially lower than the error rate from the SVM (22.3%). Moreover, the trained DNN weights (i.e., the learned features) were found to represent the hierarchical organization of aberrant FC patterns in SZ compared with HC. Specifically, pairs of nodes extracted from the lower hidden layer represented sparse FC patterns implicated in SZ, which was quantified by using kurtosis/modularity measures and features from the higher hidden layer showed holistic/global FC patterns differentiating SZ from HC. Our proposed schemes and reported findings attained by using the DNN classifier and whole-brain FC data suggest that such approaches show improved ability to learn hidden patterns in brain imaging data, which may be useful for developing diagnostic tools for SZ and other neuropsychiatric disorders and identifying associated aberrant FC patterns. PMID:25987366
Use of intonation contours for speech recognition in noise by cochlear implant recipients.

PubMed

Meister, Hartmut; Landwehr, Markus; Pyschny, Verena; Grugel, Linda; Walger, Martin

2011-05-01

The corruption of intonation contours has detrimental effects on sentence-based speech recognition in normal-hearing listeners Binns and Culling [(2007). J. Acoust. Soc. Am. 122, 1765-1776]. This paper examines whether this finding also applies to cochlear implant (CI) recipients. The subjects' F0-discrimination and speech perception in the presence of noise were measured, using sentences with regular and inverted F0-contours. The results revealed that speech recognition for regular contours was significantly better than for inverted contours. This difference was related to the subjects' F0-discrimination providing further evidence that the perception of intonation patterns is important for the CI-mediated speech recognition in noise.
A variable rate speech compressor for mobile applications

NASA Technical Reports Server (NTRS)

Yeldener, S.; Kondoz, A. M.; Evans, B. G.

1990-01-01

One of the most promising speech coder at the bit rate of 9.6 to 4.8 kbits/s is CELP. Code Excited Linear Prediction (CELP) has been dominating 9.6 to 4.8 kbits/s region during the past 3 to 4 years. Its set back however, is its expensive implementation. As an alternative to CELP, the Base-Band CELP (CELP-BB) was developed which produced good quality speech comparable to CELP and a single chip implementable complexity as reported previously. Its robustness was also improved to tolerate errors up to 1.0 pct. and maintain intelligibility up to 5.0 pct. and more. Although, CELP-BB produces good quality speech at around 4.8 kbits/s, it has a fundamental problem when updating the pitch filter memory. A sub-optimal solution is proposed for this problem. Below 4.8 kbits/s, however, CELP-BB suffers from noticeable quantization noise as a result of the large vector dimensions used. Efficient representation of speech below 4.8 kbits/s is reported by introducing Sinusoidal Transform Coding (STC) to represent the LPC excitation which is called Sine Wave Excited LPC (SWELP). In this case, natural sounding good quality synthetic speech is obtained at around 2.4 kbits/s.
Expertise with artificial non-speech sounds recruits speech-sensitive cortical regions

PubMed Central

Leech, Robert; Holt, Lori L.; Devlin, Joseph T.; Dick, Frederic

2009-01-01

Regions of the human temporal lobe show greater activation for speech than for other sounds. These differences may reflect intrinsically specialized domain-specific adaptations for processing speech, or they may be driven by the significant expertise we have in listening to the speech signal. To test the expertise hypothesis, we used a video-game-based paradigm that tacitly trained listeners to categorize acoustically complex, artificial non-linguistic sounds. Before and after training, we used functional MRI to measure how expertise with these sounds modulated temporal lobe activation. Participants’ ability to explicitly categorize the non-speech sounds predicted the change in pre- to post-training activation in speech-sensitive regions of the left posterior superior temporal sulcus, suggesting that emergent auditory expertise may help drive this functional regionalization. Thus, seemingly domain-specific patterns of neural activation in higher cortical regions may be driven in part by experience-based restructuring of high-dimensional perceptual space. PMID:19386919
The Queen's English: an alternative, biosocial hypothesis for the distinctive features of "gay speech".

PubMed

Rendall, Drew; Vasey, Paul L; McKenzie, Jared

2008-02-01

Popular stereotypes concerning the speech of homosexuals typically attribute speech patterns characteristic of the opposite-sex, i.e., broadly feminized speech in gay men and broadly masculinized speech in lesbian women. A small body of recent empirical research has begun to address the subject more systematically and to consider specific mechanistic hypotheses to account for the potentially distinctive features of homosexual speech. Results do not yet fully endorse the stereotypes but they do not entirely discount them either; nor do they cleanly favor any single mechanistic hypothesis. To contribute to this growing body of research, we report acoustic analyses of 2,875 vowel sounds from a balanced set of 125 speakers representing heterosexual and homosexual individuals of each sex from southern Alberta, Canada. Analyses focused on voice pitch and formant frequencies which together determine the principle perceptual features of vowels. There was no significant difference in mean voice pitch between heterosexual and homosexual men or between heterosexual and homosexual women, but there were significant differences in the formant frequencies of vowels produced by both homosexual groups compared to their heterosexual counterparts. Formant frequency differences were specific to only certain vowel sounds and some could be attributed to basic differences in body size between heterosexual and homosexual speakers. The remaining formant frequency differences were not obviously due to differences in vocal tract anatomy between heterosexual and homosexual speakers, nor did they reflect global feminization or masculinization of vowel production patterns in homosexual men and women, respectively. The vowel-specific differences observed could reflect social modeling processes in which only certain speech patterns of the opposite-sex, or of same-sex homosexuals, are selectively adopted. However, we introduce an alternative biosocial hypothesis, specifically that the distinctive, vowel-specific features of homosexual speakers relative to heterosexual speakers arise incidentally as a product of broader psychobehavioral differences between the two groups that are, in turn, continuous with and flow from the physiological processes that affect sexual orientation to begin with.
Schizophrenia alters intra-network functional connectivity in the caudate for detecting speech under informational speech masking conditions.

PubMed

Zheng, Yingjun; Wu, Chao; Li, Juanhua; Li, Ruikeng; Peng, Hongjun; She, Shenglin; Ning, Yuping; Li, Liang

2018-04-04

Speech recognition under noisy "cocktail-party" environments involves multiple perceptual/cognitive processes, including target detection, selective attention, irrelevant signal inhibition, sensory/working memory, and speech production. Compared to health listeners, people with schizophrenia are more vulnerable to masking stimuli and perform worse in speech recognition under speech-on-speech masking conditions. Although the schizophrenia-related speech-recognition impairment under "cocktail-party" conditions is associated with deficits of various perceptual/cognitive processes, it is crucial to know whether the brain substrates critically underlying speech detection against informational speech masking are impaired in people with schizophrenia. Using functional magnetic resonance imaging (fMRI), this study investigated differences between people with schizophrenia (n = 19, mean age = 33 ± 10 years) and their matched healthy controls (n = 15, mean age = 30 ± 9 years) in intra-network functional connectivity (FC) specifically associated with target-speech detection under speech-on-speech-masking conditions. The target-speech detection performance under the speech-on-speech-masking condition in participants with schizophrenia was significantly worse than that in matched healthy participants (healthy controls). Moreover, in healthy controls, but not participants with schizophrenia, the strength of intra-network FC within the bilateral caudate was positively correlated with the speech-detection performance under the speech-masking conditions. Compared to controls, patients showed altered spatial activity pattern and decreased intra-network FC in the caudate. In people with schizophrenia, the declined speech-detection performance under speech-on-speech masking conditions is associated with reduced intra-caudate functional connectivity, which normally contributes to detecting target speech against speech masking via its functions of suppressing masking-speech signals.
Rapid change in articulatory lip movement induced by preceding auditory feedback during production of bilabial plosives.

PubMed

Mochida, Takemi; Gomi, Hiroaki; Kashino, Makio

2010-11-08

There has been plentiful evidence of kinesthetically induced rapid compensation for unanticipated perturbation in speech articulatory movements. However, the role of auditory information in stabilizing articulation has been little studied except for the control of voice fundamental frequency, voice amplitude and vowel formant frequencies. Although the influence of auditory information on the articulatory control process is evident in unintended speech errors caused by delayed auditory feedback, the direct and immediate effect of auditory alteration on the movements of articulators has not been clarified. This work examined whether temporal changes in the auditory feedback of bilabial plosives immediately affects the subsequent lip movement. We conducted experiments with an auditory feedback alteration system that enabled us to replace or block speech sounds in real time. Participants were asked to produce the syllable /pa/ repeatedly at a constant rate. During the repetition, normal auditory feedback was interrupted, and one of three pre-recorded syllables /pa/, /Φa/, or /pi/, spoken by the same participant, was presented once at a different timing from the anticipated production onset, while no feedback was presented for subsequent repetitions. Comparisons of the labial distance trajectories under altered and normal feedback conditions indicated that the movement quickened during the short period immediately after the alteration onset, when /pa/ was presented 50 ms before the expected timing. Such change was not significant under other feedback conditions we tested. The earlier articulation rapidly induced by the progressive auditory input suggests that a compensatory mechanism helps to maintain a constant speech rate by detecting errors between the internally predicted and actually provided auditory information associated with self movement. The timing- and context-dependent effects of feedback alteration suggest that the sensory error detection works in a temporally asymmetric window where acoustic features of the syllable to be produced may be coded.

The Acquisition of Standard English Speech Habits Using Second-Language Techniques: An Experiment in Speech Modification and Generalization in the Verbal Behavior of Prison Inmates.

ERIC Educational Resources Information Center

McKee, John M.; And Others

Many people take for granted the use of language as a tool for coping with everyday occupational and social problems. However, there are those, such as prison inmates, who have difficulty using language in this manner. Realizing that prison inmates are not always able to communicate effectively through standard patterns of speech and thus are…
A novel speech-processing strategy incorporating tonal information for cochlear implants.

PubMed

Lan, N; Nie, K B; Gao, S K; Zeng, F G

2004-05-01

Good performance in cochlear implant users depends in large part on the ability of a speech processor to effectively decompose speech signals into multiple channels of narrow-band electrical pulses for stimulation of the auditory nerve. Speech processors that extract only envelopes of the narrow-band signals (e.g., the continuous interleaved sampling (CIS) processor) may not provide sufficient information to encode the tonal cues in languages such as Chinese. To improve the performance in cochlear implant users who speak tonal language, we proposed and developed a novel speech-processing strategy, which extracted both the envelopes of the narrow-band signals and the fundamental frequency (F0) of the speech signal, and used them to modulate both the amplitude and the frequency of the electrical pulses delivered to stimulation electrodes. We developed an algorithm to extract the fundatmental frequency and identified the general patterns of pitch variations of four typical tones in Chinese speech. The effectiveness of the extraction algorithm was verified with an artificial neural network that recognized the tonal patterns from the extracted F0 information. We then compared the novel strategy with the envelope-extraction CIS strategy in human subjects with normal hearing. The novel strategy produced significant improvement in perception of Chinese tones, phrases, and sentences. This novel processor with dynamic modulation of both frequency and amplitude is encouraging for the design of a cochlear implant device for sensorineurally deaf patients who speak tonal languages.
Testing the Agreement/Tense Omission Model: Why the Data on Children's Use of Non-Nominative 3psg Subjects Count against the ATOM

ERIC Educational Resources Information Center

Pine, Julian M.; Rowland, Caroline F.; Lieven, Elena V. M.; Theakston, Anna L.

2005-01-01

One of the most influential recent accounts of pronoun case-marking errors in young children's speech is Schutze & Wexler's (1996) Agreement/Tense Omission Model (ATOM). The ATOM predicts that the rate of agreeing verbs with non-nominative subjects will be so low that such errors can be reasonably disregarded as noise in the data. The present…
Recurring errors among recent history of psychology textbooks.

PubMed

Thomas, Roger K

2007-01-01

Five recurring errors in history of psychology textbooks are discussed. One involves an identical misquotation. The remaining examples involve factual and interpretational errors that more than one and usually several textbook authors made. In at least 2 cases some facts were fabricated, namely, so-called facts associated with Pavlov's mugging and Descartes's reasons for choosing the pineal gland as the locus for mind-body interaction. A fourth example involves Broca's so-called discovery of the speech center, and the fifth example involves misinterpretations of Lloyd Morgan's intentions regarding his famous canon. When an error involves misinterpretation and thus misrepresentation, I will show why the misinterpretation is untenable.
Engaged listeners: shared neural processing of powerful political speeches

PubMed Central

Häcker, Frank E. K.; Honey, Christopher J.; Hasson, Uri

2015-01-01

Powerful speeches can captivate audiences, whereas weaker speeches fail to engage their listeners. What is happening in the brains of a captivated audience? Here, we assess audience-wide functional brain dynamics during listening to speeches of varying rhetorical quality. The speeches were given by German politicians and evaluated as rhetorically powerful or weak. Listening to each of the speeches induced similar neural response time courses, as measured by inter-subject correlation analysis, in widespread brain regions involved in spoken language processing. Crucially, alignment of the time course across listeners was stronger for rhetorically powerful speeches, especially for bilateral regions of the superior temporal gyri and medial prefrontal cortex. Thus, during powerful speeches, listeners as a group are more coupled to each other, suggesting that powerful speeches are more potent in taking control of the listeners’ brain responses. Weaker speeches were processed more heterogeneously, although they still prompted substantially correlated responses. These patterns of coupled neural responses bear resemblance to metaphors of resonance, which are often invoked in discussions of speech impact, and contribute to the literature on auditory attention under natural circumstances. Overall, this approach opens up possibilities for research on the neural mechanisms mediating the reception of entertaining or persuasive messages. PMID:25653012
Discrepant visual speech facilitates covert selective listening in "cocktail party" conditions.

PubMed

Williams, Jason A

2012-06-01

The presence of congruent visual speech information facilitates the identification of auditory speech, while the addition of incongruent visual speech information often impairs accuracy. This latter arrangement occurs naturally when one is being directly addressed in conversation but listens to a different speaker. Under these conditions, performance may diminish since: (a) one is bereft of the facilitative effects of the corresponding lip motion and (b) one becomes subject to visual distortion by incongruent visual speech; by contrast, speech intelligibility may be improved due to (c) bimodal localization of the central unattended stimulus. Participants were exposed to centrally presented visual and auditory speech while attending to a peripheral speech stream. In some trials, the lip movements of the central visual stimulus matched the unattended speech stream; in others, the lip movements matched the attended peripheral speech. Accuracy for the peripheral stimulus was nearly one standard deviation greater with incongruent visual information, compared to the congruent condition which provided bimodal pattern recognition cues. Likely, the bimodal localization of the central stimulus further differentiated the stimuli and thus facilitated intelligibility. Results are discussed with regard to similar findings in an investigation of the ventriloquist effect, and the relative strength of localization and speech cues in covert listening.
Duration, Pitch, and Loudness in Kunqu Opera Stage Speech.

PubMed

Han, Qichao; Sundberg, Johan

2017-03-01

Kunqu is a special type of opera within the Chinese tradition with 600 years of history. In it, stage speech is used for the spoken dialogue. It is performed in Ming Dynasty's mandarin language and is a much more dominant part of the play than singing. Stage speech deviates considerably from normal conversational speech with respect to duration, loudness and pitch. This paper compares these properties in stage speech conversational speech. A famous, highly experienced female singer's performed stage speech and reading of the same lyrics in a conversational speech mode. Clear differences are found. As compared with conversational speech, stage speech had longer word and sentence duration and word duration was less variable. Average sound level was 16 dB higher. Also mean fundamental frequency was considerably higher and more varied. Within sentences, both loudness and fundamental frequency tended to vary according to a low-high-low pattern. Some of the findings fail to support current opinions regarding the characteristics of stage speech, and in this sense the study demonstrates the relevance of objective measurements in descriptions of vocal styles. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Ethnography of Communication: Cultural Codes and Norms.

ERIC Educational Resources Information Center

Carbaugh, Donal

The primary tasks of the ethnographic researcher are to discover, describe, and comparatively analyze different speech communities' ways of speaking. Two general abstractions occurring in ethnographic analyses are normative and cultural. Communicative norms are formulated in analyzing and explaining the "patterned use of speech."…
Speech rhythm in Kannada speaking adults who stutter.

PubMed

Maruthy, Santosh; Venugopal, Sahana; Parakh, Priyanka

2017-10-01

A longstanding hypothesis about the underlying mechanisms of stuttering suggests that speech disfluencies may be associated with problems in timing and temporal patterning of speech events. Fifteen adults who do and do not stutter read five sentences, and from these, the vocalic and consonantal durations were measured. Using these, pairwise variability index (raw PVI for consonantal intervals and normalised PVI for vocalic intervals) and interval based rhythm metrics (PercV, DeltaC, DeltaV, VarcoC and VarcoV) were calculated for all the participants. Findings suggested higher mean values in adults who stutter when compared to adults who do not stutter for all the rhythm metrics except for VarcoV. Further, statistically significant difference between the two groups was found for all the rhythm metrics except for VarcoV. Combining the present results with consistent prior findings based on rhythm deficits in children and adults who stutter, there appears to be strong empirical support for the hypothesis that individuals who stutter may have deficits in generation of rhythmic speech patterns.
Now you hear it, now you don't: vowel devoicing in Japanese infant-directed speech.

PubMed

Fais, Laurel; Kajikawa, Sachiyo; Amano, Shigeaki; Werker, Janet F

2010-03-01

In this work, we examine a context in which a conflict arises between two roles that infant-directed speech (IDS) plays: making language structure salient and modeling the adult form of a language. Vowel devoicing in fluent adult Japanese creates violations of the canonical Japanese consonant-vowel word structure pattern by systematically devoicing particular vowels, yielding surface consonant clusters. We measured vowel devoicing rates in a corpus of infant- and adult-directed Japanese speech, for both read and spontaneous speech, and found that the mothers in our study preserve the fluent adult form of the language and mask underlying phonological structure by devoicing vowels in infant-directed speech at virtually the same rates as those for adult-directed speech. The results highlight the complex interrelationships among the modifications to adult speech that comprise infant-directed speech, and that form the input from which infants begin to build the eventual mature form of their native language.
Hearing impaired speech in noisy classrooms

NASA Astrophysics Data System (ADS)

Shahin, Kimary; McKellin, William H.; Jamieson, Janet; Hodgson, Murray; Pichora-Fuller, M. Kathleen

2005-04-01

Noisy classrooms have been shown to induce among students patterns of interaction similar to those used by hearing impaired people [W. H. McKellin et al., GURT (2003)]. In this research, the speech of children in a noisy classroom setting was investigated to determine if noisy classrooms have an effect on students' speech. Audio recordings were made of the speech of students during group work in their regular classrooms (grades 1-7), and of the speech of the same students in a sound booth. Noise level readings in the classrooms were also recorded. Each student's noisy and quiet environment speech samples were acoustically analyzed for prosodic and segmental properties (f0, pitch range, pitch variation, phoneme duration, vowel formants), and compared. The analysis showed that the students' speech in the noisy classrooms had characteristics of the speech of hearing-impaired persons [e.g., R. O'Halpin, Clin. Ling. and Phon. 15, 529-550 (2001)]. Some educational implications of our findings were identified. [Work supported by the Peter Wall Institute for Advanced Studies, University of British Columbia.
a Comparative Analysis of Fluent and Cerebral Palsied Speech.

NASA Astrophysics Data System (ADS)

van Doorn, Janis Lee

Several features of the acoustic waveforms of fluent and cerebral palsied speech were compared, using six fluent and seven cerebral palsied subjects, with a major emphasis being placed on an investigation of the trajectories of the first three formants (vocal tract resonances). To provide an overall picture which included other acoustic features, fundamental frequency, intensity, speech timing (speech rate and syllable duration), and prevocalization (vocalization prior to initial stop consonants found in cerebral palsied speech) were also investigated. Measurements were made using repetitions of a test sentence which was chosen because it required large excursions of the speech articulators (lips, tongue and jaw), so that differences in the formant trajectories for the fluent and cerebral palsied speakers would be emphasized. The acoustic features were all extracted from the digitized speech waveform (10 kHz sampling rate): the fundamental frequency contours were derived manually, the intensity contours were measured using the signal covariance, speech rate and syllable durations were measured manually, as were the prevocalization durations, while the formant trajectories were derived from short time spectra which were calculated for each 10 ms of speech using linear prediction analysis. Differences which were found in the acoustic features can be summarized as follows. For cerebral palsied speakers, the fundamental frequency contours generally showed inappropriate exaggerated fluctuations, as did some of the intensity contours; the mean fundamental frequencies were either higher or the same as for the fluent subjects; speech rates were reduced, and syllable durations were longer; prevocalization was consistently present at the beginning of the test sentence; formant trajectories were found to have overall reduced frequency ranges, and to contain anomalous transitional features, but it is noteworthy that for any one cerebral palsied subject, the inappropriate trajectory pattern was generally reproducible. The anomalous transitional features took the form of (a) inappropriate transition patterns, (b) reduced frequency excursions, (c) increased transition durations, and (d) decreased maximum rates of frequency change.
Perceptual Learning and Auditory Training in Cochlear Implant Recipients

PubMed Central

Fu, Qian-Jie; Galvin, John J.

2007-01-01

Learning electrically stimulated speech patterns can be a new and difficult experience for cochlear implant (CI) recipients. Recent studies have shown that most implant recipients at least partially adapt to these new patterns via passive, daily-listening experiences. Gradually introducing a speech processor parameter (eg, the degree of spectral mismatch) may provide for more complete and less stressful adaptation. Although the implant device restores hearing sensation and the continued use of the implant provides some degree of adaptation, active auditory rehabilitation may be necessary to maximize the benefit of implantation for CI recipients. Currently, there are scant resources for auditory rehabilitation for adult, postlingually deafened CI recipients. We recently developed a computer-assisted speech-training program to provide the means to conduct auditory rehabilitation at home. The training software targets important acoustic contrasts among speech stimuli, provides auditory and visual feedback, and incorporates progressive training techniques, thereby maintaining recipients’ interest during the auditory training exercises. Our recent studies demonstrate the effectiveness of targeted auditory training in improving CI recipients’ speech and music perception. Provided with an inexpensive and effective auditory training program, CI recipients may find the motivation and momentum to get the most from the implant device. PMID:17709574
Speech Enhancement, Gain, and Noise Spectrum Adaptation Using Approximate Bayesian Estimation

PubMed Central

Hao, Jiucang; Attias, Hagai; Nagarajan, Srikantan; Lee, Te-Won; Sejnowski, Terrence J.

2010-01-01

This paper presents a new approximate Bayesian estimator for enhancing a noisy speech signal. The speech model is assumed to be a Gaussian mixture model (GMM) in the log-spectral domain. This is in contrast to most current models in frequency domain. Exact signal estimation is a computationally intractable problem. We derive three approximations to enhance the efficiency of signal estimation. The Gaussian approximation transforms the log-spectral domain GMM into the frequency domain using minimal Kullback–Leiber (KL)-divergency criterion. The frequency domain Laplace method computes the maximum a posteriori (MAP) estimator for the spectral amplitude. Correspondingly, the log-spectral domain Laplace method computes the MAP estimator for the log-spectral amplitude. Further, the gain and noise spectrum adaptation are implemented using the expectation–maximization (EM) algorithm within the GMM under Gaussian approximation. The proposed algorithms are evaluated by applying them to enhance the speeches corrupted by the speech-shaped noise (SSN). The experimental results demonstrate that the proposed algorithms offer improved signal-to-noise ratio, lower word recognition error rate, and less spectral distortion. PMID:20428253
Modifying Speech to Children based on their Perceived Phonetic Accuracy

PubMed Central

Julien, Hannah M.; Munson, Benjamin

2014-01-01

Purpose We examined the relationship between adults' perception of the accuracy of children's speech, and acoustic detail in their subsequent productions to children. Methods Twenty-two adults participated in a task in which they rated the accuracy of 2- and 3-year-old children's word-initial /s/and /∫/ using a visual analog scale (VAS), then produced a token of the same word as if they were responding to the child whose speech they had just rated. Result The duration of adults' fricatives varied as a function of their perception of the accuracy of children's speech: longer fricatives were produced following productions that they rated as inaccurate. This tendency to modify duration in response to perceived inaccurate tokens was mediated by measures of self-reported experience interacting with children. However, speakers did not increase the spectral distinctiveness of their fricatives following the perception of inaccurate tokens. Conclusion These results suggest that adults modify temporal features of their speech in response to perceiving children's inaccurate productions. These longer fricatives are potentially both enhanced input to children, and an error-corrective signal. PMID:22744140
"… Trial and error …": Speech-language pathologists' perspectives of working with Indigenous Australian adults with acquired communication disorders.

PubMed

Cochrane, Frances Clare; Brown, Louise; Siyambalapitiya, Samantha; Plant, Christopher

2016-10-01

This study explored speech-language pathologists' (SLPs) perspectives about factors that influence clinical management of Aboriginal and Torres Strait Islander adults with acquired communication disorders (e.g. aphasia, motor speech disorders). Using a qualitative phenomenological approach, seven SLPs working in North Queensland, Australia with experience working with this population participated in semi-structured in-depth interviews. Qualitative content analysis was used to identify categories and overarching themes within the data. Four categories, in relation to barriers and facilitators, were identified from participants' responses: (1) The Practice Context; (2) Working Together; (3) Client Factors; and (4) Speech-Language Pathologist Factors. Three overarching themes were also found to influence effective speech pathology services: (1) Aboriginal and Torres Strait Islander Cultural Practices; (2) Information and Communication; and (3) Time. This study identified many complex and inter-related factors which influenced SLPs' effective clinical management of this caseload. The findings suggest that SLPs should employ a flexible, holistic and collaborative approach in order to facilitate effective clinical management with Aboriginal and Torres Strait Islander people with acquired communication disorders.
Aging and the Vulnerability of Speech to Dual Task Demands

PubMed Central

Kemper, Susan; Schmalzried, RaLynn; Hoffman, Lesa; Herman, Ruth

2010-01-01

Tracking a digital pursuit rotor task was used to measure dual task costs of language production by young and older adults. Tracking performance by both groups was affected by dual task demands: time on target declined and tracking error increased as dual task demands increased from the baseline condition to a moderately demanding dual task condition to a more demanding dual task condition. When dual task demands were moderate, older adults’ speech rate declined but their fluency, grammatical complexity, and content were unaffected. When the dual task was more demanding, older adults’ speech, like young adults’ speech, became highly fragmented, ungrammatical, and incoherent. Vocabulary, working memory, processing speed, and inhibition affected vulnerability to dual task costs: vocabulary provided some protection for sentence length and grammaticality, working memory conferred some protection for grammatical complexity, and processing speed provided some protection for speech rate, propositional density, coherence, and lexical diversity. Further, vocabulary and working memory capacity provided more protection for older adults than for young adults although the protective effect of processing speed was somewhat reduced for older adults as compared to the young adults. PMID:21186917
Effects of a cochlear implant simulation on immediate memory in normal-hearing adults

PubMed Central

Burkholder, Rose A.; Pisoni, David B.; Svirsky, Mario A.

2012-01-01

This study assessed the effects of stimulus misidentification and memory processing errors on immediate memory span in 25 normal-hearing adults exposed to degraded auditory input simulating signals provided by a cochlear implant. The identification accuracy of degraded digits in isolation was measured before digit span testing. Forward and backward digit spans were shorter when digits were degraded than when they were normal. Participants’ normal digit spans and their accuracy in identifying isolated digits were used to predict digit spans in the degraded speech condition. The observed digit spans in degraded conditions did not differ significantly from predicted digit spans. This suggests that the decrease in memory span is related primarily to misidentification of digits rather than memory processing errors related to cognitive load. These findings provide complementary information to earlier research on auditory memory span of listeners exposed to degraded speech either experimentally or as a consequence of a hearing-impairment. PMID:16317807
Primary Progressive Speech Abulia.

PubMed

Milano, Nicholas J; Heilman, Kenneth M

2015-01-01

Primary progressive aphasia (PPA) is a neurodegenerative disorder characterized by progressive language impairment. The three variants of PPA include the nonfluent/agrammatic, semantic, and logopenic types. The goal of this report is to describe two patients with a loss of speech initiation that was associated with bilateral medial frontal atrophy. Two patients with progressive speech deficits were evaluated and their examinations revealed a paucity of spontaneous speech; however their naming, repetition, reading, and writing were all normal. The patients had no evidence of agrammatism or apraxia of speech but did have impaired speech fluency. In addition to impaired production of propositional spontaneous speech, these patients had impaired production of automatic speech (e.g., reciting the Lord's Prayer) and singing. Structural brain imaging revealed bilateral medial frontal atrophy in both patients. These patients' language deficits are consistent with a PPA, but they are in the pattern of a dynamic aphasia. Whereas the signs-symptoms of dynamic aphasia have been previously described, to our knowledge these are the first cases associated with predominantly bilateral medial frontal atrophy that impaired both propositional and automatic speech. Thus, this profile may represent a new variant of PPA.
Emotion to emotion speech conversion in phoneme level

NASA Astrophysics Data System (ADS)

Bulut, Murtaza; Yildirim, Serdar; Busso, Carlos; Lee, Chul Min; Kazemzadeh, Ebrahim; Lee, Sungbok; Narayanan, Shrikanth

2004-10-01

Having an ability to synthesize emotional speech can make human-machine interaction more natural in spoken dialogue management. This study investigates the effectiveness of prosodic and spectral modification in phoneme level on emotion-to-emotion speech conversion. The prosody modification is performed with the TD-PSOLA algorithm (Moulines and Charpentier, 1990). We also transform the spectral envelopes of source phonemes to match those of target phonemes using LPC-based spectral transformation approach (Kain, 2001). Prosodic speech parameters (F0, duration, and energy) for target phonemes are estimated from the statistics obtained from the analysis of an emotional speech database of happy, angry, sad, and neutral utterances collected from actors. Listening experiments conducted with native American English speakers indicate that the modification of prosody only or spectrum only is not sufficient to elicit targeted emotions. The simultaneous modification of both prosody and spectrum results in higher acceptance rates of target emotions, suggesting that not only modeling speech prosody but also modeling spectral patterns that reflect underlying speech articulations are equally important to synthesize emotional speech with good quality. We are investigating suprasegmental level modifications for further improvement in speech quality and expressiveness.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.