Prior Knowledge Guides Speech Segregation in Human Auditory Cortex.
Wang, Yuanye; Zhang, Jianfeng; Zou, Jiajie; Luo, Huan; Ding, Nai
2018-05-18
Segregating concurrent sound streams is a computationally challenging task that requires integrating bottom-up acoustic cues (e.g. pitch) and top-down prior knowledge about sound streams. In a multi-talker environment, the brain can segregate different speakers in about 100 ms in auditory cortex. Here, we used magnetoencephalographic (MEG) recordings to investigate the temporal and spatial signature of how the brain utilizes prior knowledge to segregate 2 speech streams from the same speaker, which can hardly be separated based on bottom-up acoustic cues. In a primed condition, the participants know the target speech stream in advance while in an unprimed condition no such prior knowledge is available. Neural encoding of each speech stream is characterized by the MEG responses tracking the speech envelope. We demonstrate that an effect in bilateral superior temporal gyrus and superior temporal sulcus is much stronger in the primed condition than in the unprimed condition. Priming effects are observed at about 100 ms latency and last more than 600 ms. Interestingly, prior knowledge about the target stream facilitates speech segregation by mainly suppressing the neural tracking of the non-target speech stream. In sum, prior knowledge leads to reliable speech segregation in auditory cortex, even in the absence of reliable bottom-up speech segregation cue.
Getzmann, Stephan; Näätänen, Risto
2015-11-01
With age the ability to understand speech in multitalker environments usually deteriorates. The central auditory system has to perceptually segregate and group the acoustic input into sequences of distinct auditory objects. The present study used electrophysiological measures to study effects of age on auditory stream segregation in a multitalker scenario. Younger and older adults were presented with streams of short speech stimuli. When a single target stream was presented, the occurrence of a rare (deviant) syllable among a frequent (standard) syllable elicited the mismatch negativity (MMN), an electrophysiological correlate of automatic deviance detection. The presence of a second, concurrent stream consisting of the deviant syllable of the target stream reduced the MMN amplitude, especially when located nearby the target stream. The decrease in MMN amplitude indicates that the rare syllable of the target stream was less perceived as deviant, suggesting reduced stream segregation with decreasing stream distance. Moreover, the presence of a concurrent stream increased the MMN peak latency of the older group but not that of the younger group. The results provide neurophysiological evidence for the effects of concurrent speech on auditory processing in older adults, suggesting that older adults need more time for stream segregation in the presence of concurrent speech. Copyright © 2015 Elsevier Inc. All rights reserved.
Auditory stream segregation in children with Asperger syndrome
Lepistö, T.; Kuitunen, A.; Sussman, E.; Saalasti, S.; Jansson-Verkasalo, E.; Nieminen-von Wendt, T.; Kujala, T.
2009-01-01
Individuals with Asperger syndrome (AS) often have difficulties in perceiving speech in noisy environments. The present study investigated whether this might be explained by deficient auditory stream segregation ability, that is, by a more basic difficulty in separating simultaneous sound sources from each other. To this end, auditory event-related brain potentials were recorded from a group of school-aged children with AS and a group of age-matched controls using a paradigm specifically developed for studying stream segregation. Differences in the amplitudes of ERP components were found between groups only in the stream segregation conditions and not for simple feature discrimination. The results indicated that children with AS have difficulties in segregating concurrent sound streams, which ultimately may contribute to the difficulties in speech-in-noise perception. PMID:19751798
Nawaz, Tabassam; Mehmood, Zahid; Rashid, Muhammad; Habib, Hafiz Adnan
2018-01-01
Recent research on speech segregation and music fingerprinting has led to improvements in speech segregation and music identification algorithms. Speech and music segregation generally involves the identification of music followed by speech segregation. However, music segregation becomes a challenging task in the presence of noise. This paper proposes a novel method of speech segregation for unlabelled stationary noisy audio signals using the deep belief network (DBN) model. The proposed method successfully segregates a music signal from noisy audio streams. A recurrent neural network (RNN)-based hidden layer segregation model is applied to remove stationary noise. Dictionary-based fisher algorithms are employed for speech classification. The proposed method is tested on three datasets (TIMIT, MIR-1K, and MusicBrainz), and the results indicate the robustness of proposed method for speech segregation. The qualitative and quantitative analysis carried out on three datasets demonstrate the efficiency of the proposed method compared to the state-of-the-art speech segregation and classification-based methods. PMID:29558485
Sound stream segregation: a neuromorphic approach to solve the “cocktail party problem” in real-time
Thakur, Chetan Singh; Wang, Runchun M.; Afshar, Saeed; Hamilton, Tara J.; Tapson, Jonathan C.; Shamma, Shihab A.; van Schaik, André
2015-01-01
The human auditory system has the ability to segregate complex auditory scenes into a foreground component and a background, allowing us to listen to specific speech sounds from a mixture of sounds. Selective attention plays a crucial role in this process, colloquially known as the “cocktail party effect.” It has not been possible to build a machine that can emulate this human ability in real-time. Here, we have developed a framework for the implementation of a neuromorphic sound segregation algorithm in a Field Programmable Gate Array (FPGA). This algorithm is based on the principles of temporal coherence and uses an attention signal to separate a target sound stream from background noise. Temporal coherence implies that auditory features belonging to the same sound source are coherently modulated and evoke highly correlated neural response patterns. The basis for this form of sound segregation is that responses from pairs of channels that are strongly positively correlated belong to the same stream, while channels that are uncorrelated or anti-correlated belong to different streams. In our framework, we have used a neuromorphic cochlea as a frontend sound analyser to extract spatial information of the sound input, which then passes through band pass filters that extract the sound envelope at various modulation rates. Further stages include feature extraction and mask generation, which is finally used to reconstruct the targeted sound. Using sample tonal and speech mixtures, we show that our FPGA architecture is able to segregate sound sources in real-time. The accuracy of segregation is indicated by the high signal-to-noise ratio (SNR) of the segregated stream (90, 77, and 55 dB for simple tone, complex tone, and speech, respectively) as compared to the SNR of the mixture waveform (0 dB). This system may be easily extended for the segregation of complex speech signals, and may thus find various applications in electronic devices such as for sound segregation and speech recognition. PMID:26388721
Thakur, Chetan Singh; Wang, Runchun M; Afshar, Saeed; Hamilton, Tara J; Tapson, Jonathan C; Shamma, Shihab A; van Schaik, André
2015-01-01
The human auditory system has the ability to segregate complex auditory scenes into a foreground component and a background, allowing us to listen to specific speech sounds from a mixture of sounds. Selective attention plays a crucial role in this process, colloquially known as the "cocktail party effect." It has not been possible to build a machine that can emulate this human ability in real-time. Here, we have developed a framework for the implementation of a neuromorphic sound segregation algorithm in a Field Programmable Gate Array (FPGA). This algorithm is based on the principles of temporal coherence and uses an attention signal to separate a target sound stream from background noise. Temporal coherence implies that auditory features belonging to the same sound source are coherently modulated and evoke highly correlated neural response patterns. The basis for this form of sound segregation is that responses from pairs of channels that are strongly positively correlated belong to the same stream, while channels that are uncorrelated or anti-correlated belong to different streams. In our framework, we have used a neuromorphic cochlea as a frontend sound analyser to extract spatial information of the sound input, which then passes through band pass filters that extract the sound envelope at various modulation rates. Further stages include feature extraction and mask generation, which is finally used to reconstruct the targeted sound. Using sample tonal and speech mixtures, we show that our FPGA architecture is able to segregate sound sources in real-time. The accuracy of segregation is indicated by the high signal-to-noise ratio (SNR) of the segregated stream (90, 77, and 55 dB for simple tone, complex tone, and speech, respectively) as compared to the SNR of the mixture waveform (0 dB). This system may be easily extended for the segregation of complex speech signals, and may thus find various applications in electronic devices such as for sound segregation and speech recognition.
Intelligibility of Target Signals in Sequential and Simultaneous Segregation Tasks
2009-03-01
SUBJECT TERMS Informational masking; energetic masking, multimasker penalty, speech perception 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF...alter- nation rates were high enough to directly interfere with the perception of the F0 values of the speech signals and that they thus disrupted the...segregation effects seen in this experiment and those in which stream segregation with tones was examined. Experiments examining the perception of
Sequential stream segregation in normally-hearing and cochlear-implant listenersa)
Tejani, Viral D.; Schvartz-Leyzac, Kara C.; Chatterjee, Monita
2017-01-01
Sequential stream segregation by normal hearing (NH) and cochlear implant (CI) listeners was investigated using an irregular rhythm detection (IRD) task. Pure tones and narrowband noises of different bandwidths were presented monaurally to older and younger NH listeners via headphones. For CI users, stimuli were delivered as pure tones via soundfield and via direct electrical stimulation. Results confirmed that tonal pitch is not essential for stream segregation by NH listeners and that aging does not reduce NH listeners' stream segregation. CI listeners' stream segregation was significantly poorer than NH listeners' with pure tone stimuli. With direct stimulation, however, CI listeners showed significantly stronger stream segregation, with a mean normalized pattern similar to NH listeners, implying that the CI speech processors possibly degraded acoustic cues. CI listeners' performance on an electrode discrimination task indicated that cues that are salient enough to make two electrodes highly discriminable may not be sufficiently salient for stream segregation, and that gap detection/discrimination, which must depend on perceptual electrode differences, did not play a role in the IRD task. Although the IRD task does not encompass all aspects of full stream segregation, these results suggest that some CI listeners may demonstrate aspects of stream segregation. PMID:28147600
Integration and segregation in auditory scene analysis
NASA Astrophysics Data System (ADS)
Sussman, Elyse S.
2005-03-01
Assessment of the neural correlates of auditory scene analysis, using an index of sound change detection that does not require the listener to attend to the sounds [a component of event-related brain potentials called the mismatch negativity (MMN)], has previously demonstrated that segregation processes can occur without attention focused on the sounds and that within-stream contextual factors influence how sound elements are integrated and represented in auditory memory. The current study investigated the relationship between the segregation and integration processes when they were called upon to function together. The pattern of MMN results showed that the integration of sound elements within a sound stream occurred after the segregation of sounds into independent streams and, further, that the individual streams were subject to contextual effects. These results are consistent with a view of auditory processing that suggests that the auditory scene is rapidly organized into distinct streams and the integration of sequential elements to perceptual units takes place on the already formed streams. This would allow for the flexibility required to identify changing within-stream sound patterns, needed to appreciate music or comprehend speech..
Some components of the ``cocktail-party effect,'' as revealed when it fails
NASA Astrophysics Data System (ADS)
Divenyi, Pierre L.; Gygi, Brian
2003-04-01
The precise way listeners cope with cocktail-party situations, i.e., understand speech in the midst of other, simultaneously ongoing conversations, has by-and-large remained a puzzle, despite research committed to studying the problem over the past half century. In contrast, it is widely acknowledged that the cocktail-party effect (CPE) deteriorates in aging. Our investigations during the last decade have assessed the deterioration of the CPE in elderly listeners and attempted to uncover specific auditory tasks, on which the performance of the same listeners will also exhibit a deficit. Correlated performance on CPE and such auditory tasks arguably signify that the tasks in question are necessary for perceptual segregation of the target speech and the background babble. We will present results on three tasks correlated with CPE performance. All three tasks require temporal processing-based perceptual segregation of specific non-speech stimuli (amplitude- and/or frequency-modulated sinusoidal complexes): discrimination of formant transition patterns, segregation of streams with different syllabic rhythms, and selective attention to AM or FM features in the designated stream. [Work supported by a grant from the National Institute on Aging and by the V.A. Medical Research.
Sutojo, Sarinah; van de Par, Steven; Schoenmaker, Esther
2018-06-01
In situations with competing talkers or in the presence of masking noise, speech intelligibility can be improved by spatially separating the target speaker from the interferers. This advantage is generally referred to as spatial release from masking (SRM) and different mechanisms have been suggested to explain it. One proposed mechanism to benefit from spatial cues is the binaural masking release, which is purely stimulus driven. According to this mechanism, the spatial benefit results from differences in the binaural cues of target and masker, which need to appear simultaneously in time and frequency to improve the signal detection. In an alternative proposed mechanism, the differences in the interaural cues improve the segregation of auditory streams, a process, which involves top-down processing rather than being purely stimulus driven. Other than the cues that produce binaural masking release, the interaural cue differences between target and interferer required to improve stream segregation do not have to appear simultaneously in time and frequency. This study is concerned with the contribution of binaural masking release to SRM for three masker types that differ with respect to the amount of energetic masking they exert. Speech intelligibility was measured, employing a stimulus manipulation that inhibits binaural masking release, and analyzed with a metric to account for the number of better-ear glimpses. Results indicate that the contribution of the stimulus-driven binaural masking release plays a minor role while binaural stream segregation and the availability of glimpses in the better ear had a stronger influence on improving the speech intelligibility. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Attentional Gain Control of Ongoing Cortical Speech Representations in a “Cocktail Party”
Kerlin, Jess R.; Shahin, Antoine J.; Miller, Lee M.
2010-01-01
Normal listeners possess the remarkable perceptual ability to select a single speech stream among many competing talkers. However, few studies of selective attention have addressed the unique nature of speech as a temporally extended and complex auditory object. We hypothesized that sustained selective attention to speech in a multi-talker environment would act as gain control on the early auditory cortical representations of speech. Using high-density electroencephalography and a template-matching analysis method, we found selective gain to the continuous speech content of an attended talker, greatest at a frequency of 4–8 Hz, in auditory cortex. In addition, the difference in alpha power (8–12 Hz) at parietal sites across hemispheres indicated the direction of auditory attention to speech, as has been previously found in visual tasks. The strength of this hemispheric alpha lateralization, in turn, predicted an individual’s attentional gain of the cortical speech signal. These results support a model of spatial speech stream segregation, mediated by a supramodal attention mechanism, enabling selection of the attended representation in auditory cortex. PMID:20071526
Humes, Larry E.; Kidd, Gary R.; Lentz, Jennifer J.
2013-01-01
This study was designed to address individual differences in aided speech understanding among a relatively large group of older adults. The group of older adults consisted of 98 adults (50 female and 48 male) ranging in age from 60 to 86 (mean = 69.2). Hearing loss was typical for this age group and about 90% had not worn hearing aids. All subjects completed a battery of tests, including cognitive (6 measures), psychophysical (17 measures), and speech-understanding (9 measures), as well as the Speech, Spatial, and Qualities of Hearing (SSQ) self-report scale. Most of the speech-understanding measures made use of competing speech and the non-speech psychophysical measures were designed to tap phenomena thought to be relevant for the perception of speech in competing speech (e.g., stream segregation, modulation-detection interference). All measures of speech understanding were administered with spectral shaping applied to the speech stimuli to fully restore audibility through at least 4000 Hz. The measures used were demonstrated to be reliable in older adults and, when compared to a reference group of 28 young normal-hearing adults, age-group differences were observed on many of the measures. Principal-components factor analysis was applied successfully to reduce the number of independent and dependent (speech understanding) measures for a multiple-regression analysis. Doing so yielded one global cognitive-processing factor and five non-speech psychoacoustic factors (hearing loss, dichotic signal detection, multi-burst masking, stream segregation, and modulation detection) as potential predictors. To this set of six potential predictor variables were added subject age, Environmental Sound Identification (ESI), and performance on the text-recognition-threshold (TRT) task (a visual analog of interrupted speech recognition). These variables were used to successfully predict one global aided speech-understanding factor, accounting for about 60% of the variance. PMID:24098273
NASA Astrophysics Data System (ADS)
Newman, Rochelle S.
2003-04-01
Most work on listeners' ability to separate streams of speech has focused on adults. Yet infants also find themselves in noisy environments. In order to learn from their caregivers' speech in these settings, they must first separate it from background noise such as that from television shows and siblings. Previous work has found that 7.5-month-old infants can separate streams of speech when the target voice is more intense than the distractor voice (Newman and Jusczyk, 1996), when the target voice is known to the infant (Barker and Newman, 2000) or when infants are presented with an audiovisual (rather than auditory-only) signal (Hollich, Jusczyk, and Newman, 2001). Unfortunately, the paradigm in these studies can only be used on infants at least 7.5 months of age, limiting the ability to investigate how stream segregation develops over time. The present work uses a new paradigm to explore younger infants' ability to separate streams of speech. Infants aged 4.5 months heard a female talker repeat either their own name or another infants' name, while several other voices spoke fluently in the background. We present data on infants' ability to recognize their own name in this cocktail party situation. [Work supported by NSF and NICHD.
Monaural Speech Segregation by Integrating Primitive and Schema-Based Analysis
2008-02-03
vol. 19, pp. 475-492. Wang D.L. and Chang P.S. (2008): An oscillatory correlation model of auditory streaming. Cognitive Neurodynamics , vol. 2, pp...Subcontracts DeLiang Wang (Principal Investigator) March 2008 Department of Computer Science & Engineering and Center for Cognitive Science The
Neural Systems Involved When Attending to a Speaker
Kamourieh, Salwa; Braga, Rodrigo M.; Leech, Robert; Newbould, Rexford D.; Malhotra, Paresh; Wise, Richard J. S.
2015-01-01
Remembering what a speaker said depends on attention. During conversational speech, the emphasis is on working memory, but listening to a lecture encourages episodic memory encoding. With simultaneous interference from background speech, the need for auditory vigilance increases. We recreated these context-dependent demands on auditory attention in 2 ways. The first was to require participants to attend to one speaker in either the absence or presence of a distracting background speaker. The second was to alter the task demand, requiring either an immediate or delayed recall of the content of the attended speech. Across 2 fMRI studies, common activated regions associated with segregating attended from unattended speech were the right anterior insula and adjacent frontal operculum (aI/FOp), the left planum temporale, and the precuneus. In contrast, activity in a ventral right frontoparietal system was dependent on both the task demand and the presence of a competing speaker. Additional multivariate analyses identified other domain-general frontoparietal systems, where activity increased during attentive listening but was modulated little by the need for speech stream segregation in the presence of 2 speakers. These results make predictions about impairments in attentive listening in different communicative contexts following focal or diffuse brain pathology. PMID:25596592
Attentional modulation of informational masking on early cortical representations of speech signals.
Zhang, Changxin; Arnott, Stephen R; Rabaglia, Cristina; Avivi-Reich, Meital; Qi, James; Wu, Xihong; Li, Liang; Schneider, Bruce A
2016-01-01
To recognize speech in a noisy auditory scene, listeners need to perceptually segregate the target talker's voice from other competing sounds (stream segregation). A number of studies have suggested that the attentional demands placed on listeners increase as the acoustic properties and informational content of the competing sounds become more similar to that of the target voice. Hence we would expect attentional demands to be considerably greater when speech is masked by speech than when it is masked by steady-state noise. To investigate the role of attentional mechanisms in the unmasking of speech sounds, event-related potentials (ERPs) were recorded to a syllable masked by noise or competing speech under both active (the participant was asked to respond when the syllable was presented) or passive (no response was required) listening conditions. The results showed that the long-latency auditory response to a syllable (/bi/), presented at different signal-to-masker ratios (SMRs), was similar in both passive and active listening conditions, when the masker was a steady-state noise. In contrast, a switch from the passive listening condition to the active one, when the masker was two-talker speech, significantly enhanced the ERPs to the syllable. These results support the hypothesis that the need to engage attentional mechanisms in aid of scene analysis increases as the similarity (both acoustic and informational) between the target speech and the competing background sounds increases. Copyright © 2015 Elsevier B.V. All rights reserved.
Rajan, R; Cainer, K E
2008-06-23
In most everyday settings, speech is heard in the presence of competing sounds and understanding speech requires skills in auditory streaming and segregation, followed by identification and recognition, of the attended signals. Ageing leads to difficulties in understanding speech in noisy backgrounds. In addition to age-related changes in hearing-related factors, cognitive factors also play a role but it is unclear to what extent these are generalized or modality-specific cognitive factors. We examined how ageing in normal-hearing decade age cohorts from 20 to 69 years affected discrimination of open-set speech in background noise. We used two types of sentences of similar structural and linguistic characteristics but different masking levels (i.e. differences in signal-to-noise ratios required for detection of sentences in a standard masker) so as to vary sentence demand, and two background maskers (one causing purely energetic masking effects and the other causing energetic and informational masking) to vary load conditions. There was a decline in performance (measured as speech reception thresholds for perception of sentences in noise) in the oldest cohort for both types of sentences, but only in the presence of the more demanding informational masker. We interpret these results to indicate a modality-specific decline in cognitive processing, likely a decrease in the ability to use acoustic and phonetic cues efficiently to segregate speech from background noise, in subjects aged >60.
Sayles, Mark; Stasiak, Arkadiusz; Winter, Ian M.
2015-01-01
The auditory system typically processes information from concurrently active sound sources (e.g., two voices speaking at once), in the presence of multiple delayed, attenuated and distorted sound-wave reflections (reverberation). Brainstem circuits help segregate these complex acoustic mixtures into “auditory objects.” Psychophysical studies demonstrate a strong interaction between reverberation and fundamental-frequency (F0) modulation, leading to impaired segregation of competing vowels when segregation is on the basis of F0 differences. Neurophysiological studies of complex-sound segregation have concentrated on sounds with steady F0s, in anechoic environments. However, F0 modulation and reverberation are quasi-ubiquitous. We examine the ability of 129 single units in the ventral cochlear nucleus (VCN) of the anesthetized guinea pig to segregate the concurrent synthetic vowel sounds /a/ and /i/, based on temporal discharge patterns under closed-field conditions. We address the effects of added real-room reverberation, F0 modulation, and the interaction of these two factors, on brainstem neural segregation of voiced speech sounds. A firing-rate representation of single-vowels' spectral envelopes is robust to the combination of F0 modulation and reverberation: local firing-rate maxima and minima across the tonotopic array code vowel-formant structure. However, single-vowel F0-related periodicity information in shuffled inter-spike interval distributions is significantly degraded in the combined presence of reverberation and F0 modulation. Hence, segregation of double-vowels' spectral energy into two streams (corresponding to the two vowels), on the basis of temporal discharge patterns, is impaired by reverberation; specifically when F0 is modulated. All unit types (primary-like, chopper, onset) are similarly affected. These results offer neurophysiological insights to perceptual organization of complex acoustic scenes under realistically challenging listening conditions. PMID:25628545
The Speech, Spatial and Qualities of Hearing Scale (SSQ)
Gatehouse, Stuart; Noble, William
2017-01-01
The Speech, Spatial and Qualities of Hearing Scale (SSQ) is designed to measure a range of hearing disabilities across several domains. Particular attention is given to hearing speech in a variety of competing contexts, and to the directional, distance and movement components of spatial hearing. In addition, the abilities both to segregate sounds and to attend to simultaneous speech streams are assessed, reflecting the reality of hearing in the everyday world. Qualities of hearing experience include ease of listening, and the naturalness, clarity and identifiability of different speakers, different musical pieces and instruments, and different everyday sounds. Application of the SSQ to 153 new clinic clients prior to hearing aid fitting showed that the greatest difficulty was experienced with simultaneous speech streams, ease of listening, listening in groups and in noise, and judging distance and movement. SSQ ratings were compared with an independent measure of handicap. After differences in hearing level were controlled for, it was found that identification, attention and effort problems, as well as spatial hearing problems, feature prominently in the disability–handicap relationship, along with certain features of speech hearing. The results implicate aspects of temporal and spatial dynamics of hearing disability in the experience of handicap. The SSQ shows promise as an instrument for evaluating interventions of various kinds, particularly (but not exclusively) those that implicate binaural function. PMID:15035561
Musician enhancement for speech-in-noise.
Parbery-Clark, Alexandra; Skoe, Erika; Lam, Carrie; Kraus, Nina
2009-12-01
To investigate the effect of musical training on speech-in-noise (SIN) performance, a complex task requiring the integration of working memory and stream segregation as well as the detection of time-varying perceptual cues. Previous research has indicated that, in combination with lifelong experience with musical stream segregation, musicians have better auditory perceptual skills and working memory. It was hypothesized that musicians would benefit from these factors and perform better on speech perception in noise than age-matched nonmusician controls. The performance of 16 musicians and 15 nonmusicians was compared on clinical measures of speech perception in noise-QuickSIN and Hearing-In-Noise Test (HINT). Working memory capacity and frequency discrimination were also assessed. All participants had normal hearing and were between the ages of 19 and 31 yr. To be categorized as a musician, participants needed to have started musical training before the age of 7 yr, have 10 or more years of consistent musical experience, and have practiced more than three times weekly within the 3 yr before study enrollment. Nonmusicians were categorized by the failure to meet the musician criteria, along with not having received musical training within the 7 yr before the study. Musicians outperformed the nonmusicians on both QuickSIN and HINT, in addition to having more fine-grained frequency discrimination and better working memory. Years of consistent musical practice correlated positively with QuickSIN, working memory, and frequency discrimination but not HINT. The results also indicate that working memory and frequency discrimination are more important for QuickSIN than for HINT. Musical experience appears to enhance the ability to hear speech in challenging listening environments. Large group differences were found for QuickSIN, and the results also suggest that this enhancement is derived in part from musicians' enhanced working memory and frequency discrimination. For HINT, in which performance was not linked to frequency discrimination ability and was only moderately linked to working memory, musicians still performed significantly better than the nonmusicians. The group differences for HINT were evident in the most difficult condition in which the speech and noise were presented from the same location and not spatially segregated. Understanding which cognitive and psychoacoustic factors as well as which lifelong experiences contribute to SIN may lead to more effective remediation programs for clinical populations for whom SIN poses a particular perceptual challenge. These results provide further evidence for musical training transferring to nonmusical domains and highlight the importance of taking musical training into consideration when evaluating a person's SIN ability in a clinical setting.
Endogenous Delta/Theta Sound-Brain Phase Entrainment Accelerates the Buildup of Auditory Streaming.
Riecke, Lars; Sack, Alexander T; Schroeder, Charles E
2015-12-21
In many natural listening situations, meaningful sounds (e.g., speech) fluctuate in slow rhythms among other sounds. When a slow rhythmic auditory stream is selectively attended, endogenous delta (1‒4 Hz) oscillations in auditory cortex may shift their timing so that higher-excitability neuronal phases become aligned with salient events in that stream [1, 2]. As a consequence of this stream-brain phase entrainment [3], these events are processed and perceived more readily than temporally non-overlapping events [4-11], essentially enhancing the neural segregation between the attended stream and temporally noncoherent streams [12]. Stream-brain phase entrainment is robust to acoustic interference [13-20] provided that target stream-evoked rhythmic activity can be segregated from noncoherent activity evoked by other sounds [21], a process that usually builds up over time [22-27]. However, it has remained unclear whether stream-brain phase entrainment functionally contributes to this buildup of rhythmic streams or whether it is merely an epiphenomenon of it. Here, we addressed this issue directly by experimentally manipulating endogenous stream-brain phase entrainment in human auditory cortex with non-invasive transcranial alternating current stimulation (TACS) [28-30]. We assessed the consequences of these manipulations on the perceptual buildup of the target stream (the time required to recognize its presence in a noisy background), using behavioral measures in 20 healthy listeners performing a naturalistic listening task. Experimentally induced cyclic 4-Hz variations in stream-brain phase entrainment reliably caused a cyclic 4-Hz pattern in perceptual buildup time. Our findings demonstrate that strong endogenous delta/theta stream-brain phase entrainment accelerates the perceptual emergence of task-relevant rhythmic streams in noisy environments. Copyright © 2015 Elsevier Ltd. All rights reserved.
An Expanded Role for the Dorsal Auditory Pathway in Sensorimotor Control and Integration
Rauschecker, Josef P.
2010-01-01
The dual-pathway model of auditory cortical processing assumes that two largely segregated processing streams originating in the lateral belt subserve the two main functions of hearing: identification of auditory “objects”, including speech; and localization of sounds in space (Rauschecker and Tian, 2000). Evidence has accumulated, chiefly from work in humans and nonhuman primates, that an antero-ventral pathway supports the former function, whereas a postero-dorsal stream supports the latter, i.e. processing of space and motion-in-space. In addition, the postero-dorsal stream has also been postulated to subserve some functions of speech and language in humans. A recent review (Rauschecker and Scott, 2009) has proposed the possibility that both functions of the postero-dorsal pathway can be subsumed under the same structural forward model: an efference copy sent from prefrontal and premotor cortex provides the basis for “optimal state estimation” in the inferior parietal lobe and in sensory areas of the posterior auditory cortex. The current article corroborates this model by adding and discussing recent evidence. PMID:20850511
Masking effects of speech and music: does the masker's hierarchical structure matter?
Shi, Lu-Feng; Law, Yvonne
2010-04-01
Speech and music are time-varying signals organized by parallel hierarchical rules. Through a series of four experiments, this study compared the masking effects of single-talker speech and instrumental music on speech perception while manipulating the complexity of hierarchical and temporal structures of the maskers. Listeners' word recognition was found to be similar between hierarchically intact and disrupted speech or classical music maskers (Experiment 1). When sentences served as the signal, significantly greater masking effects were observed with disrupted than intact speech or classical music maskers (Experiment 2), although not with jazz or serial music maskers, which differed from the classical music masker in their hierarchical structures (Experiment 3). Removing the classical music masker's temporal dynamics or partially restoring it affected listeners' sentence recognition; yet, differences in performance between intact and disrupted maskers remained robust (Experiment 4). Hence, the effect of structural expectancy was largely present across maskers when comparing them before and after their hierarchical structure was purposefully disrupted. This effect seemed to lend support to the auditory stream segregation theory.
Cloutman, Lauren L.; Binney, Richard J.; Morris, David M.; Parker, Geoffrey J.M.; Lambon Ralph, Matthew A.
2013-01-01
Primate studies have recently identified the dorsal stream as constituting multiple dissociable pathways associated with a range of specialized cognitive functions. To elucidate the nature and number of dorsal pathways in the human brain, the current study utilized in vivo probabilistic tractography to map the structural connectivity associated with subdivisions of the left supramarginal gyrus (SMG). The left SMG is a prominent region within the dorsal stream, which has recently been parcellated into five structurally-distinct regions which possess a dorsal–ventral (and rostral-caudal) organisation, postulated to reflect areas of functional specialisation. The connectivity patterns reveal a dissociation of the arcuate fasciculus into at least two segregated pathways connecting frontal-parietal-temporal regions. Specifically, the connectivity of the inferior SMG, implicated as an acoustic-motor speech interface, is carried by an inner/ventro-dorsal arc of fibres, whilst the pathways of the posterior superior SMG, implicated in object use and cognitive control, forms a parallel outer/dorso-dorsal crescent. PMID:23937853
Zeremdini, Jihen; Ben Messaoud, Mohamed Anouar; Bouzid, Aicha
2015-09-01
Humans have the ability to easily separate a composed speech and to form perceptual representations of the constituent sources in an acoustic mixture thanks to their ears. Until recently, researchers attempt to build computer models of high-level functions of the auditory system. The problem of the composed speech segregation is still a very challenging problem for these researchers. In our case, we are interested in approaches that are addressed to the monaural speech segregation. For this purpose, we study in this paper the computational auditory scene analysis (CASA) to segregate speech from monaural mixtures. CASA is the reproduction of the source organization achieved by listeners. It is based on two main stages: segmentation and grouping. In this work, we have presented, and compared several studies that have used CASA for speech separation and recognition.
Gaudrain, Etienne; Carlyon, Robert P
2013-01-01
Previous studies have suggested that cochlear implant users may have particular difficulties exploiting opportunities to glimpse clear segments of a target speech signal in the presence of a fluctuating masker. Although it has been proposed that this difficulty is associated with a deficit in linking the glimpsed segments across time, the details of this mechanism are yet to be explained. The present study introduces a method called Zebra-speech developed to investigate the relative contribution of simultaneous and sequential segregation mechanisms in concurrent speech perception, using a noise-band vocoder to simulate cochlear implants. One experiment showed that the saliency of the difference between the target and the masker is a key factor for Zebra-speech perception, as it is for sequential segregation. Furthermore, forward masking played little or no role, confirming that intelligibility was not limited by energetic masking but by across-time linkage abilities. In another experiment, a binaural cue was used to distinguish the target and the masker. It showed that the relative contribution of simultaneous and sequential segregation depended on the spectral resolution, with listeners relying more on sequential segregation when the spectral resolution was reduced. The potential of Zebra-speech as a segregation enhancement strategy for cochlear implants is discussed.
Gaudrain, Etienne; Carlyon, Robert P.
2013-01-01
Previous studies have suggested that cochlear implant users may have particular difficulties exploiting opportunities to glimpse clear segments of a target speech signal in the presence of a fluctuating masker. Although it has been proposed that this difficulty is associated with a deficit in linking the glimpsed segments across time, the details of this mechanism are yet to be explained. The present study introduces a method called Zebra-speech developed to investigate the relative contribution of simultaneous and sequential segregation mechanisms in concurrent speech perception, using a noise-band vocoder to simulate cochlear implants. One experiment showed that the saliency of the difference between the target and the masker is a key factor for Zebra-speech perception, as it is for sequential segregation. Furthermore, forward masking played little or no role, confirming that intelligibility was not limited by energetic masking but by across-time linkage abilities. In another experiment, a binaural cue was used to distinguish target and masker. It showed that the relative contribution of simultaneous and sequential segregation depended on the spectral resolution, with listeners relying more on sequential segregation when the spectral resolution was reduced. The potential of Zebra-speech as a segregation enhancement strategy for cochlear implants is discussed. PMID:23297922
Neural Representation of Concurrent Vowels in Macaque Primary Auditory Cortex123
Micheyl, Christophe; Steinschneider, Mitchell
2016-01-01
Abstract Successful speech perception in real-world environments requires that the auditory system segregate competing voices that overlap in frequency and time into separate streams. Vowels are major constituents of speech and are comprised of frequencies (harmonics) that are integer multiples of a common fundamental frequency (F0). The pitch and identity of a vowel are determined by its F0 and spectral envelope (formant structure), respectively. When two spectrally overlapping vowels differing in F0 are presented concurrently, they can be readily perceived as two separate “auditory objects” with pitches at their respective F0s. A difference in pitch between two simultaneous vowels provides a powerful cue for their segregation, which in turn, facilitates their individual identification. The neural mechanisms underlying the segregation of concurrent vowels based on pitch differences are poorly understood. Here, we examine neural population responses in macaque primary auditory cortex (A1) to single and double concurrent vowels (/a/ and /i/) that differ in F0 such that they are heard as two separate auditory objects with distinct pitches. We find that neural population responses in A1 can resolve, via a rate-place code, lower harmonics of both single and double concurrent vowels. Furthermore, we show that the formant structures, and hence the identities, of single vowels can be reliably recovered from the neural representation of double concurrent vowels. We conclude that A1 contains sufficient spectral information to enable concurrent vowel segregation and identification by downstream cortical areas. PMID:27294198
Segregation of Whispered Speech Interleaved with Noise or Speech Maskers
2011-08-01
range over which the talker can be heard. Whispered speech is produced by modulating the flow of air through partially open vocal folds. Because the...source of excitation is turbulent air flow , the acoustic characteristics of whispered speech differs from voiced speech [1, 2]. Despite the acoustic...signals provided by cochlear implants. Two studies investigated the segregation of simultaneously presented whispered vowels [7, 8] in a standard
Noise and pitch interact during the cortical segregation of concurrent speech.
Bidelman, Gavin M; Yellamsetty, Anusha
2017-08-01
Behavioral studies reveal listeners exploit intrinsic differences in voice fundamental frequency (F0) to segregate concurrent speech sounds-the so-called "F0-benefit." More favorable signal-to-noise ratio (SNR) in the environment, an extrinsic acoustic factor, similarly benefits the parsing of simultaneous speech. Here, we examined the neurobiological substrates of these two cues in the perceptual segregation of concurrent speech mixtures. We recorded event-related brain potentials (ERPs) while listeners performed a speeded double-vowel identification task. Listeners heard two concurrent vowels whose F0 differed by zero or four semitones presented in either clean (no noise) or noise-degraded (+5 dB SNR) conditions. Behaviorally, listeners were more accurate in correctly identifying both vowels for larger F0 separations but F0-benefit was more pronounced at more favorable SNRs (i.e., pitch × SNR interaction). Analysis of the ERPs revealed that only the P2 wave (∼200 ms) showed a similar F0 x SNR interaction as behavior and was correlated with listeners' perceptual F0-benefit. Neural classifiers applied to the ERPs further suggested that speech sounds are segregated neurally within 200 ms based on SNR whereas segregation based on pitch occurs later in time (400-700 ms). The earlier timing of extrinsic SNR compared to intrinsic F0-based segregation implies that the cortical extraction of speech from noise is more efficient than differentiating speech based on pitch cues alone, which may recruit additional cortical processes. Findings indicate that noise and pitch differences interact relatively early in cerebral cortex and that the brain arrives at the identities of concurrent speech mixtures as early as ∼200 ms. Copyright © 2017 Elsevier B.V. All rights reserved.
Auditory scene analysis in school-aged children with developmental language disorders
Sussman, E.; Steinschneider, M.; Lee, W.; Lawson, K.
2014-01-01
Natural sound environments are dynamic, with overlapping acoustic input originating from simultaneously active sources. A key function of the auditory system is to integrate sensory inputs that belong together and segregate those that come from different sources. We hypothesized that this skill is impaired in individuals with phonological processing difficulties. There is considerable disagreement about whether phonological impairments observed in children with developmental language disorders can be attributed to specific linguistic deficits or to more general acoustic processing deficits. However, most tests of general auditory abilities have been conducted with a single set of sounds. We assessed the ability of school-aged children (7–15 years) to parse complex auditory non-speech input, and determined whether the presence of phonological processing impairments was associated with stream perception performance. A key finding was that children with language impairments did not show the same developmental trajectory for stream perception as typically developing children. In addition, children with language impairments required larger frequency separations between sounds to hear distinct streams compared to age-matched peers. Furthermore, phonological processing ability was a significant predictor of stream perception measures, but only in the older age groups. No such association was found in the youngest children. These results indicate that children with language impairments have difficulty parsing speech streams, or identifying individual sound events when there are competing sound sources. We conclude that language group differences may in part reflect fundamental maturational disparities in the analysis of complex auditory scenes. PMID:24548430
2009-03-23
Multitalker speech perception with ideal time-frequency segregation: Effects of voice characteristics and number of talkers Douglas S. Brungarta Air...INTRODUCTION Speech perception in multitalker listening environments is limited by two very different types of masking. The first is energetic...06 MAR 2009 2. REPORT TYPE 3. DATES COVERED 00-00-2009 to 00-00-2009 4. TITLE AND SUBTITLE Multitalker speech perception with ideal time
The influence of target-masker similarity on across-ear interference in dichotic listening
NASA Astrophysics Data System (ADS)
Brungart, Douglas; Simpson, Brian
2004-05-01
In most dichotic listening tasks, the comprehension of a target speech signal presented in one ear is unaffected by the presence of irrelevant speech in the opposite ear. However, recent results have shown that contralaterally presented interfering speech signals do influence performance when a second interfering speech signal is present in the same ear as the target speech. In this experiment, we examined the influence of target-masker similarity on this effect by presenting ipsilateral and contralateral masking phrases spoken by the same talker, a different same-sex talker, or a different-sex talker than the one used to generate the target speech. The results show that contralateral target-masker similarity has the greatest influence on performance when an easily segregated different-sex masker is presented in the target ear, and the least influence when a difficult-to-segregate same-talker masker is presented in the target ear. These results indicate that across-ear interference in dichotic listening is not directly related to the difficulty of the segregation task in the target ear, and suggest that contralateral maskers are least likely to interfere with dichotic speech perception when the same general strategy could be used to segregate the target from the masking voices in the ipsilateral and contralateral ears.
Attentional influences on functional mapping of speech sounds in human auditory cortex.
Obleser, Jonas; Elbert, Thomas; Eulitz, Carsten
2004-07-21
The speech signal contains both information about phonological features such as place of articulation and non-phonological features such as speaker identity. These are different aspects of the 'what'-processing stream (speaker vs. speech content), and here we show that they can be further segregated as they may occur in parallel but within different neural substrates. Subjects listened to two different vowels, each spoken by two different speakers. During one block, they were asked to identify a given vowel irrespectively of the speaker (phonological categorization), while during the other block the speaker had to be identified irrespectively of the vowel (speaker categorization). Auditory evoked fields were recorded using 148-channel magnetoencephalography (MEG), and magnetic source imaging was obtained for 17 subjects. During phonological categorization, a vowel-dependent difference of N100m source location perpendicular to the main tonotopic gradient replicated previous findings. In speaker categorization, the relative mapping of vowels remained unchanged but sources were shifted towards more posterior and more superior locations. These results imply that the N100m reflects the extraction of abstract invariants from the speech signal. This part of the processing is accomplished in auditory areas anterior to AI, which are part of the auditory 'what' system. This network seems to include spatially separable modules for identifying the phonological information and for associating it with a particular speaker that are activated in synchrony but within different regions, suggesting that the 'what' processing can be more adequately modeled by a stream of parallel stages. The relative activation of the parallel processing stages can be modulated by attentional or task demands.
Examining explanations for fundamental frequency's contribution to speech intelligibility in noise
NASA Astrophysics Data System (ADS)
Schlauch, Robert S.; Miller, Sharon E.; Watson, Peter J.
2005-09-01
Laures and Weismer [JSLHR, 42, 1148 (1999)] reported that speech with natural variation in fundamental frequency (F0) is more intelligible in noise than speech with a flattened F0 contour. Cognitive-linguistic based explanations have been offered to account for this drop in intelligibility for the flattened condition, but a lower-level mechanism related to auditory streaming may be responsible. Numerous psychoacoustic studies have demonstrated that modulating a tone enables a listener to segregate it from background sounds. To test these rival hypotheses, speech recognition in noise was measured for sentences with six different F0 contours: unmodified, flattened at the mean, natural but exaggerated, reversed, and frequency modulated (rates of 2.5 and 5.0 Hz). The 180 stimulus sentences were produced by five talkers (30 sentences per condition). Speech recognition for fifteen listeners replicate earlier findings showing that flattening the F0 contour results in a roughly 10% reduction in recognition of key words compared with the natural condition. Although the exaggerated condition produced results comparable to those of the flattened condition, the other conditions with unnatural F0 contours all yielded significantly poorer performance than the flattened condition. These results support the cognitive, linguistic-based explanations for the reduction in performance.
ERIC Educational Resources Information Center
Bouvet, Lucie; Mottron, Laurent; Valdois, Sylviane; Donnadieu, Sophie
2016-01-01
Auditory stream segregation allows us to organize our sound environment, by focusing on specific information and ignoring what is unimportant. One previous study reported difficulty in stream segregation ability in children with Asperger syndrome. In order to investigate this question further, we used an interleaved melody recognition task with…
Columnar Segregation of Magnocellular and Parvocellular Streams in Human Extrastriate Cortex
2017-01-01
Magnocellular versus parvocellular (M-P) streams are fundamental to the organization of macaque visual cortex. Segregated, paired M-P streams extend from retina through LGN into V1. The M stream extends further into area V5/MT, and parts of V2. However, elsewhere in visual cortex, it remains unclear whether M-P-derived information (1) becomes intermixed or (2) remains segregated in M-P-dominated columns and neurons. Here we tested whether M-P streams exist in extrastriate cortical columns, in 8 human subjects (4 female). We acquired high-resolution fMRI at high field (7T), testing for M- and P-influenced columns within each of four cortical areas (V2, V3, V3A, and V4), based on known functional distinctions in M-P streams in macaque: (1) color versus luminance, (2) binocular disparity, (3) luminance contrast sensitivity, (4) peak spatial frequency, and (5) color/spatial interactions. Additional measurements of resting state activity (eyes closed) tested for segregated functional connections between these columns. We found M- and P-like functions and connections within and between segregated cortical columns in V2, V3, and (in most experiments) area V4. Area V3A was dominated by the M stream, without significant influence from the P stream. These results suggest that M-P streams exist, and extend through, specific columns in early/middle stages of human extrastriate cortex. SIGNIFICANCE STATEMENT The magnocellular and parvocellular (M-P) streams are fundamental components of primate visual cortical organization. These streams segregate both anatomical and functional properties in parallel, from retina through primary visual cortex. However, in most higher-order cortical sites, it is unknown whether such M-P streams exist and/or what form those streams would take. Moreover, it is unknown whether M-P streams exist in human cortex. Here, fMRI evidence measured at high field (7T) and high resolution revealed segregated M-P streams in four areas of human extrastriate cortex. These results suggest that M-P information is processed in segregated parallel channels throughout much of human visual cortex; the M-P streams are more than a convenient sorting property in earlier stages of the visual system. PMID:28724749
Toward a Neurophysiological Theory of Auditory Stream Segregation
ERIC Educational Resources Information Center
Snyder, Joel S.; Alain, Claude
2007-01-01
Auditory stream segregation (or streaming) is a phenomenon in which 2 or more repeating sounds differing in at least 1 acoustic attribute are perceived as 2 or more separate sound sources (i.e., streams). This article selectively reviews psychophysical and computational studies of streaming and comprehensively reviews more recent…
Process for the physical segregation of minerals
Yingling, Jon C.; Ganguli, Rajive
2004-01-06
With highly heterogeneous groups or streams of minerals, physical segregation using online quality measurements is an economically important first stage of the mineral beneficiation process. Segregation enables high quality fractions of the stream to bypass processing, such as cleaning operations, thereby reducing the associated costs and avoiding the yield losses inherent in any downstream separation process. The present invention includes various methods for reliably segregating a mineral stream into at least one fraction meeting desired quality specifications while at the same time maximizing yield of that fraction.
Attentional influences on functional mapping of speech sounds in human auditory cortex
Obleser, Jonas; Elbert, Thomas; Eulitz, Carsten
2004-01-01
Background The speech signal contains both information about phonological features such as place of articulation and non-phonological features such as speaker identity. These are different aspects of the 'what'-processing stream (speaker vs. speech content), and here we show that they can be further segregated as they may occur in parallel but within different neural substrates. Subjects listened to two different vowels, each spoken by two different speakers. During one block, they were asked to identify a given vowel irrespectively of the speaker (phonological categorization), while during the other block the speaker had to be identified irrespectively of the vowel (speaker categorization). Auditory evoked fields were recorded using 148-channel magnetoencephalography (MEG), and magnetic source imaging was obtained for 17 subjects. Results During phonological categorization, a vowel-dependent difference of N100m source location perpendicular to the main tonotopic gradient replicated previous findings. In speaker categorization, the relative mapping of vowels remained unchanged but sources were shifted towards more posterior and more superior locations. Conclusions These results imply that the N100m reflects the extraction of abstract invariants from the speech signal. This part of the processing is accomplished in auditory areas anterior to AI, which are part of the auditory 'what' system. This network seems to include spatially separable modules for identifying the phonological information and for associating it with a particular speaker that are activated in synchrony but within different regions, suggesting that the 'what' processing can be more adequately modeled by a stream of parallel stages. The relative activation of the parallel processing stages can be modulated by attentional or task demands. PMID:15268765
Klein, Mike E.; Zatorre, Robert J.
2015-01-01
In categorical perception (CP), continuous physical signals are mapped to discrete perceptual bins: mental categories not found in the physical world. CP has been demonstrated across multiple sensory modalities and, in audition, for certain over-learned speech and musical sounds. The neural basis of auditory CP, however, remains ambiguous, including its robustness in nonspeech processes and the relative roles of left/right hemispheres; primary/nonprimary cortices; and ventral/dorsal perceptual processing streams. Here, highly trained musicians listened to 2-tone musical intervals, which they perceive categorically while undergoing functional magnetic resonance imaging. Multivariate pattern analyses were performed after grouping sounds by interval quality (determined by frequency ratio between tones) or pitch height (perceived noncategorically, frequency ratios remain constant). Distributed activity patterns in spheres of voxels were used to determine sound sample identities. For intervals, significant decoding accuracy was observed in the right superior temporal and left intraparietal sulci, with smaller peaks observed homologously in contralateral hemispheres. For pitch height, no significant decoding accuracy was observed, consistent with the non-CP of this dimension. These results suggest that similar mechanisms are operative for nonspeech categories as for speech; espouse roles for 2 segregated processing streams; and support hierarchical processing models for CP. PMID:24488957
The dorsal stream contribution to phonological retrieval in object naming
Faseyitan, Olufunsho; Kim, Junghoon; Coslett, H. Branch
2012-01-01
Meaningful speech, as exemplified in object naming, calls on knowledge of the mappings between word meanings and phonological forms. Phonological errors in naming (e.g. GHOST named as ‘goath’) are commonly seen in persisting post-stroke aphasia and are thought to signal impairment in retrieval of phonological form information. We performed a voxel-based lesion-symptom mapping analysis of 1718 phonological naming errors collected from 106 individuals with diverse profiles of aphasia. Voxels in which lesion status correlated with phonological error rates localized to dorsal stream areas, in keeping with classical and contemporary brain-language models. Within the dorsal stream, the critical voxels were concentrated in premotor cortex, pre- and postcentral gyri and supramarginal gyrus with minimal extension into auditory-related posterior temporal and temporo-parietal cortices. This challenges the popular notion that error-free phonological retrieval requires guidance from sensory traces stored in posterior auditory regions and points instead to sensory-motor processes located further anterior in the dorsal stream. In a separate analysis, we compared the lesion maps for phonological and semantic errors and determined that there was no spatial overlap, demonstrating that the brain segregates phonological and semantic retrieval operations in word production. PMID:23171662
NASA Astrophysics Data System (ADS)
Fishman, Yonatan I.; Arezzo, Joseph C.; Steinschneider, Mitchell
2004-09-01
Auditory stream segregation refers to the organization of sequential sounds into ``perceptual streams'' reflecting individual environmental sound sources. In the present study, sequences of alternating high and low tones, ``...ABAB...,'' similar to those used in psychoacoustic experiments on stream segregation, were presented to awake monkeys while neural activity was recorded in primary auditory cortex (A1). Tone frequency separation (ΔF), tone presentation rate (PR), and tone duration (TD) were systematically varied to examine whether neural responses correlate with effects of these variables on perceptual stream segregation. ``A'' tones were fixed at the best frequency of the recording site, while ``B'' tones were displaced in frequency from ``A'' tones by an amount=ΔF. As PR increased, ``B'' tone responses decreased in amplitude to a greater extent than ``A'' tone responses, yielding neural response patterns dominated by ``A'' tone responses occurring at half the alternation rate. Increasing TD facilitated the differential attenuation of ``B'' tone responses. These findings parallel psychoacoustic data and suggest a physiological model of stream segregation whereby increasing ΔF, PR, or TD enhances spatial differentiation of ``A'' tone and ``B'' tone responses along the tonotopic map in A1.
Segregation and Integration of Auditory Streams when Listening to Multi-Part Music
Ragert, Marie; Fairhurst, Merle T.; Keller, Peter E.
2014-01-01
In our daily lives, auditory stream segregation allows us to differentiate concurrent sound sources and to make sense of the scene we are experiencing. However, a combination of segregation and the concurrent integration of auditory streams is necessary in order to analyze the relationship between streams and thus perceive a coherent auditory scene. The present functional magnetic resonance imaging study investigates the relative role and neural underpinnings of these listening strategies in multi-part musical stimuli. We compare a real human performance of a piano duet and a synthetic stimulus of the same duet in a prioritized integrative attention paradigm that required the simultaneous segregation and integration of auditory streams. In so doing, we manipulate the degree to which the attended part of the duet led either structurally (attend melody vs. attend accompaniment) or temporally (asynchronies vs. no asynchronies between parts), and thus the relative contributions of integration and segregation used to make an assessment of the leader-follower relationship. We show that perceptually the relationship between parts is biased towards the conventional structural hierarchy in western music in which the melody generally dominates (leads) the accompaniment. Moreover, the assessment varies as a function of both cognitive load, as shown through difficulty ratings and the interaction of the temporal and the structural relationship factors. Neurally, we see that the temporal relationship between parts, as one important cue for stream segregation, revealed distinct neural activity in the planum temporale. By contrast, integration used when listening to both the temporally separated performance stimulus and the temporally fused synthetic stimulus resulted in activation of the intraparietal sulcus. These results support the hypothesis that the planum temporale and IPS are key structures underlying the mechanisms of segregation and integration of auditory streams, respectively. PMID:24475030
Segregation and integration of auditory streams when listening to multi-part music.
Ragert, Marie; Fairhurst, Merle T; Keller, Peter E
2014-01-01
In our daily lives, auditory stream segregation allows us to differentiate concurrent sound sources and to make sense of the scene we are experiencing. However, a combination of segregation and the concurrent integration of auditory streams is necessary in order to analyze the relationship between streams and thus perceive a coherent auditory scene. The present functional magnetic resonance imaging study investigates the relative role and neural underpinnings of these listening strategies in multi-part musical stimuli. We compare a real human performance of a piano duet and a synthetic stimulus of the same duet in a prioritized integrative attention paradigm that required the simultaneous segregation and integration of auditory streams. In so doing, we manipulate the degree to which the attended part of the duet led either structurally (attend melody vs. attend accompaniment) or temporally (asynchronies vs. no asynchronies between parts), and thus the relative contributions of integration and segregation used to make an assessment of the leader-follower relationship. We show that perceptually the relationship between parts is biased towards the conventional structural hierarchy in western music in which the melody generally dominates (leads) the accompaniment. Moreover, the assessment varies as a function of both cognitive load, as shown through difficulty ratings and the interaction of the temporal and the structural relationship factors. Neurally, we see that the temporal relationship between parts, as one important cue for stream segregation, revealed distinct neural activity in the planum temporale. By contrast, integration used when listening to both the temporally separated performance stimulus and the temporally fused synthetic stimulus resulted in activation of the intraparietal sulcus. These results support the hypothesis that the planum temporale and IPS are key structures underlying the mechanisms of segregation and integration of auditory streams, respectively.
Opposing dorsal/ventral stream dynamics during figure-ground segregation.
Wokke, Martijn E; Scholte, H Steven; Lamme, Victor A F
2014-02-01
The visual system has been commonly subdivided into two segregated visual processing streams: The dorsal pathway processes mainly spatial information, and the ventral pathway specializes in object perception. Recent findings, however, indicate that different forms of interaction (cross-talk) exist between the dorsal and the ventral stream. Here, we used TMS and concurrent EEG recordings to explore these interactions between the dorsal and ventral stream during figure-ground segregation. In two separate experiments, we used repetitive TMS and single-pulse TMS to disrupt processing in the dorsal (V5/HMT⁺) and the ventral (lateral occipital area) stream during a motion-defined figure discrimination task. We presented stimuli that made it possible to differentiate between relatively low-level (figure boundary detection) from higher-level (surface segregation) processing steps during figure-ground segregation. Results show that disruption of V5/HMT⁺ impaired performance related to surface segregation; this effect was mainly found when V5/HMT⁺ was perturbed in an early time window (100 msec) after stimulus presentation. Surprisingly, disruption of the lateral occipital area resulted in increased performance scores and enhanced neural correlates of surface segregation. This facilitatory effect was also mainly found in an early time window (100 msec) after stimulus presentation. These results suggest a "push-pull" interaction in which dorsal and ventral extrastriate areas are being recruited or inhibited depending on stimulus category and task demands.
Jaeger, Manuela; Bleichner, Martin G; Bauer, Anna-Katharina R; Mirkovic, Bojana; Debener, Stefan
2018-02-27
The acoustic envelope of human speech correlates with the syllabic rate (4-8 Hz) and carries important information for intelligibility, which is typically compromised in multi-talker, noisy environments. In order to better understand the dynamics of selective auditory attention to low frequency modulated sound sources, we conducted a two-stream auditory steady-state response (ASSR) selective attention electroencephalogram (EEG) study. The two streams consisted of 4 and 7 Hz amplitude and frequency modulated sounds presented from the left and right side. One of two streams had to be attended while the other had to be ignored. The attended stream always contained a target, allowing for the behavioral confirmation of the attention manipulation. EEG ASSR power analysis revealed a significant increase in 7 Hz power for the attend compared to the ignore conditions. There was no significant difference in 4 Hz power when the 4 Hz stream had to be attended compared to when it had to be ignored. This lack of 4 Hz attention modulation could be explained by a distracting effect of a third frequency at 3 Hz (beat frequency) perceivable when the 4 and 7 Hz streams are presented simultaneously. Taken together our results show that low frequency modulations at syllabic rate are modulated by selective spatial attention. Whether attention effects act as enhancement of the attended stream or suppression of to be ignored stream may depend on how well auditory streams can be segregated.
Lotfi, Yones; Mehrkian, Saiedeh; Moossavi, Abdollah; Zadeh, Soghrat Faghih; Sadjedi, Hamed
2016-03-01
This study assessed the relationship between working memory capacity and auditory stream segregation by using the concurrent minimum audible angle in children with a diagnosed auditory processing disorder (APD). The participants in this cross-sectional, comparative study were 20 typically developing children and 15 children with a diagnosed APD (age, 9-11 years) according to the subtests of multiple-processing auditory assessment. Auditory stream segregation was investigated using the concurrent minimum audible angle. Working memory capacity was evaluated using the non-word repetition and forward and backward digit span tasks. Nonparametric statistics were utilized to compare the between-group differences. The Pearson correlation was employed to measure the degree of association between working memory capacity and the localization tests between the 2 groups. The group with APD had significantly lower scores than did the typically developing subjects in auditory stream segregation and working memory capacity. There were significant negative correlations between working memory capacity and the concurrent minimum audible angle in the most frontal reference location (0° azimuth) and lower negative correlations in the most lateral reference location (60° azimuth) in the children with APD. The study revealed a relationship between working memory capacity and auditory stream segregation in children with APD. The research suggests that lower working memory capacity in children with APD may be the possible cause of the inability to segregate and group incoming information.
Zion Golumbic, Elana M.; Poeppel, David; Schroeder, Charles E.
2012-01-01
The human capacity for processing speech is remarkable, especially given that information in speech unfolds over multiple time scales concurrently. Similarly notable is our ability to filter out of extraneous sounds and focus our attention on one conversation, epitomized by the ‘Cocktail Party’ effect. Yet, the neural mechanisms underlying on-line speech decoding and attentional stream selection are not well understood. We review findings from behavioral and neurophysiological investigations that underscore the importance of the temporal structure of speech for achieving these perceptual feats. We discuss the hypothesis that entrainment of ambient neuronal oscillations to speech’s temporal structure, across multiple time-scales, serves to facilitate its decoding and underlies the selection of an attended speech stream over other competing input. In this regard, speech decoding and attentional stream selection are examples of ‘active sensing’, emphasizing an interaction between proactive and predictive top-down modulation of neuronal dynamics and bottom-up sensory input. PMID:22285024
Lexical influences on competing speech perception in younger, middle-aged, and older adults
Helfer, Karen S.; Jesse, Alexandra
2015-01-01
The influence of lexical characteristics of words in to-be-attended and to-be-ignored speech streams was examined in a competing speech task. Older, middle-aged, and younger adults heard pairs of low-cloze probability sentences in which the frequency or neighborhood density of words was manipulated in either the target speech stream or the masking speech stream. All participants also completed a battery of cognitive measures. As expected, for all groups, target words that occur frequently or that are from sparse lexical neighborhoods were easier to recognize than words that are infrequent or from dense neighborhoods. Compared to other groups, these neighborhood density effects were largest for older adults; the frequency effect was largest for middle-aged adults. Lexical characteristics of words in the to-be-ignored speech stream also affected recognition of to-be-attended words, but only when overall performance was relatively good (that is, when younger participants listened to the speech streams at a more advantageous signal-to-noise ratio). For these listeners, to-be-ignored masker words from sparse neighborhoods interfered with recognition of target speech more than masker words from dense neighborhoods. Amount of hearing loss and cognitive abilities relating to attentional control modulated overall performance as well as the strength of lexical influences. PMID:26233036
ERIC Educational Resources Information Center
Hickok, Gregory
2012-01-01
Speech recognition is an active process that involves some form of predictive coding. This statement is relatively uncontroversial. What is less clear is the source of the prediction. The dual-stream model of speech processing suggests that there are two possible sources of predictive coding in speech perception: the motor speech system and the…
Cracking the Language Code: Neural Mechanisms Underlying Speech Parsing
McNealy, Kristin; Mazziotta, John C.; Dapretto, Mirella
2013-01-01
Word segmentation, detecting word boundaries in continuous speech, is a critical aspect of language learning. Previous research in infants and adults demonstrated that a stream of speech can be readily segmented based solely on the statistical and speech cues afforded by the input. Using functional magnetic resonance imaging (fMRI), the neural substrate of word segmentation was examined on-line as participants listened to three streams of concatenated syllables, containing either statistical regularities alone, statistical regularities and speech cues, or no cues. Despite the participants’ inability to explicitly detect differences between the speech streams, neural activity differed significantly across conditions, with left-lateralized signal increases in temporal cortices observed only when participants listened to streams containing statistical regularities, particularly the stream containing speech cues. In a second fMRI study, designed to verify that word segmentation had implicitly taken place, participants listened to trisyllabic combinations that occurred with different frequencies in the streams of speech they just heard (“words,” 45 times; “partwords,” 15 times; “nonwords,” once). Reliably greater activity in left inferior and middle frontal gyri was observed when comparing words with partwords and, to a lesser extent, when comparing partwords with nonwords. Activity in these regions, taken to index the implicit detection of word boundaries, was positively correlated with participants’ rapid auditory processing skills. These findings provide a neural signature of on-line word segmentation in the mature brain and an initial model with which to study developmental changes in the neural architecture involved in processing speech cues during language learning. PMID:16855090
Fridriksson, Julius; den Ouden, Dirk-Bart; Hillis, Argye E; Hickok, Gregory; Rorden, Chris; Basilakos, Alexandra; Yourganov, Grigori; Bonilha, Leonardo
2018-01-17
In most cases, aphasia is caused by strokes involving the left hemisphere, with more extensive damage typically being associated with more severe aphasia. The classical model of aphasia commonly adhered to in the Western world is the Wernicke-Lichtheim model. The model has been in existence for over a century, and classification of aphasic symptomatology continues to rely on it. However, far more detailed models of speech and language localization in the brain have been formulated. In this regard, the dual stream model of cortical brain organization proposed by Hickok and Poeppel is particularly influential. Their model describes two processing routes, a dorsal stream and a ventral stream, that roughly support speech production and speech comprehension, respectively, in normal subjects. Despite the strong influence of the dual stream model in current neuropsychological research, there has been relatively limited focus on explaining aphasic symptoms in the context of this model. Given that the dual stream model represents a more nuanced picture of cortical speech and language organization, cortical damage that causes aphasic impairment should map clearly onto the dual processing streams. Here, we present a follow-up study to our previous work that used lesion data to reveal the anatomical boundaries of the dorsal and ventral streams supporting speech and language processing. Specifically, by emphasizing clinical measures, we examine the effect of cortical damage and disconnection involving the dorsal and ventral streams on aphasic impairment. The results reveal that measures of motor speech impairment mostly involve damage to the dorsal stream, whereas measures of impaired speech comprehension are more strongly associated with ventral stream involvement. Equally important, many clinical tests that target behaviours such as naming, speech repetition, or grammatical processing rely on interactions between the two streams. This latter finding explains why patients with seemingly disparate lesion locations often experience similar impairments on given subtests. Namely, these individuals' cortical damage, although dissimilar, affects a broad cortical network that plays a role in carrying out a given speech or language task. The current data suggest this is a more accurate characterization than ascribing specific lesion locations as responsible for specific language deficits.awx363media15705668782001. © The Author(s) (2018). Published by Oxford University Press on behalf of the Guarantors of Brain. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Left Superior Temporal Gyrus Is Coupled to Attended Speech in a Cocktail-Party Auditory Scene.
Vander Ghinst, Marc; Bourguignon, Mathieu; Op de Beeck, Marc; Wens, Vincent; Marty, Brice; Hassid, Sergio; Choufani, Georges; Jousmäki, Veikko; Hari, Riitta; Van Bogaert, Patrick; Goldman, Serge; De Tiège, Xavier
2016-02-03
Using a continuous listening task, we evaluated the coupling between the listener's cortical activity and the temporal envelopes of different sounds in a multitalker auditory scene using magnetoencephalography and corticovocal coherence analysis. Neuromagnetic signals were recorded from 20 right-handed healthy adult humans who listened to five different recorded stories (attended speech streams), one without any multitalker background (No noise) and four mixed with a "cocktail party" multitalker background noise at four signal-to-noise ratios (5, 0, -5, and -10 dB) to produce speech-in-noise mixtures, here referred to as Global scene. Coherence analysis revealed that the modulations of the attended speech stream, presented without multitalker background, were coupled at ∼0.5 Hz to the activity of both superior temporal gyri, whereas the modulations at 4-8 Hz were coupled to the activity of the right supratemporal auditory cortex. In cocktail party conditions, with the multitalker background noise, the coupling was at both frequencies stronger for the attended speech stream than for the unattended Multitalker background. The coupling strengths decreased as the Multitalker background increased. During the cocktail party conditions, the ∼0.5 Hz coupling became left-hemisphere dominant, compared with bilateral coupling without the multitalker background, whereas the 4-8 Hz coupling remained right-hemisphere lateralized in both conditions. The brain activity was not coupled to the multitalker background or to its individual talkers. The results highlight the key role of listener's left superior temporal gyri in extracting the slow ∼0.5 Hz modulations, likely reflecting the attended speech stream within a multitalker auditory scene. When people listen to one person in a "cocktail party," their auditory cortex mainly follows the attended speech stream rather than the entire auditory scene. However, how the brain extracts the attended speech stream from the whole auditory scene and how increasing background noise corrupts this process is still debated. In this magnetoencephalography study, subjects had to attend a speech stream with or without multitalker background noise. Results argue for frequency-dependent cortical tracking mechanisms for the attended speech stream. The left superior temporal gyrus tracked the ∼0.5 Hz modulations of the attended speech stream only when the speech was embedded in multitalker background, whereas the right supratemporal auditory cortex tracked 4-8 Hz modulations during both noiseless and cocktail-party conditions. Copyright © 2016 the authors 0270-6474/16/361597-11$15.00/0.
ERIC Educational Resources Information Center
O'Brien, Nancy, Ed.
One of a series of reports on the status of speech investigation, this collection of articles deals with topics including intonation and morphological knowledge. The titles of the articles and their authors are as follows: (1) "Integration and Segregation in Speech Perception" (Bruno H. Repp); (2) "Speech Perception Takes Precedence…
Revealing the dual streams of speech processing.
Fridriksson, Julius; Yourganov, Grigori; Bonilha, Leonardo; Basilakos, Alexandra; Den Ouden, Dirk-Bart; Rorden, Christopher
2016-12-27
Several dual route models of human speech processing have been proposed suggesting a large-scale anatomical division between cortical regions that support motor-phonological aspects vs. lexical-semantic aspects of speech processing. However, to date, there is no complete agreement on what areas subserve each route or the nature of interactions across these routes that enables human speech processing. Relying on an extensive behavioral and neuroimaging assessment of a large sample of stroke survivors, we used a data-driven approach using principal components analysis of lesion-symptom mapping to identify brain regions crucial for performance on clusters of behavioral tasks without a priori separation into task types. Distinct anatomical boundaries were revealed between a dorsal frontoparietal stream and a ventral temporal-frontal stream associated with separate components. Collapsing over the tasks primarily supported by these streams, we characterize the dorsal stream as a form-to-articulation pathway and the ventral stream as a form-to-meaning pathway. This characterization of the division in the data reflects both the overlap between tasks supported by the two streams as well as the observation that there is a bias for phonological production tasks supported by the dorsal stream and lexical-semantic comprehension tasks supported by the ventral stream. As such, our findings show a division between two processing routes that underlie human speech processing and provide an empirical foundation for studying potential computational differences that distinguish between the two routes.
Lotfi, Yones; Mehrkian, Saiedeh; Moossavi, Abdollah; Zadeh, Soghrat Faghih; Sadjedi, Hamed
2016-01-01
Background: This study assessed the relationship between working memory capacity and auditory stream segregation by using the concurrent minimum audible angle in children with a diagnosed auditory processing disorder (APD). Methods: The participants in this cross-sectional, comparative study were 20 typically developing children and 15 children with a diagnosed APD (age, 9–11 years) according to the subtests of multiple-processing auditory assessment. Auditory stream segregation was investigated using the concurrent minimum audible angle. Working memory capacity was evaluated using the non-word repetition and forward and backward digit span tasks. Nonparametric statistics were utilized to compare the between-group differences. The Pearson correlation was employed to measure the degree of association between working memory capacity and the localization tests between the 2 groups. Results: The group with APD had significantly lower scores than did the typically developing subjects in auditory stream segregation and working memory capacity. There were significant negative correlations between working memory capacity and the concurrent minimum audible angle in the most frontal reference location (0° azimuth) and lower negative correlations in the most lateral reference location (60° azimuth) in the children with APD. Conclusion: The study revealed a relationship between working memory capacity and auditory stream segregation in children with APD. The research suggests that lower working memory capacity in children with APD may be the possible cause of the inability to segregate and group incoming information. PMID:26989281
Poliva, Oren; Bestelmeyer, Patricia E G; Hall, Michelle; Bultitude, Janet H; Koller, Kristin; Rafal, Robert D
2015-09-01
To use functional magnetic resonance imaging to map the auditory cortical fields that are activated, or nonreactive, to sounds in patient M.L., who has auditory agnosia caused by trauma to the inferior colliculi. The patient cannot recognize speech or environmental sounds. Her discrimination is greatly facilitated by context and visibility of the speaker's facial movements, and under forced-choice testing. Her auditory temporal resolution is severely compromised. Her discrimination is more impaired for words differing in voice onset time than place of articulation. Words presented to her right ear are extinguished with dichotic presentation; auditory stimuli in the right hemifield are mislocalized to the left. We used functional magnetic resonance imaging to examine cortical activations to different categories of meaningful sounds embedded in a block design. Sounds activated the caudal sub-area of M.L.'s primary auditory cortex (hA1) bilaterally and her right posterior superior temporal gyrus (auditory dorsal stream), but not the rostral sub-area (hR) of her primary auditory cortex or the anterior superior temporal gyrus in either hemisphere (auditory ventral stream). Auditory agnosia reflects dysfunction of the auditory ventral stream. The ventral and dorsal auditory streams are already segregated as early as the primary auditory cortex, with the ventral stream projecting from hR and the dorsal stream from hA1. M.L.'s leftward localization bias, preserved audiovisual integration, and phoneme perception are explained by preserved processing in her right auditory dorsal stream.
Multistability in auditory stream segregation: a predictive coding view
Winkler, István; Denham, Susan; Mill, Robert; Bőhm, Tamás M.; Bendixen, Alexandra
2012-01-01
Auditory stream segregation involves linking temporally separate acoustic events into one or more coherent sequences. For any non-trivial sequence of sounds, many alternative descriptions can be formed, only one or very few of which emerge in awareness at any time. Evidence from studies showing bi-/multistability in auditory streaming suggest that some, perhaps many of the alternative descriptions are represented in the brain in parallel and that they continuously vie for conscious perception. Here, based on a predictive coding view, we consider the nature of these sound representations and how they compete with each other. Predictive processing helps to maintain perceptual stability by signalling the continuation of previously established patterns as well as the emergence of new sound sources. It also provides a measure of how well each of the competing representations describes the current acoustic scene. This account of auditory stream segregation has been tested on perceptual data obtained in the auditory streaming paradigm. PMID:22371621
Bentsen, Thomas; May, Tobias; Kressner, Abigail A; Dau, Torsten
2018-01-01
Computational speech segregation attempts to automatically separate speech from noise. This is challenging in conditions with interfering talkers and low signal-to-noise ratios. Recent approaches have adopted deep neural networks and successfully demonstrated speech intelligibility improvements. A selection of components may be responsible for the success with these state-of-the-art approaches: the system architecture, a time frame concatenation technique and the learning objective. The aim of this study was to explore the roles and the relative contributions of these components by measuring speech intelligibility in normal-hearing listeners. A substantial improvement of 25.4 percentage points in speech intelligibility scores was found going from a subband-based architecture, in which a Gaussian Mixture Model-based classifier predicts the distributions of speech and noise for each frequency channel, to a state-of-the-art deep neural network-based architecture. Another improvement of 13.9 percentage points was obtained by changing the learning objective from the ideal binary mask, in which individual time-frequency units are labeled as either speech- or noise-dominated, to the ideal ratio mask, where the units are assigned a continuous value between zero and one. Therefore, both components play significant roles and by combining them, speech intelligibility improvements were obtained in a six-talker condition at a low signal-to-noise ratio.
Visual speech segmentation: using facial cues to locate word boundaries in continuous speech
Mitchel, Aaron D.; Weiss, Daniel J.
2014-01-01
Speech is typically a multimodal phenomenon, yet few studies have focused on the exclusive contributions of visual cues to language acquisition. To address this gap, we investigated whether visual prosodic information can facilitate speech segmentation. Previous research has demonstrated that language learners can use lexical stress and pitch cues to segment speech and that learners can extract this information from talking faces. Thus, we created an artificial speech stream that contained minimal segmentation cues and paired it with two synchronous facial displays in which visual prosody was either informative or uninformative for identifying word boundaries. Across three familiarisation conditions (audio stream alone, facial streams alone, and paired audiovisual), learning occurred only when the facial displays were informative to word boundaries, suggesting that facial cues can help learners solve the early challenges of language acquisition. PMID:25018577
Cortical Representations of Speech in a Multitalker Auditory Scene.
Puvvada, Krishna C; Simon, Jonathan Z
2017-09-20
The ability to parse a complex auditory scene into perceptual objects is facilitated by a hierarchical auditory system. Successive stages in the hierarchy transform an auditory scene of multiple overlapping sources, from peripheral tonotopically based representations in the auditory nerve, into perceptually distinct auditory-object-based representations in the auditory cortex. Here, using magnetoencephalography recordings from men and women, we investigate how a complex acoustic scene consisting of multiple speech sources is represented in distinct hierarchical stages of the auditory cortex. Using systems-theoretic methods of stimulus reconstruction, we show that the primary-like areas in the auditory cortex contain dominantly spectrotemporal-based representations of the entire auditory scene. Here, both attended and ignored speech streams are represented with almost equal fidelity, and a global representation of the full auditory scene with all its streams is a better candidate neural representation than that of individual streams being represented separately. We also show that higher-order auditory cortical areas, by contrast, represent the attended stream separately and with significantly higher fidelity than unattended streams. Furthermore, the unattended background streams are more faithfully represented as a single unsegregated background object rather than as separated objects. Together, these findings demonstrate the progression of the representations and processing of a complex acoustic scene up through the hierarchy of the human auditory cortex. SIGNIFICANCE STATEMENT Using magnetoencephalography recordings from human listeners in a simulated cocktail party environment, we investigate how a complex acoustic scene consisting of multiple speech sources is represented in separate hierarchical stages of the auditory cortex. We show that the primary-like areas in the auditory cortex use a dominantly spectrotemporal-based representation of the entire auditory scene, with both attended and unattended speech streams represented with almost equal fidelity. We also show that higher-order auditory cortical areas, by contrast, represent an attended speech stream separately from, and with significantly higher fidelity than, unattended speech streams. Furthermore, the unattended background streams are represented as a single undivided background object rather than as distinct background objects. Copyright © 2017 the authors 0270-6474/17/379189-08$15.00/0.
Kong, Ying-Yee; Mullangi, Ala; Ding, Nai
2014-01-01
This study investigates how top-down attention modulates neural tracking of the speech envelope in different listening conditions. In the quiet conditions, a single speech stream was presented and the subjects paid attention to the speech stream (active listening) or watched a silent movie instead (passive listening). In the competing speaker (CS) conditions, two speakers of opposite genders were presented diotically. Ongoing electroencephalographic (EEG) responses were measured in each condition and cross-correlated with the speech envelope of each speaker at different time lags. In quiet, active and passive listening resulted in similar neural responses to the speech envelope. In the CS conditions, however, the shape of the cross-correlation function was remarkably different between the attended and unattended speech. The cross-correlation with the attended speech showed stronger N1 and P2 responses but a weaker P1 response compared with the cross-correlation with the unattended speech. Furthermore, the N1 response to the attended speech in the CS condition was enhanced and delayed compared with the active listening condition in quiet, while the P2 response to the unattended speaker in the CS condition was attenuated compared with the passive listening in quiet. Taken together, these results demonstrate that top-down attention differentially modulates envelope-tracking neural activity at different time lags and suggest that top-down attention can both enhance the neural responses to the attended sound stream and suppress the responses to the unattended sound stream. PMID:25124153
Mechanisms Underlying Selective Neuronal Tracking of Attended Speech at a ‘Cocktail Party’
Zion Golumbic, Elana M.; Ding, Nai; Bickel, Stephan; Lakatos, Peter; Schevon, Catherine A.; McKhann, Guy M.; Goodman, Robert R.; Emerson, Ronald; Mehta, Ashesh D.; Simon, Jonathan Z.; Poeppel, David; Schroeder, Charles E.
2013-01-01
Summary The ability to focus on and understand one talker in a noisy social environment is a critical social-cognitive capacity, whose underlying neuronal mechanisms are unclear. We investigated the manner in which speech streams are represented in brain activity and the way that selective attention governs the brain’s representation of speech using a ‘Cocktail Party’ Paradigm, coupled with direct recordings from the cortical surface in surgical epilepsy patients. We find that brain activity dynamically tracks speech streams using both low frequency phase and high frequency amplitude fluctuations, and that optimal encoding likely combines the two. In and near low level auditory cortices, attention ‘modulates’ the representation by enhancing cortical tracking of attended speech streams, but ignored speech remains represented. In higher order regions, the representation appears to become more ‘selective,’ in that there is no detectable tracking of ignored speech. This selectivity itself seems to sharpen as a sentence unfolds. PMID:23473326
Method and apparatus for obtaining complete speech signals for speech recognition applications
NASA Technical Reports Server (NTRS)
Abrash, Victor (Inventor); Cesari, Federico (Inventor); Franco, Horacio (Inventor); George, Christopher (Inventor); Zheng, Jing (Inventor)
2009-01-01
The present invention relates to a method and apparatus for obtaining complete speech signals for speech recognition applications. In one embodiment, the method continuously records an audio stream comprising a sequence of frames to a circular buffer. When a user command to commence or terminate speech recognition is received, the method obtains a number of frames of the audio stream occurring before or after the user command in order to identify an augmented audio signal for speech recognition processing. In further embodiments, the method analyzes the augmented audio signal in order to locate starting and ending speech endpoints that bound at least a portion of speech to be processed for recognition. At least one of the speech endpoints is located using a Hidden Markov Model.
Kanwal, Jagmeet S; Medvedev, Andrei V; Micheyl, Christophe
2003-08-01
During navigation and the search phase of foraging, mustached bats emit approximately 25 ms long echolocation pulses (at 10-40 Hz) that contain multiple harmonics of a constant frequency (CF) component followed by a short (3 ms) downward frequency modulation. In the context of auditory stream segregation, therefore, bats may either perceive a coherent pulse-echo sequence (PEPE...), or segregated pulse and echo streams (P-P-P... and E-E-E...). To identify the neural mechanisms for stream segregation in bats, we developed a simple yet realistic neural network model with seven layers and 420 nodes. Our model required recurrent and lateral inhibition to enable output nodes in the network to 'latch-on' to a single tone (corresponding to a CF component in either the pulse or echo), i.e., exhibit differential suppression by the alternating two tones presented at a high rate (> 10 Hz). To test the applicability of our model to echolocation, we obtained neurophysiological data from the primary auditory cortex of awake mustached bats. Event-related potentials reliably reproduced the latching behaviour observed at output nodes in the network. Pulse as well as nontarget (clutter) echo CFs facilitated this latching. Individual single unit responses were erratic, but when summed over several recording sites, they also exhibited reliable latching behaviour even at 40 Hz. On the basis of these findings, we propose that a neural correlate of auditory stream segregation is present within localized synaptic activity in the mustached bat's auditory cortex and this mechanism may enhance the perception of echolocation sounds in the natural environment.
Auditory Stream Segregation Improves Infants' Selective Attention to Target Tones Amid Distracters
ERIC Educational Resources Information Center
Smith, Nicholas A.; Trainor, Laurel J.
2011-01-01
This study examined the role of auditory stream segregation in the selective attention to target tones in infancy. Using a task adapted from Bregman and Rudnicky's 1975 study and implemented in a conditioned head-turn procedure, infant and adult listeners had to discriminate the temporal order of 2,200 and 2,400 Hz target tones presented alone,…
An ALE meta-analysis on the audiovisual integration of speech signals.
Erickson, Laura C; Heeg, Elizabeth; Rauschecker, Josef P; Turkeltaub, Peter E
2014-11-01
The brain improves speech processing through the integration of audiovisual (AV) signals. Situations involving AV speech integration may be crudely dichotomized into those where auditory and visual inputs contain (1) equivalent, complementary signals (validating AV speech) or (2) inconsistent, different signals (conflicting AV speech). This simple framework may allow the systematic examination of broad commonalities and differences between AV neural processes engaged by various experimental paradigms frequently used to study AV speech integration. We conducted an activation likelihood estimation metaanalysis of 22 functional imaging studies comprising 33 experiments, 311 subjects, and 347 foci examining "conflicting" versus "validating" AV speech. Experimental paradigms included content congruency, timing synchrony, and perceptual measures, such as the McGurk effect or synchrony judgments, across AV speech stimulus types (sublexical to sentence). Colocalization of conflicting AV speech experiments revealed consistency across at least two contrast types (e.g., synchrony and congruency) in a network of dorsal stream regions in the frontal, parietal, and temporal lobes. There was consistency across all contrast types (synchrony, congruency, and percept) in the bilateral posterior superior/middle temporal cortex. Although fewer studies were available, validating AV speech experiments were localized to other regions, such as ventral stream visual areas in the occipital and inferior temporal cortex. These results suggest that while equivalent, complementary AV speech signals may evoke activity in regions related to the corroboration of sensory input, conflicting AV speech signals recruit widespread dorsal stream areas likely involved in the resolution of conflicting sensory signals. Copyright © 2014 Wiley Periodicals, Inc.
Connected word recognition using a cascaded neuro-computational model
NASA Astrophysics Data System (ADS)
Hoya, Tetsuya; van Leeuwen, Cees
2016-10-01
We propose a novel framework for processing a continuous speech stream that contains a varying number of words, as well as non-speech periods. Speech samples are segmented into word-tokens and non-speech periods. An augmented version of an earlier-proposed, cascaded neuro-computational model is used for recognising individual words within the stream. Simulation studies using both a multi-speaker-dependent and speaker-independent digit string database show that the proposed method yields a recognition performance comparable to that obtained by a benchmark approach using hidden Markov models with embedded training.
Emergence of Spatial Stream Segregation in the Ascending Auditory Pathway.
Yao, Justin D; Bremen, Peter; Middlebrooks, John C
2015-12-09
Stream segregation enables a listener to disentangle multiple competing sequences of sounds. A recent study from our laboratory demonstrated that cortical neurons in anesthetized cats exhibit spatial stream segregation (SSS) by synchronizing preferentially to one of two sequences of noise bursts that alternate between two source locations. Here, we examine the emergence of SSS along the ascending auditory pathway. Extracellular recordings were made in anesthetized rats from the inferior colliculus (IC), the nucleus of the brachium of the IC (BIN), the medial geniculate body (MGB), and the primary auditory cortex (A1). Stimuli consisted of interleaved sequences of broadband noise bursts that alternated between two source locations. At stimulus presentation rates of 5 and 10 bursts per second, at which human listeners report robust SSS, neural SSS is weak in the central nucleus of the IC (ICC), it appears in the nucleus of the brachium of the IC (BIN) and in approximately two-thirds of neurons in the ventral MGB (MGBv), and is prominent throughout A1. The enhancement of SSS at the cortical level reflects both increased spatial sensitivity and increased forward suppression. We demonstrate that forward suppression in A1 does not result from synaptic inhibition at the cortical level. Instead, forward suppression might reflect synaptic depression in the thalamocortical projection. Together, our findings indicate that auditory streams are increasingly segregated along the ascending auditory pathway as distinct mutually synchronized neural populations. Listeners are capable of disentangling multiple competing sequences of sounds that originate from distinct sources. This stream segregation is aided by differences in spatial location between the sources. A possible substrate of spatial stream segregation (SSS) has been described in the auditory cortex, but the mechanisms leading to those cortical responses are unknown. Here, we investigated SSS in three levels of the ascending auditory pathway with extracellular unit recordings in anesthetized rats. We found that neural SSS emerges within the ascending auditory pathway as a consequence of sharpening of spatial sensitivity and increasing forward suppression. Our results highlight brainstem mechanisms that culminate in SSS at the level of the auditory cortex. Copyright © 2015 Yao et al.
BIG CITY SCHOOL DESEGREGATION--TRENDS AND METHODS.
ERIC Educational Resources Information Center
DENTLER, ROBERT A.; ELSBERY, JAMES
THE CONCERNS OF THIS SPEECH ARE THE EXTENT OF SCHOOL SEGREGATION IN THE NATION'S 20 LARGEST CITIES, THE STEPS WHICH HAVE BEEN AND MIGHT BE TAKEN TO DESEGREGATE THEIR SCHOOL SYSTEMS, AND THE STRATEGIES NECESSARY TO EFFECTIVELY IMPLEMENT SCHOOL DESEGREGATION PLANS. THERE IS ALMOST TOTAL RESIDENTIAL SEGREGATION IN 13 OF THESE CITIES. SEVENTY PERCENT…
Suppression of competing speech through entrainment of cortical oscillations
D'Zmura, Michael; Srinivasan, Ramesh
2013-01-01
People are highly skilled at attending to one speaker in the presence of competitors, but the neural mechanisms supporting this remain unclear. Recent studies have argued that the auditory system enhances the gain of a speech stream relative to competitors by entraining (or “phase-locking”) to the rhythmic structure in its acoustic envelope, thus ensuring that syllables arrive during periods of high neuronal excitability. We hypothesized that such a mechanism could also suppress a competing speech stream by ensuring that syllables arrive during periods of low neuronal excitability. To test this, we analyzed high-density EEG recorded from human adults while they attended to one of two competing, naturalistic speech streams. By calculating the cross-correlation between the EEG channels and the speech envelopes, we found evidence of entrainment to the attended speech's acoustic envelope as well as weaker yet significant entrainment to the unattended speech's envelope. An independent component analysis (ICA) decomposition of the data revealed sources in the posterior temporal cortices that displayed robust correlations to both the attended and unattended envelopes. Critically, in these components the signs of the correlations when attended were opposite those when unattended, consistent with the hypothesized entrainment-based suppressive mechanism. PMID:23515789
Neuronal basis of speech comprehension.
Specht, Karsten
2014-01-01
Verbal communication does not rely only on the simple perception of auditory signals. It is rather a parallel and integrative processing of linguistic and non-linguistic information, involving temporal and frontal areas in particular. This review describes the inherent complexity of auditory speech comprehension from a functional-neuroanatomical perspective. The review is divided into two parts. In the first part, structural and functional asymmetry of language relevant structures will be discus. The second part of the review will discuss recent neuroimaging studies, which coherently demonstrate that speech comprehension processes rely on a hierarchical network involving the temporal, parietal, and frontal lobes. Further, the results support the dual-stream model for speech comprehension, with a dorsal stream for auditory-motor integration, and a ventral stream for extracting meaning but also the processing of sentences and narratives. Specific patterns of functional asymmetry between the left and right hemisphere can also be demonstrated. The review article concludes with a discussion on interactions between the dorsal and ventral streams, particularly the involvement of motor related areas in speech perception processes, and outlines some remaining unresolved issues. This article is part of a Special Issue entitled Human Auditory Neuroimaging. Copyright © 2013 Elsevier B.V. All rights reserved.
López-Barroso, Diana; Ripollés, Pablo; Marco-Pallarés, Josep; Mohammadi, Bahram; Münte, Thomas F; Bachoud-Lévi, Anne-Catherine; Rodriguez-Fornells, Antoni; de Diego-Balaguer, Ruth
2015-04-15
Although neuroimaging studies using standard subtraction-based analysis from functional magnetic resonance imaging (fMRI) have suggested that frontal and temporal regions are involved in word learning from fluent speech, the possible contribution of different brain networks during this type of learning is still largely unknown. Indeed, univariate fMRI analyses cannot identify the full extent of distributed networks that are engaged by a complex task such as word learning. Here we used Independent Component Analysis (ICA) to characterize the different brain networks subserving word learning from an artificial language speech stream. Results were replicated in a second cohort of participants with a different linguistic background. Four spatially independent networks were associated with the task in both cohorts: (i) a dorsal Auditory-Premotor network; (ii) a dorsal Sensory-Motor network; (iii) a dorsal Fronto-Parietal network; and (iv) a ventral Fronto-Temporal network. The level of engagement of these networks varied through the learning period with only the dorsal Auditory-Premotor network being engaged across all blocks. In addition, the connectivity strength of this network in the second block of the learning phase correlated with the individual variability in word learning performance. These findings suggest that: (i) word learning relies on segregated connectivity patterns involving dorsal and ventral networks; and (ii) specifically, the dorsal auditory-premotor network connectivity strength is directly correlated with word learning performance. Copyright © 2015 Elsevier Inc. All rights reserved.
Perceptual Grouping Affects Pitch Judgments across Time and Frequency
ERIC Educational Resources Information Center
Borchert, Elizabeth M. O.; Micheyl, Christophe; Oxenham, Andrew J.
2011-01-01
Pitch, the perceptual correlate of fundamental frequency (F0), plays an important role in speech, music, and animal vocalizations. Changes in F0 over time help define musical melodies and speech prosody, while comparisons of simultaneous F0 are important for musical harmony, and for segregating competing sound sources. This study compared…
Mapping a lateralization gradient within the ventral stream for auditory speech perception.
Specht, Karsten
2013-01-01
Recent models on speech perception propose a dual-stream processing network, with a dorsal stream, extending from the posterior temporal lobe of the left hemisphere through inferior parietal areas into the left inferior frontal gyrus, and a ventral stream that is assumed to originate in the primary auditory cortex in the upper posterior part of the temporal lobe and to extend toward the anterior part of the temporal lobe, where it may connect to the ventral part of the inferior frontal gyrus. This article describes and reviews the results from a series of complementary functional magnetic resonance imaging studies that aimed to trace the hierarchical processing network for speech comprehension within the left and right hemisphere with a particular focus on the temporal lobe and the ventral stream. As hypothesized, the results demonstrate a bilateral involvement of the temporal lobes in the processing of speech signals. However, an increasing leftward asymmetry was detected from auditory-phonetic to lexico-semantic processing and along the posterior-anterior axis, thus forming a "lateralization" gradient. This increasing leftward lateralization was particularly evident for the left superior temporal sulcus and more anterior parts of the temporal lobe.
Mapping a lateralization gradient within the ventral stream for auditory speech perception
Specht, Karsten
2013-01-01
Recent models on speech perception propose a dual-stream processing network, with a dorsal stream, extending from the posterior temporal lobe of the left hemisphere through inferior parietal areas into the left inferior frontal gyrus, and a ventral stream that is assumed to originate in the primary auditory cortex in the upper posterior part of the temporal lobe and to extend toward the anterior part of the temporal lobe, where it may connect to the ventral part of the inferior frontal gyrus. This article describes and reviews the results from a series of complementary functional magnetic resonance imaging studies that aimed to trace the hierarchical processing network for speech comprehension within the left and right hemisphere with a particular focus on the temporal lobe and the ventral stream. As hypothesized, the results demonstrate a bilateral involvement of the temporal lobes in the processing of speech signals. However, an increasing leftward asymmetry was detected from auditory–phonetic to lexico-semantic processing and along the posterior–anterior axis, thus forming a “lateralization” gradient. This increasing leftward lateralization was particularly evident for the left superior temporal sulcus and more anterior parts of the temporal lobe. PMID:24106470
What's in a Face? Visual Contributions to Speech Segmentation
ERIC Educational Resources Information Center
Mitchel, Aaron D.; Weiss, Daniel J.
2010-01-01
Recent research has demonstrated that adults successfully segment two interleaved artificial speech streams with incongruent statistics (i.e., streams whose combined statistics are noisier than the encapsulated statistics) only when provided with an indexical cue of speaker voice. In a series of five experiments, our study explores whether…
Auinger, Alice Barbara; Riss, Dominik; Liepins, Rudolfs; Rader, Tobias; Keck, Tilman; Keintzel, Thomas; Kaider, Alexandra; Baumgartner, Wolf-Dieter; Gstoettner, Wolfgang; Arnoldner, Christoph
2017-07-01
It has been shown that patients with electric acoustic stimulation (EAS) perform better in noisy environments than patients with a cochlear implant (CI). One reason for this could be the preserved access to acoustic low-frequency cues including the fundamental frequency (F0). Therefore, our primary aim was to investigate whether users of EAS experience a release from masking with increasing F0 difference between target talker and masking talker. The study comprised 29 patients and consisted of three groups of subjects: EAS users, CI users and normal-hearing listeners (NH). All CI and EAS users were implanted with a MED-EL cochlear implant and had at least 12 months of experience with the implant. Speech perception was assessed with the Oldenburg sentence test (OlSa) using one sentence from the test corpus as speech masker. The F0 in this masking sentence was shifted upwards by 4, 8, or 12 semitones. For each of these masker conditions the speech reception threshold (SRT) was assessed by adaptively varying the masker level while presenting the target sentences at a fixed level. A statistically significant improvement in speech perception was found for increasing difference in F0 between target sentence and masker sentence in EAS users (p = 0.038) and in NH listeners (p = 0.003). In CI users (classic CI or EAS users with electrical stimulation only) speech perception was independent from differences in F0 between target and masker. A release from masking with increasing difference in F0 between target and masking speech was only observed in listeners and configurations in which the low-frequency region was presented acoustically. Thus, the speech information contained in the low frequencies seems to be crucial for allowing listeners to separate multiple sources. By combining acoustic and electric information, EAS users even manage tasks as complicated as segregating the audio streams from multiple talkers. Preserving the natural code, like fine-structure cues in the low-frequency region, seems to be crucial to provide CI users with the best benefit. Copyright © 2017 Elsevier B.V. All rights reserved.
Neural Entrainment to Rhythmically Presented Auditory, Visual, and Audio-Visual Speech in Children
Power, Alan James; Mead, Natasha; Barnes, Lisa; Goswami, Usha
2012-01-01
Auditory cortical oscillations have been proposed to play an important role in speech perception. It is suggested that the brain may take temporal “samples” of information from the speech stream at different rates, phase resetting ongoing oscillations so that they are aligned with similar frequency bands in the input (“phase locking”). Information from these frequency bands is then bound together for speech perception. To date, there are no explorations of neural phase locking and entrainment to speech input in children. However, it is clear from studies of language acquisition that infants use both visual speech information and auditory speech information in learning. In order to study neural entrainment to speech in typically developing children, we use a rhythmic entrainment paradigm (underlying 2 Hz or delta rate) based on repetition of the syllable “ba,” presented in either the auditory modality alone, the visual modality alone, or as auditory-visual speech (via a “talking head”). To ensure attention to the task, children aged 13 years were asked to press a button as fast as possible when the “ba” stimulus violated the rhythm for each stream type. Rhythmic violation depended on delaying the occurrence of a “ba” in the isochronous stream. Neural entrainment was demonstrated for all stream types, and individual differences in standardized measures of language processing were related to auditory entrainment at the theta rate. Further, there was significant modulation of the preferred phase of auditory entrainment in the theta band when visual speech cues were present, indicating cross-modal phase resetting. The rhythmic entrainment paradigm developed here offers a method for exploring individual differences in oscillatory phase locking during development. In particular, a method for assessing neural entrainment and cross-modal phase resetting would be useful for exploring developmental learning difficulties thought to involve temporal sampling, such as dyslexia. PMID:22833726
Isolating the Energetic Component of Speech-on-Speech Masking With Ideal Time-Frequency Segregation
2006-12-01
Auditory Scene Analysis MIT Press, Cambridge, MA. Bronkhorst, A., and Plomp, R. 1992. “Effects of multiple speechlike maskers on binaural speech...C. J. 1994. “Perception and computational sepa- ration of simultaneous vowels: Cues arising from low frequency beating ,” J. Acoust. Soc. Am. 95...Litovsky, R., and Culling, J. 2004. “The benefit of binaural hearing in a cocktail party: Effects of location and type of interferer,” J. Acoust. Soc
Rapid Learning of Syllable Classes from a Perceptually Continuous Speech Stream
ERIC Educational Resources Information Center
Endress, Ansgar D.; Bonatti, Luca L.
2007-01-01
To learn a language, speakers must learn its words and rules from fluent speech; in particular, they must learn dependencies among linguistic classes. We show that when familiarized with a short artificial, subliminally bracketed stream, participants can learn relations about the structure of its words, which specify the classes of syllables…
The Function of Consciousness in Multisensory Integration
ERIC Educational Resources Information Center
Palmer, Terry D.; Ramsey, Ashley K.
2012-01-01
The function of consciousness was explored in two contexts of audio-visual speech, cross-modal visual attention guidance and McGurk cross-modal integration. Experiments 1, 2, and 3 utilized a novel cueing paradigm in which two different flash suppressed lip-streams cooccured with speech sounds matching one of these streams. A visual target was…
Audio-video feature correlation: faces and speech
NASA Astrophysics Data System (ADS)
Durand, Gwenael; Montacie, Claude; Caraty, Marie-Jose; Faudemay, Pascal
1999-08-01
This paper presents a study of the correlation of features automatically extracted from the audio stream and the video stream of audiovisual documents. In particular, we were interested in finding out whether speech analysis tools could be combined with face detection methods, and to what extend they should be combined. A generic audio signal partitioning algorithm as first used to detect Silence/Noise/Music/Speech segments in a full length movie. A generic object detection method was applied to the keyframes extracted from the movie in order to detect the presence or absence of faces. The correlation between the presence of a face in the keyframes and of the corresponding voice in the audio stream was studied. A third stream, which is the script of the movie, is warped on the speech channel in order to automatically label faces appearing in the keyframes with the name of the corresponding character. We naturally found that extracted audio and video features were related in many cases, and that significant benefits can be obtained from the joint use of audio and video analysis methods.
Sequential Organization and Room Reverberation for Speech Segregation
2012-02-28
we have proposed two algorithms for sequential organization, an unsupervised clustering algorithm applicable to monaural recordings and a binaural ...algorithm that integrates monaural and binaural analyses. In addition, we have conducted speech intelligibility tests that Firmly establish the...comprehensive version is currently under review for journal publication. A binaural approach in room reverberation Most existing approaches to binaural or
Speech and Language Therapy/Pathology: Perspectives on a Gendered Profession
ERIC Educational Resources Information Center
Litosseliti, Lia; Leadbeater, Claire
2013-01-01
Background: The speech and language therapy/pathology (SLT/SLP) profession is characterized by extreme "occupational sex segregation", a term used to refer to persistently male- or female-dominated professions. Men make up only 2.5% of all SLTs in the UK, and a similar imbalance is found in other countries. Despite calls to increase…
Interactive MPEG-4 low-bit-rate speech/audio transmission over the Internet
NASA Astrophysics Data System (ADS)
Liu, Fang; Kim, JongWon; Kuo, C.-C. Jay
1999-11-01
The recently developed MPEG-4 technology enables the coding and transmission of natural and synthetic audio-visual data in the form of objects. In an effort to extend the object-based functionality of MPEG-4 to real-time Internet applications, architectural prototypes of multiplex layer and transport layer tailored for transmission of MPEG-4 data over IP are under debate among Internet Engineering Task Force (IETF), and MPEG-4 systems Ad Hoc group. In this paper, we present an architecture for interactive MPEG-4 speech/audio transmission system over the Internet. It utilities a framework of Real Time Streaming Protocol (RTSP) over Real-time Transport Protocol (RTP) to provide controlled, on-demand delivery of real time speech/audio data. Based on a client-server model, a couple of low bit-rate bit streams (real-time speech/audio, pre- encoded speech/audio) are multiplexed and transmitted via a single RTP channel to the receiver. The MPEG-4 Scene Description (SD) and Object Descriptor (OD) bit streams are securely sent through the RTSP control channel. Upon receiving, an initial MPEG-4 audio- visual scene is constructed after de-multiplexing, decoding of bit streams, and scene composition. A receiver is allowed to manipulate the initial audio-visual scene presentation locally, or interactively arrange scene changes by sending requests to the server. A server may also choose to update the client with new streams and list of contents for user selection.
Discrepant visual speech facilitates covert selective listening in "cocktail party" conditions.
Williams, Jason A
2012-06-01
The presence of congruent visual speech information facilitates the identification of auditory speech, while the addition of incongruent visual speech information often impairs accuracy. This latter arrangement occurs naturally when one is being directly addressed in conversation but listens to a different speaker. Under these conditions, performance may diminish since: (a) one is bereft of the facilitative effects of the corresponding lip motion and (b) one becomes subject to visual distortion by incongruent visual speech; by contrast, speech intelligibility may be improved due to (c) bimodal localization of the central unattended stimulus. Participants were exposed to centrally presented visual and auditory speech while attending to a peripheral speech stream. In some trials, the lip movements of the central visual stimulus matched the unattended speech stream; in others, the lip movements matched the attended peripheral speech. Accuracy for the peripheral stimulus was nearly one standard deviation greater with incongruent visual information, compared to the congruent condition which provided bimodal pattern recognition cues. Likely, the bimodal localization of the central stimulus further differentiated the stimuli and thus facilitated intelligibility. Results are discussed with regard to similar findings in an investigation of the ventriloquist effect, and the relative strength of localization and speech cues in covert listening.
DETECTION AND IDENTIFICATION OF SPEECH SOUNDS USING CORTICAL ACTIVITY PATTERNS
Centanni, T.M.; Sloan, A.M.; Reed, A.C.; Engineer, C.T.; Rennaker, R.; Kilgard, M.P.
2014-01-01
We have developed a classifier capable of locating and identifying speech sounds using activity from rat auditory cortex with an accuracy equivalent to behavioral performance without the need to specify the onset time of the speech sounds. This classifier can identify speech sounds from a large speech set within 40 ms of stimulus presentation. To compare the temporal limits of the classifier to behavior, we developed a novel task that requires rats to identify individual consonant sounds from a stream of distracter consonants. The classifier successfully predicted the ability of rats to accurately identify speech sounds for syllable presentation rates up to 10 syllables per second (up to 17.9 ± 1.5 bits/sec), which is comparable to human performance. Our results demonstrate that the spatiotemporal patterns generated in primary auditory cortex can be used to quickly and accurately identify consonant sounds from a continuous speech stream without prior knowledge of the stimulus onset times. Improved understanding of the neural mechanisms that support robust speech processing in difficult listening conditions could improve the identification and treatment of a variety of speech processing disorders. PMID:24286757
Inferring Speaker Affect in Spoken Natural Language Communication
ERIC Educational Resources Information Center
Pon-Barry, Heather Roberta
2013-01-01
The field of spoken language processing is concerned with creating computer programs that can understand human speech and produce human-like speech. Regarding the problem of understanding human speech, there is currently growing interest in moving beyond speech recognition (the task of transcribing the words in an audio stream) and towards…
Pitch-Based Segregation of Reverberant Speech
2005-02-01
speaker recognition in real environments, audio information retrieval and hearing prosthesis. Second, although binaural listening improves the...intelligibility of target speech under anechoic conditions (Bronkhorst, 2000), this binaural advantage is largely eliminated by reverberation (Plomp, 1976...Brown and Cooke, 1994; Wang and Brown, 1999; Hu and Wang, 2004) as well as in binaural separation (e.g., Roman et al., 2003; Palomaki et al., 2004
Maps and streams in the auditory cortex: nonhuman primates illuminate human speech processing
Rauschecker, Josef P; Scott, Sophie K
2010-01-01
Speech and language are considered uniquely human abilities: animals have communication systems, but they do not match human linguistic skills in terms of recursive structure and combinatorial power. Yet, in evolution, spoken language must have emerged from neural mechanisms at least partially available in animals. In this paper, we will demonstrate how our understanding of speech perception, one important facet of language, has profited from findings and theory in nonhuman primate studies. Chief among these are physiological and anatomical studies showing that primate auditory cortex, across species, shows patterns of hierarchical structure, topographic mapping and streams of functional processing. We will identify roles for different cortical areas in the perceptual processing of speech and review functional imaging work in humans that bears on our understanding of how the brain decodes and monitors speech. A new model connects structures in the temporal, frontal and parietal lobes linking speech perception and production. PMID:19471271
2015-01-01
Table 2: Segregation results in terms of STOI on a variety of novel noises (SNR=-2 dB) Babble-20 Cafeteria Factory Babble-100 Living Room Cafe Park...NOISEX-92 corpus [13], and a living room, a cafe and a park noise from the DEMAND corpus [12]. To put the performance of the noise-independent model in
High-frequency neural activity predicts word parsing in ambiguous speech streams.
Kösem, Anne; Basirat, Anahita; Azizi, Leila; van Wassenhove, Virginie
2016-12-01
During speech listening, the brain parses a continuous acoustic stream of information into computational units (e.g., syllables or words) necessary for speech comprehension. Recent neuroscientific hypotheses have proposed that neural oscillations contribute to speech parsing, but whether they do so on the basis of acoustic cues (bottom-up acoustic parsing) or as a function of available linguistic representations (top-down linguistic parsing) is unknown. In this magnetoencephalography study, we contrasted acoustic and linguistic parsing using bistable speech sequences. While listening to the speech sequences, participants were asked to maintain one of the two possible speech percepts through volitional control. We predicted that the tracking of speech dynamics by neural oscillations would not only follow the acoustic properties but also shift in time according to the participant's conscious speech percept. Our results show that the latency of high-frequency activity (specifically, beta and gamma bands) varied as a function of the perceptual report. In contrast, the phase of low-frequency oscillations was not strongly affected by top-down control. Whereas changes in low-frequency neural oscillations were compatible with the encoding of prelexical segmentation cues, high-frequency activity specifically informed on an individual's conscious speech percept. Copyright © 2016 the American Physiological Society.
High-frequency neural activity predicts word parsing in ambiguous speech streams
Basirat, Anahita; Azizi, Leila; van Wassenhove, Virginie
2016-01-01
During speech listening, the brain parses a continuous acoustic stream of information into computational units (e.g., syllables or words) necessary for speech comprehension. Recent neuroscientific hypotheses have proposed that neural oscillations contribute to speech parsing, but whether they do so on the basis of acoustic cues (bottom-up acoustic parsing) or as a function of available linguistic representations (top-down linguistic parsing) is unknown. In this magnetoencephalography study, we contrasted acoustic and linguistic parsing using bistable speech sequences. While listening to the speech sequences, participants were asked to maintain one of the two possible speech percepts through volitional control. We predicted that the tracking of speech dynamics by neural oscillations would not only follow the acoustic properties but also shift in time according to the participant's conscious speech percept. Our results show that the latency of high-frequency activity (specifically, beta and gamma bands) varied as a function of the perceptual report. In contrast, the phase of low-frequency oscillations was not strongly affected by top-down control. Whereas changes in low-frequency neural oscillations were compatible with the encoding of prelexical segmentation cues, high-frequency activity specifically informed on an individual's conscious speech percept. PMID:27605528
Sheth, Bhavin R; Young, Ryan
2016-01-01
Evidence is strong that the visual pathway is segregated into two distinct streams-ventral and dorsal. Two proposals theorize that the pathways are segregated in function: The ventral stream processes information about object identity, whereas the dorsal stream, according to one model, processes information about either object location, and according to another, is responsible in executing movements under visual control. The models are influential; however recent experimental evidence challenges them, e.g., the ventral stream is not solely responsible for object recognition; conversely, its function is not strictly limited to object vision; the dorsal stream is not responsible by itself for spatial vision or visuomotor control; conversely, its function extends beyond vision or visuomotor control. In their place, we suggest a robust dichotomy consisting of a ventral stream selectively sampling high-resolution/ focal spaces, and a dorsal stream sampling nearly all of space with reduced foveal bias. The proposal hews closely to the theme of embodied cognition: Function arises as a consequence of an extant sensory underpinning. A continuous, not sharp, segregation based on function emerges, and carries with it an undercurrent of an exploitation-exploration dichotomy. Under this interpretation, cells of the ventral stream, which individually have more punctate receptive fields that generally include the fovea or parafovea, provide detailed information about object shapes and features and lead to the systematic exploitation of said information; cells of the dorsal stream, which individually have large receptive fields, contribute to visuospatial perception, provide information about the presence/absence of salient objects and their locations for novel exploration and subsequent exploitation by the ventral stream or, under certain conditions, the dorsal stream. We leverage the dichotomy to unify neuropsychological cases under a common umbrella, account for the increased prevalence of multisensory integration in the dorsal stream under a Bayesian framework, predict conditions under which object recognition utilizes the ventral or dorsal stream, and explain why cells of the dorsal stream drive sensorimotor control and motion processing and have poorer feature selectivity. Finally, the model speculates on a dynamic interaction between the two streams that underscores a unified, seamless perception. Existing theories are subsumed under our proposal.
Speech Segregation based on Binary Classification
2016-07-15
including the IBM, the target binary mask (TBM), the IRM, the short -time Fourier transform spectral magnitude (FFT-MAG) and its corresponding mask (FFT...complementary features and a fixed DNN as the discriminative learning machine. For evaluation metrics, besides SNR, we use the Short -Time Objective...target analysis is a recent successful intelligibility test conducted on both normal-hearing (NH) and hearing-impaired (HI) listeners. The speech
Sheth, Bhavin R.; Young, Ryan
2016-01-01
Evidence is strong that the visual pathway is segregated into two distinct streams—ventral and dorsal. Two proposals theorize that the pathways are segregated in function: The ventral stream processes information about object identity, whereas the dorsal stream, according to one model, processes information about either object location, and according to another, is responsible in executing movements under visual control. The models are influential; however recent experimental evidence challenges them, e.g., the ventral stream is not solely responsible for object recognition; conversely, its function is not strictly limited to object vision; the dorsal stream is not responsible by itself for spatial vision or visuomotor control; conversely, its function extends beyond vision or visuomotor control. In their place, we suggest a robust dichotomy consisting of a ventral stream selectively sampling high-resolution/focal spaces, and a dorsal stream sampling nearly all of space with reduced foveal bias. The proposal hews closely to the theme of embodied cognition: Function arises as a consequence of an extant sensory underpinning. A continuous, not sharp, segregation based on function emerges, and carries with it an undercurrent of an exploitation-exploration dichotomy. Under this interpretation, cells of the ventral stream, which individually have more punctate receptive fields that generally include the fovea or parafovea, provide detailed information about object shapes and features and lead to the systematic exploitation of said information; cells of the dorsal stream, which individually have large receptive fields, contribute to visuospatial perception, provide information about the presence/absence of salient objects and their locations for novel exploration and subsequent exploitation by the ventral stream or, under certain conditions, the dorsal stream. We leverage the dichotomy to unify neuropsychological cases under a common umbrella, account for the increased prevalence of multisensory integration in the dorsal stream under a Bayesian framework, predict conditions under which object recognition utilizes the ventral or dorsal stream, and explain why cells of the dorsal stream drive sensorimotor control and motion processing and have poorer feature selectivity. Finally, the model speculates on a dynamic interaction between the two streams that underscores a unified, seamless perception. Existing theories are subsumed under our proposal. PMID:27920670
Pattern Specificity in the Effect of Prior [delta]f on Auditory Stream Segregation
ERIC Educational Resources Information Center
Snyder, Joel S.; Weintraub, David M.
2011-01-01
During repeating sequences of low (A) and high (B) tones, perception of two separate streams ("streaming") increases with greater frequency separation ([delta]f) between the A and B tones; in contrast, a prior context with large [delta]f results in less streaming during a subsequent test pattern. The purpose of the present study was to…
Sensory Intelligence for Extraction of an Abstract Auditory Rule: A Cross-Linguistic Study.
Guo, Xiao-Tao; Wang, Xiao-Dong; Liang, Xiu-Yuan; Wang, Ming; Chen, Lin
2018-02-21
In a complex linguistic environment, while speech sounds can greatly vary, some shared features are often invariant. These invariant features constitute so-called abstract auditory rules. Our previous study has shown that with auditory sensory intelligence, the human brain can automatically extract the abstract auditory rules in the speech sound stream, presumably serving as the neural basis for speech comprehension. However, whether the sensory intelligence for extraction of abstract auditory rules in speech is inherent or experience-dependent remains unclear. To address this issue, we constructed a complex speech sound stream using auditory materials in Mandarin Chinese, in which syllables had a flat lexical tone but differed in other acoustic features to form an abstract auditory rule. This rule was occasionally and randomly violated by the syllables with the rising, dipping or falling tone. We found that both Chinese and foreign speakers detected the violations of the abstract auditory rule in the speech sound stream at a pre-attentive stage, as revealed by the whole-head recordings of mismatch negativity (MMN) in a passive paradigm. However, MMNs peaked earlier in Chinese speakers than in foreign speakers. Furthermore, Chinese speakers showed different MMN peak latencies for the three deviant types, which paralleled recognition points. These findings indicate that the sensory intelligence for extraction of abstract auditory rules in speech sounds is innate but shaped by language experience. Copyright © 2018 IBRO. Published by Elsevier Ltd. All rights reserved.
Damage to ventral and dorsal language pathways in acute aphasia
Hartwigsen, Gesa; Kellmeyer, Philipp; Glauche, Volkmar; Mader, Irina; Klöppel, Stefan; Suchan, Julia; Karnath, Hans-Otto; Weiller, Cornelius; Saur, Dorothee
2013-01-01
Converging evidence from neuroimaging studies and computational modelling suggests an organization of language in a dual dorsal–ventral brain network: a dorsal stream connects temporoparietal with frontal premotor regions through the superior longitudinal and arcuate fasciculus and integrates sensorimotor processing, e.g. in repetition of speech. A ventral stream connects temporal and prefrontal regions via the extreme capsule and mediates meaning, e.g. in auditory comprehension. The aim of our study was to test, in a large sample of 100 aphasic stroke patients, how well acute impairments of repetition and comprehension correlate with lesions of either the dorsal or ventral stream. We combined voxelwise lesion-behaviour mapping with the dorsal and ventral white matter fibre tracts determined by probabilistic fibre tracking in our previous study in healthy subjects. We found that repetition impairments were mainly associated with lesions located in the posterior temporoparietal region with a statistical lesion maximum in the periventricular white matter in projection of the dorsal superior longitudinal and arcuate fasciculus. In contrast, lesions associated with comprehension deficits were found more ventral-anterior in the temporoprefrontal region with a statistical lesion maximum between the insular cortex and the putamen in projection of the ventral extreme capsule. Individual lesion overlap with the dorsal fibre tract showed a significant negative correlation with repetition performance, whereas lesion overlap with the ventral fibre tract revealed a significant negative correlation with comprehension performance. To summarize, our results from patients with acute stroke lesions support the claim that language is organized along two segregated dorsal–ventral streams. Particularly, this is the first lesion study demonstrating that task performance on auditory comprehension measures requires an interaction between temporal and prefrontal brain regions via the ventral extreme capsule pathway. PMID:23378217
May-McNally, Shannan L; Quinn, Thomas P; Taylor, Eric B
2015-08-01
Understanding the extent of interspecific hybridization and how ecological segregation may influence hybridization requires comprehensively sampling different habitats over a range of life history stages. Arctic char (Salvelinus alpinus) and Dolly Varden (S. malma) are recently diverged salmonid fishes that come into contact in several areas of the North Pacific where they occasionally hybridize. To better quantify the degree of hybridization and ecological segregation between these taxa, we sampled over 700 fish from multiple lake (littoral and profundal) and stream sites in two large, interconnected southwestern Alaskan lakes. Individuals were genotyped at 12 microsatellite markers, and genetic admixture (Q) values generated through Bayesian-based clustering revealed hybridization levels generally lower than reported in a previous study (<0.6% to 5% of samples classified as late-generation hybrids). Dolly Varden and Arctic char tended to make different use of stream habitats with the latter apparently abandoning streams for lake habitats after 2-3 years of age. Our results support the distinct biological species status of Dolly Varden and Arctic char and suggest that ecological segregation may be an important factor limiting opportunities for hybridization and/or the ecological performance of hybrid char.
May-McNally, Shannan L; Quinn, Thomas P; Taylor, Eric B
2015-01-01
Understanding the extent of interspecific hybridization and how ecological segregation may influence hybridization requires comprehensively sampling different habitats over a range of life history stages. Arctic char (Salvelinus alpinus) and Dolly Varden (S. malma) are recently diverged salmonid fishes that come into contact in several areas of the North Pacific where they occasionally hybridize. To better quantify the degree of hybridization and ecological segregation between these taxa, we sampled over 700 fish from multiple lake (littoral and profundal) and stream sites in two large, interconnected southwestern Alaskan lakes. Individuals were genotyped at 12 microsatellite markers, and genetic admixture (Q) values generated through Bayesian-based clustering revealed hybridization levels generally lower than reported in a previous study (<0.6% to 5% of samples classified as late-generation hybrids). Dolly Varden and Arctic char tended to make different use of stream habitats with the latter apparently abandoning streams for lake habitats after 2–3 years of age. Our results support the distinct biological species status of Dolly Varden and Arctic char and suggest that ecological segregation may be an important factor limiting opportunities for hybridization and/or the ecological performance of hybrid char. PMID:26356310
Data-driven analysis of functional brain interactions during free listening to music and speech.
Fang, Jun; Hu, Xintao; Han, Junwei; Jiang, Xi; Zhu, Dajiang; Guo, Lei; Liu, Tianming
2015-06-01
Natural stimulus functional magnetic resonance imaging (N-fMRI) such as fMRI acquired when participants were watching video streams or listening to audio streams has been increasingly used to investigate functional mechanisms of the human brain in recent years. One of the fundamental challenges in functional brain mapping based on N-fMRI is to model the brain's functional responses to continuous, naturalistic and dynamic natural stimuli. To address this challenge, in this paper we present a data-driven approach to exploring functional interactions in the human brain during free listening to music and speech streams. Specifically, we model the brain responses using N-fMRI by measuring the functional interactions on large-scale brain networks with intrinsically established structural correspondence, and perform music and speech classification tasks to guide the systematic identification of consistent and discriminative functional interactions when multiple subjects were listening music and speech in multiple categories. The underlying premise is that the functional interactions derived from N-fMRI data of multiple subjects should exhibit both consistency and discriminability. Our experimental results show that a variety of brain systems including attention, memory, auditory/language, emotion, and action networks are among the most relevant brain systems involved in classic music, pop music and speech differentiation. Our study provides an alternative approach to investigating the human brain's mechanism in comprehension of complex natural music and speech.
Sixteen-Month-Old Infants' Segment Words from Infant- and Adult-Directed Speech
ERIC Educational Resources Information Center
Mani, Nivedita; Pätzold, Wiebke
2016-01-01
One of the first challenges facing the young language learner is the task of segmenting words from a natural language speech stream, without prior knowledge of how these words sound. Studies with younger children find that children find it easier to segment words from fluent speech when the words are presented in infant-directed speech, i.e., the…
ERIC Educational Resources Information Center
Chinello, Alessandro; Cattani, Veronica; Bonfiglioli, Claudia; Dehaene, Stanislas; Piazza, Manuela
2013-01-01
In the primate brain, sensory information is processed along two partially segregated cortical streams: the ventral stream, mainly coding for objects' shape and identity, and the dorsal stream, mainly coding for objects' quantitative information (including size, number, and spatial position). Neurophysiological measures indicate that such…
Robust audio-visual speech recognition under noisy audio-video conditions.
Stewart, Darryl; Seymour, Rowan; Pass, Adrian; Ming, Ji
2014-02-01
This paper presents the maximum weighted stream posterior (MWSP) model as a robust and efficient stream integration method for audio-visual speech recognition in environments, where the audio or video streams may be subjected to unknown and time-varying corruption. A significant advantage of MWSP is that it does not require any specific measurements of the signal in either stream to calculate appropriate stream weights during recognition, and as such it is modality-independent. This also means that MWSP complements and can be used alongside many of the other approaches that have been proposed in the literature for this problem. For evaluation we used the large XM2VTS database for speaker-independent audio-visual speech recognition. The extensive tests include both clean and corrupted utterances with corruption added in either/both the video and audio streams using a variety of types (e.g., MPEG-4 video compression) and levels of noise. The experiments show that this approach gives excellent performance in comparison to another well-known dynamic stream weighting approach and also compared to any fixed-weighted integration approach in both clean conditions or when noise is added to either stream. Furthermore, our experiments show that the MWSP approach dynamically selects suitable integration weights on a frame-by-frame basis according to the level of noise in the streams and also according to the naturally fluctuating relative reliability of the modalities even in clean conditions. The MWSP approach is shown to maintain robust recognition performance in all tested conditions, while requiring no prior knowledge about the type or level of noise.
Near-Term Fetuses Process Temporal Features of Speech
ERIC Educational Resources Information Center
Granier-Deferre, Carolyn; Ribeiro, Aurelie; Jacquet, Anne-Yvonne; Bassereau, Sophie
2011-01-01
The perception of speech and music requires processing of variations in spectra and amplitude over different time intervals. Near-term fetuses can discriminate acoustic features, such as frequencies and spectra, but whether they can process complex auditory streams, such as speech sequences and more specifically their temporal variations, fast or…
Statistical Methods of Latent Structure Discovery in Child-Directed Speech
ERIC Educational Resources Information Center
Panteleyeva, Natalya B.
2010-01-01
This dissertation investigates how distributional information in the speech stream can assist infants in the initial stages of acquisition of their native language phonology. An exploratory statistical analysis derives this information from the adult speech data in the corpus of conversations between adults and young children in Russian. Because…
The Neural Basis of Speech Parsing in Children and Adults
ERIC Educational Resources Information Center
McNealy, Kristin; Mazziotta, John C.; Dapretto, Mirella
2010-01-01
Word segmentation, detecting word boundaries in continuous speech, is a fundamental aspect of language learning that can occur solely by the computation of statistical and speech cues. Fifty-four children underwent functional magnetic resonance imaging (fMRI) while listening to three streams of concatenated syllables that contained either high…
Harnessing Active Fins to Segregate Nanoparticles from Binary Mixtures
NASA Astrophysics Data System (ADS)
Liu, Ya; Kuksenok, Olga; Bhattacharya, Amitabh; Ma, Yongting; He, Ximin; Aizenberg, Joanna; Balazs, Anna
2014-03-01
One of the challenges in creating high-performance polymeric nanocomposites for optoelectronic applications, such as bilayer solar cells, is establishing effective and facile routes for controlling the properties of interface and segregation of binary particles with hole conductor particles and electron conductor particles. We model nanocomposites that encompass binary particles and binary blends in a microchannel. An array of oscillating microfins is immersed in the fluid and tethered to the floor of the microchannel; the fluid containing mixture of nanoparticles is driven along the channel by an imposed pressure gradient. During the oscillations, the fins with the specific chemical wetting reach the upper fluid when they are upright and are entirely within the lower stream when they are tilted. We introduce specific interaction between the fins and particulates in the solution. Fins can selectively ``catch'' target nanoparticles within the upper fluid stream and then release them into the lower stream. We focus on different modes of fins motion to optimize selective segregation of particles within binary mixture. Our approach provides an effective means of tailoring the properties and ultimate performance of the composites.
Getting the cocktail party started: masking effects in speech perception
Evans, S; McGettigan, C; Agnew, ZK; Rosen, S; Scott, SK
2016-01-01
Spoken conversations typically take place in noisy environments and different kinds of masking sounds place differing demands on cognitive resources. Previous studies, examining the modulation of neural activity associated with the properties of competing sounds, have shown that additional speech streams engage the superior temporal gyrus. However, the absence of a condition in which target speech was heard without additional masking made it difficult to identify brain networks specific to masking and to ascertain the extent to which competing speech was processed equivalently to target speech. In this study, we scanned young healthy adults with continuous functional Magnetic Resonance Imaging (fMRI), whilst they listened to stories masked by sounds that differed in their similarity to speech. We show that auditory attention and control networks are activated during attentive listening to masked speech in the absence of an overt behavioural task. We demonstrate that competing speech is processed predominantly in the left hemisphere within the same pathway as target speech but is not treated equivalently within that stream, and that individuals who perform better in speech in noise tasks activate the left mid-posterior superior temporal gyrus more. Finally, we identify neural responses associated with the onset of sounds in the auditory environment, activity was found within right lateralised frontal regions consistent with a phasic alerting response. Taken together, these results provide a comprehensive account of the neural processes involved in listening in noise. PMID:26696297
Felix II, Richard A.; Gourévitch, Boris; Gómez-Álvarez, Marcelo; Leijon, Sara C. M.; Saldaña, Enrique; Magnusson, Anna K.
2017-01-01
Auditory streaming enables perception and interpretation of complex acoustic environments that contain competing sound sources. At early stages of central processing, sounds are segregated into separate streams representing attributes that later merge into acoustic objects. Streaming of temporal cues is critical for perceiving vocal communication, such as human speech, but our understanding of circuits that underlie this process is lacking, particularly at subcortical levels. The superior paraolivary nucleus (SPON), a prominent group of inhibitory neurons in the mammalian brainstem, has been implicated in processing temporal information needed for the segmentation of ongoing complex sounds into discrete events. The SPON requires temporally precise and robust excitatory input(s) to convey information about the steep rise in sound amplitude that marks the onset of voiced sound elements. Unfortunately, the sources of excitation to the SPON and the impact of these inputs on the behavior of SPON neurons have yet to be resolved. Using anatomical tract tracing and immunohistochemistry, we identified octopus cells in the contralateral cochlear nucleus (CN) as the primary source of excitatory input to the SPON. Cluster analysis of miniature excitatory events also indicated that the majority of SPON neurons receive one type of excitatory input. Precise octopus cell-driven onset spiking coupled with transient offset spiking make SPON responses well-suited to signal transitions in sound energy contained in vocalizations. Targets of octopus cell projections, including the SPON, are strongly implicated in the processing of temporal sound features, which suggests a common pathway that conveys information critical for perception of complex natural sounds. PMID:28620283
Corollary discharge provides the sensory content of inner speech.
Scott, Mark
2013-09-01
Inner speech is one of the most common, but least investigated, mental activities humans perform. It is an internal copy of one's external voice and so is similar to a well-established component of motor control: corollary discharge. Corollary discharge is a prediction of the sound of one's voice generated by the motor system. This prediction is normally used to filter self-caused sounds from perception, which segregates them from externally caused sounds and prevents the sensory confusion that would otherwise result. The similarity between inner speech and corollary discharge motivates the theory, tested here, that corollary discharge provides the sensory content of inner speech. The results reported here show that inner speech attenuates the impact of external sounds. This attenuation was measured using a context effect (an influence of contextual speech sounds on the perception of subsequent speech sounds), which weakens in the presence of speech imagery that matches the context sound. Results from a control experiment demonstrated this weakening in external speech as well. Such sensory attenuation is a hallmark of corollary discharge.
Enhancing Communication in Noisy Environments
2009-10-01
derived from the ITD and ILD cues, which are binaural . ITD depends on the azimuthal position of the source. Similarly, ILD refers to the fact...4.4 dB No Perceptual Binaural Speech Enhancement [42] 4.5 dB Yes Fuzzy Cocktail Party Processor [25] 7.5 dB Yes Binaural segregation [43] 8.9 dB No...modulation. IEEE Transactions on Neural Networks. 15 (2004): 1135-50. [42] Dong R. Perceptual Binaural Speech Enhancement in Noisy Environments. M.A.Sc
ERIC Educational Resources Information Center
Raman, Santhiram R.; Sua, Tan Yao
2010-01-01
Ethnic segregation has become an emerging feature in Malaysia's education system even though the institutional role of education should have been a unifying force for the country's multi-ethnic society. The underlying problem is that, at all levels of education provision in Malaysia, alternative streams are allowed to coexist alongside mainstream…
ERIC Educational Resources Information Center
Golumbic, Elana M. Zion; Poeppel, David; Schroeder, Charles E.
2012-01-01
The human capacity for processing speech is remarkable, especially given that information in speech unfolds over multiple time scales concurrently. Similarly notable is our ability to filter out of extraneous sounds and focus our attention on one conversation, epitomized by the "Cocktail Party" effect. Yet, the neural mechanisms underlying on-line…
Multi-stream LSTM-HMM decoding and histogram equalization for noise robust keyword spotting.
Wöllmer, Martin; Marchi, Erik; Squartini, Stefano; Schuller, Björn
2011-09-01
Highly spontaneous, conversational, and potentially emotional and noisy speech is known to be a challenge for today's automatic speech recognition (ASR) systems, which highlights the need for advanced algorithms that improve speech features and models. Histogram Equalization is an efficient method to reduce the mismatch between clean and noisy conditions by normalizing all moments of the probability distribution of the feature vector components. In this article, we propose to combine histogram equalization and multi-condition training for robust keyword detection in noisy speech. To better cope with conversational speaking styles, we show how contextual information can be effectively exploited in a multi-stream ASR framework that dynamically models context-sensitive phoneme estimates generated by a long short-term memory neural network. The proposed techniques are evaluated on the SEMAINE database-a corpus containing emotionally colored conversations with a cognitive system for "Sensitive Artificial Listening".
Bouvet, Lucie; Mottron, Laurent; Valdois, Sylviane; Donnadieu, Sophie
2016-05-01
Auditory stream segregation allows us to organize our sound environment, by focusing on specific information and ignoring what is unimportant. One previous study reported difficulty in stream segregation ability in children with Asperger syndrome. In order to investigate this question further, we used an interleaved melody recognition task with children in the autism spectrum disorder (ASD). In this task, a probe melody is followed by a mixed sequence, made up of a target melody interleaved with a distractor melody. These two melodies have either the same [0 semitone (ST)] or a different mean frequency (6, 12 or 24 ST separation conditions). Children have to identify if the probe melody is present in the mixed sequence. Children with ASD performed better than typical children when melodies were completely embedded. Conversely, they were impaired in the ST separation conditions. Our results confirm the difficulty of children with ASD in using a frequency cue to organize auditory perceptual information. However, superior performance in the completely embedded condition may result from superior perceptual processes in autism. We propose that this atypical pattern of results might reflect the expression of a single cognitive feature in autism.
Content-based TV sports video retrieval using multimodal analysis
NASA Astrophysics Data System (ADS)
Yu, Yiqing; Liu, Huayong; Wang, Hongbin; Zhou, Dongru
2003-09-01
In this paper, we propose content-based video retrieval, which is a kind of retrieval by its semantical contents. Because video data is composed of multimodal information streams such as video, auditory and textual streams, we describe a strategy of using multimodal analysis for automatic parsing sports video. The paper first defines the basic structure of sports video database system, and then introduces a new approach that integrates visual stream analysis, speech recognition, speech signal processing and text extraction to realize video retrieval. The experimental results for TV sports video of football games indicate that the multimodal analysis is effective for video retrieval by quickly browsing tree-like video clips or inputting keywords within predefined domain.
Fuel cell gas management system
DuBose, Ronald Arthur
2000-01-11
A fuel cell gas management system including a cathode humidification system for transferring latent and sensible heat from an exhaust stream to the cathode inlet stream of the fuel cell; an anode humidity retention system for maintaining the total enthalpy of the anode stream exiting the fuel cell equal to the total enthalpy of the anode inlet stream; and a cooling water management system having segregated deionized water and cooling water loops interconnected by means of a brazed plate heat exchanger.
Brainstem origins for cortical 'what' and 'where' pathways in the auditory system.
Kraus, Nina; Nicol, Trent
2005-04-01
We have developed a data-driven conceptual framework that links two areas of science: the source-filter model of acoustics and cortical sensory processing streams. The source-filter model describes the mechanics behind speech production: the identity of the speaker is carried largely in the vocal cord source and the message is shaped by the ever-changing filters of the vocal tract. Sensory processing streams, popularly called 'what' and 'where' pathways, are well established in the visual system as a neural scheme for separately carrying different facets of visual objects, namely their identity and their position/motion, to the cortex. A similar functional organization has been postulated in the auditory system. Both speaker identity and the spoken message, which are simultaneously conveyed in the acoustic structure of speech, can be disentangled into discrete brainstem response components. We argue that these two response classes are early manifestations of auditory 'what' and 'where' streams in the cortex. This brainstem link forges a new understanding of the relationship between the acoustics of speech and cortical processing streams, unites two hitherto separate areas in science, and provides a model for future investigations of auditory function.
An algorithm to improve speech recognition in noise for hearing-impaired listeners
Healy, Eric W.; Yoho, Sarah E.; Wang, Yuxuan; Wang, DeLiang
2013-01-01
Despite considerable effort, monaural (single-microphone) algorithms capable of increasing the intelligibility of speech in noise have remained elusive. Successful development of such an algorithm is especially important for hearing-impaired (HI) listeners, given their particular difficulty in noisy backgrounds. In the current study, an algorithm based on binary masking was developed to separate speech from noise. Unlike the ideal binary mask, which requires prior knowledge of the premixed signals, the masks used to segregate speech from noise in the current study were estimated by training the algorithm on speech not used during testing. Sentences were mixed with speech-shaped noise and with babble at various signal-to-noise ratios (SNRs). Testing using normal-hearing and HI listeners indicated that intelligibility increased following processing in all conditions. These increases were larger for HI listeners, for the modulated background, and for the least-favorable SNRs. They were also often substantial, allowing several HI listeners to improve intelligibility from scores near zero to values above 70%. PMID:24116438
Wutz, Andreas; Weisz, Nathan; Braun, Christoph; Melcher, David
2014-01-22
Dynamic vision requires both stability of the current perceptual representation and sensitivity to the accumulation of sensory evidence over time. Here we study the electrophysiological signatures of this intricate balance between temporal segregation and integration in vision. Within a forward masking paradigm with short and long stimulus onset asynchronies (SOA), we manipulated the temporal overlap of the visual persistence of two successive transients. Human observers enumerated the items presented in the second target display as a measure of the informational capacity read-out from this partly temporally integrated visual percept. We observed higher β-power immediately before mask display onset in incorrect trials, in which enumeration failed due to stronger integration of mask and target visual information. This effect was timescale specific, distinguishing between segregation and integration of visual transients that were distant in time (long SOA). Conversely, for short SOA trials, mask onset evoked a stronger visual response when mask and targets were correctly segregated in time. Examination of the target-related response profile revealed the importance of an evoked α-phase reset for the segregation of those rapid visual transients. Investigating this precise mapping of the temporal relationships of visual signals onto electrophysiological responses highlights how the stream of visual information is carved up into discrete temporal windows that mediate between segregated and integrated percepts. Fragmenting the stream of visual information provides a means to stabilize perceptual events within one instant in time.
Infants with Williams syndrome detect statistical regularities in continuous speech.
Cashon, Cara H; Ha, Oh-Ryeong; Graf Estes, Katharine; Saffran, Jenny R; Mervis, Carolyn B
2016-09-01
Williams syndrome (WS) is a rare genetic disorder associated with delays in language and cognitive development. The reasons for the language delay are unknown. Statistical learning is a domain-general mechanism recruited for early language acquisition. In the present study, we investigated whether infants with WS were able to detect the statistical structure in continuous speech. Eighteen 8- to 20-month-olds with WS were familiarized with 2min of a continuous stream of synthesized nonsense words; the statistical structure of the speech was the only cue to word boundaries. They were tested on their ability to discriminate statistically-defined "words" and "part-words" (which crossed word boundaries) in the artificial language. Despite significant cognitive and language delays, infants with WS were able to detect the statistical regularities in the speech stream. These findings suggest that an inability to track the statistical properties of speech is unlikely to be the primary basis for the delays in the onset of language observed in infants with WS. These results provide the first evidence of statistical learning by infants with developmental delays. Copyright © 2016 Elsevier B.V. All rights reserved.
Sarubbo, Silvio; De Benedictis, Alessandro; Merler, Stefano; Mandonnet, Emmanuel; Barbareschi, Mattia; Dallabona, Monica; Chioffi, Franco; Duffau, Hugues
2016-11-01
The most accepted framework of language processing includes a dorsal phonological and a ventral semantic pathway, connecting a wide network of distributed cortical hubs. However, the cortico-subcortical connectivity and the reciprocal anatomical relationships of this dual-stream system are not completely clarified. We performed an original blunt microdissection of 10 hemispheres with the exposition of locoregional short fibers and six long-range fascicles involved in language elaboration. Special attention was addressed to the analysis of termination sites and anatomical relationships between long- and short-range fascicles. We correlated these anatomical findings with a topographical analysis of 93 functional responses located at the terminal sites of the language bundles, collected by direct electrical stimulation in 108 right-handers. The locations of phonological and semantic paraphasias, verbal apraxia, speech arrest, pure anomia, and alexia were statistically analyzed, and the respective barycenters were computed in the MNI space. We found that terminations of main language bundles and functional responses have a wider distribution in respect to the classical definition of language territories. Our analysis showed that dorsal and ventral streams have a similar anatomical layer organization. These pathways are parallel and relatively segregated over their subcortical course while their terminal fibers are strictly overlapped at the cortical level. Finally, the anatomical features of the U-fibers suggested a role of locoregional integration between the phonological, semantic, and executive subnetworks of language, in particular within the inferoventral frontal lobe and the temporoparietal junction, which revealed to be the main criss-cross regions between the dorsal and ventral pathways. Hum Brain Mapp 37:3858-3872, 2016. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Binaural segregation in multisource reverberant environments.
Roman, Nicoleta; Srinivasan, Soundararajan; Wang, DeLiang
2006-12-01
In a natural environment, speech signals are degraded by both reverberation and concurrent noise sources. While human listening is robust under these conditions using only two ears, current two-microphone algorithms perform poorly. The psychological process of figure-ground segregation suggests that the target signal is perceived as a foreground while the remaining stimuli are perceived as a background. Accordingly, the goal is to estimate an ideal time-frequency (T-F) binary mask, which selects the target if it is stronger than the interference in a local T-F unit. In this paper, a binaural segregation system that extracts the reverberant target signal from multisource reverberant mixtures by utilizing only the location information of target source is proposed. The proposed system combines target cancellation through adaptive filtering and a binary decision rule to estimate the ideal T-F binary mask. The main observation in this work is that the target attenuation in a T-F unit resulting from adaptive filtering is correlated with the relative strength of target to mixture. A comprehensive evaluation shows that the proposed system results in large SNR gains. In addition, comparisons using SNR as well as automatic speech recognition measures show that this system outperforms standard two-microphone beamforming approaches and a recent binaural processor.
Neilans, Erikson G; Dent, Micheal L
2015-02-01
Auditory scene analysis has been suggested as a universal process that exists across all animals. Relative to humans, however, little work has been devoted to how animals perceptually isolate different sound sources. Frequency separation of sounds is arguably the most common parameter studied in auditory streaming, but it is not the only factor contributing to how the auditory scene is perceived. Researchers have found that in humans, even at large frequency separations, synchronous tones are heard as a single auditory stream, whereas asynchronous tones with the same frequency separations are perceived as 2 distinct sounds. These findings demonstrate how both the timing and frequency separation of sounds are important for auditory scene analysis. It is unclear how animals, such as budgerigars (Melopsittacus undulatus), perceive synchronous and asynchronous sounds. In this study, budgerigars and humans (Homo sapiens) were tested on their perception of synchronous, asynchronous, and partially overlapping pure tones using the same psychophysical procedures. Species differences were found between budgerigars and humans in how partially overlapping sounds were perceived, with budgerigars more likely to segregate overlapping sounds and humans more apt to fuse the 2 sounds together. The results also illustrated that temporal cues are particularly important for stream segregation of overlapping sounds. Lastly, budgerigars were found to segregate partially overlapping sounds in a manner predicted by computational models of streaming, whereas humans were not. PsycINFO Database Record (c) 2015 APA, all rights reserved.
NASA Astrophysics Data System (ADS)
Modegi, Toshio
We are developing audio watermarking techniques which enable extraction of embedded data by cell phones. For that we have to embed data onto frequency ranges, where our auditory response is prominent, therefore data embedding will cause much auditory noises. Previously we have proposed applying a two-channel stereo play-back feature, where noises generated by a data embedded left-channel signal will be reduced by the other right-channel signal. However, this proposal has practical problems of restricting extracting terminal location. In this paper, we propose synthesizing the noise reducing right-channel signal with the left-signal and reduces noises completely by generating an auditory stream segregation phenomenon to users. This newly proposed makes the noise reducing right-channel signal unnecessary and supports monaural play-back operations. Moreover, we propose a wide-band embedding method causing dual auditory stream segregation phenomena, which enables data embedding on whole public phone frequency ranges and stable extractions with 3-G mobile phones. From these proposals, extraction precisions become higher than those by the previously proposed method whereas the quality damages of embedded signals become smaller. In this paper we present an abstract of our newly proposed method and experimental results comparing with those by the previously proposed method.
Crespi, Bernard; Read, Silven; Hurd, Peter
2017-10-01
We genotyped a healthy population for three haplotype-tagging FOXP2 SNPs, and tested for associations of these SNPs with strength of handedness and questionnaire-based metrics of inner speech characteristics (ISP) and speech fluency (FLU), as derived from the Schizotypal Personality Questionnaire-BR. Levels of mixed-handedness were positively correlated with ISP and FLU, supporting prior work on these two domains. Genotype for rs7799109, a SNP previously linked with lateralization of left frontal regions underlying language, was associated with degree of mixed handedness and with scores for ISP and FLU phenotypes. Genotype of rs1456031, which has previously been linked with auditory hallucinations, was also associated with ISP phenotypes. These results provide evidence that FOXP2 SNPs influence aspects of human inner speech and fluency that are related to lateralized phenotypes, and suggest that the evolution of human language, as mediated by the adaptive evolution of FOXP2, involved features of inner speech. Copyright © 2017 Elsevier Inc. All rights reserved.
Enhancing Auditory Selective Attention Using a Visually Guided Hearing Aid
ERIC Educational Resources Information Center
Kidd, Gerald, Jr.
2017-01-01
Purpose: Listeners with hearing loss, as well as many listeners with clinically normal hearing, often experience great difficulty segregating talkers in a multiple-talker sound field and selectively attending to the desired "target" talker while ignoring the speech from unwanted "masker" talkers and other sources of sound. This…
López-Barroso, Diana; de Diego-Balaguer, Ruth
2017-01-01
Dorsal and ventral pathways connecting perisylvian language areas have been shown to be functionally and anatomically segregated. Whereas the dorsal pathway integrates the sensory-motor information required for verbal repetition, the ventral pathway has classically been associated with semantic processes. The great individual differences characterizing language learning through life partly correlate with brain structure and function within these dorsal and ventral language networks. Variability and plasticity within these networks also underlie inter-individual differences in the recovery of linguistic abilities in aphasia. Despite the division of labor of the dorsal and ventral streams, studies in healthy individuals have shown how the interaction of them and the redundancy in the areas they connect allow for compensatory strategies in functions that are usually segregated. In this mini-review we highlight the need to examine compensatory mechanisms between streams in healthy individuals as a helpful guide to choosing the most appropriate rehabilitation strategies, using spared functions and targeting preserved compensatory networks for brain plasticity. PMID:29021751
Xie, Zilong; Reetzke, Rachel; Chandrasekaran, Bharath
2018-05-24
Increasing visual perceptual load can reduce pre-attentive auditory cortical activity to sounds, a reflection of the limited and shared attentional resources for sensory processing across modalities. Here, we demonstrate that modulating visual perceptual load can impact the early sensory encoding of speech sounds, and that the impact of visual load is highly dependent on the predictability of the incoming speech stream. Participants (n = 20, 9 females) performed a visual search task of high (target similar to distractors) and low (target dissimilar to distractors) perceptual load, while early auditory electrophysiological responses were recorded to native speech sounds. Speech sounds were presented either in a 'repetitive context', or a less predictable 'variable context'. Independent of auditory stimulus context, pre-attentive auditory cortical activity was reduced during high visual load, relative to low visual load. We applied a data-driven machine learning approach to decode speech sounds from the early auditory electrophysiological responses. Decoding performance was found to be poorer under conditions of high (relative to low) visual load, when the incoming acoustic stream was predictable. When the auditory stimulus context was less predictable, decoding performance was substantially greater for the high (relative to low) visual load conditions. Our results provide support for shared attentional resources between visual and auditory modalities that substantially influence the early sensory encoding of speech signals in a context-dependent manner. Copyright © 2018 IBRO. Published by Elsevier Ltd. All rights reserved.
Lewkowicz, David J; Minar, Nicholas J; Tift, Amy H; Brandon, Melissa
2015-02-01
To investigate the developmental emergence of the perception of the multisensory coherence of native and non-native audiovisual fluent speech, we tested 4-, 8- to 10-, and 12- to 14-month-old English-learning infants. Infants first viewed two identical female faces articulating two different monologues in silence and then in the presence of an audible monologue that matched the visible articulations of one of the faces. Neither the 4-month-old nor 8- to 10-month-old infants exhibited audiovisual matching in that they did not look longer at the matching monologue. In contrast, the 12- to 14-month-old infants exhibited matching and, consistent with the emergence of perceptual expertise for the native language, perceived the multisensory coherence of native-language monologues earlier in the test trials than that of non-native language monologues. Moreover, the matching of native audible and visible speech streams observed in the 12- to 14-month-olds did not depend on audiovisual synchrony, whereas the matching of non-native audible and visible speech streams did depend on synchrony. Overall, the current findings indicate that the perception of the multisensory coherence of fluent audiovisual speech emerges late in infancy, that audiovisual synchrony cues are more important in the perception of the multisensory coherence of non-native speech than that of native audiovisual speech, and that the emergence of this skill most likely is affected by perceptual narrowing. Copyright © 2014 Elsevier Inc. All rights reserved.
François, Clément; Schön, Daniele
2014-02-01
There is increasing evidence that humans and other nonhuman mammals are sensitive to the statistical structure of auditory input. Indeed, neural sensitivity to statistical regularities seems to be a fundamental biological property underlying auditory learning. In the case of speech, statistical regularities play a crucial role in the acquisition of several linguistic features, from phonotactic to more complex rules such as morphosyntactic rules. Interestingly, a similar sensitivity has been shown with non-speech streams: sequences of sounds changing in frequency or timbre can be segmented on the sole basis of conditional probabilities between adjacent sounds. We recently ran a set of cross-sectional and longitudinal experiments showing that merging music and speech information in song facilitates stream segmentation and, further, that musical practice enhances sensitivity to statistical regularities in speech at both neural and behavioral levels. Based on recent findings showing the involvement of a fronto-temporal network in speech segmentation, we defend the idea that enhanced auditory learning observed in musicians originates via at least three distinct pathways: enhanced low-level auditory processing, enhanced phono-articulatory mapping via the left Inferior Frontal Gyrus and Pre-Motor cortex and increased functional connectivity within the audio-motor network. Finally, we discuss how these data predict a beneficial use of music for optimizing speech acquisition in both normal and impaired populations. Copyright © 2013 Elsevier B.V. All rights reserved.
Waters, Christopher L.; Janupala, Rajiv R.; Mallinson, Richard G.; ...
2017-05-25
Thermal conversion technologies may be the most efficient means of production of transportation fuels from lignocellulosic biomass. In order to increase the viability and improve the carbon emissions profile of pyrolysis biofuels, improvements must be made to the required catalytic upgrading to increase both hydrogen utilization efficiency and final liquid carbon yields. However, no current single catalytic valorization strategy can be optimized to convert the complex mixture of compounds produced upon fast pyrolysis of biomass. Staged thermal fractionation, which entails a series of sequentially increasing temperature steps to decompose biomass, has been proposed as a simple means to create vapormore » product streams of enhanced purity as compared to fast pyrolysis. In this work, we use analytical pyrolysis to investigate the effects of time and temperature on a thermal step designed to segregate the lignin and cellulose pyrolysis products of a biomass which has been pre-torrefied to remove hemicellulose. At process conditions of 380 °C and 180 s isothermal hold time, a stream containing less than 20% phenolics (carbon basis) was produced, and upon subsequent fast pyrolysis of the residual solid a stream of 81.5% levoglucosan (carbon basis) was produced. The thermal segregation comes at the expense of vapor product carbon yield, but the improvement in catalytic performance may offset these losses.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Waters, Christopher L.; Janupala, Rajiv R.; Mallinson, Richard G.
Thermal conversion technologies may be the most efficient means of production of transportation fuels from lignocellulosic biomass. In order to increase the viability and improve the carbon emissions profile of pyrolysis biofuels, improvements must be made to the required catalytic upgrading to increase both hydrogen utilization efficiency and final liquid carbon yields. However, no current single catalytic valorization strategy can be optimized to convert the complex mixture of compounds produced upon fast pyrolysis of biomass. Staged thermal fractionation, which entails a series of sequentially increasing temperature steps to decompose biomass, has been proposed as a simple means to create vapormore » product streams of enhanced purity as compared to fast pyrolysis. In this work, we use analytical pyrolysis to investigate the effects of time and temperature on a thermal step designed to segregate the lignin and cellulose pyrolysis products of a biomass which has been pre-torrefied to remove hemicellulose. At process conditions of 380 °C and 180 s isothermal hold time, a stream containing less than 20% phenolics (carbon basis) was produced, and upon subsequent fast pyrolysis of the residual solid a stream of 81.5% levoglucosan (carbon basis) was produced. The thermal segregation comes at the expense of vapor product carbon yield, but the improvement in catalytic performance may offset these losses.« less
Speech Segmentation by Statistical Learning Depends on Attention
ERIC Educational Resources Information Center
Toro, Juan M.; Sinnett, Scott; Soto-Faraco, Salvador
2005-01-01
We addressed the hypothesis that word segmentation based on statistical regularities occurs without the need of attention. Participants were presented with a stream of artificial speech in which the only cue to extract the words was the presence of statistical regularities between syllables. Half of the participants were asked to passively listen…
ERIC Educational Resources Information Center
Hertrich, Ingo; Dietrich, Susanne; Ackermann, Hermann
2011-01-01
During speech communication, visual information may interact with the auditory system at various processing stages. Most noteworthy, recent magnetoencephalography (MEG) data provided first evidence for early and preattentive phonetic/phonological encoding of the visual data stream--prior to its fusion with auditory phonological features [Hertrich,…
Implicit Processing of Phonotactic Cues: Evidence from Electrophysiological and Vascular Responses
ERIC Educational Resources Information Center
Rossi, Sonja; Jurgenson, Ina B.; Hanulikova, Adriana; Telkemeyer, Silke; Wartenburger, Isabell; Obrig, Hellmuth
2011-01-01
Spoken word recognition is achieved via competition between activated lexical candidates that match the incoming speech input. The competition is modulated by prelexical cues that are important for segmenting the auditory speech stream into linguistic units. One such prelexical cue that listeners rely on in spoken word recognition is phonotactics.…
Jones, S J; Longe, O; Vaz Pato, M
1998-03-01
Examination of the cortical auditory evoked potentials to complex tones changing in pitch and timbre suggests a useful new method for investigating higher auditory processes, in particular those concerned with 'streaming' and auditory object formation. The main conclusions were: (i) the N1 evoked by a sudden change in pitch or timbre was more posteriorly distributed than the N1 at the onset of the tone, indicating at least partial segregation of the neuronal populations responsive to sound onset and spectral change; (ii) the T-complex was consistently larger over the right hemisphere, consistent with clinical and PET evidence for particular involvement of the right temporal lobe in the processing of timbral and musical material; (iii) responses to timbral change were relatively unaffected by increasing the rate of interspersed changes in pitch, suggesting a mechanism for detecting the onset of a new voice in a constantly modulated sound stream; (iv) responses to onset, offset and pitch change of complex tones were relatively unaffected by interfering tones when the latter were of a different timbre, suggesting these responses must be generated subsequent to auditory stream segregation.
Emergence of neural encoding of auditory objects while listening to competing speakers
Ding, Nai; Simon, Jonathan Z.
2012-01-01
A visual scene is perceived in terms of visual objects. Similar ideas have been proposed for the analogous case of auditory scene analysis, although their hypothesized neural underpinnings have not yet been established. Here, we address this question by recording from subjects selectively listening to one of two competing speakers, either of different or the same sex, using magnetoencephalography. Individual neural representations are seen for the speech of the two speakers, with each being selectively phase locked to the rhythm of the corresponding speech stream and from which can be exclusively reconstructed the temporal envelope of that speech stream. The neural representation of the attended speech dominates responses (with latency near 100 ms) in posterior auditory cortex. Furthermore, when the intensity of the attended and background speakers is separately varied over an 8-dB range, the neural representation of the attended speech adapts only to the intensity of that speaker but not to the intensity of the background speaker, suggesting an object-level intensity gain control. In summary, these results indicate that concurrent auditory objects, even if spectrotemporally overlapping and not resolvable at the auditory periphery, are neurally encoded individually in auditory cortex and emerge as fundamental representational units for top-down attentional modulation and bottom-up neural adaptation. PMID:22753470
Rhythmic grouping biases constrain infant statistical learning
Hay, Jessica F.; Saffran, Jenny R.
2012-01-01
Linguistic stress and sequential statistical cues to word boundaries interact during speech segmentation in infancy. However, little is known about how the different acoustic components of stress constrain statistical learning. The current studies were designed to investigate whether intensity and duration each function independently as cues to initial prominence (trochaic-based hypothesis) or whether, as predicted by the Iambic-Trochaic Law (ITL), intensity and duration have characteristic and separable effects on rhythmic grouping (ITL-based hypothesis) in a statistical learning task. Infants were familiarized with an artificial language (Experiments 1 & 3) or a tone stream (Experiment 2) in which there was an alternation in either intensity or duration. In addition to potential acoustic cues, the familiarization sequences also contained statistical cues to word boundaries. In speech (Experiment 1) and non-speech (Experiment 2) conditions, 9-month-old infants demonstrated discrimination patterns consistent with an ITL-based hypothesis: intensity signaled initial prominence and duration signaled final prominence. The results of Experiment 3, in which 6.5-month-old infants were familiarized with the speech streams from Experiment 1, suggest that there is a developmental change in infants’ willingness to treat increased duration as a cue to word offsets in fluent speech. Infants’ perceptual systems interact with linguistic experience to constrain how infants learn from their auditory environment. PMID:23730217
Francis, Alexander L
2010-02-01
Perception of speech in competing speech is facilitated by spatial separation of the target and distracting speech, but this benefit may arise at either a perceptual or a cognitive level of processing. Load theory predicts different effects of perceptual and cognitive (working memory) load on selective attention in flanker task contexts, suggesting that this paradigm may be used to distinguish levels of interference. Two experiments examined interference from competing speech during a word recognition task under different perceptual and working memory loads in a dual-task paradigm. Listeners identified words produced by a talker of one gender while ignoring a talker of the other gender. Perceptual load was manipulated using a nonspeech response cue, with response conditional upon either one or two acoustic features (pitch and modulation). Memory load was manipulated with a secondary task consisting of one or six visually presented digits. In the first experiment, the target and distractor were presented at different virtual locations (0 degrees and 90 degrees , respectively), whereas in the second, all the stimuli were presented from the same apparent location. Results suggest that spatial cues improve resistance to distraction in part by reducing working memory demand.
Native Language Influence in the Segmentation of a Novel Language
ERIC Educational Resources Information Center
Ordin, Mikhail; Nespor, Marina
2016-01-01
A major problem in second language acquisition (SLA) is the segmentation of fluent speech in the target language, i.e., detecting the boundaries of phonological constituents like words and phrases in the speech stream. To this end, among a variety of cues, people extensively use prosody and statistical regularities. We examined the role of pitch,…
Auditory attention strategy depends on target linguistic properties and spatial configurationa)
McCloy, Daniel R.; Lee, Adrian K. C.
2015-01-01
Whether crossing a busy intersection or attending a large dinner party, listeners sometimes need to attend to multiple spatially distributed sound sources or streams concurrently. How they achieve this is not clear—some studies suggest that listeners cannot truly simultaneously attend to separate streams, but instead combine attention switching with short-term memory to achieve something resembling divided attention. This paper presents two oddball detection experiments designed to investigate whether directing attention to phonetic versus semantic properties of the attended speech impacts listeners' ability to divide their auditory attention across spatial locations. Each experiment uses four spatially distinct streams of monosyllabic words, variation in cue type (providing phonetic or semantic information), and requiring attention to one or two locations. A rapid button-press response paradigm is employed to minimize the role of short-term memory in performing the task. Results show that differences in the spatial configuration of attended and unattended streams interact with linguistic properties of the speech streams to impact performance. Additionally, listeners may leverage phonetic information to make oddball detection judgments even when oddballs are semantically defined. Both of these effects appear to be mediated by the overall complexity of the acoustic scene. PMID:26233011
Heart rate variability as candidate endophenotype of social anxiety: A two-generation family study.
Harrewijn, A; Van der Molen, M J W; Verkuil, B; Sweijen, S W; Houwing-Duistermaat, J J; Westenberg, P M
2018-09-01
Social anxiety disorder (SAD) is the extreme fear and avoidance of one or more social situations. The goal of the current study was to investigate whether heart rate variability (HRV) during resting state and a social performance task (SPT) is a candidate endophenotype of SAD. In this two-generation family study, patients with SAD with their partner and children, and their siblings with partner and children took part in a SPT (total n = 121, 9 families, 3-30 persons per family, age range: 8-61 years, 17 patients with SAD). In this task, participants had to watch and evaluate the speech of a female peer, and had to give a similar speech. HRV was measured during two resting state phases, and during anticipation, speech and recovery phases of the SPT. We tested two criteria for endophenotypes: co-segregation with SAD within families and heritability. HRV did not co-segregate with SAD within families. Root mean square of successive differences during the first resting phase and recovery, and high frequency power during all phases of the task were heritable. It should be noted that few participants were diagnosed with SAD. Results during the speech should be interpreted with caution, because the duration was short and there was a lot of movement. HRV during resting state and the SPT is a possible endophenotype, but not of SAD. As other studies have shown that HRV is related to different internalizing disorders, HRV might reflect a transdiagnostic genetic vulnerability for internalizing disorders. Future research should investigate which factors influence the development of psychopathology in persons with decreased HRV. Copyright © 2018 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Riera-Palou, Felip; den Brinker, Albertus C.
2007-12-01
This paper introduces a new audio and speech broadband coding technique based on the combination of a pulse excitation coder and a standardized parametric coder, namely, MPEG-4 high-quality parametric coder. After presenting a series of enhancements to regular pulse excitation (RPE) to make it suitable for the modeling of broadband signals, it is shown how pulse and parametric codings complement each other and how they can be merged to yield a layered bit stream scalable coder able to operate at different points in the quality bit rate plane. The performance of the proposed coder is evaluated in a listening test. The major result is that the extra functionality of the bit stream scalability does not come at the price of a reduced performance since the coder is competitive with standardized coders (MP3, AAC, SSC).
Subliminal speech perception and auditory streaming.
Dupoux, Emmanuel; de Gardelle, Vincent; Kouider, Sid
2008-11-01
Current theories of consciousness assume a qualitative dissociation between conscious and unconscious processing: while subliminal stimuli only elicit a transient activity, supraliminal stimuli have long-lasting influences. Nevertheless, the existence of this qualitative distinction remains controversial, as past studies confounded awareness and stimulus strength (energy, duration). Here, we used a masked speech priming method in conjunction with a submillisecond interaural delay manipulation to contrast subliminal and supraliminal processing at constant prime, mask and target strength. This delay induced a perceptual streaming effect, with the prime popping out in the supraliminal condition. By manipulating the prime-target interval (ISI), we show a qualitatively distinct profile of priming longevity as a function of prime awareness. While subliminal priming disappeared after half a second, supraliminal priming was independent of ISI. This shows that the distinction between conscious and unconscious processing depends on high-level perceptual streaming factors rather than low-level features (energy, duration).
Determining the energetic and informational components of speech-on-speech masking
Kidd, Gerald; Mason, Christine R.; Swaminathan, Jayaganesh; Roverud, Elin; Clayton, Kameron K.; Best, Virginia
2016-01-01
Identification of target speech was studied under masked conditions consisting of two or four independent speech maskers. In the reference conditions, the maskers were colocated with the target, the masker talkers were the same sex as the target, and the masker speech was intelligible. The comparison conditions, intended to provide release from masking, included different-sex target and masker talkers, time-reversal of the masker speech, and spatial separation of the maskers from the target. Significant release from masking was found for all comparison conditions. To determine whether these reductions in masking could be attributed to differences in energetic masking, ideal time-frequency segregation (ITFS) processing was applied so that the time-frequency units where the masker energy dominated the target energy were removed. The remaining target-dominated “glimpses” were reassembled as the stimulus. Speech reception thresholds measured using these resynthesized ITFS-processed stimuli were the same for the reference and comparison conditions supporting the conclusion that the amount of energetic masking across conditions was the same. These results indicated that the large release from masking found under all comparison conditions was due primarily to a reduction in informational masking. Furthermore, the large individual differences observed generally were correlated across the three masking release conditions. PMID:27475139
Rapid Statistical Learning Supporting Word Extraction From Continuous Speech.
Batterink, Laura J
2017-07-01
The identification of words in continuous speech, known as speech segmentation, is a critical early step in language acquisition. This process is partially supported by statistical learning, the ability to extract patterns from the environment. Given that speech segmentation represents a potential bottleneck for language acquisition, patterns in speech may be extracted very rapidly, without extensive exposure. This hypothesis was examined by exposing participants to continuous speech streams composed of novel repeating nonsense words. Learning was measured on-line using a reaction time task. After merely one exposure to an embedded novel word, learners demonstrated significant learning effects, as revealed by faster responses to predictable than to unpredictable syllables. These results demonstrate that learners gained sensitivity to the statistical structure of unfamiliar speech on a very rapid timescale. This ability may play an essential role in early stages of language acquisition, allowing learners to rapidly identify word candidates and "break in" to an unfamiliar language.
Zekveld, Adriana A; Heslenfeld, Dirk J; Johnsrude, Ingrid S; Versfeld, Niek J; Kramer, Sophia E
2014-11-01
An important aspect of hearing is the degree to which listeners have to deploy effort to understand speech. One promising measure of listening effort is task-evoked pupil dilation. Here, we use functional magnetic resonance imaging (fMRI) to identify the neural correlates of pupil dilation during comprehension of degraded spoken sentences in 17 normal-hearing listeners. Subjects listened to sentences degraded in three different ways: the target female speech was masked by fluctuating noise, by speech from a single male speaker, or the target speech was noise-vocoded. The degree of degradation was individually adapted such that 50% or 84% of the sentences were intelligible. Control conditions included clear speech in quiet, and silent trials. The peak pupil dilation was larger for the 50% compared to the 84% intelligibility condition, and largest for speech masked by the single-talker masker, followed by speech masked by fluctuating noise, and smallest for noise-vocoded speech. Activation in the bilateral superior temporal gyrus (STG) showed the same pattern, with most extensive activation for speech masked by the single-talker masker. Larger peak pupil dilation was associated with more activation in the bilateral STG, bilateral ventral and dorsal anterior cingulate cortex and several frontal brain areas. A subset of the temporal region sensitive to pupil dilation was also sensitive to speech intelligibility and degradation type. These results show that pupil dilation during speech perception in challenging conditions reflects both auditory and cognitive processes that are recruited to cope with degraded speech and the need to segregate target speech from interfering sounds. Copyright © 2014 Elsevier Inc. All rights reserved.
Lewkowicz, David J.; Minar, Nicholas J.; Tift, Amy H.; Brandon, Melissa
2014-01-01
To investigate the developmental emergence of the ability to perceive the multisensory coherence of native and non-native audiovisual fluent speech, we tested 4-, 8–10, and 12–14 month-old English-learning infants. Infants first viewed two identical female faces articulating two different monologues in silence and then in the presence of an audible monologue that matched the visible articulations of one of the faces. Neither the 4-month-old nor the 8–10 month-old infants exhibited audio-visual matching in that neither group exhibited greater looking at the matching monologue. In contrast, the 12–14 month-old infants exhibited matching and, consistent with the emergence of perceptual expertise for the native language, they perceived the multisensory coherence of native-language monologues earlier in the test trials than of non-native language monologues. Moreover, the matching of native audible and visible speech streams observed in the 12–14 month olds did not depend on audio-visual synchrony whereas the matching of non-native audible and visible speech streams did depend on synchrony. Overall, the current findings indicate that the perception of the multisensory coherence of fluent audiovisual speech emerges late in infancy, that audio-visual synchrony cues are more important in the perception of the multisensory coherence of non-native than native audiovisual speech, and that the emergence of this skill most likely is affected by perceptual narrowing. PMID:25462038
2013-03-31
certainly remain comingled with other solid waste. For example, some bases provided containers for segregation of recyclables including plastic and...prevalent types of solid waste are food (19.1% by average sample weight), wood (18.9%), and plastics (16.0%) based on analysis of bases in...within the interval shown. Food and wood wastes are the largest components of the average waste stream (both at ~19% by weight), followed by plastic
Event-Related Potentials Index Segmentation of Nonsense Sounds
ERIC Educational Resources Information Center
Sanders, Lisa D.; Ameral, Victoria; Sayles, Kathryn
2009-01-01
To understand the world around us, continuous streams of information including speech must be segmented into units that can be mapped onto stored representations. Recent evidence has shown that event-related potentials (ERPs) can index the online segmentation of sound streams. In the current study, listeners were trained to recognize sequences of…
Interdigitated Color- and Disparity-Selective Columns within Human Visual Cortical Areas V2 and V3
Polimeni, Jonathan R.; Tootell, Roger B.H.
2016-01-01
In nonhuman primates (NHPs), secondary visual cortex (V2) is composed of repeating columnar stripes, which are evident in histological variations of cytochrome oxidase (CO) levels. Distinctive “thin” and “thick” stripes of dark CO staining reportedly respond selectively to stimulus variations in color and binocular disparity, respectively. Here, we first tested whether similar color-selective or disparity-selective stripes exist in human V2. If so, available evidence predicts that such stripes should (1) radiate “outward” from the V1–V2 border, (2) interdigitate, (3) differ from each other in both thickness and length, (4) be spaced ∼3.5–4 mm apart (center-to-center), and, perhaps, (5) have segregated functional connections. Second, we tested whether analogous segregated columns exist in a “next-higher” tier area, V3. To answer these questions, we used high-resolution fMRI (1 × 1 × 1 mm3) at high field (7 T), presenting color-selective or disparity-selective stimuli, plus extensive signal averaging across multiple scan sessions and cortical surface-based analysis. All hypotheses were confirmed. V2 stripes and V3 columns were reliably localized in all subjects. The two stripe/column types were largely interdigitated (e.g., nonoverlapping) in both V2 and V3. Color-selective stripes differed from disparity-selective stripes in both width (thickness) and length. Analysis of resting-state functional connections (eyes closed) showed a stronger correlation between functionally alike (compared with functionally unlike) stripes/columns in V2 and V3. These results revealed a fine-scale segregation of color-selective or disparity-selective streams within human areas V2 and V3. Together with prior evidence from NHPs, this suggests that two parallel processing streams extend from visual subcortical regions through V1, V2, and V3. SIGNIFICANCE STATEMENT In current textbooks and reviews, diagrams of cortical visual processing highlight two distinct neural-processing streams within the first and second cortical areas in monkeys. Two major streams consist of segregated cortical columns that are selectively activated by either color or ocular interactions. Because such cortical columns are so small, they were not revealed previously by conventional imaging techniques in humans. Here we demonstrate that such segregated columnar systems exist in humans. We find that, in humans, color versus binocular disparity columns extend one full area further, into the third visual area. Our approach can be extended to reveal and study additional types of columns in human cortex, perhaps including columns underlying more cognitive functions. PMID:26865609
ERIC Educational Resources Information Center
O'Brien, Mary Grantham
2014-01-01
In early stages of classroom language learning, many adult second language (L2) learners communicate primarily with one another, yet we know little about which speech stream characteristics learners tune into or the extent to which they understand this lingua franca communication. In the current study, 25 native English speakers learning German as…
NASA Astrophysics Data System (ADS)
Mirkovic, Bojana; Debener, Stefan; Jaeger, Manuela; De Vos, Maarten
2015-08-01
Objective. Recent studies have provided evidence that temporal envelope driven speech decoding from high-density electroencephalography (EEG) and magnetoencephalography recordings can identify the attended speech stream in a multi-speaker scenario. The present work replicated the previous high density EEG study and investigated the necessary technical requirements for practical attended speech decoding with EEG. Approach. Twelve normal hearing participants attended to one out of two simultaneously presented audiobook stories, while high density EEG was recorded. An offline iterative procedure eliminating those channels contributing the least to decoding provided insight into the necessary channel number and optimal cross-subject channel configuration. Aiming towards the future goal of near real-time classification with an individually trained decoder, the minimum duration of training data necessary for successful classification was determined by using a chronological cross-validation approach. Main results. Close replication of the previously reported results confirmed the method robustness. Decoder performance remained stable from 96 channels down to 25. Furthermore, for less than 15 min of training data, the subject-independent (pre-trained) decoder performed better than an individually trained decoder did. Significance. Our study complements previous research and provides information suggesting that efficient low-density EEG online decoding is within reach.
Fuel-cell engine stream conditioning system
DuBose, Ronald Arthur
2002-01-01
A stream conditioning system for a fuel cell gas management system or fuel cell engine. The stream conditioning system manages species potential in at least one fuel cell reactant stream. A species transfer device is located in the path of at least one reactant stream of a fuel cell's inlet or outlet, which transfer device conditions that stream to improve the efficiency of the fuel cell. The species transfer device incorporates an exchange media and a sorbent. The fuel cell gas management system can include a cathode loop with the stream conditioning system transferring latent and sensible heat from an exhaust stream to the cathode inlet stream of the fuel cell; an anode humidity retention system for maintaining the total enthalpy of the anode stream exiting the fuel cell related to the total enthalpy of the anode inlet stream; and a cooling water management system having segregated deionized water and cooling water loops interconnected by means of a brazed plate heat exchanger.
Contributions of local speech encoding and functional connectivity to audio-visual speech perception
Giordano, Bruno L; Ince, Robin A A; Gross, Joachim; Schyns, Philippe G; Panzeri, Stefano; Kayser, Christoph
2017-01-01
Seeing a speaker’s face enhances speech intelligibility in adverse environments. We investigated the underlying network mechanisms by quantifying local speech representations and directed connectivity in MEG data obtained while human participants listened to speech of varying acoustic SNR and visual context. During high acoustic SNR speech encoding by temporally entrained brain activity was strong in temporal and inferior frontal cortex, while during low SNR strong entrainment emerged in premotor and superior frontal cortex. These changes in local encoding were accompanied by changes in directed connectivity along the ventral stream and the auditory-premotor axis. Importantly, the behavioral benefit arising from seeing the speaker’s face was not predicted by changes in local encoding but rather by enhanced functional connectivity between temporal and inferior frontal cortex. Our results demonstrate a role of auditory-frontal interactions in visual speech representations and suggest that functional connectivity along the ventral pathway facilitates speech comprehension in multisensory environments. DOI: http://dx.doi.org/10.7554/eLife.24763.001 PMID:28590903
Neural integration of iconic and unrelated coverbal gestures: a functional MRI study.
Green, Antonia; Straube, Benjamin; Weis, Susanne; Jansen, Andreas; Willmes, Klaus; Konrad, Kerstin; Kircher, Tilo
2009-10-01
Gestures are an important part of interpersonal communication, for example by illustrating physical properties of speech contents (e.g., "the ball is round"). The meaning of these so-called iconic gestures is strongly intertwined with speech. We investigated the neural correlates of the semantic integration for verbal and gestural information. Participants watched short videos of five speech and gesture conditions performed by an actor, including variation of language (familiar German vs. unfamiliar Russian), variation of gesture (iconic vs. unrelated), as well as isolated familiar language, while brain activation was measured using functional magnetic resonance imaging. For familiar speech with either of both gesture types contrasted to Russian speech-gesture pairs, activation increases were observed at the left temporo-occipital junction. Apart from this shared location, speech with iconic gestures exclusively engaged left occipital areas, whereas speech with unrelated gestures activated bilateral parietal and posterior temporal regions. Our results demonstrate that the processing of speech with speech-related versus speech-unrelated gestures occurs in two distinct but partly overlapping networks. The distinct processing streams (visual versus linguistic/spatial) are interpreted in terms of "auxiliary systems" allowing the integration of speech and gesture in the left temporo-occipital region.
ERIC Educational Resources Information Center
Murakami, Takenobu; Restle, Julia; Ziemann, Ulf
2012-01-01
A left-hemispheric cortico-cortical network involving areas of the temporoparietal junction (Tpj) and the posterior inferior frontal gyrus (pIFG) is thought to support sensorimotor integration of speech perception into articulatory motor activation, but how this network links with the lip area of the primary motor cortex (M1) during speech…
The Human Voice in Speech and Singing
NASA Astrophysics Data System (ADS)
Lindblom, Björn; Sundberg, Johan
This chapter
Separating pitch chroma and pitch height in the human brain
Warren, J. D.; Uppenkamp, S.; Patterson, R. D.; Griffiths, T. D.
2003-01-01
Musicians recognize pitch as having two dimensions. On the keyboard, these are illustrated by the octave and the cycle of notes within the octave. In perception, these dimensions are referred to as pitch height and pitch chroma, respectively. Pitch chroma provides a basis for presenting acoustic patterns (melodies) that do not depend on the particular sound source. In contrast, pitch height provides a basis for segregation of notes into streams to separate sound sources. This paper reports a functional magnetic resonance experiment designed to search for distinct mappings of these two types of pitch change in the human brain. The results show that chroma change is specifically represented anterior to primary auditory cortex, whereas height change is specifically represented posterior to primary auditory cortex. We propose that tracking of acoustic information streams occurs in anterior auditory areas, whereas the segregation of sound objects (a crucial aspect of auditory scene analysis) depends on posterior areas. PMID:12909719
Separating pitch chroma and pitch height in the human brain.
Warren, J D; Uppenkamp, S; Patterson, R D; Griffiths, T D
2003-08-19
Musicians recognize pitch as having two dimensions. On the keyboard, these are illustrated by the octave and the cycle of notes within the octave. In perception, these dimensions are referred to as pitch height and pitch chroma, respectively. Pitch chroma provides a basis for presenting acoustic patterns (melodies) that do not depend on the particular sound source. In contrast, pitch height provides a basis for segregation of notes into streams to separate sound sources. This paper reports a functional magnetic resonance experiment designed to search for distinct mappings of these two types of pitch change in the human brain. The results show that chroma change is specifically represented anterior to primary auditory cortex, whereas height change is specifically represented posterior to primary auditory cortex. We propose that tracking of acoustic information streams occurs in anterior auditory areas, whereas the segregation of sound objects (a crucial aspect of auditory scene analysis) depends on posterior areas.
ERIC Educational Resources Information Center
Elias-Olivares, Lucia
The linguistic varieties in use in the Chicano speech community of East Austin (Texas) and the attitudes toward them were studied. Data were collected from field work done in a section of Austin that comprised over half of the Chicano population. The section was a practically segregated urban neighborhood and somewhat isolated from other ethnic…
Binaural model-based dynamic-range compression.
Ernst, Stephan M A; Kortlang, Steffen; Grimm, Giso; Bisitz, Thomas; Kollmeier, Birger; Ewert, Stephan D
2018-01-26
Binaural cues such as interaural level differences (ILDs) are used to organise auditory perception and to segregate sound sources in complex acoustical environments. In bilaterally fitted hearing aids, dynamic-range compression operating independently at each ear potentially alters these ILDs, thus distorting binaural perception and sound source segregation. A binaurally-linked model-based fast-acting dynamic compression algorithm designed to approximate the normal-hearing basilar membrane (BM) input-output function in hearing-impaired listeners is suggested. A multi-center evaluation in comparison with an alternative binaural and two bilateral fittings was performed to assess the effect of binaural synchronisation on (a) speech intelligibility and (b) perceived quality in realistic conditions. 30 and 12 hearing impaired (HI) listeners were aided individually with the algorithms for both experimental parts, respectively. A small preference towards the proposed model-based algorithm in the direct quality comparison was found. However, no benefit of binaural-synchronisation regarding speech intelligibility was found, suggesting a dominant role of the better ear in all experimental conditions. The suggested binaural synchronisation of compression algorithms showed a limited effect on the tested outcome measures, however, linking could be situationally beneficial to preserve a natural binaural perception of the acoustical environment.
NASA Astrophysics Data System (ADS)
Liberman, A. M.
1982-03-01
This report is one of a regular series on the status and progress of studies on the nature of speech, instrumentation for its investigation and practical applications. Manuscripts cover the following topics: Speech perception and memory coding in relation to reading ability; The use of orthographic structure by deaf adults: Recognition of finger-spelled letters; Exploring the information support for speech; The stream of speech; Using the acoustic signal to make inferences about place and duration of tongue-palate contact. Patterns of human interlimb coordination emerge from the the properties of nonlinear limit cycle oscillatory processes: Theory and data; Motor control: Which themes do we orchestrate? Exploring the nature of motor control in Down's syndrome; Periodicity and auditory memory: A pilot study; Reading skill and language skill: On the role of sign order and morphological structure in memory for American Sign Language sentences; Perception of nasal consonants with special reference to Catalan; and Speech production Characteristics of the hearing impaired.
Small intragenic deletion in FOXP2 associated with childhood apraxia of speech and dysarthria.
Turner, Samantha J; Hildebrand, Michael S; Block, Susan; Damiano, John; Fahey, Michael; Reilly, Sheena; Bahlo, Melanie; Scheffer, Ingrid E; Morgan, Angela T
2013-09-01
Relatively little is known about the neurobiological basis of speech disorders although genetic determinants are increasingly recognized. The first gene for primary speech disorder was FOXP2, identified in a large, informative family with verbal and oral dyspraxia. Subsequently, many de novo and familial cases with a severe speech disorder associated with FOXP2 mutations have been reported. These mutations include sequencing alterations, translocations, uniparental disomy, and genomic copy number variants. We studied eight probands with speech disorder and their families. Family members were phenotyped using a comprehensive assessment of speech, oral motor function, language, literacy skills, and cognition. Coding regions of FOXP2 were screened to identify novel variants. Segregation of the variant was determined in the probands' families. Variants were identified in two probands. One child with severe motor speech disorder had a small de novo intragenic FOXP2 deletion. His phenotype included features of childhood apraxia of speech and dysarthria, oral motor dyspraxia, receptive and expressive language disorder, and literacy difficulties. The other variant was found in a family in two of three family members with stuttering, and also in the mother with oral motor impairment. This variant was considered a benign polymorphism as it was predicted to be non-pathogenic with in silico tools and found in database controls. This is the first report of a small intragenic deletion of FOXP2 that is likely to be the cause of severe motor speech disorder associated with language and literacy problems. Copyright © 2013 Wiley Periodicals, Inc.
ERIC Educational Resources Information Center
Santos-Oliveira, Daniela Cristina
2017-01-01
Models of speech perception suggest a dorsal stream connecting the temporal and inferior parietal lobe with the inferior frontal gyrus. This stream is thought to involve an auditory motor loop that translates acoustic information into motor/articulatory commands and is further influenced by decision making processes that involve maintenance of…
A right-ear bias of auditory selective attention is evident in alpha oscillations.
Payne, Lisa; Rogers, Chad S; Wingfield, Arthur; Sekuler, Robert
2017-04-01
Auditory selective attention makes it possible to pick out one speech stream that is embedded in a multispeaker environment. We adapted a cued dichotic listening task to examine suppression of a speech stream lateralized to the nonattended ear, and to evaluate the effects of attention on the right ear's well-known advantage in the perception of linguistic stimuli. After being cued to attend to input from either their left or right ear, participants heard two different four-word streams presented simultaneously to the separate ears. Following each dichotic presentation, participants judged whether a spoken probe word had been in the attended ear's stream. We used EEG signals to track participants' spatial lateralization of auditory attention, which is marked by interhemispheric differences in EEG alpha (8-14 Hz) power. A right-ear advantage (REA) was evident in faster response times and greater sensitivity in distinguishing attended from unattended words. Consistent with the REA, we found strongest parietal and right frontotemporal alpha modulation during the attend-right condition. These findings provide evidence for a link between selective attention and the REA during directed dichotic listening. © 2016 Society for Psychophysiological Research.
Speech recognition systems on the Cell Broadband Engine
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liu, Y; Jones, H; Vaidya, S
In this paper we describe our design, implementation, and first results of a prototype connected-phoneme-based speech recognition system on the Cell Broadband Engine{trademark} (Cell/B.E.). Automatic speech recognition decodes speech samples into plain text (other representations are possible) and must process samples at real-time rates. Fortunately, the computational tasks involved in this pipeline are highly data-parallel and can receive significant hardware acceleration from vector-streaming architectures such as the Cell/B.E. Identifying and exploiting these parallelism opportunities is challenging, but also critical to improving system performance. We observed, from our initial performance timings, that a single Cell/B.E. processor can recognize speech from thousandsmore » of simultaneous voice channels in real time--a channel density that is orders-of-magnitude greater than the capacity of existing software speech recognizers based on CPUs (central processing units). This result emphasizes the potential for Cell/B.E.-based speech recognition and will likely lead to the future development of production speech systems using Cell/B.E. clusters.« less
Role of contextual cues on the perception of spectrally reduced interrupted speech.
Patro, Chhayakanta; Mendel, Lisa Lucks
2016-08-01
Understanding speech within an auditory scene is constantly challenged by interfering noise in suboptimal listening environments when noise hinders the continuity of the speech stream. In such instances, a typical auditory-cognitive system perceptually integrates available speech information and "fills in" missing information in the light of semantic context. However, individuals with cochlear implants (CIs) find it difficult and effortful to understand interrupted speech compared to their normal hearing counterparts. This inefficiency in perceptual integration of speech could be attributed to further degradations in the spectral-temporal domain imposed by CIs making it difficult to utilize the contextual evidence effectively. To address these issues, 20 normal hearing adults listened to speech that was spectrally reduced and spectrally reduced interrupted in a manner similar to CI processing. The Revised Speech Perception in Noise test, which includes contextually rich and contextually poor sentences, was used to evaluate the influence of semantic context on speech perception. Results indicated that listeners benefited more from semantic context when they listened to spectrally reduced speech alone. For the spectrally reduced interrupted speech, contextual information was not as helpful under significant spectral reductions, but became beneficial as the spectral resolution improved. These results suggest top-down processing facilitates speech perception up to a point, and it fails to facilitate speech understanding when the speech signals are significantly degraded.
Crosse, Michael J; Lalor, Edmund C
2014-04-01
Visual speech can greatly enhance a listener's comprehension of auditory speech when they are presented simultaneously. Efforts to determine the neural underpinnings of this phenomenon have been hampered by the limited temporal resolution of hemodynamic imaging and the fact that EEG and magnetoencephalographic data are usually analyzed in response to simple, discrete stimuli. Recent research has shown that neuronal activity in human auditory cortex tracks the envelope of natural speech. Here, we exploit this finding by estimating a linear forward-mapping between the speech envelope and EEG data and show that the latency at which the envelope of natural speech is represented in cortex is shortened by >10 ms when continuous audiovisual speech is presented compared with audio-only speech. In addition, we use a reverse-mapping approach to reconstruct an estimate of the speech stimulus from the EEG data and, by comparing the bimodal estimate with the sum of the unimodal estimates, find no evidence of any nonlinear additive effects in the audiovisual speech condition. These findings point to an underlying mechanism that could account for enhanced comprehension during audiovisual speech. Specifically, we hypothesize that low-level acoustic features that are temporally coherent with the preceding visual stream may be synthesized into a speech object at an earlier latency, which may provide an extended period of low-level processing before extraction of semantic information.
Testing the dual-pathway model for auditory processing in human cortex.
Zündorf, Ida C; Lewald, Jörg; Karnath, Hans-Otto
2016-01-01
Analogous to the visual system, auditory information has been proposed to be processed in two largely segregated streams: an anteroventral ("what") pathway mainly subserving sound identification and a posterodorsal ("where") stream mainly subserving sound localization. Despite the popularity of this assumption, the degree of separation of spatial and non-spatial auditory information processing in cortex is still under discussion. In the present study, a statistical approach was implemented to investigate potential behavioral dissociations for spatial and non-spatial auditory processing in stroke patients, and voxel-wise lesion analyses were used to uncover their neural correlates. The results generally provided support for anatomically and functionally segregated auditory networks. However, some degree of anatomo-functional overlap between "what" and "where" aspects of processing was found in the superior pars opercularis of right inferior frontal gyrus (Brodmann area 44), suggesting the potential existence of a shared target area of both auditory streams in this region. Moreover, beyond the typically defined posterodorsal stream (i.e., posterior superior temporal gyrus, inferior parietal lobule, and superior frontal sulcus), occipital lesions were found to be associated with sound localization deficits. These results, indicating anatomically and functionally complex cortical networks for spatial and non-spatial auditory processing, are roughly consistent with the dual-pathway model of auditory processing in its original form, but argue for the need to refine and extend this widely accepted hypothesis. Copyright © 2015 Elsevier Inc. All rights reserved.
Neighborhood Foreclosures, Racial/Ethnic Transitions, and Residential Segregation
Hall, Matthew; Crowder, Kyle; Spring, Amy
2015-01-01
In this article, we use data on virtually all foreclosure events between 2005 and 2009 to calculate neighborhood foreclosure rates for nearly all block groups in the United States to assess the impact of housing foreclosures on neighborhood racial/ethnic change and on broader patterns of racial residential segregation. We find that the foreclosure crisis was patterned strongly along racial lines: black, Latino, and racially integrated neighborhoods had exceptionally high foreclosure rates. Multilevel models of racial/ethnic change reveal that foreclosure concentrations were linked to declining shares of whites and expanding shares of black and Latino residents. Results further suggest that these compositional shifts were driven by both white population loss and minority growth, especially from racially mixed settings with high foreclosure rates. To explore the impact of these racially selective migration streams on patterns of residential segregation, we simulate racial segregation assuming that foreclosure rates remained at their 2005 levels throughout the crisis period. Our simulations suggest that the foreclosure crisis increased racial segregation between blacks and whites by 1.1 dissimilarity points, and between Latinos and whites by 2.2 dissimilarity points. PMID:26120142
Sleep Disrupts High-Level Speech Parsing Despite Significant Basic Auditory Processing.
Makov, Shiri; Sharon, Omer; Ding, Nai; Ben-Shachar, Michal; Nir, Yuval; Zion Golumbic, Elana
2017-08-09
The extent to which the sleeping brain processes sensory information remains unclear. This is particularly true for continuous and complex stimuli such as speech, in which information is organized into hierarchically embedded structures. Recently, novel metrics for assessing the neural representation of continuous speech have been developed using noninvasive brain recordings that have thus far only been tested during wakefulness. Here we investigated, for the first time, the sleeping brain's capacity to process continuous speech at different hierarchical levels using a newly developed Concurrent Hierarchical Tracking (CHT) approach that allows monitoring the neural representation and processing-depth of continuous speech online. Speech sequences were compiled with syllables, words, phrases, and sentences occurring at fixed time intervals such that different linguistic levels correspond to distinct frequencies. This enabled us to distinguish their neural signatures in brain activity. We compared the neural tracking of intelligible versus unintelligible (scrambled and foreign) speech across states of wakefulness and sleep using high-density EEG in humans. We found that neural tracking of stimulus acoustics was comparable across wakefulness and sleep and similar across all conditions regardless of speech intelligibility. In contrast, neural tracking of higher-order linguistic constructs (words, phrases, and sentences) was only observed for intelligible speech during wakefulness and could not be detected at all during nonrapid eye movement or rapid eye movement sleep. These results suggest that, whereas low-level auditory processing is relatively preserved during sleep, higher-level hierarchical linguistic parsing is severely disrupted, thereby revealing the capacity and limits of language processing during sleep. SIGNIFICANCE STATEMENT Despite the persistence of some sensory processing during sleep, it is unclear whether high-level cognitive processes such as speech parsing are also preserved. We used a novel approach for studying the depth of speech processing across wakefulness and sleep while tracking neuronal activity with EEG. We found that responses to the auditory sound stream remained intact; however, the sleeping brain did not show signs of hierarchical parsing of the continuous stream of syllables into words, phrases, and sentences. The results suggest that sleep imposes a functional barrier between basic sensory processing and high-level cognitive processing. This paradigm also holds promise for studying residual cognitive abilities in a wide array of unresponsive states. Copyright © 2017 the authors 0270-6474/17/377772-10$15.00/0.
Enhancing Auditory Selective Attention Using a Visually Guided Hearing Aid.
Kidd, Gerald
2017-10-17
Listeners with hearing loss, as well as many listeners with clinically normal hearing, often experience great difficulty segregating talkers in a multiple-talker sound field and selectively attending to the desired "target" talker while ignoring the speech from unwanted "masker" talkers and other sources of sound. This listening situation forms the classic "cocktail party problem" described by Cherry (1953) that has received a great deal of study over the past few decades. In this article, a new approach to improving sound source segregation and enhancing auditory selective attention is described. The conceptual design, current implementation, and results obtained to date are reviewed and discussed in this article. This approach, embodied in a prototype "visually guided hearing aid" (VGHA) currently used for research, employs acoustic beamforming steered by eye gaze as a means for improving the ability of listeners to segregate and attend to one sound source in the presence of competing sound sources. The results from several studies demonstrate that listeners with normal hearing are able to use an attention-based "spatial filter" operating primarily on binaural cues to selectively attend to one source among competing spatially distributed sources. Furthermore, listeners with sensorineural hearing loss generally are less able to use this spatial filter as effectively as are listeners with normal hearing especially in conditions high in "informational masking." The VGHA enhances auditory spatial attention for speech-on-speech masking and improves signal-to-noise ratio for conditions high in "energetic masking." Visual steering of the beamformer supports the coordinated actions of vision and audition in selective attention and facilitates following sound source transitions in complex listening situations. Both listeners with normal hearing and with sensorineural hearing loss may benefit from the acoustic beamforming implemented by the VGHA, especially for nearby sources in less reverberant sound fields. Moreover, guiding the beam using eye gaze can be an effective means of sound source enhancement for listening conditions where the target source changes frequently over time as often occurs during turn-taking in a conversation. http://cred.pubs.asha.org/article.aspx?articleid=2601621.
Enhancing Auditory Selective Attention Using a Visually Guided Hearing Aid
2017-01-01
Purpose Listeners with hearing loss, as well as many listeners with clinically normal hearing, often experience great difficulty segregating talkers in a multiple-talker sound field and selectively attending to the desired “target” talker while ignoring the speech from unwanted “masker” talkers and other sources of sound. This listening situation forms the classic “cocktail party problem” described by Cherry (1953) that has received a great deal of study over the past few decades. In this article, a new approach to improving sound source segregation and enhancing auditory selective attention is described. The conceptual design, current implementation, and results obtained to date are reviewed and discussed in this article. Method This approach, embodied in a prototype “visually guided hearing aid” (VGHA) currently used for research, employs acoustic beamforming steered by eye gaze as a means for improving the ability of listeners to segregate and attend to one sound source in the presence of competing sound sources. Results The results from several studies demonstrate that listeners with normal hearing are able to use an attention-based “spatial filter” operating primarily on binaural cues to selectively attend to one source among competing spatially distributed sources. Furthermore, listeners with sensorineural hearing loss generally are less able to use this spatial filter as effectively as are listeners with normal hearing especially in conditions high in “informational masking.” The VGHA enhances auditory spatial attention for speech-on-speech masking and improves signal-to-noise ratio for conditions high in “energetic masking.” Visual steering of the beamformer supports the coordinated actions of vision and audition in selective attention and facilitates following sound source transitions in complex listening situations. Conclusions Both listeners with normal hearing and with sensorineural hearing loss may benefit from the acoustic beamforming implemented by the VGHA, especially for nearby sources in less reverberant sound fields. Moreover, guiding the beam using eye gaze can be an effective means of sound source enhancement for listening conditions where the target source changes frequently over time as often occurs during turn-taking in a conversation. Presentation Video http://cred.pubs.asha.org/article.aspx?articleid=2601621 PMID:29049603
A cortical circuit for voluntary laryngeal control: Implications for the evolution language.
Hickok, Gregory
2017-02-01
The development of voluntary laryngeal control has been argued to be a key innovation in the evolution of language. Part of the evidence for this hypothesis comes from neuroscience. For example, comparative research has shown that humans have direct cortical innervation of motor neurons controlling the larynx, whereas nonhuman primates do not. Research on cortical motor control circuits has shown that the frontal lobe cortical motor system does not work alone; it is dependent on sensory feedback control circuits. Thus, the human brain must have evolved not only the required efferent motor pathway but also the cortical circuit for controlling those efferent signals. To fill this gap, I propose a link between the evolution of laryngeal control and neuroscience research on the human dorsal auditory-motor speech stream. Specifically, I argue that the dorsal stream Spt (Sylvian parietal-temporal) circuit evolved in step with the direct cortico-laryngeal control pathway and together represented a key advance in the evolution of speech. I suggest that a cortical laryngeal control circuit may play an important role in language by providing a prosodic frame for speech planning.
Design Automation for Streaming Systems
2005-12-16
which are FIFO buffered channels. We develop a process network model for streaming sys - tems (TDFPN) and a hardware description language with built in...and may include an automatic address generator. A complete synthesis sys - tem would provide separate segment operator implementations for every...Acoustics, Speech, and Signal Processing (ICASSP ’89), pages 988– 991, 1989. [Luk et al., 1997] Wayne Luk, Nabeel Shirazi, and Peter Y. K. Cheung
Simonyan, Kristina; Fuertinger, Stefan
2015-04-01
Speech production is one of the most complex human behaviors. Although brain activation during speaking has been well investigated, our understanding of interactions between the brain regions and neural networks remains scarce. We combined seed-based interregional correlation analysis with graph theoretical analysis of functional MRI data during the resting state and sentence production in healthy subjects to investigate the interface and topology of functional networks originating from the key brain regions controlling speech, i.e., the laryngeal/orofacial motor cortex, inferior frontal and superior temporal gyri, supplementary motor area, cingulate cortex, putamen, and thalamus. During both resting and speaking, the interactions between these networks were bilaterally distributed and centered on the sensorimotor brain regions. However, speech production preferentially recruited the inferior parietal lobule (IPL) and cerebellum into the large-scale network, suggesting the importance of these regions in facilitation of the transition from the resting state to speaking. Furthermore, the cerebellum (lobule VI) was the most prominent region showing functional influences on speech-network integration and segregation. Although networks were bilaterally distributed, interregional connectivity during speaking was stronger in the left vs. right hemisphere, which may have underlined a more homogeneous overlap between the examined networks in the left hemisphere. Among these, the laryngeal motor cortex (LMC) established a core network that fully overlapped with all other speech-related networks, determining the extent of network interactions. Our data demonstrate complex interactions of large-scale brain networks controlling speech production and point to the critical role of the LMC, IPL, and cerebellum in the formation of speech production network. Copyright © 2015 the American Physiological Society.
How reading differs from object naming at the neuronal level.
Price, C J; McCrory, E; Noppeney, U; Mechelli, A; Moore, C J; Biggio, N; Devlin, J T
2006-01-15
This paper uses whole brain functional neuroimaging in neurologically normal participants to explore how reading aloud differs from object naming in terms of neuronal implementation. In the first experiment, we directly compared brain activation during reading aloud and object naming. This revealed greater activation for reading in bilateral premotor, left posterior superior temporal and precuneus regions. In a second experiment, we segregated the object-naming system into object recognition and speech production areas by factorially manipulating the presence or absence of objects (pictures of objects or their meaningless scrambled counterparts) with the presence or absence of speech production (vocal vs. finger press responses). This demonstrated that the areas associated with speech production (object naming and repetitively saying "OK" to meaningless scrambled pictures) corresponded exactly to the areas where responses were higher for reading aloud than object naming in Experiment 1. Collectively the results suggest that, relative to object naming, reading increases the demands on shared speech production processes. At a cognitive level, enhanced activation for reading in speech production areas may reflect the multiple and competing phonological codes that are generated from the sublexical parts of written words. At a neuronal level, it may reflect differences in the speed with which different areas are activated and integrate with one another.
Perception of temporally modified speech in auditory neuropathy.
Hassan, Dalia Mohamed
2011-01-01
Disrupted auditory nerve activity in auditory neuropathy (AN) significantly impairs the sequential processing of auditory information, resulting in poor speech perception. This study investigated the ability of AN subjects to perceive temporally modified consonant-vowel (CV) pairs and shed light on their phonological awareness skills. Four Arabic CV pairs were selected: /ki/-/gi/, /to/-/do/, /si/-/sti/ and /so/-/zo/. The formant transitions in consonants and the pauses between CV pairs were prolonged. Rhyming, segmentation and blending skills were tested using words at a natural rate of speech and with prolongation of the speech stream. Fourteen adult AN subjects were compared to a matched group of cochlear-impaired patients in their perception of acoustically processed speech. The AN group distinguished the CV pairs at a low speech rate, in particular with modification of the consonant duration. Phonological awareness skills deteriorated in adult AN subjects but improved with prolongation of the speech inter-syllabic time interval. A rehabilitation program for AN should consider temporal modification of speech, training for auditory temporal processing and the use of devices with innovative signal processing schemes. Verbal modifications as well as visual imaging appear to be promising compensatory strategies for remediating the affected phonological processing skills.
Age and experience shape developmental changes in the neural basis of language-related learning.
McNealy, Kristin; Mazziotta, John C; Dapretto, Mirella
2011-11-01
Very little is known about the neural underpinnings of language learning across the lifespan and how these might be modified by maturational and experiential factors. Building on behavioral research highlighting the importance of early word segmentation (i.e. the detection of word boundaries in continuous speech) for subsequent language learning, here we characterize developmental changes in brain activity as this process occurs online, using data collected in a mixed cross-sectional and longitudinal design. One hundred and fifty-six participants, ranging from age 5 to adulthood, underwent functional magnetic resonance imaging (fMRI) while listening to three novel streams of continuous speech, which contained either strong statistical regularities, strong statistical regularities and speech cues, or weak statistical regularities providing minimal cues to word boundaries. All age groups displayed significant signal increases over time in temporal cortices for the streams with high statistical regularities; however, we observed a significant right-to-left shift in the laterality of these learning-related increases with age. Interestingly, only the 5- to 10-year-old children displayed significant signal increases for the stream with low statistical regularities, suggesting an age-related decrease in sensitivity to more subtle statistical cues. Further, in a sample of 78 10-year-olds, we examined the impact of proficiency in a second language and level of pubertal development on learning-related signal increases, showing that the brain regions involved in language learning are influenced by both experiential and maturational factors. 2011 Blackwell Publishing Ltd.
The Human Voice in Speech and Singing
NASA Astrophysics Data System (ADS)
Lindblom, Björn; Sundberg, Johan
This chapter describes various aspects of the human voice as a means of communication in speech and singing. From the point of view of function, vocal sounds can be regarded as the end result of a three stage process: (1) the compression of air in the respiratory system, which produces an exhalatory airstream, (2) the vibrating vocal folds' transformation of this air stream to an intermittent or pulsating air stream, which is a complex tone, referred to as the voice source, and (3) the filtering of this complex tone in the vocal tract resonator. The main function of the respiratory system is to generate an overpressure of air under the glottis, or a subglottal pressure. Section 16.1 describes different aspects of the respiratory system of significance to speech and singing, including lung volume ranges, subglottal pressures, and how this pressure is affected by the ever-varying recoil forces. The complex tone generated when the air stream from the lungs passes the vibrating vocal folds can be varied in at least three dimensions: fundamental frequency, amplitude and spectrum. Section 16.2 describes how these properties of the voice source are affected by the subglottal pressure, the length and stiffness of the vocal folds and how firmly the vocal folds are adducted. Section 16.3 gives an account of the vocal tract filter, how its form determines the frequencies of its resonances, and Sect. 16.4 gives an account for how these resonance frequencies or formants shape the vocal sounds by imposing spectrum peaks separated by spectrum valleys, and how the frequencies of these peaks determine vowel and voice qualities. The remaining sections of the chapter describe various aspects of the acoustic signals used for vocal communication in speech and singing. The syllable structure is discussed in Sect. 16.5, the closely related aspects of rhythmicity and timing in speech and singing is described in Sect. 16.6, and pitch and rhythm aspects in Sect. 16.7. The impressive control of all these acoustic characteristics of vocal signals is discussed in Sect. 16.8, while Sect. 16.9 considers expressive aspects of vocal communication.
The upcycling of post-industrial PP/PET waste streams through in-situ microfibrillar preparation
NASA Astrophysics Data System (ADS)
Delva, Laurens; Ragaert, Kim; Cardon, Ludwig
2015-12-01
Post-industrial plastic waste streams can be re-used as secondary material streams for polymer processing by extrusion or injection moulding. One of the major commercially available waste stream contains polypropylene (PP) contaminated with polyesters (mostly polyethylene tereftalate - PET). An important practical hurdle for the direct implementation of this waste stream is the immiscibility of PP and PET in the melt, which leads to segregation within the polymer structure and adversely affects the reproducibility and mechanical properties of the manufactured parts. It has been indicated in literature that the creation of PET microfibrils in the PP matrix could undo these drawbacks and upcycle the PP/PET combination. Within the current research, a commercially available virgin PP/PET was evaluated for the microfibrillar preparation. The mechanical (tensile and impact) properties, thermal properties and morphology of the composites were characterized at different stages of the microfibrillar preparation.
Neural Integration in Body Perception.
Ramsey, Richard
2018-06-19
The perception of other people is instrumental in guiding social interactions. For example, the appearance of the human body cues a wide range of inferences regarding sex, age, health, and personality, as well as emotional state and intentions, which influence social behavior. To date, most neuroscience research on body perception has aimed to characterize the functional contribution of segregated patches of cortex in the ventral visual stream. In light of the growing prominence of network architectures in neuroscience, the current article reviews neuroimaging studies that measure functional integration between different brain regions during body perception. The review demonstrates that body perception is not restricted to processing in the ventral visual stream but instead reflects a functional alliance between the ventral visual stream and extended neural systems associated with action perception, executive functions, and theory of mind. Overall, these findings demonstrate how body percepts are constructed through interactions in distributed brain networks and underscore that functional segregation and integration should be considered together when formulating neurocognitive theories of body perception. Insight from such an updated model of body perception generalizes to inform the organizational structure of social perception and cognition more generally and also informs disorders of body image, such as anorexia nervosa, which may rely on atypical integration of body-related information.
Harrewijn, Anita; van der Molen, Melle J W; van Vliet, Irene M; Houwing-Duistermaat, Jeanine J; Westenberg, P Michiel
2018-02-01
Social anxiety disorder (SAD) is characterized by an extreme and intense fear and avoidance of social situations. In this two-generation family study we examined delta-beta correlation during a social performance task as candidate endophenotype of SAD. Nine families with a target participant (diagnosed with SAD), their spouse and children, as well as target's siblings with spouse and children performed a social performance task in which they gave a speech in front of a camera. EEG was measured during resting state, anticipation, and recovery. Our analyses focused on two criteria for endophenotypes: co-segregation within families and heritability. Co-segregation analyses revealed increased negative delta-low beta correlation during anticipation in participants with (sub)clinical SAD compared to participants without (sub)clinical SAD. Heritability analyses revealed that delta-low beta and delta-high beta correlation during anticipation were heritable. Delta-beta correlation did not differ between participants with and without (sub)clinical SAD during resting state or recovery, nor between participants with and without SAD during all phases of the task. It should be noted that participants were seen only once, they all performed the EEG tasks in the same order, and some participants were too anxious to give a speech. Delta-low beta correlation during anticipation of giving a speech might be a candidate endophenotype of SAD, possibly reflecting increased crosstalk between cortical and subcortical regions. If validated as endophenotype, delta-beta correlation during anticipation could be useful in studying the genetic basis, as well as improving treatment and early detection of persons at risk for developing SAD. Copyright © 2017 Elsevier B.V. All rights reserved.
Speech processing: from peripheral to hemispheric asymmetry of the auditory system.
Lazard, Diane S; Collette, Jean-Louis; Perrot, Xavier
2012-01-01
Language processing from the cochlea to auditory association cortices shows side-dependent specificities with an apparent left hemispheric dominance. The aim of this article was to propose to nonspeech specialists a didactic review of two complementary theories about hemispheric asymmetry in speech processing. Starting from anatomico-physiological and clinical observations of auditory asymmetry and interhemispheric connections, this review then exposes behavioral (dichotic listening paradigm) as well as functional (functional magnetic resonance imaging and positron emission tomography) experiments that assessed hemispheric specialization for speech processing. Even though speech at an early phonological level is regarded as being processed bilaterally, a left-hemispheric dominance exists for higher-level processing. This asymmetry may arise from a segregation of the speech signal, broken apart within nonprimary auditory areas in two distinct temporal integration windows--a fast one on the left and a slower one on the right--modeled through the asymmetric sampling in time theory or a spectro-temporal trade-off, with a higher temporal resolution in the left hemisphere and a higher spectral resolution in the right hemisphere, modeled through the spectral/temporal resolution trade-off theory. Both theories deal with the concept that lower-order tuning principles for acoustic signal might drive higher-order organization for speech processing. However, the precise nature, mechanisms, and origin of speech processing asymmetry are still being debated. Finally, an example of hemispheric asymmetry alteration, which has direct clinical implications, is given through the case of auditory aging that mixes peripheral disorder and modifications of central processing. Copyright © 2011 The American Laryngological, Rhinological, and Otological Society, Inc.
Alexandrou, Anna Maria; Saarinen, Timo; Kujala, Jan; Salmelin, Riitta
2018-06-19
During natural speech perception, listeners must track the global speaking rate, that is, the overall rate of incoming linguistic information, as well as transient, local speaking rate variations occurring within the global speaking rate. Here, we address the hypothesis that this tracking mechanism is achieved through coupling of cortical signals to the amplitude envelope of the perceived acoustic speech signals. Cortical signals were recorded with magnetoencephalography (MEG) while participants perceived spontaneously produced speech stimuli at three global speaking rates (slow, normal/habitual, and fast). Inherently to spontaneously produced speech, these stimuli also featured local variations in speaking rate. The coupling between cortical and acoustic speech signals was evaluated using audio-MEG coherence. Modulations in audio-MEG coherence spatially differentiated between tracking of global speaking rate, highlighting the temporal cortex bilaterally and the right parietal cortex, and sensitivity to local speaking rate variations, emphasizing the left parietal cortex. Cortical tuning to the temporal structure of natural connected speech thus seems to require the joint contribution of both auditory and parietal regions. These findings suggest that cortical tuning to speech rhythm operates on two functionally distinct levels: one encoding the global rhythmic structure of speech and the other associated with online, rapidly evolving temporal predictions. Thus, it may be proposed that speech perception is shaped by evolutionary tuning, a preference for certain speaking rates, and predictive tuning, associated with cortical tracking of the constantly changing rate of linguistic information in a speech stream.
Talebi, Hossein; Moossavi, Abdollah; Faghihzadeh, Soghrat
2014-01-01
Older adults with cerebrovascular accident (CVA) show evidence of auditory and speech perception problems. In present study, it was examined whether these problems are due to impairments of concurrent auditory segregation procedure which is the basic level of auditory scene analysis and auditory organization in auditory scenes with competing sounds. Concurrent auditory segregation using competing sentence test (CST) and dichotic digits test (DDT) was assessed and compared in 30 male older adults (15 normal and 15 cases with right hemisphere CVA) in the same age groups (60-75 years old). For the CST, participants were presented with target message in one ear and competing message in the other one. The task was to listen to target sentence and repeat back without attention to competing sentence. For the DDT, auditory stimuli were monosyllabic digits presented dichotically and the task was to repeat those. Comparing mean score of CST and DDT between CVA patients with right hemisphere impairment and normal participants showed statistically significant difference (p=0.001 for CST and p<0.0001 for DDT). The present study revealed that abnormal CST and DDT scores of participants with right hemisphere CVA could be related to concurrent segregation difficulties. These findings suggest that low level segregation mechanisms and/or high level attention mechanisms might contribute to the problems.
The pupil response is sensitive to divided attention during speech processing.
Koelewijn, Thomas; Shinn-Cunningham, Barbara G; Zekveld, Adriana A; Kramer, Sophia E
2014-06-01
Dividing attention over two streams of speech strongly decreases performance compared to focusing on only one. How divided attention affects cognitive processing load as indexed with pupillometry during speech recognition has so far not been investigated. In 12 young adults the pupil response was recorded while they focused on either one or both of two sentences that were presented dichotically and masked by fluctuating noise across a range of signal-to-noise ratios. In line with previous studies, the performance decreases when processing two target sentences instead of one. Additionally, dividing attention to process two sentences caused larger pupil dilation and later peak pupil latency than processing only one. This suggests an effect of attention on cognitive processing load (pupil dilation) during speech processing in noise. Copyright © 2014 The Authors. Published by Elsevier B.V. All rights reserved.
Electrolyte chemistry control in electrodialysis processing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hayes, Thomas D.; Severin, Blaine F.
Methods for controlling electrolyte chemistry in electrodialysis units having an anode and a cathode each in an electrolyte of a selected concentration and a membrane stack disposed therebetween. The membrane stack includes pairs of cationic selective and anionic membranes to segregate increasingly dilute salts streams from concentrated salts stream. Electrolyte chemistry control is via use of at least one of following techniques: a single calcium exclusionary cationic selective membrane at a cathode cell boundary, an exclusionary membrane configured as a hydraulically isolated scavenger cell, a multivalent scavenger co-electrolyte and combinations thereof.
Women in "Male" Careers: The Case of Higher Technicians in France.
ERIC Educational Resources Information Center
Daune-Richard, Anne-Marie
1992-01-01
French statistics show that in the area of training and employment, differences in behavior patterns between men and women have diminished considerably. Nonetheless, sexual segregation remains strong, especially in scientific and technical fields. Distribution among training streams remains uneven. In tertiary and upper-level education and…
Shao, Xu; Milner, Ben
2005-08-01
This work proposes a method to reconstruct an acoustic speech signal solely from a stream of mel-frequency cepstral coefficients (MFCCs) as may be encountered in a distributed speech recognition (DSR) system. Previous methods for speech reconstruction have required, in addition to the MFCC vectors, fundamental frequency and voicing components. In this work the voicing classification and fundamental frequency are predicted from the MFCC vectors themselves using two maximum a posteriori (MAP) methods. The first method enables fundamental frequency prediction by modeling the joint density of MFCCs and fundamental frequency using a single Gaussian mixture model (GMM). The second scheme uses a set of hidden Markov models (HMMs) to link together a set of state-dependent GMMs, which enables a more localized modeling of the joint density of MFCCs and fundamental frequency. Experimental results on speaker-independent male and female speech show that accurate voicing classification and fundamental frequency prediction is attained when compared to hand-corrected reference fundamental frequency measurements. The use of the predicted fundamental frequency and voicing for speech reconstruction is shown to give very similar speech quality to that obtained using the reference fundamental frequency and voicing.
Buchan, Julie N; Munhall, Kevin G
2011-01-01
Conflicting visual speech information can influence the perception of acoustic speech, causing an illusory percept of a sound not present in the actual acoustic speech (the McGurk effect). We examined whether participants can voluntarily selectively attend to either the auditory or visual modality by instructing participants to pay attention to the information in one modality and to ignore competing information from the other modality. We also examined how performance under these instructions was affected by weakening the influence of the visual information by manipulating the temporal offset between the audio and video channels (experiment 1), and the spatial frequency information present in the video (experiment 2). Gaze behaviour was also monitored to examine whether attentional instructions influenced the gathering of visual information. While task instructions did have an influence on the observed integration of auditory and visual speech information, participants were unable to completely ignore conflicting information, particularly information from the visual stream. Manipulating temporal offset had a more pronounced interaction with task instructions than manipulating the amount of visual information. Participants' gaze behaviour suggests that the attended modality influences the gathering of visual information in audiovisual speech perception.
At what time is the cocktail party? A late locus of selective attention to natural speech.
Power, Alan J; Foxe, John J; Forde, Emma-Jane; Reilly, Richard B; Lalor, Edmund C
2012-05-01
Distinguishing between speakers and focusing attention on one speaker in multi-speaker environments is extremely important in everyday life. Exactly how the brain accomplishes this feat and, in particular, the precise temporal dynamics of this attentional deployment are as yet unknown. A long history of behavioral research using dichotic listening paradigms has debated whether selective attention to speech operates at an early stage of processing based on the physical characteristics of the stimulus or at a later stage during semantic processing. With its poor temporal resolution fMRI has contributed little to the debate, while EEG-ERP paradigms have been hampered by the need to average the EEG in response to discrete stimuli which are superimposed onto ongoing speech. This presents a number of problems, foremost among which is that early attention effects in the form of endogenously generated potentials can be so temporally broad as to mask later attention effects based on the higher level processing of the speech stream. Here we overcome this issue by utilizing the AESPA (auditory evoked spread spectrum analysis) method which allows us to extract temporally detailed responses to two concurrently presented speech streams in natural cocktail-party-like attentional conditions without the need for superimposed probes. We show attentional effects on exogenous stimulus processing in the 200-220 ms range in the left hemisphere. We discuss these effects within the context of research on auditory scene analysis and in terms of a flexible locus of attention that can be deployed at a particular processing stage depending on the task. © 2012 The Authors. European Journal of Neuroscience © 2012 Federation of European Neuroscience Societies and Blackwell Publishing Ltd.
Boets, Bart; Wouters, Jan; van Wieringen, Astrid; Ghesquière, Pol
2007-04-09
This study investigates whether the core bottleneck of literacy-impairment should be situated at the phonological level or at a more basic sensory level, as postulated by supporters of the auditory temporal processing theory. Phonological ability, speech perception and low-level auditory processing were assessed in a group of 5-year-old pre-school children at high-family risk for dyslexia, compared to a group of well-matched low-risk control children. Based on family risk status and first grade literacy achievement children were categorized in groups and pre-school data were retrospectively reanalyzed. On average, children showing both increased family risk and literacy-impairment at the end of first grade, presented significant pre-school deficits in phonological awareness, rapid automatized naming, speech-in-noise perception and frequency modulation detection. The concurrent presence of these deficits before receiving any formal reading instruction, might suggest a causal relation with problematic literacy development. However, a closer inspection of the individual data indicates that the core of the literacy problem is situated at the level of higher-order phonological processing. Although auditory and speech perception problems are relatively over-represented in literacy-impaired subjects and might possibly aggravate the phonological and literacy problem, it is unlikely that they would be at the basis of these problems. At a neurobiological level, results are interpreted as evidence for dysfunctional processing along the auditory-to-articulation stream that is implied in phonological processing, in combination with a relatively intact or inconsistently impaired functioning of the auditory-to-meaning stream that subserves auditory processing and speech perception.
Lord, Louis-David; Stevner, Angus B.; Kringelbach, Morten L.
2017-01-01
To survive in an ever-changing environment, the brain must seamlessly integrate a rich stream of incoming information into coherent internal representations that can then be used to efficiently plan for action. The brain must, however, balance its ability to integrate information from various sources with a complementary capacity to segregate information into modules which perform specialized computations in local circuits. Importantly, evidence suggests that imbalances in the brain's ability to bind together and/or segregate information over both space and time is a common feature of several neuropsychiatric disorders. Most studies have, however, until recently strictly attempted to characterize the principles of integration and segregation in static (i.e. time-invariant) representations of human brain networks, hence disregarding the complex spatio-temporal nature of these processes. In the present Review, we describe how the emerging discipline of whole-brain computational connectomics may be used to study the causal mechanisms of the integration and segregation of information on behaviourally relevant timescales. We emphasize how novel methods from network science and whole-brain computational modelling can expand beyond traditional neuroimaging paradigms and help to uncover the neurobiological determinants of the abnormal integration and segregation of information in neuropsychiatric disorders. This article is part of the themed issue ‘Mathematical methods in medicine: neuroscience, cardiology and pathology’. PMID:28507228
Recent advances in exploring the neural underpinnings of auditory scene perception
Snyder, Joel S.; Elhilali, Mounya
2017-01-01
Studies of auditory scene analysis have traditionally relied on paradigms using artificial sounds—and conventional behavioral techniques—to elucidate how we perceptually segregate auditory objects or streams from each other. In the past few decades, however, there has been growing interest in uncovering the neural underpinnings of auditory segregation using human and animal neuroscience techniques, as well as computational modeling. This largely reflects the growth in the fields of cognitive neuroscience and computational neuroscience and has led to new theories of how the auditory system segregates sounds in complex arrays. The current review focuses on neural and computational studies of auditory scene perception published in the past few years. Following the progress that has been made in these studies, we describe (1) theoretical advances in our understanding of the most well-studied aspects of auditory scene perception, namely segregation of sequential patterns of sounds and concurrently presented sounds; (2) the diversification of topics and paradigms that have been investigated; and (3) how new neuroscience techniques (including invasive neurophysiology in awake humans, genotyping, and brain stimulation) have been used in this field. PMID:28199022
Lewald, Jörg; Hanenberg, Christina; Getzmann, Stephan
2016-10-01
Successful speech perception in complex auditory scenes with multiple competing speakers requires spatial segregation of auditory streams into perceptually distinct and coherent auditory objects and focusing of attention toward the speaker of interest. Here, we focused on the neural basis of this remarkable capacity of the human auditory system and investigated the spatiotemporal sequence of neural activity within the cortical network engaged in solving the "cocktail-party" problem. Twenty-eight subjects localized a target word in the presence of three competing sound sources. The analysis of the ERPs revealed an anterior contralateral subcomponent of the N2 (N2ac), computed as the difference waveform for targets to the left minus targets to the right. The N2ac peaked at about 500 ms after stimulus onset, and its amplitude was correlated with better localization performance. Cortical source localization for the contrast of left versus right targets at the time of the N2ac revealed a maximum in the region around left superior frontal sulcus and frontal eye field, both of which are known to be involved in processing of auditory spatial information. In addition, a posterior-contralateral late positive subcomponent (LPCpc) occurred at a latency of about 700 ms. Both these subcomponents are potential correlates of allocation of spatial attention to the target under cocktail-party conditions. © 2016 Society for Psychophysiological Research.
Prediction and constraint in audiovisual speech perception
Peelle, Jonathan E.; Sommers, Mitchell S.
2015-01-01
During face-to-face conversational speech listeners must efficiently process a rapid and complex stream of multisensory information. Visual speech can serve as a critical complement to auditory information because it provides cues to both the timing of the incoming acoustic signal (the amplitude envelope, influencing attention and perceptual sensitivity) and its content (place and manner of articulation, constraining lexical selection). Here we review behavioral and neurophysiological evidence regarding listeners' use of visual speech information. Multisensory integration of audiovisual speech cues improves recognition accuracy, particularly for speech in noise. Even when speech is intelligible based solely on auditory information, adding visual information may reduce the cognitive demands placed on listeners through increasing precision of prediction. Electrophysiological studies demonstrate oscillatory cortical entrainment to speech in auditory cortex is enhanced when visual speech is present, increasing sensitivity to important acoustic cues. Neuroimaging studies also suggest increased activity in auditory cortex when congruent visual information is available, but additionally emphasize the involvement of heteromodal regions of posterior superior temporal sulcus as playing a role in integrative processing. We interpret these findings in a framework of temporally-focused lexical competition in which visual speech information affects auditory processing to increase sensitivity to auditory information through an early integration mechanism, and a late integration stage that incorporates specific information about a speaker's articulators to constrain the number of possible candidates in a spoken utterance. Ultimately it is words compatible with both auditory and visual information that most strongly determine successful speech perception during everyday listening. Thus, audiovisual speech perception is accomplished through multiple stages of integration, supported by distinct neuroanatomical mechanisms. PMID:25890390
Johnson, Wilson H; Douglas, Marlis R; Lewis, Jeffrey A; Stuecker, Tara N; Carbonero, Franck G; Austin, Bradley J; Evans-White, Michelle A; Entrekin, Sally A; Douglas, Michael E
2017-02-03
Unconventional natural gas (UNG) extraction (fracking) is ongoing in 29 North American shale basins (20 states), with ~6000 wells found within the Fayetteville shale (north-central Arkansas). If the chemical signature of fracking is detectable in streams, it can be employed to bookmark potential impacts. We evaluated benthic biofilm community composition as a proxy for stream chemistry so as to segregate anthropogenic signatures in eight Arkansas River catchments. In doing so, we tested the hypothesis that fracking characteristics in study streams are statistically distinguishable from those produced by agriculture or urbanization. Four tributary catchments had UNG-wells significantly more dense and near to our sampling sites and were grouped as 'potentially-impacted catchment zones' (PICZ). Four others were characterized by significantly larger forested area with greater slope and elevation but reduced pasture, and were classified as 'minimally-impacted' (MICZ). Overall, 46 bacterial phyla/141 classes were identified, with 24 phyla (52%) and 54 classes (38%) across all samples. PICZ-sites were ecologically more variable than MICZ-sites, with significantly greater nutrient levels (total nitrogen, total phosphorous), and elevated Cyanobacteria as bioindicators that tracked these conditions. PICZ-sites also exhibited elevated conductance (a correlate of increased ion concentration) and depressed salt-intolerant Spartobacteria, suggesting the presence of brine as a fracking effect. Biofilm communities at PICZ-sites were significantly less variable than those at MICZ-sites. Study streams differed by Group according to morphology, land use, and water chemistry but not in biofilm community structure. Those at PICZ-sites covaried according to anthropogenic impact, and were qualitatively similar to communities found at sites disturbed by fracking. The hypothesis that fracking signatures in study streams are distinguishable from those produced by other anthropogenic effects was statistically rejected. Instead, alterations in biofilm community composition, as induced by fracking, may be less specific than initially predicted, and thus more easily confounded by agriculture and urbanization effects (among others). Study streams must be carefully categorized with regard to the magnitude and extent of anthropogenic impacts. They must also be segregated with statistical confidence (as herein) before fracking impacts are monitored.
Vatakis, Argiro; Maragos, Petros; Rodomagoulakis, Isidoros; Spence, Charles
2012-01-01
We investigated how the physical differences associated with the articulation of speech affect the temporal aspects of audiovisual speech perception. Video clips of consonants and vowels uttered by three different speakers were presented. The video clips were analyzed using an auditory-visual signal saliency model in order to compare signal saliency and behavioral data. Participants made temporal order judgments (TOJs) regarding which speech-stream (auditory or visual) had been presented first. The sensitivity of participants' TOJs and the point of subjective simultaneity (PSS) were analyzed as a function of the place, manner of articulation, and voicing for consonants, and the height/backness of the tongue and lip-roundedness for vowels. We expected that in the case of the place of articulation and roundedness, where the visual-speech signal is more salient, temporal perception of speech would be modulated by the visual-speech signal. No such effect was expected for the manner of articulation or height. The results demonstrate that for place and manner of articulation, participants' temporal percept was affected (although not always significantly) by highly-salient speech-signals with the visual-signals requiring smaller visual-leads at the PSS. This was not the case when height was evaluated. These findings suggest that in the case of audiovisual speech perception, a highly salient visual-speech signal may lead to higher probabilities regarding the identity of the auditory-signal that modulate the temporal window of multisensory integration of the speech-stimulus. PMID:23060756
ERIC Educational Resources Information Center
Ferrer-Esteban, Gerard
2016-01-01
This article analyzes whether school social segregation, derived from policies and practices of both between-school student allocation and within-school streaming, is related to the effectiveness of the Italian education system. Hierarchical regression models are used to set out territorially aggregated factors of social sorting influencing…
Auditory Stream Segregation and the Perception of Across-Frequency Synchrony
ERIC Educational Resources Information Center
Micheyl, Christophe; Hunter, Cynthia; Oxenham, Andrew J.
2010-01-01
This study explored the extent to which sequential auditory grouping affects the perception of temporal synchrony. In Experiment 1, listeners discriminated between 2 pairs of asynchronous "target" tones at different frequencies, A and B, in which the B tone either led or lagged. Thresholds were markedly higher when the target tones were temporally…
40 CFR 63.1094 - What waste streams are exempt from the requirements of this subpart?
Code of Federal Regulations, 2010 CFR
2010-07-01
... CATEGORIES (CONTINUED) National Emission Standards for Ethylene Manufacturing Process Units: Heat Exchange... section are exempt from this subpart. (a) Waste in the form of gases or vapors that is emitted from process fluids. (b) Waste that is contained in a segregated storm water sewer system. Waste Requirements ...
Environmental heterogeneity, dispersal mode, and co-occurrence in stream macroinvertebrates
Heino, Jani
2013-01-01
Both environmental heterogeneity and mode of dispersal may affect species co-occurrence in metacommunities. Aquatic invertebrates were sampled in 20–30 streams in each of three drainage basins, differing considerably in environmental heterogeneity. Each drainage basin was further divided into two equally sized sets of sites, again differing profoundly in environmental heterogeneity. Benthic invertebrate data were divided into three groups of taxa based on overland dispersal modes: passive dispersers with aquatic adults, passive dispersers with terrestrial winged adults, and active dispersers with terrestrial winged adults. The co-occurrence of taxa in each dispersal mode group, drainage basin, and heterogeneity site subset was measured using the C-score and its standardized effect size. The probability of finding high levels of species segregation tended to increase with environmental heterogeneity across the drainage basins. These patterns were, however, contingent on both dispersal mode and drainage basin. It thus appears that environmental heterogeneity and dispersal mode interact in affecting co-occurrence in metacommunities, with passive dispersers with aquatic adults showing random patterns irrespective of environmental heterogeneity, and active dispersers with terrestrial winged adults showing increasing segregation with increasing environmental heterogeneity. PMID:23467653
Puschmann, Sebastian; Weerda, Riklef; Klump, Georg; Thiel, Christiane M
2013-05-01
Psychophysical experiments show that auditory change detection can be disturbed in situations in which listeners have to monitor complex auditory input. We made use of this change deafness effect to segregate the neural correlates of physical change in auditory input from brain responses related to conscious change perception in an fMRI experiment. Participants listened to two successively presented complex auditory scenes, which consisted of six auditory streams, and had to decide whether scenes were identical or whether the frequency of one stream was changed between presentations. Our results show that physical changes in auditory input, independent of successful change detection, are represented at the level of auditory cortex. Activations related to conscious change perception, independent of physical change, were found in the insula and the ACC. Moreover, our data provide evidence for significant effective connectivity between auditory cortex and the insula in the case of correctly detected auditory changes, but not for missed changes. This underlines the importance of the insula/anterior cingulate network for conscious change detection.
Size segregation in bedload sediment transport at the particle scale
NASA Astrophysics Data System (ADS)
Frey, P.; Martin, T.
2011-12-01
Bedload, the larger material that is transported in stream channels, has major consequences, for the management of water resources, for environmental sustainability, and for flooding alleviation. Most particularly, in mountains, steep slopes drive intense transport of a wide range of grain sizes. Our ability to compute local and even bulk quantities such as the sediment flux in rivers is poor. One important reason is that grain-grain interactions in stream channels may have been neglected. An arguably most important difficulty pertains to the very wide range of grain size leading to grain size sorting or segregation. This phenomenon largely modifies fluxes and results in patterns that can be seen ubiquitously in nature such as armoring or downstream fining. Most studies have concerned the spontaneous percolation of fine grains into immobile gravels, because of implications for salmonid spawning beds, or stratigraphical interpretation. However when the substrate is moving, the segregation process is different as statistically void openings permit downward percolation of larger particles. This process also named "kinetic sieving" has been studied in industrial contexts where segregation of granular or powder materials is often non-desirable. We present an experimental study of two-size mixtures of coarse spherical glass beads entrained by a shallow turbulent and supercritical water flow down a steep channel with a mobile bed. The particle diameters were 4 and 6mm, the channel width 6.5mm and the channel inclination ranged from 7.5 to 12.5%. The water flow rate and the particle rate were kept constant at the upstream entrance. First only the coarser particle rate was input and adjusted to obtain bed load equilibrium, that is, neither bed degradation nor aggradation over sufficiently long time intervals. Then a low rate of smaller particles (about 1% of the total sediment rate) was introduced to study the spatial and temporal evolution of segregating smaller particles. Flows were filmed from the side by a high-speed camera. Using image processing algorithms made it possible to determine the position, velocity and trajectory of both smaller and coarser particles. After a certain time, a quasi-continuous area of smaller beads developed under moving and above quasi-immobile coarser beads (see figure). Results include the time evolution of segregating smaller beads, assessment of percolation velocity and streamwise and vertical velocity depth profiles.
Subliminal Speech Perception and Auditory Streaming
ERIC Educational Resources Information Center
Dupoux, Emmanuel; de Gardelle, Vincent; Kouider, Sid
2008-01-01
Current theories of consciousness assume a qualitative dissociation between conscious and unconscious processing: while subliminal stimuli only elicit a transient activity, supraliminal stimuli have long-lasting influences. Nevertheless, the existence of this qualitative distinction remains controversial, as past studies confounded awareness and…
Ihlefeld, Antje; Litovsky, Ruth Y
2012-01-01
Spatial release from masking refers to a benefit for speech understanding. It occurs when a target talker and a masker talker are spatially separated. In those cases, speech intelligibility for target speech is typically higher than when both talkers are at the same location. In cochlear implant listeners, spatial release from masking is much reduced or absent compared with normal hearing listeners. Perhaps this reduced spatial release occurs because cochlear implant listeners cannot effectively attend to spatial cues. Three experiments examined factors that may interfere with deploying spatial attention to a target talker masked by another talker. To simulate cochlear implant listening, stimuli were vocoded with two unique features. First, we used 50-Hz low-pass filtered speech envelopes and noise carriers, strongly reducing the possibility of temporal pitch cues; second, co-modulation was imposed on target and masker utterances to enhance perceptual fusion between the two sources. Stimuli were presented over headphones. Experiments 1 and 2 presented high-fidelity spatial cues with unprocessed and vocoded speech. Experiment 3 maintained faithful long-term average interaural level differences but presented scrambled interaural time differences with vocoded speech. Results show a robust spatial release from masking in Experiments 1 and 2, and a greatly reduced spatial release in Experiment 3. Faithful long-term average interaural level differences were insufficient for producing spatial release from masking. This suggests that appropriate interaural time differences are necessary for restoring spatial release from masking, at least for a situation where there are few viable alternative segregation cues.
NASA Astrophysics Data System (ADS)
Misurelli, Sara M.
The ability to analyze an "auditory scene"---that is, to selectively attend to a target source while simultaneously segregating and ignoring distracting information---is one of the most important and complex skills utilized by normal hearing (NH) adults. The NH adult auditory system and brain work rather well to segregate auditory sources in adverse environments. However, for some children and individuals with hearing loss, selectively attending to one source in noisy environments can be extremely challenging. In a normal auditory system, information arriving at each ear is integrated, and thus these binaural cues aid in speech understanding in noise. A growing number of individuals who are deaf now receive cochlear implants (CIs), which supply hearing through electrical stimulation to the auditory nerve. In particular, bilateral cochlear implants (BICIs) are now becoming more prevalent, especially in children. However, because CI sound processing lacks both fine structure cues and coordination between stimulation at the two ears, binaural cues may either be absent or inconsistent. For children with NH and with BiCIs, this difficulty in segregating sources is of particular concern because their learning and development commonly occurs within the context of complex auditory environments. This dissertation intends to explore and understand the ability of children with NH and with BiCIs to function in everyday noisy environments. The goals of this work are to (1) Investigate source segregation abilities in children with NH and with BiCIs; (2) Examine the effect of target-interferer similarity and the benefits of source segregation for children with NH and with BiCIs; (3) Investigate measures of executive function that may predict performance in complex and realistic auditory tasks of source segregation for listeners with NH; and (4) Examine source segregation abilities in NH listeners, from school-age to adults.
Activity in Human Auditory Cortex Represents Spatial Separation Between Concurrent Sounds.
Shiell, Martha M; Hausfeld, Lars; Formisano, Elia
2018-05-23
The primary and posterior auditory cortex (AC) are known for their sensitivity to spatial information, but how this information is processed is not yet understood. AC that is sensitive to spatial manipulations is also modulated by the number of auditory streams present in a scene (Smith et al., 2010), suggesting that spatial and nonspatial cues are integrated for stream segregation. We reasoned that, if this is the case, then it is the distance between sounds rather than their absolute positions that is essential. To test this hypothesis, we measured human brain activity in response to spatially separated concurrent sounds with fMRI at 7 tesla in five men and five women. Stimuli were spatialized amplitude-modulated broadband noises recorded for each participant via in-ear microphones before scanning. Using a linear support vector machine classifier, we investigated whether sound location and/or location plus spatial separation between sounds could be decoded from the activity in Heschl's gyrus and the planum temporale. The classifier was successful only when comparing patterns associated with the conditions that had the largest difference in perceptual spatial separation. Our pattern of results suggests that the representation of spatial separation is not merely the combination of single locations, but rather is an independent feature of the auditory scene. SIGNIFICANCE STATEMENT Often, when we think of auditory spatial information, we think of where sounds are coming from-that is, the process of localization. However, this information can also be used in scene analysis, the process of grouping and segregating features of a soundwave into objects. Essentially, when sounds are further apart, they are more likely to be segregated into separate streams. Here, we provide evidence that activity in the human auditory cortex represents the spatial separation between sounds rather than their absolute locations, indicating that scene analysis and localization processes may be independent. Copyright © 2018 the authors 0270-6474/18/384977-08$15.00/0.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Simpson, A. P.; Barber, S.; Abdurrahman, N. M.
2006-07-01
The Super High Efficiency Neutron Coincidence Counter (SuperHENC) was originally developed by BIL Solutions Inc., Los Alamos National Laboratory (LANL) and Rocky Flats Environmental Technology Site (RFETS) for assay of transuranic (TRU) waste in Standard Waste Boxes (SWB) at Rocky Flats. This mobile system was a key component in the shipment of over 4,000 SWBs to the Waste Isolation Pilot Plant (WIPP) in Carlsbad, New Mexico. The system was WIPP certified in 2001 and operated at the site for four years. The success of this system, a passive neutron coincidence counter combined with high resolution gamma spectroscopy, led to themore » order of two new units, delivered to Hanford in 2004. Several new challenges were faced at Hanford: For example, the original RFETS system was calibrated for segregated waste streams such that metals, plastics, wet combustibles and dry combustibles were separated by 'Item Description Codes' prior to assay. Furthermore, the RFETS mission of handling only weapons grade plutonium, enabled the original SuperHENC to benefit from the use of known Pu isotopics. Operations at Hanford, as with most other DOE sites, generate un-segregated waste streams, with a wide diversity of Pu isotopics. Consequently, the new SuperHENCs are required to deal with new technical challenges. The neutron system's software and calibration methodology have been modified to encompass these new requirements. In addition, PC-FRAM software has been added to the gamma system, providing a robust isotopic measurement capability. Finally a new software package has been developed that integrates the neutron and gamma data to provide a final assay results and analysis report. The new system's performance has been rigorously tested and validated against WIPP quality requirements. These modifications, together with the mobile platform, make the new SuperHENC far more versatile in handling diverse waste streams and allow for rapid redeployment around the DOE complex. (authors)« less
Disbergen, Niels R.; Valente, Giancarlo; Formisano, Elia; Zatorre, Robert J.
2018-01-01
Polyphonic music listening well exemplifies processes typically involved in daily auditory scene analysis situations, relying on an interactive interplay between bottom-up and top-down processes. Most studies investigating scene analysis have used elementary auditory scenes, however real-world scene analysis is far more complex. In particular, music, contrary to most other natural auditory scenes, can be perceived by either integrating or, under attentive control, segregating sound streams, often carried by different instruments. One of the prominent bottom-up cues contributing to multi-instrument music perception is their timbre difference. In this work, we introduce and validate a novel paradigm designed to investigate, within naturalistic musical auditory scenes, attentive modulation as well as its interaction with bottom-up processes. Two psychophysical experiments are described, employing custom-composed two-voice polyphonic music pieces within a framework implementing a behavioral performance metric to validate listener instructions requiring either integration or segregation of scene elements. In Experiment 1, the listeners' locus of attention was switched between individual instruments or the aggregate (i.e., both instruments together), via a task requiring the detection of temporal modulations (i.e., triplets) incorporated within or across instruments. Subjects responded post-stimulus whether triplets were present in the to-be-attended instrument(s). Experiment 2 introduced the bottom-up manipulation by adding a three-level morphing of instrument timbre distance to the attentional framework. The task was designed to be used within neuroimaging paradigms; Experiment 2 was additionally validated behaviorally in the functional Magnetic Resonance Imaging (fMRI) environment. Experiment 1 subjects (N = 29, non-musicians) completed the task at high levels of accuracy, showing no group differences between any experimental conditions. Nineteen listeners also participated in Experiment 2, showing a main effect of instrument timbre distance, even though within attention-condition timbre-distance contrasts did not demonstrate any timbre effect. Correlation of overall scores with morph-distance effects, computed by subtracting the largest from the smallest timbre distance scores, showed an influence of general task difficulty on the timbre distance effect. Comparison of laboratory and fMRI data showed scanner noise had no adverse effect on task performance. These Experimental paradigms enable to study both bottom-up and top-down contributions to auditory stream segregation and integration within psychophysical and neuroimaging experiments. PMID:29563861
Talebi, Hossein; Moossavi, Abdollah; Faghihzadeh, Soghrat
2014-01-01
Background: Older adults with cerebrovascular accident (CVA) show evidence of auditory and speech perception problems. In present study, it was examined whether these problems are due to impairments of concurrent auditory segregation procedure which is the basic level of auditory scene analysis and auditory organization in auditory scenes with competing sounds. Methods: Concurrent auditory segregation using competing sentence test (CST) and dichotic digits test (DDT) was assessed and compared in 30 male older adults (15 normal and 15 cases with right hemisphere CVA) in the same age groups (60-75 years old). For the CST, participants were presented with target message in one ear and competing message in the other one. The task was to listen to target sentence and repeat back without attention to competing sentence. For the DDT, auditory stimuli were monosyllabic digits presented dichotically and the task was to repeat those. Results: Comparing mean score of CST and DDT between CVA patients with right hemisphere impairment and normal participants showed statistically significant difference (p=0.001 for CST and p<0.0001 for DDT). Conclusion: The present study revealed that abnormal CST and DDT scores of participants with right hemisphere CVA could be related to concurrent segregation difficulties. These findings suggest that low level segregation mechanisms and/or high level attention mechanisms might contribute to the problems. PMID:25679009
Skipper, Jeremy I; Devlin, Joseph T; Lametti, Daniel R
2017-01-01
Does "the motor system" play "a role" in speech perception? If so, where, how, and when? We conducted a systematic review that addresses these questions using both qualitative and quantitative methods. The qualitative review of behavioural, computational modelling, non-human animal, brain damage/disorder, electrical stimulation/recording, and neuroimaging research suggests that distributed brain regions involved in producing speech play specific, dynamic, and contextually determined roles in speech perception. The quantitative review employed region and network based neuroimaging meta-analyses and a novel text mining method to describe relative contributions of nodes in distributed brain networks. Supporting the qualitative review, results show a specific functional correspondence between regions involved in non-linguistic movement of the articulators, covertly and overtly producing speech, and the perception of both nonword and word sounds. This distributed set of cortical and subcortical speech production regions are ubiquitously active and form multiple networks whose topologies dynamically change with listening context. Results are inconsistent with motor and acoustic only models of speech perception and classical and contemporary dual-stream models of the organization of language and the brain. Instead, results are more consistent with complex network models in which multiple speech production related networks and subnetworks dynamically self-organize to constrain interpretation of indeterminant acoustic patterns as listening context requires. Copyright © 2016. Published by Elsevier Inc.
Prediction and constraint in audiovisual speech perception.
Peelle, Jonathan E; Sommers, Mitchell S
2015-07-01
During face-to-face conversational speech listeners must efficiently process a rapid and complex stream of multisensory information. Visual speech can serve as a critical complement to auditory information because it provides cues to both the timing of the incoming acoustic signal (the amplitude envelope, influencing attention and perceptual sensitivity) and its content (place and manner of articulation, constraining lexical selection). Here we review behavioral and neurophysiological evidence regarding listeners' use of visual speech information. Multisensory integration of audiovisual speech cues improves recognition accuracy, particularly for speech in noise. Even when speech is intelligible based solely on auditory information, adding visual information may reduce the cognitive demands placed on listeners through increasing the precision of prediction. Electrophysiological studies demonstrate that oscillatory cortical entrainment to speech in auditory cortex is enhanced when visual speech is present, increasing sensitivity to important acoustic cues. Neuroimaging studies also suggest increased activity in auditory cortex when congruent visual information is available, but additionally emphasize the involvement of heteromodal regions of posterior superior temporal sulcus as playing a role in integrative processing. We interpret these findings in a framework of temporally-focused lexical competition in which visual speech information affects auditory processing to increase sensitivity to acoustic information through an early integration mechanism, and a late integration stage that incorporates specific information about a speaker's articulators to constrain the number of possible candidates in a spoken utterance. Ultimately it is words compatible with both auditory and visual information that most strongly determine successful speech perception during everyday listening. Thus, audiovisual speech perception is accomplished through multiple stages of integration, supported by distinct neuroanatomical mechanisms. Copyright © 2015 Elsevier Ltd. All rights reserved.
Is Statistical Learning Constrained by Lower Level Perceptual Organization?
Emberson, Lauren L.; Liu, Ran; Zevin, Jason D.
2013-01-01
In order for statistical information to aid in complex developmental processes such as language acquisition, learning from higher-order statistics (e.g. across successive syllables in a speech stream to support segmentation) must be possible while perceptual abilities (e.g. speech categorization) are still developing. The current study examines how perceptual organization interacts with statistical learning. Adult participants were presented with multiple exemplars from novel, complex sound categories designed to reflect some of the spectral complexity and variability of speech. These categories were organized into sequential pairs and presented such that higher-order statistics, defined based on sound categories, could support stream segmentation. Perceptual similarity judgments and multi-dimensional scaling revealed that participants only perceived three perceptual clusters of sounds and thus did not distinguish the four experimenter-defined categories, creating a tension between lower level perceptual organization and higher-order statistical information. We examined whether the resulting pattern of learning is more consistent with statistical learning being “bottom-up,” constrained by the lower levels of organization, or “top-down,” such that higher-order statistical information of the stimulus stream takes priority over the perceptual organization, and perhaps influences perceptual organization. We consistently find evidence that learning is constrained by perceptual organization. Moreover, participants generalize their learning to novel sounds that occupy a similar perceptual space, suggesting that statistical learning occurs based on regions of or clusters in perceptual space. Overall, these results reveal a constraint on learning of sound sequences, such that statistical information is determined based on lower level organization. These findings have important implications for the role of statistical learning in language acquisition. PMID:23618755
Musical melody and speech intonation: singing a different tune.
Zatorre, Robert J; Baum, Shari R
2012-01-01
Music and speech are often cited as characteristically human forms of communication. Both share the features of hierarchical structure, complex sound systems, and sensorimotor sequencing demands, and both are used to convey and influence emotions, among other functions [1]. Both music and speech also prominently use acoustical frequency modulations, perceived as variations in pitch, as part of their communicative repertoire. Given these similarities, and the fact that pitch perception and production involve the same peripheral transduction system (cochlea) and the same production mechanism (vocal tract), it might be natural to assume that pitch processing in speech and music would also depend on the same underlying cognitive and neural mechanisms. In this essay we argue that the processing of pitch information differs significantly for speech and music; specifically, we suggest that there are two pitch-related processing systems, one for more coarse-grained, approximate analysis and one for more fine-grained accurate representation, and that the latter is unique to music. More broadly, this dissociation offers clues about the interface between sensory and motor systems, and highlights the idea that multiple processing streams are a ubiquitous feature of neuro-cognitive architectures.
Masson-Carro, Ingrid; Goudbeek, Martijn; Krahmer, Emiel
2016-10-01
Past research has sought to elucidate how speakers and addressees establish common ground in conversation, yet few studies have focused on how visual cues such as co-speech gestures contribute to this process. Likewise, the effect of cognitive constraints on multimodal grounding remains to be established. This study addresses the relationship between the verbal and gestural modalities during grounding in referential communication. We report data from a collaborative task where repeated references were elicited, and a time constraint was imposed to increase cognitive load. Our results reveal no differential effects of repetition or cognitive load on the semantic-based gesture rate, suggesting that representational gestures and speech are closely coordinated during grounding. However, gestures and speech differed in their execution, especially under time pressure. We argue that speech and gesture are two complementary streams that might be planned in conjunction but that unfold independently in later stages of language production, with speakers emphasizing the form of their gestures, but not of their words, to better meet the goals of the collaborative task. Copyright © 2016 Cognitive Science Society, Inc.
Dynamic speech representations in the human temporal lobe.
Leonard, Matthew K; Chang, Edward F
2014-09-01
Speech perception requires rapid integration of acoustic input with context-dependent knowledge. Recent methodological advances have allowed researchers to identify underlying information representations in primary and secondary auditory cortex and to examine how context modulates these representations. We review recent studies that focus on contextual modulations of neural activity in the superior temporal gyrus (STG), a major hub for spectrotemporal encoding. Recent findings suggest a highly interactive flow of information processing through the auditory ventral stream, including influences of higher-level linguistic and metalinguistic knowledge, even within individual areas. Such mechanisms may give rise to more abstract representations, such as those for words. We discuss the importance of characterizing representations of context-dependent and dynamic patterns of neural activity in the approach to speech perception research. Copyright © 2014 Elsevier Ltd. All rights reserved.
Using Predictability for Lexical Segmentation
ERIC Educational Resources Information Center
Çöltekin, Çagri
2017-01-01
This study investigates a strategy based on predictability of consecutive sub-lexical units in learning to segment a continuous speech stream into lexical units using computational modeling and simulations. Lexical segmentation is one of the early challenges during language acquisition, and it has been studied extensively through psycholinguistic…
The neural correlates of statistical learning in a word segmentation task: An fMRI study
Karuza, Elisabeth A.; Newport, Elissa L.; Aslin, Richard N.; Starling, Sarah J.; Tivarus, Madalina E.; Bavelier, Daphne
2013-01-01
Functional magnetic resonance imaging (fMRI) was used to assess neural activation as participants learned to segment continuous streams of speech containing syllable sequences varying in their transitional probabilities. Speech streams were presented in four runs, each followed by a behavioral test to measure the extent of learning over time. Behavioral performance indicated that participants could discriminate statistically coherent sequences (words) from less coherent sequences (partwords). Individual rates of learning, defined as the difference in ratings for words and partwords, were used as predictors of neural activation to ask which brain areas showed activity associated with these measures. Results showed significant activity in the pars opercularis and pars triangularis regions of the left inferior frontal gyrus (LIFG). The relationship between these findings and prior work on the neural basis of statistical learning is discussed, and parallels to the frontal/subcortical network involved in other forms of implicit sequence learning are considered. PMID:23312790
Words and possible words in early language acquisition.
Marchetto, Erika; Bonatti, Luca L
2013-11-01
In order to acquire language, infants must extract its building blocks-words-and master the rules governing their legal combinations from speech. These two problems are not independent, however: words also have internal structure. Thus, infants must extract two kinds of information from the same speech input. They must find the actual words of their language. Furthermore, they must identify its possible words, that is, the sequences of sounds that, being morphologically well formed, could be words. Here, we show that infants' sensitivity to possible words appears to be more primitive and fundamental than their ability to find actual words. We expose 12- and 18-month-old infants to an artificial language containing a conflict between statistically coherent and structurally coherent items. We show that 18-month-olds can extract possible words when the familiarization stream contains marks of segmentation, but cannot do so when the stream is continuous. Yet, they can find actual words from a continuous stream by computing statistical relationships among syllables. By contrast, 12-month-olds can find possible words when familiarized with a segmented stream, but seem unable to extract statistically coherent items from a continuous stream that contains minimal conflicts between statistical and structural information. These results suggest that sensitivity to word structure is in place earlier than the ability to analyze distributional information. The ability to compute nontrivial statistical relationships becomes fully effective relatively late in development, when infants have already acquired a considerable amount of linguistic knowledge. Thus, mechanisms for structure extraction that do not rely on extensive sampling of the input are likely to have a much larger role in language acquisition than general-purpose statistical abilities. Copyright © 2013. Published by Elsevier Inc.
Ding, Nai; Pan, Xunyi; Luo, Cheng; Su, Naifei; Zhang, Wen; Zhang, Jianfeng
2018-01-31
How the brain groups sequential sensory events into chunks is a fundamental question in cognitive neuroscience. This study investigates whether top-down attention or specific tasks are required for the brain to apply lexical knowledge to group syllables into words. Neural responses tracking the syllabic and word rhythms of a rhythmic speech sequence were concurrently monitored using electroencephalography (EEG). The participants performed different tasks, attending to either the rhythmic speech sequence or a distractor, which was another speech stream or a nonlinguistic auditory/visual stimulus. Attention to speech, but not a lexical-meaning-related task, was required for reliable neural tracking of words, even when the distractor was a nonlinguistic stimulus presented cross-modally. Neural tracking of syllables, however, was reliably observed in all tested conditions. These results strongly suggest that neural encoding of individual auditory events (i.e., syllables) is automatic, while knowledge-based construction of temporal chunks (i.e., words) crucially relies on top-down attention. SIGNIFICANCE STATEMENT Why we cannot understand speech when not paying attention is an old question in psychology and cognitive neuroscience. Speech processing is a complex process that involves multiple stages, e.g., hearing and analyzing the speech sound, recognizing words, and combining words into phrases and sentences. The current study investigates which speech-processing stage is blocked when we do not listen carefully. We show that the brain can reliably encode syllables, basic units of speech sounds, even when we do not pay attention. Nevertheless, when distracted, the brain cannot group syllables into multisyllabic words, which are basic units for speech meaning. Therefore, the process of converting speech sound into meaning crucially relies on attention. Copyright © 2018 the authors 0270-6474/18/381178-11$15.00/0.
Improving Acoustic Models by Watching Television
NASA Technical Reports Server (NTRS)
Witbrock, Michael J.; Hauptmann, Alexander G.
1998-01-01
Obtaining sufficient labelled training data is a persistent difficulty for speech recognition research. Although well transcribed data is expensive to produce, there is a constant stream of challenging speech data and poor transcription broadcast as closed-captioned television. We describe a reliable unsupervised method for identifying accurately transcribed sections of these broadcasts, and show how these segments can be used to train a recognition system. Starting from acoustic models trained on the Wall Street Journal database, a single iteration of our training method reduced the word error rate on an independent broadcast television news test set from 62.2% to 59.5%.
Yoo, Sejin; Chung, Jun-Young; Jeon, Hyeon-Ae; Lee, Kyoung-Min; Kim, Young-Bo; Cho, Zang-Hee
2012-07-01
Speech production is inextricably linked to speech perception, yet they are usually investigated in isolation. In this study, we employed a verbal-repetition task to identify the neural substrates of speech processing with two ends active simultaneously using functional MRI. Subjects verbally repeated auditory stimuli containing an ambiguous vowel sound that could be perceived as either a word or a pseudoword depending on the interpretation of the vowel. We found verbal repetition commonly activated the audition-articulation interface bilaterally at Sylvian fissures and superior temporal sulci. Contrasting word-versus-pseudoword trials revealed neural activities unique to word repetition in the left posterior middle temporal areas and activities unique to pseudoword repetition in the left inferior frontal gyrus. These findings imply that the tasks are carried out using different speech codes: an articulation-based code of pseudowords and an acoustic-phonetic code of words. It also supports the dual-stream model and imitative learning of vocabulary. Copyright © 2012 Elsevier Inc. All rights reserved.
Investigating Joint Attention Mechanisms through Spoken Human-Robot Interaction
ERIC Educational Resources Information Center
Staudte, Maria; Crocker, Matthew W.
2011-01-01
Referential gaze during situated language production and comprehension is tightly coupled with the unfolding speech stream (Griffin, 2001; Meyer, Sleiderink, & Levelt, 1998; Tanenhaus, Spivey-Knowlton, Eberhard, & Sedivy, 1995). In a shared environment, utterance comprehension may further be facilitated when the listener can exploit the speaker's…
Developmental changes in sensitivity to vocal paralanguage
Friend, Margaret
2017-01-01
Developmental changes in children’s sensitivity to the role of acoustic variation in the speech stream in conveying speaker affect (vocal paralanguage) were examined. Four-, 7- and 10-year-olds heard utterances in three formats: low-pass filtered, reiterant, and normal speech. The availability of lexical and paralinguistic information varied across these three formats in a way that required children to base their judgments of speaker affect on different configurations of cues in each format. Across ages, the best performance was obtained when a rich array of acoustic cues was present and when there was no competing lexical information. Four-year-olds performed at chance when judgments had to be based solely on speech prosody in the filtered format and they were unable to selectively attend to paralanguage when discrepant lexical cues were present in normal speech. Seven-year-olds were significantly more sensitive to the paralinguistic role of speech prosody in filtered speech than were 4-year-olds and there was a trend toward greater attention to paralanguage when lexical and paralinguistic cues were inconsistent in normal speech. An integration of the ability to utilize prosodic cues to speaker affect with attention to paralanguage in cases of lexical/paralinguistic discrepancy was observed for 10-year-olds. The results are discussed in terms of the development of a perceptual bias emerging out of selective attention to language. PMID:28713218
Lee, Yune-Sang; Turkeltaub, Peter; Granger, Richard; Raizada, Rajeev D S
2012-03-14
Although much effort has been directed toward understanding the neural basis of speech processing, the neural processes involved in the categorical perception of speech have been relatively less studied, and many questions remain open. In this functional magnetic resonance imaging (fMRI) study, we probed the cortical regions mediating categorical speech perception using an advanced brain-mapping technique, whole-brain multivariate pattern-based analysis (MVPA). Normal healthy human subjects (native English speakers) were scanned while they listened to 10 consonant-vowel syllables along the /ba/-/da/ continuum. Outside of the scanner, individuals' own category boundaries were measured to divide the fMRI data into /ba/ and /da/ conditions per subject. The whole-brain MVPA revealed that Broca's area and the left pre-supplementary motor area evoked distinct neural activity patterns between the two perceptual categories (/ba/ vs /da/). Broca's area was also found when the same analysis was applied to another dataset (Raizada and Poldrack, 2007), which previously yielded the supramarginal gyrus using a univariate adaptation-fMRI paradigm. The consistent MVPA findings from two independent datasets strongly indicate that Broca's area participates in categorical speech perception, with a possible role of translating speech signals into articulatory codes. The difference in results between univariate and multivariate pattern-based analyses of the same data suggest that processes in different cortical areas along the dorsal speech perception stream are distributed on different spatial scales.
The functional neuroanatomy of language
NASA Astrophysics Data System (ADS)
Hickok, Gregory
2009-09-01
There has been substantial progress over the last several years in understanding aspects of the functional neuroanatomy of language. Some of these advances are summarized in this review. It will be argued that recognizing speech sounds is carried out in the superior temporal lobe bilaterally, that the superior temporal sulcus bilaterally is involved in phonological-level aspects of this process, that the frontal/motor system is not central to speech recognition although it may modulate auditory perception of speech, that conceptual access mechanisms are likely located in the lateral posterior temporal lobe (middle and inferior temporal gyri), that speech production involves sensory-related systems in the posterior superior temporal lobe in the left hemisphere, that the interface between perceptual and motor systems is supported by a sensory-motor circuit for vocal tract actions (not dedicated to speech) that is very similar to sensory-motor circuits found in primate parietal lobe, and that verbal short-term memory can be understood as an emergent property of this sensory-motor circuit. These observations are considered within the context of a dual stream model of speech processing in which one pathway supports speech comprehension and the other supports sensory-motor integration. Additional topics of discussion include the functional organization of the planum temporale for spatial hearing and speech-related sensory-motor processes, the anatomical and functional basis of a form of acquired language disorder, conduction aphasia, the neural basis of vocabulary development, and sentence-level/grammatical processing.
Magnified Neural Envelope Coding Predicts Deficits in Speech Perception in Noise.
Millman, Rebecca E; Mattys, Sven L; Gouws, André D; Prendergast, Garreth
2017-08-09
Verbal communication in noisy backgrounds is challenging. Understanding speech in background noise that fluctuates in intensity over time is particularly difficult for hearing-impaired listeners with a sensorineural hearing loss (SNHL). The reduction in fast-acting cochlear compression associated with SNHL exaggerates the perceived fluctuations in intensity in amplitude-modulated sounds. SNHL-induced changes in the coding of amplitude-modulated sounds may have a detrimental effect on the ability of SNHL listeners to understand speech in the presence of modulated background noise. To date, direct evidence for a link between magnified envelope coding and deficits in speech identification in modulated noise has been absent. Here, magnetoencephalography was used to quantify the effects of SNHL on phase locking to the temporal envelope of modulated noise (envelope coding) in human auditory cortex. Our results show that SNHL enhances the amplitude of envelope coding in posteromedial auditory cortex, whereas it enhances the fidelity of envelope coding in posteromedial and posterolateral auditory cortex. This dissociation was more evident in the right hemisphere, demonstrating functional lateralization in enhanced envelope coding in SNHL listeners. However, enhanced envelope coding was not perceptually beneficial. Our results also show that both hearing thresholds and, to a lesser extent, magnified cortical envelope coding in left posteromedial auditory cortex predict speech identification in modulated background noise. We propose a framework in which magnified envelope coding in posteromedial auditory cortex disrupts the segregation of speech from background noise, leading to deficits in speech perception in modulated background noise. SIGNIFICANCE STATEMENT People with hearing loss struggle to follow conversations in noisy environments. Background noise that fluctuates in intensity over time poses a particular challenge. Using magnetoencephalography, we demonstrate anatomically distinct cortical representations of modulated noise in normal-hearing and hearing-impaired listeners. This work provides the first link among hearing thresholds, the amplitude of cortical representations of modulated sounds, and the ability to understand speech in modulated background noise. In light of previous work, we propose that magnified cortical representations of modulated sounds disrupt the separation of speech from modulated background noise in auditory cortex. Copyright © 2017 Millman et al.
Integration and segregation in auditory streaming
NASA Astrophysics Data System (ADS)
Almonte, Felix; Jirsa, Viktor K.; Large, Edward W.; Tuller, Betty
2005-12-01
We aim to capture the perceptual dynamics of auditory streaming using a neurally inspired model of auditory processing. Traditional approaches view streaming as a competition of streams, realized within a tonotopically organized neural network. In contrast, we view streaming to be a dynamic integration process which resides at locations other than the sensory specific neural subsystems. This process finds its realization in the synchronization of neural ensembles or in the existence of informational convergence zones. Our approach uses two interacting dynamical systems, in which the first system responds to incoming acoustic stimuli and transforms them into a spatiotemporal neural field dynamics. The second system is a classification system coupled to the neural field and evolves to a stationary state. These states are identified with a single perceptual stream or multiple streams. Several results in human perception are modelled including temporal coherence and fission boundaries [L.P.A.S. van Noorden, Temporal coherence in the perception of tone sequences, Ph.D. Thesis, Eindhoven University of Technology, The Netherlands, 1975], and crossing of motions [A.S. Bregman, Auditory Scene Analysis: The Perceptual Organization of Sound, MIT Press, 1990]. Our model predicts phenomena such as the existence of two streams with the same pitch, which cannot be explained by the traditional stream competition models. An experimental study is performed to provide proof of existence of this phenomenon. The model elucidates possible mechanisms that may underlie perceptual phenomena.
Ambient groundwater flow diminishes nitrogen cycling in streams
NASA Astrophysics Data System (ADS)
Azizian, M.; Grant, S. B.; Rippy, M.; Detwiler, R. L.; Boano, F.; Cook, P. L. M.
2017-12-01
Modeling and experimental studies demonstrate that ambient groundwater reduces hyporheic exchange, but the implications of this observation for stream N-cycling is not yet clear. We utilized a simple process-based model (the Pumping and Streamline Segregation or PASS model) to evaluate N- cycling over two scales of hyporheic exchange (fluvial ripples and riffle-pool sequences), ten ambient groundwater and stream flow scenarios (five gaining and losing conditions and two stream discharges), and three biogeochemical settings (identified based on a principal component analysis of previously published measurements in streams throughout the United States). Model-data comparisons indicate that our model provides realistic estimates for direct denitrification of stream nitrate, but overpredicts nitrification and coupled nitrification-denitrification. Riffle-pool sequences are responsible for most of the N-processing, despite the fact that fluvial ripples generate 3-11 times more hyporheic exchange flux. Across all scenarios, hyporheic exchange flux and the Damkohler Number emerge as primary controls on stream N-cycling; the former regulates trafficking of nutrients and oxygen across the sediment-water interface, while the latter quantifies the relative rates of organic carbon mineralization and advective transport in streambed sediments. Vertical groundwater flux modulates both of these master variables in ways that tend to diminish stream N-cycling. Thus, anthropogenic perturbations of ambient groundwater flows (e.g., by urbanization, agricultural activities, groundwater mining, and/or climate change) may compromise some of the key ecosystem services provided by streams.
Spatiotemporal imaging of cortical activation during verb generation and picture naming.
Edwards, Erik; Nagarajan, Srikantan S; Dalal, Sarang S; Canolty, Ryan T; Kirsch, Heidi E; Barbaro, Nicholas M; Knight, Robert T
2010-03-01
One hundred and fifty years of neurolinguistic research has identified the key structures in the human brain that support language. However, neither the classic neuropsychological approaches introduced by Broca (1861) and Wernicke (1874), nor modern neuroimaging employing PET and fMRI has been able to delineate the temporal flow of language processing in the human brain. We recorded the electrocorticogram (ECoG) from indwelling electrodes over left hemisphere language cortices during two common language tasks, verb generation and picture naming. We observed that the very high frequencies of the ECoG (high-gamma, 70-160 Hz) track language processing with spatial and temporal precision. Serial progression of activations is seen at a larger timescale, showing distinct stages of perception, semantic association/selection, and speech production. Within the areas supporting each of these larger processing stages, parallel (or "incremental") processing is observed. In addition to the traditional posterior vs. anterior localization for speech perception vs. production, we provide novel evidence for the role of premotor cortex in speech perception and of Wernicke's and surrounding cortex in speech production. The data are discussed with regards to current leading models of speech perception and production, and a "dual ventral stream" hybrid of leading speech perception models is given. Copyright (c) 2009 Elsevier Inc. All rights reserved.
Auditory stream segregation with multi-tonal complexes in hearing-impaired listeners
NASA Astrophysics Data System (ADS)
Rogers, Deanna S.; Lentz, Jennifer J.
2004-05-01
The ability to segregate sounds into different streams was investigated in normally hearing and hearing-impaired listeners. Fusion and fission boundaries were measured using 6-tone complexes with tones equally spaced in log frequency. An ABA-ABA- sequence was used in which A represents a multitone complex ranging from either 250-1000 Hz (low-frequency region) or 1000-4000 Hz (high-frequency region). B also represents a multitone complex with same log spacing as A. Multitonal complexes were 100 ms in duration with 20-ms ramps, and- represents a silent interval of 100 ms. To measure the fusion boundary, the first tone of the B stimulus was either 375 Hz (low) or 1500 Hz (high) and shifted downward in frequency with each progressive ABA triplet until the listener pressed a button indicating that a ``galloping'' rhythm was heard. When measuring the fusion boundary, the first tone of the B stimulus was 252 or 1030 Hz and shifted upward with each triplet. Listeners then pressed a button when the ``galloping rhythm ended.'' Data suggest that hearing-impaired subjects have different fission and fusion boundaries than normal-hearing listeners. These data will be discussed in terms of both peripheral and central factors.
Seeing the Song: Left Auditory Structures May Track Auditory-Visual Dynamic Alignment
Mossbridge, Julia A.; Grabowecky, Marcia; Suzuki, Satoru
2013-01-01
Auditory and visual signals generated by a single source tend to be temporally correlated, such as the synchronous sounds of footsteps and the limb movements of a walker. Continuous tracking and comparison of the dynamics of auditory-visual streams is thus useful for the perceptual binding of information arising from a common source. Although language-related mechanisms have been implicated in the tracking of speech-related auditory-visual signals (e.g., speech sounds and lip movements), it is not well known what sensory mechanisms generally track ongoing auditory-visual synchrony for non-speech signals in a complex auditory-visual environment. To begin to address this question, we used music and visual displays that varied in the dynamics of multiple features (e.g., auditory loudness and pitch; visual luminance, color, size, motion, and organization) across multiple time scales. Auditory activity (monitored using auditory steady-state responses, ASSR) was selectively reduced in the left hemisphere when the music and dynamic visual displays were temporally misaligned. Importantly, ASSR was not affected when attentional engagement with the music was reduced, or when visual displays presented dynamics clearly dissimilar to the music. These results appear to suggest that left-lateralized auditory mechanisms are sensitive to auditory-visual temporal alignment, but perhaps only when the dynamics of auditory and visual streams are similar. These mechanisms may contribute to correct auditory-visual binding in a busy sensory environment. PMID:24194873
Hertrich, Ingo; Dietrich, Susanne; Ackermann, Hermann
2011-01-01
During speech communication, visual information may interact with the auditory system at various processing stages. Most noteworthy, recent magnetoencephalography (MEG) data provided first evidence for early and preattentive phonetic/phonological encoding of the visual data stream--prior to its fusion with auditory phonological features [Hertrich, I., Mathiak, K., Lutzenberger, W., & Ackermann, H. Time course of early audiovisual interactions during speech and non-speech central-auditory processing: An MEG study. Journal of Cognitive Neuroscience, 21, 259-274, 2009]. Using functional magnetic resonance imaging, the present follow-up study aims to further elucidate the topographic distribution of visual-phonological operations and audiovisual (AV) interactions during speech perception. Ambiguous acoustic syllables--disambiguated to /pa/ or /ta/ by the visual channel (speaking face)--served as test materials, concomitant with various control conditions (nonspeech AV signals, visual-only and acoustic-only speech, and nonspeech stimuli). (i) Visual speech yielded an AV-subadditive activation of primary auditory cortex and the anterior superior temporal gyrus (STG), whereas the posterior STG responded both to speech and nonspeech motion. (ii) The inferior frontal and the fusiform gyrus of the right hemisphere showed a strong phonetic/phonological impact (differential effects of visual /pa/ vs. /ta/) upon hemodynamic activation during presentation of speaking faces. Taken together with the previous MEG data, these results point at a dual-pathway model of visual speech information processing: On the one hand, access to the auditory system via the anterior supratemporal “what" path may give rise to direct activation of "auditory objects." On the other hand, visual speech information seems to be represented in a right-hemisphere visual working memory, providing a potential basis for later interactions with auditory information such as the McGurk effect.
Recognizing speech in a novel accent: the motor theory of speech perception reframed.
Moulin-Frier, Clément; Arbib, Michael A
2013-08-01
The motor theory of speech perception holds that we perceive the speech of another in terms of a motor representation of that speech. However, when we have learned to recognize a foreign accent, it seems plausible that recognition of a word rarely involves reconstruction of the speech gestures of the speaker rather than the listener. To better assess the motor theory and this observation, we proceed in three stages. Part 1 places the motor theory of speech perception in a larger framework based on our earlier models of the adaptive formation of mirror neurons for grasping, and for viewing extensions of that mirror system as part of a larger system for neuro-linguistic processing, augmented by the present consideration of recognizing speech in a novel accent. Part 2 then offers a novel computational model of how a listener comes to understand the speech of someone speaking the listener's native language with a foreign accent. The core tenet of the model is that the listener uses hypotheses about the word the speaker is currently uttering to update probabilities linking the sound produced by the speaker to phonemes in the native language repertoire of the listener. This, on average, improves the recognition of later words. This model is neutral regarding the nature of the representations it uses (motor vs. auditory). It serve as a reference point for the discussion in Part 3, which proposes a dual-stream neuro-linguistic architecture to revisits claims for and against the motor theory of speech perception and the relevance of mirror neurons, and extracts some implications for the reframing of the motor theory.
Technological, biological, and acoustical constraints to music perception in cochlear implant users.
Limb, Charles J; Roy, Alexis T
2014-02-01
Despite advances in technology, the ability to perceive music remains limited for many cochlear implant users. This paper reviews the technological, biological, and acoustical constraints that make music an especially challenging stimulus for cochlear implant users, while highlighting recent research efforts to overcome these shortcomings. The limitations of cochlear implant devices, which have been optimized for speech comprehension, become evident when applied to music, particularly with regards to inadequate spectral, fine-temporal, and dynamic range representation. Beyond the impoverished information transmitted by the device itself, both peripheral and central auditory nervous system deficits are seen in the presence of sensorineural hearing loss, such as auditory nerve degeneration and abnormal auditory cortex activation. These technological and biological constraints to effective music perception are further compounded by the complexity of the acoustical features of music itself that require the perceptual integration of varying rhythmic, melodic, harmonic, and timbral elements of sound. Cochlear implant users not only have difficulty perceiving spectral components individually (leading to fundamental disruptions in perception of pitch, melody, and harmony) but also display deficits with higher perceptual integration tasks required for music perception, such as auditory stream segregation. Despite these current limitations, focused musical training programs, new assessment methods, and improvements in the representation and transmission of the complex acoustical features of music through technological innovation offer the potential for significant advancements in cochlear implant-mediated music perception. Copyright © 2013 Elsevier B.V. All rights reserved.
Neural time course of visually enhanced echo suppression.
Bishop, Christopher W; London, Sam; Miller, Lee M
2012-10-01
Auditory spatial perception plays a critical role in day-to-day communication. For instance, listeners utilize acoustic spatial information to segregate individual talkers into distinct auditory "streams" to improve speech intelligibility. However, spatial localization is an exceedingly difficult task in everyday listening environments with numerous distracting echoes from nearby surfaces, such as walls. Listeners' brains overcome this unique challenge by relying on acoustic timing and, quite surprisingly, visual spatial information to suppress short-latency (1-10 ms) echoes through a process known as "the precedence effect" or "echo suppression." In the present study, we employed electroencephalography (EEG) to investigate the neural time course of echo suppression both with and without the aid of coincident visual stimulation in human listeners. We find that echo suppression is a multistage process initialized during the auditory N1 (70-100 ms) and followed by space-specific suppression mechanisms from 150 to 250 ms. Additionally, we find a robust correlate of listeners' spatial perception (i.e., suppressing or not suppressing the echo) over central electrode sites from 300 to 500 ms. Contrary to our hypothesis, vision's powerful contribution to echo suppression occurs late in processing (250-400 ms), suggesting that vision contributes primarily during late sensory or decision making processes. Together, our findings support growing evidence that echo suppression is a slow, progressive mechanism modifiable by visual influences during late sensory and decision making stages. Furthermore, our findings suggest that audiovisual interactions are not limited to early, sensory-level modulations but extend well into late stages of cortical processing.
Best, Virginia; Mason, Christine R.; Swaminathan, Jayaganesh; Roverud, Elin; Kidd, Gerald
2017-01-01
In many situations, listeners with sensorineural hearing loss demonstrate reduced spatial release from masking compared to listeners with normal hearing. This deficit is particularly evident in the “symmetric masker” paradigm in which competing talkers are located to either side of a central target talker. However, there is some evidence that reduced target audibility (rather than a spatial deficit per se) under conditions of spatial separation may contribute to the observed deficit. In this study a simple “glimpsing” model (applied separately to each ear) was used to isolate the target information that is potentially available in binaural speech mixtures. Intelligibility of these glimpsed stimuli was then measured directly. Differences between normally hearing and hearing-impaired listeners observed in the natural binaural condition persisted for the glimpsed condition, despite the fact that the task no longer required segregation or spatial processing. This result is consistent with the idea that the performance of listeners with hearing loss in the spatialized mixture was limited by their ability to identify the target speech based on sparse glimpses, possibly as a result of some of those glimpses being inaudible. PMID:28147587
Lutfi, Robert A.
2014-01-01
Older adults are often reported in the literature to have greater difficulty than younger adults understanding speech in noise [Helfer and Wilber (1988). J. Acoust. Soc. Am, 859–893]. The poorer performance of older adults has been attributed to a general deterioration of cognitive processing, deterioration of cochlear anatomy, and/or greater difficulty segregating speech from noise. The current work used perturbation analysis [Berg (1990). J. Acoust. Soc. Am., 149–158] to provide a more specific assessment of the effect of cognitive factors on speech perception in noise. Sixteen older (age 56–79 years) and seventeen younger (age 19–30 years) adults discriminated a target vowel masked by randomly selected masker vowels immediately preceding and following the target. Relative decision weights on target and maskers resulting from the analysis revealed large individual differences across participants despite similar performance scores in many cases. On the most difficult vowel discriminations, the older adult decision weights were significantly correlated with inhibitory control (Color Word Interference test) and pure-tone threshold averages (PTA). Young adult decision weights were not correlated with any measures of peripheral (PTA) or central function (inhibition or working memory). PMID:25256580
Rate and onset cues can improve cochlear implant synthetic vowel recognition in noise
Mc Laughlin, Myles; Reilly, Richard B.; Zeng, Fan-Gang
2013-01-01
Understanding speech-in-noise is difficult for most cochlear implant (CI) users. Speech-in-noise segregation cues are well understood for acoustic hearing but not for electric hearing. This study investigated the effects of stimulation rate and onset delay on synthetic vowel-in-noise recognition in CI subjects. In experiment I, synthetic vowels were presented at 50, 145, or 795 pulse/s and noise at the same three rates, yielding nine combinations. Recognition improved significantly if the noise had a lower rate than the vowel, suggesting that listeners can use temporal gaps in the noise to detect a synthetic vowel. This hypothesis is supported by accurate prediction of synthetic vowel recognition using a temporal integration window model. Using lower rates a similar trend was observed in normal hearing subjects. Experiment II found that for CI subjects, a vowel onset delay improved performance if the noise had a lower or higher rate than the synthetic vowel. These results show that differing rates or onset times can improve synthetic vowel-in-noise recognition, indicating a need to develop speech processing strategies that encode or emphasize these cues. PMID:23464025
Suprasegmental information affects processing of talking faces at birth.
Guellai, Bahia; Mersad, Karima; Streri, Arlette
2015-02-01
From birth, newborns show a preference for faces talking a native language compared to silent faces. The present study addresses two questions that remained unanswered by previous research: (a) Does the familiarity with the language play a role in this process and (b) Are all the linguistic and paralinguistic cues necessary in this case? Experiment 1 extended newborns' preference for native speakers to non-native ones. Given that fetuses and newborns are sensitive to the prosodic characteristics of speech, Experiments 2 and 3 presented faces talking native and nonnative languages with the speech stream being low-pass filtered. Results showed that newborns preferred looking at a person who talked to them even when only the prosodic cues were provided for both languages. Nonetheless, a familiarity preference for the previously talking face is observed in the "normal speech" condition (i.e., Experiment 1) and a novelty preference in the "filtered speech" condition (Experiments 2 and 3). This asymmetry reveals that newborns process these two types of stimuli differently and that they may already be sensitive to a mismatch between the articulatory movements of the face and the corresponding speech sounds. Copyright © 2014 Elsevier Inc. All rights reserved.
Kouider, Sid; Dupoux, Emmanuel
2005-08-01
We present a novel subliminal priming technique that operates in the auditory modality. Masking is achieved by hiding a spoken word within a stream of time-compressed speechlike sounds with similar spectral characteristics. Participants were unable to consciously identify the hidden words, yet reliable repetition priming was found. This effect was unaffected by a change in the speaker's voice and remained restricted to lexical processing. The results show that the speech modality, like the written modality, involves the automatic extraction of abstract word-form representations that do not include nonlinguistic details. In both cases, priming operates at the level of discrete and abstract lexical entries and is little influenced by overlap in form or semantics.
Role of Binaural Temporal Fine Structure and Envelope Cues in Cocktail-Party Listening.
Swaminathan, Jayaganesh; Mason, Christine R; Streeter, Timothy M; Best, Virginia; Roverud, Elin; Kidd, Gerald
2016-08-03
While conversing in a crowded social setting, a listener is often required to follow a target speech signal amid multiple competing speech signals (the so-called "cocktail party" problem). In such situations, separation of the target speech signal in azimuth from the interfering masker signals can lead to an improvement in target intelligibility, an effect known as spatial release from masking (SRM). This study assessed the contributions of two stimulus properties that vary with separation of sound sources, binaural envelope (ENV) and temporal fine structure (TFS), to SRM in normal-hearing (NH) human listeners. Target speech was presented from the front and speech maskers were either colocated with or symmetrically separated from the target in azimuth. The target and maskers were presented either as natural speech or as "noise-vocoded" speech in which the intelligibility was conveyed only by the speech ENVs from several frequency bands; the speech TFS within each band was replaced with noise carriers. The experiments were designed to preserve the spatial cues in the speech ENVs while retaining/eliminating them from the TFS. This was achieved by using the same/different noise carriers in the two ears. A phenomenological auditory-nerve model was used to verify that the interaural correlations in TFS differed across conditions, whereas the ENVs retained a high degree of correlation, as intended. Overall, the results from this study revealed that binaural TFS cues, especially for frequency regions below 1500 Hz, are critical for achieving SRM in NH listeners. Potential implications for studying SRM in hearing-impaired listeners are discussed. Acoustic signals received by the auditory system pass first through an array of physiologically based band-pass filters. Conceptually, at the output of each filter, there are two principal forms of temporal information: slowly varying fluctuations in the envelope (ENV) and rapidly varying fluctuations in the temporal fine structure (TFS). The importance of these two types of information in everyday listening (e.g., conversing in a noisy social situation; the "cocktail-party" problem) has not been established. This study assessed the contributions of binaural ENV and TFS cues for understanding speech in multiple-talker situations. Results suggest that, whereas the ENV cues are important for speech intelligibility, binaural TFS cues are critical for perceptually segregating the different talkers and thus for solving the cocktail party problem. Copyright © 2016 the authors 0270-6474/16/368250-08$15.00/0.
Subcortical processing of speech regularities underlies reading and music aptitude in children.
Strait, Dana L; Hornickel, Jane; Kraus, Nina
2011-10-17
Neural sensitivity to acoustic regularities supports fundamental human behaviors such as hearing in noise and reading. Although the failure to encode acoustic regularities in ongoing speech has been associated with language and literacy deficits, how auditory expertise, such as the expertise that is associated with musical skill, relates to the brainstem processing of speech regularities is unknown. An association between musical skill and neural sensitivity to acoustic regularities would not be surprising given the importance of repetition and regularity in music. Here, we aimed to define relationships between the subcortical processing of speech regularities, music aptitude, and reading abilities in children with and without reading impairment. We hypothesized that, in combination with auditory cognitive abilities, neural sensitivity to regularities in ongoing speech provides a common biological mechanism underlying the development of music and reading abilities. We assessed auditory working memory and attention, music aptitude, reading ability, and neural sensitivity to acoustic regularities in 42 school-aged children with a wide range of reading ability. Neural sensitivity to acoustic regularities was assessed by recording brainstem responses to the same speech sound presented in predictable and variable speech streams. Through correlation analyses and structural equation modeling, we reveal that music aptitude and literacy both relate to the extent of subcortical adaptation to regularities in ongoing speech as well as with auditory working memory and attention. Relationships between music and speech processing are specifically driven by performance on a musical rhythm task, underscoring the importance of rhythmic regularity for both language and music. These data indicate common brain mechanisms underlying reading and music abilities that relate to how the nervous system responds to regularities in auditory input. Definition of common biological underpinnings for music and reading supports the usefulness of music for promoting child literacy, with the potential to improve reading remediation.
ERIC Educational Resources Information Center
Sohail, Juwairia; Johnson, Elizabeth K.
2016-01-01
Much of what we know about the development of listeners' word segmentation strategies originates from the artificial language-learning literature. However, many artificial speech streams designed to study word segmentation lack a salient cue found in all natural languages: utterance boundaries. In this study, participants listened to a…
ERIC Educational Resources Information Center
Abla, Dilshat; Okanoya, Kazuo
2008-01-01
Word segmentation, that is, discovering the boundaries between words that are embedded in a continuous speech stream, is an important faculty for language learners; humans solve this task partly by calculating transitional probabilities between sounds. Behavioral and ERP studies suggest that detection of sequential probabilities (statistical…
Extracting Words from the Speech Stream at First Exposure
ERIC Educational Resources Information Center
Shoemaker, Ellenor; Rast, Rebekah
2013-01-01
The earliest stages of adult language acquisition have received increased attention in recent years (cf. Carroll, introduction to this issue). The study reported here aims to contribute to this discussion by investigating the role of several variables in the development of word recognition strategies during the very first hours of exposure to a…
Captions and Reduced Forms Instruction: The Impact on EFL Students' Listening Comprehension
ERIC Educational Resources Information Center
Yang, Jie Chi; Chang, Peichin
2014-01-01
For many EFL learners, listening poses a grave challenge. The difficulty in segmenting a stream of speech and limited capacity in short-term memory are common weaknesses for language learners. Specifically, reduced forms, which frequently appear in authentic informal conversations, compound the challenges in listening comprehension. Numerous…
Whole-exome sequencing supports genetic heterogeneity in childhood apraxia of speech.
Worthey, Elizabeth A; Raca, Gordana; Laffin, Jennifer J; Wilk, Brandon M; Harris, Jeremy M; Jakielski, Kathy J; Dimmock, David P; Strand, Edythe A; Shriberg, Lawrence D
2013-10-02
Childhood apraxia of speech (CAS) is a rare, severe, persistent pediatric motor speech disorder with associated deficits in sensorimotor, cognitive, language, learning and affective processes. Among other neurogenetic origins, CAS is the disorder segregating with a mutation in FOXP2 in a widely studied, multigenerational London family. We report the first whole-exome sequencing (WES) findings from a cohort of 10 unrelated participants, ages 3 to 19 years, with well-characterized CAS. As part of a larger study of children and youth with motor speech sound disorders, 32 participants were classified as positive for CAS on the basis of a behavioral classification marker using auditory-perceptual and acoustic methods that quantify the competence, precision and stability of a speaker's speech, prosody and voice. WES of 10 randomly selected participants was completed using the Illumina Genome Analyzer IIx Sequencing System. Image analysis, base calling, demultiplexing, read mapping, and variant calling were performed using Illumina software. Software developed in-house was used for variant annotation, prioritization and interpretation to identify those variants likely to be deleterious to neurodevelopmental substrates of speech-language development. Among potentially deleterious variants, clinically reportable findings of interest occurred on a total of five chromosomes (Chr3, Chr6, Chr7, Chr9 and Chr17), which included six genes either strongly associated with CAS (FOXP1 and CNTNAP2) or associated with disorders with phenotypes overlapping CAS (ATP13A4, CNTNAP1, KIAA0319 and SETX). A total of 8 (80%) of the 10 participants had clinically reportable variants in one or two of the six genes, with variants in ATP13A4, KIAA0319 and CNTNAP2 being the most prevalent. Similar to the results reported in emerging WES studies of other complex neurodevelopmental disorders, our findings from this first WES study of CAS are interpreted as support for heterogeneous genetic origins of this pediatric motor speech disorder with multiple genes, pathways and complex interactions. We also submit that our findings illustrate the potential use of WES for both gene identification and case-by-case clinical diagnostics in pediatric motor speech disorders.
NASA Astrophysics Data System (ADS)
Sembiring, N.; Nasution, A. H.
2018-02-01
Corrective maintenance i.e replacing or repairing the machine component after machine break down always done in a manufacturing company. It causes the production process must be stopped. Production time will decrease due to the maintenance team must replace or repair the damage machine component. This paper proposes a preventive maintenance’s schedule for a critical component of a critical machine of an crude palm oil and kernel company due to increase maintenance efficiency. The Reliability Engineering & Maintenance Value Stream Mapping is used as a method and a tool to analize the reliability of the component and reduce the wastage in any process by segregating value added and non value added activities.
Ambient groundwater flow diminishes nitrate processing in the hyporheic zone of streams
NASA Astrophysics Data System (ADS)
Azizian, Morvarid; Boano, Fulvio; Cook, Perran L. M.; Detwiler, Russell L.; Rippy, Megan A.; Grant, Stanley B.
2017-05-01
Modeling and experimental studies demonstrate that ambient groundwater reduces hyporheic exchange, but the implications of this observation for stream N-cycling is not yet clear. Here we utilize a simple process-based model (the Pumping and Streamline Segregation or PASS model) to evaluate N-cycling over two scales of hyporheic exchange (fluvial ripples and riffle-pool sequences), ten ambient groundwater and stream flow scenarios (five gaining and losing conditions and two stream discharges), and three biogeochemical settings (identified based on a principal component analysis of previously published measurements in streams throughout the United States). Model-data comparisons indicate that our model provides realistic estimates for direct denitrification of stream nitrate, but overpredicts nitrification and coupled nitrification-denitrification. Riffle-pool sequences are responsible for most of the N-processing, despite the fact that fluvial ripples generate 3-11 times more hyporheic exchange flux. Across all scenarios, hyporheic exchange flux and the Damköhler Number emerge as primary controls on stream N-cycling; the former regulates trafficking of nutrients and oxygen across the sediment-water interface, while the latter quantifies the relative rates of organic carbon mineralization and advective transport in streambed sediments. Vertical groundwater flux modulates both of these master variables in ways that tend to diminish stream N-cycling. Thus, anthropogenic perturbations of ambient groundwater flows (e.g., by urbanization, agricultural activities, groundwater mining, and/or climate change) may compromise some of the key ecosystem services provided by streams.
Pollution profile and biodegradation characteristics of fur-suede processing effluents.
Yildiz Töre, G; Insel, G; Ubay Cokgör, E; Ferlier, E; Kabdaşli, I; Orhon, D
2011-07-01
This study investigated the effect of stream segregation on the biodegradation characteristics of wastewaters generated by fur-suede processing. It was conducted on a plant located in an organized industrial district in Turkey. A detailed in-plant analysis of the process profile and the resulting pollution profile in terms of significant parameters indicated the characteristics of a strong wastewater with a maximum total COD of 4285 mg L(-1), despite the excessive wastewater generation of 205 m3 (ton skin)(-1). Respirometric analysis by model calibration yielded slow biodegradation kinetics and showed that around 50% of the particulate organics were utilized at a rate similar to that of endogenous respiration. A similar analysis on the segregated wastewater streams suggested that biodegradation of the plant effluent is controlled largely by the initial washing/pickling operations. The effect of other effluent streams was not significant due to their relatively low contribution to the overall organic load. The respirometric tests showed that the biodegradation kinetics of the joint treatment plant influent of the district were substantially improved and exhibited typical levels reported for tannery wastewater, so that the inhibitory impact was suppressed to a great extent by dilution and mixing with effluents of the other plants. The chemical treatment step in the joint treatment plant removed the majority of the particulate organics so that 80% of the available COD was utilized in the oxygen uptake rate (OUR) test, a ratio quite compatible with the biodegradable COD fractions of tannery wastewater. Consequently, process kinetics and especially the hydrolysis rate appeared to be significantly improved.
Dittinger, Eva; Valizadeh, Seyed Abolfazl; Jäncke, Lutz; Besson, Mireille; Elmer, Stefan
2018-02-01
Current models of speech and language processing postulate the involvement of two parallel processing streams (the dual stream model): a ventral stream involved in mapping sensory and phonological representations onto lexical and conceptual representations and a dorsal stream contributing to sound-to-motor mapping, articulation, and to how verbal information is encoded and manipulated in memory. Based on previous evidence showing that music training has an influence on language processing, cognitive functions, and word learning, we examined EEG-based intracranial functional connectivity in the ventral and dorsal streams while musicians and nonmusicians learned the meaning of novel words through picture-word associations. In accordance with the dual stream model, word learning was generally associated with increased beta functional connectivity in the ventral stream compared to the dorsal stream. In addition, in the linguistically most demanding "semantic task," musicians outperformed nonmusicians, and this behavioral advantage was accompanied by increased left-hemispheric theta connectivity in both streams. Moreover, theta coherence in the left dorsal pathway was positively correlated with the number of years of music training. These results provide evidence for a complex interplay within a network of brain regions involved in semantic processing and verbal memory functions, and suggest that intensive music training can modify its functional architecture leading to advantages in novel word learning. © 2017 Wiley Periodicals, Inc.
Intelligent interfaces for expert systems
NASA Technical Reports Server (NTRS)
Villarreal, James A.; Wang, Lui
1988-01-01
Vital to the success of an expert system is an interface to the user which performs intelligently. A generic intelligent interface is being developed for expert systems. This intelligent interface was developed around the in-house developed Expert System for the Flight Analysis System (ESFAS). The Flight Analysis System (FAS) is comprised of 84 configuration controlled FORTRAN subroutines that are used in the preflight analysis of the space shuttle. In order to use FAS proficiently, a person must be knowledgeable in the areas of flight mechanics, the procedures involved in deploying a certain payload, and an overall understanding of the FAS. ESFAS, still in its developmental stage, is taking into account much of this knowledge. The generic intelligent interface involves the integration of a speech recognizer and synthesizer, a preparser, and a natural language parser to ESFAS. The speech recognizer being used is capable of recognizing 1000 words of connected speech. The natural language parser is a commercial software package which uses caseframe instantiation in processing the streams of words from the speech recognizer or the keyboard. The systems configuration is described along with capabilities and drawbacks.
Spatiotemporal dynamics of auditory attention synchronize with speech
Wöstmann, Malte; Herrmann, Björn; Maess, Burkhard
2016-01-01
Attention plays a fundamental role in selectively processing stimuli in our environment despite distraction. Spatial attention induces increasing and decreasing power of neural alpha oscillations (8–12 Hz) in brain regions ipsilateral and contralateral to the locus of attention, respectively. This study tested whether the hemispheric lateralization of alpha power codes not just the spatial location but also the temporal structure of the stimulus. Participants attended to spoken digits presented to one ear and ignored tightly synchronized distracting digits presented to the other ear. In the magnetoencephalogram, spatial attention induced lateralization of alpha power in parietal, but notably also in auditory cortical regions. This alpha power lateralization was not maintained steadily but fluctuated in synchrony with the speech rate and lagged the time course of low-frequency (1–5 Hz) sensory synchronization. Higher amplitude of alpha power modulation at the speech rate was predictive of a listener’s enhanced performance of stream-specific speech comprehension. Our findings demonstrate that alpha power lateralization is modulated in tune with the sensory input and acts as a spatiotemporal filter controlling the read-out of sensory content. PMID:27001861
Effect of attentional load on audiovisual speech perception: evidence from ERPs.
Alsius, Agnès; Möttönen, Riikka; Sams, Mikko E; Soto-Faraco, Salvador; Tiippana, Kaisa
2014-01-01
Seeing articulatory movements influences perception of auditory speech. This is often reflected in a shortened latency of auditory event-related potentials (ERPs) generated in the auditory cortex. The present study addressed whether this early neural correlate of audiovisual interaction is modulated by attention. We recorded ERPs in 15 subjects while they were presented with auditory, visual, and audiovisual spoken syllables. Audiovisual stimuli consisted of incongruent auditory and visual components known to elicit a McGurk effect, i.e., a visually driven alteration in the auditory speech percept. In a Dual task condition, participants were asked to identify spoken syllables whilst monitoring a rapid visual stream of pictures for targets, i.e., they had to divide their attention. In a Single task condition, participants identified the syllables without any other tasks, i.e., they were asked to ignore the pictures and focus their attention fully on the spoken syllables. The McGurk effect was weaker in the Dual task than in the Single task condition, indicating an effect of attentional load on audiovisual speech perception. Early auditory ERP components, N1 and P2, peaked earlier to audiovisual stimuli than to auditory stimuli when attention was fully focused on syllables, indicating neurophysiological audiovisual interaction. This latency decrement was reduced when attention was loaded, suggesting that attention influences early neural processing of audiovisual speech. We conclude that reduced attention weakens the interaction between vision and audition in speech.
Visual input enhances selective speech envelope tracking in auditory cortex at a "cocktail party".
Zion Golumbic, Elana; Cogan, Gregory B; Schroeder, Charles E; Poeppel, David
2013-01-23
Our ability to selectively attend to one auditory signal amid competing input streams, epitomized by the "Cocktail Party" problem, continues to stimulate research from various approaches. How this demanding perceptual feat is achieved from a neural systems perspective remains unclear and controversial. It is well established that neural responses to attended stimuli are enhanced compared with responses to ignored ones, but responses to ignored stimuli are nonetheless highly significant, leading to interference in performance. We investigated whether congruent visual input of an attended speaker enhances cortical selectivity in auditory cortex, leading to diminished representation of ignored stimuli. We recorded magnetoencephalographic signals from human participants as they attended to segments of natural continuous speech. Using two complementary methods of quantifying the neural response to speech, we found that viewing a speaker's face enhances the capacity of auditory cortex to track the temporal speech envelope of that speaker. This mechanism was most effective in a Cocktail Party setting, promoting preferential tracking of the attended speaker, whereas without visual input no significant attentional modulation was observed. These neurophysiological results underscore the importance of visual input in resolving perceptual ambiguity in a noisy environment. Since visual cues in speech precede the associated auditory signals, they likely serve a predictive role in facilitating auditory processing of speech, perhaps by directing attentional resources to appropriate points in time when to-be-attended acoustic input is expected to arrive.
Neural correlates of phonetic convergence and speech imitation.
Garnier, Maëva; Lamalle, Laurent; Sato, Marc
2013-01-01
Speakers unconsciously tend to mimic their interlocutor's speech during communicative interaction. This study aims at examining the neural correlates of phonetic convergence and deliberate imitation, in order to explore whether imitation of phonetic features, deliberate, or unconscious, might reflect a sensory-motor recalibration process. Sixteen participants listened to vowels with pitch varying around the average pitch of their own voice, and then produced the identified vowels, while their speech was recorded and their brain activity was imaged using fMRI. Three degrees and types of imitation were compared (unconscious, deliberate, and inhibited) using a go-nogo paradigm, which enabled the comparison of brain activations during the whole imitation process, its active perception step, and its production. Speakers followed the pitch of voices they were exposed to, even unconsciously, without being instructed to do so. After being informed about this phenomenon, 14 participants were able to inhibit it, at least partially. The results of whole brain and ROI analyses support the fact that both deliberate and unconscious imitations are based on similar neural mechanisms and networks, involving regions of the dorsal stream, during both perception and production steps of the imitation process. While no significant difference in brain activation was found between unconscious and deliberate imitations, the degree of imitation, however, appears to be determined by processes occurring during the perception step. Four regions of the dorsal stream: bilateral auditory cortex, bilateral supramarginal gyrus (SMG), and left Wernicke's area, indeed showed an activity that correlated significantly with the degree of imitation during the perception step.
The Hierarchical Cortical Organization of Human Speech Processing
de Heer, Wendy A.; Huth, Alexander G.; Griffiths, Thomas L.
2017-01-01
Speech comprehension requires that the brain extract semantic meaning from the spectral features represented at the cochlea. To investigate this process, we performed an fMRI experiment in which five men and two women passively listened to several hours of natural narrative speech. We then used voxelwise modeling to predict BOLD responses based on three different feature spaces that represent the spectral, articulatory, and semantic properties of speech. The amount of variance explained by each feature space was then assessed using a separate validation dataset. Because some responses might be explained equally well by more than one feature space, we used a variance partitioning analysis to determine the fraction of the variance that was uniquely explained by each feature space. Consistent with previous studies, we found that speech comprehension involves hierarchical representations starting in primary auditory areas and moving laterally on the temporal lobe: spectral features are found in the core of A1, mixtures of spectral and articulatory in STG, mixtures of articulatory and semantic in STS, and semantic in STS and beyond. Our data also show that both hemispheres are equally and actively involved in speech perception and interpretation. Further, responses as early in the auditory hierarchy as in STS are more correlated with semantic than spectral representations. These results illustrate the importance of using natural speech in neurolinguistic research. Our methodology also provides an efficient way to simultaneously test multiple specific hypotheses about the representations of speech without using block designs and segmented or synthetic speech. SIGNIFICANCE STATEMENT To investigate the processing steps performed by the human brain to transform natural speech sound into meaningful language, we used models based on a hierarchical set of speech features to predict BOLD responses of individual voxels recorded in an fMRI experiment while subjects listened to natural speech. Both cerebral hemispheres were actively involved in speech processing in large and equal amounts. Also, the transformation from spectral features to semantic elements occurs early in the cortical speech-processing stream. Our experimental and analytical approaches are important alternatives and complements to standard approaches that use segmented speech and block designs, which report more laterality in speech processing and associated semantic processing to higher levels of cortex than reported here. PMID:28588065
Effects of the rate of formant-frequency variation on the grouping of formants in speech perception.
Summers, Robert J; Bailey, Peter J; Roberts, Brian
2012-04-01
How speech is separated perceptually from other speech remains poorly understood. Recent research suggests that the ability of an extraneous formant to impair intelligibility depends on the modulation of its frequency, but not its amplitude, contour. This study further examined the effect of formant-frequency variation on intelligibility by manipulating the rate of formant-frequency change. Target sentences were synthetic three-formant (F1 + F2 + F3) analogues of natural utterances. Perceptual organization was probed by presenting stimuli dichotically (F1 + F2C + F3C; F2 + F3), where F2C + F3C constitute a competitor for F2 and F3 that listeners must reject to optimize recognition. Competitors were derived using formant-frequency contours extracted from extended passages spoken by the same talker and processed to alter the rate of formant-frequency variation, such that rate scale factors relative to the target sentences were 0, 0.25, 0.5, 1, 2, and 4 (0 = constant frequencies). Competitor amplitude contours were either constant, or time-reversed and rate-adjusted in parallel with the frequency contour. Adding a competitor typically reduced intelligibility; this reduction increased with competitor rate until the rate was at least twice that of the target sentences. Similarity in the results for the two amplitude conditions confirmed that formant amplitude contours do not influence across-formant grouping. The findings indicate that competitor efficacy is not tuned to the rate of the target sentences; most probably, it depends primarily on the overall rate of frequency variation in the competitor formants. This suggests that, when segregating the speech of concurrent talkers, differences in speech rate may not be a significant cue for across-frequency grouping of formants.
Allar, Ayse D; Beler Baykal, Bilsen
2016-01-01
ECOSAN is a recent domestic wastewater management concept which suggests segregation at the source. One of these streams, yellow water (human urine) has the potential to be used as fertilizer, directly or indirectly, because of its rich content of plant nutrients. One physicochemical method for indirect use is adsorption/ion exchange using clinoptilolite. This paper aims to present the results of a scenario focusing on possible diversion of urine and self-sufficiency of nutrients recovered on site through the use of this process, using actual demographic and territorial information from an existing summer housing site. Specifically, this paper aims to answer the questions: (i) how much nitrogen can be recovered to be used as fertilizer by diverting urine? and (ii) is this sufficient or in surplus within the model housing site? This sets an example of resource-oriented sanitation using stream segregation as a wastewater management strategy in a small community. Nitrogen was taken as the basis of calculations/predictions and the focus was placed on whether nitrogen is self-sufficient or in excess as fertilizer for use within the premises. The results reveal that the proposed application makes sense and that urine coming from the housing site is self-sufficient as fertilizer within the housing site itself.
Comparison of auditory stream segregation in sighted and early blind individuals.
Boroujeni, Fatemeh Moghadasi; Heidari, Fatemeh; Rouzbahani, Masoumeh; Kamali, Mohammad
2017-01-18
An important characteristic of the auditory system is the capacity to analyze complex sounds and make decisions on the source of the constituent parts of these sounds. Blind individuals compensate for the lack of visual information by an increase input from other sensory modalities, including increased auditory information. The purpose of the current study was to compare the fission boundary (FB) threshold of sighted and early blind individuals through spectral aspects using a psychoacoustic auditory stream segregation (ASS) test. This study was conducted on 16 sighted and 16 early blind adult individuals. The applied stimuli were presented sequentially as the pure tones A and B and as a triplet ABA-ABA pattern at the intensity of 40dBSL. The A tone frequency was selected as the basis at values of 500, 1000, and 2000Hz. The B tone was presented with the difference of a 4-100% above the basis tone frequency. Blind individuals had significantly lower FB thresholds than sighted people. FB was independent of the frequency of the tone A when expressed as the difference in the number of equivalent rectangular bandwidths (ERBs). Early blindness may increase perceptual separation of the acoustic stimuli to form accurate representations of the world. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Comparison of Classification Methods for Detecting Emotion from Mandarin Speech
NASA Astrophysics Data System (ADS)
Pao, Tsang-Long; Chen, Yu-Te; Yeh, Jun-Heng
It is said that technology comes out from humanity. What is humanity? The very definition of humanity is emotion. Emotion is the basis for all human expression and the underlying theme behind everything that is done, said, thought or imagined. Making computers being able to perceive and respond to human emotion, the human-computer interaction will be more natural. Several classifiers are adopted for automatically assigning an emotion category, such as anger, happiness or sadness, to a speech utterance. These classifiers were designed independently and tested on various emotional speech corpora, making it difficult to compare and evaluate their performance. In this paper, we first compared several popular classification methods and evaluated their performance by applying them to a Mandarin speech corpus consisting of five basic emotions, including anger, happiness, boredom, sadness and neutral. The extracted feature streams contain MFCC, LPCC, and LPC. The experimental results show that the proposed WD-MKNN classifier achieves an accuracy of 81.4% for the 5-class emotion recognition and outperforms other classification techniques, including KNN, MKNN, DW-KNN, LDA, QDA, GMM, HMM, SVM, and BPNN. Then, to verify the advantage of the proposed method, we compared these classifiers by applying them to another Mandarin expressive speech corpus consisting of two emotions. The experimental results still show that the proposed WD-MKNN outperforms others.
The Neurobiological Grounding of Persistent Stuttering: from Structure to Function.
Neef, Nicole E; Anwander, Alfred; Friederici, Angela D
2015-09-01
Neuroimaging and transcranial magnetic stimulation provide insights into the neuronal mechanisms underlying speech disfluencies in chronic persistent stuttering. In the present paper, the goal is not to provide an exhaustive review of existing literature, but rather to highlight robust findings. We, therefore, conducted a meta-analysis of diffusion tensor imaging studies which have recently implicated disrupted white matter connectivity in stuttering. A reduction of fractional anisotropy in persistent stuttering has been reported at several different loci. Our meta-analysis revealed consistent deficits in the left dorsal stream and in the interhemispheric connections between the sensorimotor cortices. In addition, recent fMRI meta-analyses link stuttering to reduced left fronto-parieto-temporal activation while greater fluency is associated with boosted co-activations of right fronto-parieto-temporal areas. However, the physiological foundation of these irregularities is not accessible with MRI. Complementary, transcranial magnetic stimulation (TMS) reveals local excitatory and inhibitory regulation of cortical dynamics. Applied to a speech motor area, TMS revealed reduced speech-planning-related neuronal dynamics at the level of the primary motor cortex in stuttering. Together, this review provides a focused view of the neurobiology of stuttering to date and may guide the rational design of future research. This future needs to account for the perpetual dynamic interactions between auditory, somatosensory, and speech motor circuits that shape fluent speech.
Proform-Antecedent Linking in Listeners with Language Impairments and Unimpaired Listeners
ERIC Educational Resources Information Center
Engel, Samantha Michelle
2016-01-01
This dissertation explores how listeners extract meaning from personal and reflexive pronouns in spoken language. To be understood, words like her and herself must be linked to a prior element in the speech stream (or antecedent). This process draws on syntactic knowledge and verbal working memory processes. I present two original research studies…
The Effect of Sonority on Word Segmentation: Evidence for the Use of a Phonological Universal
ERIC Educational Resources Information Center
Ettlinger, Marc; Finn, Amy S.; Hudson Kam, Carla L.
2012-01-01
It has been well documented how language-specific cues may be used for word segmentation. Here, we investigate what role a language-independent phonological universal, the sonority sequencing principle (SSP), may also play. Participants were presented with an unsegmented speech stream with non-English word onsets that juxtaposed adherence to the…
Modeling the Contribution of Phonotactic Cues to the Problem of Word Segmentation
ERIC Educational Resources Information Center
Blanchard, Daniel; Heinz, Jeffrey; Golinkoff, Roberta
2010-01-01
How do infants find the words in the speech stream? Computational models help us understand this feat by revealing the advantages and disadvantages of different strategies that infants might use. Here, we outline a computational model of word segmentation that aims both to incorporate cues proposed by language acquisition researchers and to…
Do Newly Formed Word Representations Encode Non-Criterial Information?
ERIC Educational Resources Information Center
Curtin, Suzanne
2011-01-01
Lexical stress is useful for a number of language learning tasks. In particular, it helps infants segment the speech stream and identify phonetic contrasts. Recent work has demonstrated that infants aged 1 ; 0 can learn two novel words differing only in their stress pattern. In the current study, we ask whether infants aged 1 ; 0 store stress…
Implicit Segmentation of a Stream of Syllables Based on Transitional Probabilities: An MEG Study
ERIC Educational Resources Information Center
Teinonen, Tuomas; Huotilainen, Minna
2012-01-01
Statistical segmentation of continuous speech, i.e., the ability to utilise transitional probabilities between syllables in order to detect word boundaries, is reflected in the brain's auditory event-related potentials (ERPs). The N1 and N400 ERP components are typically enhanced for word onsets compared to random syllables during active…
Perceptual Grouping Affects Pitch Judgments Across Time and Frequency
Borchert, Elizabeth M. O.; Micheyl, Christophe; Oxenham, Andrew J.
2010-01-01
Pitch, the perceptual correlate of fundamental frequency (F0), plays an important role in speech, music and animal vocalizations. Changes in F0 over time help define musical melodies and speech prosody, while comparisons of simultaneous F0 are important for musical harmony, and for segregating competing sound sources. This study compared listeners’ ability to detect differences in F0 between pairs of sequential or simultaneous tones that were filtered into separate, non-overlapping spectral regions. The timbre differences induced by filtering led to poor F0 discrimination in the sequential, but not the simultaneous, conditions. Temporal overlap of the two tones was not sufficient to produce good performance; instead performance appeared to depend on the two tones being integrated into the same perceptual object. The results confirm the difficulty of comparing the pitches of sequential sounds with different timbres and suggest that, for simultaneous sounds, pitch differences may be detected through a decrease in perceptual fusion rather than an explicit coding and comparison of the underlying F0s. PMID:21077719
Neural dynamics of feedforward and feedback processing in figure-ground segregation
Layton, Oliver W.; Mingolla, Ennio; Yazdanbakhsh, Arash
2014-01-01
Determining whether a region belongs to the interior or exterior of a shape (figure-ground segregation) is a core competency of the primate brain, yet the underlying mechanisms are not well understood. Many models assume that figure-ground segregation occurs by assembling progressively more complex representations through feedforward connections, with feedback playing only a modulatory role. We present a dynamical model of figure-ground segregation in the primate ventral stream wherein feedback plays a crucial role in disambiguating a figure's interior and exterior. We introduce a processing strategy whereby jitter in RF center locations and variation in RF sizes is exploited to enhance and suppress neural activity inside and outside of figures, respectively. Feedforward projections emanate from units that model cells in V4 known to respond to the curvature of boundary contours (curved contour cells), and feedback projections from units predicted to exist in IT that strategically group neurons with different RF sizes and RF center locations (teardrop cells). Neurons (convex cells) that preferentially respond when centered on a figure dynamically balance feedforward (bottom-up) information and feedback from higher visual areas. The activation is enhanced when an interior portion of a figure is in the RF via feedback from units that detect closure in the boundary contours of a figure. Our model produces maximal activity along the medial axis of well-known figures with and without concavities, and inside algorithmically generated shapes. Our results suggest that the dynamic balancing of feedforward signals with the specific feedback mechanisms proposed by the model is crucial for figure-ground segregation. PMID:25346703
Neural dynamics of feedforward and feedback processing in figure-ground segregation.
Layton, Oliver W; Mingolla, Ennio; Yazdanbakhsh, Arash
2014-01-01
Determining whether a region belongs to the interior or exterior of a shape (figure-ground segregation) is a core competency of the primate brain, yet the underlying mechanisms are not well understood. Many models assume that figure-ground segregation occurs by assembling progressively more complex representations through feedforward connections, with feedback playing only a modulatory role. We present a dynamical model of figure-ground segregation in the primate ventral stream wherein feedback plays a crucial role in disambiguating a figure's interior and exterior. We introduce a processing strategy whereby jitter in RF center locations and variation in RF sizes is exploited to enhance and suppress neural activity inside and outside of figures, respectively. Feedforward projections emanate from units that model cells in V4 known to respond to the curvature of boundary contours (curved contour cells), and feedback projections from units predicted to exist in IT that strategically group neurons with different RF sizes and RF center locations (teardrop cells). Neurons (convex cells) that preferentially respond when centered on a figure dynamically balance feedforward (bottom-up) information and feedback from higher visual areas. The activation is enhanced when an interior portion of a figure is in the RF via feedback from units that detect closure in the boundary contours of a figure. Our model produces maximal activity along the medial axis of well-known figures with and without concavities, and inside algorithmically generated shapes. Our results suggest that the dynamic balancing of feedforward signals with the specific feedback mechanisms proposed by the model is crucial for figure-ground segregation.
Greenhouse gas emissions of waste management processes and options: A case study.
de la Barrera, Belen; Hooda, Peter S
2016-07-01
Increasing concern about climate change is prompting organisations to mitigate their greenhouse gas emissions. Waste management activities also contribute to greenhouse gas emissions. In the waste management sector, there has been an increasing diversion of waste sent to landfill, with much emphasis on recycling and reuse to prevent emissions. This study evaluates the carbon footprint of the different processes involved in waste management systems, considering the entire waste management stream. Waste management data from the Royal Borough of Kingston upon Thames, London (UK), was used to estimate the carbon footprint for its (Royal Borough of Kingston upon Thames) current source segregation system. Second, modelled full and partial co-mingling scenarios were used to estimate carbon emissions from these proposed waste management approaches. The greenhouse gas emissions from the entire waste management system at Royal Borough of Kingston upon Thames were 12,347 t CO2e for the source-segregated scenario, and 11,907 t CO2e for the partial co-mingled model. These emissions amount to 203.26 kg CO2e t(-1) and 196.02 kg CO2e t(-1) municipal solid waste for source-segregated and partial co-mingled, respectively. The change from a source segregation fleet to a partial co-mingling fleet reduced the emissions, at least partly owing to a change in the number and type of vehicles. © The Author(s) 2016.
Visual hallucinatory syndromes and the anatomy of the visual brain.
Santhouse, A M; Howard, R J; ffytche, D H
2000-10-01
We have set out to identify phenomenological correlates of cerebral functional architecture within Charles Bonnet syndrome (CBS) hallucinations by looking for associations between specific hallucination categories. Thirty-four CBS patients were examined with a structured interview/questionnaire to establish the presence of 28 different pathological visual experiences. Associations between categories of pathological experience were investigated by an exploratory factor analysis. Twelve of the pathological experiences partitioned into three segregated syndromic clusters. The first cluster consisted of hallucinations of extended landscape scenes and small figures in costumes with hats; the second, hallucinations of grotesque, disembodied and distorted faces with prominent eyes and teeth; and the third, visual perseveration and delayed palinopsia. The three visual psycho-syndromes mirror the segregation of hierarchical visual pathways into streams and suggest a novel theoretical framework for future research into the pathophysiology of neuropsychiatric syndromes.
Scarbel, Lucie; Beautemps, Denis; Schwartz, Jean-Luc; Sato, Marc
2014-01-01
One classical argument in favor of a functional role of the motor system in speech perception comes from the close-shadowing task in which a subject has to identify and to repeat as quickly as possible an auditory speech stimulus. The fact that close-shadowing can occur very rapidly and much faster than manual identification of the speech target is taken to suggest that perceptually induced speech representations are already shaped in a motor-compatible format. Another argument is provided by audiovisual interactions often interpreted as referring to a multisensory-motor framework. In this study, we attempted to combine these two paradigms by testing whether the visual modality could speed motor response in a close-shadowing task. To this aim, both oral and manual responses were evaluated during the perception of auditory and audiovisual speech stimuli, clear or embedded in white noise. Overall, oral responses were faster than manual ones, but it also appeared that they were less accurate in noise, which suggests that motor representations evoked by the speech input could be rough at a first processing stage. In the presence of acoustic noise, the audiovisual modality led to both faster and more accurate responses than the auditory modality. No interaction was however, observed between modality and response. Altogether, these results are interpreted within a two-stage sensory-motor framework, in which the auditory and visual streams are integrated together and with internally generated motor representations before a final decision may be available. PMID:25009512
Effect of attentional load on audiovisual speech perception: evidence from ERPs
Alsius, Agnès; Möttönen, Riikka; Sams, Mikko E.; Soto-Faraco, Salvador; Tiippana, Kaisa
2014-01-01
Seeing articulatory movements influences perception of auditory speech. This is often reflected in a shortened latency of auditory event-related potentials (ERPs) generated in the auditory cortex. The present study addressed whether this early neural correlate of audiovisual interaction is modulated by attention. We recorded ERPs in 15 subjects while they were presented with auditory, visual, and audiovisual spoken syllables. Audiovisual stimuli consisted of incongruent auditory and visual components known to elicit a McGurk effect, i.e., a visually driven alteration in the auditory speech percept. In a Dual task condition, participants were asked to identify spoken syllables whilst monitoring a rapid visual stream of pictures for targets, i.e., they had to divide their attention. In a Single task condition, participants identified the syllables without any other tasks, i.e., they were asked to ignore the pictures and focus their attention fully on the spoken syllables. The McGurk effect was weaker in the Dual task than in the Single task condition, indicating an effect of attentional load on audiovisual speech perception. Early auditory ERP components, N1 and P2, peaked earlier to audiovisual stimuli than to auditory stimuli when attention was fully focused on syllables, indicating neurophysiological audiovisual interaction. This latency decrement was reduced when attention was loaded, suggesting that attention influences early neural processing of audiovisual speech. We conclude that reduced attention weakens the interaction between vision and audition in speech. PMID:25076922
Visual Input Enhances Selective Speech Envelope Tracking in Auditory Cortex at a ‘Cocktail Party’
Golumbic, Elana Zion; Cogan, Gregory B.; Schroeder, Charles E.; Poeppel, David
2013-01-01
Our ability to selectively attend to one auditory signal amidst competing input streams, epitomized by the ‘Cocktail Party’ problem, continues to stimulate research from various approaches. How this demanding perceptual feat is achieved from a neural systems perspective remains unclear and controversial. It is well established that neural responses to attended stimuli are enhanced compared to responses to ignored ones, but responses to ignored stimuli are nonetheless highly significant, leading to interference in performance. We investigated whether congruent visual input of an attended speaker enhances cortical selectivity in auditory cortex, leading to diminished representation of ignored stimuli. We recorded magnetoencephalographic (MEG) signals from human participants as they attended to segments of natural continuous speech. Using two complementary methods of quantifying the neural response to speech, we found that viewing a speaker’s face enhances the capacity of auditory cortex to track the temporal speech envelope of that speaker. This mechanism was most effective in a ‘Cocktail Party’ setting, promoting preferential tracking of the attended speaker, whereas without visual input no significant attentional modulation was observed. These neurophysiological results underscore the importance of visual input in resolving perceptual ambiguity in a noisy environment. Since visual cues in speech precede the associated auditory signals, they likely serve a predictive role in facilitating auditory processing of speech, perhaps by directing attentional resources to appropriate points in time when to-be-attended acoustic input is expected to arrive. PMID:23345218
Poliva, Oren
2017-01-01
In the brain of primates, the auditory cortex connects with the frontal lobe via the temporal pole (auditory ventral stream; AVS) and via the inferior parietal lobe (auditory dorsal stream; ADS). The AVS is responsible for sound recognition, and the ADS for sound-localization, voice detection and integration of calls with faces. I propose that the primary role of the ADS in non-human primates is the detection and response to contact calls. These calls are exchanged between tribe members (e.g., mother-offspring) and are used for monitoring location. Detection of contact calls occurs by the ADS identifying a voice, localizing it, and verifying that the corresponding face is out of sight. Once a contact call is detected, the primate produces a contact call in return via descending connections from the frontal lobe to a network of limbic and brainstem regions. Because the ADS of present day humans also performs speech production, I further propose an evolutionary course for the transition from contact call exchange to an early form of speech. In accordance with this model, structural changes to the ADS endowed early members of the genus Homo with partial vocal control. This development was beneficial as it enabled offspring to modify their contact calls with intonations for signaling high or low levels of distress to their mother. Eventually, individuals were capable of participating in yes-no question-answer conversations. In these conversations the offspring emitted a low-level distress call for inquiring about the safety of objects (e.g., food), and his/her mother responded with a high- or low-level distress call to signal approval or disapproval of the interaction. Gradually, the ADS and its connections with brainstem motor regions became more robust and vocal control became more volitional. Speech emerged once vocal control was sufficient for inventing novel calls. PMID:28928931
Deike, Susann; Deliano, Matthias; Brechmann, André
2016-10-01
One hypothesis concerning the neural underpinnings of auditory streaming states that frequency tuning of tonotopically organized neurons in primary auditory fields in combination with physiological forward suppression is necessary for the separation of representations of high-frequency A and low-frequency B tones. The extent of spatial overlap between the tonotopic activations of A and B tones is thought to underlie the perceptual organization of streaming sequences into one coherent or two separate streams. The present study attempts to interfere with these mechanisms by transcranial direct current stimulation (tDCS) and to probe behavioral outcomes reflecting the perception of ABAB streaming sequences. We hypothesized that tDCS by modulating cortical excitability causes a change in the separateness of the representations of A and B tones, which leads to a change in the proportions of one-stream and two-stream percepts. To test this, 22 subjects were presented with ambiguous ABAB sequences of three different frequency separations (∆F) and had to decide on their current percept after receiving sham, anodal, or cathodal tDCS over the left auditory cortex. We could confirm our hypothesis at the most ambiguous ∆F condition of 6 semitones. For anodal compared with sham and cathodal stimulation, we found a significant decrease in the proportion of two-stream perception and an increase in the proportion of one-stream perception. The results demonstrate the feasibility of using tDCS to probe mechanisms underlying auditory streaming through the use of various behavioral measures. Moreover, this approach allows one to probe the functions of auditory regions and their interactions with other processing stages. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.
ERIC Educational Resources Information Center
Zeamer, Charlotte; Fox Tree, Jean E.
2013-01-01
Literature on auditory distraction has generally focused on the effects of particular kinds of sounds on attention to target stimuli. In support of extensive previous findings that have demonstrated the special role of language as an auditory distractor, we found that a concurrent speech stream impaired recall of a short lecture, especially for…
A Bayesian Framework for Word Segmentation: Exploring the Effects of Context
ERIC Educational Resources Information Center
Goldwater, Sharon; Griffiths, Thomas L.; Johnson, Mark
2009-01-01
Since the experiments of Saffran et al. [Saffran, J., Aslin, R., & Newport, E. (1996). Statistical learning in 8-month-old infants. "Science," 274, 1926-1928], there has been a great deal of interest in the question of how statistical regularities in the speech stream might be used by infants to begin to identify individual words. In this work, we…
Christiansen, Morten H.; Onnis, Luca; Hockema, Stephen A.
2009-01-01
When learning language young children are faced with many seemingly formidable challenges, including discovering words embedded in a continuous stream of sounds and determining what role these words play in syntactic constructions. We suggest that knowledge of phoneme distributions may play a crucial part in helping children segment words and determine their lexical category, and propose an integrated model of how children might go from unsegmented speech to lexical categories. We corroborated this theoretical model using a two-stage computational analysis of a large corpus of English child-directed speech. First, we used transition probabilities between phonemes to find words in unsegmented speech. Second, we used distributional information about word edges—the beginning and ending phonemes of words—to predict whether the segmented words from the first stage were nouns, verbs, or something else. The results indicate that discovering lexical units and their associated syntactic category in child-directed speech is possible by attending to the statistics of single phoneme transitions and word-initial and final phonemes. Thus, we suggest that a core computational principle in language acquisition is that the same source of information is used to learn about different aspects of linguistic structure. PMID:19371361
Subcortical processing of speech regularities underlies reading and music aptitude in children
2011-01-01
Background Neural sensitivity to acoustic regularities supports fundamental human behaviors such as hearing in noise and reading. Although the failure to encode acoustic regularities in ongoing speech has been associated with language and literacy deficits, how auditory expertise, such as the expertise that is associated with musical skill, relates to the brainstem processing of speech regularities is unknown. An association between musical skill and neural sensitivity to acoustic regularities would not be surprising given the importance of repetition and regularity in music. Here, we aimed to define relationships between the subcortical processing of speech regularities, music aptitude, and reading abilities in children with and without reading impairment. We hypothesized that, in combination with auditory cognitive abilities, neural sensitivity to regularities in ongoing speech provides a common biological mechanism underlying the development of music and reading abilities. Methods We assessed auditory working memory and attention, music aptitude, reading ability, and neural sensitivity to acoustic regularities in 42 school-aged children with a wide range of reading ability. Neural sensitivity to acoustic regularities was assessed by recording brainstem responses to the same speech sound presented in predictable and variable speech streams. Results Through correlation analyses and structural equation modeling, we reveal that music aptitude and literacy both relate to the extent of subcortical adaptation to regularities in ongoing speech as well as with auditory working memory and attention. Relationships between music and speech processing are specifically driven by performance on a musical rhythm task, underscoring the importance of rhythmic regularity for both language and music. Conclusions These data indicate common brain mechanisms underlying reading and music abilities that relate to how the nervous system responds to regularities in auditory input. Definition of common biological underpinnings for music and reading supports the usefulness of music for promoting child literacy, with the potential to improve reading remediation. PMID:22005291
Neural correlates of phonetic convergence and speech imitation
Garnier, Maëva; Lamalle, Laurent; Sato, Marc
2013-01-01
Speakers unconsciously tend to mimic their interlocutor's speech during communicative interaction. This study aims at examining the neural correlates of phonetic convergence and deliberate imitation, in order to explore whether imitation of phonetic features, deliberate, or unconscious, might reflect a sensory-motor recalibration process. Sixteen participants listened to vowels with pitch varying around the average pitch of their own voice, and then produced the identified vowels, while their speech was recorded and their brain activity was imaged using fMRI. Three degrees and types of imitation were compared (unconscious, deliberate, and inhibited) using a go-nogo paradigm, which enabled the comparison of brain activations during the whole imitation process, its active perception step, and its production. Speakers followed the pitch of voices they were exposed to, even unconsciously, without being instructed to do so. After being informed about this phenomenon, 14 participants were able to inhibit it, at least partially. The results of whole brain and ROI analyses support the fact that both deliberate and unconscious imitations are based on similar neural mechanisms and networks, involving regions of the dorsal stream, during both perception and production steps of the imitation process. While no significant difference in brain activation was found between unconscious and deliberate imitations, the degree of imitation, however, appears to be determined by processes occurring during the perception step. Four regions of the dorsal stream: bilateral auditory cortex, bilateral supramarginal gyrus (SMG), and left Wernicke's area, indeed showed an activity that correlated significantly with the degree of imitation during the perception step. PMID:24062704
Getzmann, Stephan; Jasny, Julian; Falkenstein, Michael
2017-02-01
Verbal communication in a "cocktail-party situation" is a major challenge for the auditory system. In particular, changes in target speaker usually result in declined speech perception. Here, we investigated whether speech cues indicating a subsequent change in target speaker reduce the costs of switching in younger and older adults. We employed event-related potential (ERP) measures and a speech perception task, in which sequences of short words were simultaneously presented by four speakers. Changes in target speaker were either unpredictable or semantically cued by a word within the target stream. Cued changes resulted in a less decreased performance than uncued changes in both age groups. The ERP analysis revealed shorter latencies in the change-related N400 and late positive complex (LPC) after cued changes, suggesting an acceleration in context updating and attention switching. Thus, both younger and older listeners used semantic cues to prepare changes in speaker setting. Copyright © 2016 Elsevier Inc. All rights reserved.
All words are not created equal: Expectations about word length guide infant statistical learning
Lew-Williams, Casey; Saffran, Jenny R.
2011-01-01
Infants have been described as ‘statistical learners’ capable of extracting structure (such as words) from patterned input (such as language). Here, we investigated whether prior knowledge influences how infants track transitional probabilities in word segmentation tasks. Are infants biased by prior experience when engaging in sequential statistical learning? In a laboratory simulation of learning across time, we exposed 9- and 10-month-old infants to a list of either bisyllabic or trisyllabic nonsense words, followed by a pause-free speech stream composed of a different set of bisyllabic or trisyllabic nonsense words. Listening times revealed successful segmentation of words from fluent speech only when words were uniformly bisyllabic or trisyllabic throughout both phases of the experiment. Hearing trisyllabic words during the pre-exposure phase derailed infants’ abilities to segment speech into bisyllabic words, and vice versa. We conclude that prior knowledge about word length equips infants with perceptual expectations that facilitate efficient processing of subsequent language input. PMID:22088408
Loutrari, Ariadne; Lorch, Marjorie Perlman
2017-07-01
We present a follow-up study on the case of a Greek amusic adult, B.Z., whose impaired performance on scale, contour, interval, and meter was reported by Paraskevopoulos, Tsapkini, and Peretz in 2010, employing a culturally-tailored version of the Montreal Battery of Evaluation of Amusia. In the present study, we administered a novel set of perceptual judgement tasks designed to investigate the ability to appreciate holistic prosodic aspects of 'expressiveness' and emotion in phrase length music and speech stimuli. Our results show that, although diagnosed as a congenital amusic, B.Z. scored as well as healthy controls (N=24) on judging 'expressiveness' and emotional prosody in both speech and music stimuli. These findings suggest that the ability to make perceptual judgements about such prosodic qualities may be preserved in individuals who demonstrate difficulties perceiving basic musical features such as melody or rhythm. B.Z.'s case yields new insights into amusia and the processing of speech and music prosody through a holistic approach. The employment of novel stimuli with relatively fewer non-naturalistic manipulations, as developed for this study, may be a useful tool for revealing unexplored aspects of music and speech cognition and offer the possibility to further the investigation of the perception of acoustic streams in more authentic auditory conditions. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Statistical Learning in a Natural Language by 8-Month-Old Infants
Pelucchi, Bruna; Hay, Jessica F.; Saffran, Jenny R.
2013-01-01
Numerous studies over the past decade support the claim that infants are equipped with powerful statistical language learning mechanisms. The primary evidence for statistical language learning in word segmentation comes from studies using artificial languages, continuous streams of synthesized syllables that are highly simplified relative to real speech. To what extent can these conclusions be scaled up to natural language learning? In the current experiments, English-learning 8-month-old infants’ ability to track transitional probabilities in fluent infant-directed Italian speech was tested (N = 72). The results suggest that infants are sensitive to transitional probability cues in unfamiliar natural language stimuli, and support the claim that statistical learning is sufficiently robust to support aspects of real-world language acquisition. PMID:19489896
Statistical learning in a natural language by 8-month-old infants.
Pelucchi, Bruna; Hay, Jessica F; Saffran, Jenny R
2009-01-01
Numerous studies over the past decade support the claim that infants are equipped with powerful statistical language learning mechanisms. The primary evidence for statistical language learning in word segmentation comes from studies using artificial languages, continuous streams of synthesized syllables that are highly simplified relative to real speech. To what extent can these conclusions be scaled up to natural language learning? In the current experiments, English-learning 8-month-old infants' ability to track transitional probabilities in fluent infant-directed Italian speech was tested (N = 72). The results suggest that infants are sensitive to transitional probability cues in unfamiliar natural language stimuli, and support the claim that statistical learning is sufficiently robust to support aspects of real-world language acquisition.
Navy Electroplating Pollution Control Technology Assessment Manual.
1984-02-01
quality. Dummying of chromium baths is used in the special case where high cathode-to-anode 5ea ratio has resulted in build up of trivalent chromium (Cr...Dummying with a high anode -to-cat hode area ratio can be 6used to reoxidize the trivalent to hexavalent chromium (Cr ).Proper scheduling of work can...unit processes: * Chromium reduction (if needed) of segregated chromium waste streams to reduce the chromium from its hexavalent form to the trivalent
Droplet-Based Segregation and Extraction of Concentrated Samples
DOE Office of Scientific and Technical Information (OSTI.GOV)
Buie, C R; Buckley, P; Hamilton, J
2007-02-23
Microfluidic analysis often requires sample concentration and separation techniques to isolate and detect analytes of interest. Complex or scarce samples may also require an orthogonal separation and detection method or off-chip analysis to confirm results. To perform these additional steps, the concentrated sample plug must be extracted from the primary microfluidic channel with minimal sample loss and dilution. We investigated two extraction techniques; injection of immiscible fluid droplets into the sample stream (''capping'''') and injection of the sample into an immiscible fluid stream (''extraction''). From our results we conclude that capping is the more effective partitioning technique. Furthermore, this functionalitymore » enables additional off-chip post-processing procedures such as DNA/RNA microarray analysis, realtime polymerase chain reaction (RT-PCR), and culture growth to validate chip performance.« less
Chakalov, Ivan; Draganova, Rossitza; Wollbrink, Andreas; Preissl, Hubert; Pantev, Christo
2012-06-20
The aim of the present study was to identify a specific neuronal correlate underlying the pre-attentive auditory stream segregation of subsequent sound patterns alternating in spectral or temporal cues. Fifteen participants with normal hearing were presented with series' of two consecutive ABA auditory tone-triplet sequences, the initial triplets being the Adaptation sequence and the subsequent triplets being the Test sequence. In the first experiment, the frequency separation (delta-f) between A and B tones in the sequences was varied by 2, 4 and 10 semitones. In the second experiment, a constant delta-f of 6 semitones was maintained but the Inter-Stimulus Intervals (ISIs) between A and B tones were varied. Auditory evoked magnetic fields (AEFs) were recorded using magnetoencephalography (MEG). Participants watched a muted video of their choice and ignored the auditory stimuli. In a subsequent behavioral study both MEG experiments were replicated to provide information about the participants' perceptual state. MEG measurements showed a significant increase in the amplitude of the B-tone related P1 component of the AEFs as delta-f increased. This effect was seen predominantly in the left hemisphere. A significant increase in the amplitude of the N1 component was only obtained for a Test sequence delta-f of 10 semitones with a prior Adaptation sequence of 2 semitones. This effect was more pronounced in the right hemisphere. The additional behavioral data indicated an increased probability of two-stream perception for delta-f = 4 and delta-f = 10 semitones with a preceding Adaptation sequence of 2 semitones. However, neither the neural activity nor the perception of the successive streaming sequences were modulated when the ISIs were alternated. Our MEG experiment demonstrated differences in the behavior of P1 and N1 components during the automatic segregation of sounds when induced by an initial Adaptation sequence. The P1 component appeared enhanced in all Test-conditions and thus demonstrates the preceding context effect, whereas N1 was specifically modulated only by large delta-f Test sequences induced by a preceding small delta-f Adaptation sequence. These results suggest that P1 and N1 components represent at least partially-different systems that underlie the neural representation of auditory streaming.
A Visual Cortical Network for Deriving Phonological Information from Intelligible Lip Movements.
Hauswald, Anne; Lithari, Chrysa; Collignon, Olivier; Leonardelli, Elisa; Weisz, Nathan
2018-05-07
Successful lip-reading requires a mapping from visual to phonological information [1]. Recently, visual and motor cortices have been implicated in tracking lip movements (e.g., [2]). It remains unclear, however, whether visuo-phonological mapping occurs already at the level of the visual cortex-that is, whether this structure tracks the acoustic signal in a functionally relevant manner. To elucidate this, we investigated how the cortex tracks (i.e., entrains to) absent acoustic speech signals carried by silent lip movements. Crucially, we contrasted the entrainment to unheard forward (intelligible) and backward (unintelligible) acoustic speech. We observed that the visual cortex exhibited stronger entrainment to the unheard forward acoustic speech envelope compared to the unheard backward acoustic speech envelope. Supporting the notion of a visuo-phonological mapping process, this forward-backward difference of occipital entrainment was not present for actually observed lip movements. Importantly, the respective occipital region received more top-down input, especially from left premotor, primary motor, and somatosensory regions and, to a lesser extent, also from posterior temporal cortex. Strikingly, across participants, the extent of top-down modulation of the visual cortex stemming from these regions partially correlated with the strength of entrainment to absent acoustic forward speech envelope, but not to present forward lip movements. Our findings demonstrate that a distributed cortical network, including key dorsal stream auditory regions [3-5], influences how the visual cortex shows sensitivity to the intelligibility of speech while tracking silent lip movements. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.
Sheft, Stanley; Shafiro, Valeriy; Lorenzi, Christian; McMullen, Rachel; Farrell, Caitlin
2012-01-01
Objective The frequency modulation (FM) of speech can convey linguistic information and also enhance speech-stream coherence and segmentation. Using a clinically oriented approach, the purpose of the present study was to examine the effects of age and hearing loss on the ability to discriminate between stochastic patterns of low-rate FM and determine whether difficulties in speech perception experienced by older listeners relate to a deficit in this ability. Design Data were collected from 18 normal-hearing young adults, and 18 participants who were at least 60 years old, nine normal-hearing and nine with a mild-to-moderate sensorineural hearing loss. Using stochastic frequency modulators derived from 5-Hz lowpass noise applied to a 1-kHz carrier, discrimination thresholds were measured in terms of frequency excursion (ΔF) both in quiet and with a speech-babble masker present, stimulus duration, and signal-to-noise ratio (SNRFM) in the presence of a speech-babble masker. Speech perception ability was evaluated using Quick Speech-in-Noise (QuickSIN) sentences in four-talker babble. Results Results showed a significant effect of age, but not of hearing loss among the older listeners, for FM discrimination conditions with masking present (ΔF and SNRFM). The effect of age was not significant for the FM measures based on stimulus duration. ΔF and SNRFM were also the two conditions for which performance was significantly correlated with listener age when controlling for effect of hearing loss as measured by pure-tone average. With respect to speech-in-noise ability, results from the SNRFM condition were significantly correlated with QuickSIN performance. Conclusions Results indicate that aging is associated with reduced ability to discriminate moderate-duration patterns of low-rate stochastic FM. Furthermore, the relationship between QuickSIN performance and the SNRFM thresholds suggests that the difficulty experienced by older listeners with speech-in-noise processing may in part relate to diminished ability to process slower fine-structure modulation at low sensation levels. Results thus suggest that clinical consideration of stochastic FM discrimination measures may offer a fuller picture of auditory processing abilities. PMID:22790319
Duke, Mila Morais; Wolfe, Jace; Schafer, Erin
2016-05-01
Cochlear implant (CI) recipients often experience difficulty understanding speech in noise and speech that originates from a distance. Many CI recipients also experience difficulty understanding speech originating from a television. Use of hearing assistance technology (HAT) may improve speech recognition in noise and for signals that originate from more than a few feet from the listener; however, there are no published studies evaluating the potential benefits of a wireless HAT designed to deliver audio signals from a television directly to a CI sound processor. The objective of this study was to compare speech recognition in quiet and in noise of CI recipients with the use of their CI alone and with the use of their CI and a wireless HAT (Cochlear Wireless TV Streamer). A two-way repeated measures design was used to evaluate performance differences obtained in quiet and in competing noise (65 dBA) with the CI sound processor alone and with the sound processor coupled to the Cochlear Wireless TV Streamer. Sixteen users of Cochlear Nucleus 24 Freedom, CI512, and CI422 implants were included in the study. Participants were evaluated in four conditions including use of the sound processor alone and use of the sound processor with the wireless streamer in quiet and in the presence of competing noise at 65 dBA. Speech recognition was evaluated in each condition with two full lists of Computer-Assisted Speech Perception Testing and Training Sentence-Level Test sentences presented from a light-emitting diode television. Speech recognition in noise was significantly better with use of the wireless streamer compared to participants' performance with their CI sound processor alone. There was also a nonsignificant trend toward better performance in quiet with use of the TV Streamer. Performance was significantly poorer when evaluated in noise compared to performance in quiet when the TV Streamer was not used. Use of the Cochlear Wireless TV Streamer designed to stream audio from a television directly to a CI sound processor provides better speech recognition in quiet and in noise when compared to performance obtained with use of the CI sound processor alone. American Academy of Audiology.
Ahrens, Merle-Marie; Veniero, Domenica; Gross, Joachim; Harvey, Monika; Thut, Gregor
2015-01-01
Many behaviourally relevant sensory events such as motion stimuli and speech have an intrinsic spatio-temporal structure. This will engage intentional and most likely unintentional (automatic) prediction mechanisms enhancing the perception of upcoming stimuli in the event stream. Here we sought to probe the anticipatory processes that are automatically driven by rhythmic input streams in terms of their spatial and temporal components. To this end, we employed an apparent visual motion paradigm testing the effects of pre-target motion on lateralized visual target discrimination. The motion stimuli either moved towards or away from peripheral target positions (valid vs. invalid spatial motion cueing) at a rhythmic or arrhythmic pace (valid vs. invalid temporal motion cueing). Crucially, we emphasized automatic motion-induced anticipatory processes by rendering the motion stimuli non-predictive of upcoming target position (by design) and task-irrelevant (by instruction), and by creating instead endogenous (orthogonal) expectations using symbolic cueing. Our data revealed that the apparent motion cues automatically engaged both spatial and temporal anticipatory processes, but that these processes were dissociated. We further found evidence for lateralisation of anticipatory temporal but not spatial processes. This indicates that distinct mechanisms may drive automatic spatial and temporal extrapolation of upcoming events from rhythmic event streams. This contrasts with previous findings that instead suggest an interaction between spatial and temporal attention processes when endogenously driven. Our results further highlight the need for isolating intentional from unintentional processes for better understanding the various anticipatory mechanisms engaged in processing behaviourally relevant stimuli with predictable spatio-temporal structure such as motion and speech. PMID:26623650
Allen, Thomas E; Anderson, Melissa L
2010-01-01
This article investigated to what extent age, use of a cochlear implant, parental hearing status, and use of sign in the home determine language of instruction for profoundly deaf children. Categorical data from 8,325 profoundly deaf students from the 2008 Annual Survey of Deaf and Hard-of-Hearing Children and Youth were analyzed using chi-square automated interaction detector, a stepwise analytic procedure that allows the assessment of higher order interactions among categorical variables. Results indicated that all characteristics were significantly related to classroom communication modality. Although younger and older students demonstrated a different distribution of communication modality, for both younger and older students, cochlear implantation had the greatest effect on differentiating students into communication modalities, yielding greater gains in the speech-only category for implanted students. For all subgroups defined by age and implantation status, the use of sign at home further segregated the sample into communication modality subgroups, reducing the likelihood of speech only and increasing the placement of students into signing classroom settings. Implications for future research in the field of deaf education are discussed.
NASA Astrophysics Data System (ADS)
Aaronson, Neil L.
This dissertation deals with questions important to the problem of human sound source localization in rooms, starting with perceptual studies and moving on to physical measurements made in rooms. In Chapter 1, a perceptual study is performed relevant to a specific phenomenon the effect of speech reflections occurring in the front-back dimension and the ability of humans to segregate that from unreflected speech. Distracters were presented from the same source as the target speech, a loudspeaker directly in front of the listener, and also from a loudspeaker directly behind the listener, delayed relative to the front loudspeaker. Steps were taken to minimize the contributions of binaural difference cues. For all delays within +/-32 ms, a release from informational masking of about 2 dB occurred. This suggested that human listeners are able to segregate speech sources based on spatial cues, even with minimal binaural cues. In moving on to physical measurements in rooms, a method was sought for simultaneous measurement of room characteristics such as impulse response (IR) and reverberation time (RT60), and binaural parameters such as interaural time difference (ITD), interaural level difference (ILD), and the interaural cross-correlation function and coherence. Chapter 2 involves investigations into the usefulness of maximum length sequences (MLS) for these purposes. Comparisons to random telegraph noise (RTN) show that MLS performs better in the measurement of stationary and room transfer functions, IR, and RT60 by an order of magnitude in RMS percent error, even after Wiener filtering and exponential time-domain filtering have improved the accuracy of RTN measurements. Measurements were taken in real rooms in an effort to understand how the reverberant characteristics of rooms affect binaural parameters important to sound source localization. Chapter 3 deals with interaural coherence, a parameter important for localization and perception of auditory source width. MLS were used to measure waveform and envelope coherences in two rooms for various source distances and 0° azimuth through a head-and-torso simulator (KEMAR). A relationship is sought that relates these two types of coherence, since envelope coherence, while an important quantity, is generally less accessible than waveform coherence. A power law relationship is shown to exist between the two that works well within and across bands, for any source distance, and is robust to reverberant conditions of the room. Measurements of ITD, ILD, and coherence in rooms give insight into the way rooms affect these parameters, and in turn, the ability of listeners to localize sounds in rooms. Such measurements, along with room properties, are made and analyzed using MLS methods in Chapter 4. It was found that the pinnae cause incoherence for sound sources incident between 30° and 90°. In human listeners, this does not seem to adversely affect performance in lateralization experiments. The cause of poor coherence in rooms was studied as part of Chapter 4 as well. It was found that rooms affect coherence by introducing variance into the ITD spectra within the bands in which it is measured. A mathematical model to predict the interaural coherence within a band given the standard deviation of the ITD spectrum and the center frequency of the band gives an exponential relationship. This is found to work well in predicting measured coherence given ITD spectrum variance. The pinnae seem to affect the ITD spectrum in a similar way at incident sound angles for which coherence is poor in an anechoic environment.
NASA Astrophysics Data System (ADS)
Hasan, Taufiq; Bořil, Hynek; Sangwan, Abhijeet; L Hansen, John H.
2013-12-01
The ability to detect and organize `hot spots' representing areas of excitement within video streams is a challenging research problem when techniques rely exclusively on video content. A generic method for sports video highlight selection is presented in this study which leverages both video/image structure as well as audio/speech properties. Processing begins where the video is partitioned into small segments and several multi-modal features are extracted from each segment. Excitability is computed based on the likelihood of the segmental features residing in certain regions of their joint probability density function space which are considered both exciting and rare. The proposed measure is used to rank order the partitioned segments to compress the overall video sequence and produce a contiguous set of highlights. Experiments are performed on baseball videos based on signal processing advancements for excitement assessment in the commentators' speech, audio energy, slow motion replay, scene cut density, and motion activity as features. Detailed analysis on correlation between user excitability and various speech production parameters is conducted and an effective scheme is designed to estimate the excitement level of commentator's speech from the sports videos. Subjective evaluation of excitability and ranking of video segments demonstrate a higher correlation with the proposed measure compared to well-established techniques indicating the effectiveness of the overall approach.
NASA Astrophysics Data System (ADS)
Imai, Emiko; Katagiri, Yoshitada; Seki, Keiko; Kawamata, Toshio
2011-06-01
We present a neural model of the production of modulated speech streams in the brain, referred to as prosody, which indicates the limbic structure essential for producing prosody both linguistically and emotionally. This model suggests that activating the fundamental brain including monoamine neurons at the basal ganglia will potentially contribute to helping patients with prosodic disorders coming from functional defects of the fundamental brain to overcome their speech problem. To establish effective clinical treatment for such prosodic disorders, we examine how sounds affect the fundamental activity by using electroencephalographic measurements. Throughout examinations with various melodious sounds, we found that some melodies with lilting rhythms successfully give rise to the fast alpha rhythms at the electroencephalogram which reflect the fundamental brain activity without any negative feelings.
Applying Spatial Audio to Human Interfaces: 25 Years of NASA Experience
NASA Technical Reports Server (NTRS)
Begault, Durand R.; Wenzel, Elizabeth M.; Godfrey, Martine; Miller, Joel D.; Anderson, Mark R.
2010-01-01
From the perspective of human factors engineering, the inclusion of spatial audio within a human-machine interface is advantageous from several perspectives. Demonstrated benefits include the ability to monitor multiple streams of speech and non-speech warning tones using a cocktail party advantage, and for aurally-guided visual search. Other potential benefits include the spatial coordination and interaction of multimodal events, and evaluation of new communication technologies and alerting systems using virtual simulation. Many of these technologies were developed at NASA Ames Research Center, beginning in 1985. This paper reviews examples and describes the advantages of spatial sound in NASA-related technologies, including space operations, aeronautics, and search and rescue. The work has involved hardware and software development as well as basic and applied research.
Binaural speech processing in individuals with auditory neuropathy.
Rance, G; Ryan, M M; Carew, P; Corben, L A; Yiu, E; Tan, J; Delatycki, M B
2012-12-13
Auditory neuropathy disrupts the neural representation of sound and may therefore impair processes contingent upon inter-aural integration. The aims of this study were to investigate binaural auditory processing in individuals with axonal (Friedreich ataxia) and demyelinating (Charcot-Marie-Tooth disease type 1A) auditory neuropathy and to evaluate the relationship between the degree of auditory deficit and overall clinical severity in patients with neuropathic disorders. Twenty-three subjects with genetically confirmed Friedreich ataxia and 12 subjects with Charcot-Marie-Tooth disease type 1A underwent psychophysical evaluation of basic auditory processing (intensity discrimination/temporal resolution) and binaural speech perception assessment using the Listening in Spatialized Noise test. Age, gender and hearing-level-matched controls were also tested. Speech perception in noise for individuals with auditory neuropathy was abnormal for each listening condition, but was particularly affected in circumstances where binaural processing might have improved perception through spatial segregation. Ability to use spatial cues was correlated with temporal resolution suggesting that the binaural-processing deficit was the result of disordered representation of timing cues in the left and right auditory nerves. Spatial processing was also related to overall disease severity (as measured by the Friedreich Ataxia Rating Scale and Charcot-Marie-Tooth Neuropathy Score) suggesting that the degree of neural dysfunction in the auditory system accurately reflects generalized neuropathic changes. Measures of binaural speech processing show promise for application in the neurology clinic. In individuals with auditory neuropathy due to both axonal and demyelinating mechanisms the assessment provides a measure of functional hearing ability, a biomarker capable of tracking the natural history of progressive disease and a potential means of evaluating the effectiveness of interventions. Copyright © 2012 IBRO. Published by Elsevier Ltd. All rights reserved.
A Network Model of Observation and Imitation of Speech
Mashal, Nira; Solodkin, Ana; Dick, Anthony Steven; Chen, E. Elinor; Small, Steven L.
2012-01-01
Much evidence has now accumulated demonstrating and quantifying the extent of shared regional brain activation for observation and execution of speech. However, the nature of the actual networks that implement these functions, i.e., both the brain regions and the connections among them, and the similarities and differences across these networks has not been elucidated. The current study aims to characterize formally a network for observation and imitation of syllables in the healthy adult brain and to compare their structure and effective connectivity. Eleven healthy participants observed or imitated audiovisual syllables spoken by a human actor. We constructed four structural equation models to characterize the networks for observation and imitation in each of the two hemispheres. Our results show that the network models for observation and imitation comprise the same essential structure but differ in important ways from each other (in both hemispheres) based on connectivity. In particular, our results show that the connections from posterior superior temporal gyrus and sulcus to ventral premotor, ventral premotor to dorsal premotor, and dorsal premotor to primary motor cortex in the left hemisphere are stronger during imitation than during observation. The first two connections are implicated in a putative dorsal stream of speech perception, thought to involve translating auditory speech signals into motor representations. Thus, the current results suggest that flow of information during imitation, starting at the posterior superior temporal cortex and ending in the motor cortex, enhances input to the motor cortex in the service of speech execution. PMID:22470360
Why the Body Comes First: Effects of Experimenter Touch on Infants' Word Finding
ERIC Educational Resources Information Center
Seidl, Amanda; Tincoff, Ruth; Baker, Christopher; Cristia, Alejandrina
2015-01-01
The lexicon of 6-month-olds is comprised of names and body part words. Unlike names, body part words do not often occur in isolation in the input. This presents a puzzle: How have infants been able to pull out these words from the continuous stream of speech at such a young age? We hypothesize that caregivers' interactions directed at and on…
Akram, Sahar; Presacco, Alessandro; Simon, Jonathan Z.; Shamma, Shihab A.; Babadi, Behtash
2015-01-01
The underlying mechanism of how the human brain solves the cocktail party problem is largely unknown. Recent neuroimaging studies, however, suggest salient temporal correlations between the auditory neural response and the attended auditory object. Using magnetoencephalography (MEG) recordings of the neural responses of human subjects, we propose a decoding approach for tracking the attentional state while subjects are selectively listening to one of the two speech streams embedded in a competing-speaker environment. We develop a biophysically-inspired state-space model to account for the modulation of the neural response with respect to the attentional state of the listener. The constructed decoder is based on a maximum a posteriori (MAP) estimate of the state parameters via the Expectation Maximization (EM) algorithm. Using only the envelope of the two speech streams as covariates, the proposed decoder enables us to track the attentional state of the listener with a temporal resolution of the order of seconds, together with statistical confidence intervals. We evaluate the performance of the proposed model using numerical simulations and experimentally measured evoked MEG responses from the human brain. Our analysis reveals considerable performance gains provided by the state-space model in terms of temporal resolution, computational complexity and decoding accuracy. PMID:26436490
Processing reafferent and exafferent visual information for action and perception.
Reichenbach, Alexandra; Diedrichsen, Jörn
2015-01-01
A recent study suggests that reafferent hand-related visual information utilizes a privileged, attention-independent processing channel for motor control. This process was termed visuomotor binding to reflect its proposed function: linking visual reafferences to the corresponding motor control centers. Here, we ask whether the advantage of processing reafferent over exafferent visual information is a specific feature of the motor processing stream or whether the improved processing also benefits the perceptual processing stream. Human participants performed a bimanual reaching task in a cluttered visual display, and one of the visual hand cursors could be displaced laterally during the movement. We measured the rapid feedback responses of the motor system as well as matched perceptual judgments of which cursor was displaced. Perceptual judgments were either made by watching the visual scene without moving or made simultaneously to the reaching tasks, such that the perceptual processing stream could also profit from the specialized processing of reafferent information in the latter case. Our results demonstrate that perceptual judgments in the heavily cluttered visual environment were improved when performed based on reafferent information. Even in this case, however, the filtering capability of the perceptual processing stream suffered more from the increasing complexity of the visual scene than the motor processing stream. These findings suggest partly shared and partly segregated processing of reafferent information for vision for motor control versus vision for perception.
Chuen, Lorraine; Schutz, Michael
2016-07-01
An observer's inference that multimodal signals originate from a common underlying source facilitates cross-modal binding. This 'unity assumption' causes asynchronous auditory and visual speech streams to seem simultaneous (Vatakis & Spence, Perception & Psychophysics, 69(5), 744-756, 2007). Subsequent tests of non-speech stimuli such as musical and impact events found no evidence for the unity assumption, suggesting the effect is speech-specific (Vatakis & Spence, Acta Psychologica, 127(1), 12-23, 2008). However, the role of amplitude envelope (the changes in energy of a sound over time) was not previously appreciated within this paradigm. Here, we explore whether previous findings suggesting speech-specificity of the unity assumption were confounded by similarities in the amplitude envelopes of the contrasted auditory stimuli. Experiment 1 used natural events with clearly differentiated envelopes: single notes played on either a cello (bowing motion) or marimba (striking motion). Participants performed an un-speeded temporal order judgments task; viewing audio-visually matched (e.g., marimba auditory with marimba video) and mismatched (e.g., cello auditory with marimba video) versions of stimuli at various stimulus onset asynchronies, and were required to indicate which modality was presented first. As predicted, participants were less sensitive to temporal order in matched conditions, demonstrating that the unity assumption can facilitate the perception of synchrony outside of speech stimuli. Results from Experiments 2 and 3 revealed that when spectral information was removed from the original auditory stimuli, amplitude envelope alone could not facilitate the influence of audiovisual unity. We propose that both amplitude envelope and spectral acoustic cues affect the percept of audiovisual unity, working in concert to help an observer determine when to integrate across modalities.
Single-sensor multispeaker listening with acoustic metamaterials
Xie, Yangbo; Tsai, Tsung-Han; Konneker, Adam; Popa, Bogdan-Ioan; Brady, David J.; Cummer, Steven A.
2015-01-01
Designing a “cocktail party listener” that functionally mimics the selective perception of a human auditory system has been pursued over the past decades. By exploiting acoustic metamaterials and compressive sensing, we present here a single-sensor listening device that separates simultaneous overlapping sounds from different sources. The device with a compact array of resonant metamaterials is demonstrated to distinguish three overlapping and independent sources with 96.67% correct audio recognition. Segregation of the audio signals is achieved using physical layer encoding without relying on source characteristics. This hardware approach to multichannel source separation can be applied to robust speech recognition and hearing aids and may be extended to other acoustic imaging and sensing applications. PMID:26261314
Buss, Emily; Bowdrie, Kristina
2017-01-01
Previous work has shown that masked-sentence recognition is particularly poor when the masker is composed of two competing talkers, a finding that is attributed to informational masking. Informational masking tends to be largest when the target and masker talkers are perceptually similar. Reductions in masking have been observed for a wide range of target and masker differences, including language: Performance is better when the target and masker talkers speak in different languages, compared with the same language. The present study evaluated normal-hearing adults’ sentence recognition in a two-talker masker as a function of the perceptual similarity between the target and each of the two masker streams. The target was English, and the maskers were composed of English, time-reversed English, or Dutch. These three masker types are known to vary in the informational masking they exert. The two talkers within the two-talker maskers were either congruent (e.g., both English) or incongruent (e.g., one English, one Dutch). As predicted, mean performance was worse for the congruent English masker than the congruent time-reversed English or congruent Dutch maskers. Incongruent two-talker maskers, with just one English masker stream, were only modestly less effective than the congruent English masker. This result indicates that two-talker masker effectiveness was determined predominantly by the one masker stream that was most perceptually similar to the target. Speech recognition in a single-talker masker differed only marginally between the English, Dutch, and time-reversed English masker types, suggesting that perceptual similarity may be more critical in a two-talker than a one-talker masker. PMID:29169315
Mapping the cortical representation of speech sounds in a syllable repetition task.
Markiewicz, Christopher J; Bohland, Jason W
2016-11-01
Speech repetition relies on a series of distributed cortical representations and functional pathways. A speaker must map auditory representations of incoming sounds onto learned speech items, maintain an accurate representation of those items in short-term memory, interface that representation with the motor output system, and fluently articulate the target sequence. A "dorsal stream" consisting of posterior temporal, inferior parietal and premotor regions is thought to mediate auditory-motor representations and transformations, but the nature and activation of these representations for different portions of speech repetition tasks remains unclear. Here we mapped the correlates of phonetic and/or phonological information related to the specific phonemes and syllables that were heard, remembered, and produced using a series of cortical searchlight multi-voxel pattern analyses trained on estimates of BOLD responses from individual trials. Based on responses linked to input events (auditory syllable presentation), predictive vowel-level information was found in the left inferior frontal sulcus, while syllable prediction revealed significant clusters in the left ventral premotor cortex and central sulcus and the left mid superior temporal sulcus. Responses linked to output events (the GO signal cueing overt production) revealed strong clusters of vowel-related information bilaterally in the mid to posterior superior temporal sulcus. For the prediction of onset and coda consonants, input-linked responses yielded distributed clusters in the superior temporal cortices, which were further informative for classifiers trained on output-linked responses. Output-linked responses in the Rolandic cortex made strong predictions for the syllables and consonants produced, but their predictive power was reduced for vowels. The results of this study provide a systematic survey of how cortical response patterns covary with the identity of speech sounds, which will help to constrain and guide theoretical models of speech perception, speech production, and phonological working memory. Copyright © 2016 Elsevier Inc. All rights reserved.
Dietrich, Susanne; Hertrich, Ingo; Ackermann, Hermann
2015-01-01
In many functional magnetic resonance imaging (fMRI) studies blind humans were found to show cross-modal reorganization engaging the visual system in non-visual tasks. For example, blind people can manage to understand (synthetic) spoken language at very high speaking rates up to ca. 20 syllables/s (syl/s). FMRI data showed that hemodynamic activation within right-hemispheric primary visual cortex (V1), bilateral pulvinar (Pv), and left-hemispheric supplementary motor area (pre-SMA) covaried with their capability of ultra-fast speech (16 syllables/s) comprehension. It has been suggested that right V1 plays an important role with respect to the perception of ultra-fast speech features, particularly the detection of syllable onsets. Furthermore, left pre-SMA seems to be an interface between these syllabic representations and the frontal speech processing and working memory network. So far, little is known about the networks linking V1 to Pv, auditory cortex (A1), and (mesio-) frontal areas. Dynamic causal modeling (DCM) was applied to investigate (i) the input structure from A1 and Pv toward right V1 and (ii) output from right V1 and A1 to left pre-SMA. As concerns the input Pv was significantly connected to V1, in addition to A1, in blind participants, but not in sighted controls. Regarding the output V1 was significantly connected to pre-SMA in blind individuals, and the strength of V1-SMA connectivity correlated with the performance of ultra-fast speech comprehension. By contrast, in sighted controls, not understanding ultra-fast speech, pre-SMA did neither receive input from A1 nor V1. Taken together, right V1 might facilitate the “parsing” of the ultra-fast speech stream in blind subjects by receiving subcortical auditory input via the Pv (= secondary visual pathway) and transmitting this information toward contralateral pre-SMA. PMID:26148062
Dietrich, Susanne; Hertrich, Ingo; Ackermann, Hermann
2015-01-01
In many functional magnetic resonance imaging (fMRI) studies blind humans were found to show cross-modal reorganization engaging the visual system in non-visual tasks. For example, blind people can manage to understand (synthetic) spoken language at very high speaking rates up to ca. 20 syllables/s (syl/s). FMRI data showed that hemodynamic activation within right-hemispheric primary visual cortex (V1), bilateral pulvinar (Pv), and left-hemispheric supplementary motor area (pre-SMA) covaried with their capability of ultra-fast speech (16 syllables/s) comprehension. It has been suggested that right V1 plays an important role with respect to the perception of ultra-fast speech features, particularly the detection of syllable onsets. Furthermore, left pre-SMA seems to be an interface between these syllabic representations and the frontal speech processing and working memory network. So far, little is known about the networks linking V1 to Pv, auditory cortex (A1), and (mesio-) frontal areas. Dynamic causal modeling (DCM) was applied to investigate (i) the input structure from A1 and Pv toward right V1 and (ii) output from right V1 and A1 to left pre-SMA. As concerns the input Pv was significantly connected to V1, in addition to A1, in blind participants, but not in sighted controls. Regarding the output V1 was significantly connected to pre-SMA in blind individuals, and the strength of V1-SMA connectivity correlated with the performance of ultra-fast speech comprehension. By contrast, in sighted controls, not understanding ultra-fast speech, pre-SMA did neither receive input from A1 nor V1. Taken together, right V1 might facilitate the "parsing" of the ultra-fast speech stream in blind subjects by receiving subcortical auditory input via the Pv (= secondary visual pathway) and transmitting this information toward contralateral pre-SMA.
Fost, B A; Ferreri, C P
2015-03-01
The pH preferred and avoided by wild, adult brook trout Salvelinus fontinalis and brown trout Salmo trutta was examined in a series a laboratory tests using gradual and steep-gradient flow-through aquaria. The results were compared with those published for the observed segregation patterns of juvenile S. fontinalis and S. trutta in Pennsylvania streams. The adult S. trutta tested showed a preference for pH 4·0 while adult S. fontinalis did not prefer any pH within the range tested. Salmo trutta are not found in Pennsylvania streams with a base-flow pH < 5·8 which suggests that S. trutta prefer pH well above 4·0. Adult S. trutta displayed a lack of avoidance at pH below 5·0, as also reported earlier for juveniles. The avoidance pH of wild, adult S. fontinalis (between pH 5·5 and 6·0) and S. trutta (between pH 6·5 and 7·0) did not differ appreciably from earlier study results for the avoidance pH of juvenile S. fontinalis and S. trutta. A comparison of c.i. around these avoidance estimates indicates that avoidance pH is similar among adult S. fontinalis and S. trutta in this study. The limited overlap of c.i. for avoidance pH values for the two species, however, suggests that some S. trutta will display avoidance at a higher pH when S. fontinalis will not. The results of this study indicate that segregation patterns of adult S. fontinalis and S. trutta in Pennsylvania streams could be related to pH and that competition with S. trutta could be mediating the occurrence of S. fontinalis at some pH levels. © 2015 The Fisheries Society of the British Isles.
Knockdown of Dyslexia-Gene Dcdc2 Interferes with Speech Sound Discrimination in Continuous Streams.
Centanni, Tracy Michelle; Booker, Anne B; Chen, Fuyi; Sloan, Andrew M; Carraway, Ryan S; Rennaker, Robert L; LoTurco, Joseph J; Kilgard, Michael P
2016-04-27
Dyslexia is the most common developmental language disorder and is marked by deficits in reading and phonological awareness. One theory of dyslexia suggests that the phonological awareness deficit is due to abnormal auditory processing of speech sounds. Variants in DCDC2 and several other neural migration genes are associated with dyslexia and may contribute to auditory processing deficits. In the current study, we tested the hypothesis that RNAi suppression of Dcdc2 in rats causes abnormal cortical responses to sound and impaired speech sound discrimination. In the current study, rats were subjected in utero to RNA interference targeting of the gene Dcdc2 or a scrambled sequence. Primary auditory cortex (A1) responses were acquired from 11 rats (5 with Dcdc2 RNAi; DC-) before any behavioral training. A separate group of 8 rats (3 DC-) were trained on a variety of speech sound discrimination tasks, and auditory cortex responses were acquired following training. Dcdc2 RNAi nearly eliminated the ability of rats to identify specific speech sounds from a continuous train of speech sounds but did not impair performance during discrimination of isolated speech sounds. The neural responses to speech sounds in A1 were not degraded as a function of presentation rate before training. These results suggest that A1 is not directly involved in the impaired speech discrimination caused by Dcdc2 RNAi. This result contrasts earlier results using Kiaa0319 RNAi and suggests that different dyslexia genes may cause different deficits in the speech processing circuitry, which may explain differential responses to therapy. Although dyslexia is diagnosed through reading difficulty, there is a great deal of variation in the phenotypes of these individuals. The underlying neural and genetic mechanisms causing these differences are still widely debated. In the current study, we demonstrate that suppression of a candidate-dyslexia gene causes deficits on tasks of rapid stimulus processing. These animals also exhibited abnormal neural plasticity after training, which may be a mechanism for why some children with dyslexia do not respond to intervention. These results are in stark contrast to our previous work with a different candidate gene, which caused a different set of deficits. Our results shed some light on possible neural and genetic mechanisms causing heterogeneity in the dyslexic population. Copyright © 2016 the authors 0270-6474/16/364895-12$15.00/0.
Knockdown of Dyslexia-Gene Dcdc2 Interferes with Speech Sound Discrimination in Continuous Streams
Booker, Anne B.; Chen, Fuyi; Sloan, Andrew M.; Carraway, Ryan S.; Rennaker, Robert L.; LoTurco, Joseph J.; Kilgard, Michael P.
2016-01-01
Dyslexia is the most common developmental language disorder and is marked by deficits in reading and phonological awareness. One theory of dyslexia suggests that the phonological awareness deficit is due to abnormal auditory processing of speech sounds. Variants in DCDC2 and several other neural migration genes are associated with dyslexia and may contribute to auditory processing deficits. In the current study, we tested the hypothesis that RNAi suppression of Dcdc2 in rats causes abnormal cortical responses to sound and impaired speech sound discrimination. In the current study, rats were subjected in utero to RNA interference targeting of the gene Dcdc2 or a scrambled sequence. Primary auditory cortex (A1) responses were acquired from 11 rats (5 with Dcdc2 RNAi; DC−) before any behavioral training. A separate group of 8 rats (3 DC−) were trained on a variety of speech sound discrimination tasks, and auditory cortex responses were acquired following training. Dcdc2 RNAi nearly eliminated the ability of rats to identify specific speech sounds from a continuous train of speech sounds but did not impair performance during discrimination of isolated speech sounds. The neural responses to speech sounds in A1 were not degraded as a function of presentation rate before training. These results suggest that A1 is not directly involved in the impaired speech discrimination caused by Dcdc2 RNAi. This result contrasts earlier results using Kiaa0319 RNAi and suggests that different dyslexia genes may cause different deficits in the speech processing circuitry, which may explain differential responses to therapy. SIGNIFICANCE STATEMENT Although dyslexia is diagnosed through reading difficulty, there is a great deal of variation in the phenotypes of these individuals. The underlying neural and genetic mechanisms causing these differences are still widely debated. In the current study, we demonstrate that suppression of a candidate-dyslexia gene causes deficits on tasks of rapid stimulus processing. These animals also exhibited abnormal neural plasticity after training, which may be a mechanism for why some children with dyslexia do not respond to intervention. These results are in stark contrast to our previous work with a different candidate gene, which caused a different set of deficits. Our results shed some light on possible neural and genetic mechanisms causing heterogeneity in the dyslexic population. PMID:27122044
A feedback model of figure-ground assignment.
Domijan, Drazen; Setić, Mia
2008-05-30
A computational model is proposed in order to explain how bottom-up and top-down signals are combined into a unified perception of figure and background. The model is based on the interaction between the ventral and the dorsal stream. The dorsal stream computes saliency based on boundary signals provided by the simple and the complex cortical cells. Output from the dorsal stream is projected to the surface network which serves as a blackboard on which the surface representation is formed. The surface network is a recurrent network which segregates different surfaces by assigning different firing rates to them. The figure is labeled by the maximal firing rate. Computer simulations showed that the model correctly assigns figural status to the surface with a smaller size, a greater contrast, convexity, surroundedness, horizontal-vertical orientation and a higher spatial frequency content. The simple gradient of activity in the dorsal stream enables the simulation of the new principles of the lower region and the top-bottom polarity. The model also explains how the exogenous attention and the endogenous attention may reverse the figural assignment. Due to the local excitation in the surface network, neural activity at the cued region will spread over the whole surface representation. Therefore, the model implements the object-based attentional selection.
Bidet-Caulet, Aurélie; Fischer, Catherine; Besle, Julien; Aguera, Pierre-Emmanuel; Giard, Marie-Helene; Bertrand, Olivier
2007-08-29
In noisy environments, we use auditory selective attention to actively ignore distracting sounds and select relevant information, as during a cocktail party to follow one particular conversation. The present electrophysiological study aims at deciphering the spatiotemporal organization of the effect of selective attention on the representation of concurrent sounds in the human auditory cortex. Sound onset asynchrony was manipulated to induce the segregation of two concurrent auditory streams. Each stream consisted of amplitude modulated tones at different carrier and modulation frequencies. Electrophysiological recordings were performed in epileptic patients with pharmacologically resistant partial epilepsy, implanted with depth electrodes in the temporal cortex. Patients were presented with the stimuli while they either performed an auditory distracting task or actively selected one of the two concurrent streams. Selective attention was found to affect steady-state responses in the primary auditory cortex, and transient and sustained evoked responses in secondary auditory areas. The results provide new insights on the neural mechanisms of auditory selective attention: stream selection during sound rivalry would be facilitated not only by enhancing the neural representation of relevant sounds, but also by reducing the representation of irrelevant information in the auditory cortex. Finally, they suggest a specialization of the left hemisphere in the attentional selection of fine-grained acoustic information.
Getzmann, Stephan; Lewald, Jörg; Falkenstein, Michael
2014-01-01
Speech understanding in complex and dynamic listening environments requires (a) auditory scene analysis, namely auditory object formation and segregation, and (b) allocation of the attentional focus to the talker of interest. There is evidence that pre-information is actively used to facilitate these two aspects of the so-called "cocktail-party" problem. Here, a simulated multi-talker scenario was combined with electroencephalography to study scene analysis and allocation of attention in young and middle-aged adults. Sequences of short words (combinations of brief company names and stock-price values) from four talkers at different locations were simultaneously presented, and the detection of target names and the discrimination between critical target values were assessed. Immediately prior to speech sequences, auditory pre-information was provided via cues that either prepared auditory scene analysis or attentional focusing, or non-specific pre-information was given. While performance was generally better in younger than older participants, both age groups benefited from auditory pre-information. The analysis of the cue-related event-related potentials revealed age-specific differences in the use of pre-cues: Younger adults showed a pronounced N2 component, suggesting early inhibition of concurrent speech stimuli; older adults exhibited a stronger late P3 component, suggesting increased resource allocation to process the pre-information. In sum, the results argue for an age-specific utilization of auditory pre-information to improve listening in complex dynamic auditory environments.
Noble, William; Gatehouse, Stuart
2004-02-01
A series of comparative analyses is presented between a group with relatively similar degrees of hearing loss in each ear (n = 103: symmetry group) and one with dissimilar losses (n = 50: asymmetry group). Asymmetry was defined as an interaural difference of more than 10dB in hearing levels averaged over 0.5. 1, 2 and 4kHz. Comparison was focused on self-rated disabilities as reflected in responses on the Speech, Spatial and Qualities of Hearing Scale (SSQ). The connections between SSQ ratings and a global self-rating of handicap were also observed. The interrelationships among SSQ items for the two groups were analysed to determine how the SSQ behaves when applied to groups in whom binaural hearing is more (asymmetry) versus less compromised. As expected, spatial hearing is severely disabled in the group with asymmetry; this group is generally more disabled than the symmetry group across all SSQ domains. In the linkages with handicap, spatial hearing, especially in dynamic settings, was strongly represented in the asymmetry group, while all aspects of hearing were moderately to strongly represented in the symmetry group. Item intercorrelations showed that speech hearing is a relatively autonomous function for the symmetry group, whereas it is enmeshed with segregation, clarity and naturalness factors for the asymmetry group. Spatial functions were more independent of others in the asymmetry group. The SSQ shows promise in the assessment of outcomes in the case of bilateral versus unilateral amplification and/or implantation.
Getzmann, Stephan; Lewald, Jörg; Falkenstein, Michael
2014-01-01
Speech understanding in complex and dynamic listening environments requires (a) auditory scene analysis, namely auditory object formation and segregation, and (b) allocation of the attentional focus to the talker of interest. There is evidence that pre-information is actively used to facilitate these two aspects of the so-called “cocktail-party” problem. Here, a simulated multi-talker scenario was combined with electroencephalography to study scene analysis and allocation of attention in young and middle-aged adults. Sequences of short words (combinations of brief company names and stock-price values) from four talkers at different locations were simultaneously presented, and the detection of target names and the discrimination between critical target values were assessed. Immediately prior to speech sequences, auditory pre-information was provided via cues that either prepared auditory scene analysis or attentional focusing, or non-specific pre-information was given. While performance was generally better in younger than older participants, both age groups benefited from auditory pre-information. The analysis of the cue-related event-related potentials revealed age-specific differences in the use of pre-cues: Younger adults showed a pronounced N2 component, suggesting early inhibition of concurrent speech stimuli; older adults exhibited a stronger late P3 component, suggesting increased resource allocation to process the pre-information. In sum, the results argue for an age-specific utilization of auditory pre-information to improve listening in complex dynamic auditory environments. PMID:25540608
Stream pH as an abiotic gradient influencing distributions of trout in Pennsylvania streams
Kocovsky, P.M.; Carline, R.F.
2005-01-01
Elevation and stream slope are abiotic gradients that limit upstream distributions of brook trout Salvelinus fontinalis and brown trout Salmo trutta in streams. We sought to determine whether another abiotic gradient, base-flow pH, may also affect distributions of these two species in eastern North America streams. We used historical data from the Pennsylvania Fish and Boat Commission's fisheries management database to explore the effects of reach elevation, slope, and base-flow pH on distributional limits to brook trout and brown trout in Pennsylvania streams in the Appalachian Plateaus and Ridge and Valley physiographic provinces. Discriminant function analysis (DFA) was used to calculate a canonical axis that separated allopatric brook trout populations from allopatric brown trout populations and allowed us to assess which of the three independent variables were important gradients along which communities graded from allopatric brook trout to allopatric brown trout. Canonical structure coefficients from DFA indicated that in both physiographic provinces, stream base-flow pH and slope were important factors in distributional limits; elevation was also an important factor in the Ridge and Valley Province but not the Appalachian Plateaus Province. Graphs of each variable against the proportion of brook trout in a community also identified apparent zones of allopatry for both species on the basis of pH and stream slope. We hypothesize that pH-mediated interspecific competition that favors brook trout in competition with brown trout at lower pH is the most plausible mechanism for segregation of these two species along pH gradients. Our discovery that trout distributions in Pennsylvania are related to stream base-flow pH has important implications for brook trout conservation in acidified regions. Carefully designed laboratory and field studies will be required to test our hypothesis and elucidate the mechanisms responsible for the partitioning of brook trout and brown trout along pH gradients. ?? Copyright by the American Fisheries Society 2005.
Dissociable prefrontal brain systems for attention and emotion
NASA Astrophysics Data System (ADS)
Yamasaki, Hiroshi; Labar, Kevin S.; McCarthy, Gregory
2002-08-01
The prefrontal cortex has been implicated in a variety of attentional, executive, and mnemonic mental operations, yet its functional organization is still highly debated. The present study used functional MRI to determine whether attentional and emotional functions are segregated into dissociable prefrontal networks in the human brain. Subjects discriminated infrequent and irregularly presented attentional targets (circles) from frequent standards (squares) while novel distracting scenes, parametrically varied for emotional arousal, were intermittently presented. Targets differentially activated middle frontal gyrus, posterior parietal cortex, and posterior cingulate gyrus. Novel distracters activated inferior frontal gyrus, amygdala, and fusiform gyrus, with significantly stronger activation evoked by the emotional scenes. The anterior cingulate gyrus was the only brain region with equivalent responses to attentional and emotional stimuli. These results show that attentional and emotional functions are segregated into parallel dorsal and ventral streams that extend into prefrontal cortex and are integrated in the anterior cingulate. These findings may have implications for understanding the neural dynamics underlying emotional distractibility on attentional tasks in affective disorders. novelty | prefrontal cortex | amygdala | cingulate gyrus
DOE Office of Scientific and Technical Information (OSTI.GOV)
NONE
1998-01-01
This Environmental Assessment (EA) has been prepared by the Department of Energy (DOE) to assess the potential environmental impacts associated with the construction, operation and decontamination and decommissioning (D&D) of the Waste Segregation Facility (WSF) for the sorting, shredding, and compaction of low-level radioactive waste (LLW) at the Savannah River Site (SRS) located near Aiken, South Carolina. The LLW to be processed consists of two waste streams: legacy waste which is currently stored in E-Area Vaults of SRS and new waste generated from continuing operations. The proposed action is to construct, operate, and D&D a facility to process low-activity job-controlmore » and equipment waste for volume reduction. The LLW would be processed to make more efficient use of low-level waste disposal capacity (E-Area Vaults) or to meet the waste acceptance criteria for treatment at the Consolidated Incineration Facility (CIF) at SRS.« less
Dynamic Encoding of Speech Sequence Probability in Human Temporal Cortex
Leonard, Matthew K.; Bouchard, Kristofer E.; Tang, Claire
2015-01-01
Sensory processing involves identification of stimulus features, but also integration with the surrounding sensory and cognitive context. Previous work in animals and humans has shown fine-scale sensitivity to context in the form of learned knowledge about the statistics of the sensory environment, including relative probabilities of discrete units in a stream of sequential auditory input. These statistics are a defining characteristic of one of the most important sequential signals humans encounter: speech. For speech, extensive exposure to a language tunes listeners to the statistics of sound sequences. To address how speech sequence statistics are neurally encoded, we used high-resolution direct cortical recordings from human lateral superior temporal cortex as subjects listened to words and nonwords with varying transition probabilities between sound segments. In addition to their sensitivity to acoustic features (including contextual features, such as coarticulation), we found that neural responses dynamically encoded the language-level probability of both preceding and upcoming speech sounds. Transition probability first negatively modulated neural responses, followed by positive modulation of neural responses, consistent with coordinated predictive and retrospective recognition processes, respectively. Furthermore, transition probability encoding was different for real English words compared with nonwords, providing evidence for online interactions with high-order linguistic knowledge. These results demonstrate that sensory processing of deeply learned stimuli involves integrating physical stimulus features with their contextual sequential structure. Despite not being consciously aware of phoneme sequence statistics, listeners use this information to process spoken input and to link low-level acoustic representations with linguistic information about word identity and meaning. PMID:25948269
NASA Technical Reports Server (NTRS)
Kim, J.; Simon, T. W.
1991-01-01
An experimental investigation of the transition process on flat-plate and concave curved-wall boundary layers for various free-streem turbulence levels was performed. Where possible, sampling according to the intermittency function was made. Such sampling allowed segregation of the signal into two types of behavior: laminar-like and turbulent-like. The results from the investigation are discussed. Documentation is presented in two volumes. Volume one contains the text of the report including figures and supporting appendices. Volume two contains data reduction program listings and tabulated data.
NASA Technical Reports Server (NTRS)
Johnston, James C.; Hochhaus, Larry; Ruthruff, Eric
2002-01-01
Four experiments tested whether repetition blindness (RB; reduced accuracy reporting repetitions of briefly displayed items) is a perceptual or a memory-recall phenomenon. RB was measured in rapid serial visual presentation (RSVP) streams, with the task altered to reduce memory demands. In Experiment 1 only the number of targets (1 vs. 2) was reported, eliminating the need to remember target identities. Experiment 2 segregated repeated and nonrepeated targets into separate blocks to reduce bias against repeated targets. Experiments 3 and 4 required immediate "online" buttonpress responses to targets as they occurred. All 4 experiments showed very strong RB. Furthermore, the online response data showed clearly that the 2nd of the repeated targets is the one missed. The present results show that in the RSVP paradigm, RB occurs online during initial stimulus encoding and decision making. The authors argue that RB is indeed a perceptual phenomenon.
SNR-adaptive stream weighting for audio-MES ASR.
Lee, Ki-Seung
2008-08-01
Myoelectric signals (MESs) from the speaker's mouth region have been successfully shown to improve the noise robustness of automatic speech recognizers (ASRs), thus promising to extend their usability in implementing noise-robust ASR. In the recognition system presented herein, extracted audio and facial MES features were integrated by a decision fusion method, where the likelihood score of the audio-MES observation vector was given by a linear combination of class-conditional observation log-likelihoods of two classifiers, using appropriate weights. We developed a weighting process adaptive to SNRs. The main objective of the paper involves determining the optimal SNR classification boundaries and constructing a set of optimum stream weights for each SNR class. These two parameters were determined by a method based on a maximum mutual information criterion. Acoustic and facial MES data were collected from five subjects, using a 60-word vocabulary. Four types of acoustic noise including babble, car, aircraft, and white noise were acoustically added to clean speech signals with SNR ranging from -14 to 31 dB. The classification accuracy of the audio ASR was as low as 25.5%. Whereas, the classification accuracy of the MES ASR was 85.2%. The classification accuracy could be further improved by employing the proposed audio-MES weighting method, which was as high as 89.4% in the case of babble noise. A similar result was also found for the other types of noise.
Neural Basis of Action Understanding: Evidence from Sign Language Aphasia.
Rogalsky, Corianne; Raphel, Kristin; Tomkovicz, Vivian; O'Grady, Lucinda; Damasio, Hanna; Bellugi, Ursula; Hickok, Gregory
2013-01-01
The neural basis of action understanding is a hotly debated issue. The mirror neuron account holds that motor simulation in fronto-parietal circuits is critical to action understanding including speech comprehension, while others emphasize the ventral stream in the temporal lobe. Evidence from speech strongly supports the ventral stream account, but on the other hand, evidence from manual gesture comprehension (e.g., in limb apraxia) has led to contradictory findings. Here we present a lesion analysis of sign language comprehension. Sign language is an excellent model for studying mirror system function in that it bridges the gap between the visual-manual system in which mirror neurons are best characterized and language systems which have represented a theoretical target of mirror neuron research. Twenty-one life long deaf signers with focal cortical lesions performed two tasks: one involving the comprehension of individual signs and the other involving comprehension of signed sentences (commands). Participants' lesions, as indicated on MRI or CT scans, were mapped onto a template brain to explore the relationship between lesion location and sign comprehension measures. Single sign comprehension was not significantly affected by left hemisphere damage. Sentence sign comprehension impairments were associated with left temporal-parietal damage. We found that damage to mirror system related regions in the left frontal lobe were not associated with deficits on either of these comprehension tasks. We conclude that the mirror system is not critically involved in action understanding.
Butler, Clare
2013-09-01
Individuals who experience speech dysfluency are often stigmatised because their speech acts differ from the communicative norm. This article is located in and seeks to further the identity debates in exploring how individuals who are subject to the intermittent emergence of a stigmatised characteristic manage this randomised personal discrediting in their identity work. Through a series of focus groups and semi-structured interviews participants grudgingly report their management approaches which include concealing, drafting in unwitting others, role-playing and segregating self from their stammer. In describing how they manage their stammer they detail their use of the social space in a number of ways, including as a hiding place; a site for 'it' (the stammer); a gap in which to switch words; and a different area in which to perform. This study offers important insights, increasing our understanding of the often hidden negotiations of identity work and the sometime ingenious use of space in the management of a social stigma. © 2013 The Author. Sociology of Health & Illness © 2013 Foundation for the Sociology of Health & Illness/John Wiley © Sons Ltd. Published by John Wiley © Sons Ltd.
Exploring the Early Organization and Maturation of Linguistic Pathways in the Human Infant Brain.
Dubois, Jessica; Poupon, Cyril; Thirion, Bertrand; Simonnet, Hina; Kulikova, Sofya; Leroy, François; Hertz-Pannier, Lucie; Dehaene-Lambertz, Ghislaine
2016-05-01
Linguistic processing is based on a close collaboration between temporal and frontal regions connected by two pathways: the "dorsal" and "ventral pathways" (assumed to support phonological and semantic processing, respectively, in adults). We investigated here the development of these pathways at the onset of language acquisition, during the first post-natal weeks, using cross-sectional diffusion imaging in 21 healthy infants (6-22 weeks of age) and 17 young adults. We compared the bundle organization and microstructure at these two ages using tractography and original clustering analyses of diffusion tensor imaging parameters. We observed structural similarities between both groups, especially concerning the dorsal/ventral pathway segregation and the arcuate fasciculus asymmetry. We further highlighted the developmental tempos of the linguistic bundles: The ventral pathway maturation was more advanced than the dorsal pathway maturation, but the latter catches up during the first post-natal months. Its fast development during this period might relate to the learning of speech cross-modal representations and to the first combinatorial analyses of the speech input. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Linking sounds to meanings: infant statistical learning in a natural language.
Hay, Jessica F; Pelucchi, Bruna; Graf Estes, Katharine; Saffran, Jenny R
2011-09-01
The processes of infant word segmentation and infant word learning have largely been studied separately. However, the ease with which potential word forms are segmented from fluent speech seems likely to influence subsequent mappings between words and their referents. To explore this process, we tested the link between the statistical coherence of sequences presented in fluent speech and infants' subsequent use of those sequences as labels for novel objects. Notably, the materials were drawn from a natural language unfamiliar to the infants (Italian). The results of three experiments suggest that there is a close relationship between the statistics of the speech stream and subsequent mapping of labels to referents. Mapping was facilitated when the labels contained high transitional probabilities in the forward and/or backward direction (Experiment 1). When no transitional probability information was available (Experiment 2), or when the internal transitional probabilities of the labels were low in both directions (Experiment 3), infants failed to link the labels to their referents. Word learning appears to be strongly influenced by infants' prior experience with the distribution of sounds that make up words in natural languages. Copyright © 2011 Elsevier Inc. All rights reserved.
Listen to your mother! The role of talker familiarity in infant streaming.
Barker, Brittan A; Newman, Rochelle S
2004-12-01
Little is known about the acoustic cues infants might use to selectively attend to one talker in the presence of background noise. This study examined the role of talker familiarity as a possible cue. Infants either heard their own mothers (maternal-voice condition) or a different infant's mother (novel-voice condition) repeating isolated words while a female distracter voice spoke fluently in the background. Subsequently, infants heard passages produced by the target voice containing either the familiarized, target words or novel words. Infants in the maternal-voice condition listened significantly longer to the passages containing familiar words; infants in the novel-voice condition showed no preference. These results suggest that infants are able to separate the simultaneous speech of two women when one of the voices is highly familiar to them. However, infants seem to find separating the simultaneous speech of two unfamiliar women extremely difficult.
Interactive language learning by robots: the transition from babbling to word forms.
Lyon, Caroline; Nehaniv, Chrystopher L; Saunders, Joe
2012-01-01
The advent of humanoid robots has enabled a new approach to investigating the acquisition of language, and we report on the development of robots able to acquire rudimentary linguistic skills. Our work focuses on early stages analogous to some characteristics of a human child of about 6 to 14 months, the transition from babbling to first word forms. We investigate one mechanism among many that may contribute to this process, a key factor being the sensitivity of learners to the statistical distribution of linguistic elements. As well as being necessary for learning word meanings, the acquisition of anchor word forms facilitates the segmentation of an acoustic stream through other mechanisms. In our experiments some salient one-syllable word forms are learnt by a humanoid robot in real-time interactions with naive participants. Words emerge from random syllabic babble through a learning process based on a dialogue between the robot and the human participant, whose speech is perceived by the robot as a stream of phonemes. Numerous ways of representing the speech as syllabic segments are possible. Furthermore, the pronunciation of many words in spontaneous speech is variable. However, in line with research elsewhere, we observe that salient content words are more likely than function words to have consistent canonical representations; thus their relative frequency increases, as does their influence on the learner. Variable pronunciation may contribute to early word form acquisition. The importance of contingent interaction in real-time between teacher and learner is reflected by a reinforcement process, with variable success. The examination of individual cases may be more informative than group results. Nevertheless, word forms are usually produced by the robot after a few minutes of dialogue, employing a simple, real-time, frequency dependent mechanism. This work shows the potential of human-robot interaction systems in studies of the dynamics of early language acquisition.
Neural entrainment to rhythmic speech in children with developmental dyslexia
Power, Alan J.; Mead, Natasha; Barnes, Lisa; Goswami, Usha
2013-01-01
A rhythmic paradigm based on repetition of the syllable “ba” was used to study auditory, visual, and audio-visual oscillatory entrainment to speech in children with and without dyslexia using EEG. Children pressed a button whenever they identified a delay in the isochronous stimulus delivery (500 ms; 2 Hz delta band rate). Response power, strength of entrainment and preferred phase of entrainment in the delta and theta frequency bands were compared between groups. The quality of stimulus representation was also measured using cross-correlation of the stimulus envelope with the neural response. The data showed a significant group difference in the preferred phase of entrainment in the delta band in response to the auditory and audio-visual stimulus streams. A different preferred phase has significant implications for the quality of speech information that is encoded neurally, as it implies enhanced neuronal processing (phase alignment) at less informative temporal points in the incoming signal. Consistent with this possibility, the cross-correlogram analysis revealed superior stimulus representation by the control children, who showed a trend for larger peak r-values and significantly later lags in peak r-values compared to participants with dyslexia. Significant relationships between both peak r-values and peak lags were found with behavioral measures of reading. The data indicate that the auditory temporal reference frame for speech processing is atypical in developmental dyslexia, with low frequency (delta) oscillations entraining to a different phase of the rhythmic syllabic input. This would affect the quality of encoding of speech, and could underlie the cognitive impairments in phonological representation that are the behavioral hallmark of this developmental disorder across languages. PMID:24376407
Language learning impairments: integrating basic science, technology, and remediation.
Tallal, P; Merzenich, M M; Miller, S; Jenkins, W
1998-11-01
One of the fundamental goals of the modern field of neuroscience is to understand how neuronal activity gives rise to higher cortical function. However, to bridge the gap between neurobiology and behavior, we must understand higher cortical functions at the behavioral level at least as well as we have come to understand neurobiological processes at the cellular and molecular levels. This is certainly the case in the study of speech processing, where critical studies of behavioral dysfunction have provided key insights into the basic neurobiological mechanisms relevant to speech perception and production. Much of this progress derives from a detailed analysis of the sensory, perceptual, cognitive, and motor abilities of children who fail to acquire speech, language, and reading skills normally within the context of otherwise normal development. Current research now shows that a dysfunction in normal phonological processing, which is critical to the development of oral and written language, may derive, at least in part, from difficulties in perceiving and producing basic sensory-motor information in rapid succession--within tens of ms (see Tallal et al. 1993a for a review). There is now substantial evidence supporting the hypothesis that basic temporal integration processes play a fundamental role in establishing neural representations for the units of speech (phonemes), which must be segmented from the (continuous) speech stream and combined to form words, in order for the normal development of oral and written language to proceed. Results from magnetic resonance imaging (MRI) and positron emission tomography (PET) studies, as well as studies of behavioral performance in normal and language impaired children and adults, will be reviewed to support the view that the integration of rapidly changing successive acoustic events plays a primary role in phonological development and disorders. Finally, remediation studies based on this research, coupled with neuroplasticity research, will be presented.
Speech and language therapy/pathology: perspectives on a gendered profession.
Litosseliti, Lia; Leadbeater, Claire
2013-01-01
The speech and language therapy/pathology (SLT/SLP) profession is characterized by extreme 'occupational sex segregation', a term used to refer to persistently male- or female-dominated professions. Men make up only 2.5% of all SLTs in the UK, and a similar imbalance is found in other countries. Despite calls to increase diversity in the allied health professions more generally, research into the reasons for occupational sex segregation and gender as a potential key factor remains scarce. This study aims to explore the potential role of gender/gendered discourses in people's decision to pursue a career in SLT/SLP. It seeks to illustrate how gendered assumptions/expectations/discourses continue to construct SLT as a 'gendered' profession, and to make some recommendations in this area for SLT recruitment and practice. The study adopted a qualitative design which elicited research participants' views, knowledge and experiences (in their own words) in relation to the research questions. Data collection involved two iterative phases: a preliminary data phase--which involved semi-structured interviews with newly qualified SLT graduates and practising SLTs, and the completion of questionnaires by undergraduate SLTs--and a main/focus group phase. In the focus group phase reported in this paper, six focus groups in total were held with SLTs, teachers of SLT, and careers advisors in London, UK. The data were analysed qualitatively using grounded theory principles, thematic analysis and discourse analysis. The findings extend our knowledge and understanding of gender as a parameter of people's motivations and perceptions, which can influence their choice of career (e.g. as regards pay and flexibility). The findings also show that discourses around women as carers, nurturers and communicators constitute key ways through which the SLT profession continues to be constructed as 'women's work'. The topic of structural gender inequalities in the profession was also discussed in the data. Some recommendations for change, with implications for SLT recruitment and practice, were made by the participants themselves. Gender imbalance in SLT needs to be researched further in order to help address inequalities, re-evaluate professional practices and develop service delivery in the profession. This area also needs to be researched via analysis that goes beyond gender distribution in numerical terms to consider the complex perceptions or discourses around gender and work. Cross-disciplinary and comparative perspectives in future research would also be fruitful. © 2012 Royal College of Speech and Language Therapists.
Effect of motion on speech recognition.
Davis, Timothy J; Grantham, D Wesley; Gifford, René H
2016-07-01
The benefit of spatial separation for talkers in a multi-talker environment is well documented. However, few studies have examined the effect of talker motion on speech recognition. In the current study, we evaluated the effects of (1) motion of the target or distracters, (2) a priori information about the target and distracter spatial configurations, and (3) target and distracter location. In total, seventeen young adults with normal hearing were tested in a large anechoic chamber in two experiments. In Experiment 1, seven stimulus conditions were tested using the Coordinate Response Measure (Bolia et al., 2000) speech corpus, in which subjects were required to report the key words in a target sentence presented simultaneously with two distracter sentences. As in previous studies, there was a significant improvement in key word identification for conditions in which the target and distracters were spatially separated as compared to the co-located conditions. In addition, 1) motion of either talker or distracter resulted in improved performance compared to stationary presentation (talker motion yielded significantly better performance than distracter motion) 2) a priori information regarding stimulus configuration was not beneficial, and 3) performance was significantly better with key words at 0° azimuth as compared to -60° (on the listener's left). Experiment 2 included two additional conditions designed to assess whether the benefit of motion observed in Experiment 1 was due to the motion itself or to the fact that the motion conditions introduced small spatial separations in the target and distracter key words. Results showed that small spatial separations (on the order of 5-8°) resulted in improved performance (relative to co-located key words) whether the sentences were moving or stationary. These results suggest that in the presence of distracting messages, motion of either target or distracters and/or small spatial separation of the key words may be beneficial for sound source segregation and thus for improved speech recognition. Copyright © 2016 Elsevier B.V. All rights reserved.
Poeppel, David
2012-01-01
Research on the brain basis of speech and language faces theoretical and empirical challenges. The majority of current research, dominated by imaging, deficit-lesion, and electrophysiological techniques, seeks to identify regions that underpin aspects of language processing such as phonology, syntax, or semantics. The emphasis lies on localization and spatial characterization of function. The first part of the paper deals with a practical challenge that arises in the context of such a research program. This maps problem concerns the extent to which spatial information and localization can satisfy the explanatory needs for perception and cognition. Several areas of investigation exemplify how the neural basis of speech and language is discussed in those terms (regions, streams, hemispheres, networks). The second part of the paper turns to a more troublesome challenge, namely how to formulate the formal links between neurobiology and cognition. This principled problem thus addresses the relation between the primitives of cognition (here speech, language) and neurobiology. Dealing with this mapping problem invites the development of linking hypotheses between the domains. The cognitive sciences provide granular, theoretically motivated claims about the structure of various domains (the ‘cognome’); neurobiology, similarly, provides a list of the available neural structures. However, explanatory connections will require crafting computationally explicit linking hypotheses at the right level of abstraction. For both the practical maps problem and the principled mapping problem, developmental approaches and evidence can play a central role in the resolution. PMID:23017085
Wolfe, Jace; Morais, Mila; Schafer, Erin
2016-02-01
The goals of the present investigation were (1) to evaluate recognition of recorded speech presented over a mobile telephone for a group of adult bimodal cochlear implant users, and (2) to measure the potential benefits of wireless hearing assistance technology (HAT) for mobile telephone speech recognition using bimodal stimulation (i.e., a cochlear implant in one ear and a hearing aid on the other ear). A three-by-two-way repeated measures design was used to evaluate mobile telephone sentence-recognition performance differences obtained in quiet and in noise with and without the wireless HAT accessory coupled to the hearing aid alone, CI sound processor alone, and in the bimodal condition. Outpatient cochlear implant clinic. Sixteen bimodal users with Nucleus 24, Freedom, CI512, or CI422 cochlear implants participated in this study. Performance was measured with and without the use of a wireless HAT for the telephone used with the hearing aid alone, CI alone, and bimodal condition. CNC word recognition in quiet and in noise with and without the use of a wireless HAT telephone accessory in the hearing aid alone, CI alone, and bimodal conditions. Results suggested that the bimodal condition gave significantly better speech recognition on the mobile telephone with the wireless HAT. A wireless HAT for the mobile telephone provides bimodal users with significant improvement in word recognition in quiet and in noise over the mobile telephone.
Centrifugal Sieve for Gravity-Level-Independent Size Segregation of Granular Materials
NASA Technical Reports Server (NTRS)
Walton, Otis R.; Dreyer, Christopher; Riedel, Edward
2013-01-01
Conventional size segregation or screening in batch mode, using stacked vibrated screens, is often a time-consuming process. Utilization of centrifugal force instead of gravity as the primary body force can significantly shorten the time to segregate feedstock into a set of different-sized fractions. Likewise, under reduced gravity or microgravity, a centrifugal sieve system would function as well as it does terrestrially. When vibratory and mechanical blade sieving screens designed for terrestrial conditions were tested under lunar gravity conditions, they did not function well. The centrifugal sieving design of this technology overcomes the issues that prevented sieves designed for terrestrial conditions from functioning under reduced gravity. These sieves feature a rotating outer (cylindrical or conical) screen wall, rotating fast enough for the centrifugal forces near the wall to hold granular material against the rotating screen. Conventional centrifugal sieves have a stationary screen and rapidly rotating blades that shear the granular solid near the stationary screen, and effect the sieving process assisted by the airflow inside the unit. The centrifugal sieves of this new design may (or may not) have an inner blade or blades, moving relative to the rotating wall screen. Some continuous flow embodiments would have no inner auger or blades, but achieve axial motion through vibration. In all cases, the shearing action is gentler than conventional centrifugal sieves, which have very high velocity differences between the stationary outer screen and the rapidly rotating blades. The new design does not depend on airflow in the sieving unit, so it will function just as well in vacuum as in air. One advantage of the innovation for batch sieving is that a batch-mode centrifugal sieve may accomplish the same sieving operation in much less time than a conventional stacked set of vibrated screens (which utilize gravity as the primary driving force for size separation). In continuous mode, the centrifugal sieves can provide steady streams of fine and coarse material separated from a mixed feedstock flow stream. The centrifugal sieves can be scaled to any desired size and/or mass flow rate. Thus, they could be made in sizes suitable for small robotic exploratory missions, or for semi-permanent processing of regolith for extraction of volatiles of minerals. An advantage of the continuous-mode system is that it can be made with absolutely no gravity flow components for feeding material into, or for extracting the separated size streams from, the centrifugal sieve. Thus, the system is capable of functioning in a true microgravity environment. Another advantage of the continuous-mode system is that some embodiments of the innovation have no internal blades or vanes, and thus, can be designed to handle a very wide range of feedstock sizes, including occasional very large oversized pieces, without jamming or seizing up.
From attentional gating in macaque primary visual cortex to dyslexia in humans.
Vidyasagar, T R
2001-01-01
Selective attention is an important aspect of brain function that we need in coping with the immense and constant barrage of sensory information. One model of attention (Feature Integration Theory) that suggests an early selection of spatial locations of objects via an attentional spotlight would also solve the 'binding problem' (that is how do different attributes of each object get correctly bound together?). Our experiments have demonstrated modulation of specific locations of interest at the level of the primary visual cortex both in visual discrimination and memory tasks, where the actual locations of the targets was also important in being able to perform the task. It is suggested that the feedback mediating the modulation arises from the posterior parietal cortex, which would also be consistent with its known role in attentional control. In primates, the magnocellular (M) and parvocellular (P) pathways are the two major streams of inputs from the retina, carrying distinctly different types of information and they remain fairly segregated in their projections to the primary visual cortex and further into the extra-striate regions. The P inputs go mainly into the ventral (temporal) stream, while the dorsal (parietal) stream is dominated by M inputs. A theory of attentional gating is proposed here where the M dominated dorsal stream gates the P inputs into the ventral stream. This framework is used to provide a neural explanation of the processes involved in reading and in learning to read. This scheme also explains how a magnocellular deficit could cause the common reading impairment, dyslexia.
Neural Decoding of Bistable Sounds Reveals an Effect of Intention on Perceptual Organization
2018-01-01
Auditory signals arrive at the ear as a mixture that the brain must decompose into distinct sources based to a large extent on acoustic properties of the sounds. An important question concerns whether listeners have voluntary control over how many sources they perceive. This has been studied using pure high (H) and low (L) tones presented in the repeating pattern HLH-HLH-, which can form a bistable percept heard either as an integrated whole (HLH-) or as segregated into high (H-H-) and low (-L-) sequences. Although instructing listeners to try to integrate or segregate sounds affects reports of what they hear, this could reflect a response bias rather than a perceptual effect. We had human listeners (15 males, 12 females) continuously report their perception of such sequences and recorded neural activity using MEG. During neutral listening, a classifier trained on patterns of neural activity distinguished between periods of integrated and segregated perception. In other conditions, participants tried to influence their perception by allocating attention either to the whole sequence or to a subset of the sounds. They reported hearing the desired percept for a greater proportion of time than when listening neutrally. Critically, neural activity supported these reports; stimulus-locked brain responses in auditory cortex were more likely to resemble the signature of segregation when participants tried to hear segregation than when attempting to perceive integration. These results indicate that listeners can influence how many sound sources they perceive, as reflected in neural responses that track both the input and its perceptual organization. SIGNIFICANCE STATEMENT Can we consciously influence our perception of the external world? We address this question using sound sequences that can be heard either as coming from a single source or as two distinct auditory streams. Listeners reported spontaneous changes in their perception between these two interpretations while we recorded neural activity to identify signatures of such integration and segregation. They also indicated that they could, to some extent, choose between these alternatives. This claim was supported by corresponding changes in responses in auditory cortex. By linking neural and behavioral correlates of perception, we demonstrate that the number of objects that we perceive can depend not only on the physical attributes of our environment, but also on how we intend to experience it. PMID:29440556
Whitlock, Steven L.; Campbell, Matthew R.; Quist, Michael C.; Dux, Andrew M.
2018-01-01
Genetic and phenotypic traits of spatially and temporally segregated kokanee Oncorhynchus nerka spawning groups in Lake Pend Oreille, Idaho, were compared to test for evidence of divergence on the basis of ecotype (stream spawners versus shoreline spawners) and spawn timing and to describe morphological, life history, and reproductive variation within and among groups. Early and late spawning runs were found to be reproductively isolated; however, there was no clear evidence of genetic differentiation between ecotypes. Spawning groups within the same ecotype differed in length, age distribution, mean length at age, fecundity, and egg size. Variation in reproductive attributes was due primarily to differences in length distributions. Larger‐bodied shore‐spawning kokanee were located in areas where egg survival is known to be enhanced by downwelling, suggesting that the distribution of shore‐spawning kokanee may be partly structured by competition for spawning habitats with groundwater influence. This study contributes to other research indicating that introduced kokanee populations are unlikely to undergo adaptive divergence if they have a history of population fluctuations and are supplemented regularly.
The effect of in-stream activities on the Njoro River, Kenya. Part II: Microbial water quality
NASA Astrophysics Data System (ADS)
Yillia, Paul T.; Kreuzinger, Norbert; Mathooko, Jude M.
The influence of periodic in-stream activities of people and livestock on the microbial water quality of the Njoro River in Kenya was monitored at two disturbed pools (Turkana Flats and Njoro Bridge) at the middle reaches. A total of 96 sets of samples were obtained from the two pools in six weeks during dry weather (January-April) in 2006. On each sampling day, two trips were made before and during in-stream activities and on each trip, two sets of samples were collected upstream and downstream of activities. This schedule was repeated four times each for Wednesday, Saturday and Sunday. Samples were processed for heterotrophic plate count bacteria (HPC), total coliform (TC), presumptive Escherichia coli and presumptive Enterococci. Additional samples were analysed for total suspended solids (TSS), turbidity, BOD 5 and ammonium-N. The microbial water quality deteriorated significant ( p < 0.05) downstream during activities at both pools. A similar trend was observed with the chemical indicators (TSS, turbidity, BOD 5 and ammonium-N). The two groups of indicators demonstrated high capacity for site segregation based on pollution levels. Pollution levels for specific days were not significantly different ( p > 0.05). This was incompatible with the variability of in-stream activities with specific days. The pooled data was explained largely by three significant principal components - recent pollution (PC1), metabolic activity (PC2) and residual pollution (PC3). It was concluded that the empirical site parity/disparity in the levels of microbial and non-microbial indicators reflected the diurnal periodicity of in-stream activities and the concomitant pollution they caused. However, microbial source tracking studies are required to distinguish faecal sources. In the meantime, measures should be undertaken to regulate in-stream activities along the stream and minimize the movement of livestock in the catchment.
NASA Astrophysics Data System (ADS)
Battin, Tom J.
1999-10-01
The objective of the present paper was to link reach-scale streambed reactive uptake of dissolved organic carbon (DOC) and dissolved oxygen (DO) to subsurface flow paths in an alpine stream (Oberer Seebach (OSB)). The topography adjacent to the stream channel largely determined flow paths, with shallow hillslope groundwater flowing beneath the stream and entering the alluvial groundwater at the opposite bank. As computed from hydrometric data, OSB consistently lost stream water to groundwater with fluxes out of the stream averaging 943 ± 47 and 664 ± 45 L m-2 h-1 at low (Q < 600 L s-1) and high (Q > 600 L s-1) flow, respectively. Hydrometric segregation of streambed fluxes and physicochemical mixing analysis indicated that stream water was the major input component to the streambed with average contributions of 70-80% to the hyporheic zone (i.e., the subsurface zone where shallow groundwater and stream water mix). Surface water was also the major source of DOC with 0.512 ± 0.043 mg C m-2 h-1 to the streambed. The DOC flux from shallow riparian groundwater was lower (0.309 ± 0.071 mg C m-2 h-1) and peaked in autumn with 1.011 mg C m-2 h-1. I computed the relative proportion of downstream discharge through the streambed as the ratio of the downstream length (Ssw) a stream water parcel travels before entering the streambed to the downstream length (Shyp) a streambed water parcel travels before returning to the stream water. The relative streambed DOC retention efficiency, calculated as (input-output)/input of interstitial DOC, correlated with the proportion (Ssw/Shyp) of downstream discharge (r2 = 0.76, p = 0.006). Also, did the streambed metabolism (calculated as DO uptake from mass balance) decrease with low subsurface downstream routing, whereas elevated downstream discharge through the streambed stimulated DO uptake (r2 = 0.69, p = 0.019)? Despite the very short DOC turnover times (˜0.05 days, calculated as mean standing stock/annual input) within the streambed, the latter constitutes a net sink of DOC (˜14 mg C m-2 h-1). Along with high standing stocks of sediment associated particulate organic carbon, these results suggest microbial biofilms as the major retention and storage site of DOC in an alpine stream where large hydrologic exchange controls DOC fluxes.
Real-time alpha monitoring of a radioactive liquid waste stream at Los Alamos National Laboratory
DOE Office of Scientific and Technical Information (OSTI.GOV)
Johnson, J.D.; Whitley, C.R.; Rawool-Sullivan, M.
1995-12-31
This poster display concerns the development, installation, and testing of a real-time radioactive liquid waste monitor at Los Alamos National Laboratory (LANL). The detector system was designed for the LANL Radioactive Liquid Waste Treatment Facility so that influent to the plant could be monitored in real time. By knowing the activity of the influent, plant operators can better monitor treatment, better segregate waste (potentially), and monitor the regulatory compliance of users of the LANL Radioactive Liquid Waste Collection System. The detector system uses long-range alpha detection technology, which is a nonintrusive method of characterization that determines alpha activity on themore » liquid surface by measuring the ionization of ambient air. Extensive testing has been performed to ensure long-term use with a minimal amount of maintenance. The final design was a simple cost-effective alpha monitor that could be modified for monitoring influent waste streams at various points in the LANL Radioactive Liquid Waste Collection System.« less
Ammonia removal in food waste anaerobic digestion using a side-stream stripping process.
Serna-Maza, A; Heaven, S; Banks, C J
2014-01-01
Three 35-L anaerobic digesters fed on source segregated food waste were coupled to side-stream ammonia stripping columns and operated semi-continuously over 300 days, with results in terms of performance and stability compared to those of a control digester without stripping. Biogas was used as the stripping medium, and the columns were operated under different conditions of temperature (55, 70, 85 °C), pH (unadjusted and pH 10), and RT (2-5 days). To reduce digester TAN concentrations to a useful level a high temperature (≥70 °C) and a pH of 10 were needed; under these conditions 48% of the TAN was removed over a 138-day period without any detrimental effects on digester performance. Other effects of the stripping process were an overall reduction in digestate organic nitrogen-containing fraction compared to the control and a recovery in the acetoclastic pathway when TAN concentration was 1770±20 mg kg(-1). Copyright © 2013 Elsevier Ltd. All rights reserved.
Segregation of feedforward and feedback projections in mouse visual cortex
Berezovskii, Vladimir K.; Nassi, Jonathan J.; Born, Richard T.
2011-01-01
Hierarchical organization is a common feature of mammalian neocortex. Neurons that send their axons from lower to higher areas of the hierarchy are referred to as “feedforward” (FF) neurons, whereas those projecting in the opposite direction are called “feedback” (FB) neurons. Anatomical, functional and theoretical studies suggest that these different classes of projections play fundamentally different roles in perception. In primates, laminar differences in projection patterns often distinguish the two projection streams. In rodents, however, these differences are less clear, despite an established hierarchy of visual areas. Thus the rodent provides a strong test of the hypothesis that FF and FB neurons form distinct populations. We tested this hypothesis by injecting retrograde tracers into two different hierarchical levels of mouse visual cortex (areas 17 and AL) and then determining the relative proportions of double-labeled FB and FF neurons in an area intermediate to them (LM). Despite finding singly labeled neurons densely intermingled with no laminar segregation, we found few double-labeled neurons (~5% of each singly labeled population). We also examined the development of FF and FB connections. FF connections were present at the earliest time-point we examined (postnatal day two, P2), while FB connections were not detectable until P11. Our findings indicate that, even in cortices without laminar segregation of FF and FB neurons, the two projection systems are largely distinct at the neuronal level and also differ with respect to the timing of their outgrowth. PMID:21618232
Bonato, Karine Orlandi; Fialho, Clarice Bernhardt
2014-01-01
Ontogenetic influences in patterns of niche breadth and feeding overlap were investigated in three species of Siluriformes (Heptapterus sp., Rhamdia quelen and Trichomycterus poikilos) aiming at understanding the species coexistence. Samplings were conducted bimonthly by electrofishing technique from June/2012 to June/2013 in ten streams of the northwestern state of Rio Grande do Sul, Brazil. The stomach contents of 1,948 individuals were analyzed by volumetric method, with 59 food items identified. In general Heptapterus sp. consumed a high proportion of Aegla sp., terrestrial plant remains and Megaloptera; R. quelen consumed fish, and Oligochaeta, followed by Aegla sp.; while the diet of T. poikilos was based on Simuliidae, Ephemeroptera and Trichoptera. Specie segregation was observed in the NMDS. Through PERMANOVA analysis feeding differences among species, and between a combination of species plus size classes were observed. IndVal showed which items were indicators of these differences. Niche breadth values were high for all species. The niche breadth values were low only for the larger size of R. quelen and Heptapterus sp. while T. poikilos values were more similar. Overall the species were a low feeding overlap values. The higher frequency of high feeding overlap was observed for interaction between Heptapterus sp. and T. poikilos. The null model confirmed the niche partitioning between the species. The higher frequency of high and intermediate feeding overlap values were reported to smaller size classes. The null model showed resource sharing between the species/size class. Therefore, overall species showed a resource partitioning because of the use of occasional items. However, these species share resources mainly in the early ontogenetic stages until the emphasized change of morphological characteristics leading to trophic niche expansion and the apparent segregation observed. PMID:25340614
Shinn-Cunningham, Barbara
2017-10-17
This review provides clinicians with an overview of recent findings relevant to understanding why listeners with normal hearing thresholds (NHTs) sometimes suffer from communication difficulties in noisy settings. The results from neuroscience and psychoacoustics are reviewed. In noisy settings, listeners focus their attention by engaging cortical brain networks to suppress unimportant sounds; they then can analyze and understand an important sound, such as speech, amidst competing sounds. Differences in the efficacy of top-down control of attention can affect communication abilities. In addition, subclinical deficits in sensory fidelity can disrupt the ability to perceptually segregate sound sources, interfering with selective attention, even in listeners with NHTs. Studies of variability in control of attention and in sensory coding fidelity may help to isolate and identify some of the causes of communication disorders in individuals presenting at the clinic with "normal hearing." How well an individual with NHTs can understand speech amidst competing sounds depends not only on the sound being audible but also on the integrity of cortical control networks and the fidelity of the representation of suprathreshold sound. Understanding the root cause of difficulties experienced by listeners with NHTs ultimately can lead to new, targeted interventions that address specific deficits affecting communication in noise. http://cred.pubs.asha.org/article.aspx?articleid=2601617.
NASA Astrophysics Data System (ADS)
Nur Farid, Mifta; Arifianto, Dhany
2016-11-01
A person who is suffering from hearing loss can be helped by using hearing aids and the most optimal performance of hearing aids are binaural hearing aids because it has similarities to human auditory system. In a conversation at a cocktail party, a person can focus on a single conversation even though the background sound and other people conversation is quite loud. This phenomenon is known as the cocktail party effect. In an early study, has been explained that binaural hearing have an important contribution to the cocktail party effect. So in this study, will be performed separation on the input binaural sound with 2 microphone sensors of two sound sources based on both the binaural cue, interaural time difference (ITD) and interaural level difference (ILD) using binary mask. To estimate value of ITD, is used cross-correlation method which the value of ITD represented as time delay of peak shifting at time-frequency unit. Binary mask is estimated based on pattern of ITD and ILD to relative strength of target that computed statistically using probability density estimation. Results of sound source separation performing well with the value of speech intelligibility using the percent correct word by 86% and 3 dB by SNR.
The Contribution of Brainstem and Cerebellar Pathways to Auditory Recognition
McLachlan, Neil M.; Wilson, Sarah J.
2017-01-01
The cerebellum has been known to play an important role in motor functions for many years. More recently its role has been expanded to include a range of cognitive and sensory-motor processes, and substantial neuroimaging and clinical evidence now points to cerebellar involvement in most auditory processing tasks. In particular, an increase in the size of the cerebellum over recent human evolution has been attributed in part to the development of speech. Despite this, the auditory cognition literature has largely overlooked afferent auditory connections to the cerebellum that have been implicated in acoustically conditioned reflexes in animals, and could subserve speech and other auditory processing in humans. This review expands our understanding of auditory processing by incorporating cerebellar pathways into the anatomy and functions of the human auditory system. We reason that plasticity in the cerebellar pathways underpins implicit learning of spectrotemporal information necessary for sound and speech recognition. Once learnt, this information automatically recognizes incoming auditory signals and predicts likely subsequent information based on previous experience. Since sound recognition processes involving the brainstem and cerebellum initiate early in auditory processing, learnt information stored in cerebellar memory templates could then support a range of auditory processing functions such as streaming, habituation, the integration of auditory feature information such as pitch, and the recognition of vocal communications. PMID:28373850
NASA Astrophysics Data System (ADS)
Kardava, Irakli; Tadyszak, Krzysztof; Gulua, Nana; Jurga, Stefan
2017-02-01
For more flexibility of environmental perception by artificial intelligence it is needed to exist the supporting software modules, which will be able to automate the creation of specific language syntax and to make a further analysis for relevant decisions based on semantic functions. According of our proposed approach, of which implementation it is possible to create the couples of formal rules of given sentences (in case of natural languages) or statements (in case of special languages) by helping of computer vision, speech recognition or editable text conversion system for further automatic improvement. In other words, we have developed an approach, by which it can be achieved to significantly improve the training process automation of artificial intelligence, which as a result will give us a higher level of self-developing skills independently from us (from users). At the base of our approach we have developed a software demo version, which includes the algorithm and software code for the entire above mentioned component's implementation (computer vision, speech recognition and editable text conversion system). The program has the ability to work in a multi - stream mode and simultaneously create a syntax based on receiving information from several sources.
Feature assignment in perception of auditory figure.
Gregg, Melissa K; Samuel, Arthur G
2012-08-01
Because the environment often includes multiple sounds that overlap in time, listeners must segregate a sound of interest (the auditory figure) from other co-occurring sounds (the unattended auditory ground). We conducted a series of experiments to clarify the principles governing the extraction of auditory figures. We distinguish between auditory "objects" (relatively punctate events, such as a dog's bark) and auditory "streams" (sounds involving a pattern over time, such as a galloping rhythm). In Experiments 1 and 2, on each trial 2 sounds-an object (a vowel) and a stream (a series of tones)-were presented with 1 target feature that could be perceptually grouped with either source. In each block of these experiments, listeners were required to attend to 1 of the 2 sounds, and report its perceived category. Across several experimental manipulations, listeners were more likely to allocate the feature to an impoverished object if the result of the grouping was a good, identifiable object. Perception of objects was quite sensitive to feature variation (noise masking), whereas perception of streams was more robust to feature variation. In Experiment 3, the number of sound sources competing for the feature was increased to 3. This produced a shift toward relying more on spatial cues than on the potential contribution of the feature to an object's perceptual quality. The results support a distinction between auditory objects and streams, and provide new information about the way that the auditory world is parsed. (c) 2012 APA, all rights reserved.
The neural processing of hierarchical structure in music and speech at different timescales
Farbood, Morwaread M.; Heeger, David J.; Marcus, Gary; Hasson, Uri; Lerner, Yulia
2015-01-01
Music, like speech, is a complex auditory signal that contains structures at multiple timescales, and as such is a potentially powerful entry point into the question of how the brain integrates complex streams of information. Using an experimental design modeled after previous studies that used scrambled versions of a spoken story (Lerner et al., 2011) and a silent movie (Hasson et al., 2008), we investigate whether listeners perceive hierarchical structure in music beyond short (~6 s) time windows and whether there is cortical overlap between music and language processing at multiple timescales. Experienced pianists were presented with an extended musical excerpt scrambled at multiple timescales—by measure, phrase, and section—while measuring brain activity with functional magnetic resonance imaging (fMRI). The reliability of evoked activity, as quantified by inter-subject correlation of the fMRI responses, was measured. We found that response reliability depended systematically on musical structure coherence, revealing a topographically organized hierarchy of processing timescales. Early auditory areas (at the bottom of the hierarchy) responded reliably in all conditions. For brain areas at the top of the hierarchy, the original (unscrambled) excerpt evoked more reliable responses than any of the scrambled excerpts, indicating that these brain areas process long-timescale musical structures, on the order of minutes. The topography of processing timescales was analogous with that reported previously for speech, but the timescale gradients for music and speech overlapped with one another only partially, suggesting that temporally analogous structures—words/measures, sentences/musical phrases, paragraph/sections—are processed separately. PMID:26029037
Electrostimulation mapping of comprehension of auditory and visual words.
Roux, Franck-Emmanuel; Miskin, Krasimir; Durand, Jean-Baptiste; Sacko, Oumar; Réhault, Emilie; Tanova, Rositsa; Démonet, Jean-François
2015-10-01
In order to spare functional areas during the removal of brain tumours, electrical stimulation mapping was used in 90 patients (77 in the left hemisphere and 13 in the right; 2754 cortical sites tested). Language functions were studied with a special focus on comprehension of auditory and visual words and the semantic system. In addition to naming, patients were asked to perform pointing tasks from auditory and visual stimuli (using sets of 4 different images controlled for familiarity), and also auditory object (sound recognition) and Token test tasks. Ninety-two auditory comprehension interference sites were observed. We found that the process of auditory comprehension involved a few, fine-grained, sub-centimetre cortical territories. Early stages of speech comprehension seem to relate to two posterior regions in the left superior temporal gyrus. Downstream lexical-semantic speech processing and sound analysis involved 2 pathways, along the anterior part of the left superior temporal gyrus, and posteriorly around the supramarginal and middle temporal gyri. Electrostimulation experimentally dissociated perceptual consciousness attached to speech comprehension. The initial word discrimination process can be considered as an "automatic" stage, the attention feedback not being impaired by stimulation as would be the case at the lexical-semantic stage. Multimodal organization of the superior temporal gyrus was also detected since some neurones could be involved in comprehension of visual material and naming. These findings demonstrate a fine graded, sub-centimetre, cortical representation of speech comprehension processing mainly in the left superior temporal gyrus and are in line with those described in dual stream models of language comprehension processing. Copyright © 2015 Elsevier Ltd. All rights reserved.
The neural processing of hierarchical structure in music and speech at different timescales.
Farbood, Morwaread M; Heeger, David J; Marcus, Gary; Hasson, Uri; Lerner, Yulia
2015-01-01
Music, like speech, is a complex auditory signal that contains structures at multiple timescales, and as such is a potentially powerful entry point into the question of how the brain integrates complex streams of information. Using an experimental design modeled after previous studies that used scrambled versions of a spoken story (Lerner et al., 2011) and a silent movie (Hasson et al., 2008), we investigate whether listeners perceive hierarchical structure in music beyond short (~6 s) time windows and whether there is cortical overlap between music and language processing at multiple timescales. Experienced pianists were presented with an extended musical excerpt scrambled at multiple timescales-by measure, phrase, and section-while measuring brain activity with functional magnetic resonance imaging (fMRI). The reliability of evoked activity, as quantified by inter-subject correlation of the fMRI responses, was measured. We found that response reliability depended systematically on musical structure coherence, revealing a topographically organized hierarchy of processing timescales. Early auditory areas (at the bottom of the hierarchy) responded reliably in all conditions. For brain areas at the top of the hierarchy, the original (unscrambled) excerpt evoked more reliable responses than any of the scrambled excerpts, indicating that these brain areas process long-timescale musical structures, on the order of minutes. The topography of processing timescales was analogous with that reported previously for speech, but the timescale gradients for music and speech overlapped with one another only partially, suggesting that temporally analogous structures-words/measures, sentences/musical phrases, paragraph/sections-are processed separately.
Lahnakoski, Juha M; Glerean, Enrico; Salmi, Juha; Jääskeläinen, Iiro P; Sams, Mikko; Hari, Riitta; Nummenmaa, Lauri
2012-01-01
Despite the abundant data on brain networks processing static social signals, such as pictures of faces, the neural systems supporting social perception in naturalistic conditions are still poorly understood. Here we delineated brain networks subserving social perception under naturalistic conditions in 19 healthy humans who watched, during 3-T functional magnetic resonance imaging (fMRI), a set of 137 short (approximately 16 s each, total 27 min) audiovisual movie clips depicting pre-selected social signals. Two independent raters estimated how well each clip represented eight social features (faces, human bodies, biological motion, goal-oriented actions, emotion, social interaction, pain, and speech) and six filler features (places, objects, rigid motion, people not in social interaction, non-goal-oriented action, and non-human sounds) lacking social content. These ratings were used as predictors in the fMRI analysis. The posterior superior temporal sulcus (STS) responded to all social features but not to any non-social features, and the anterior STS responded to all social features except bodies and biological motion. We also found four partially segregated, extended networks for processing of specific social signals: (1) a fronto-temporal network responding to multiple social categories, (2) a fronto-parietal network preferentially activated to bodies, motion, and pain, (3) a temporo-amygdalar network responding to faces, social interaction, and speech, and (4) a fronto-insular network responding to pain, emotions, social interactions, and speech. Our results highlight the role of the pSTS in processing multiple aspects of social information, as well as the feasibility and efficiency of fMRI mapping under conditions that resemble the complexity of real life.
Household hazardous waste management: a review.
Inglezakis, Vassilis J; Moustakas, Konstantinos
2015-03-01
This paper deals with the waste stream of household hazardous waste (HHW) presenting existing management systems, legislation overview and other relevant quantitative and qualitative information. European Union legislation and international management schemes are summarized and presented in a concise manner by the use of diagrams in order to provide crucial information on HHW. Furthermore, sources and types, numerical figures about generation, collection and relevant management costs are within the scope of the present paper. The review shows that the term used to refer to hazardous waste generated in households is not clearly defined in legislation, while there is absence of specific acts regulating the management of HHW. The lack of obligation to segregate HHW from the household waste and the different terminology used makes it difficult to determine the quantities and composition of this waste stream, while its generation amount is relatively small and, therefore, is commonly overlooked in waste statistics. The paper aims to cover the gap in the related literature on a subject that is included within the crucial waste management challenges at world level, considering that HHW can also have impact on other waste streams by altering the redox conditions or causing direct reactions with other non hazardous waste substances. Copyright © 2014 Elsevier Ltd. All rights reserved.
Effects of sensorineural hearing loss on visually guided attention in a multitalker environment.
Best, Virginia; Marrone, Nicole; Mason, Christine R; Kidd, Gerald; Shinn-Cunningham, Barbara G
2009-03-01
This study asked whether or not listeners with sensorineural hearing loss have an impaired ability to use top-down attention to enhance speech intelligibility in the presence of interfering talkers. Listeners were presented with a target string of spoken digits embedded in a mixture of five spatially separated speech streams. The benefit of providing simple visual cues indicating when and/or where the target would occur was measured in listeners with hearing loss, listeners with normal hearing, and a control group of listeners with normal hearing who were tested at a lower target-to-masker ratio to equate their baseline (no cue) performance with the hearing-loss group. All groups received robust benefits from the visual cues. The magnitude of the spatial-cue benefit, however, was significantly smaller in listeners with hearing loss. Results suggest that reduced utility of selective attention for resolving competition between simultaneous sounds contributes to the communication difficulties experienced by listeners with hearing loss in everyday listening situations.
Fama, Mackenzie E; Hayward, William; Snider, Sarah F; Friedman, Rhonda B; Turkeltaub, Peter E
2017-01-01
Many individuals with aphasia describe anomia with comments like "I know it but I can't say it." The exact meaning of such phrases is unclear. We hypothesize that at least two discrete experiences exist: the sense of (1) knowing a concept, but failing to find the right word, and (2) saying the correct word internally but not aloud (successful inner speech, sIS). We propose that sIS reflects successful lexical access; subsequent overt anomia indicates post-lexical output deficits. In this pilot study, we probed the subjective experience of anomia in 37 persons with aphasia. Self-reported sIS related to aphasia severity and phonological output deficits. In multivariate lesion-symptom mapping, sIS was associated with dorsal stream lesions, particularly in ventral sensorimotor cortex. These preliminary results suggest that people with aphasia can often provide meaningful insights about their experience of anomia and that reports of sIS relate to specific lesion locations and language deficits. Copyright © 2016 Elsevier Inc. All rights reserved.
Structuring Broadcast Audio for Information Access
NASA Astrophysics Data System (ADS)
Gauvain, Jean-Luc; Lamel, Lori
2003-12-01
One rapidly expanding application area for state-of-the-art speech recognition technology is the automatic processing of broadcast audiovisual data for information access. Since much of the linguistic information is found in the audio channel, speech recognition is a key enabling technology which, when combined with information retrieval techniques, can be used for searching large audiovisual document collections. Audio indexing must take into account the specificities of audio data such as needing to deal with the continuous data stream and an imperfect word transcription. Other important considerations are dealing with language specificities and facilitating language portability. At Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur (LIMSI), broadcast news transcription systems have been developed for seven languages: English, French, German, Mandarin, Portuguese, Spanish, and Arabic. The transcription systems have been integrated into prototype demonstrators for several application areas such as audio data mining, structuring audiovisual archives, selective dissemination of information, and topic tracking for media monitoring. As examples, this paper addresses the spoken document retrieval and topic tracking tasks.
Ostrand, Rachel; Blumstein, Sheila E.; Ferreira, Victor S.; Morgan, James L.
2016-01-01
Human speech perception often includes both an auditory and visual component. A conflict in these signals can result in the McGurk illusion, in which the listener perceives a fusion of the two streams, implying that information from both has been integrated. We report two experiments investigating whether auditory-visual integration of speech occurs before or after lexical access, and whether the visual signal influences lexical access at all. Subjects were presented with McGurk or Congruent primes and performed a lexical decision task on related or unrelated targets. Although subjects perceived the McGurk illusion, McGurk and Congruent primes with matching real-word auditory signals equivalently primed targets that were semantically related to the auditory signal, but not targets related to the McGurk percept. We conclude that the time course of auditory-visual integration is dependent on the lexicality of the auditory and visual input signals, and that listeners can lexically access one word and yet consciously perceive another. PMID:27011021
Lopopolo, Alessandro; Frank, Stefan L; van den Bosch, Antal; Willems, Roel M
2017-01-01
Language comprehension involves the simultaneous processing of information at the phonological, syntactic, and lexical level. We track these three distinct streams of information in the brain by using stochastic measures derived from computational language models to detect neural correlates of phoneme, part-of-speech, and word processing in an fMRI experiment. Probabilistic language models have proven to be useful tools for studying how language is processed as a sequence of symbols unfolding in time. Conditional probabilities between sequences of words are at the basis of probabilistic measures such as surprisal and perplexity which have been successfully used as predictors of several behavioural and neural correlates of sentence processing. Here we computed perplexity from sequences of words and their parts of speech, and their phonemic transcriptions. Brain activity time-locked to each word is regressed on the three model-derived measures. We observe that the brain keeps track of the statistical structure of lexical, syntactic and phonological information in distinct areas.
Effect of Blast Injury on Auditory Localization in Military Service Members.
Kubli, Lina R; Brungart, Douglas; Northern, Jerry
Among the many advantages of binaural hearing are the abilities to localize sounds in space and to attend to one sound in the presence of many sounds. Binaural hearing provides benefits for all listeners, but it may be especially critical for military personnel who must maintain situational awareness in complex tactical environments with multiple speech and noise sources. There is concern that Military Service Members who have been exposed to one or more high-intensity blasts during their tour of duty may have difficulty with binaural and spatial ability due to degradation in auditory and cognitive processes. The primary objective of this study was to assess the ability of blast-exposed Military Service Members to localize speech sounds in quiet and in multisource environments with one or two competing talkers. Participants were presented with one, two, or three topic-related (e.g., sports, food, travel) sentences under headphones and required to attend to, and then locate the source of, the sentence pertaining to a prespecified target topic within a virtual space. The listener's head position was monitored by a head-mounted tracking device that continuously updated the apparent spatial location of the target and competing speech sounds as the subject turned within the virtual space. Measurements of auditory localization ability included mean absolute error in locating the source of the target sentence, the time it took to locate the target sentence within 30 degrees, target/competitor confusion errors, response time, and cumulative head motion. Twenty-one blast-exposed Active-Duty or Veteran Military Service Members (blast-exposed group) and 33 non-blast-exposed Service Members and beneficiaries (control group) were evaluated. In general, the blast-exposed group performed as well as the control group if the task involved localizing the source of a single speech target. However, if the task involved two or three simultaneous talkers, localization ability was compromised for some participants in the blast-exposed group. Blast-exposed participants were less accurate in their localization responses and required more exploratory head movements to find the location of the target talker. Results suggest that blast-exposed participants have more difficulty than non-blast-exposed participants in localizing sounds in complex acoustic environments. This apparent deficit in spatial hearing ability highlights the need to develop new diagnostic tests using complex listening tasks that involve multiple sound sources that require speech segregation and comprehension.
White matter anisotropy in the ventral language pathway predicts sound-to-word learning success
Wong, Francis C. K.; Chandrasekaran, Bharath; Garibaldi, Kyla; Wong, Patrick C. M.
2011-01-01
According to the dual stream model of auditory language processing, the dorsal stream is responsible for mapping sound to articulation while the ventral stream plays the role of mapping sound to meaning. Most researchers agree that the arcuate fasciculus (AF) is the neuroanatomical correlate of the dorsal steam, however, less is known about what constitutes the ventral one. Nevertheless two hypotheses exist, one suggests that the segment of the AF that terminates in middle temporal gyrus corresponds to the ventral stream and the other suggests that it is the extreme capsule that underlies this sound to meaning pathway. The goal of this study is to evaluate these two competing hypotheses. We trained participants with a sound-to-word learning paradigm in which they learned to use a foreign phonetic contrast for signaling word meaning. Using diffusion tensor imaging (DTI), a brain imaging tool to investigate white matter connectivity in humans, we found that fractional anisotropy in the left parietal-temporal region positively correlated with the performance in sound-to-word learning. In addition, fiber tracking revealed a ventral pathway, composed of the extreme capsule and the inferior longitudinal fasciculus, that mediated auditory comprehension. Our findings provide converging evidence supporting the importance of the ventral steam, an extreme capsule system, in the frontal-temporal language network. Implications for current models of speech processing will also be discussed. PMID:21677162
Multiple Transmitter Receptors in Regions and Layers of the Human Cerebral Cortex
Zilles, Karl; Palomero-Gallagher, Nicola
2017-01-01
We measured the densities (fmol/mg protein) of 15 different receptors of various transmitter systems in the supragranular, granular and infragranular strata of 44 areas of visual, somatosensory, auditory and multimodal association systems of the human cerebral cortex. Receptor densities were obtained after labeling of the receptors using quantitative in vitro receptor autoradiography in human postmortem brains. The mean density of each receptor type over all cortical layers and of each of the three major strata varies between cortical regions. In a single cortical area, the multi-receptor fingerprints of its strata (i.e., polar plots, each visualizing the densities of multiple different receptor types in supragranular, granular or infragranular layers of the same cortical area) differ in shape and size indicating regional and laminar specific balances between the receptors. Furthermore, the three strata are clearly segregated into well definable clusters by their receptor fingerprints. Fingerprints of different cortical areas systematically vary between functional networks, and with the hierarchical levels within sensory systems. Primary sensory areas are clearly separated from all other cortical areas particularly by their very high muscarinic M2 and nicotinic α4β2 receptor densities, and to a lesser degree also by noradrenergic α2 and serotonergic 5-HT2 receptors. Early visual areas of the dorsal and ventral streams are segregated by their multi-receptor fingerprints. The results are discussed on the background of functional segregation, cortical hierarchies, microstructural types, and the horizontal (layers) and vertical (columns) organization in the cerebral cortex. We conclude that a cortical column is composed of segments, which can be assigned to the cortical strata. The segments differ by their patterns of multi-receptor balances, indicating different layer-specific signal processing mechanisms. Additionally, the differences between the strata-and area-specific fingerprints of the 44 areas reflect the segregation of the cerebral cortex into functionally and topographically definable groups of cortical areas (visual, auditory, somatosensory, limbic, motor), and reveals their hierarchical position (primary and unimodal (early) sensory to higher sensory and finally to multimodal association areas). Highlights Densities of transmitter receptors vary between areas of human cerebral cortex.Multi-receptor fingerprints segregate cortical layers.The densities of all examined receptor types together reach highest values in the supragranular stratum of all areas.The lowest values are found in the infragranular stratum.Multi-receptor fingerprints of entire areas and their layers segregate functional systemsCortical types (primary sensory, motor, multimodal association) differ in their receptor fingerprints. PMID:28970785
Cross-stream distribution of red blood cells in sickle-cell disease
NASA Astrophysics Data System (ADS)
Zhang, Xiao; Lam, Wilbur; Graham, Michael
2017-11-01
Experiments revealed that in blood flow, red blood cells (RBCs) tend to migrate away from the vessel walls, leaving a cell-free layer near the walls, while leukocytes and platelets tend to marginate towards the vessel walls. This segregation behavior of different cellular components in blood flow can be driven by their differences in stiffness and shape. An alteration of this segregation behavior may explain endothelial dysfunction and pain crisis associated with sickle-cell disease (SCD). It is hypothesized that the sickle RBCs, which are considerably stiffer than the healthy RBCs, may marginate towards the vessel walls and exert repeated damage to the endothelial cells. Direct simulations are performed to study the flowing suspensions of deformable biconcave discoids and stiff sickles representing healthy and sickle cells, respectively. It is observed that the sickles exhibit a strong margination towards the walls. The biconcave discoids in flowing suspensions undergo a so-called tank-treading motion, while the sickles behave as rigid bodies and undergo a tumbling motion. The margination behavior and tumbling motion of the sickles may help substantiate the aforementioned hypothesis of the mechanism for the SCD complications and shed some light on the design of novel therapies.
Controlling mixing and segregation in time periodic granular flows
NASA Astrophysics Data System (ADS)
Bhattacharya, Tathagata
Segregation is a major problem for many solids processing industries. Differences in particle size or density can lead to flow-induced segregation. In the present work, we employ the discrete element method (DEM)---one type of particle dynamics (PD) technique---to investigate the mixing and segregation of granular material in some prototypical solid handling devices, such as a rotating drum and chute. In DEM, one calculates the trajectories of individual particles based on Newton's laws of motion by employing suitable contact force models and a collision detection algorithm. Recently, it has been suggested that segregation in particle mixers can be thwarted if the particle flow is inverted at a rate above a critical forcing frequency. Further, it has been hypothesized that, for a rotating drum, the effectiveness of this technique can be linked to the probability distribution of the number of times a particle passes through the flowing layer per rotation of the drum. In the first portion of this work, various configurations of solid mixers are numerically and experimentally studied to investigate the conditions for improved mixing in light of these hypotheses. Besides rotating drums, many studies of granular flow have focused on gravity driven chute flows owing to its practical importance in granular transportation and to the fact that the relative simplicity of this type of flow allows for development and testing of new theories. In this part of the work, we observe the deposition behavior of both mono-sized and polydisperse dry granular materials in an inclined chute flow. The effects of different parameters such as chute angle, particle size, falling height and charge amount on the mass fraction distribution of granular materials after deposition are investigated. The simulation results obtained using DEM are compared with the experimental findings and a high degree of agreement is observed. Tuning of the underlying contact force parameters allows the achievement of realistic results and is used as a means of validating the model against available experimental data. The tuned model is then used to find the critical chute length for segregation based on the hypothesis that segregation can be thwarted if the particle flow is inverted at a rate above a critical forcing frequency. The critical frequency, fcrit, is inversely proportional to the characteristic time of segregation, ts. Mixing is observed instead of segregation when the chute length L < U avgts, where Uavg denotes the average stream-wise flow velocity of the particles. While segregation is often an undesired effect, sometimes separating the components of a particle mixture is the ultimate goal. Rate-based separation processes hold promise as both more environmentally benign as well as less energy intensive when compared to conventional particle separations technologies such as vibrating screens or flotation methods. This approach is based on differences in the kinetic properties of the components of a mixture, such as the velocity of migration or diffusivity. In this portion of the work, two examples of novel rate-based separation devices are demonstrated. The first example involves the study of the dynamics of gravity-driven particles through an array of obstacles. Both discrete element (DEM) simulations and experiments are used to augment the understanding of this device. Dissipative collisions (both between the particles themselves and with the obstacles) give rise to a diffusive motion of particles perpendicular to the flow direction and the differences in diffusion lengths are exploited to separate the particles. The second example employs DEM to analyze a ratchet mechanism where a current of particles can be produced in a direction perpendicular to the energy input. In this setup, a vibrating saw-toothed base is employed to induce different mobility for different types of particles. The effect of operating conditions and design parameters on the separation efficiency are discussed. Keywords: granular flow, particle, mixing, segregation, discrete element method, particle dynamics, tumbler, chute, periodic flow inversion, collisional flow, rate-based separation, ratchet, static separator, dissipative particle dynamics, non-spherical droplet.
2017-01-01
Purpose This review provides clinicians with an overview of recent findings relevant to understanding why listeners with normal hearing thresholds (NHTs) sometimes suffer from communication difficulties in noisy settings. Method The results from neuroscience and psychoacoustics are reviewed. Results In noisy settings, listeners focus their attention by engaging cortical brain networks to suppress unimportant sounds; they then can analyze and understand an important sound, such as speech, amidst competing sounds. Differences in the efficacy of top-down control of attention can affect communication abilities. In addition, subclinical deficits in sensory fidelity can disrupt the ability to perceptually segregate sound sources, interfering with selective attention, even in listeners with NHTs. Studies of variability in control of attention and in sensory coding fidelity may help to isolate and identify some of the causes of communication disorders in individuals presenting at the clinic with “normal hearing.” Conclusions How well an individual with NHTs can understand speech amidst competing sounds depends not only on the sound being audible but also on the integrity of cortical control networks and the fidelity of the representation of suprathreshold sound. Understanding the root cause of difficulties experienced by listeners with NHTs ultimately can lead to new, targeted interventions that address specific deficits affecting communication in noise. Presentation Video http://cred.pubs.asha.org/article.aspx?articleid=2601617 PMID:29049598
Reframing the action and perception dissociation in DF: haptics matters, but how?
Whitwell, Robert L; Buckingham, Gavin
2013-02-01
Goodale and Milner's (1992) "vision-for-action" and "vision-for-perception" account of the division of labor between the dorsal and ventral "streams" has come to dominate contemporary views of the functional roles of these two pathways. Nevertheless, some lines of evidence for the model remain controversial. Recently, Thomas Schenk reexamined visual form agnosic patient DF's spared anticipatory grip scaling to object size, one of the principal empirical pillars of the model. Based on this new evidence, Schenk rejects the original interpretation of DF's spared ability that was based on segregated processing of object size and argues that DF's spared grip scaling relies on haptic feedback to calibrate visual egocentric cues that relate the posture of the hand to the visible edges of the goal-object. However, a careful consideration of the tasks that Schenk employed reveals some problems with his claim. We suspect that the core issues of this controversy will require a closer examination of the role that cognition plays in the operation of the dorsal and ventral streams in healthy controls and in patient DF.
Yarch, Jeff; Federer, Frederick
2017-01-01
Decades of anatomical studies on the primate primary visual cortex (V1) have led to a detailed diagram of V1 intrinsic circuitry, but this diagram lacks information about the output targets of V1 cells. Understanding how V1 local processing relates to downstream processing requires identification of neuronal populations defined by their output targets. In primates, V1 layers (L)2/3 and 4B send segregated projections to distinct cytochrome oxidase (CO) stripes in area V2: neurons in CO blob columns project to thin stripes while neurons outside blob columns project to thick and pale stripes, suggesting functional specialization of V1-to-V2 CO streams. However, the conventional diagram of V1 shows all L4B neurons, regardless of their soma location in blob or interblob columns, as projecting selectively to CO blobs in L2/3, suggesting convergence of blob/interblob information in L2/3 blobs and, possibly, some V2 stripes. However, it is unclear whether all L4B projection neurons show similar local circuitries. Using viral-mediated circuit tracing, we have identified the local circuits of L4B neurons projecting to V2 thick stripes in macaque. Consistent with previous studies, we found the somata of this L4B subpopulation to reside predominantly outside blob columns; however, unlike previous descriptions of local L4B circuits, these cells consistently projected outside CO blob columns in all layers. Thus, the local circuits of these L4B output neurons, just like their extrinsic projections to V2, preserve CO streams. Moreover, the intra-V1 laminar patterns of axonal projections identify two distinct neuron classes within this L4B subpopulation, including a rare novel neuron type, suggestive of two functionally specialized output channels. SIGNIFICANCE STATEMENT Conventional diagrams of primate primary visual cortex (V1) depict neuronal connections within and between different V1 layers, but lack information about the cells' downstream targets. This information is critical to understanding how local processing in V1 relates to downstream processing. We have identified the local circuits of a population of cells in V1 layer (L)4B that project to area V2. These cells' local circuits differ from classical descriptions of L4B circuits in both the laminar and functional compartments targeted by their axons, and identify two neuron classes. Our results demonstrate that both local intra-V1 and extrinsic V1-to-V2 connections of L4B neurons preserve CO-stream segregation, suggesting that across-stream integration occurs downstream of V1, and that output targets dictate local V1 circuitry. PMID:28077720
Words, rules, and mechanisms of language acquisition.
Endress, Ansgar D; Bonatti, Luca L
2016-01-01
We review recent artificial language learning studies, especially those following Endress and Bonatti (Endress AD, Bonatti LL. Rapid learning of syllable classes from a perceptually continuous speech stream. Cognition 2007, 105:247-299), suggesting that humans can deploy a variety of learning mechanisms to acquire artificial languages. Several experiments provide evidence for multiple learning mechanisms that can be deployed in fluent speech: one mechanism encodes the positions of syllables within words and can be used to extract generalization, while the other registers co-occurrence statistics of syllables and can be used to break a continuum into its components. We review dissociations between these mechanisms and their potential role in language acquisition. We then turn to recent criticisms of the multiple mechanisms hypothesis and show that they are inconsistent with the available data. Our results suggest that artificial and natural language learning is best understood by dissecting the underlying specialized learning abilities, and that these data provide a rare opportunity to link important language phenomena to basic psychological mechanisms. For further resources related to this article, please visit the WIREs website. © 2015 Wiley Periodicals, Inc.
Vieira, Manuel; Fonseca, Paulo J; Amorim, M Clara P; Teixeira, Carlos J C
2015-12-01
The study of acoustic communication in animals often requires not only the recognition of species specific acoustic signals but also the identification of individual subjects, all in a complex acoustic background. Moreover, when very long recordings are to be analyzed, automatic recognition and identification processes are invaluable tools to extract the relevant biological information. A pattern recognition methodology based on hidden Markov models is presented inspired by successful results obtained in the most widely known and complex acoustical communication signal: human speech. This methodology was applied here for the first time to the detection and recognition of fish acoustic signals, specifically in a stream of round-the-clock recordings of Lusitanian toadfish (Halobatrachus didactylus) in their natural estuarine habitat. The results show that this methodology is able not only to detect the mating sounds (boatwhistles) but also to identify individual male toadfish, reaching an identification rate of ca. 95%. Moreover this method also proved to be a powerful tool to assess signal durations in large data sets. However, the system failed in recognizing other sound types.
Visually-guided attention enhances target identification in a complex auditory scene.
Best, Virginia; Ozmeral, Erol J; Shinn-Cunningham, Barbara G
2007-06-01
In auditory scenes containing many similar sound sources, sorting of acoustic information into streams becomes difficult, which can lead to disruptions in the identification of behaviorally relevant targets. This study investigated the benefit of providing simple visual cues for when and/or where a target would occur in a complex acoustic mixture. Importantly, the visual cues provided no information about the target content. In separate experiments, human subjects either identified learned birdsongs in the presence of a chorus of unlearned songs or recalled strings of spoken digits in the presence of speech maskers. A visual cue indicating which loudspeaker (from an array of five) would contain the target improved accuracy for both kinds of stimuli. A cue indicating which time segment (out of a possible five) would contain the target also improved accuracy, but much more for birdsong than for speech. These results suggest that in real world situations, information about where a target of interest is located can enhance its identification, while information about when to listen can also be helpful when targets are unfamiliar or extremely similar to their competitors.
Visually-guided Attention Enhances Target Identification in a Complex Auditory Scene
Ozmeral, Erol J.; Shinn-Cunningham, Barbara G.
2007-01-01
In auditory scenes containing many similar sound sources, sorting of acoustic information into streams becomes difficult, which can lead to disruptions in the identification of behaviorally relevant targets. This study investigated the benefit of providing simple visual cues for when and/or where a target would occur in a complex acoustic mixture. Importantly, the visual cues provided no information about the target content. In separate experiments, human subjects either identified learned birdsongs in the presence of a chorus of unlearned songs or recalled strings of spoken digits in the presence of speech maskers. A visual cue indicating which loudspeaker (from an array of five) would contain the target improved accuracy for both kinds of stimuli. A cue indicating which time segment (out of a possible five) would contain the target also improved accuracy, but much more for birdsong than for speech. These results suggest that in real world situations, information about where a target of interest is located can enhance its identification, while information about when to listen can also be helpful when targets are unfamiliar or extremely similar to their competitors. PMID:17453308
A FPGA Implementation of the CAR-FAC Cochlear Model.
Xu, Ying; Thakur, Chetan S; Singh, Ram K; Hamilton, Tara Julia; Wang, Runchun M; van Schaik, André
2018-01-01
This paper presents a digital implementation of the Cascade of Asymmetric Resonators with Fast-Acting Compression (CAR-FAC) cochlear model. The CAR part simulates the basilar membrane's (BM) response to sound. The FAC part models the outer hair cell (OHC), the inner hair cell (IHC), and the medial olivocochlear efferent system functions. The FAC feeds back to the CAR by moving the poles and zeros of the CAR resonators automatically. We have implemented a 70-section, 44.1 kHz sampling rate CAR-FAC system on an Altera Cyclone V Field Programmable Gate Array (FPGA) with 18% ALM utilization by using time-multiplexing and pipeline parallelizing techniques and present measurement results here. The fully digital reconfigurable CAR-FAC system is stable, scalable, easy to use, and provides an excellent input stage to more complex machine hearing tasks such as sound localization, sound segregation, speech recognition, and so on.
A FPGA Implementation of the CAR-FAC Cochlear Model
Xu, Ying; Thakur, Chetan S.; Singh, Ram K.; Hamilton, Tara Julia; Wang, Runchun M.; van Schaik, André
2018-01-01
This paper presents a digital implementation of the Cascade of Asymmetric Resonators with Fast-Acting Compression (CAR-FAC) cochlear model. The CAR part simulates the basilar membrane's (BM) response to sound. The FAC part models the outer hair cell (OHC), the inner hair cell (IHC), and the medial olivocochlear efferent system functions. The FAC feeds back to the CAR by moving the poles and zeros of the CAR resonators automatically. We have implemented a 70-section, 44.1 kHz sampling rate CAR-FAC system on an Altera Cyclone V Field Programmable Gate Array (FPGA) with 18% ALM utilization by using time-multiplexing and pipeline parallelizing techniques and present measurement results here. The fully digital reconfigurable CAR-FAC system is stable, scalable, easy to use, and provides an excellent input stage to more complex machine hearing tasks such as sound localization, sound segregation, speech recognition, and so on. PMID:29692700
Teng, Xiangbin; Tian, Xing; Doelling, Keith; Poeppel, David
2017-10-17
Parsing continuous acoustic streams into perceptual units is fundamental to auditory perception. Previous studies have uncovered a cortical entrainment mechanism in the delta and theta bands (~1-8 Hz) that correlates with formation of perceptual units in speech, music, and other quasi-rhythmic stimuli. Whether cortical oscillations in the delta-theta bands are passively entrained by regular acoustic patterns or play an active role in parsing the acoustic stream is debated. Here, we investigate cortical oscillations using novel stimuli with 1/f modulation spectra. These 1/f signals have no rhythmic structure but contain information over many timescales because of their broadband modulation characteristics. We chose 1/f modulation spectra with varying exponents of f, which simulate the dynamics of environmental noise, speech, vocalizations, and music. While undergoing magnetoencephalography (MEG) recording, participants listened to 1/f stimuli and detected embedded target tones. Tone detection performance varied across stimuli of different exponents and can be explained by local signal-to-noise ratio computed using a temporal window around 200 ms. Furthermore, theta band oscillations, surprisingly, were observed for all stimuli, but robust phase coherence was preferentially displayed by stimuli with exponents 1 and 1.5. We constructed an auditory processing model to quantify acoustic information on various timescales and correlated the model outputs with the neural results. We show that cortical oscillations reflect a chunking of segments, > 200 ms. These results suggest an active auditory segmentation mechanism, complementary to entrainment, operating on a timescale of ~200 ms to organize acoustic information. © 2017 Federation of European Neuroscience Societies and John Wiley & Sons Ltd.
Poliva, Oren
2016-01-01
The auditory cortex communicates with the frontal lobe via the middle temporal gyrus (auditory ventral stream; AVS) or the inferior parietal lobule (auditory dorsal stream; ADS). Whereas the AVS is ascribed only with sound recognition, the ADS is ascribed with sound localization, voice detection, prosodic perception/production, lip-speech integration, phoneme discrimination, articulation, repetition, phonological long-term memory and working memory. Previously, I interpreted the juxtaposition of sound localization, voice detection, audio-visual integration and prosodic analysis, as evidence that the behavioral precursor to human speech is the exchange of contact calls in non-human primates. Herein, I interpret the remaining ADS functions as evidence of additional stages in language evolution. According to this model, the role of the ADS in vocal control enabled early Homo (Hominans) to name objects using monosyllabic calls, and allowed children to learn their parents' calls by imitating their lip movements. Initially, the calls were forgotten quickly but gradually were remembered for longer periods. Once the representations of the calls became permanent, mimicry was limited to infancy, and older individuals encoded in the ADS a lexicon for the names of objects (phonological lexicon). Consequently, sound recognition in the AVS was sufficient for activating the phonological representations in the ADS and mimicry became independent of lip-reading. Later, by developing inhibitory connections between acoustic-syllabic representations in the AVS and phonological representations of subsequent syllables in the ADS, Hominans became capable of concatenating the monosyllabic calls for repeating polysyllabic words (i.e., developed working memory). Finally, due to strengthening of connections between phonological representations in the ADS, Hominans became capable of encoding several syllables as a single representation (chunking). Consequently, Hominans began vocalizing and mimicking/rehearsing lists of words (sentences). PMID:27445676
Semaphorin6A acts as a gate keeper between the central and the peripheral nervous system
Mauti, Olivier; Domanitskaya, Elena; Andermatt, Irwin; Sadhu, Rejina; Stoeckli, Esther T
2007-01-01
Background During spinal cord development, expression of chicken SEMAPHORIN6A (SEMA6A) is almost exclusively found in the boundary caps at the ventral motor axon exit point and at the dorsal root entry site. The boundary cap cells are derived from a population of late migrating neural crest cells. They form a transient structure at the transition zone between the peripheral nervous system (PNS) and the central nervous system (CNS). Ablation of the boundary cap resulted in emigration of motoneurons from the ventral spinal cord along the ventral roots. Based on its very restricted expression in boundary cap cells, we tested for a role of Sema6A as a gate keeper between the CNS and the PNS. Results Downregulation of Sema6A in boundary cap cells by in ovo RNA interference resulted in motoneurons streaming out of the spinal cord along the ventral roots, and in the failure of dorsal roots to form and segregate properly. PlexinAs interact with class 6 semaphorins and are expressed by both motoneurons and sensory neurons. Knockdown of PlexinA1 reproduced the phenotype seen after loss of Sema6A function both at the ventral motor exit point and at the dorsal root entry site of the lumbosacral spinal cord. Loss of either PlexinA4 or Sema6D function had an effect only at the dorsal root entry site but not at the ventral motor axon exit point. Conclusion Sema6A acts as a gate keeper between the PNS and the CNS both ventrally and dorsally. It is required for the clustering of boundary cap cells at the PNS/CNS interface and, thus, prevents motoneurons from streaming out of the ventral spinal cord. At the dorsal root entry site it organizes the segregation of dorsal roots. PMID:18088409
Kruschwitz, Johann D; Meyer-Lindenberg, Andreas; Veer, Ilya M; Wackerhagen, Carolin; Erk, Susanne; Mohnke, Sebastian; Pöhland, Lydia; Haddad, Leila; Grimm, Oliver; Tost, Heike; Romanczuk-Seiferth, Nina; Heinz, Andreas; Walter, Martin; Walter, Henrik
2015-10-01
The application of global signal regression (GSR) to resting-state functional magnetic resonance imaging data and its usefulness is a widely discussed topic. In this article, we report an observation of segregated distribution of amygdala resting-state functional connectivity (rs-FC) within the fusiform gyrus (FFG) as an effect of GSR in a multi-center-sample of 276 healthy subjects. Specifically, we observed that amygdala rs-FC was distributed within the FFG as distinct anterior versus posterior clusters delineated by positive versus negative rs-FC polarity when GSR was performed. To characterize this effect in more detail, post hoc analyses revealed the following: first, direct overlays of task-functional magnetic resonance imaging derived face sensitive areas and clusters of positive versus negative amygdala rs-FC showed that the positive amygdala rs-FC cluster corresponded best with the fusiform face area, whereas the occipital face area corresponded to the negative amygdala rs-FC cluster. Second, as expected from a hierarchical face perception model, these amygdala rs-FC defined clusters showed differential rs-FC with other regions of the visual stream. Third, dynamic connectivity analyses revealed that these amygdala rs-FC defined clusters also differed in their rs-FC variance across time to the amygdala. Furthermore, subsample analyses of three independent research sites confirmed reliability of the effect of GSR, as revealed by similar patterns of distinct amygdala rs-FC polarity within the FFG. In this article, we discuss the potential of GSR to segregate face sensitive areas within the FFG and furthermore discuss how our results may relate to the functional organization of the face-perception circuit. © 2015 Wiley Periodicals, Inc.
Wang, Qingcui; Bao, Ming; Chen, Lihan
2014-01-01
Previous studies using auditory sequences with rapid repetition of tones revealed that spatiotemporal cues and spectral cues are important cues used to fuse or segregate sound streams. However, the perceptual grouping was partially driven by the cognitive processing of the periodicity cues of the long sequence. Here, we investigate whether perceptual groupings (spatiotemporal grouping vs. frequency grouping) could also be applicable to short auditory sequences, where auditory perceptual organization is mainly subserved by lower levels of perceptual processing. To find the answer to that question, we conducted two experiments using an auditory Ternus display. The display was composed of three speakers (A, B and C), with each speaker consecutively emitting one sound consisting of two frames (AB and BC). Experiment 1 manipulated both spatial and temporal factors. We implemented three 'within-frame intervals' (WFIs, or intervals between A and B, and between B and C), seven 'inter-frame intervals' (IFIs, or intervals between AB and BC) and two different speaker layouts (inter-distance of speakers: near or far). Experiment 2 manipulated the differentiations of frequencies between two auditory frames, in addition to the spatiotemporal cues as in Experiment 1. Listeners were required to make two alternative forced choices (2AFC) to report the perception of a given Ternus display: element motion (auditory apparent motion from sound A to B to C) or group motion (auditory apparent motion from sound 'AB' to 'BC'). The results indicate that the perceptual grouping of short auditory sequences (materialized by the perceptual decisions of the auditory Ternus display) was modulated by temporal and spectral cues, with the latter contributing more to segregating auditory events. Spatial layout plays a less role in perceptual organization. These results could be accounted for by the 'peripheral channeling' theory.
O'Sullivan, James A; Shamma, Shihab A; Lalor, Edmund C
2015-05-06
The human brain has evolved to operate effectively in highly complex acoustic environments, segregating multiple sound sources into perceptually distinct auditory objects. A recent theory seeks to explain this ability by arguing that stream segregation occurs primarily due to the temporal coherence of the neural populations that encode the various features of an individual acoustic source. This theory has received support from both psychoacoustic and functional magnetic resonance imaging (fMRI) studies that use stimuli which model complex acoustic environments. Termed stochastic figure-ground (SFG) stimuli, they are composed of a "figure" and background that overlap in spectrotemporal space, such that the only way to segregate the figure is by computing the coherence of its frequency components over time. Here, we extend these psychoacoustic and fMRI findings by using the greater temporal resolution of electroencephalography to investigate the neural computation of temporal coherence. We present subjects with modified SFG stimuli wherein the temporal coherence of the figure is modulated stochastically over time, which allows us to use linear regression methods to extract a signature of the neural processing of this temporal coherence. We do this under both active and passive listening conditions. Our findings show an early effect of coherence during passive listening, lasting from ∼115 to 185 ms post-stimulus. When subjects are actively listening to the stimuli, these responses are larger and last longer, up to ∼265 ms. These findings provide evidence for early and preattentive neural computations of temporal coherence that are enhanced by active analysis of an auditory scene. Copyright © 2015 the authors 0270-6474/15/357256-08$15.00/0.
Attention selectively modulates cortical entrainment in different regions of the speech spectrum
Baltzell, Lucas S.; Horton, Cort; Shen, Yi; Richards, Virginia M.; D'Zmura, Michael; Srinivasan, Ramesh
2016-01-01
Recent studies have uncovered a neural response that appears to track the envelope of speech, and have shown that this tracking process is mediated by attention. It has been argued that this tracking reflects a process of phase-locking to the fluctuations of stimulus energy, ensuring that this energy arrives during periods of high neuronal excitability. Because all acoustic stimuli are decomposed into spectral channels at the cochlea, and this spectral decomposition is maintained along the ascending auditory pathway and into auditory cortex, we hypothesized that the overall stimulus envelope is not as relevant to cortical processing as the individual frequency channels; attention may be mediating envelope tracking differentially across these spectral channels. To test this we reanalyzed data reported by Horton et al. (2013), where high-density EEG was recorded while adults attended to one of two competing naturalistic speech streams. In order to simulate cochlear filtering, the stimuli were passed through a gammatone filterbank, and temporal envelopes were extracted at each filter output. Following Horton et al. (2013), the attended and unattended envelopes were cross-correlated with the EEG, and local maxima were extracted at three different latency ranges corresponding to distinct peaks in the cross-correlation function (N1, P2, and N2). We found that the ratio between the attended and unattended cross-correlation functions varied across frequency channels in the N1 latency range, consistent with the hypothesis that attention differentially modulates envelope-tracking activity across spectral channels. PMID:27195825
An evaluation of dynamic lip-tooth characteristics during speech and smile in adolescents.
Ackerman, Marc B; Brensinger, Colleen; Landis, J Richard
2004-02-01
This retrospective study was conducted to measure lip-tooth characteristics of adolescents. Pretreatment video clips of 1242 consecutive patients were screened for Class-I skeletal and dental patterns. After all inclusion criteria were applied, the final sample consisted of 50 patients (27 boys, 23 girls) with a mean age of 12.5 years. The raw digital video stream of each patient was edited to select a single image frame representing the patient saying the syllable "chee" and a second single image representing the patient's posed social smile and saved as part of a 12-frame image sequence. Each animation image was analyzed using a SmileMesh computer application to measure the smile index (the ratio of the intercommissure width divided by the interlabial gap), intercommissure width (mm), interlabial gap (mm), percent incisor below the intercommissure line, and maximum incisor exposure (mm). The data were analyzed using SAS (version 8.1). All recorded differences in linear measures had to be > or = 2 mm. The results suggest that anterior tooth display at speech and smile should be recorded independently but evaluated as part of a dynamic range. Asking patients to say "cheese" and then smile is no longer a valid method to elicit the parameters of anterior tooth display. When planning the vertical positions of incisors during orthodontic treatment, the orthodontist should view the dynamics of anterior tooth display as a continuum delineated by the time points of rest, speech, posed social smile, and a Duchenne smile.
An oscillator model of the timing of turn-taking.
Wilson, Margaret; Wilson, Thomas P
2005-12-01
When humans talk without conventionalized arrangements, they engage in conversation--that is, a continuous and largely nonsimultaneous exchange in which speakers take turns. Turn-taking is ubiquitous in conversation and is the normal case against which alternatives, such as interruptions, are treated as violations that warrant repair. Furthermore, turn-taking involves highly coordinated timing, including a cyclic rise and fall in the probability of initiating speech during brief silences, and involves the notable rarity, especially in two-party conversations, of two speakers' breaking a silence at once. These phenomena, reported by conversation analysts, have been neglected by cognitive psychologists, and to date there has been no adequate cognitive explanation. Here, we propose that, during conversation, endogenous oscillators in the brains of the speaker and the listeners become mutually entrained, on the basis of the speaker's rate of syllable production. This entrained cyclic pattern governs the potential for initiating speech at any given instant for the speaker and also for the listeners (as potential next speakers). Furthermore, the readiness functions of the listeners are counterphased with that of the speaker, minimizing the likelihood of simultaneous starts by a listener and the previous speaker. This mutual entrainment continues for a brief period when the speech stream ceases, accounting for the cyclic property of silences. This model not only captures the timing phenomena observed inthe literature on conversation analysis, but also converges with findings from the literatures on phoneme timing, syllable organization, and interpersonal coordination.
Howe, M. S.; McGowan, R. S.
2011-01-01
An analysis is made of the sound generated by the time-dependent throttling of a nominally steady stream of air through a small orifice into a flow-through resonant cavity. This is exemplified by the production of voiced speech, where air from the lungs enters the vocal tract through the glottis at a time variable volume flow rate Q(t) controlled by oscillations of the glottis cross-section. Voicing theory has hitherto determined Q from a heuristic, reduced complexity ‘Fant’ differential equation (G. Fant, Acoustic Theory of Speech Production, 1960). A new self-consistent, integro-differential form of this equation is derived in this paper using the theory of aerodynamic sound, with full account taken of the back-reaction of the resonant tract on the glottal flux Q. The theory involves an aeroacoustic Green’s function (G) for flow-surface interactions in a time-dependent glottis, so making the problem non-self-adjoint. In complex problems of this type it is not usually possible to obtain G in an explicit analytic form. The principal objective of the paper is to show how the Fant equation can still be derived in such cases from a consideration of the equation of aerodynamic sound and from the adjoint of the equation governing G in the neighbourhood of the ‘throttle’. The theory is illustrated by application to the canonical problem of throttled flow into a Helmholtz resonator. PMID:21666824
Mostafapour, S P; Lahargoue, K; Gates, G A
1998-12-01
No consensus exists regarding the magnitude of the risk of noise-induced hearing loss (NIHL) associated with leisure noise, in particular, personal listening devices in young adults. Examine the magnitude of hearing loss associated with personal listening devices and other sources of leisure noise in causing NIHL in young adults. Prospective auditory testing of college student volunteers with retrospective history exposure to home stereos, personal listening devices, firearms, and other sources of recreational noise. Subjects underwent audiologic examination consisting of estimation of pure-tone thresholds, speech reception thresholds, and word recognition at 45 dB HL. Fifty subjects aged 18 to 30 years were tested. All hearing thresholds of all subjects (save one-a unilateral 30 dB HL threshold at 6 kHz) were normal, (i.e., 25 dB HL or better). A 10 dB threshold elevation (notch) in either ear at 3 to 6 kHz as compared with neighboring frequencies was noted in 11 (22%) subjects and an unequivocal notch (15 dB or greater) in either ear was noted in 14 (28%) of subjects. The presence or absence of any notch (small or large) did not correlate with any single or cumulative source of noise exposure. No difference in pure-tone threshold, speech reception threshold, or speech discrimination was found among subjects when segregated by noise exposure level. The majority of young users of personal listening devices are at low risk for substantive NIHL. Interpretation of the significance of these findings in relation to noise exposure must be made with caution. NIHL is an additive process and even subtle deficits may contribute to unequivocal hearing loss with continued exposure. The low prevalence of measurable deficits in this study group may not exclude more substantive deficits in other populations with greater exposures. Continued education of young people about the risk to hearing from recreational noise exposure is warranted.
2015-01-01
An important aspect of speech perception is the ability to group or select formants using cues in the acoustic source characteristics—for example, fundamental frequency (F0) differences between formants promote their segregation. This study explored the role of more radical differences in source characteristics. Three-formant (F1+F2+F3) synthetic speech analogues were derived from natural sentences. In Experiment 1, F1+F3 were generated by passing a harmonic glottal source (F0 = 140 Hz) through second-order resonators (H1+H3); in Experiment 2, F1+F3 were tonal (sine-wave) analogues (T1+T3). F2 could take either form (H2 or T2). In some conditions, the target formants were presented alone, either monaurally or dichotically (left ear = F1+F3; right ear = F2). In others, they were accompanied by a competitor for F2 (F1+F2C+F3; F2), which listeners must reject to optimize recognition. Competitors (H2C or T2C) were created using the time-reversed frequency and amplitude contours of F2. Dichotic presentation of F2 and F2C ensured that the impact of the competitor arose primarily through informational masking. In the absence of F2C, the effect of a source mismatch between F1+F3 and F2 was relatively modest. When F2C was present, intelligibility was lowest when F2 was tonal and F2C was harmonic, irrespective of which type matched F1+F3. This finding suggests that source type and context, rather than similarity, govern the phonetic contribution of a formant. It is proposed that wideband harmonic analogues are more effective informational maskers than narrowband tonal analogues, and so become dominant in across-frequency integration of phonetic information when placed in competition. PMID:25751040
A recurrent neural model for proto-object based contour integration and figure-ground segregation.
Hu, Brian; Niebur, Ernst
2017-12-01
Visual processing of objects makes use of both feedforward and feedback streams of information. However, the nature of feedback signals is largely unknown, as is the identity of the neuronal populations in lower visual areas that receive them. Here, we develop a recurrent neural model to address these questions in the context of contour integration and figure-ground segregation. A key feature of our model is the use of grouping neurons whose activity represents tentative objects ("proto-objects") based on the integration of local feature information. Grouping neurons receive input from an organized set of local feature neurons, and project modulatory feedback to those same neurons. Additionally, inhibition at both the local feature level and the object representation level biases the interpretation of the visual scene in agreement with principles from Gestalt psychology. Our model explains several sets of neurophysiological results (Zhou et al. Journal of Neuroscience, 20(17), 6594-6611 2000; Qiu et al. Nature Neuroscience, 10(11), 1492-1499 2007; Chen et al. Neuron, 82(3), 682-694 2014), and makes testable predictions about the influence of neuronal feedback and attentional selection on neural responses across different visual areas. Our model also provides a framework for understanding how object-based attention is able to select both objects and the features associated with them.
Neurocognitive mechanisms of gaze-expression interactions in face processing and social attention
Graham, Reiko; LaBar, Kevin S.
2012-01-01
The face conveys a rich source of non-verbal information used during social communication. While research has revealed how specific facial channels such as emotional expression are processed, little is known about the prioritization and integration of multiple cues in the face during dyadic exchanges. Classic models of face perception have emphasized the segregation of dynamic versus static facial features along independent information processing pathways. Here we review recent behavioral and neuroscientific evidence suggesting that within the dynamic stream, concurrent changes in eye gaze and emotional expression can yield early independent effects on face judgments and covert shifts of visuospatial attention. These effects are partially segregated within initial visual afferent processing volleys, but are subsequently integrated in limbic regions such as the amygdala or via reentrant visual processing volleys. This spatiotemporal pattern may help to resolve otherwise perplexing discrepancies across behavioral studies of emotional influences on gaze-directed attentional cueing. Theoretical explanations of gaze-expression interactions are discussed, with special consideration of speed-of-processing (discriminability) and contextual (ambiguity) accounts. Future research in this area promises to reveal the mental chronometry of face processing and interpersonal attention, with implications for understanding how social referencing develops in infancy and is impaired in autism and other disorders of social cognition. PMID:22285906
CA1 pyramidal cell diversity enabling parallel information processing in the hippocampus
Soltesz, Ivan; Losonczy, Attila
2018-01-01
Hippocampal network operations supporting spatial navigation and declarative memory are traditionally interpreted in a framework where each hippocampal area, such as the dentate gyrus, CA3, and CA1, consists of homogeneous populations of functionally equivalent principal neurons. However, heterogeneity within hippocampal principal cell populations, in particular within pyramidal cells at the main CA1 output node, is increasingly recognized and includes developmental, molecular, anatomical, and functional differences. Here we review recent progress in the delineation of hippocampal principal cell subpopulations by focusing on radially defined subpopulations of CA1 pyramidal cells, and we consider how functional segregation of information streams, in parallel channels with nonuniform properties, could represent a general organizational principle of the hippocampus supporting diverse behaviors. PMID:29593317
Central Auditory Processing of Temporal and Spectral-Variance Cues in Cochlear Implant Listeners
Pham, Carol Q.; Bremen, Peter; Shen, Weidong; Yang, Shi-Ming; Middlebrooks, John C.; Zeng, Fan-Gang; Mc Laughlin, Myles
2015-01-01
Cochlear implant (CI) listeners have difficulty understanding speech in complex listening environments. This deficit is thought to be largely due to peripheral encoding problems arising from current spread, which results in wide peripheral filters. In normal hearing (NH) listeners, central processing contributes to segregation of speech from competing sounds. We tested the hypothesis that basic central processing abilities are retained in post-lingually deaf CI listeners, but processing is hampered by degraded input from the periphery. In eight CI listeners, we measured auditory nerve compound action potentials to characterize peripheral filters. Then, we measured psychophysical detection thresholds in the presence of multi-electrode maskers placed either inside (peripheral masking) or outside (central masking) the peripheral filter. This was intended to distinguish peripheral from central contributions to signal detection. Introduction of temporal asynchrony between the signal and masker improved signal detection in both peripheral and central masking conditions for all CI listeners. Randomly varying components of the masker created spectral-variance cues, which seemed to benefit only two out of eight CI listeners. Contrastingly, the spectral-variance cues improved signal detection in all five NH listeners who listened to our CI simulation. Together these results indicate that widened peripheral filters significantly hamper central processing of spectral-variance cues but not of temporal cues in post-lingually deaf CI listeners. As indicated by two CI listeners in our study, however, post-lingually deaf CI listeners may retain some central processing abilities similar to NH listeners. PMID:26176553
Language change in a multiple group society
NASA Astrophysics Data System (ADS)
Pop, Cristina-Maria; Frey, Erwin
2013-08-01
The processes leading to change in languages are manifold. In order to reduce ambiguity in the transmission of information, agreement on a set of conventions for recurring problems is favored. In addition to that, speakers tend to use particular linguistic variants associated with the social groups they identify with. The influence of other groups propagating across the speech community as new variant forms sustains the competition between linguistic variants. With the utterance selection model, an evolutionary description of language change, Baxter [Phys. Rev. EPLEEE81539-375510.1103/PhysRevE.73.046118 73, 046118 (2006)] have provided a mathematical formulation of the interactions inside a group of speakers, exploring the mechanisms that lead to or inhibit the fixation of linguistic variants. In this paper, we take the utterance selection model one step further by describing a speech community consisting of multiple interacting groups. Tuning the interaction strength between groups allows us to gain deeper understanding about the way in which linguistic variants propagate and how their distribution depends on the group partitioning. Both for the group size and the number of groups we find scaling behaviors with two asymptotic regimes. If groups are strongly connected, the dynamics is that of the standard utterance selection model, whereas if their coupling is weak, the magnitude of the latter along with the system size governs the way consensus is reached. Furthermore, we find that a high influence of the interlocutor on a speaker's utterances can act as a counterweight to group segregation.
Neural Signatures of Stimulus Features in Visual Working Memory—A Spatiotemporal Approach
Jackson, Margaret C.; Klein, Christoph; Mohr, Harald; Shapiro, Kimron L.; Linden, David E. J.
2010-01-01
We examined the neural signatures of stimulus features in visual working memory (WM) by integrating functional magnetic resonance imaging (fMRI) and event-related potential data recorded during mental manipulation of colors, rotation angles, and color–angle conjunctions. The N200, negative slow wave, and P3b were modulated by the information content of WM, and an fMRI-constrained source model revealed a progression in neural activity from posterior visual areas to higher order areas in the ventral and dorsal processing streams. Color processing was associated with activity in inferior frontal gyrus during encoding and retrieval, whereas angle processing involved right parietal regions during the delay interval. WM for color–angle conjunctions did not involve any additional neural processes. The finding that different patterns of brain activity underlie WM for color and spatial information is consistent with ideas that the ventral/dorsal “what/where” segregation of perceptual processing influences WM organization. The absence of characteristic signatures of conjunction-related brain activity, which was generally intermediate between the 2 single conditions, suggests that conjunction judgments are based on the coordinated activity of these 2 streams. PMID:19429863
Halim, Zahid; Abbas, Ghulam
2015-01-01
Sign language provides hearing and speech impaired individuals with an interface to communicate with other members of the society. Unfortunately, sign language is not understood by most of the common people. For this, a gadget based on image processing and pattern recognition can provide with a vital aid for detecting and translating sign language into a vocal language. This work presents a system for detecting and understanding the sign language gestures by a custom built software tool and later translating the gesture into a vocal language. For the purpose of recognizing a particular gesture, the system employs a Dynamic Time Warping (DTW) algorithm and an off-the-shelf software tool is employed for vocal language generation. Microsoft(®) Kinect is the primary tool used to capture video stream of a user. The proposed method is capable of successfully detecting gestures stored in the dictionary with an accuracy of 91%. The proposed system has the ability to define and add custom made gestures. Based on an experiment in which 10 individuals with impairments used the system to communicate with 5 people with no disability, 87% agreed that the system was useful.
Perruchet, Pierre; Tillmann, Barbara
2010-03-01
This study investigates the joint influences of three factors on the discovery of new word-like units in a continuous artificial speech stream: the statistical structure of the ongoing input, the initial word-likeness of parts of the speech flow, and the contextual information provided by the earlier emergence of other word-like units. Results of an experiment conducted with adult participants show that these sources of information have strong and interactive influences on word discovery. The authors then examine the ability of different models of word segmentation to account for these results. PARSER (Perruchet & Vinter, 1998) is compared to the view that word segmentation relies on the exploitation of transitional probabilities between successive syllables, and with the models based on the Minimum Description Length principle, such as INCDROP. The authors submit arguments suggesting that PARSER has the advantage of accounting for the whole pattern of data without ad-hoc modifications, while relying exclusively on general-purpose learning principles. This study strengthens the growing notion that nonspecific cognitive processes, mainly based on associative learning and memory principles, are able to account for a larger part of early language acquisition than previously assumed. Copyright © 2009 Cognitive Science Society, Inc.
MPEG-7 audio-visual indexing test-bed for video retrieval
NASA Astrophysics Data System (ADS)
Gagnon, Langis; Foucher, Samuel; Gouaillier, Valerie; Brun, Christelle; Brousseau, Julie; Boulianne, Gilles; Osterrath, Frederic; Chapdelaine, Claude; Dutrisac, Julie; St-Onge, Francis; Champagne, Benoit; Lu, Xiaojian
2003-12-01
This paper reports on the development status of a Multimedia Asset Management (MAM) test-bed for content-based indexing and retrieval of audio-visual documents within the MPEG-7 standard. The project, called "MPEG-7 Audio-Visual Document Indexing System" (MADIS), specifically targets the indexing and retrieval of video shots and key frames from documentary film archives, based on audio-visual content like face recognition, motion activity, speech recognition and semantic clustering. The MPEG-7/XML encoding of the film database is done off-line. The description decomposition is based on a temporal decomposition into visual segments (shots), key frames and audio/speech sub-segments. The visible outcome will be a web site that allows video retrieval using a proprietary XQuery-based search engine and accessible to members at the Canadian National Film Board (NFB) Cineroute site. For example, end-user will be able to ask to point on movie shots in the database that have been produced in a specific year, that contain the face of a specific actor who tells a specific word and in which there is no motion activity. Video streaming is performed over the high bandwidth CA*net network deployed by CANARIE, a public Canadian Internet development organization.
Sullivan, Jessica R.; Assmann, Peter F.; Hossain, Shaikat; Schafer, Erin C.
2017-01-01
Two experiments explored the role of differences in voice gender in the recognition of speech masked by a competing talker in cochlear implant simulations. Experiment 1 confirmed that listeners with normal hearing receive little benefit from differences in voice gender between a target and masker sentence in four- and eight-channel simulations, consistent with previous findings that cochlear implants deliver an impoverished representation of the cues for voice gender. However, gender differences led to small but significant improvements in word recognition with 16 and 32 channels. Experiment 2 assessed the benefits of perceptual training on the use of voice gender cues in an eight-channel simulation. Listeners were assigned to one of four groups: (1) word recognition training with target and masker differing in gender; (2) word recognition training with same-gender target and masker; (3) gender recognition training; or (4) control with no training. Significant improvements in word recognition were observed from pre- to post-test sessions for all three training groups compared to the control group. These improvements were maintained at the late session (one week following the last training session) for all three groups. There was an overall improvement in masked word recognition performance provided by gender mismatch following training, but the amount of benefit did not differ as a function of the type of training. The training effects observed here are consistent with a form of rapid perceptual learning that contributes to the segregation of competing voices but does not specifically enhance the benefits provided by voice gender cues. PMID:28372046
Räsänen, Okko; Kakouros, Sofoklis; Soderstrom, Melanie
2018-06-06
The exaggerated intonation and special rhythmic properties of infant-directed speech (IDS) have been hypothesized to attract infants' attention to the speech stream. However, there has been little work actually connecting the properties of IDS to models of attentional processing or perceptual learning. A number of such attention models suggest that surprising or novel perceptual inputs attract attention, where novelty can be operationalized as the statistical (un)predictability of the stimulus in the given context. Since prosodic patterns such as F0 contours are accessible to young infants who are also known to be adept statistical learners, the present paper investigates a hypothesis that F0 contours in IDS are less predictable than those in adult-directed speech (ADS), given previous exposure to both speaking styles, thereby potentially tapping into basic attentional mechanisms of the listeners in a similar manner that relative probabilities of other linguistic patterns are known to modulate attentional processing in infants and adults. Computational modeling analyses with naturalistic IDS and ADS speech from matched speakers and contexts show that IDS intonation has lower overall temporal predictability even when the F0 contours of both speaking styles are normalized to have equal means and variances. A closer analysis reveals that there is a tendency of IDS intonation to be less predictable at the end of short utterances, whereas ADS exhibits more stable average predictability patterns across the full extent of the utterances. The difference between IDS and ADS persists even when the proportion of IDS and ADS exposure is varied substantially, simulating different relative amounts of IDS heard in different family and cultural environments. Exposure to IDS is also found to be more efficient for predicting ADS intonation contours in new utterances than exposure to the equal amount of ADS speech. This indicates that the more variable prosodic contours of IDS also generalize to ADS, and may therefore enhance prosodic learning in infancy. Overall, the study suggests that one reason behind infant preference for IDS could be its higher information value at the prosodic level, as measured by the amount of surprisal in the F0 contours. This provides the first formal link between the properties of IDS and the models of attentional processing and statistical learning in the brain. However, this finding does not rule out the possibility that other differences between the IDS and ADS also play a role. Copyright © 2018 Elsevier B.V. All rights reserved.
A centralized audio presentation manager
DOE Office of Scientific and Technical Information (OSTI.GOV)
Papp, A.L. III; Blattner, M.M.
1994-05-16
The centralized audio presentation manager addresses the problems which occur when multiple programs running simultaneously attempt to use the audio output of a computer system. Time dependence of sound means that certain auditory messages must be scheduled simultaneously, which can lead to perceptual problems due to psychoacoustic phenomena. Furthermore, the combination of speech and nonspeech audio is examined; each presents its own problems of perceptibility in an acoustic environment composed of multiple auditory streams. The centralized audio presentation manager receives abstract parameterized message requests from the currently running programs, and attempts to create and present a sonic representation in themore » most perceptible manner through the use of a theoretically and empirically designed rule set.« less
Manipulation of Liquids Using Phased Array Generation of Acoustic Radiation Pressure
NASA Technical Reports Server (NTRS)
Oeftering, Richard C. (Inventor)
2000-01-01
A phased array of piezoelectric transducers is used to control and manipulate contained as well as uncontained fluids in space and earth applications. The transducers in the phased array are individually activated while being commonly controlled to produce acoustic radiation pressure and acoustic streaming. The phased array is activated to produce a single pulse, a pulse burst or a continuous pulse to agitate, segregate or manipulate liquids and gases. The phased array generated acoustic radiation pressure is also useful in manipulating a drop, a bubble or other object immersed in a liquid. The transducers can be arranged in any number of layouts including linear single or multi- dimensional, space curved and annular arrays. The individual transducers in the array are activated by a controller, preferably driven by a computer.
Breska, Assaf; Deouell, Leon Y
2017-02-01
Predicting the timing of upcoming events enables efficient resource allocation and action preparation. Rhythmic streams, such as music, speech, and biological motion, constitute a pervasive source for temporal predictions. Widely accepted entrainment theories postulate that rhythm-based predictions are mediated by synchronizing low-frequency neural oscillations to the rhythm, as indicated by increased phase concentration (PC) of low-frequency neural activity for rhythmic compared to random streams. However, we show here that PC enhancement in scalp recordings is not specific to rhythms but is observed to the same extent in less periodic streams if they enable memory-based prediction. This is inconsistent with the predictions of a computational entrainment model of stronger PC for rhythmic streams. Anticipatory change in alpha activity and facilitation of electroencephalogram (EEG) manifestations of response selection are also comparable between rhythm- and memory-based predictions. However, rhythmic sequences uniquely result in obligatory depression of preparation-related premotor brain activity when an on-beat event is omitted, even when it is strategically beneficial to maintain preparation, leading to larger behavioral costs for violation of prediction. Thus, while our findings undermine the validity of PC as a sign of rhythmic entrainment, they constitute the first electrophysiological dissociation, to our knowledge, between mechanisms of rhythmic predictions and of memory-based predictions: the former obligatorily lead to resonance-like preparation patterns (that are in line with entrainment), while the latter allow flexible resource allocation in time regardless of periodicity in the input. Taken together, they delineate the neural mechanisms of three distinct modes of preparation: continuous vigilance, interval-timing-based prediction and rhythm-based prediction.
Deouell, Leon Y.
2017-01-01
Predicting the timing of upcoming events enables efficient resource allocation and action preparation. Rhythmic streams, such as music, speech, and biological motion, constitute a pervasive source for temporal predictions. Widely accepted entrainment theories postulate that rhythm-based predictions are mediated by synchronizing low-frequency neural oscillations to the rhythm, as indicated by increased phase concentration (PC) of low-frequency neural activity for rhythmic compared to random streams. However, we show here that PC enhancement in scalp recordings is not specific to rhythms but is observed to the same extent in less periodic streams if they enable memory-based prediction. This is inconsistent with the predictions of a computational entrainment model of stronger PC for rhythmic streams. Anticipatory change in alpha activity and facilitation of electroencephalogram (EEG) manifestations of response selection are also comparable between rhythm- and memory-based predictions. However, rhythmic sequences uniquely result in obligatory depression of preparation-related premotor brain activity when an on-beat event is omitted, even when it is strategically beneficial to maintain preparation, leading to larger behavioral costs for violation of prediction. Thus, while our findings undermine the validity of PC as a sign of rhythmic entrainment, they constitute the first electrophysiological dissociation, to our knowledge, between mechanisms of rhythmic predictions and of memory-based predictions: the former obligatorily lead to resonance-like preparation patterns (that are in line with entrainment), while the latter allow flexible resource allocation in time regardless of periodicity in the input. Taken together, they delineate the neural mechanisms of three distinct modes of preparation: continuous vigilance, interval-timing-based prediction and rhythm-based prediction. PMID:28187128
Income inequality and income segregation.
Reardon, Sean F; Bischoff, Kendra
2011-01-01
This article investigates how the growth in income inequality from 1970 to 2000 affected patterns of income segregation along three dimensions: the spatial segregation of poverty and affluence, race-specific patterns of income segregation, and the geographic scale of income segregation. The evidence reveals a robust relationship between income inequality and income segregation, an effect that is larger for black families than for white families. In addition, income inequality affects income segregation primarily through its effect on the large-scale spatial segregation of affluence rather than by affecting the spatial segregation of poverty or by altering small-scale patterns of income segregation.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mason, John A.; Burke, Kevin J.; Towner, Antony C.N.
This paper describes the development, testing and validation of a shielded waste segregation and clearance monitor designed for the measurement of low-density low-level waste (LLW). The monitor is made of a measurement chamber surrounded by detectors and a shielded outer frame. The shielded chamber consists of a steel frame, which contains typically 1.5 inches (3.81 cm) of lead and 0.5 inches (1.27 cm) of steel shielding. Inside the shielding are plastic scintillator panels, which serve as gross gamma ray detectors. The detector panels, with embedded photomultipliers, completely surround the internal measurement chamber on all 6 sides. Care has been takenmore » to distribute the plastic scintillator detectors in order to optimise both the efficiency for gamma ray detection and at the same time achieve a volumetric sensitivity, which is as uniform as possible. A common high voltage power supply provides the bias voltage for each of the six photomultipliers. The voltage signals arising from the detectors and photomultipliers are amplified by six sensitive amplifiers. Each amplifier incorporates a single channel analyser with both upper and lower thresholds and the digitised counts from each detector are recorded on six scalars. Operation of the device is by means of a microprocessor from which the scalars are controlled. An internal load cell linked to the microprocessor determines the weight of the waste object, and this information is used to calculate the specific activity of the waste. The monitor makes background measurements when the shielded door is closed and a sample, usually a bag of low-density waste, is not present in the measurement chamber. Measurements of the minimum detectable activity (MDA) of an earlier large volume prototype instrument are reported as part of the development of the Waste Segregation and Clearance Monitor (WSCM) described in the paper. For the optimised WSCM a detection efficiency of greater than 32% was measured using a small Cs-137 source placed in the centre of the measurement chamber. Small sources have also been used to determine the spatial variation of the detection efficiency for various positions within the measurement chamber. The data have been used to establish sentencing limits and different 'fingerprints' for specific waste streams including waste streams containing fission products and others based on other radionuclides including Am-241. Some of the test data that are presented have been used to validate the instrument performance. The monitor is currently in routine use at a nuclear facility for the measurement and sentencing of low-density low activity radioactive waste. (authors)« less
Robinson, Philip W; Pätynen, Jukka; Lokki, Tapio; Jang, Hyung Suk; Jeon, Jin Yong; Xiang, Ning
2013-06-01
In musical or theatrical performance, some venues allow listeners to individually localize and segregate individual performers, while others produce a well blended ensemble sound. The room acoustic conditions that make this possible, and the psycho-acoustic effects at work are not fully understood. This research utilizes auralizations from measured and simulated performance venues to investigate spatial discrimination of multiple acoustic sources in rooms. Signals were generated from measurements taken in a small theater, and listeners in the audience area were asked to distinguish pairs of speech sources on stage with various spatial separations. This experiment was repeated with the proscenium splay walls treated to be flat, diffusive, or absorptive. Similar experiments were conducted in a simulated hall, utilizing 11 early reflections with various characteristics, and measured late reverberation. The experiments reveal that discriminating the lateral arrangement of two sources is possible at narrower separation angles when reflections come from flat or absorptive rather than diffusive surfaces.
Distribution of Chironomidae in a semiarid intermittent river of Brazil.
Farias, R L; Carvalho, L K; Medeiros, E S F
2012-12-01
The effects of the intermittency of water flow on habitat structure and substrate composition have been reported to create a patch dynamics for the aquatic fauna, mostly for that associated with the substrate. This study aims to describe the spatial distribution of Chironomidae in an intermittent river of semiarid Brazil and to associate assemblage composition with environmental variables. Benthic invertebrates were sampled during the wet and dry seasons using a D-shaped net (40 cm wide and 250 μm mesh), and the Chironomidae were identified to genus level. The most abundant genera were Tanytarsus, Polypedilum, and Saetheria with important contributions of the genera Procladius, Aedokritus, and Dicrotendipes. Richness and density were not significantly different between the study sites, and multiple regression showed that the variation in richness and density explained by the environmental variables was significant only for substrate composition. The composition of genera showed significant spatial segregation across the study sites. Canonical Correspondence Analysis showed significant correspondence between Chironomidae composition and the environmental variables, with submerged vegetation, elevation, and leaf litter being important predictors of the Chironomidae fauna. This study showed that Chironomidae presented important spatial variation along the river and that this variation was substantially explained by environmental variables associated with the habitat structure and river hierarchy. We suggest that the observed spatial segregation in the fauna results in the high diversity of this group of organisms in intermittent streams.
Frontal and parietal theta burst TMS impairs working memory for visual-spatial conjunctions
Morgan, Helen M.; Jackson, Margaret C.; van Koningsbruggen, Martijn G.; Shapiro, Kimron L.; Linden, David E.J.
2013-01-01
In tasks that selectively probe visual or spatial working memory (WM) frontal and posterior cortical areas show a segregation, with dorsal areas preferentially involved in spatial (e.g. location) WM and ventral areas in visual (e.g. object identity) WM. In a previous fMRI study [1], we showed that right parietal cortex (PC) was more active during WM for orientation, whereas left inferior frontal gyrus (IFG) was more active during colour WM. During WM for colour-orientation conjunctions, activity in these areas was intermediate to the level of activity for the single task preferred and non-preferred information. To examine whether these specialised areas play a critical role in coordinating visual and spatial WM to perform a conjunction task, we used theta burst transcranial magnetic stimulation (TMS) to induce a functional deficit. Compared to sham stimulation, TMS to right PC or left IFG selectively impaired WM for conjunctions but not single features. This is consistent with findings from visual search paradigms, in which frontal and parietal TMS selectively affects search for conjunctions compared to single features, and with combined TMS and functional imaging work suggesting that parietal and frontal regions are functionally coupled in tasks requiring integration of visual and spatial information. Our results thus elucidate mechanisms by which the brain coordinates spatially segregated processing streams and have implications beyond the field of working memory. PMID:22483548
Frontal and parietal theta burst TMS impairs working memory for visual-spatial conjunctions.
Morgan, Helen M; Jackson, Margaret C; van Koningsbruggen, Martijn G; Shapiro, Kimron L; Linden, David E J
2013-03-01
In tasks that selectively probe visual or spatial working memory (WM) frontal and posterior cortical areas show a segregation, with dorsal areas preferentially involved in spatial (e.g. location) WM and ventral areas in visual (e.g. object identity) WM. In a previous fMRI study [1], we showed that right parietal cortex (PC) was more active during WM for orientation, whereas left inferior frontal gyrus (IFG) was more active during colour WM. During WM for colour-orientation conjunctions, activity in these areas was intermediate to the level of activity for the single task preferred and non-preferred information. To examine whether these specialised areas play a critical role in coordinating visual and spatial WM to perform a conjunction task, we used theta burst transcranial magnetic stimulation (TMS) to induce a functional deficit. Compared to sham stimulation, TMS to right PC or left IFG selectively impaired WM for conjunctions but not single features. This is consistent with findings from visual search paradigms, in which frontal and parietal TMS selectively affects search for conjunctions compared to single features, and with combined TMS and functional imaging work suggesting that parietal and frontal regions are functionally coupled in tasks requiring integration of visual and spatial information. Our results thus elucidate mechanisms by which the brain coordinates spatially segregated processing streams and have implications beyond the field of working memory. Copyright © 2013 Elsevier Inc. All rights reserved.
Wang, Xue; Gaustad, Gabrielle; Babbitt, Callie W
2016-05-01
Development of lithium-ion battery recycling systems is a current focus of much research; however, significant research remains to optimize the process. One key area not studied is the utilization of mechanical pre-recycling steps to improve overall yield. This work proposes a pre-recycling process, including mechanical shredding and size-based sorting steps, with the goal of potential future scale-up to the industrial level. This pre-recycling process aims to achieve material segregation with a focus on the metallic portion and provide clear targets for subsequent recycling processes. The results show that contained metallic materials can be segregated into different size fractions at different levels. For example, for lithium cobalt oxide batteries, cobalt content has been improved from 35% by weight in the metallic portion before this pre-recycling process to 82% in the ultrafine (<0.5mm) fraction and to 68% in the fine (0.5-1mm) fraction, and been excluded in the larger pieces (>6mm). However, size fractions across multiple battery chemistries showed significant variability in material concentration. This finding indicates that sorting by cathode before pre-treatment could reduce the uncertainty of input materials and therefore improve the purity of output streams. Thus, battery labeling systems may be an important step towards implementation of any pre-recycling process. Copyright © 2015 Elsevier Ltd. All rights reserved.
Quantifying tolerance indicator values for common stream fish species of the United States
Meador, M.R.; Carlisle, D.M.
2007-01-01
The classification of fish species tolerance to environmental disturbance is often used as a means to assess ecosystem conditions. Its use, however, may be problematic because the approach to tolerance classification is based on subjective judgment. We analyzed fish and physicochemical data from 773 stream sites collected as part of the U.S. Geological Survey's National Water-Quality Assessment Program to calculate tolerance indicator values for 10 physicochemical variables using weighted averaging. Tolerance indicator values (TIVs) for ammonia, chloride, dissolved oxygen, nitrite plus nitrate, pH, phosphorus, specific conductance, sulfate, suspended sediment, and water temperature were calculated for 105 common fish species of the United States. Tolerance indicator values for specific conductance and sulfate were correlated (rho = 0.87), and thus, fish species may be co-tolerant to these water-quality variables. We integrated TIVs for each species into an overall tolerance classification for comparisons with judgment-based tolerance classifications. Principal components analysis indicated that the distinction between tolerant and intolerant classifications was determined largely by tolerance to suspended sediment, specific conductance, chloride, and total phosphorus. Factors such as water temperature, dissolved oxygen, and pH may not be as important in distinguishing between tolerant and intolerant classifications, but may help to segregate species classified as moderate. Empirically derived tolerance classifications were 58.8% in agreement with judgment-derived tolerance classifications. Canonical discriminant analysis revealed that few TIVs, primarily chloride, could discriminate among judgment-derived tolerance classifications of tolerant, moderate, and intolerant. To our knowledge, this is the first empirically based understanding of fish species tolerance for stream fishes in the United States.
Trophic ecomorphology of Siluriformes (Pisces, Osteichthyes) from a tropical stream.
Pagotto, J P A; Goulart, E; Oliveira, E F; Yamamura, C B
2011-05-01
The present study analysed the relationship between morphology and trophic structure of Siluriformes (Pisces, Osteichthyes) from the Caracu Stream (22º 45' S and 53º 15' W), a tributary of the Paraná River (Brazil). Sampling was carried out at three sites using electrofishing, and two species of Loricariidae and four of Heptapteridae were obtained. A cluster analysis revealed the presence of three trophic guilds (detritivores, insectivores and omnivores). Principal components analysis demonstrated the segregation of two ecomorphotypes: at one extreme there were the detritivores (Loricariidae) with morphological structures that are fundamental in allowing them to fix themselves to substrates characterised by rushing torrents, thus permitting them to graze on the detritus and organic materials encrusted on the substrate; at the other extreme of the gradient there were the insectivores and omnivores (Heptapteridae), with morphological characteristics that promote superior performance in the exploitation of structurally complex habitats with low current velocity, colonised by insects and plants. Canonical discriminant analysis revealed an ecomorphological divergence between insectivores, which have morphological structures that permit them to capture prey in small spaces among rocks, and omnivores, which have a more compressed body and tend to explore food items deposited in marginal backwater zones. Mantel tests showed that trophic structure was significantly related to the body shape of a species, independently of the phylogenetic history, indicating that, in this case, there was an ecomorphotype for each trophic guild. Therefore, the present study demonstrated that the Siluriformes of the Caracu Stream were ecomorphologically structured and that morphology can be applied as an additional tool in predicting the trophic structure of this group.
Improved Open-Microphone Speech Recognition
NASA Astrophysics Data System (ADS)
Abrash, Victor
2002-12-01
Many current and future NASA missions make extreme demands on mission personnel both in terms of work load and in performing under difficult environmental conditions. In situations where hands are impeded or needed for other tasks, eyes are busy attending to the environment, or tasks are sufficiently complex that ease of use of the interface becomes critical, spoken natural language dialog systems offer unique input and output modalities that can improve efficiency and safety. They also offer new capabilities that would not otherwise be available. For example, many NASA applications require astronauts to use computers in micro-gravity or while wearing space suits. Under these circumstances, command and control systems that allow users to issue commands or enter data in hands-and eyes-busy situations become critical. Speech recognition technology designed for current commercial applications limits the performance of the open-ended state-of-the-art dialog systems being developed at NASA. For example, today's recognition systems typically listen to user input only during short segments of the dialog, and user input outside of these short time windows is lost. Mistakes detecting the start and end times of user utterances can lead to mistakes in the recognition output, and the dialog system as a whole has no way to recover from this, or any other, recognition error. Systems also often require the user to signal when that user is going to speak, which is impractical in a hands-free environment, or only allow a system-initiated dialog requiring the user to speak immediately following a system prompt. In this project, SRI has developed software to enable speech recognition in a hands-free, open-microphone environment, eliminating the need for a push-to-talk button or other signaling mechanism. The software continuously captures a user's speech and makes it available to one or more recognizers. By constantly monitoring and storing the audio stream, it provides the spoken dialog manager extra flexibility to recognize the signal with no audio gaps between recognition requests, as well as to rerecognize portions of the signal, or to rerecognize speech with different grammars, acoustic models, recognizers, start times, and so on. SRI expects that this new open-mic functionality will enable NASA to develop better error-correction mechanisms for spoken dialog systems, and may also enable new interaction strategies.
Improved Open-Microphone Speech Recognition
NASA Technical Reports Server (NTRS)
Abrash, Victor
2002-01-01
Many current and future NASA missions make extreme demands on mission personnel both in terms of work load and in performing under difficult environmental conditions. In situations where hands are impeded or needed for other tasks, eyes are busy attending to the environment, or tasks are sufficiently complex that ease of use of the interface becomes critical, spoken natural language dialog systems offer unique input and output modalities that can improve efficiency and safety. They also offer new capabilities that would not otherwise be available. For example, many NASA applications require astronauts to use computers in micro-gravity or while wearing space suits. Under these circumstances, command and control systems that allow users to issue commands or enter data in hands-and eyes-busy situations become critical. Speech recognition technology designed for current commercial applications limits the performance of the open-ended state-of-the-art dialog systems being developed at NASA. For example, today's recognition systems typically listen to user input only during short segments of the dialog, and user input outside of these short time windows is lost. Mistakes detecting the start and end times of user utterances can lead to mistakes in the recognition output, and the dialog system as a whole has no way to recover from this, or any other, recognition error. Systems also often require the user to signal when that user is going to speak, which is impractical in a hands-free environment, or only allow a system-initiated dialog requiring the user to speak immediately following a system prompt. In this project, SRI has developed software to enable speech recognition in a hands-free, open-microphone environment, eliminating the need for a push-to-talk button or other signaling mechanism. The software continuously captures a user's speech and makes it available to one or more recognizers. By constantly monitoring and storing the audio stream, it provides the spoken dialog manager extra flexibility to recognize the signal with no audio gaps between recognition requests, as well as to rerecognize portions of the signal, or to rerecognize speech with different grammars, acoustic models, recognizers, start times, and so on. SRI expects that this new open-mic functionality will enable NASA to develop better error-correction mechanisms for spoken dialog systems, and may also enable new interaction strategies.
Spatial Release From Masking in 2-Year-Olds With Normal Hearing and With Bilateral Cochlear Implants
Hess, Christi L.; Misurelli, Sara M.; Litovsky, Ruth Y.
2018-01-01
This study evaluated spatial release from masking (SRM) in 2- to 3-year-old children who are deaf and were implanted with bilateral cochlear implants (BiCIs), and in age-matched normal-hearing (NH) toddlers. Here, we examined whether early activation of bilateral hearing has the potential to promote SRM that is similar to age-matched NH children. Listeners were 13 NH toddlers and 13 toddlers with BiCIs, ages 27 to 36 months. Speech reception thresholds (SRTs) were measured for target speech in front (0°) and for competitors that were either Colocated in front (0°) or Separated toward the right (+90°). SRM was computed as the difference between SRTs in the front versus in the asymmetrical condition. Results show that SRTs were higher in the BiCI than NH group in all conditions. Both groups had higher SRTs in the Colocated and Separated conditions compared with Quiet, indicating masking. SRM was significant only in the NH group. In the BiCI group, the group effect of SRM was not significant, likely limited by the small sample size; however, all but two children had SRM values within the NH range. This work shows that to some extent, the ability to use spatial cues for source segregation develops by age 2 to 3 in NH children and is attainable in most of the children in the BiCI group. There is potential for the paradigm used here to be used in clinical settings to evaluate outcomes of bilateral hearing in very young children. PMID:29761735
Brungart, Douglas S; Simpson, Brian D
2007-09-01
Similarity between the target and masking voices is known to have a strong influence on performance in monaural and binaural selective attention tasks, but little is known about the role it might play in dichotic listening tasks with a target signal and one masking voice in the one ear and a second independent masking voice in the opposite ear. This experiment examined performance in a dichotic listening task with a target talker in one ear and same-talker, same-sex, or different-sex maskers in both the target and the unattended ears. The results indicate that listeners were most susceptible to across-ear interference with a different-sex within-ear masker and least susceptible with a same-talker within-ear masker, suggesting that the amount of across-ear interference cannot be predicted from the difficulty of selectively attending to the within-ear masking voice. The results also show that the amount of across-ear interference consistently increases when the across-ear masking voice is more similar to the target speech than the within-ear masking voice is, but that no corresponding decline in across-ear interference occurs when the across-ear voice is less similar to the target than the within-ear voice. These results are consistent with an "integrated strategy" model of speech perception where the listener chooses a segregation strategy based on the characteristics of the masker present in the target ear and the amount of across-ear interference is determined by the extent to which this strategy can also effectively be used to suppress the masker in the unattended ear.
The emergency department prediction of disposition (EPOD) study.
Vaghasiya, Milan R; Murphy, Margaret; O'Flynn, Daniel; Shetty, Amith
2014-11-01
Emergency departments (ED) continue to evolve models of care and streaming as interventions to tackle the effects of access block and overcrowding. Tertiary ED may be able to design patient-flow based on predicted dispositions in the department. Segregating discharge-stream patients may help develop patient-flows within the department, which is less affected by availability of beds in a hospital. We aim to determine if triage nurses and ED doctors can predict disposition outcomes early in the patient journey and thus lead to successful streaming of patients in the ED. During this study, triage nurses and ED doctors anonymously predicted disposition outcomes for patients presenting to triage after their brief assessments. Patient disposition at the 24-h post ED presentation was considered as the actual outcome and compared against predicted outcomes. Triage nurses were able to predict actual discharges of 445 patients out of 490 patients with a positive predictive value (PPV) of 90.8% (95% CI 87.8-93.2%). ED registrars were able to predict actual discharges of 85 patients out of 93 patients with PPV of 91.4% (95% CI 83.3-95.9%). ED consultants were able to predict actual discharges of 111 patients out of 118 patients with PPV 94.1% (95% CI 87.7-97.4%). PPVs for admission among ED consultants, ED registrars and Triage nurses were 59.7%, 54.4% and 48.5% respectively. Triage nurses, ED consultants and ED registrars are able to predict a patient's discharge disposition at triage with high levels of confidence. Triage nurses, ED consultants, and ED registrars can predict patients who are likely to be admitted with equal ability. This data may be used to develop specific admission and discharge streams based on early decision-making in EDs by triage nurses, ED registrars or ED consultants. Crown Copyright © 2014. Published by Elsevier Ltd. All rights reserved.
Vertically Integrated Models for Carbon Storage Modeling in Heterogeneous Domains
NASA Astrophysics Data System (ADS)
Bandilla, K.; Celia, M. A.
2017-12-01
Numerical modeling is an essential tool for studying the impacts of geologic carbon storage (GCS). Injection of carbon dioxide (CO2) into deep saline aquifers leads to multi-phase flow (injected CO2 and resident brine), which can be described by a set of three-dimensional governing equations, including mass-balance equation, volumetric flux equations (modified Darcy), and constitutive equations. This is the modeling approach on which commonly used reservoir simulators such as TOUGH2 are based. Due to the large density difference between CO2 and brine, GCS models can often be simplified by assuming buoyant segregation and integrating the three-dimensional governing equations in the vertical direction. The integration leads to a set of two-dimensional equations coupled with reconstruction operators for vertical profiles of saturation and pressure. Vertically-integrated approaches have been shown to give results of comparable quality as three-dimensional reservoir simulators when applied to realistic CO2 injection sites such as the upper sand wedge at the Sleipner site. However, vertically-integrated approaches usually rely on homogeneous properties over the thickness of a geologic layer. Here, we investigate the impact of general (vertical and horizontal) heterogeneity in intrinsic permeability, relative permeability functions, and capillary pressure functions. We consider formations involving complex fluvial deposition environments and compare the performance of vertically-integrated models to full three-dimensional models for a set of hypothetical test cases consisting of high permeability channels (streams) embedded in a low permeability background (floodplains). The domains are randomly generated assuming that stream channels can be represented by sinusoidal waves in the plan-view and by parabolas for the streams' cross-sections. Stream parameters such as width, thickness and wavelength are based on values found at the Ketzin site in Germany. Results from the vertically-integrated approach are compared to results using TOUGH2, both in terms of depth-averaged saturation and vertical saturation profiles.
Male group size, female distribution and changes in sexual segregation by Roosevelt elk
Peterson, Leah M.
2017-01-01
Sexual segregation, or the differential use of space by males and females, is hypothesized to be a function of body size dimorphism. Sexual segregation can also manifest at small (social segregation) and large (habitat segregation) spatial scales for a variety of reasons. Furthermore, the connection between small- and large-scale sexual segregation has rarely been addressed. We studied a population of Roosevelt elk (Cervus elaphus roosevelti) across 21 years in north coastal California, USA, to assess small- and large-scale sexual segregation in winter. We hypothesized that male group size would associate with small-scale segregation and that a change in female distribution would associate with large-scale segregation. Variation in forage biomass might also be coupled to small and large-scale sexual segregation. Our findings were consistent with male group size associating with small-scale segregation and a change in female distribution associating with large-scale segregation. Females appeared to avoid large groups comprised of socially dominant males. Males appeared to occupy a habitat vacated by females because of a wider forage niche, greater tolerance to lethal risks, and, perhaps, to reduce encounters with other elk. Sexual segregation at both spatial scales was a poor predictor of forage biomass. Size dimorphism was coupled to change in sexual segregation at small and large spatial scales. Small scale segregation can seemingly manifest when all forage habitat is occupied by females and large scale segregation might happen when some forage habitat is not occupied by females. PMID:29121076
Greven, Inez M; Ramsey, Richard
2017-02-01
The majority of human neuroscience research has focussed on understanding functional organisation within segregated patches of cortex. The ventral visual stream has been associated with the detection of physical features such as faces and body parts, whereas the theory-of-mind network has been associated with making inferences about mental states and underlying character, such as whether someone is friendly, selfish, or generous. To date, however, it is largely unknown how such distinct processing components integrate neural signals. Using functional magnetic resonance imaging and connectivity analyses, we investigated the contribution of functional integration to social perception. During scanning, participants observed bodies that had previously been associated with trait-based or neutral information. Additionally, we independently localised the body perception and theory-of-mind networks. We demonstrate that when observing someone who cues the recall of stored social knowledge compared to non-social knowledge, a node in the ventral visual stream (extrastriate body area) shows greater coupling with part of the theory-of-mind network (temporal pole). These results show that functional connections provide an interface between perceptual and inferential processing components, thus providing neurobiological evidence that supports the view that understanding the visual environment involves interplay between conceptual knowledge and perceptual processing. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.
Anatomy of hierarchy: Feedforward and feedback pathways in macaque visual cortex
Markov, Nikola T; Vezoli, Julien; Chameau, Pascal; Falchier, Arnaud; Quilodran, René; Huissoud, Cyril; Lamy, Camille; Misery, Pierre; Giroud, Pascale; Ullman, Shimon; Barone, Pascal; Dehay, Colette; Knoblauch, Kenneth; Kennedy, Henry
2013-01-01
The laminar location of the cell bodies and terminals of interareal connections determines the hierarchical structural organization of the cortex and has been intensively studied. However, we still have only a rudimentary understanding of the connectional principles of feedforward (FF) and feedback (FB) pathways. Quantitative analysis of retrograde tracers was used to extend the notion that the laminar distribution of neurons interconnecting visual areas provides an index of hierarchical distance (percentage of supragranular labeled neurons [SLN]). We show that: 1) SLN values constrain models of cortical hierarchy, revealing previously unsuspected areal relations; 2) SLN reflects the operation of a combinatorial distance rule acting differentially on sets of connections between areas; 3) Supragranular layers contain highly segregated bottom-up and top-down streams, both of which exhibit point-to-point connectivity. This contrasts with the infragranular layers, which contain diffuse bottom-up and top-down streams; 4) Cell filling of the parent neurons of FF and FB pathways provides further evidence of compartmentalization; 5) FF pathways have higher weights, cross fewer hierarchical levels, and are less numerous than FB pathways. Taken together, the present results suggest that cortical hierarchies are built from supra- and infragranular counterstreams. This compartmentalized dual counterstream organization allows point-to-point connectivity in both bottom-up and top-down directions. PMID:23983048
Renier, Laurent A.; Anurova, Irina; De Volder, Anne G.; Carlson, Synnöve; VanMeter, John; Rauschecker, Josef P.
2012-01-01
The segregation between cortical pathways for the identification and localization of objects is thought of as a general organizational principle in the brain. Yet, little is known about the unimodal versus multimodal nature of these processing streams. The main purpose of the present study was to test whether the auditory and tactile dual pathways converged into specialized multisensory brain areas. We used functional magnetic resonance imaging (fMRI) to compare directly in the same subjects the brain activation related to localization and identification of comparable auditory and vibrotactile stimuli. Results indicate that the right inferior frontal gyrus (IFG) and both left and right insula were more activated during identification conditions than during localization in both touch and audition. The reverse dissociation was found for the left and right inferior parietal lobules (IPL), the left superior parietal lobule (SPL) and the right precuneus-SPL, which were all more activated during localization conditions in the two modalities. We propose that specialized areas in the right IFG and the left and right insula are multisensory operators for the processing of stimulus identity whereas parts of the left and right IPL and SPL are specialized for the processing of spatial attributes independently of sensory modality. PMID:19726653
Auditory Scene Analysis: An Attention Perspective
2017-01-01
Purpose This review article provides a new perspective on the role of attention in auditory scene analysis. Method A framework for understanding how attention interacts with stimulus-driven processes to facilitate task goals is presented. Previously reported data obtained through behavioral and electrophysiological measures in adults with normal hearing are summarized to demonstrate attention effects on auditory perception—from passive processes that organize unattended input to attention effects that act at different levels of the system. Data will show that attention can sharpen stream organization toward behavioral goals, identify auditory events obscured by noise, and limit passive processing capacity. Conclusions A model of attention is provided that illustrates how the auditory system performs multilevel analyses that involve interactions between stimulus-driven input and top-down processes. Overall, these studies show that (a) stream segregation occurs automatically and sets the basis for auditory event formation; (b) attention interacts with automatic processing to facilitate task goals; and (c) information about unattended sounds is not lost when selecting one organization over another. Our results support a neural model that allows multiple sound organizations to be held in memory and accessed simultaneously through a balance of automatic and task-specific processes, allowing flexibility for navigating noisy environments with competing sound sources. Presentation Video http://cred.pubs.asha.org/article.aspx?articleid=2601618 PMID:29049599
FGD Additives to Segregate and Sequester Mercury in Solid Byproducts - Final Report
DOE Office of Scientific and Technical Information (OSTI.GOV)
Searcy, K; Bltyhe, G M; Steen, W A
2012-02-28
Many mercury control strategies for U.S. coal-fired power generating plants involve co-benefit capture of oxidized mercury from flue gases treated by wet flue gas desulfurization (FGD) systems. For these processes to be effective at overall mercury control, the captured mercury must not be re-emitted to the atmosphere or into surface or ground water. The project sought to identify scrubber additives and FGD operating conditions under which mercury re-emissions would decrease and mercury would remain in the liquor and be blown down from the system in the chloride purge stream. After exiting the FGD system, mercury would react with precipitating agentsmore » to form stable solid byproducts and would be removed in a dewatering step. The FGD gypsum solids, free of most of the mercury, could then be disposed or processed for reuse as wallboard or in other beneficial reuse. The project comprised extensive bench-scale FGD scrubber tests in Phases I and II. During Phase II, the approaches developed at the bench scale were tested at the pilot scale. Laboratory wastewater treatment tests measured the performance of precipitating agents in removing mercury from the chloride purge stream. Finally, the economic viability of the approaches tested was evaluated.« less
Bennett, Ralph G.; Christian, Jerry D.; Kirkham, Robert J.; Tranter, Troy J.
1998-01-01
An improved method for producing .sup.99m Tc compositions. .sup.100 Mo metal is irradiated with photons in a particle (electron) accelerator to produce .sup.99 Mo metal which is dissolved in a solvent. A solvated .sup.99 Mo product is then dried to generate a supply of .sup.99 MoO.sub.3 crystals. The crystals are thereafter heated at a temperature which will sublimate the crystals and form a gaseous mixture containing vaporized .sup.99m TcO.sub.3 and vaporized .sup.99m TcO.sub.2 but will not cause the production of vaporized .sup.99 MoO.sub.3. The mixture is then combined with an oxidizing gas to generate a gaseous stream containing vaporized .sup.99m Tc.sub.2 O.sub.7. Next, the gaseous stream is cooled to a temperature sufficient to convert the vaporized .sup.99m Tc.sub.2 O.sub.7 into a condensed .sup.99m Tc-containing product. The product has high purity levels resulting from the use of reduced temperature conditions and ultrafine crystalline .sup.99 MoO.sub.3 starting materials with segregated .sup.99m Tc compositions therein which avoid the production of vaporized .sup.99 MoO.sub.3 contaminants.
Reardon, Sean F.; Farrell, Chad R.; Matthews, Stephen A.; O'Sullivan, David; Bischoff, Kendra; Firebaugh, Glenn
2014-01-01
We use newly developed methods of measuring spatial segregation across a range of spatial scales to assess changes in racial residential segregation patterns in the 100 largest U.S. metropolitan areas from 1990 to 2000. Our results point to three notable trends in segregation from 1990 to 2000: 1) Hispanic-white and Asian-white segregation levels increased at both micro- and macro-scales; 2) black-white segregation declined at a micro-scale, but was unchanged at a macro-scale; and 3) for all three racial groups and for almost all metropolitan areas, macro-scale segregation accounted for more of the total metropolitan area segregation in 2000 than in 1990. Our examination of the variation in these trends among the metropolitan areas suggests that Hispanic-white and Asian-white segregation changes have been driven largely by increases in macro-scale segregation resulting from the rapid growth of the Hispanic and Asian populations in central cities. The changes in black-white segregation, in contrast, appear to be driven by the continuation of a 30-year trend in declining micro-segregation, coupled with persistent and largely stable patterns of macro-segregation. PMID:19569292
Kabdebon, C; Pena, M; Buiatti, M; Dehaene-Lambertz, G
2015-09-01
Using electroencephalography, we examined 8-month-old infants' ability to discover a systematic dependency between the first and third syllables of successive words, concatenated into a monotonous speech stream, and to subsequently generalize this regularity to new items presented in isolation. Full-term and preterm infants, while exposed to the stream, displayed a significant entrainment (phase-locking) to the syllabic and word frequencies, demonstrating that they were sensitive to the word unit. The acquisition of the systematic dependency defining words was confirmed by the significantly different neural responses to rule-words and part-words subsequently presented during the test phase. Finally, we observed a correlation between syllabic entrainment during learning and the difference in phase coherence between the test conditions (rule-words vs part-words) suggesting that temporal processing of the syllable unit might be crucial in linguistic learning. No group difference was observed suggesting that non-adjacent statistical computations are already robust at 8 months, even in preterm infants, and thus develop during the first year of life, earlier than expected from behavioral studies. Copyright © 2015 Elsevier Inc. All rights reserved.
Designing interaction, voice, and inclusion in AAC research.
Pullin, Graham; Treviranus, Jutta; Patel, Rupal; Higginbotham, Jeff
2017-09-01
The ISAAC 2016 Research Symposium included a Design Stream that examined timely issues across augmentative and alternative communication (AAC), framed in terms of designing interaction, designing voice, and designing inclusion. Each is a complex term with multiple meanings; together they represent challenging yet important frontiers of AAC research. The Design Stream was conceived by the four authors, researchers who have been exploring AAC and disability-related design throughout their careers, brought together by a shared conviction that designing for communication implies more than ensuring access to words and utterances. Each of these presenters came to AAC from a different background: interaction design, inclusive design, speech science, and social science. The resulting discussion among 24 symposium participants included controversies about the role of technology, tensions about independence and interdependence, and a provocation about taste. The paper concludes by proposing new directions for AAC research: (a) new interdisciplinary research could combine scientific and design research methods, as distant yet complementary as microanalysis and interaction design, (b) new research tools could seed accessible and engaging contextual research into voice within a social model of disability, and (c) new open research networks could support inclusive, international and interdisciplinary research.
A Dual-Stream Neuroanatomy of Singing
Loui, Psyche
2015-01-01
Singing requires effortless and efficient use of auditory and motor systems that center around the perception and production of the human voice. Although perception and production are usually tightly coupled functions, occasional mismatches between the two systems inform us of dissociable pathways in the brain systems that enable singing. Here I review the literature on perception and production in the auditory modality, and propose a dual-stream neuroanatomical model that subserves singing. I will discuss studies surrounding the neural functions of feedforward, feedback, and efference systems that control vocal monitoring, as well as the white matter pathways that connect frontal and temporal regions that are involved in perception and production. I will also consider disruptions of the perception-production network that are evident in tone-deaf individuals and poor pitch singers. Finally, by comparing expert singers against other musicians and nonmusicians, I will evaluate the possibility that singing training might offer rehabilitation from these disruptions through neuroplasticity of the perception-production network. Taken together, the best available evidence supports a model of dorsal and ventral pathways in auditory-motor integration that enables singing and is shared with language, music, speech, and human interactions in the auditory environment. PMID:26120242
Binding and unbinding the auditory and visual streams in the McGurk effect.
Nahorna, Olha; Berthommier, Frédéric; Schwartz, Jean-Luc
2012-08-01
Subjects presented with coherent auditory and visual streams generally fuse them into a single percept. This results in enhanced intelligibility in noise, or in visual modification of the auditory percept in the McGurk effect. It is classically considered that processing is done independently in the auditory and visual systems before interaction occurs at a certain representational stage, resulting in an integrated percept. However, some behavioral and neurophysiological data suggest the existence of a two-stage process. A first stage would involve binding together the appropriate pieces of audio and video information before fusion per se in a second stage. Then it should be possible to design experiments leading to unbinding. It is shown here that if a given McGurk stimulus is preceded by an incoherent audiovisual context, the amount of McGurk effect is largely reduced. Various kinds of incoherent contexts (acoustic syllables dubbed on video sentences or phonetic or temporal modifications of the acoustic content of a regular sequence of audiovisual syllables) can significantly reduce the McGurk effect even when they are short (less than 4 s). The data are interpreted in the framework of a two-stage "binding and fusion" model for audiovisual speech perception.
A Dual-Stream Neuroanatomy of Singing.
Loui, Psyche
2015-02-01
Singing requires effortless and efficient use of auditory and motor systems that center around the perception and production of the human voice. Although perception and production are usually tightly coupled functions, occasional mismatches between the two systems inform us of dissociable pathways in the brain systems that enable singing. Here I review the literature on perception and production in the auditory modality, and propose a dual-stream neuroanatomical model that subserves singing. I will discuss studies surrounding the neural functions of feedforward, feedback, and efference systems that control vocal monitoring, as well as the white matter pathways that connect frontal and temporal regions that are involved in perception and production. I will also consider disruptions of the perception-production network that are evident in tone-deaf individuals and poor pitch singers. Finally, by comparing expert singers against other musicians and nonmusicians, I will evaluate the possibility that singing training might offer rehabilitation from these disruptions through neuroplasticity of the perception-production network. Taken together, the best available evidence supports a model of dorsal and ventral pathways in auditory-motor integration that enables singing and is shared with language, music, speech, and human interactions in the auditory environment.
Field Evaluation of Temperature Differential in HMA Mixtures
DOT National Transportation Integrated Search
2012-05-15
Segregation is a common occurrence in hot mix asphalt (HMA) construction. The two types of : segregation encountered are gradation segregation and thermal segregation. This investigation report : involves mainly thermal segregation, which occurs when...
Why middle-aged listeners have trouble hearing in everyday settings.
Ruggles, Dorea; Bharadwaj, Hari; Shinn-Cunningham, Barbara G
2012-08-07
Anecdotally, middle-aged listeners report difficulty conversing in social settings, even when they have normal audiometric thresholds [1-3]. Moreover, young adult listeners with "normal" hearing vary in their ability to selectively attend to speech amid similar streams of speech. Ignoring age, these individual differences correlate with physiological differences in temporal coding precision present in the auditory brainstem, suggesting that the fidelity of encoding of suprathreshold sound helps explain individual differences [4]. Here, we revisit the conundrum of whether early aging influences an individual's ability to communicate in everyday settings. Although absolute selective attention ability is not predicted by age, reverberant energy interferes more with selective attention as age increases. Breaking the brainstem response down into components corresponding to coding of stimulus fine structure and envelope, we find that age alters which brainstem component predicts performance. Specifically, middle-aged listeners appear to rely heavily on temporal fine structure, which is more disrupted by reverberant energy than temporal envelope structure is. In contrast, the fidelity of envelope cues predicts performance in younger adults. These results hint that temporal envelope cues influence spatial hearing in reverberant settings more than is commonly appreciated and help explain why middle-aged listeners have particular difficulty communicating in daily life. Copyright © 2012 Elsevier Ltd. All rights reserved.
Franco, Ana; Gaillard, Vinciane; Cleeremans, Axel; Destrebecqz, Arnaud
2015-12-01
Statistical learning can be used to extract the words from continuous speech. Gómez, Bion, and Mehler (Language and Cognitive Processes, 26, 212-223, 2011) proposed an online measure of statistical learning: They superimposed auditory clicks on a continuous artificial speech stream made up of a random succession of trisyllabic nonwords. Participants were instructed to detect these clicks, which could be located either within or between words. The results showed that, over the length of exposure, reaction times (RTs) increased more for within-word than for between-word clicks. This result has been accounted for by means of statistical learning of the between-word boundaries. However, even though statistical learning occurs without an intention to learn, it nevertheless requires attentional resources. Therefore, this process could be affected by a concurrent task such as click detection. In the present study, we evaluated the extent to which the click detection task indeed reflects successful statistical learning. Our results suggest that the emergence of RT differences between within- and between-word click detection is neither systematic nor related to the successful segmentation of the artificial language. Therefore, instead of being an online measure of learning, the click detection task seems to interfere with the extraction of statistical regularities.
Impaired Statistical Learning in Developmental Dyslexia
Thiessen, Erik D.; Holt, Lori L.
2015-01-01
Purpose Developmental dyslexia (DD) is commonly thought to arise from phonological impairments. However, an emerging perspective is that a more general procedural learning deficit, not specific to phonological processing, may underlie DD. The current study examined if individuals with DD are capable of extracting statistical regularities across sequences of passively experienced speech and nonspeech sounds. Such statistical learning is believed to be domain-general, to draw upon procedural learning systems, and to relate to language outcomes. Method DD and control groups were familiarized with a continuous stream of syllables or sine-wave tones, the ordering of which was defined by high or low transitional probabilities across adjacent stimulus pairs. Participants subsequently judged two 3-stimulus test items with either high or low statistical coherence as being the most similar to the sounds heard during familiarization. Results As with control participants, the DD group was sensitive to the transitional probability structure of the familiarization materials as evidenced by above-chance performance. However, the performance of participants with DD was significantly poorer than controls across linguistic and nonlinguistic stimuli. In addition, reading-related measures were significantly correlated with statistical learning performance of both speech and nonspeech material. Conclusion Results are discussed in light of procedural learning impairments among participants with DD. PMID:25860795
Atomic scale study of grain boundary segregation before carbide nucleation in Ni-Cr-Fe Alloys
NASA Astrophysics Data System (ADS)
Li, Hui; Xia, Shuang; Liu, Wenqing; Liu, Tingguang; Zhou, Bangxin
2013-08-01
Three dimensional chemical information concerning grain boundary segregation before carbide nucleation was characterized by atom probe tomography in two Ni-Cr-Fe alloys which were aged at 500 °C for 0.5 h after homogenizing treatment. B, C and Si atoms segregation at grain boundary in Alloy 690 was observed. B, C, N and P atoms segregation at grain boundary in 304 austenitic stainless steel was observed. C atoms co-segregation with Cr atoms at the grain boundaries both in Alloy 690 and 304 austenitic stainless steel was found, and its effect on the carbide nucleation was discussed. The amount of each segregated element at grain boundaries in the two Ni-Cr-Fe alloys were analyzed quantitatively. Comparison of the grain boundary segregation features of the two Ni-Cr-Fe alloys were carried out based on the experimental results. The impurity and solute atoms segregate inhomogeneously in the same grain boundary both in 304 SS and Alloy 690. The grain boundary segregation tendencies (Sav) are B (11.8 ± 1.4) > P (5.4 ± 1.4) > N (4.7 ± 0.3) > C (3.7 ± 0.4) in 304 SS, and B (6.9 ± 0.9) > C (6.7 ± 0.4) > Si (1.5 ± 0.2) in Alloy 690. Cr atoms may co-segregate with C atoms at grain boundaries before carbide nucleation at the grain boundaries both in 304 SS and Alloy 690. Ni atoms generally deplete at grain boundary both in 304 SS and Alloy 690. The literature shows that the Ni atoms may co-segregate with P atoms at grain boundaries [28], but the P atoms segregation do not leads to Ni segregation in the current study. In the current study, Fe atoms may segregate or deplete at grain boundary in Alloy 690. But Fe atoms generally deplete at grain boundary in 304 SS. B atoms have the strongest grain boundary segregation tendency both in 304 SS and Alloy 690. The grain boundary segregation tendency and Gibbs free energy of B in 304 SS is higher than in Alloy 690. C atoms are easy to segregate at grain boundaries both in 304 SS and Alloy 690. The grain boundary segregation tendency and Gibbs free energy of C in Alloy 690 is higher than in 304 SS, due to the higher bulk C concentration and the site competition of P atoms which segregate at grain boundary [29,30]. It is imply that the segregation tendency is influenced by the bulk concentration of the segregates. Si atoms slightly segregate at grain boundaries in Alloy 690, but do not segregate at grain boundaries in 304 SS. N and P atoms segregate at grain boundary in 304 SS, and their segregation Gibbs free energy are similar. N atoms may be exhausted by the TiN precipitated in the matrix and can not be observed in the grain boundary of Alloy 690 [19]. Mn atoms deplete at grain boundary in 304 SS. This phenomenon is similar to that of proton irradiation induced segregation in 304 SS [32]. B, C, N, P segregation Gibbs energies are similar both in 304 SS and Alloy 690. B and C atoms segregate at grain boundary both in Alloy 690 and 304 SS, P and N segregate at grain boundary in 304 SS. Si atoms segregate at grain boundary in Alloy 690, but do not segregate at grain boundary in 304 SS. Cr enriches at grain boundary both in Alloy 690 and 304 SS, although carbide does not nucleate. Ni and Fe may segregate, deplete or homogeneously distribute at grain boundary in Alloy 690, but they deplete at grain boundary in 304 SS. C and Cr atoms co-segregate at grain boundaries before carbide nucleation in Alloy 690 and 304 SS. Combination with other results in literatures, the evolution of Cr concentration at grain boundary should be enrichment at grain boundary before carbide nucleation, depletion at grain boundary after carbide precipitation, and healing after obvious growth of carbide. After aging treatment at 500 °C for 0.5 h, the total reduction of grain boundary free energy due to segregation is 27.489 kJ/mol for Alloy 690 and 45.207 kJ/mol for 304.
Li, Hui; Song, Hui; Liu, Wenqing; Xia, Shuang; Zhou, Bangxin; Su, Cheng; Ding, Wenyan
2015-12-01
The segregation of various elements at grain boundaries, precipitate/matrix interfaces were analyzed using atom probe tomography in an austenitic precipitation strengthened stainless steel aged at 750 °C for different time. Segregation of P, B and C at all types of interfaces in all the specimens were observed. However, Si segregated at all types of interfaces only in the specimen aged for 16 h. Enrichment of Ti at grain boundaries was evident in the specimen aged for 16 h, while Ti did not segregate at other interfaces. Mo varied considerably among interface types, e.g. from segregated at grain boundaries in the specimens after all the aging time to never segregate at γ'/γ phase interfaces. Cr co-segregated with C at grain boundaries, although carbides still did not nucleate at grain boundaries yet. Despite segregation tendency variations in different interface types, the segregation tendency evolution variation of different elements depending aging time were analyzed among all types of interfaces. Based on the experimental results, the enrichment factors, Gibbs interface excess and segregation free energies of segregated elements were calculated and discussed. Copyright © 2015 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Kiyohara, Shin; Mizoguchi, Teruyasu
2018-03-01
Grain boundary segregation of dopants plays a crucial role in materials properties. To investigate the dopant segregation behavior at the grain boundary, an enormous number of combinations have to be considered in the segregation of multiple dopants at the complex grain boundary structures. Here, two data mining techniques, the random-forests regression and the genetic algorithm, were applied to determine stable segregation sites at grain boundaries efficiently. Using the random-forests method, a predictive model was constructed from 2% of the segregation configurations and it has been shown that this model could determine the stable segregation configurations. Furthermore, the genetic algorithm also successfully determined the most stable segregation configuration with great efficiency. We demonstrate that these approaches are quite effective to investigate the dopant segregation behaviors at grain boundaries.
Residential Segregation and Racial Cancer Disparities: A Systematic Review.
Landrine, Hope; Corral, Irma; Lee, Joseph G L; Efird, Jimmy T; Hall, Marla B; Bess, Jukelia J
2017-12-01
This paper provides the first review of empirical studies of segregation and black-white cancer disparities. We searched all years of PubMed (through May 2016) using these terms: racial segregation, residential segregation, neighborhood racial composition (first terms) and (second terms) cancer incidence, mortality, survival, stage at diagnosis, screening. The 17 (of 668) articles that measured both segregation and a cancer outcome were retained. Segregation contributed significantly to cancer and to racial cancer disparities in 70% of analyses, even after controlling for socioeconomic status and health insurance. Residing in segregated African-American areas was associated with higher odds of later-stage diagnosis of breast and lung cancers, higher mortality rates and lower survival rates from breast and lung cancers, and higher cumulative cancer risks associated with exposure to ambient air toxics. There were no studies of many types of cancer (e.g., cervical). Studies differed in their measure of segregation, and 40% used an invalid measure. Possible mediators of the segregation effect usually were not tested. Empirical analysis of segregation and racial cancer disparities is a recent area of research. The literature is limited to 17 studies that focused primarily on breast cancer. Studies differed in their measure of segregation, yet segregation nonetheless contributed to cancer and to racial cancer disparities in 70% of analyses. This suggests the need for further research that uses valid measures of segregation, examines a variety of types of cancers, and explores the variables that may mediate the segregation effect.
49 CFR 176.80 - Applicability.
Code of Federal Regulations, 2010 CFR
2010-10-01
... Segregation Requirements § 176.80 Applicability. (a) This subpart sets forth segregation requirements in addition to any segregation requirements set forth elsewhere in this subchapter. (b) Hazardous materials in... segregation requirements of this subpart and any additional segregation specified in this subchapter for...
49 CFR 176.80 - Applicability.
Code of Federal Regulations, 2011 CFR
2011-10-01
... Segregation Requirements § 176.80 Applicability. (a) This subpart sets forth segregation requirements in addition to any segregation requirements set forth elsewhere in this subchapter. (b) Hazardous materials in... segregation requirements of this subpart and any additional segregation specified in this subchapter for...
ERIC Educational Resources Information Center
Laosa, Luis M.
2001-01-01
This issue reviews national demographic trends in school segregation, summarizing research findings. Though the national debate on school segregation emphasizes blacks and whites, present-day school segregation includes segregation by socioeconomic level, ethnicity, and native language. The research study examined features of the ecology of…
Ruggles, Dorea; Shinn-Cunningham, Barbara
2011-06-01
Listeners can selectively attend to a desired target by directing attention to known target source features, such as location or pitch. Reverberation, however, reduces the reliability of the cues that allow a target source to be segregated and selected from a sound mixture. Given this, it is likely that reverberant energy interferes with selective auditory attention. Anecdotal reports suggest that the ability to focus spatial auditory attention degrades even with early aging, yet there is little evidence that middle-aged listeners have behavioral deficits on tasks requiring selective auditory attention. The current study was designed to look for individual differences in selective attention ability and to see if any such differences correlate with age. Normal-hearing adults, ranging in age from 18 to 55 years, were asked to report a stream of digits located directly ahead in a simulated rectangular room. Simultaneous, competing masker digit streams were simulated at locations 15° left and right of center. The level of reverberation was varied to alter task difficulty by interfering with localization cues (increasing localization blur). Overall, performance was best in the anechoic condition and worst in the high-reverberation condition. Listeners nearly always reported a digit from one of the three competing streams, showing that reverberation did not render the digits unintelligible. Importantly, inter-subject differences were extremely large. These differences, however, were not significantly correlated with age, memory span, or hearing status. These results show that listeners with audiometrically normal pure tone thresholds differ in their ability to selectively attend to a desired source, a task important in everyday communication. Further work is necessary to determine if these differences arise from differences in peripheral auditory function or in more central function.
DEEP IMAGING OF M51: A NEW VIEW OF THE WHIRLPOOL’S EXTENDED TIDAL DEBRIS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Watkins, Aaron E.; Mihos, J. Christopher; Harding, Paul
We present deep, wide-field imaging of the M51 system using CWRU’s Burrell Schmidt Telescope at KPNO to study the faint tidal features that constrain its interaction history. Our images trace M51's tidal morphology down to a limiting surface brightness of μ{sub B,lim} ∼ 30 mag arcsec{sup −2} and provide accurate colors (σ{sub B−V}<0.1) down to μ{sub B} ∼ 28. We identify two new tidal streams in the system (the south and northeast plumes) with surface brightnesses of μ{sub B} = 29 and luminosities of ∼10{sup 6}L{sub ⊙,B}. While the northeast plume may be a faint outer extension of the tidalmore » “crown” north of NGC 5195 (M51b), the south plume has no analog in any existing M51 simulation and may represent a distinct tidal stream or disrupted dwarf galaxy. We also trace the extremely diffuse northwest plume out to a total extent of 20′ (43 kpc) from NGC 5194 (M51a) and show it to be physically distinct from the overlapping bright tidal streams from M51b. The northwest plume’s morphology and red color (B−V=0.8) instead argue that it originated from tidal stripping of M51a’s extreme outer disk. Finally, we confirm the strong segregation of gas and stars in the southeast tail and do not detect any diffuse stellar component in the H i portion of the tail. Extant simulations of M51 have difficulty matching both the wealth of tidal structure in the system and the lack of stars in the H i tail, motivating new modeling campaigns to study the dynamical evolution of this classic interacting system.« less
Diel fluctuations in natural organic matter quality in an oligotrophic cave system
NASA Astrophysics Data System (ADS)
Brown, T.; Engel, A. S.; Pfiffner, S. M.
2016-12-01
Transformations of natural organic matter (NOM) and effects of photochemical degradation on dissolved organic matter (DOM) quality in recharge can be readily studied in cave systems with hydrologic connections between the surface and subsurface. Specifically, diel controls on photodegradation, fresh NOM production, and microbial C cycling were examined from recharge to resurgence of an oligotrophic cave stream in Kentucky. We used NOM isolation and spectroscopic analysis to concentrate and characterize DOM, and lipid profiling to evaluate microbial community structure. A hydrophilic fraction of DOM was isolated from bulk waters in the field using diethylaminoethyl (DEAE) weak anion exchange column chromatography, and isolates were characterized with FTIR spectroscopy to identify differences in macromolecular structure between surface and subsurface (downstream) DOM. Lipids from colloidal NOM (retained on 0.2 µm filter) and stream sediments were extracted using a modified Bligh Dyer method, segregated into classes, and converted to fatty acid methyl esters (FAME) for quantification and identification by GC-MS. During a late summer, low flow, 24-hour sampling event, the quality of surface water DOM recharged at night was 40% richer in aliphatic esters, 30% richer in phenols and alkanes, and elevated in polysaccharides compared with DOM recharged during daylight. IR absorptivity in nocturnal DOM isolates was an order of magnitude lower in the cave stream, with recalcitrant DOM interpreted from bands of aliphatic esters, alkanes, and organo-silicates. Phospholipid fatty acid (PLFA) profiles indicated that the abundance of polyunsaturated PLFA associated with algae, fungi, and higher plants decreased along the flowpath. Cave microbes exhibited elevated trans:cis ratios relative to surface communities, and the ratio increased at night. This suggested that downstream microbial communities existed in a state of reduced activity without inputs of photosynthates at night.
Neuromechanistic Model of Auditory Bistability
Rankin, James; Sussman, Elyse; Rinzel, John
2015-01-01
Sequences of higher frequency A and lower frequency B tones repeating in an ABA- triplet pattern are widely used to study auditory streaming. One may experience either an integrated percept, a single ABA-ABA- stream, or a segregated percept, separate but simultaneous streams A-A-A-A- and -B---B--. During minutes-long presentations, subjects may report irregular alternations between these interpretations. We combine neuromechanistic modeling and psychoacoustic experiments to study these persistent alternations and to characterize the effects of manipulating stimulus parameters. Unlike many phenomenological models with abstract, percept-specific competition and fixed inputs, our network model comprises neuronal units with sensory feature dependent inputs that mimic the pulsatile-like A1 responses to tones in the ABA- triplets. It embodies a neuronal computation for percept competition thought to occur beyond primary auditory cortex (A1). Mutual inhibition, adaptation and noise are implemented. We include slow NDMA recurrent excitation for local temporal memory that enables linkage across sound gaps from one triplet to the next. Percepts in our model are identified in the firing patterns of the neuronal units. We predict with the model that manipulations of the frequency difference between tones A and B should affect the dominance durations of the stronger percept, the one dominant a larger fraction of time, more than those of the weaker percept—a property that has been previously established and generalized across several visual bistable paradigms. We confirm the qualitative prediction with our psychoacoustic experiments and use the behavioral data to further constrain and improve the model, achieving quantitative agreement between experimental and modeling results. Our work and model provide a platform that can be extended to consider other stimulus conditions, including the effects of context and volition. PMID:26562507
Interdependent encoding of pitch, timbre and spatial location in auditory cortex
Bizley, Jennifer K.; Walker, Kerry M. M.; Silverman, Bernard W.; King, Andrew J.; Schnupp, Jan W. H.
2009-01-01
Because we can perceive the pitch, timbre and spatial location of a sound source independently, it seems natural to suppose that cortical processing of sounds might separate out spatial from non-spatial attributes. Indeed, recent studies support the existence of anatomically segregated ‘what’ and ‘where’ cortical processing streams. However, few attempts have been made to measure the responses of individual neurons in different cortical fields to sounds that vary simultaneously across spatial and non-spatial dimensions. We recorded responses to artificial vowels presented in virtual acoustic space to investigate the representations of pitch, timbre and sound source azimuth in both core and belt areas of ferret auditory cortex. A variance decomposition technique was used to quantify the way in which altering each parameter changed neural responses. Most units were sensitive to two or more of these stimulus attributes. Whilst indicating that neural encoding of pitch, location and timbre cues is distributed across auditory cortex, significant differences in average neuronal sensitivity were observed across cortical areas and depths, which could form the basis for the segregation of spatial and non-spatial cues at higher cortical levels. Some units exhibited significant non-linear interactions between particular combinations of pitch, timbre and azimuth. These interactions were most pronounced for pitch and timbre and were less commonly observed between spatial and non-spatial attributes. Such non-linearities were most prevalent in primary auditory cortex, although they tended to be small compared with stimulus main effects. PMID:19228960
ERIC Educational Resources Information Center
Gandara, Patricia
2010-01-01
Latinos are, after whites, the most segregated student group in the United States, and their segregation is closely tied to poor academic outcomes. Latinos experience a triple segregation: by race/ethnicity, poverty, and language. Racial segregation perpetuates negative stereotypes, reduces the likelihood of a strong teaching staff, and is often…
Oka, Masayoshi; Wong, David W. S.
2014-01-01
Two conceptual and methodological foundations of segregation studies are that (i) segregation involves more than one group, and (ii) segregation measures need to quantify how different population groups are distributed across space. Therefore, percentage of population belonging to a group is not an appropriate measure of segregation because it does not describe how populations are spread across different areal units or neighborhoods. In principle, evenness and isolation are the two distinct dimensions of segregation that capture the spatial patterns of population groups. To portray people’s daily environment more accurately, segregation measures need to account for the spatial relationships between areal units and to reflect the situations at the neighborhood scale. For these reasons, the use of local spatial entropy-based diversity index (SHi) and local spatial isolation index (Si) to capture the evenness and isolation dimensions of segregation, respectively, are preferable. However, these two local spatial segregation indexes have rarely been incorporated into health research. Rather ineffective and insufficient segregation measures have been used in previous studies. Hence, this paper empirically demonstrates how the two measures can reflect the two distinct dimensions of segregation at the neighborhood level, and argues conceptually and set the stage for their future use to effectively and meaningfully examine the relationships between residential segregation and health. PMID:25202687
Residential segregation, dividing walls and mental health: a population-based record linkage study.
Maguire, Aideen; French, Declan; O'Reilly, Dermot
2016-09-01
Neighbourhood segregation has been described as a fundamental determinant of physical health, but literature on its effect on mental health is less clear. While most previous research has relied on conceptualised measures of segregation, Northern Ireland is unique as it contains physical manifestations of segregation in the form of segregation barriers (or 'peacelines') which can be used to accurately identify residential segregation. We used population-wide health record data on over 1.3 million individuals, to analyse the effect of residential segregation, measured by both the formal Dissimilarity Index and by proximity to a segregation barrier, on the likelihood of poor mental health. Using multilevel logistic regression models, we found residential segregation measured by the Dissimilarity Index poses no additional risk to the likelihood of poor mental health after adjustment for area-level deprivation. However, residence in an area segregated by a 'peaceline' increases the likelihood of antidepressant medication by 19% (OR=1.19, 95% CI 1.14 to 1.23) and anxiolytic medication by 39% (OR=1.39, 95% CI 1.32 to 1.48), even after adjustment for gender, age, conurbation, deprivation and crime. Living in an area segregated by a 'peaceline' is detrimental to mental health suggesting segregated areas characterised by a heightened sense of 'other' pose a greater risk to mental health. The difference in results based on segregation measure highlights the importance of choice of measure when studying segregation. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/
The Role of Residential Segregation in Contemporary School Segregation
ERIC Educational Resources Information Center
Frankenberg, Erica
2013-01-01
Inaction to address housing segregation in metropolitan areas has resulted in persistently high levels of residential segregation. As the Supreme Court has recently limited school districts' voluntary integration efforts, this article considers the role of residential segregation in maintaining racially isolated schools, namely what is known about…
41 CFR 109-1.5106 - Segregation of personal property.
Code of Federal Regulations, 2011 CFR
2011-01-01
... 41 Public Contracts and Property Management 3 2011-01-01 2011-01-01 false Segregation of personal...-INTRODUCTION 1.51-Personal Property Management Standards and Practices § 109-1.5106 Segregation of personal...) The segregation of the property would materially hinder the progress of the work (i.e., segregation is...
49 CFR 176.146 - Segregation from non-hazardous materials.
Code of Federal Regulations, 2010 CFR
2010-10-01
... 49 Transportation 2 2010-10-01 2010-10-01 false Segregation from non-hazardous materials. 176.146... VESSEL Detailed Requirements for Class 1 (Explosive) Materials Segregation § 176.146 Segregation from non... for “away from” segregation apply. (2) An explosive substance or article which has a secondary...
49 CFR 176.140 - Segregation from other classes of hazardous materials.
Code of Federal Regulations, 2010 CFR
2010-10-01
... 49 Transportation 2 2010-10-01 2010-10-01 false Segregation from other classes of hazardous... CARRIAGE BY VESSEL Detailed Requirements for Class 1 (Explosive) Materials Segregation § 176.140 Segregation from other classes of hazardous materials. (a) Class 1 (explosive) materials must be segregated...
41 CFR 109-1.5106 - Segregation of personal property.
Code of Federal Regulations, 2010 CFR
2010-07-01
... 41 Public Contracts and Property Management 3 2010-07-01 2010-07-01 false Segregation of personal...-INTRODUCTION 1.51-Personal Property Management Standards and Practices § 109-1.5106 Segregation of personal...) The segregation of the property would materially hinder the progress of the work (i.e., segregation is...
Income Segregation between Schools and School Districts
ERIC Educational Resources Information Center
Owens, Ann; Reardon, Sean F.; Jencks, Christopher
2016-01-01
Although trends in the racial segregation of schools are well documented, less is known about trends in income segregation. We use multiple data sources to document trends in income segregation between schools and school districts. Between-district income segregation of families with children enrolled in public school increased by over 15% from…
49 CFR 176.144 - Segregation of Class 1 (explosive) materials.
Code of Federal Regulations, 2011 CFR
2011-10-01
... 49 Transportation 2 2011-10-01 2011-10-01 false Segregation of Class 1 (explosive) materials. 176... VESSEL Detailed Requirements for Class 1 (Explosive) Materials Segregation § 176.144 Segregation of Class... any ferrous metal or aluminum alloy, unless separated by a partition. (e) Segregation on deck: When...
49 CFR 176.144 - Segregation of Class 1 (explosive) materials.
Code of Federal Regulations, 2010 CFR
2010-10-01
... 49 Transportation 2 2010-10-01 2010-10-01 false Segregation of Class 1 (explosive) materials. 176... VESSEL Detailed Requirements for Class 1 (Explosive) Materials Segregation § 176.144 Segregation of Class... any ferrous metal or aluminum alloy, unless separated by a partition. (e) Segregation on deck: When...
49 CFR 176.146 - Segregation from non-hazardous materials.
Code of Federal Regulations, 2011 CFR
2011-10-01
... 49 Transportation 2 2011-10-01 2011-10-01 false Segregation from non-hazardous materials. 176.146... VESSEL Detailed Requirements for Class 1 (Explosive) Materials Segregation § 176.146 Segregation from non... for “away from” segregation apply. (2) An explosive substance or article which has a secondary...
49 CFR 176.140 - Segregation from other classes of hazardous materials.
Code of Federal Regulations, 2011 CFR
2011-10-01
... 49 Transportation 2 2011-10-01 2011-10-01 false Segregation from other classes of hazardous... CARRIAGE BY VESSEL Detailed Requirements for Class 1 (Explosive) Materials Segregation § 176.140 Segregation from other classes of hazardous materials. (a) Class 1 (explosive) materials must be segregated...
46 CFR 148.120 - Stowage and segregation requirements.
Code of Federal Regulations, 2011 CFR
2011-10-01
... 46 Shipping 5 2011-10-01 2011-10-01 false Stowage and segregation requirements. 148.120 Section... OF BULK SOLID MATERIALS THAT REQUIRE SPECIAL HANDLING Stowage and Segregation § 148.120 Stowage and segregation requirements. (a) Each material listed in Table 148.10 of this part must be segregated from...
Shaping Income Segregation in Schools: The Role of School Attendance Zone Geography
ERIC Educational Resources Information Center
Saporito, Salvatore
2017-01-01
This study investigates how much the geographic shapes of school attendance zones contributes to their levels of income segregation while holding constant levels of income segregation across residential areas. Income segregation across attendance zones is measured with the rank ordered information theory index. Income segregation across…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kenik, E.A.
X-ray microanalysis in an analytical electron microscope is a proven technique for the measurement of solute segregation in alloys. Solute segregation under equilibrium or nonequilibrium conditions can strongly influence material performance. X-ray microanalysis in an analytical electron microscope provides an alternative technique to measure grain boundary segregation, as well as segregation to other defects not accessible to Auger analysis. The utility of the technique is demonstrated by measurements of equilibrium segregation to boundaries in an antimony containing stainless steel, including the variation of segregation with boundary character and by measurements of nonequilibrium segregation to boundaries and dislocations in an ion-irradiatedmore » stainless steel.« less
Segregation effects during solidification in weightless melts
NASA Technical Reports Server (NTRS)
Li, C.
1973-01-01
Two types of melt segregation effects were studied: (1) evaporative segregation, or segregation due to surface evaporation; and (2) freezing segregation, or segregation due to liquid-solid phase transformation. These segregation effects are closely related. In fact, evaporative segregation always precedes freezing segregation to some degree and must often be studied prior to performing meaningful solidification experiments. This is particularly true since evaporation may cause the melt composition, at least at the critical surface regions or layers to be affected manyfold within seconds so that the surface region or layer melting point and other thermophysical properties, nucleation characteristics, base for undercooling, and critical velocity to avoid constitutional supercooling, may be completely unexpected. An important objective was, therefore, to develop the necessary normal evaporation equations for predicting the compositional changes within specified times at temperature and to correlate these equations with actual experimental data collected from the literature.
Death by Segregation: Does the Dimension of Racial Segregation Matter?
Yang, Tse-Chuan; Matthews, Stephen A
2015-01-01
The county-level geographic mortality differentials have persisted in the past four decades in the United States (US). Though several socioeconomic factors (e.g., inequality) partially explain this phenomenon, the role of race/ethnic segregation, in general, and the different dimensions of segregation, more specifically, has been underexplored. Focusing on all-cause age-sex standardized US county-level mortality (2004-2008), this study has two substantive goals: (1) to understand whether segregation is a determinant of mortality and if yes, how the relationship between segregation and mortality varies by racial/ethnic dyads (e.g., white/black), and (2) to explore whether different dimensions of segregation (i.e., evenness, exposure, concentration, centralization, and clustering) are associated with mortality. A third goal is methodological: to assess whether spatial autocorrelation influences our understanding of the associations between the dimensions of segregation and mortality. Race/ethnic segregation was found to contribute to the geographic mortality disparities. Moreover, the relationship with mortality differed by both race/ethnic group and the dimension of segregation. Specifically, white/black segregation is positively related to mortality, whereas the segregation between whites and non-black minorities is negatively associated with mortality. Among the five dimensions of segregation, evenness and exposure are more strongly related to mortality than other dimensions. Spatial filtering approaches also identified six unique spatial patterns that significantly affect the spatial distribution of mortality. These patterns offer possible insights that help identify omitted variables related to the persistent patterning of mortality in the US.
Dai, Lengshi; Best, Virginia; Shinn-Cunningham, Barbara G.
2018-01-01
Listeners with sensorineural hearing loss often have trouble understanding speech amid other voices. While poor spatial hearing is often implicated, direct evidence is weak; moreover, studies suggest that reduced audibility and degraded spectrotemporal coding may explain such problems. We hypothesized that poor spatial acuity leads to difficulty deploying selective attention, which normally filters out distracting sounds. In listeners with normal hearing, selective attention causes changes in the neural responses evoked by competing sounds, which can be used to quantify the effectiveness of attentional control. Here, we used behavior and electroencephalography to explore whether control of selective auditory attention is degraded in hearing-impaired (HI) listeners. Normal-hearing (NH) and HI listeners identified a simple melody presented simultaneously with two competing melodies, each simulated from different lateral angles. We quantified performance and attentional modulation of cortical responses evoked by these competing streams. Compared with NH listeners, HI listeners had poorer sensitivity to spatial cues, performed more poorly on the selective attention task, and showed less robust attentional modulation of cortical responses. Moreover, across NH and HI individuals, these measures were correlated. While both groups showed cortical suppression of distracting streams, this modulation was weaker in HI listeners, especially when attending to a target at midline, surrounded by competing streams. These findings suggest that hearing loss interferes with the ability to filter out sound sources based on location, contributing to communication difficulties in social situations. These findings also have implications for technologies aiming to use neural signals to guide hearing aid processing. PMID:29555752
Visual processing affects the neural basis of auditory discrimination.
Kislyuk, Daniel S; Möttönen, Riikka; Sams, Mikko
2008-12-01
The interaction between auditory and visual speech streams is a seamless and surprisingly effective process. An intriguing example is the "McGurk effect": The acoustic syllable /ba/ presented simultaneously with a mouth articulating /ga/ is typically heard as /da/ [McGurk, H., & MacDonald, J. Hearing lips and seeing voices. Nature, 264, 746-748, 1976]. Previous studies have demonstrated the interaction of auditory and visual streams at the auditory cortex level, but the importance of these interactions for the qualitative perception change remained unclear because the change could result from interactions at higher processing levels as well. In our electroencephalogram experiment, we combined the McGurk effect with mismatch negativity (MMN), a response that is elicited in the auditory cortex at a latency of 100-250 msec by any above-threshold change in a sequence of repetitive sounds. An "odd-ball" sequence of acoustic stimuli consisting of frequent /va/ syllables (standards) and infrequent /ba/ syllables (deviants) was presented to 11 participants. Deviant stimuli in the unisensory acoustic stimulus sequence elicited a typical MMN, reflecting discrimination of acoustic features in the auditory cortex. When the acoustic stimuli were dubbed onto a video of a mouth constantly articulating /va/, the deviant acoustic /ba/ was heard as /va/ due to the McGurk effect and was indistinguishable from the standards. Importantly, such deviants did not elicit MMN, indicating that the auditory cortex failed to discriminate between the acoustic stimuli. Our findings show that visual stream can qualitatively change the auditory percept at the auditory cortex level, profoundly influencing the auditory cortex mechanisms underlying early sound discrimination.
Plascak, Jesse J.; Molina, Yamile; Wu-Georges, Samantha; Idris, Ayah; Thompson, Beti
2016-01-01
The relationship between Latino residential segregation and self-rated health (SRH) is unclear, but might be partially affected by social capital. We investigated the association between Latino residential segregation and SRH while also examining the roles of various social capital measures. Washington State Behavioral Risk Factor Surveillance System (2012–2014) and U.S. Census data were linked by zip code and zip code tabulation area. Multilevel logistic regression models were used to estimate odds of good or better SRH by Latino residential segregation, measured by the Gini coefficient, and controlling for sociodemographic, acculturation and social capital measures of neighborhood ties, collective socialization of children, and social control. The Latino residential segregation – SRH relationship was convex, or ‘U’-shaped, such that increases in segregation among Latinos residing in lower segregation areas was associated with lower SRH while increases in segregation among Latinos residing in higher segregation areas was associated with higher SRH. The social capital measures were independently associated with SRH but had little effect on the relationship between Latino residential segregation and SRH. A convex relationship between Latino residential segregation and SRH could explain mixed findings of previous studies. Although important for SRH, social capital measures of neighborhood ties, collective socialization of children, and social control might not account for the relationship between Latino residential segregation and SRH. PMID:27173739
Plascak, Jesse J; Molina, Yamile; Wu-Georges, Samantha; Idris, Ayah; Thompson, Beti
2016-06-01
The relationship between Latino residential segregation and self-rated health (SRH) is unclear, but might be partially affected by social capital. We investigated the association between Latino residential segregation and SRH while also examining the roles of various social capital measures. Washington State Behavioral Risk Factor Surveillance System (2012-2014) and U.S. Census data were linked by zip code and zip code tabulation area. Multilevel logistic regression models were used to estimate odds of good or better SRH by Latino residential segregation, measured by the Gini coefficient, and controlling for sociodemographic, acculturation and social capital measures of neighborhood ties, collective socialization of children, and social control. The Latino residential segregation - SRH relationship was convex, or 'U'-shaped, such that increases in segregation among Latinos residing in lower segregation areas was associated with lower SRH while increases in segregation among Latinos residing in higher segregation areas was associated with higher SRH. The social capital measures were independently associated with SRH but had little effect on the relationship between Latino residential segregation and SRH. A convex relationship between Latino residential segregation and SRH could explain mixed findings of previous studies. Although important for SRH, social capital measures of neighborhood ties, collective socialization of children, and social control might not account for the relationship between Latino residential segregation and SRH. Copyright © 2016 Elsevier Ltd. All rights reserved.
A Revaluation of Indexes of Residential Segregation
ERIC Educational Resources Information Center
Winship, Christopher
1977-01-01
Shows that there are at least two different perspectives from which residential segregation can be examined. Segregation can be measured as it deviates from a situation of complete desegregation or in terms of a situation in which there is random segregation in the city. New criteria for indexes of residential segregation are developed. (Author/JM)
Code of Federal Regulations, 2010 CFR
2010-07-01
... disciplinary segregation and review of inmates in disciplinary segregation. 541.20 Section 541.20 Judicial... disciplinary segregation and review of inmates in disciplinary segregation. (a) Except as provided in paragraph... the physical confines of administrative detention, and (3) upon advice of appropriate medical staff...
43 CFR 3873.1 - Segregation of mineral from non-mineral land.
Code of Federal Regulations, 2011 CFR
2011-10-01
... 43 Public Lands: Interior 2 2011-10-01 2011-10-01 false Segregation of mineral from non-mineral... AND CONFLICTS Segregation § 3873.1 Segregation of mineral from non-mineral land. Where a survey is... satisfactorily established that there are existent prior unpatented mining claims, the segregation of the latter...
Income Segregation between Schools and School Districts. CEPA Working Paper No. 16-04
ERIC Educational Resources Information Center
Owens, Ann; Reardon, Sean F.; Jencks, Christopher
2016-01-01
Although trends in the racial segregation of schools are well documented, less is known about trends in "income" segregation. We use multiple data sources to document trends in income segregation between schools and school districts. Between-district income segregation of families with children enrolled in public school increased by over…
Segregation as Splitting, Segregation as Joining: Schools, Housing, and the Many Modes of Jim Crow
ERIC Educational Resources Information Center
Highsmith, Andrew R.; Erickson, Ansley T.
2015-01-01
Popular understandings of segregation often emphasize the Jim Crow South before the 1954 "Brown" decision and, in many instances, explain continued segregation in schooling as the result of segregated housing patterns. The case of Flint, Michigan, complicates these views, at once illustrating the depth of governmental commitment to…
A Naturalistic Observational Study of Informal Segregation: Seating Patterns in Lectures
ERIC Educational Resources Information Center
Koen, Jennifer; Durrheim, Kevin
2010-01-01
In spite of the removal of legislated racial segregation, a number of observational studies in South Africa and elsewhere have shown that "informal," nonlegislated segregation persists in spaces of everyday interaction. Most of these have been case studies of segregation at single sites. The authors seek to quantify segregation in a…
Park, Yoo Min; Kwan, Mei-Po
2017-10-10
Many environmental justice studies have sought to examine the effect of residential segregation on unequal exposure to environmental factors among different social groups, but little is known about how segregation in non-residential contexts affects such disparity. Based on a review of the relevant literature, this paper discusses the limitations of traditional residence-based approaches in examining the association between socioeconomic or racial/ethnic segregation and unequal environmental exposure in environmental justice research. It emphasizes that future research needs to go beyond residential segregation by considering the full spectrum of segregation experienced by people in various geographic and temporal contexts of everyday life. Along with this comprehensive understanding of segregation, the paper also highlights the importance of assessing environmental exposure at a high spatiotemporal resolution in environmental justice research. The successful integration of a comprehensive concept of segregation, high-resolution data and fine-grained spatiotemporal approaches to assessing segregation and environmental exposure would provide more nuanced and robust findings on the associations between segregation and disparities in environmental exposure and their health impacts. Moreover, it would also contribute to significantly expanding the scope of environmental justice research.
Luminosity segregation in galaxy clusters as an indication of dynamical evolution
NASA Technical Reports Server (NTRS)
Baier, F. W.; Schmidt, K.-H.
1993-01-01
Theoretical models describing the dynamical evolution of self-gravitating systems predict a spatial mass segregation for more evolved systems, with the more massive objects concentrated toward the center of the configuration. From the observational point of view, however, the existence of mass segregation in galaxy clusters seems to be a matter of controversy. A special problem in this connection is the formation of cD galaxies in the centers of galaxy clusters. The most promising scenarios of their formation are galaxy cannibalism (merger scenario) and growing by cooling flows. It seems to be plausible to consider the swallowing of smaller systems by a dominant galaxy as an important process in the evolution of a cD galaxy. The stage of the evolution of the dominant galaxy should be reflected by the surrounding galaxy population, especially by possible mass segregation effects. Assuming that mass segregation is tantamount to luminosity segregation we analyzed luminosity segregation in roughly 40 cD galaxy clusters. Obviously there are three different groups of clusters: (1) clusters with luminosity segregation, (2) clusters without luminosity segregation, and (3) such objects exhibiting a phenomenon which we call antisegregation in luminosity, i.e. a deficiency of bright galaxies in the central regions of clusters. This result is interpreted in the sense of different degrees of mass segregation and as an indication for different evolution stages of these clusters. The clusters are arranged in the three segregation classes 2, 1, and 0 (S2 = strong mass segregation, S1 = moderate mass segregation, S0 = weak or absent mass segregation). We assume that a galaxy cluster starts its dynamical evolution after virialization without any radial mass segregation. Energy exchange during encounters of cluster members as well as merger processes between cluster galaxies lead to an increasing radial mass segregation in the cluster (S1). If a certain degree of segregation (S2) has been established, an essential number of slow-moving and relative massive cluster members in the center will be cannibalized by the initial brightest cluster galaxy. This process should lead to the growing of the predominate galaxy, which is accompanied by a diminution of the mass segregation (transition to S1 and S0, respectively) in the neighborhood of the central very massive galaxy. An increase of the areal density of brighter galaxies towards the outer cluster regions (antisegregation of luminosity), i.e. an extreme low degree of mass segregation was estimated for a substantial percentage of cD clusters. This result favors the cannibalism scenario for the formation of cD galaxies.
Two-year-olds can begin to acquire verb meanings in socially impoverished contexts.
Arunachalam, Sudha
2013-12-01
By two years of age, toddlers are adept at recruiting social, observational, and linguistic cues to discover the meanings of words. Here, we ask how they fare in impoverished contexts in which linguistic cues are provided, but no social or visual information is available. Novel verbs are presented in a stream of syntactically informative sentences, but the sentences are not embedded in a social context, and no visual access to the verb's referent is provided until the test phase. The results provide insight into how toddlers may benefit from overhearing contexts in which they are not directly attending to the ambient speech, and in which no conversational context, visual referent, or child-directed conversation is available. Copyright © 2013 Elsevier B.V. All rights reserved.
Fossett, Mark
2011-01-01
This paper considers the potential for using agent models to explore theories of residential segregation in urban areas. Results of generative experiments conducted using an agent-based simulation of segregation dynamics document that varying a small number of model parameters representing constructs from urban-ecological theories of segregation can generate a wide range of qualitatively distinct and substantively interesting segregation patterns. The results suggest how complex, macro-level patterns of residential segregation can arise from a small set of simple micro-level social dynamics operating within particular urban-demographic contexts. The promise and current limitations of agent simulation studies are noted and optimism is expressed regarding the potential for such studies to engage and contribute to the broader research literature on residential segregation. PMID:21379372
Do, D Phuong; Frank, Reanne; Iceland, John
2017-08-01
While black-white segregation has been consistently linked to detrimental health outcomes for blacks, whether segregation is necessarily a zero-sum arrangement in which some groups accrue health advantages at the expense of other groups and whether metropolitan segregation impacts the health of racial groups uniformly within the metropolitan area, remains unclear. Using nationally representative data from the 2008-2013 National Health Interview Survey linked to Census data, we investigate whether the association between metropolitan segregation and health is invariant within the metropolitan area or whether it is modified by neighborhood poverty for black and white Americans. In doing so, we assess the extent to which segregation involves direct health tradeoffs between blacks and whites. We conduct race-stratified multinomial and logistic regression models to assess the relationship between 1) segregation and level of neighborhood poverty and 2) segregation, neighborhood poverty, and poor health, respectively. We find that, for blacks, segregation was associated with a higher likelihood of residing in high poverty neighborhoods, net of individual-level socioeconomic characteristics. Segregation was positively associated with poor health for blacks in high poverty neighborhoods, but not for those in lower poverty neighborhoods. Hence, the self-rated health of blacks clearly suffers as a result of black-white segregation - both directly, and indirectly through exposure to high poverty neighborhoods. We do not find consistent evidence for a direct relationship between segregation and poor health for whites. However, we find some suggestive evidence that segregation may indirectly benefit whites through decreasing their exposure to high poverty environments. These findings underscore the critical role of concentrated disadvantage in the complex interconnection between metropolitan segregation and health. Weakening the link between racial segregation and concentrated poverty via local policy and planning has the potential for broad population-based health improvements and significant reductions in black-white health disparities. Copyright © 2017. Published by Elsevier Ltd.
Public school segregation and juvenile violent crime arrests in metropolitan areas.
Eitle, David; Eitle, Tamela McNulty
2010-01-01
Previous research has established an association between residential segregation and violent crime in urban America. Our study examines whether school-based segregation is predictive of arrests of juveniles for violent crimes in U.S. metro areas. Using Census, Uniform Crime Report, and Common Core data for 204 metro areas, a measure of school-based racial segregation, Theil's entropy index, is decomposed into two components: between- and within-district segregation. Findings reveal evidence of a significant interaction term: Within-district segregation is inversely associated with arrests for juvenile violence, but only in metropolitan areas with higher than average levels of between-district segregation.
The high-rate data challenge: computing for the CBM experiment
NASA Astrophysics Data System (ADS)
Friese, V.;
2017-10-01
The Compressed Baryonic Matter experiment (CBM) is a next-generation heavy-ion experiment to be operated at the FAIR facility, currently under construction in Darmstadt, Germany. A key feature of CBM is very high interaction rate, exceeding those of contemporary nuclear collision experiments by several orders of magnitude. Such interaction rates forbid a conventional, hardware-triggered readout; instead, experiment data will be freely streaming from self-triggered front-end electronics. In order to reduce the huge raw data volume to a recordable rate, data will be selected exclusively on CPU, which necessitates partial event reconstruction in real-time. Consequently, the traditional segregation of online and offline software vanishes; an integrated on- and offline data processing concept is called for. In this paper, we will report on concepts and developments for computing for CBM as well as on the status of preparations for its first physics run.
Synthesis and self-assembly of amphiphilic polymeric microparticles.
Dendukuri, Dhananjay; Hatton, T Alan; Doyle, Patrick S
2007-04-10
We report the synthesis and self-assembly of amphiphilic, nonspherical, polymeric microparticles. Wedge-shaped particles bearing segregated hydrophilic and hydrophobic sections were synthesized in a microfludic channel by polymerizing across laminar coflowing streams of hydrophilic and hydrophobic polymers using continuous flow lithography (CFL). Particle monodispersity was characterized by measuring both the size of the particles formed and the extent of amphiphilicity. The coefficient of variation (COV) was found to be less than 2.5% in all measured dimensions. Particle structure was further characterized by measuring the curvature of the interface between the sections and the extent of cross-linking using FTIR spectroscopy. The amphiphilic particles were allowed to self-assemble in water or at water-oil interfaces. In water, the geometry of the particles enabled the formation of micelle-like structures, while in emulsions, the particles migrated to the oil-water interface and oriented themselves to minimize their surface energy.
Effects of Space Environment on Flow and Concentration During Directional Solidification
NASA Technical Reports Server (NTRS)
Benjapiyaporn, C.; Timchenko, V.; Leonardi, E.; deVahlDavis, G.; deGroh, H. C., III
2000-01-01
A study of directional solidification of a weak binary alloy (specifically, Bi - 1 at% Sn) based on the fixed grid single domain approach is being undertaken. The enthalpy method is used to solve for the temperature field over the computational domain including both the solid and liquid phases; latent heat evolution is treated with the aid of an effective specific heat coefficient. A source term accounting for the release of solute into the liquid during solidification has been incorporated into the solute transport equation. The vorticity-stream function formulation is used to describe thermosolutal convection in the liquid region. In this paper we numerically investigate the effects of g-jitter on directional solidification. A background gravity of 1 micro-g has been assumed, and new results for the effects of periodic disturbances over a range of amplitudes and frequencies on solute field and segregation have been presented.
V/STOL model fan stage rig design report
NASA Technical Reports Server (NTRS)
Cheatham, J. G.; Creason, T. L.
1983-01-01
A model single-stage fan with variable inlet guide vanes (VIGV) was designed to demonstrate efficient point operation while providing flow and pressure ratio modulation capability required for a V/STOL propulsion system. The fan stage incorporates a split-flap VIGV with an independently actuated ID flap to permit independent modulation of fan and core engine airstreams, a flow splitter integrally designed into the blade and vanes to completely segregate fan and core airstreams in order to maximize core stream supercharging for V/STOL operation, and an EGV with a variable leading edge fan flap for rig performance optimization. The stage was designed for a maximum flow size of 37.4 kg/s (82.3 lb/s) for compatibility with LeRC test facility requirements. Design values at maximum flow for blade tip velocity and stage pressure ratio are 472 m/s (1550 ft/s) and 1.68, respectively.
Iqbal, Zafar; Püttmann, Lucia; Musante, Luciana; Razzaq, Attia; Zahoor, Muhammad Yasir; Hu, Hao; Wienker, Thomas F; Garshasbi, Masoud; Fattahi, Zohreh; Gilissen, Christian; Vissers, Lisenka ELM; de Brouwer, Arjan PM; Veltman, Joris A; Pfundt, Rolph; Najmabadi, Hossein; Ropers, Hans-Hilger; Riazuddin, Sheikh; Kahrizi, Kimia; van Bokhoven, Hans
2016-01-01
AIMP1/p43 is a multifunctional non-catalytic component of the multisynthetase complex. The complex consists of nine catalytic and three non-catalytic proteins, which catalyze the ligation of amino acids to their cognate tRNA isoacceptors for use in protein translation. To date, two allelic variants in the AIMP1 gene have been reported as the underlying cause of autosomal recessive primary neurodegenerative disorder. Here, we present two consanguineous families from Pakistan and Iran, presenting with moderate to severe intellectual disability, global developmental delay, and speech impairment without neurodegeneration. By the combination of homozygosity mapping and next generation sequencing, we identified two homozygous missense variants, p.(Gly299Arg) and p.(Val176Gly), in the gene AIMP1 that co-segregated with the phenotype in the respective families. Molecular modeling of the variants revealed deleterious effects on the protein structure that are predicted to result in reduced AIMP1 function. Our findings indicate that the clinical spectrum for AIMP1 defects is broader than witnessed so far. PMID:26173967
Iqbal, Zafar; Püttmann, Lucia; Musante, Luciana; Razzaq, Attia; Zahoor, Muhammad Yasir; Hu, Hao; Wienker, Thomas F; Garshasbi, Masoud; Fattahi, Zohreh; Gilissen, Christian; Vissers, Lisenka E L M; de Brouwer, Arjan P M; Veltman, Joris A; Pfundt, Rolph; Najmabadi, Hossein; Ropers, Hans-Hilger; Riazuddin, Sheikh; Kahrizi, Kimia; van Bokhoven, Hans
2016-03-01
AIMP1/p43 is a multifunctional non-catalytic component of the multisynthetase complex. The complex consists of nine catalytic and three non-catalytic proteins, which catalyze the ligation of amino acids to their cognate tRNA isoacceptors for use in protein translation. To date, two allelic variants in the AIMP1 gene have been reported as the underlying cause of autosomal recessive primary neurodegenerative disorder. Here, we present two consanguineous families from Pakistan and Iran, presenting with moderate to severe intellectual disability, global developmental delay, and speech impairment without neurodegeneration. By the combination of homozygosity mapping and next generation sequencing, we identified two homozygous missense variants, p.(Gly299Arg) and p.(Val176Gly), in the gene AIMP1 that co-segregated with the phenotype in the respective families. Molecular modeling of the variants revealed deleterious effects on the protein structure that are predicted to result in reduced AIMP1 function. Our findings indicate that the clinical spectrum for AIMP1 defects is broader than witnessed so far.
NASA Astrophysics Data System (ADS)
Barnard, P. E.; Terblans, J. J.; Swart, H. C.
2015-12-01
The article takes a new look at the process of atomic segregation by considering the influence of surface relaxation on the segregation parameters; the activation energy (Q), segregation energy (ΔG), interaction parameter (Ω) and the pre-exponential factor (D0). Computational modelling, namely Density Functional Theory (DFT) and the Modified Darken Model (MDM) in conjunction with Auger Electron Spectroscopy (AES) was utilized to study the variation of the segregation parameters for S in the surface region of Fe(100). Results indicate a variation in each of the segregation parameters as a function of the atomic layer under consideration. Values of the segregation parameters varied more dramatically as the surface layer is approached, with atomic layer 2 having the largest deviations in comparison to the bulk values. This atomic layer had the highest Q value and formed the rate limiting step for the segregation of S towards the Fe(100) surface. It was found that the segregation process is influenced by two sets of segregation parameters, those of the surface region formed by atomic layer 2, and those in the bulk material. This article is the first to conduct a full scale investigation on the influence of surface relaxation on segregation and labelled it the "surface effect".
Gender Segregation in the Spanish Labor Market: An Alternative Approach
ERIC Educational Resources Information Center
del Rio, Coral; Alonso-Villar, Olga
2010-01-01
The aim of this paper is to study occupational segregation by gender in Spain, which is a country where occupational segregation explains a large part of the gender wage gap. As opposed to previous studies, this paper measures not only overall segregation, but also the segregation of several population subgroups. For this purpose, this paper uses…
ERIC Educational Resources Information Center
Orfield, Gary; And Others
This study shows where school segregation is concentrated and where schools remain highly integrated. It offers the first national comparison of segregation by community size and reveals that segregation remains high in big cities and serious in mid-size central cities. Many African-American and Latino students also attend segregated schools in…