computational auditory scene: Topics by Science.gov

Sample records for computational auditory scene

Recent advances in exploring the neural underpinnings of auditory scene perception

PubMed Central

Snyder, Joel S.; Elhilali, Mounya

2017-01-01

Studies of auditory scene analysis have traditionally relied on paradigms using artificial sounds—and conventional behavioral techniques—to elucidate how we perceptually segregate auditory objects or streams from each other. In the past few decades, however, there has been growing interest in uncovering the neural underpinnings of auditory segregation using human and animal neuroscience techniques, as well as computational modeling. This largely reflects the growth in the fields of cognitive neuroscience and computational neuroscience and has led to new theories of how the auditory system segregates sounds in complex arrays. The current review focuses on neural and computational studies of auditory scene perception published in the past few years. Following the progress that has been made in these studies, we describe (1) theoretical advances in our understanding of the most well-studied aspects of auditory scene perception, namely segregation of sequential patterns of sounds and concurrently presented sounds; (2) the diversification of topics and paradigms that have been investigated; and (3) how new neuroscience techniques (including invasive neurophysiology in awake humans, genotyping, and brain stimulation) have been used in this field. PMID:28199022
The what, where and how of auditory-object perception.

PubMed

Bizley, Jennifer K; Cohen, Yale E

2013-10-01

The fundamental perceptual unit in hearing is the 'auditory object'. Similar to visual objects, auditory objects are the computational result of the auditory system's capacity to detect, extract, segregate and group spectrotemporal regularities in the acoustic environment; the multitude of acoustic stimuli around us together form the auditory scene. However, unlike the visual scene, resolving the component objects within the auditory scene crucially depends on their temporal structure. Neural correlates of auditory objects are found throughout the auditory system. However, neural responses do not become correlated with a listener's perceptual reports until the level of the cortex. The roles of different neural structures and the contribution of different cognitive states to the perception of auditory objects are not yet fully understood.
The what, where and how of auditory-object perception

PubMed Central

Bizley, Jennifer K.; Cohen, Yale E.

2014-01-01

The fundamental perceptual unit in hearing is the ‘auditory object’. Similar to visual objects, auditory objects are the computational result of the auditory system's capacity to detect, extract, segregate and group spectrotemporal regularities in the acoustic environment; the multitude of acoustic stimuli around us together form the auditory scene. However, unlike the visual scene, resolving the component objects within the auditory scene crucially depends on their temporal structure. Neural correlates of auditory objects are found throughout the auditory system. However, neural responses do not become correlated with a listener's perceptual reports until the level of the cortex. The roles of different neural structures and the contribution of different cognitive states to the perception of auditory objects are not yet fully understood. PMID:24052177
Functional neuroanatomy of auditory scene analysis in Alzheimer's disease

PubMed Central

Golden, Hannah L.; Agustus, Jennifer L.; Goll, Johanna C.; Downey, Laura E.; Mummery, Catherine J.; Schott, Jonathan M.; Crutch, Sebastian J.; Warren, Jason D.

2015-01-01

Auditory scene analysis is a demanding computational process that is performed automatically and efficiently by the healthy brain but vulnerable to the neurodegenerative pathology of Alzheimer's disease. Here we assessed the functional neuroanatomy of auditory scene analysis in Alzheimer's disease using the well-known ‘cocktail party effect’ as a model paradigm whereby stored templates for auditory objects (e.g., hearing one's spoken name) are used to segregate auditory ‘foreground’ and ‘background’. Patients with typical amnestic Alzheimer's disease (n = 13) and age-matched healthy individuals (n = 17) underwent functional 3T-MRI using a sparse acquisition protocol with passive listening to auditory stimulus conditions comprising the participant's own name interleaved with or superimposed on multi-talker babble, and spectrally rotated (unrecognisable) analogues of these conditions. Name identification (conditions containing the participant's own name contrasted with spectrally rotated analogues) produced extensive bilateral activation involving superior temporal cortex in both the AD and healthy control groups, with no significant differences between groups. Auditory object segregation (conditions with interleaved name sounds contrasted with superimposed name sounds) produced activation of right posterior superior temporal cortex in both groups, again with no differences between groups. However, the cocktail party effect (interaction of own name identification with auditory object segregation processing) produced activation of right supramarginal gyrus in the AD group that was significantly enhanced compared with the healthy control group. The findings delineate an altered functional neuroanatomical profile of auditory scene analysis in Alzheimer's disease that may constitute a novel computational signature of this neurodegenerative pathology. PMID:26029629
Neural correlates of auditory scene analysis and perception

PubMed Central

Cohen, Yale E.

2014-01-01

The auditory system is designed to transform acoustic information from low-level sensory representations into perceptual representations. These perceptual representations are the computational result of the auditory system's ability to group and segregate spectral, spatial and temporal regularities in the acoustic environment into stable perceptual units (i.e., sounds or auditory objects). Current evidence suggests that the cortex--specifically, the ventral auditory pathway--is responsible for the computations most closely related to perceptual representations. Here, we discuss how the transformations along the ventral auditory pathway relate to auditory percepts, with special attention paid to the processing of vocalizations and categorization, and explore recent models of how these areas may carry out these computations. PMID:24681354
A comparison of several computational auditory scene analysis (CASA) techniques for monaural speech segregation.

PubMed

Zeremdini, Jihen; Ben Messaoud, Mohamed Anouar; Bouzid, Aicha

2015-09-01

Humans have the ability to easily separate a composed speech and to form perceptual representations of the constituent sources in an acoustic mixture thanks to their ears. Until recently, researchers attempt to build computer models of high-level functions of the auditory system. The problem of the composed speech segregation is still a very challenging problem for these researchers. In our case, we are interested in approaches that are addressed to the monaural speech segregation. For this purpose, we study in this paper the computational auditory scene analysis (CASA) to segregate speech from monaural mixtures. CASA is the reproduction of the source organization achieved by listeners. It is based on two main stages: segmentation and grouping. In this work, we have presented, and compared several studies that have used CASA for speech separation and recognition.
Resolving the neural dynamics of visual and auditory scene processing in the human brain: a methodological approach

PubMed Central

Teng, Santani

2017-01-01

In natural environments, visual and auditory stimulation elicit responses across a large set of brain regions in a fraction of a second, yielding representations of the multimodal scene and its properties. The rapid and complex neural dynamics underlying visual and auditory information processing pose major challenges to human cognitive neuroscience. Brain signals measured non-invasively are inherently noisy, the format of neural representations is unknown, and transformations between representations are complex and often nonlinear. Further, no single non-invasive brain measurement technique provides a spatio-temporally integrated view. In this opinion piece, we argue that progress can be made by a concerted effort based on three pillars of recent methodological development: (i) sensitive analysis techniques such as decoding and cross-classification, (ii) complex computational modelling using models such as deep neural networks, and (iii) integration across imaging methods (magnetoencephalography/electroencephalography, functional magnetic resonance imaging) and models, e.g. using representational similarity analysis. We showcase two recent efforts that have been undertaken in this spirit and provide novel results about visual and auditory scene analysis. Finally, we discuss the limits of this perspective and sketch a concrete roadmap for future research. This article is part of the themed issue ‘Auditory and visual scene analysis’. PMID:28044019
Resolving the neural dynamics of visual and auditory scene processing in the human brain: a methodological approach.

PubMed

Cichy, Radoslaw Martin; Teng, Santani

2017-02-19

In natural environments, visual and auditory stimulation elicit responses across a large set of brain regions in a fraction of a second, yielding representations of the multimodal scene and its properties. The rapid and complex neural dynamics underlying visual and auditory information processing pose major challenges to human cognitive neuroscience. Brain signals measured non-invasively are inherently noisy, the format of neural representations is unknown, and transformations between representations are complex and often nonlinear. Further, no single non-invasive brain measurement technique provides a spatio-temporally integrated view. In this opinion piece, we argue that progress can be made by a concerted effort based on three pillars of recent methodological development: (i) sensitive analysis techniques such as decoding and cross-classification, (ii) complex computational modelling using models such as deep neural networks, and (iii) integration across imaging methods (magnetoencephalography/electroencephalography, functional magnetic resonance imaging) and models, e.g. using representational similarity analysis. We showcase two recent efforts that have been undertaken in this spirit and provide novel results about visual and auditory scene analysis. Finally, we discuss the limits of this perspective and sketch a concrete roadmap for future research.This article is part of the themed issue 'Auditory and visual scene analysis'. © 2017 The Authors.
A Corticothalamic Circuit Model for Sound Identification in Complex Scenes

PubMed Central

Otazu, Gonzalo H.; Leibold, Christian

2011-01-01

The identification of the sound sources present in the environment is essential for the survival of many animals. However, these sounds are not presented in isolation, as natural scenes consist of a superposition of sounds originating from multiple sources. The identification of a source under these circumstances is a complex computational problem that is readily solved by most animals. We present a model of the thalamocortical circuit that performs level-invariant recognition of auditory objects in complex auditory scenes. The circuit identifies the objects present from a large dictionary of possible elements and operates reliably for real sound signals with multiple concurrently active sources. The key model assumption is that the activities of some cortical neurons encode the difference between the observed signal and an internal estimate. Reanalysis of awake auditory cortex recordings revealed neurons with patterns of activity corresponding to such an error signal. PMID:21931668
Cortical Representations of Speech in a Multitalker Auditory Scene.

PubMed

Puvvada, Krishna C; Simon, Jonathan Z

2017-09-20

The ability to parse a complex auditory scene into perceptual objects is facilitated by a hierarchical auditory system. Successive stages in the hierarchy transform an auditory scene of multiple overlapping sources, from peripheral tonotopically based representations in the auditory nerve, into perceptually distinct auditory-object-based representations in the auditory cortex. Here, using magnetoencephalography recordings from men and women, we investigate how a complex acoustic scene consisting of multiple speech sources is represented in distinct hierarchical stages of the auditory cortex. Using systems-theoretic methods of stimulus reconstruction, we show that the primary-like areas in the auditory cortex contain dominantly spectrotemporal-based representations of the entire auditory scene. Here, both attended and ignored speech streams are represented with almost equal fidelity, and a global representation of the full auditory scene with all its streams is a better candidate neural representation than that of individual streams being represented separately. We also show that higher-order auditory cortical areas, by contrast, represent the attended stream separately and with significantly higher fidelity than unattended streams. Furthermore, the unattended background streams are more faithfully represented as a single unsegregated background object rather than as separated objects. Together, these findings demonstrate the progression of the representations and processing of a complex acoustic scene up through the hierarchy of the human auditory cortex. SIGNIFICANCE STATEMENT Using magnetoencephalography recordings from human listeners in a simulated cocktail party environment, we investigate how a complex acoustic scene consisting of multiple speech sources is represented in separate hierarchical stages of the auditory cortex. We show that the primary-like areas in the auditory cortex use a dominantly spectrotemporal-based representation of the entire auditory scene, with both attended and unattended speech streams represented with almost equal fidelity. We also show that higher-order auditory cortical areas, by contrast, represent an attended speech stream separately from, and with significantly higher fidelity than, unattended speech streams. Furthermore, the unattended background streams are represented as a single undivided background object rather than as distinct background objects. Copyright © 2017 the authors 0270-6474/17/379189-08$15.00/0.
Modelling auditory attention

PubMed Central

Kaya, Emine Merve

2017-01-01

Sounds in everyday life seldom appear in isolation. Both humans and machines are constantly flooded with a cacophony of sounds that need to be sorted through and scoured for relevant information—a phenomenon referred to as the ‘cocktail party problem’. A key component in parsing acoustic scenes is the role of attention, which mediates perception and behaviour by focusing both sensory and cognitive resources on pertinent information in the stimulus space. The current article provides a review of modelling studies of auditory attention. The review highlights how the term attention refers to a multitude of behavioural and cognitive processes that can shape sensory processing. Attention can be modulated by ‘bottom-up’ sensory-driven factors, as well as ‘top-down’ task-specific goals, expectations and learned schemas. Essentially, it acts as a selection process or processes that focus both sensory and cognitive resources on the most relevant events in the soundscape; with relevance being dictated by the stimulus itself (e.g. a loud explosion) or by a task at hand (e.g. listen to announcements in a busy airport). Recent computational models of auditory attention provide key insights into its role in facilitating perception in cluttered auditory scenes. This article is part of the themed issue ‘Auditory and visual scene analysis’. PMID:28044012
Assessing Top-Down and Bottom-Up Contributions to Auditory Stream Segregation and Integration With Polyphonic Music

PubMed Central

Disbergen, Niels R.; Valente, Giancarlo; Formisano, Elia; Zatorre, Robert J.

2018-01-01

Polyphonic music listening well exemplifies processes typically involved in daily auditory scene analysis situations, relying on an interactive interplay between bottom-up and top-down processes. Most studies investigating scene analysis have used elementary auditory scenes, however real-world scene analysis is far more complex. In particular, music, contrary to most other natural auditory scenes, can be perceived by either integrating or, under attentive control, segregating sound streams, often carried by different instruments. One of the prominent bottom-up cues contributing to multi-instrument music perception is their timbre difference. In this work, we introduce and validate a novel paradigm designed to investigate, within naturalistic musical auditory scenes, attentive modulation as well as its interaction with bottom-up processes. Two psychophysical experiments are described, employing custom-composed two-voice polyphonic music pieces within a framework implementing a behavioral performance metric to validate listener instructions requiring either integration or segregation of scene elements. In Experiment 1, the listeners' locus of attention was switched between individual instruments or the aggregate (i.e., both instruments together), via a task requiring the detection of temporal modulations (i.e., triplets) incorporated within or across instruments. Subjects responded post-stimulus whether triplets were present in the to-be-attended instrument(s). Experiment 2 introduced the bottom-up manipulation by adding a three-level morphing of instrument timbre distance to the attentional framework. The task was designed to be used within neuroimaging paradigms; Experiment 2 was additionally validated behaviorally in the functional Magnetic Resonance Imaging (fMRI) environment. Experiment 1 subjects (N = 29, non-musicians) completed the task at high levels of accuracy, showing no group differences between any experimental conditions. Nineteen listeners also participated in Experiment 2, showing a main effect of instrument timbre distance, even though within attention-condition timbre-distance contrasts did not demonstrate any timbre effect. Correlation of overall scores with morph-distance effects, computed by subtracting the largest from the smallest timbre distance scores, showed an influence of general task difficulty on the timbre distance effect. Comparison of laboratory and fMRI data showed scanner noise had no adverse effect on task performance. These Experimental paradigms enable to study both bottom-up and top-down contributions to auditory stream segregation and integration within psychophysical and neuroimaging experiments. PMID:29563861
Auditory salience using natural soundscapes.

PubMed

Huang, Nicholas; Elhilali, Mounya

2017-03-01

Salience describes the phenomenon by which an object stands out from a scene. While its underlying processes are extensively studied in vision, mechanisms of auditory salience remain largely unknown. Previous studies have used well-controlled auditory scenes to shed light on some of the acoustic attributes that drive the salience of sound events. Unfortunately, the use of constrained stimuli in addition to a lack of well-established benchmarks of salience judgments hampers the development of comprehensive theories of sensory-driven auditory attention. The present study explores auditory salience in a set of dynamic natural scenes. A behavioral measure of salience is collected by having human volunteers listen to two concurrent scenes and indicate continuously which one attracts their attention. By using natural scenes, the study takes a data-driven rather than experimenter-driven approach to exploring the parameters of auditory salience. The findings indicate that the space of auditory salience is multidimensional (spanning loudness, pitch, spectral shape, as well as other acoustic attributes), nonlinear and highly context-dependent. Importantly, the results indicate that contextual information about the entire scene over both short and long scales needs to be considered in order to properly account for perceptual judgments of salience.
Rendering visual events as sounds: Spatial attention capture by auditory augmented reality.

PubMed

Stone, Scott A; Tata, Matthew S

2017-01-01

Many salient visual events tend to coincide with auditory events, such as seeing and hearing a car pass by. Information from the visual and auditory senses can be used to create a stable percept of the stimulus. Having access to related coincident visual and auditory information can help for spatial tasks such as localization. However not all visual information has analogous auditory percepts, such as viewing a computer monitor. Here, we describe a system capable of detecting and augmenting visual salient events into localizable auditory events. The system uses a neuromorphic camera (DAVIS 240B) to detect logarithmic changes of brightness intensity in the scene, which can be interpreted as salient visual events. Participants were blindfolded and asked to use the device to detect new objects in the scene, as well as determine direction of motion for a moving visual object. Results suggest the system is robust enough to allow for the simple detection of new salient stimuli, as well accurately encoding direction of visual motion. Future successes are probable as neuromorphic devices are likely to become faster and smaller in the future, making this system much more feasible.
Rendering visual events as sounds: Spatial attention capture by auditory augmented reality

PubMed Central

Tata, Matthew S.

2017-01-01

Many salient visual events tend to coincide with auditory events, such as seeing and hearing a car pass by. Information from the visual and auditory senses can be used to create a stable percept of the stimulus. Having access to related coincident visual and auditory information can help for spatial tasks such as localization. However not all visual information has analogous auditory percepts, such as viewing a computer monitor. Here, we describe a system capable of detecting and augmenting visual salient events into localizable auditory events. The system uses a neuromorphic camera (DAVIS 240B) to detect logarithmic changes of brightness intensity in the scene, which can be interpreted as salient visual events. Participants were blindfolded and asked to use the device to detect new objects in the scene, as well as determine direction of motion for a moving visual object. Results suggest the system is robust enough to allow for the simple detection of new salient stimuli, as well accurately encoding direction of visual motion. Future successes are probable as neuromorphic devices are likely to become faster and smaller in the future, making this system much more feasible. PMID:28792518
The singular nature of auditory and visual scene analysis in autism

PubMed Central

Lin, I.-Fan; Shirama, Aya; Kato, Nobumasa

2017-01-01

Individuals with autism spectrum disorder often have difficulty acquiring relevant auditory and visual information in daily environments, despite not being diagnosed as hearing impaired or having low vision. Resent psychophysical and neurophysiological studies have shown that autistic individuals have highly specific individual differences at various levels of information processing, including feature extraction, automatic grouping and top-down modulation in auditory and visual scene analysis. Comparison of the characteristics of scene analysis between auditory and visual modalities reveals some essential commonalities, which could provide clues about the underlying neural mechanisms. Further progress in this line of research may suggest effective methods for diagnosing and supporting autistic individuals. This article is part of the themed issue ‘Auditory and visual scene analysis'. PMID:28044025
Memory for sound, with an ear toward hearing in complex auditory scenes.

PubMed

Snyder, Joel S; Gregg, Melissa K

2011-10-01

An area of research that has experienced recent growth is the study of memory during perception of simple and complex auditory scenes. These studies have provided important information about how well auditory objects are encoded in memory and how well listeners can notice changes in auditory scenes. These are significant developments because they present an opportunity to better understand how we hear in realistic situations, how higher-level aspects of hearing such as semantics and prior exposure affect perception, and the similarities and differences between auditory perception and perception in other modalities, such as vision and touch. The research also poses exciting challenges for behavioral and neural models of how auditory perception and memory work.
Temporal coherence for pure tones in budgerigars (Melopsittacus undulatus) and humans (Homo sapiens).

PubMed

Neilans, Erikson G; Dent, Micheal L

2015-02-01

Auditory scene analysis has been suggested as a universal process that exists across all animals. Relative to humans, however, little work has been devoted to how animals perceptually isolate different sound sources. Frequency separation of sounds is arguably the most common parameter studied in auditory streaming, but it is not the only factor contributing to how the auditory scene is perceived. Researchers have found that in humans, even at large frequency separations, synchronous tones are heard as a single auditory stream, whereas asynchronous tones with the same frequency separations are perceived as 2 distinct sounds. These findings demonstrate how both the timing and frequency separation of sounds are important for auditory scene analysis. It is unclear how animals, such as budgerigars (Melopsittacus undulatus), perceive synchronous and asynchronous sounds. In this study, budgerigars and humans (Homo sapiens) were tested on their perception of synchronous, asynchronous, and partially overlapping pure tones using the same psychophysical procedures. Species differences were found between budgerigars and humans in how partially overlapping sounds were perceived, with budgerigars more likely to segregate overlapping sounds and humans more apt to fuse the 2 sounds together. The results also illustrated that temporal cues are particularly important for stream segregation of overlapping sounds. Lastly, budgerigars were found to segregate partially overlapping sounds in a manner predicted by computational models of streaming, whereas humans were not. PsycINFO Database Record (c) 2015 APA, all rights reserved.
Audio Visual Integration with Competing Sources in the Framework of Audio Visual Speech Scene Analysis.

PubMed

Ganesh, Attigodu Chandrashekara; Berthommier, Frédéric; Schwartz, Jean-Luc

2016-01-01

We introduce "Audio-Visual Speech Scene Analysis" (AVSSA) as an extension of the two-stage Auditory Scene Analysis model towards audiovisual scenes made of mixtures of speakers. AVSSA assumes that a coherence index between the auditory and the visual input is computed prior to audiovisual fusion, enabling to determine whether the sensory inputs should be bound together. Previous experiments on the modulation of the McGurk effect by audiovisual coherent vs. incoherent contexts presented before the McGurk target have provided experimental evidence supporting AVSSA. Indeed, incoherent contexts appear to decrease the McGurk effect, suggesting that they produce lower audiovisual coherence hence less audiovisual fusion. The present experiments extend the AVSSA paradigm by creating contexts made of competing audiovisual sources and measuring their effect on McGurk targets. The competing audiovisual sources have respectively a high and a low audiovisual coherence (that is, large vs. small audiovisual comodulations in time). The first experiment involves contexts made of two auditory sources and one video source associated to either the first or the second audio source. It appears that the McGurk effect is smaller after the context made of the visual source associated to the auditory source with less audiovisual coherence. In the second experiment with the same stimuli, the participants are asked to attend to either one or the other source. The data show that the modulation of fusion depends on the attentional focus. Altogether, these two experiments shed light on audiovisual binding, the AVSSA process and the role of attention.
Exploration of Behavioral, Physiological, and Computational Approaches to Auditory Scene Analysis

DTIC Science & Technology

2004-01-01

Bronkhorst and R. Plomp, "Effects of multiple speechlike maskers on binaural speech recognitions in normal and impaired listening". Journal of the Acoustical...of simultaneous vowels: cues arising from low frequency beating ". Journal of the Acoustical Society of America. 95: pp. 1559-1569. 1994. [41] C.J...and Hearing Research. 12: pp. 229-245. 1969. [44] T. Doll and T. Hanna, "Directional cueing effects in auditory recognition", in Binaural and

A Model of Auditory-Cognitive Processing and Relevance to Clinical Applicability.

PubMed

Edwards, Brent

2016-01-01

Hearing loss and cognitive function interact in both a bottom-up and top-down relationship. Listening effort is tied to these interactions, and models have been developed to explain their relationship. The Ease of Language Understanding model in particular has gained considerable attention in its explanation of the effect of signal distortion on speech understanding. Signal distortion can also affect auditory scene analysis ability, however, resulting in a distorted auditory scene that can affect cognitive function, listening effort, and the allocation of cognitive resources. These effects are explained through an addition to the Ease of Language Understanding model. This model can be generalized to apply to all sounds, not only speech, representing the increased effort required for auditory environmental awareness and other nonspeech auditory tasks. While the authors have measures of speech understanding and cognitive load to quantify these interactions, they are lacking measures of the effect of hearing aid technology on auditory scene analysis ability and how effort and attention varies with the quality of an auditory scene. Additionally, the clinical relevance of hearing aid technology on cognitive function and the application of cognitive measures in hearing aid fittings will be limited until effectiveness is demonstrated in real-world situations.
A Dual-Process Account of Auditory Change Detection

ERIC Educational Resources Information Center

McAnally, Ken I.; Martin, Russell L.; Eramudugolla, Ranmalee; Stuart, Geoffrey W.; Irvine, Dexter R. F.; Mattingley, Jason B.

2010-01-01

Listeners can be "deaf" to a substantial change in a scene comprising multiple auditory objects unless their attention has been directed to the changed object. It is unclear whether auditory change detection relies on identification of the objects in pre- and post-change scenes. We compared the rates at which listeners correctly identify changed…
Auditory Scene Analysis: The Sweet Music of Ambiguity

PubMed Central

Pressnitzer, Daniel; Suied, Clara; Shamma, Shihab A.

2011-01-01

In this review paper aimed at the non-specialist, we explore the use that neuroscientists and musicians have made of perceptual illusions based on ambiguity. The pivotal issue is auditory scene analysis (ASA), or what enables us to make sense of complex acoustic mixtures in order to follow, for instance, a single melody in the midst of an orchestra. In general, ASA uncovers the most likely physical causes that account for the waveform collected at the ears. However, the acoustical problem is ill-posed and it must be solved from noisy sensory input. Recently, the neural mechanisms implicated in the transformation of ambiguous sensory information into coherent auditory scenes have been investigated using so-called bistability illusions (where an unchanging ambiguous stimulus evokes a succession of distinct percepts in the mind of the listener). After reviewing some of those studies, we turn to music, which arguably provides some of the most complex acoustic scenes that a human listener will ever encounter. Interestingly, musicians will not always aim at making each physical source intelligible, but rather express one or more melodic lines with a small or large number of instruments. By means of a few musical illustrations and by using a computational model inspired by neuro-physiological principles, we suggest that this relies on a detailed (if perhaps implicit) knowledge of the rules of ASA and of its inherent ambiguity. We then put forward the opinion that some degree perceptual ambiguity may participate in our appreciation of music. PMID:22174701
Semantic congruency but not temporal synchrony enhances long-term memory performance for audio-visual scenes.

PubMed

Meyerhoff, Hauke S; Huff, Markus

2016-04-01

Human long-term memory for visual objects and scenes is tremendous. Here, we test how auditory information contributes to long-term memory performance for realistic scenes. In a total of six experiments, we manipulated the presentation modality (auditory, visual, audio-visual) as well as semantic congruency and temporal synchrony between auditory and visual information of brief filmic clips. Our results show that audio-visual clips generally elicit more accurate memory performance than unimodal clips. This advantage even increases with congruent visual and auditory information. However, violations of audio-visual synchrony hardly have any influence on memory performance. Memory performance remained intact even with a sequential presentation of auditory and visual information, but finally declined when the matching tracks of one scene were presented separately with intervening tracks during learning. With respect to memory performance, our results therefore show that audio-visual integration is sensitive to semantic congruency but remarkably robust against asymmetries between different modalities.
The Incongruency Advantage for Environmental Sounds Presented in Natural Auditory Scenes

ERIC Educational Resources Information Center

Gygi, Brian; Shafiro, Valeriy

2011-01-01

The effect of context on the identification of common environmental sounds (e.g., dogs barking or cars honking) was tested by embedding them in familiar auditory background scenes (street ambience, restaurants). Initial results with subjects trained on both the scenes and the sounds to be identified showed a significant advantage of about five…
Auditory and Cognitive Effects of Aging on Perception of Environmental Sounds in Natural Auditory Scenes

ERIC Educational Resources Information Center

Gygi, Brian; Shafiro, Valeriy

2013-01-01

Purpose: Previously, Gygi and Shafiro (2011) found that when environmental sounds are semantically incongruent with the background scene (e.g., horse galloping in a restaurant), they can be identified more accurately by young normal-hearing listeners (YNH) than sounds congruent with the scene (e.g., horse galloping at a racetrack). This study…
Using the structure of natural scenes and sounds to predict neural response properties in the brain

NASA Astrophysics Data System (ADS)

Deweese, Michael

2014-03-01

The natural scenes and sounds we encounter in the world are highly structured. The fact that animals and humans are so efficient at processing these sensory signals compared with the latest algorithms running on the fastest modern computers suggests that our brains can exploit this structure. We have developed a sparse mathematical representation of speech that minimizes the number of active model neurons needed to represent typical speech sounds. The model learns several well-known acoustic features of speech such as harmonic stacks, formants, onsets and terminations, but we also find more exotic structures in the spectrogra representation of sound such as localized checkerboard patterns and frequency-modulated excitatory subregions flanked by suppressive sidebands. Moreover, several of these novel features resemble neuronal receptive fields reported in the Inferior Colliculus (IC), as well as auditory thalamus (MGBv) and primary auditory cortex (A1), and our model neurons exhibit the same tradeoff in spectrotemporal resolution as has been observed in IC. To our knowledge, this is the first demonstration that receptive fields of neurons in the ascending mammalian auditory pathway beyond the auditory nerve can be predicted based on coding principles and the statistical properties of recorded sounds. We have also developed a biologically-inspired neural network model of primary visual cortex (V1) that can learn a sparse representation of natural scenes using spiking neurons and strictly local plasticity rules. The representation learned by our model is in good agreement with measured receptive fields in V1, demonstrating that sparse sensory coding can be achieved in a realistic biological setting.
Acoustic and higher-level representations of naturalistic auditory scenes in human auditory and frontal cortex.

PubMed

Hausfeld, Lars; Riecke, Lars; Formisano, Elia

2018-06-01

Often, in everyday life, we encounter auditory scenes comprising multiple simultaneous sounds and succeed to selectively attend to only one sound, typically the most relevant for ongoing behavior. Studies using basic sounds and two-talker stimuli have shown that auditory selective attention aids this by enhancing the neural representations of the attended sound in auditory cortex. It remains unknown, however, whether and how this selective attention mechanism operates on representations of auditory scenes containing natural sounds of different categories. In this high-field fMRI study we presented participants with simultaneous voices and musical instruments while manipulating their focus of attention. We found an attentional enhancement of neural sound representations in temporal cortex - as defined by spatial activation patterns - at locations that depended on the attended category (i.e., voices or instruments). In contrast, we found that in frontal cortex the site of enhancement was independent of the attended category and the same regions could flexibly represent any attended sound regardless of its category. These results are relevant to elucidate the interacting mechanisms of bottom-up and top-down processing when listening to real-life scenes comprised of multiple sound categories. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.
Auditory conflict and congruence in frontotemporal dementia.

PubMed

Clark, Camilla N; Nicholas, Jennifer M; Agustus, Jennifer L; Hardy, Christopher J D; Russell, Lucy L; Brotherhood, Emilie V; Dick, Katrina M; Marshall, Charles R; Mummery, Catherine J; Rohrer, Jonathan D; Warren, Jason D

2017-09-01

Impaired analysis of signal conflict and congruence may contribute to diverse socio-emotional symptoms in frontotemporal dementias, however the underlying mechanisms have not been defined. Here we addressed this issue in patients with behavioural variant frontotemporal dementia (bvFTD; n = 19) and semantic dementia (SD; n = 10) relative to healthy older individuals (n = 20). We created auditory scenes in which semantic and emotional congruity of constituent sounds were independently probed; associated tasks controlled for auditory perceptual similarity, scene parsing and semantic competence. Neuroanatomical correlates of auditory congruity processing were assessed using voxel-based morphometry. Relative to healthy controls, both the bvFTD and SD groups had impaired semantic and emotional congruity processing (after taking auditory control task performance into account) and reduced affective integration of sounds into scenes. Grey matter correlates of auditory semantic congruity processing were identified in distributed regions encompassing prefrontal, parieto-temporal and insular areas and correlates of auditory emotional congruity in partly overlapping temporal, insular and striatal regions. Our findings suggest that decoding of auditory signal relatedness may probe a generic cognitive mechanism and neural architecture underpinning frontotemporal dementia syndromes. Copyright © 2017 The Author(s). Published by Elsevier Ltd.. All rights reserved.
Auditory Scene Analysis: An Attention Perspective

ERIC Educational Resources Information Center

Sussman, Elyse S.

2017-01-01

Purpose: This review article provides a new perspective on the role of attention in auditory scene analysis. Method: A framework for understanding how attention interacts with stimulus-driven processes to facilitate task goals is presented. Previously reported data obtained through behavioral and electrophysiological measures in adults with normal…
Evidence for Neural Computations of Temporal Coherence in an Auditory Scene and Their Enhancement during Active Listening.

PubMed

O'Sullivan, James A; Shamma, Shihab A; Lalor, Edmund C

2015-05-06

The human brain has evolved to operate effectively in highly complex acoustic environments, segregating multiple sound sources into perceptually distinct auditory objects. A recent theory seeks to explain this ability by arguing that stream segregation occurs primarily due to the temporal coherence of the neural populations that encode the various features of an individual acoustic source. This theory has received support from both psychoacoustic and functional magnetic resonance imaging (fMRI) studies that use stimuli which model complex acoustic environments. Termed stochastic figure-ground (SFG) stimuli, they are composed of a "figure" and background that overlap in spectrotemporal space, such that the only way to segregate the figure is by computing the coherence of its frequency components over time. Here, we extend these psychoacoustic and fMRI findings by using the greater temporal resolution of electroencephalography to investigate the neural computation of temporal coherence. We present subjects with modified SFG stimuli wherein the temporal coherence of the figure is modulated stochastically over time, which allows us to use linear regression methods to extract a signature of the neural processing of this temporal coherence. We do this under both active and passive listening conditions. Our findings show an early effect of coherence during passive listening, lasting from ∼115 to 185 ms post-stimulus. When subjects are actively listening to the stimuli, these responses are larger and last longer, up to ∼265 ms. These findings provide evidence for early and preattentive neural computations of temporal coherence that are enhanced by active analysis of an auditory scene. Copyright © 2015 the authors 0270-6474/15/357256-08$15.00/0.
Effect of a concurrent auditory task on visual search performance in a driving-related image-flicker task.

PubMed

Richard, Christian M; Wright, Richard D; Ee, Cheryl; Prime, Steven L; Shimizu, Yujiro; Vavrik, John

2002-01-01

The effect of a concurrent auditory task on visual search was investigated using an image-flicker technique. Participants were undergraduate university students with normal or corrected-to-normal vision who searched for changes in images of driving scenes that involved either driving-related (e.g., traffic light) or driving-unrelated (e.g., mailbox) scene elements. The results indicated that response times were significantly slower if the search was accompanied by a concurrent auditory task. In addition, slower overall responses to scenes involving driving-unrelated changes suggest that the underlying process affected by the concurrent auditory task is strategic in nature. These results were interpreted in terms of their implications for using a cellular telephone while driving. Actual or potential applications of this research include the development of safer in-vehicle communication devices.
Effects of capacity limits, memory loss, and sound type in change deafness.

PubMed

Gregg, Melissa K; Irsik, Vanessa C; Snyder, Joel S

2017-11-01

Change deafness, the inability to notice changes to auditory scenes, has the potential to provide insights about sound perception in busy situations typical of everyday life. We determined the extent to which change deafness to sounds is due to the capacity of processing multiple sounds and the loss of memory for sounds over time. We also determined whether these processing limitations work differently for varying types of sounds within a scene. Auditory scenes composed of naturalistic sounds, spectrally dynamic unrecognizable sounds, tones, and noise rhythms were presented in a change-detection task. On each trial, two scenes were presented that were same or different. We manipulated the number of sounds within each scene to measure memory capacity and the silent interval between scenes to measure memory loss. For all sounds, change detection was worse as scene size increased, demonstrating the importance of capacity limits. Change detection to the natural sounds did not deteriorate much as the interval between scenes increased up to 2,000 ms, but it did deteriorate substantially with longer intervals. For artificial sounds, in contrast, change-detection performance suffered even for very short intervals. The results suggest that change detection is generally limited by capacity, regardless of sound type, but that auditory memory is more enduring for sounds with naturalistic acoustic structures.
Computational Modeling of Age-Differences In a Visually Demanding Driving Task: Vehicle Detection

DTIC Science & Technology

1997-10-07

overall estimate of d’ for each scene was calculated from the two levels using the method described in MacMillan and Creelman [13]. MODELING VEHICLE...Scialfa, "Visual and auditory aging," In J. Birren & K. W. Schaie (Eds.) Handbook of the Psychology of Aging (4th edition), 1996, New York: Academic...Computational models of Visual Processing, 1991, Boston MA: MIT Press. [13] N. A. MacMillan & C. D. Creelman , Detection Theory: A User’s Guide, 1991
The Incongruency Advantage for Environmental Sounds Presented in Natural Auditory Scenes

PubMed Central

Gygi, Brian; Shafiro, Valeriy

2011-01-01

The effect of context on the identification of common environmental sounds (e.g., dogs barking or cars honking) was tested by embedding them in familiar auditory background scenes (street ambience, restaurants). Initial results with subjects trained on both the scenes and the sounds to be identified showed a significant advantage of about 5 percentage points better accuracy for sounds that were contextually incongruous with the background scene (e.g., a rooster crowing in a hospital). Further studies with naïve (untrained) listeners showed that this Incongruency Advantage (IA) is level-dependent: there is no advantage for incongruent sounds lower than a Sound/Scene ratio (So/Sc) of −7.5 dB, but there is about 5 percentage points better accuracy for sounds with greater So/Sc. Testing a new group of trained listeners on a larger corpus of sounds and scenes showed that the effect is robust and not confined to specific stimulus set. Modeling using spectral-temporal measures showed that neither analyses based on acoustic features, nor semantic assessments of sound-scene congruency can account for this difference, indicating the Incongruency Advantage is a complex effect, possibly arising from the sensitivity of the auditory system to new and unexpected events, under particular listening conditions. PMID:21355664
Scanning silence: mental imagery of complex sounds.

PubMed

Bunzeck, Nico; Wuestenberg, Torsten; Lutz, Kai; Heinze, Hans-Jochen; Jancke, Lutz

2005-07-15

In this functional magnetic resonance imaging (fMRI) study, we investigated the neural basis of mental auditory imagery of familiar complex sounds that did not contain language or music. In the first condition (perception), the subjects watched familiar scenes and listened to the corresponding sounds that were presented simultaneously. In the second condition (imagery), the same scenes were presented silently and the subjects had to mentally imagine the appropriate sounds. During the third condition (control), the participants watched a scrambled version of the scenes without sound. To overcome the disadvantages of the stray acoustic scanner noise in auditory fMRI experiments, we applied sparse temporal sampling technique with five functional clusters that were acquired at the end of each movie presentation. Compared to the control condition, we found bilateral activations in the primary and secondary auditory cortices (including Heschl's gyrus and planum temporale) during perception of complex sounds. In contrast, the imagery condition elicited bilateral hemodynamic responses only in the secondary auditory cortex (including the planum temporale). No significant activity was observed in the primary auditory cortex. The results show that imagery and perception of complex sounds that do not contain language or music rely on overlapping neural correlates of the secondary but not primary auditory cortex.
Integration and segregation in auditory scene analysis

NASA Astrophysics Data System (ADS)

Sussman, Elyse S.

2005-03-01

Assessment of the neural correlates of auditory scene analysis, using an index of sound change detection that does not require the listener to attend to the sounds [a component of event-related brain potentials called the mismatch negativity (MMN)], has previously demonstrated that segregation processes can occur without attention focused on the sounds and that within-stream contextual factors influence how sound elements are integrated and represented in auditory memory. The current study investigated the relationship between the segregation and integration processes when they were called upon to function together. The pattern of MMN results showed that the integration of sound elements within a sound stream occurred after the segregation of sounds into independent streams and, further, that the individual streams were subject to contextual effects. These results are consistent with a view of auditory processing that suggests that the auditory scene is rapidly organized into distinct streams and the integration of sequential elements to perceptual units takes place on the already formed streams. This would allow for the flexibility required to identify changing within-stream sound patterns, needed to appreciate music or comprehend speech..
The Perception of Concurrent Sound Objects in Harmonic Complexes Impairs Gap Detection

ERIC Educational Resources Information Center

Leung, Ada W. S.; Jolicoeur, Pierre; Vachon, Francois; Alain, Claude

2011-01-01

Since the introduction of the concept of auditory scene analysis, there has been a paucity of work focusing on the theoretical explanation of how attention is allocated within a complex auditory scene. Here we examined signal detection in situations that promote either the fusion of tonal elements into a single sound object or the segregation of a…
Children Use Object-Level Category Knowledge to Detect Changes in Complex Auditory Scenes

ERIC Educational Resources Information Center

Vanden Bosch der Nederlanden, Christina M.; Snyder, Joel S.; Hannon, Erin E.

2016-01-01

Children interact with and learn about all types of sound sources, including dogs, bells, trains, and human beings. Although it is clear that knowledge of semantic categories for everyday sights and sounds develops during childhood, there are very few studies examining how children use this knowledge to make sense of auditory scenes. We used a…
Segregating the neural correlates of physical and perceived change in auditory input using the change deafness effect.

PubMed

Puschmann, Sebastian; Weerda, Riklef; Klump, Georg; Thiel, Christiane M

2013-05-01

Psychophysical experiments show that auditory change detection can be disturbed in situations in which listeners have to monitor complex auditory input. We made use of this change deafness effect to segregate the neural correlates of physical change in auditory input from brain responses related to conscious change perception in an fMRI experiment. Participants listened to two successively presented complex auditory scenes, which consisted of six auditory streams, and had to decide whether scenes were identical or whether the frequency of one stream was changed between presentations. Our results show that physical changes in auditory input, independent of successful change detection, are represented at the level of auditory cortex. Activations related to conscious change perception, independent of physical change, were found in the insula and the ACC. Moreover, our data provide evidence for significant effective connectivity between auditory cortex and the insula in the case of correctly detected auditory changes, but not for missed changes. This underlines the importance of the insula/anterior cingulate network for conscious change detection.

Attention, Awareness, and the Perception of Auditory Scenes

PubMed Central

Snyder, Joel S.; Gregg, Melissa K.; Weintraub, David M.; Alain, Claude

2011-01-01

Auditory perception and cognition entails both low-level and high-level processes, which are likely to interact with each other to create our rich conscious experience of soundscapes. Recent research that we review has revealed numerous influences of high-level factors, such as attention, intention, and prior experience, on conscious auditory perception. And recently, studies have shown that auditory scene analysis tasks can exhibit multistability in a manner very similar to ambiguous visual stimuli, presenting a unique opportunity to study neural correlates of auditory awareness and the extent to which mechanisms of perception are shared across sensory modalities. Research has also led to a growing number of techniques through which auditory perception can be manipulated and even completely suppressed. Such findings have important consequences for our understanding of the mechanisms of perception and also should allow scientists to precisely distinguish the influences of different higher-level influences. PMID:22347201
Using auditory pre-information to solve the cocktail-party problem: electrophysiological evidence for age-specific differences.

PubMed

Getzmann, Stephan; Lewald, Jörg; Falkenstein, Michael

2014-01-01

Speech understanding in complex and dynamic listening environments requires (a) auditory scene analysis, namely auditory object formation and segregation, and (b) allocation of the attentional focus to the talker of interest. There is evidence that pre-information is actively used to facilitate these two aspects of the so-called "cocktail-party" problem. Here, a simulated multi-talker scenario was combined with electroencephalography to study scene analysis and allocation of attention in young and middle-aged adults. Sequences of short words (combinations of brief company names and stock-price values) from four talkers at different locations were simultaneously presented, and the detection of target names and the discrimination between critical target values were assessed. Immediately prior to speech sequences, auditory pre-information was provided via cues that either prepared auditory scene analysis or attentional focusing, or non-specific pre-information was given. While performance was generally better in younger than older participants, both age groups benefited from auditory pre-information. The analysis of the cue-related event-related potentials revealed age-specific differences in the use of pre-cues: Younger adults showed a pronounced N2 component, suggesting early inhibition of concurrent speech stimuli; older adults exhibited a stronger late P3 component, suggesting increased resource allocation to process the pre-information. In sum, the results argue for an age-specific utilization of auditory pre-information to improve listening in complex dynamic auditory environments.
Using auditory pre-information to solve the cocktail-party problem: electrophysiological evidence for age-specific differences

PubMed Central

Getzmann, Stephan; Lewald, Jörg; Falkenstein, Michael

2014-01-01

Speech understanding in complex and dynamic listening environments requires (a) auditory scene analysis, namely auditory object formation and segregation, and (b) allocation of the attentional focus to the talker of interest. There is evidence that pre-information is actively used to facilitate these two aspects of the so-called “cocktail-party” problem. Here, a simulated multi-talker scenario was combined with electroencephalography to study scene analysis and allocation of attention in young and middle-aged adults. Sequences of short words (combinations of brief company names and stock-price values) from four talkers at different locations were simultaneously presented, and the detection of target names and the discrimination between critical target values were assessed. Immediately prior to speech sequences, auditory pre-information was provided via cues that either prepared auditory scene analysis or attentional focusing, or non-specific pre-information was given. While performance was generally better in younger than older participants, both age groups benefited from auditory pre-information. The analysis of the cue-related event-related potentials revealed age-specific differences in the use of pre-cues: Younger adults showed a pronounced N2 component, suggesting early inhibition of concurrent speech stimuli; older adults exhibited a stronger late P3 component, suggesting increased resource allocation to process the pre-information. In sum, the results argue for an age-specific utilization of auditory pre-information to improve listening in complex dynamic auditory environments. PMID:25540608
The Central Auditory Processing Kit[TM]. Book 1: Auditory Memory [and] Book 2: Auditory Discrimination, Auditory Closure, and Auditory Synthesis [and] Book 3: Auditory Figure-Ground, Auditory Cohesion, Auditory Binaural Integration, and Compensatory Strategies.

ERIC Educational Resources Information Center

Mokhemar, Mary Ann

This kit for assessing central auditory processing disorders (CAPD), in children in grades 1 through 8 includes 3 books, 14 full-color cards with picture scenes, and a card depicting a phone key pad, all contained in a sturdy carrying case. The units in each of the three books correspond with auditory skill areas most commonly addressed in…
How is visual salience computed in the brain? Insights from behaviour, neurobiology and modelling

PubMed Central

Veale, Richard; Hafed, Ziad M.

2017-01-01

Inherent in visual scene analysis is a bottleneck associated with the need to sequentially sample locations with foveating eye movements. The concept of a ‘saliency map’ topographically encoding stimulus conspicuity over the visual scene has proven to be an efficient predictor of eye movements. Our work reviews insights into the neurobiological implementation of visual salience computation. We start by summarizing the role that different visual brain areas play in salience computation, whether at the level of feature analysis for bottom-up salience or at the level of goal-directed priority maps for output behaviour. We then delve into how a subcortical structure, the superior colliculus (SC), participates in salience computation. The SC represents a visual saliency map via a centre-surround inhibition mechanism in the superficial layers, which feeds into priority selection mechanisms in the deeper layers, thereby affecting saccadic and microsaccadic eye movements. Lateral interactions in the local SC circuit are particularly important for controlling active populations of neurons. This, in turn, might help explain long-range effects, such as those of peripheral cues on tiny microsaccades. Finally, we show how a combination of in vitro neurophysiology and large-scale computational modelling is able to clarify how salience computation is implemented in the local circuit of the SC. This article is part of the themed issue ‘Auditory and visual scene analysis’. PMID:28044023
Left Superior Temporal Gyrus Is Coupled to Attended Speech in a Cocktail-Party Auditory Scene.

PubMed

Vander Ghinst, Marc; Bourguignon, Mathieu; Op de Beeck, Marc; Wens, Vincent; Marty, Brice; Hassid, Sergio; Choufani, Georges; Jousmäki, Veikko; Hari, Riitta; Van Bogaert, Patrick; Goldman, Serge; De Tiège, Xavier

2016-02-03

Using a continuous listening task, we evaluated the coupling between the listener's cortical activity and the temporal envelopes of different sounds in a multitalker auditory scene using magnetoencephalography and corticovocal coherence analysis. Neuromagnetic signals were recorded from 20 right-handed healthy adult humans who listened to five different recorded stories (attended speech streams), one without any multitalker background (No noise) and four mixed with a "cocktail party" multitalker background noise at four signal-to-noise ratios (5, 0, -5, and -10 dB) to produce speech-in-noise mixtures, here referred to as Global scene. Coherence analysis revealed that the modulations of the attended speech stream, presented without multitalker background, were coupled at ∼0.5 Hz to the activity of both superior temporal gyri, whereas the modulations at 4-8 Hz were coupled to the activity of the right supratemporal auditory cortex. In cocktail party conditions, with the multitalker background noise, the coupling was at both frequencies stronger for the attended speech stream than for the unattended Multitalker background. The coupling strengths decreased as the Multitalker background increased. During the cocktail party conditions, the ∼0.5 Hz coupling became left-hemisphere dominant, compared with bilateral coupling without the multitalker background, whereas the 4-8 Hz coupling remained right-hemisphere lateralized in both conditions. The brain activity was not coupled to the multitalker background or to its individual talkers. The results highlight the key role of listener's left superior temporal gyri in extracting the slow ∼0.5 Hz modulations, likely reflecting the attended speech stream within a multitalker auditory scene. When people listen to one person in a "cocktail party," their auditory cortex mainly follows the attended speech stream rather than the entire auditory scene. However, how the brain extracts the attended speech stream from the whole auditory scene and how increasing background noise corrupts this process is still debated. In this magnetoencephalography study, subjects had to attend a speech stream with or without multitalker background noise. Results argue for frequency-dependent cortical tracking mechanisms for the attended speech stream. The left superior temporal gyrus tracked the ∼0.5 Hz modulations of the attended speech stream only when the speech was embedded in multitalker background, whereas the right supratemporal auditory cortex tracked 4-8 Hz modulations during both noiseless and cocktail-party conditions. Copyright © 2016 the authors 0270-6474/16/361597-11$15.00/0.
The role of temporal structure in the investigation of sensory memory, auditory scene analysis, and speech perception: a healthy-aging perspective.

PubMed

Rimmele, Johanna Maria; Sussman, Elyse; Poeppel, David

2015-02-01

Listening situations with multiple talkers or background noise are common in everyday communication and are particularly demanding for older adults. Here we review current research on auditory perception in aging individuals in order to gain insights into the challenges of listening under noisy conditions. Informationally rich temporal structure in auditory signals--over a range of time scales from milliseconds to seconds--renders temporal processing central to perception in the auditory domain. We discuss the role of temporal structure in auditory processing, in particular from a perspective relevant for hearing in background noise, and focusing on sensory memory, auditory scene analysis, and speech perception. Interestingly, these auditory processes, usually studied in an independent manner, show considerable overlap of processing time scales, even though each has its own 'privileged' temporal regimes. By integrating perspectives on temporal structure processing in these three areas of investigation, we aim to highlight similarities typically not recognized. Copyright © 2014 Elsevier B.V. All rights reserved.
The role of temporal structure in the investigation of sensory memory, auditory scene analysis, and speech perception: A healthy-aging perspective

PubMed Central

Rimmele, Johanna Maria; Sussman, Elyse; Poeppel, David

2014-01-01

Listening situations with multiple talkers or background noise are common in everyday communication and are particularly demanding for older adults. Here we review current research on auditory perception in aging individuals in order to gain insights into the challenges of listening under noisy conditions. Informationally rich temporal structure in auditory signals - over a range of time scales from milliseconds to seconds - renders temporal processing central to perception in the auditory domain. We discuss the role of temporal structure in auditory processing, in particular from a perspective relevant for hearing in background noise, and focusing on sensory memory, auditory scene analysis, and speech perception. Interestingly, these auditory processes, usually studied in an independent manner, show considerable overlap of processing time scales, even though each has its own ‚privileged‘ temporal regimes. By integrating perspectives on temporal structure processing in these three areas of investigation, we aim to highlight similarities typically not recognized. PMID:24956028
The auditory scene: an fMRI study on melody and accompaniment in professional pianists.

PubMed

Spada, Danilo; Verga, Laura; Iadanza, Antonella; Tettamanti, Marco; Perani, Daniela

2014-11-15

The auditory scene is a mental representation of individual sounds extracted from the summed sound waveform reaching the ears of the listeners. Musical contexts represent particularly complex cases of auditory scenes. In such a scenario, melody may be seen as the main object moving on a background represented by the accompaniment. Both melody and accompaniment vary in time according to harmonic rules, forming a typical texture with melody in the most prominent, salient voice. In the present sparse acquisition functional magnetic resonance imaging study, we investigated the interplay between melody and accompaniment in trained pianists, by observing the activation responses elicited by processing: (1) melody placed in the upper and lower texture voices, leading to, respectively, a higher and lower auditory salience; (2) harmonic violations occurring in either the melody, the accompaniment, or both. The results indicated that the neural activation elicited by the processing of polyphonic compositions in expert musicians depends upon the upper versus lower position of the melodic line in the texture, and showed an overall greater activation for the harmonic processing of melody over accompaniment. Both these two predominant effects were characterized by the involvement of the posterior cingulate cortex and precuneus, among other associative brain regions. We discuss the prominent role of the posterior medial cortex in the processing of melodic and harmonic information in the auditory stream, and propose to frame this processing in relation to the cognitive construction of complex multimodal sensory imagery scenes. Copyright © 2014 Elsevier Inc. All rights reserved.
Stochastic correlative firing for figure-ground segregation.

PubMed

Chen, Zhe

2005-03-01

Segregation of sensory inputs into separate objects is a central aspect of perception and arises in all sensory modalities. The figure-ground segregation problem requires identifying an object of interest in a complex scene, in many cases given binaural auditory or binocular visual observations. The computations required for visual and auditory figure-ground segregation share many common features and can be cast within a unified framework. Sensory perception can be viewed as a problem of optimizing information transmission. Here we suggest a stochastic correlative firing mechanism and an associative learning rule for figure-ground segregation in several classic sensory perception tasks, including the cocktail party problem in binaural hearing, binocular fusion of stereo images, and Gestalt grouping in motion perception.
Modification of computational auditory scene analysis (CASA) for noise-robust acoustic feature

NASA Astrophysics Data System (ADS)

Kwon, Minseok

While there have been many attempts to mitigate interferences of background noise, the performance of automatic speech recognition (ASR) still can be deteriorated by various factors with ease. However, normal hearing listeners can accurately perceive sounds of their interests, which is believed to be a result of Auditory Scene Analysis (ASA). As a first attempt, the simulation of the human auditory processing, called computational auditory scene analysis (CASA), was fulfilled through physiological and psychological investigations of ASA. CASA comprised of Zilany-Bruce auditory model, followed by tracking fundamental frequency for voice segmentation and detecting pairs of onset/offset at each characteristic frequency (CF) for unvoiced segmentation. The resulting Time-Frequency (T-F) representation of acoustic stimulation was converted into acoustic feature, gammachirp-tone frequency cepstral coefficients (GFCC). 11 keywords with various environmental conditions are used and the robustness of GFCC was evaluated by spectral distance (SD) and dynamic time warping distance (DTW). In "clean" and "noisy" conditions, the application of CASA generally improved noise robustness of the acoustic feature compared to a conventional method with or without noise suppression using MMSE estimator. The intial study, however, not only showed the noise-type dependency at low SNR, but also called the evaluation methods in question. Some modifications were made to capture better spectral continuity from an acoustic feature matrix, to obtain faster processing speed, and to describe the human auditory system more precisely. The proposed framework includes: 1) multi-scale integration to capture more accurate continuity in feature extraction, 2) contrast enhancement (CE) of each CF by competition with neighboring frequency bands, and 3) auditory model modifications. The model modifications contain the introduction of higher Q factor, middle ear filter more analogous to human auditory system, the regulation of time constant update for filters in signal/control path as well as level-independent frequency glides with fixed frequency modulation. First, we scrutinized performance development in keyword recognition using the proposed methods in quiet and noise-corrupted environments. The results argue that multi-scale integration should be used along with CE in order to avoid ambiguous continuity in unvoiced segments. Moreover, the inclusion of the all modifications was observed to guarantee the noise-type-independent robustness particularly with severe interference. Moreover, the CASA with the auditory model was implemented into a single/dual-channel ASR using reference TIMIT corpus so as to get more general result. Hidden Markov model (HTK) toolkit was used for phone recognition in various environmental conditions. In a single-channel ASR, the results argue that unmasked acoustic features (unmasked GFCC) should combine with target estimates from the mask to compensate for missing information. From the observation of a dual-channel ASR, the combined GFCC guarantees the highest performance regardless of interferences within speech. Moreover, consistent improvement of noise robustness by GFCC (unmasked or combined) shows the validity of our proposed CASA implementation in dual microphone system. In conclusion, the proposed framework proves the robustness of the acoustic features in various background interferences via both direct distance evaluation and statistical assessment. In addition, the introduction of dual microphone system using the framework in this study shows the potential of the effective implementation of the auditory model-based CASA in ASR.
Auditory Memory Distortion for Spoken Prose

PubMed Central

Hutchison, Joanna L.; Hubbard, Timothy L.; Ferrandino, Blaise; Brigante, Ryan; Wright, Jamie M.; Rypma, Bart

2013-01-01

Observers often remember a scene as containing information that was not presented but that would have likely been located just beyond the observed boundaries of the scene. This effect is called boundary extension (BE; e.g., Intraub & Richardson, 1989). Previous studies have observed BE in memory for visual and haptic stimuli, and the present experiments examined whether BE occurred in memory for auditory stimuli (prose, music). Experiments 1 and 2 varied the amount of auditory content to be remembered. BE was not observed, but when auditory targets contained more content, boundary restriction (BR) occurred. Experiment 3 presented auditory stimuli with less content and BR also occurred. In Experiment 4, white noise was added to stimuli with less content to equalize the durations of auditory stimuli, and BR still occurred. Experiments 5 and 6 presented trained stories and popular music, and BR still occurred. This latter finding ruled out the hypothesis that the lack of BE in Experiments 1–4 reflected a lack of familiarity with the stimuli. Overall, memory for auditory content exhibited BR rather than BE, and this pattern was stronger if auditory stimuli contained more content. Implications for the understanding of general perceptual processing and directions for future research are discussed. PMID:22612172
Isolating the Energetic Component of Speech-on-Speech Masking With Ideal Time-Frequency Segregation

DTIC Science & Technology

2006-12-01

Auditory Scene Analysis MIT Press, Cambridge, MA. Bronkhorst, A., and Plomp, R. 1992. “Effects of multiple speechlike maskers on binaural speech...C. J. 1994. “Perception and computational sepa- ration of simultaneous vowels: Cues arising from low frequency beating ,” J. Acoust. Soc. Am. 95...Litovsky, R., and Culling, J. 2004. “The benefit of binaural hearing in a cocktail party: Effects of location and type of interferer,” J. Acoust. Soc
Sensory Substitution: The Spatial Updating of Auditory Scenes "Mimics" the Spatial Updating of Visual Scenes.

PubMed

Pasqualotto, Achille; Esenkaya, Tayfun

2016-01-01

Visual-to-auditory sensory substitution is used to convey visual information through audition, and it was initially created to compensate for blindness; it consists of software converting the visual images captured by a video-camera into the equivalent auditory images, or "soundscapes". Here, it was used by blindfolded sighted participants to learn the spatial position of simple shapes depicted in images arranged on the floor. Very few studies have used sensory substitution to investigate spatial representation, while it has been widely used to investigate object recognition. Additionally, with sensory substitution we could study the performance of participants actively exploring the environment through audition, rather than passively localizing sound sources. Blindfolded participants egocentrically learnt the position of six images by using sensory substitution and then a judgment of relative direction task (JRD) was used to determine how this scene was represented. This task consists of imagining being in a given location, oriented in a given direction, and pointing towards the required image. Before performing the JRD task, participants explored a map that provided allocentric information about the scene. Although spatial exploration was egocentric, surprisingly we found that performance in the JRD task was better for allocentric perspectives. This suggests that the egocentric representation of the scene was updated. This result is in line with previous studies using visual and somatosensory scenes, thus supporting the notion that different sensory modalities produce equivalent spatial representation(s). Moreover, our results have practical implications to improve training methods with sensory substitution devices (SSD).
The capture and recreation of 3D auditory scenes

NASA Astrophysics Data System (ADS)

Li, Zhiyun

The main goal of this research is to develop the theory and implement practical tools (in both software and hardware) for the capture and recreation of 3D auditory scenes. Our research is expected to have applications in virtual reality, telepresence, film, music, video games, auditory user interfaces, and sound-based surveillance. The first part of our research is concerned with sound capture via a spherical microphone array. The advantage of this array is that it can be steered into any 3D directions digitally with the same beampattern. We develop design methodologies to achieve flexible microphone layouts, optimal beampattern approximation and robustness constraint. We also design novel hemispherical and circular microphone array layouts for more spatially constrained auditory scenes. Using the captured audio, we then propose a unified and simple approach for recreating them by exploring the reciprocity principle that is satisfied between the two processes. Our approach makes the system easy to build, and practical. Using this approach, we can capture the 3D sound field by a spherical microphone array and recreate it using a spherical loudspeaker array, and ensure that the recreated sound field matches the recorded field up to a high order of spherical harmonics. For some regular or semi-regular microphone layouts, we design an efficient parallel implementation of the multi-directional spherical beamformer by using the rotational symmetries of the beampattern and of the spherical microphone array. This can be implemented in either software or hardware and easily adapted for other regular or semi-regular layouts of microphones. In addition, we extend this approach for headphone-based system. Design examples and simulation results are presented to verify our algorithms. Prototypes are built and tested in real-world auditory scenes.
Activity in Human Auditory Cortex Represents Spatial Separation Between Concurrent Sounds.

PubMed

Shiell, Martha M; Hausfeld, Lars; Formisano, Elia

2018-05-23

The primary and posterior auditory cortex (AC) are known for their sensitivity to spatial information, but how this information is processed is not yet understood. AC that is sensitive to spatial manipulations is also modulated by the number of auditory streams present in a scene (Smith et al., 2010), suggesting that spatial and nonspatial cues are integrated for stream segregation. We reasoned that, if this is the case, then it is the distance between sounds rather than their absolute positions that is essential. To test this hypothesis, we measured human brain activity in response to spatially separated concurrent sounds with fMRI at 7 tesla in five men and five women. Stimuli were spatialized amplitude-modulated broadband noises recorded for each participant via in-ear microphones before scanning. Using a linear support vector machine classifier, we investigated whether sound location and/or location plus spatial separation between sounds could be decoded from the activity in Heschl's gyrus and the planum temporale. The classifier was successful only when comparing patterns associated with the conditions that had the largest difference in perceptual spatial separation. Our pattern of results suggests that the representation of spatial separation is not merely the combination of single locations, but rather is an independent feature of the auditory scene. SIGNIFICANCE STATEMENT Often, when we think of auditory spatial information, we think of where sounds are coming from-that is, the process of localization. However, this information can also be used in scene analysis, the process of grouping and segregating features of a soundwave into objects. Essentially, when sounds are further apart, they are more likely to be segregated into separate streams. Here, we provide evidence that activity in the human auditory cortex represents the spatial separation between sounds rather than their absolute locations, indicating that scene analysis and localization processes may be independent. Copyright © 2018 the authors 0270-6474/18/384977-08$15.00/0.
Comparing perceived auditory width to the visual image of a performing ensemble in contrasting bi-modal environmentsa)

PubMed Central

Valente, Daniel L.; Braasch, Jonas; Myrbeck, Shane A.

2012-01-01

Despite many studies investigating auditory spatial impressions in rooms, few have addressed the impact of simultaneous visual cues on localization and the perception of spaciousness. The current research presents an immersive audiovisual environment in which participants were instructed to make auditory width judgments in dynamic bi-modal settings. The results of these psychophysical tests suggest the importance of congruent audio visual presentation to the ecological interpretation of an auditory scene. Supporting data were accumulated in five rooms of ascending volumes and varying reverberation times. Participants were given an audiovisual matching test in which they were instructed to pan the auditory width of a performing ensemble to a varying set of audio and visual cues in rooms. Results show that both auditory and visual factors affect the collected responses and that the two sensory modalities coincide in distinct interactions. The greatest differences between the panned audio stimuli given a fixed visual width were found in the physical space with the largest volume and the greatest source distance. These results suggest, in this specific instance, a predominance of auditory cues in the spatial analysis of the bi-modal scene. PMID:22280585
Evolutionary conservation and neuronal mechanisms of auditory perceptual restoration.

PubMed

Petkov, Christopher I; Sutter, Mitchell L

2011-01-01

Auditory perceptual 'restoration' occurs when the auditory system restores an occluded or masked sound of interest. Behavioral work on auditory restoration in humans began over 50 years ago using it to model a noisy environmental scene with competing sounds. It has become clear that not only humans experience auditory restoration: restoration has been broadly conserved in many species. Behavioral studies in humans and animals provide a necessary foundation to link the insights being obtained from human EEG and fMRI to those from animal neurophysiology. The aggregate of data resulting from multiple approaches across species has begun to clarify the neuronal bases of auditory restoration. Different types of neural responses supporting restoration have been found, supportive of multiple mechanisms working within a species. Yet a general principle has emerged that responses correlated with restoration mimic the response that would have been given to the uninterrupted sound of interest. Using the same technology to study different species will help us to better harness animal models of 'auditory scene analysis' to clarify the conserved neural mechanisms shaping the perceptual organization of sound and to advance strategies to improve hearing in natural environmental settings. © 2010 Elsevier B.V. All rights reserved.
Concurrent auditory perception difficulties in older adults with right hemisphere cerebrovascular accident.

PubMed

Talebi, Hossein; Moossavi, Abdollah; Faghihzadeh, Soghrat

2014-01-01

Older adults with cerebrovascular accident (CVA) show evidence of auditory and speech perception problems. In present study, it was examined whether these problems are due to impairments of concurrent auditory segregation procedure which is the basic level of auditory scene analysis and auditory organization in auditory scenes with competing sounds. Concurrent auditory segregation using competing sentence test (CST) and dichotic digits test (DDT) was assessed and compared in 30 male older adults (15 normal and 15 cases with right hemisphere CVA) in the same age groups (60-75 years old). For the CST, participants were presented with target message in one ear and competing message in the other one. The task was to listen to target sentence and repeat back without attention to competing sentence. For the DDT, auditory stimuli were monosyllabic digits presented dichotically and the task was to repeat those. Comparing mean score of CST and DDT between CVA patients with right hemisphere impairment and normal participants showed statistically significant difference (p=0.001 for CST and p<0.0001 for DDT). The present study revealed that abnormal CST and DDT scores of participants with right hemisphere CVA could be related to concurrent segregation difficulties. These findings suggest that low level segregation mechanisms and/or high level attention mechanisms might contribute to the problems.
Sensory Substitution: The Spatial Updating of Auditory Scenes “Mimics” the Spatial Updating of Visual Scenes

PubMed Central

Pasqualotto, Achille; Esenkaya, Tayfun

2016-01-01

Visual-to-auditory sensory substitution is used to convey visual information through audition, and it was initially created to compensate for blindness; it consists of software converting the visual images captured by a video-camera into the equivalent auditory images, or “soundscapes”. Here, it was used by blindfolded sighted participants to learn the spatial position of simple shapes depicted in images arranged on the floor. Very few studies have used sensory substitution to investigate spatial representation, while it has been widely used to investigate object recognition. Additionally, with sensory substitution we could study the performance of participants actively exploring the environment through audition, rather than passively localizing sound sources. Blindfolded participants egocentrically learnt the position of six images by using sensory substitution and then a judgment of relative direction task (JRD) was used to determine how this scene was represented. This task consists of imagining being in a given location, oriented in a given direction, and pointing towards the required image. Before performing the JRD task, participants explored a map that provided allocentric information about the scene. Although spatial exploration was egocentric, surprisingly we found that performance in the JRD task was better for allocentric perspectives. This suggests that the egocentric representation of the scene was updated. This result is in line with previous studies using visual and somatosensory scenes, thus supporting the notion that different sensory modalities produce equivalent spatial representation(s). Moreover, our results have practical implications to improve training methods with sensory substitution devices (SSD). PMID:27148000

Neural Correlates of Auditory Figure-Ground Segregation Based on Temporal Coherence

PubMed Central

Teki, Sundeep; Barascud, Nicolas; Picard, Samuel; Payne, Christopher; Griffiths, Timothy D.; Chait, Maria

2016-01-01

To make sense of natural acoustic environments, listeners must parse complex mixtures of sounds that vary in frequency, space, and time. Emerging work suggests that, in addition to the well-studied spectral cues for segregation, sensitivity to temporal coherence—the coincidence of sound elements in and across time—is also critical for the perceptual organization of acoustic scenes. Here, we examine pre-attentive, stimulus-driven neural processes underlying auditory figure-ground segregation using stimuli that capture the challenges of listening in complex scenes where segregation cannot be achieved based on spectral cues alone. Signals (“stochastic figure-ground”: SFG) comprised a sequence of brief broadband chords containing random pure tone components that vary from 1 chord to another. Occasional tone repetitions across chords are perceived as “figures” popping out of a stochastic “ground.” Magnetoencephalography (MEG) measurement in naïve, distracted, human subjects revealed robust evoked responses, commencing from about 150 ms after figure onset that reflect the emergence of the “figure” from the randomly varying “ground.” Neural sources underlying this bottom-up driven figure-ground segregation were localized to planum temporale, and the intraparietal sulcus, demonstrating that this area, outside the “classic” auditory system, is also involved in the early stages of auditory scene analysis.” PMID:27325682
Neural Correlates of Auditory Figure-Ground Segregation Based on Temporal Coherence.

PubMed

Teki, Sundeep; Barascud, Nicolas; Picard, Samuel; Payne, Christopher; Griffiths, Timothy D; Chait, Maria

2016-09-01

To make sense of natural acoustic environments, listeners must parse complex mixtures of sounds that vary in frequency, space, and time. Emerging work suggests that, in addition to the well-studied spectral cues for segregation, sensitivity to temporal coherence-the coincidence of sound elements in and across time-is also critical for the perceptual organization of acoustic scenes. Here, we examine pre-attentive, stimulus-driven neural processes underlying auditory figure-ground segregation using stimuli that capture the challenges of listening in complex scenes where segregation cannot be achieved based on spectral cues alone. Signals ("stochastic figure-ground": SFG) comprised a sequence of brief broadband chords containing random pure tone components that vary from 1 chord to another. Occasional tone repetitions across chords are perceived as "figures" popping out of a stochastic "ground." Magnetoencephalography (MEG) measurement in naïve, distracted, human subjects revealed robust evoked responses, commencing from about 150 ms after figure onset that reflect the emergence of the "figure" from the randomly varying "ground." Neural sources underlying this bottom-up driven figure-ground segregation were localized to planum temporale, and the intraparietal sulcus, demonstrating that this area, outside the "classic" auditory system, is also involved in the early stages of auditory scene analysis." © The Author 2016. Published by Oxford University Press.
Concurrent auditory perception difficulties in older adults with right hemisphere cerebrovascular accident

PubMed Central

Talebi, Hossein; Moossavi, Abdollah; Faghihzadeh, Soghrat

2014-01-01

Background: Older adults with cerebrovascular accident (CVA) show evidence of auditory and speech perception problems. In present study, it was examined whether these problems are due to impairments of concurrent auditory segregation procedure which is the basic level of auditory scene analysis and auditory organization in auditory scenes with competing sounds. Methods: Concurrent auditory segregation using competing sentence test (CST) and dichotic digits test (DDT) was assessed and compared in 30 male older adults (15 normal and 15 cases with right hemisphere CVA) in the same age groups (60-75 years old). For the CST, participants were presented with target message in one ear and competing message in the other one. The task was to listen to target sentence and repeat back without attention to competing sentence. For the DDT, auditory stimuli were monosyllabic digits presented dichotically and the task was to repeat those. Results: Comparing mean score of CST and DDT between CVA patients with right hemisphere impairment and normal participants showed statistically significant difference (p=0.001 for CST and p<0.0001 for DDT). Conclusion: The present study revealed that abnormal CST and DDT scores of participants with right hemisphere CVA could be related to concurrent segregation difficulties. These findings suggest that low level segregation mechanisms and/or high level attention mechanisms might contribute to the problems. PMID:25679009
Demonstrating the Potential for Dynamic Auditory Stimulation to Contribute to Motion Sickness

PubMed Central

Keshavarz, Behrang; Hettinger, Lawrence J.; Kennedy, Robert S.; Campos, Jennifer L.

2014-01-01

Auditory cues can create the illusion of self-motion (vection) in the absence of visual or physical stimulation. The present study aimed to determine whether auditory cues alone can also elicit motion sickness and how auditory cues contribute to motion sickness when added to visual motion stimuli. Twenty participants were seated in front of a curved projection display and were exposed to a virtual scene that constantly rotated around the participant's vertical axis. The virtual scene contained either visual-only, auditory-only, or a combination of corresponding visual and auditory cues. All participants performed all three conditions in a counterbalanced order. Participants tilted their heads alternately towards the right or left shoulder in all conditions during stimulus exposure in order to create pseudo-Coriolis effects and to maximize the likelihood for motion sickness. Measurements of motion sickness (onset, severity), vection (latency, strength, duration), and postural steadiness (center of pressure) were recorded. Results showed that adding auditory cues to the visual stimuli did not, on average, affect motion sickness and postural steadiness, but it did reduce vection onset times and increased vection strength compared to pure visual or pure auditory stimulation. Eighteen of the 20 participants reported at least slight motion sickness in the two conditions including visual stimuli. More interestingly, six participants also reported slight motion sickness during pure auditory stimulation and two of the six participants stopped the pure auditory test session due to motion sickness. The present study is the first to demonstrate that motion sickness may be caused by pure auditory stimulation, which we refer to as “auditorily induced motion sickness”. PMID:24983752
Brain correlates of the orientation of auditory spatial attention onto speaker location in a "cocktail-party" situation.

PubMed

Lewald, Jörg; Hanenberg, Christina; Getzmann, Stephan

2016-10-01

Successful speech perception in complex auditory scenes with multiple competing speakers requires spatial segregation of auditory streams into perceptually distinct and coherent auditory objects and focusing of attention toward the speaker of interest. Here, we focused on the neural basis of this remarkable capacity of the human auditory system and investigated the spatiotemporal sequence of neural activity within the cortical network engaged in solving the "cocktail-party" problem. Twenty-eight subjects localized a target word in the presence of three competing sound sources. The analysis of the ERPs revealed an anterior contralateral subcomponent of the N2 (N2ac), computed as the difference waveform for targets to the left minus targets to the right. The N2ac peaked at about 500 ms after stimulus onset, and its amplitude was correlated with better localization performance. Cortical source localization for the contrast of left versus right targets at the time of the N2ac revealed a maximum in the region around left superior frontal sulcus and frontal eye field, both of which are known to be involved in processing of auditory spatial information. In addition, a posterior-contralateral late positive subcomponent (LPCpc) occurred at a latency of about 700 ms. Both these subcomponents are potential correlates of allocation of spatial attention to the target under cocktail-party conditions. © 2016 Society for Psychophysiological Research.
A model of head-related transfer functions based on a state-space analysis

NASA Astrophysics Data System (ADS)

Adams, Norman Herkamp

This dissertation develops and validates a novel state-space method for binaural auditory display. Binaural displays seek to immerse a listener in a 3D virtual auditory scene with a pair of headphones. The challenge for any binaural display is to compute the two signals to supply to the headphones. The present work considers a general framework capable of synthesizing a wide variety of auditory scenes. The framework models collections of head-related transfer functions (HRTFs) simultaneously. This framework improves the flexibility of contemporary displays, but it also compounds the steep computational cost of the display. The cost is reduced dramatically by formulating the collection of HRTFs in the state-space and employing order-reduction techniques to design efficient approximants. Order-reduction techniques based on the Hankel-operator are found to yield accurate low-cost approximants. However, the inter-aural time difference (ITD) of the HRTFs degrades the time-domain response of the approximants. Fortunately, this problem can be circumvented by employing a state-space architecture that allows the ITD to be modeled outside of the state-space. Accordingly, three state-space architectures are considered. Overall, a multiple-input, single-output (MISO) architecture yields the best compromise between performance and flexibility. The state-space approximants are evaluated both empirically and psychoacoustically. An array of truncated FIR filters is used as a pragmatic reference system for comparison. For a fixed cost bound, the state-space systems yield lower approximation error than FIR arrays for D>10, where D is the number of directions in the HRTF collection. A series of headphone listening tests are also performed to validate the state-space approach, and to estimate the minimum order N of indiscriminable approximants. For D = 50, the state-space systems yield order thresholds less than half those of the FIR arrays. Depending upon the stimulus uncertainty, a minimum state-space order of 7≤N≤23 appears to be adequate. In conclusion, the proposed state-space method enables a more flexible and immersive binaural display with low computational cost.
Issues in Humanoid Audition and Sound Source Localization by Active Audition

NASA Astrophysics Data System (ADS)

Nakadai, Kazuhiro; Okuno, Hiroshi G.; Kitano, Hiroaki

In this paper, we present an active audition system which is implemented on the humanoid robot "SIG the humanoid". The audition system for highly intelligent humanoids localizes sound sources and recognizes auditory events in the auditory scene. Active audition reported in this paper enables SIG to track sources by integrating audition, vision, and motor movements. Given the multiple sound sources in the auditory scene, SIG actively moves its head to improve localization by aligning microphones orthogonal to the sound source and by capturing the possible sound sources by vision. However, such an active head movement inevitably creates motor noises.The system adaptively cancels motor noises using motor control signals and the cover acoustics. The experimental result demonstrates that active audition by integration of audition, vision, and motor control attains sound source tracking in variety of conditions.onditions.
Competing streams at the cocktail party: Exploring the mechanisms of attention and temporal integration

PubMed Central

Xiang, Juanjuan; Simon, Jonathan; Elhilali, Mounya

2010-01-01

Processing of complex acoustic scenes depends critically on the temporal integration of sensory information as sounds evolve naturally over time. It has been previously speculated that this process is guided by both innate mechanisms of temporal processing in the auditory system, as well as top-down mechanisms of attention, and possibly other schema-based processes. In an effort to unravel the neural underpinnings of these processes and their role in scene analysis, we combine Magnetoencephalography (MEG) with behavioral measures in humans in the context of polyrhythmic tone sequences. While maintaining unchanged sensory input, we manipulate subjects’ attention to one of two competing rhythmic streams in the same sequence. The results reveal that the neural representation of the attended rhythm is significantly enhanced both in its steady-state power and spatial phase coherence relative to its unattended state, closely correlating with its perceptual detectability for each listener. Interestingly, the data reveals a differential efficiency of rhythmic rates of the order of few hertz during the streaming process, closely following known neural and behavioral measures of temporal modulation sensitivity in the auditory system. These findings establish a direct link between known temporal modulation tuning in the auditory system (particularly at the level of auditory cortex) and the temporal integration of perceptual features in a complex acoustic scene, while mediated by processes of attention. PMID:20826671
Emergence of neural encoding of auditory objects while listening to competing speakers

PubMed Central

Ding, Nai; Simon, Jonathan Z.

2012-01-01

A visual scene is perceived in terms of visual objects. Similar ideas have been proposed for the analogous case of auditory scene analysis, although their hypothesized neural underpinnings have not yet been established. Here, we address this question by recording from subjects selectively listening to one of two competing speakers, either of different or the same sex, using magnetoencephalography. Individual neural representations are seen for the speech of the two speakers, with each being selectively phase locked to the rhythm of the corresponding speech stream and from which can be exclusively reconstructed the temporal envelope of that speech stream. The neural representation of the attended speech dominates responses (with latency near 100 ms) in posterior auditory cortex. Furthermore, when the intensity of the attended and background speakers is separately varied over an 8-dB range, the neural representation of the attended speech adapts only to the intensity of that speaker but not to the intensity of the background speaker, suggesting an object-level intensity gain control. In summary, these results indicate that concurrent auditory objects, even if spectrotemporally overlapping and not resolvable at the auditory periphery, are neurally encoded individually in auditory cortex and emerge as fundamental representational units for top-down attentional modulation and bottom-up neural adaptation. PMID:22753470
Auditory Scene Analysis: An Attention Perspective

PubMed Central

2017-01-01

Purpose This review article provides a new perspective on the role of attention in auditory scene analysis. Method A framework for understanding how attention interacts with stimulus-driven processes to facilitate task goals is presented. Previously reported data obtained through behavioral and electrophysiological measures in adults with normal hearing are summarized to demonstrate attention effects on auditory perception—from passive processes that organize unattended input to attention effects that act at different levels of the system. Data will show that attention can sharpen stream organization toward behavioral goals, identify auditory events obscured by noise, and limit passive processing capacity. Conclusions A model of attention is provided that illustrates how the auditory system performs multilevel analyses that involve interactions between stimulus-driven input and top-down processes. Overall, these studies show that (a) stream segregation occurs automatically and sets the basis for auditory event formation; (b) attention interacts with automatic processing to facilitate task goals; and (c) information about unattended sounds is not lost when selecting one organization over another. Our results support a neural model that allows multiple sound organizations to be held in memory and accessed simultaneously through a balance of automatic and task-specific processes, allowing flexibility for navigating noisy environments with competing sound sources. Presentation Video http://cred.pubs.asha.org/article.aspx?articleid=2601618 PMID:29049599
Auditory Memory Distortion for Spoken Prose

ERIC Educational Resources Information Center

Hutchison, Joanna L.; Hubbard, Timothy L.; Ferrandino, Blaise; Brigante, Ryan; Wright, Jamie M.; Rypma, Bart

2012-01-01

Observers often remember a scene as containing information that was not presented but that would have likely been located just beyond the observed boundaries of the scene. This effect is called "boundary extension" (BE; e.g., Intraub & Richardson, 1989). Previous studies have observed BE in memory for visual and haptic stimuli, and…
A roadmap for the study of conscious audition and its neural basis

PubMed Central

Cariani, Peter A.; Gutschalk, Alexander

2017-01-01

How and which aspects of neural activity give rise to subjective perceptual experience—i.e. conscious perception—is a fundamental question of neuroscience. To date, the vast majority of work concerning this question has come from vision, raising the issue of generalizability of prominent resulting theories. However, recent work has begun to shed light on the neural processes subserving conscious perception in other modalities, particularly audition. Here, we outline a roadmap for the future study of conscious auditory perception and its neural basis, paying particular attention to how conscious perception emerges (and of which elements or groups of elements) in complex auditory scenes. We begin by discussing the functional role of the auditory system, particularly as it pertains to conscious perception. Next, we ask: what are the phenomena that need to be explained by a theory of conscious auditory perception? After surveying the available literature for candidate neural correlates, we end by considering the implications that such results have for a general theory of conscious perception as well as prominent outstanding questions and what approaches/techniques can best be used to address them. This article is part of the themed issue ‘Auditory and visual scene analysis’. PMID:28044014
Comparison on driving fatigue related hemodynamics activated by auditory and visual stimulus

NASA Astrophysics Data System (ADS)

Deng, Zishan; Gao, Yuan; Li, Ting

2018-02-01

As one of the main causes of traffic accidents, driving fatigue deserves researchers' attention and its detection and monitoring during long-term driving require a new technique to realize. Since functional near-infrared spectroscopy (fNIRS) can be applied to detect cerebral hemodynamic responses, we can promisingly expect its application in fatigue level detection. Here, we performed three different kinds of experiments on a driver and recorded his cerebral hemodynamic responses when driving for long hours utilizing our device based on fNIRS. Each experiment lasted for 7 hours and one of the three specific experimental tests, detecting the driver's response to sounds, traffic lights and direction signs respectively, was done every hour. The results showed that visual stimulus was easier to cause fatigue compared with auditory stimulus and visual stimulus induced by traffic lights scenes was easier to cause fatigue compared with visual stimulus induced by direction signs in the first few hours. We also found that fatigue related hemodynamics caused by auditory stimulus increased fastest, then traffic lights scenes, and direction signs scenes slowest. Our study successfully compared audio, visual color, and visual character stimulus in sensitivity to cause driving fatigue, which is meaningful for driving safety management.
Psychophysical and Neural Correlates of Auditory Attraction and Aversion

NASA Astrophysics Data System (ADS)

Patten, Kristopher Jakob

This study explores the psychophysical and neural processes associated with the perception of sounds as either pleasant or aversive. The underlying psychophysical theory is based on auditory scene analysis, the process through which listeners parse auditory signals into individual acoustic sources. The first experiment tests and confirms that a self-rated pleasantness continuum reliably exists for 20 various stimuli (r = .48). In addition, the pleasantness continuum correlated with the physical acoustic characteristics of consonance/dissonance (r = .78), which can facilitate auditory parsing processes. The second experiment uses an fMRI block design to test blood oxygen level dependent (BOLD) changes elicited by a subset of 5 exemplar stimuli chosen from Experiment 1 that are evenly distributed over the pleasantness continuum. Specifically, it tests and confirms that the pleasantness continuum produces systematic changes in brain activity for unpleasant acoustic stimuli beyond what occurs with pleasant auditory stimuli. Results revealed that the combination of two positively and two negatively valenced experimental sounds compared to one neutral baseline control elicited BOLD increases in the primary auditory cortex, specifically the bilateral superior temporal gyrus, and left dorsomedial prefrontal cortex; the latter being consistent with a frontal decision-making process common in identification tasks. The negatively-valenced stimuli yielded additional BOLD increases in the left insula, which typically indicates processing of visceral emotions. The positively-valenced stimuli did not yield any significant BOLD activation, consistent with consonant, harmonic stimuli being the prototypical acoustic pattern of auditory objects that is optimal for auditory scene analysis. Both the psychophysical findings of Experiment 1 and the neural processing findings of Experiment 2 support that consonance is an important dimension of sound that is processed in a manner that aids auditory parsing and functional representation of acoustic objects and was found to be a principal feature of pleasing auditory stimuli.
Statistics of natural binaural sounds.

PubMed

Młynarski, Wiktor; Jost, Jürgen

2014-01-01

Binaural sound localization is usually considered a discrimination task, where interaural phase (IPD) and level (ILD) disparities at narrowly tuned frequency channels are utilized to identify a position of a sound source. In natural conditions however, binaural circuits are exposed to a stimulation by sound waves originating from multiple, often moving and overlapping sources. Therefore statistics of binaural cues depend on acoustic properties and the spatial configuration of the environment. Distribution of cues encountered naturally and their dependence on physical properties of an auditory scene have not been studied before. In the present work we analyzed statistics of naturally encountered binaural sounds. We performed binaural recordings of three auditory scenes with varying spatial configuration and analyzed empirical cue distributions from each scene. We have found that certain properties such as the spread of IPD distributions as well as an overall shape of ILD distributions do not vary strongly between different auditory scenes. Moreover, we found that ILD distributions vary much weaker across frequency channels and IPDs often attain much higher values, than can be predicted from head filtering properties. In order to understand the complexity of the binaural hearing task in the natural environment, sound waveforms were analyzed by performing Independent Component Analysis (ICA). Properties of learned basis functions indicate that in natural conditions soundwaves in each ear are predominantly generated by independent sources. This implies that the real-world sound localization must rely on mechanisms more complex than a mere cue extraction.
Statistics of Natural Binaural Sounds

PubMed Central

Młynarski, Wiktor; Jost, Jürgen

2014-01-01

Binaural sound localization is usually considered a discrimination task, where interaural phase (IPD) and level (ILD) disparities at narrowly tuned frequency channels are utilized to identify a position of a sound source. In natural conditions however, binaural circuits are exposed to a stimulation by sound waves originating from multiple, often moving and overlapping sources. Therefore statistics of binaural cues depend on acoustic properties and the spatial configuration of the environment. Distribution of cues encountered naturally and their dependence on physical properties of an auditory scene have not been studied before. In the present work we analyzed statistics of naturally encountered binaural sounds. We performed binaural recordings of three auditory scenes with varying spatial configuration and analyzed empirical cue distributions from each scene. We have found that certain properties such as the spread of IPD distributions as well as an overall shape of ILD distributions do not vary strongly between different auditory scenes. Moreover, we found that ILD distributions vary much weaker across frequency channels and IPDs often attain much higher values, than can be predicted from head filtering properties. In order to understand the complexity of the binaural hearing task in the natural environment, sound waveforms were analyzed by performing Independent Component Analysis (ICA). Properties of learned basis functions indicate that in natural conditions soundwaves in each ear are predominantly generated by independent sources. This implies that the real-world sound localization must rely on mechanisms more complex than a mere cue extraction. PMID:25285658
Brain bases for auditory stimulus-driven figure-ground segregation.

PubMed

Teki, Sundeep; Chait, Maria; Kumar, Sukhbinder; von Kriegstein, Katharina; Griffiths, Timothy D

2011-01-05

Auditory figure-ground segregation, listeners' ability to selectively hear out a sound of interest from a background of competing sounds, is a fundamental aspect of scene analysis. In contrast to the disordered acoustic environment we experience during everyday listening, most studies of auditory segregation have used relatively simple, temporally regular signals. We developed a new figure-ground stimulus that incorporates stochastic variation of the figure and background that captures the rich spectrotemporal complexity of natural acoustic scenes. Figure and background signals overlap in spectrotemporal space, but vary in the statistics of fluctuation, such that the only way to extract the figure is by integrating the patterns over time and frequency. Our behavioral results demonstrate that human listeners are remarkably sensitive to the appearance of such figures. In a functional magnetic resonance imaging experiment, aimed at investigating preattentive, stimulus-driven, auditory segregation mechanisms, naive subjects listened to these stimuli while performing an irrelevant task. Results demonstrate significant activations in the intraparietal sulcus (IPS) and the superior temporal sulcus related to bottom-up, stimulus-driven figure-ground decomposition. We did not observe any significant activation in the primary auditory cortex. Our results support a role for automatic, bottom-up mechanisms in the IPS in mediating stimulus-driven, auditory figure-ground segregation, which is consistent with accumulating evidence implicating the IPS in structuring sensory input and perceptual organization.
Neural Correlates of Sound Localization in Complex Acoustic Environments

PubMed Central

Zündorf, Ida C.; Lewald, Jörg; Karnath, Hans-Otto

2013-01-01

Listening to and understanding people in a “cocktail-party situation” is a remarkable feature of the human auditory system. Here we investigated the neural correlates of the ability to localize a particular sound among others in an acoustically cluttered environment with healthy subjects. In a sound localization task, five different natural sounds were presented from five virtual spatial locations during functional magnetic resonance imaging (fMRI). Activity related to auditory stream segregation was revealed in posterior superior temporal gyrus bilaterally, anterior insula, supplementary motor area, and frontoparietal network. Moreover, the results indicated critical roles of left planum temporale in extracting the sound of interest among acoustical distracters and the precuneus in orienting spatial attention to the target sound. We hypothesized that the left-sided lateralization of the planum temporale activation is related to the higher specialization of the left hemisphere for analysis of spectrotemporal sound features. Furthermore, the precuneus − a brain area known to be involved in the computation of spatial coordinates across diverse frames of reference for reaching to objects − seems to be also a crucial area for accurately determining locations of auditory targets in an acoustically complex scene of multiple sound sources. The precuneus thus may not only be involved in visuo-motor processes, but may also subserve related functions in the auditory modality. PMID:23691185
Volume Attenuation and High Frequency Loss as Auditory Depth Cues in Stereoscopic 3D Cinema

NASA Astrophysics Data System (ADS)

Manolas, Christos; Pauletto, Sandra

2014-09-01

Assisted by the technological advances of the past decades, stereoscopic 3D (S3D) cinema is currently in the process of being established as a mainstream form of entertainment. The main focus of this collaborative effort is placed on the creation of immersive S3D visuals. However, with few exceptions, little attention has been given so far to the potential effect of the soundtrack on such environments. The potential of sound both as a means to enhance the impact of the S3D visual information and to expand the S3D cinematic world beyond the boundaries of the visuals is large. This article reports on our research into the possibilities of using auditory depth cues within the soundtrack as a means of affecting the perception of depth within cinematic S3D scenes. We study two main distance-related auditory cues: high-end frequency loss and overall volume attenuation. A series of experiments explored the effectiveness of these auditory cues. Results, although not conclusive, indicate that the studied auditory cues can influence the audience judgement of depth in cinematic 3D scenes, sometimes in unexpected ways. We conclude that 3D filmmaking can benefit from further studies on the effectiveness of specific sound design techniques to enhance S3D cinema.
Hearing Scenes: A Neuromagnetic Signature of Auditory Source and Reverberant Space Separation

PubMed Central

Oliva, Aude

2017-01-01

Abstract Perceiving the geometry of surrounding space is a multisensory process, crucial to contextualizing object perception and guiding navigation behavior. Humans can make judgments about surrounding spaces from reverberation cues, caused by sounds reflecting off multiple interior surfaces. However, it remains unclear how the brain represents reverberant spaces separately from sound sources. Here, we report separable neural signatures of auditory space and source perception during magnetoencephalography (MEG) recording as subjects listened to brief sounds convolved with monaural room impulse responses (RIRs). The decoding signature of sound sources began at 57 ms after stimulus onset and peaked at 130 ms, while space decoding started at 138 ms and peaked at 386 ms. Importantly, these neuromagnetic responses were readily dissociable in form and time: while sound source decoding exhibited an early and transient response, the neural signature of space was sustained and independent of the original source that produced it. The reverberant space response was robust to variations in sound source, and vice versa, indicating a generalized response not tied to specific source-space combinations. These results provide the first neuromagnetic evidence for robust, dissociable auditory source and reverberant space representations in the human brain and reveal the temporal dynamics of how auditory scene analysis extracts percepts from complex naturalistic auditory signals. PMID:28451630

The Influence of Adaptation and Inhibition on the Effects of Onset Asynchrony on Auditory Grouping

ERIC Educational Resources Information Center

Holmes, Stephen D.; Roberts, Brian

2011-01-01

Onset asynchrony is an important cue for auditory scene analysis. For example, a harmonic of a vowel that begins before the other components contributes less to the perceived phonetic quality. This effect was thought primarily to involve high-level grouping processes, because the contribution can be partly restored by accompanying the leading…
Multisensory and Modality-Specific Influences on Adaptation to Optical Prisms

PubMed Central

Calzolari, Elena; Albini, Federica; Bolognini, Nadia; Vallar, Giuseppe

2017-01-01

Visuo-motor adaptation to optical prisms displacing the visual scene (prism adaptation, PA) is a method used for investigating visuo-motor plasticity in healthy individuals and, in clinical settings, for the rehabilitation of unilateral spatial neglect. In the standard paradigm, the adaptation phase involves repeated pointings to visual targets, while wearing optical prisms displacing the visual scene laterally. Here we explored differences in PA, and its aftereffects (AEs), as related to the sensory modality of the target. Visual, auditory, and multisensory – audio-visual – targets in the adaptation phase were used, while participants wore prisms displacing the visual field rightward by 10°. Proprioceptive, visual, visual-proprioceptive, auditory-proprioceptive straight-ahead shifts were measured. Pointing to auditory and to audio-visual targets in the adaptation phase produces proprioceptive, visual-proprioceptive, and auditory-proprioceptive AEs, as the typical visual targets did. This finding reveals that cross-modal plasticity effects involve both the auditory and the visual modality, and their interactions (Experiment 1). Even a shortened PA phase, requiring only 24 pointings to visual and audio-visual targets (Experiment 2), is sufficient to bring about AEs, as compared to the standard 92-pointings procedure. Finally, pointings to auditory targets cause AEs, although PA with a reduced number of pointings (24) to auditory targets brings about smaller AEs, as compared to the 92-pointings procedure (Experiment 3). Together, results from the three experiments extend to the auditory modality the sensorimotor plasticity underlying the typical AEs produced by PA to visual targets. Importantly, PA to auditory targets appears characterized by less accurate pointings and error correction, suggesting that the auditory component of the PA process may be less central to the building up of the AEs, than the sensorimotor pointing activity per se. These findings highlight both the effectiveness of a reduced number of pointings for bringing about AEs, and the possibility of inducing PA with auditory targets, which may be used as a compensatory route in patients with visual deficits. PMID:29213233
Interactive physically-based sound simulation

NASA Astrophysics Data System (ADS)

Raghuvanshi, Nikunj

The realization of interactive, immersive virtual worlds requires the ability to present a realistic audio experience that convincingly compliments their visual rendering. Physical simulation is a natural way to achieve such realism, enabling deeply immersive virtual worlds. However, physically-based sound simulation is very computationally expensive owing to the high-frequency, transient oscillations underlying audible sounds. The increasing computational power of desktop computers has served to reduce the gap between required and available computation, and it has become possible to bridge this gap further by using a combination of algorithmic improvements that exploit the physical, as well as perceptual properties of audible sounds. My thesis is a step in this direction. My dissertation concentrates on developing real-time techniques for both sub-problems of sound simulation: synthesis and propagation. Sound synthesis is concerned with generating the sounds produced by objects due to elastic surface vibrations upon interaction with the environment, such as collisions. I present novel techniques that exploit human auditory perception to simulate scenes with hundreds of sounding objects undergoing impact and rolling in real time. Sound propagation is the complementary problem of modeling the high-order scattering and diffraction of sound in an environment as it travels from source to listener. I discuss my work on a novel numerical acoustic simulator (ARD) that is hundred times faster and consumes ten times less memory than a high-accuracy finite-difference technique, allowing acoustic simulations on previously-intractable spaces, such as a cathedral, on a desktop computer. Lastly, I present my work on interactive sound propagation that leverages my ARD simulator to render the acoustics of arbitrary static scenes for multiple moving sources and listener in real time, while accounting for scene-dependent effects such as low-pass filtering and smooth attenuation behind obstructions, reverberation, scattering from complex geometry and sound focusing. This is enabled by a novel compact representation that takes a thousand times less memory than a direct scheme, thus reducing memory footprints to fit within available main memory. To the best of my knowledge, this is the only technique and system in existence to demonstrate auralization of physical wave-based effects in real-time on large, complex 3D scenes.
Estimating the relative weights of visual and auditory tau versus heuristic-based cues for time-to-contact judgments in realistic, familiar scenes by older and younger adults.

PubMed

Keshavarz, Behrang; Campos, Jennifer L; DeLucia, Patricia R; Oberfeld, Daniel

2017-04-01

Estimating time to contact (TTC) involves multiple sensory systems, including vision and audition. Previous findings suggested that the ratio of an object's instantaneous optical size/sound intensity to its instantaneous rate of change in optical size/sound intensity (τ) drives TTC judgments. Other evidence has shown that heuristic-based cues are used, including final optical size or final sound pressure level. Most previous studies have used decontextualized and unfamiliar stimuli (e.g., geometric shapes on a blank background). Here we evaluated TTC estimates by using a traffic scene with an approaching vehicle to evaluate the weights of visual and auditory TTC cues under more realistic conditions. Younger (18-39 years) and older (65+ years) participants made TTC estimates in three sensory conditions: visual-only, auditory-only, and audio-visual. Stimuli were presented within an immersive virtual-reality environment, and cue weights were calculated for both visual cues (e.g., visual τ, final optical size) and auditory cues (e.g., auditory τ, final sound pressure level). The results demonstrated the use of visual τ as well as heuristic cues in the visual-only condition. TTC estimates in the auditory-only condition, however, were primarily based on an auditory heuristic cue (final sound pressure level), rather than on auditory τ. In the audio-visual condition, the visual cues dominated overall, with the highest weight being assigned to visual τ by younger adults, and a more equal weighting of visual τ and heuristic cues in older adults. Overall, better characterizing the effects of combined sensory inputs, stimulus characteristics, and age on the cues used to estimate TTC will provide important insights into how these factors may affect everyday behavior.
Modelling the Emergence and Dynamics of Perceptual Organisation in Auditory Streaming

PubMed Central

Mill, Robert W.; Bőhm, Tamás M.; Bendixen, Alexandra; Winkler, István; Denham, Susan L.

2013-01-01

Many sound sources can only be recognised from the pattern of sounds they emit, and not from the individual sound events that make up their emission sequences. Auditory scene analysis addresses the difficult task of interpreting the sound world in terms of an unknown number of discrete sound sources (causes) with possibly overlapping signals, and therefore of associating each event with the appropriate source. There are potentially many different ways in which incoming events can be assigned to different causes, which means that the auditory system has to choose between them. This problem has been studied for many years using the auditory streaming paradigm, and recently it has become apparent that instead of making one fixed perceptual decision, given sufficient time, auditory perception switches back and forth between the alternatives—a phenomenon known as perceptual bi- or multi-stability. We propose a new model of auditory scene analysis at the core of which is a process that seeks to discover predictable patterns in the ongoing sound sequence. Representations of predictable fragments are created on the fly, and are maintained, strengthened or weakened on the basis of their predictive success, and conflict with other representations. Auditory perceptual organisation emerges spontaneously from the nature of the competition between these representations. We present detailed comparisons between the model simulations and data from an auditory streaming experiment, and show that the model accounts for many important findings, including: the emergence of, and switching between, alternative organisations; the influence of stimulus parameters on perceptual dominance, switching rate and perceptual phase durations; and the build-up of auditory streaming. The principal contribution of the model is to show that a two-stage process of pattern discovery and competition between incompatible patterns can account for both the contents (perceptual organisations) and the dynamics of human perception in auditory streaming. PMID:23516340
Acoustic facilitation of object movement detection during self-motion

PubMed Central

Calabro, F. J.; Soto-Faraco, S.; Vaina, L. M.

2011-01-01

In humans, as well as most animal species, perception of object motion is critical to successful interaction with the surrounding environment. Yet, as the observer also moves, the retinal projections of the various motion components add to each other and extracting accurate object motion becomes computationally challenging. Recent psychophysical studies have demonstrated that observers use a flow-parsing mechanism to estimate and subtract self-motion from the optic flow field. We investigated whether concurrent acoustic cues for motion can facilitate visual flow parsing, thereby enhancing the detection of moving objects during simulated self-motion. Participants identified an object (the target) that moved either forward or backward within a visual scene containing nine identical textured objects simulating forward observer translation. We found that spatially co-localized, directionally congruent, moving auditory stimuli enhanced object motion detection. Interestingly, subjects who performed poorly on the visual-only task benefited more from the addition of moving auditory stimuli. When auditory stimuli were not co-localized to the visual target, improvements in detection rates were weak. Taken together, these results suggest that parsing object motion from self-motion-induced optic flow can operate on multisensory object representations. PMID:21307050
Contextual effects of noise on vocalization encoding in primary auditory cortex

PubMed Central

Ni, Ruiye; Bender, David A.; Shanechi, Amirali M.; Gamble, Jeffrey R.

2016-01-01

Robust auditory perception plays a pivotal function for processing behaviorally relevant sounds, particularly with distractions from the environment. The neuronal coding enabling this ability, however, is still not well understood. In this study, we recorded single-unit activity from the primary auditory cortex (A1) of awake marmoset monkeys (Callithrix jacchus) while delivering conspecific vocalizations degraded by two different background noises: broadband white noise and vocalization babble. Noise effects on neural representation of target vocalizations were quantified by measuring the responses' similarity to those elicited by natural vocalizations as a function of signal-to-noise ratio. A clustering approach was used to describe the range of response profiles by reducing the population responses to a summary of four response classes (robust, balanced, insensitive, and brittle) under both noise conditions. This clustering approach revealed that, on average, approximately two-thirds of the neurons change their response class when encountering different noises. Therefore, the distortion induced by one particular masking background in single-unit responses is not necessarily predictable from that induced by another, suggesting the low likelihood of a unique group of noise-invariant neurons across different background conditions in A1. Regarding noise influence on neural activities, the brittle response group showed addition of spiking activity both within and between phrases of vocalizations relative to clean vocalizations, whereas the other groups generally showed spiking activity suppression within phrases, and the alteration between phrases was noise dependent. Overall, the variable single-unit responses, yet consistent response types, imply that primate A1 performs scene analysis through the collective activity of multiple neurons. NEW & NOTEWORTHY The understanding of where and how auditory scene analysis is accomplished is of broad interest to neuroscientists. In this paper, we systematically investigated neuronal coding of multiple vocalizations degraded by two distinct noises at various signal-to-noise ratios in nonhuman primates. In the process, we uncovered heterogeneity of single-unit representations for different auditory scenes yet homogeneity of responses across the population. PMID:27881720
Contextual effects of noise on vocalization encoding in primary auditory cortex.

PubMed

Ni, Ruiye; Bender, David A; Shanechi, Amirali M; Gamble, Jeffrey R; Barbour, Dennis L

2017-02-01

Robust auditory perception plays a pivotal function for processing behaviorally relevant sounds, particularly with distractions from the environment. The neuronal coding enabling this ability, however, is still not well understood. In this study, we recorded single-unit activity from the primary auditory cortex (A1) of awake marmoset monkeys (Callithrix jacchus) while delivering conspecific vocalizations degraded by two different background noises: broadband white noise and vocalization babble. Noise effects on neural representation of target vocalizations were quantified by measuring the responses' similarity to those elicited by natural vocalizations as a function of signal-to-noise ratio. A clustering approach was used to describe the range of response profiles by reducing the population responses to a summary of four response classes (robust, balanced, insensitive, and brittle) under both noise conditions. This clustering approach revealed that, on average, approximately two-thirds of the neurons change their response class when encountering different noises. Therefore, the distortion induced by one particular masking background in single-unit responses is not necessarily predictable from that induced by another, suggesting the low likelihood of a unique group of noise-invariant neurons across different background conditions in A1. Regarding noise influence on neural activities, the brittle response group showed addition of spiking activity both within and between phrases of vocalizations relative to clean vocalizations, whereas the other groups generally showed spiking activity suppression within phrases, and the alteration between phrases was noise dependent. Overall, the variable single-unit responses, yet consistent response types, imply that primate A1 performs scene analysis through the collective activity of multiple neurons. The understanding of where and how auditory scene analysis is accomplished is of broad interest to neuroscientists. In this paper, we systematically investigated neuronal coding of multiple vocalizations degraded by two distinct noises at various signal-to-noise ratios in nonhuman primates. In the process, we uncovered heterogeneity of single-unit representations for different auditory scenes yet homogeneity of responses across the population. Copyright © 2017 the American Physiological Society.
The origins of music in auditory scene analysis and the roles of evolution and culture in musical creation.

PubMed

Trainor, Laurel J

2015-03-19

Whether music was an evolutionary adaptation that conferred survival advantages or a cultural creation has generated much debate. Consistent with an evolutionary hypothesis, music is unique to humans, emerges early in development and is universal across societies. However, the adaptive benefit of music is far from obvious. Music is highly flexible, generative and changes rapidly over time, consistent with a cultural creation hypothesis. In this paper, it is proposed that much of musical pitch and timing structure adapted to preexisting features of auditory processing that evolved for auditory scene analysis (ASA). Thus, music may have emerged initially as a cultural creation made possible by preexisting adaptations for ASA. However, some aspects of music, such as its emotional and social power, may have subsequently proved beneficial for survival and led to adaptations that enhanced musical behaviour. Ontogenetic and phylogenetic evidence is considered in this regard. In particular, enhanced auditory-motor pathways in humans that enable movement entrainment to music and consequent increases in social cohesion, and pathways enabling music to affect reward centres in the brain should be investigated as possible musical adaptations. It is concluded that the origins of music are complex and probably involved exaptation, cultural creation and evolutionary adaptation.
A dual-process account of auditory change detection.

PubMed

McAnally, Ken I; Martin, Russell L; Eramudugolla, Ranmalee; Stuart, Geoffrey W; Irvine, Dexter R F; Mattingley, Jason B

2010-08-01

Listeners can be "deaf" to a substantial change in a scene comprising multiple auditory objects unless their attention has been directed to the changed object. It is unclear whether auditory change detection relies on identification of the objects in pre- and post-change scenes. We compared the rates at which listeners correctly identify changed objects with those predicted by change-detection models based on signal detection theory (SDT) and high-threshold theory (HTT). Detected changes were not identified as accurately as predicted by models based on either theory, suggesting that some changes are detected by a process that does not support change identification. Undetected changes were identified as accurately as predicted by the HTT model but much less accurately than predicted by the SDT models. The process underlying change detection was investigated further by determining receiver-operating characteristics (ROCs). ROCs did not conform to those predicted by either a SDT or a HTT model but were well modeled by a dual-process that incorporated HTT and SDT components. The dual-process model also accurately predicted the rates at which detected and undetected changes were correctly identified.
Auditory scene analysis in school-aged children with developmental language disorders

PubMed Central

Sussman, E.; Steinschneider, M.; Lee, W.; Lawson, K.

2014-01-01

Natural sound environments are dynamic, with overlapping acoustic input originating from simultaneously active sources. A key function of the auditory system is to integrate sensory inputs that belong together and segregate those that come from different sources. We hypothesized that this skill is impaired in individuals with phonological processing difficulties. There is considerable disagreement about whether phonological impairments observed in children with developmental language disorders can be attributed to specific linguistic deficits or to more general acoustic processing deficits. However, most tests of general auditory abilities have been conducted with a single set of sounds. We assessed the ability of school-aged children (7–15 years) to parse complex auditory non-speech input, and determined whether the presence of phonological processing impairments was associated with stream perception performance. A key finding was that children with language impairments did not show the same developmental trajectory for stream perception as typically developing children. In addition, children with language impairments required larger frequency separations between sounds to hear distinct streams compared to age-matched peers. Furthermore, phonological processing ability was a significant predictor of stream perception measures, but only in the older age groups. No such association was found in the youngest children. These results indicate that children with language impairments have difficulty parsing speech streams, or identifying individual sound events when there are competing sound sources. We conclude that language group differences may in part reflect fundamental maturational disparities in the analysis of complex auditory scenes. PMID:24548430
Development of the auditory system

PubMed Central

Litovsky, Ruth

2015-01-01

Auditory development involves changes in the peripheral and central nervous system along the auditory pathways, and these occur naturally, and in response to stimulation. Human development occurs along a trajectory that can last decades, and is studied using behavioral psychophysics, as well as physiologic measurements with neural imaging. The auditory system constructs a perceptual space that takes information from objects and groups, segregates sounds, and provides meaning and access to communication tools such as language. Auditory signals are processed in a series of analysis stages, from peripheral to central. Coding of information has been studied for features of sound, including frequency, intensity, loudness, and location, in quiet and in the presence of maskers. In the latter case, the ability of the auditory system to perform an analysis of the scene becomes highly relevant. While some basic abilities are well developed at birth, there is a clear prolonged maturation of auditory development well into the teenage years. Maturation involves auditory pathways. However, non-auditory changes (attention, memory, cognition) play an important role in auditory development. The ability of the auditory system to adapt in response to novel stimuli is a key feature of development throughout the nervous system, known as neural plasticity. PMID:25726262
Segregation and Integration of Auditory Streams when Listening to Multi-Part Music

PubMed Central

Ragert, Marie; Fairhurst, Merle T.; Keller, Peter E.

2014-01-01

In our daily lives, auditory stream segregation allows us to differentiate concurrent sound sources and to make sense of the scene we are experiencing. However, a combination of segregation and the concurrent integration of auditory streams is necessary in order to analyze the relationship between streams and thus perceive a coherent auditory scene. The present functional magnetic resonance imaging study investigates the relative role and neural underpinnings of these listening strategies in multi-part musical stimuli. We compare a real human performance of a piano duet and a synthetic stimulus of the same duet in a prioritized integrative attention paradigm that required the simultaneous segregation and integration of auditory streams. In so doing, we manipulate the degree to which the attended part of the duet led either structurally (attend melody vs. attend accompaniment) or temporally (asynchronies vs. no asynchronies between parts), and thus the relative contributions of integration and segregation used to make an assessment of the leader-follower relationship. We show that perceptually the relationship between parts is biased towards the conventional structural hierarchy in western music in which the melody generally dominates (leads) the accompaniment. Moreover, the assessment varies as a function of both cognitive load, as shown through difficulty ratings and the interaction of the temporal and the structural relationship factors. Neurally, we see that the temporal relationship between parts, as one important cue for stream segregation, revealed distinct neural activity in the planum temporale. By contrast, integration used when listening to both the temporally separated performance stimulus and the temporally fused synthetic stimulus resulted in activation of the intraparietal sulcus. These results support the hypothesis that the planum temporale and IPS are key structures underlying the mechanisms of segregation and integration of auditory streams, respectively. PMID:24475030
Segregation and integration of auditory streams when listening to multi-part music.

PubMed

Ragert, Marie; Fairhurst, Merle T; Keller, Peter E

2014-01-01

In our daily lives, auditory stream segregation allows us to differentiate concurrent sound sources and to make sense of the scene we are experiencing. However, a combination of segregation and the concurrent integration of auditory streams is necessary in order to analyze the relationship between streams and thus perceive a coherent auditory scene. The present functional magnetic resonance imaging study investigates the relative role and neural underpinnings of these listening strategies in multi-part musical stimuli. We compare a real human performance of a piano duet and a synthetic stimulus of the same duet in a prioritized integrative attention paradigm that required the simultaneous segregation and integration of auditory streams. In so doing, we manipulate the degree to which the attended part of the duet led either structurally (attend melody vs. attend accompaniment) or temporally (asynchronies vs. no asynchronies between parts), and thus the relative contributions of integration and segregation used to make an assessment of the leader-follower relationship. We show that perceptually the relationship between parts is biased towards the conventional structural hierarchy in western music in which the melody generally dominates (leads) the accompaniment. Moreover, the assessment varies as a function of both cognitive load, as shown through difficulty ratings and the interaction of the temporal and the structural relationship factors. Neurally, we see that the temporal relationship between parts, as one important cue for stream segregation, revealed distinct neural activity in the planum temporale. By contrast, integration used when listening to both the temporally separated performance stimulus and the temporally fused synthetic stimulus resulted in activation of the intraparietal sulcus. These results support the hypothesis that the planum temporale and IPS are key structures underlying the mechanisms of segregation and integration of auditory streams, respectively.
Hearing through the noise: Biologically inspired noise reduction

NASA Astrophysics Data System (ADS)

Lee, Tyler Paul

Vocal communication in the natural world demands that a listener perform a remarkably complicated task in real-time. Vocalizations mix with all other sounds in the environment as they travel to the listener, arriving as a jumbled low-dimensional signal. A listener must then use this signal to extract the structure corresponding to individual sound sources. How this computation is implemented in the brain remains poorly understood, yet an accurate description of such mechanisms would impact a variety of medical and technological applications of sound processing. In this thesis, I describe initial work on how neurons in the secondary auditory cortex of the Zebra Finch extract song from naturalistic background noise. I then build on our understanding of the function of these neurons by creating an algorithm that extracts speech from natural background noise using spectrotemporal modulations. The algorithm, implemented as an artificial neural network, can be flexibly applied to any class of signal or noise and performs better than an optimal frequency-based noise reduction algorithm for a variety of background noises and signal-to-noise ratios. One potential drawback to using spectrotemporal modulations for noise reduction, though, is that analyzing the modulations present in an ongoing sound requires a latency set by the slowest temporal modulation computed. The algorithm avoids this problem by reducing noise predictively, taking advantage of the large amount of temporal structure present in natural sounds. This predictive denoising has ties to recent work suggesting that the auditory system uses attention to focus on predicted regions of spectrotemporal space when performing auditory scene analysis.
Change deafness for real spatialized environmental scenes.

PubMed

Gaston, Jeremy; Dickerson, Kelly; Hipp, Daniel; Gerhardstein, Peter

2017-01-01

The everyday auditory environment is complex and dynamic; often, multiple sounds co-occur and compete for a listener's cognitive resources. 'Change deafness', framed as the auditory analog to the well-documented phenomenon of 'change blindness', describes the finding that changes presented within complex environments are often missed. The present study examines a number of stimulus factors that may influence change deafness under real-world listening conditions. Specifically, an AX (same-different) discrimination task was used to examine the effects of both spatial separation over a loudspeaker array and the type of change (sound source additions and removals) on discrimination of changes embedded in complex backgrounds. Results using signal detection theory and accuracy analyses indicated that, under most conditions, errors were significantly reduced for spatially distributed relative to non-spatial scenes. A second goal of the present study was to evaluate a possible link between memory for scene contents and change discrimination. Memory was evaluated by presenting a cued recall test following each trial of the discrimination task. Results using signal detection theory and accuracy analyses indicated that recall ability was similar in terms of accuracy, but there were reductions in sensitivity compared to previous reports. Finally, the present study used a large and representative sample of outdoor, urban, and environmental sounds, presented in unique combinations of nearly 1000 trials per participant. This enabled the exploration of the relationship between change perception and the perceptual similarity between change targets and background scene sounds. These (post hoc) analyses suggest both a categorical and a stimulus-level relationship between scene similarity and the magnitude of change errors.
Sound Classification in Hearing Aids Inspired by Auditory Scene Analysis

NASA Astrophysics Data System (ADS)

Büchler, Michael; Allegro, Silvia; Launer, Stefan; Dillier, Norbert

2005-12-01

A sound classification system for the automatic recognition of the acoustic environment in a hearing aid is discussed. The system distinguishes the four sound classes "clean speech," "speech in noise," "noise," and "music." A number of features that are inspired by auditory scene analysis are extracted from the sound signal. These features describe amplitude modulations, spectral profile, harmonicity, amplitude onsets, and rhythm. They are evaluated together with different pattern classifiers. Simple classifiers, such as rule-based and minimum-distance classifiers, are compared with more complex approaches, such as Bayes classifier, neural network, and hidden Markov model. Sounds from a large database are employed for both training and testing of the system. The achieved recognition rates are very high except for the class "speech in noise." Problems arise in the classification of compressed pop music, strongly reverberated speech, and tonal or fluctuating noises.
Contributions of low- and high-level properties to neural processing of visual scenes in the human brain.

PubMed

Groen, Iris I A; Silson, Edward H; Baker, Chris I

2017-02-19

Visual scene analysis in humans has been characterized by the presence of regions in extrastriate cortex that are selectively responsive to scenes compared with objects or faces. While these regions have often been interpreted as representing high-level properties of scenes (e.g. category), they also exhibit substantial sensitivity to low-level (e.g. spatial frequency) and mid-level (e.g. spatial layout) properties, and it is unclear how these disparate findings can be united in a single framework. In this opinion piece, we suggest that this problem can be resolved by questioning the utility of the classical low- to high-level framework of visual perception for scene processing, and discuss why low- and mid-level properties may be particularly diagnostic for the behavioural goals specific to scene perception as compared to object recognition. In particular, we highlight the contributions of low-level vision to scene representation by reviewing (i) retinotopic biases and receptive field properties of scene-selective regions and (ii) the temporal dynamics of scene perception that demonstrate overlap of low- and mid-level feature representations with those of scene category. We discuss the relevance of these findings for scene perception and suggest a more expansive framework for visual scene analysis.This article is part of the themed issue 'Auditory and visual scene analysis'. © 2017 The Author(s).
Contributions of low- and high-level properties to neural processing of visual scenes in the human brain

PubMed Central

2017-01-01

Visual scene analysis in humans has been characterized by the presence of regions in extrastriate cortex that are selectively responsive to scenes compared with objects or faces. While these regions have often been interpreted as representing high-level properties of scenes (e.g. category), they also exhibit substantial sensitivity to low-level (e.g. spatial frequency) and mid-level (e.g. spatial layout) properties, and it is unclear how these disparate findings can be united in a single framework. In this opinion piece, we suggest that this problem can be resolved by questioning the utility of the classical low- to high-level framework of visual perception for scene processing, and discuss why low- and mid-level properties may be particularly diagnostic for the behavioural goals specific to scene perception as compared to object recognition. In particular, we highlight the contributions of low-level vision to scene representation by reviewing (i) retinotopic biases and receptive field properties of scene-selective regions and (ii) the temporal dynamics of scene perception that demonstrate overlap of low- and mid-level feature representations with those of scene category. We discuss the relevance of these findings for scene perception and suggest a more expansive framework for visual scene analysis. This article is part of the themed issue ‘Auditory and visual scene analysis’. PMID:28044013
Multistability in auditory stream segregation: a predictive coding view

PubMed Central

Winkler, István; Denham, Susan; Mill, Robert; Bőhm, Tamás M.; Bendixen, Alexandra

2012-01-01

Auditory stream segregation involves linking temporally separate acoustic events into one or more coherent sequences. For any non-trivial sequence of sounds, many alternative descriptions can be formed, only one or very few of which emerge in awareness at any time. Evidence from studies showing bi-/multistability in auditory streaming suggest that some, perhaps many of the alternative descriptions are represented in the brain in parallel and that they continuously vie for conscious perception. Here, based on a predictive coding view, we consider the nature of these sound representations and how they compete with each other. Predictive processing helps to maintain perceptual stability by signalling the continuation of previously established patterns as well as the emergence of new sound sources. It also provides a measure of how well each of the competing representations describes the current acoustic scene. This account of auditory stream segregation has been tested on perceptual data obtained in the auditory streaming paradigm. PMID:22371621

Psychoacoustics

NASA Astrophysics Data System (ADS)

Moore, Brian C. J.

Psychoacoustics psychological is concerned with the relationships between the physical characteristics of sounds and their perceptual attributes. This chapter describes: the absolute sensitivity of the auditory system for detecting weak sounds and how that sensitivity varies with frequency; the frequency selectivity of the auditory system (the ability to resolve or hear out the sinusoidal components in a complex sound) and its characterization in terms of an array of auditory filters; the processes that influence the masking of one sound by another; the range of sound levels that can be processed by the auditory system; the perception and modeling of loudness; level discrimination; the temporal resolution of the auditory system (the ability to detect changes over time); the perception and modeling of pitch for pure and complex tones; the perception of timbre for steady and time-varying sounds; the perception of space and sound localization; and the mechanisms underlying auditory scene analysis that allow the construction of percepts corresponding to individual sounds sources when listening to complex mixtures of sounds.
Emergent selectivity for task-relevant stimuli in higher-order auditory cortex

PubMed Central

Atiani, Serin; David, Stephen V.; Elgueda, Diego; Locastro, Michael; Radtke-Schuller, Susanne; Shamma, Shihab A.; Fritz, Jonathan B.

2014-01-01

A variety of attention-related effects have been demonstrated in primary auditory cortex (A1). However, an understanding of the functional role of higher auditory cortical areas in guiding attention to acoustic stimuli has been elusive. We recorded from neurons in two tonotopic cortical belt areas in the dorsal posterior ectosylvian gyrus (dPEG) of ferrets trained on a simple auditory discrimination task. Neurons in dPEG showed similar basic auditory tuning properties to A1, but during behavior we observed marked differences between these areas. In the belt areas, changes in neuronal firing rate and response dynamics greatly enhanced responses to target stimuli relative to distractors, allowing for greater attentional selection during active listening. Consistent with existing anatomical evidence, the pattern of sensory tuning and behavioral modulation in auditory belt cortex links the spectro-temporal representation of the whole acoustic scene in A1 to a more abstracted representation of task-relevant stimuli observed in frontal cortex. PMID:24742467
Broad attention to multiple individual objects may facilitate change detection with complex auditory scenes.

PubMed

Irsik, Vanessa C; Vanden Bosch der Nederlanden, Christina M; Snyder, Joel S

2016-11-01

Attention and other processing constraints limit the perception of objects in complex scenes, which has been studied extensively in the visual sense. We used a change deafness paradigm to examine how attention to particular objects helps and hurts the ability to notice changes within complex auditory scenes. In a counterbalanced design, we examined how cueing attention to particular objects affected performance in an auditory change-detection task through the use of valid or invalid cues and trials without cues (Experiment 1). We further examined how successful encoding predicted change-detection performance using an object-encoding task and we addressed whether performing the object-encoding task along with the change-detection task affected performance overall (Experiment 2). Participants had more error for invalid compared to valid and uncued trials, but this effect was reduced in Experiment 2 compared to Experiment 1. When the object-encoding task was present, listeners who completed the uncued condition first had less overall error than those who completed the cued condition first. All participants showed less change deafness when they successfully encoded change-relevant compared to irrelevant objects during valid and uncued trials. However, only participants who completed the uncued condition first also showed this effect during invalid cue trials, suggesting a broader scope of attention. These findings provide converging evidence that attention to change-relevant objects is crucial for successful detection of acoustic changes and that encouraging broad attention to multiple objects is the best way to reduce change deafness. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Application of Data Mining and Knowledge Discovery Techniques to Enhance Binary Target Detection and Decision-Making for Compromised Visual Images

DTIC Science & Technology

2004-11-01

affords exciting opportunities in target detection. The input signal may be a sum of sine waves, it could be an auditory signal, or possibly a visual...rendering of a scene. Since image processing is an area in which the original data are stationary in some sense ( auditory signals suffer from...11 Example 1 of SR - Identification of a Subliminal Signal below a Threshold .......................... 13 Example 2 of SR
Stable individual characteristics in the perception of multiple embedded patterns in multistable auditory stimuli

PubMed Central

Denham, Susan; Bõhm, Tamás M.; Bendixen, Alexandra; Szalárdy, Orsolya; Kocsis, Zsuzsanna; Mill, Robert; Winkler, István

2014-01-01

The ability of the auditory system to parse complex scenes into component objects in order to extract information from the environment is very robust, yet the processing principles underlying this ability are still not well understood. This study was designed to investigate the proposal that the auditory system constructs multiple interpretations of the acoustic scene in parallel, based on the finding that when listening to a long repetitive sequence listeners report switching between different perceptual organizations. Using the “ABA-” auditory streaming paradigm we trained listeners until they could reliably recognize all possible embedded patterns of length four which could in principle be extracted from the sequence, and in a series of test sessions investigated their spontaneous reports of those patterns. With the training allowing them to identify and mark a wider variety of possible patterns, participants spontaneously reported many more patterns than the ones traditionally assumed (Integrated vs. Segregated). Despite receiving consistent training and despite the apparent randomness of perceptual switching, we found individual switching patterns were idiosyncratic; i.e., the perceptual switching patterns of each participant were more similar to their own switching patterns in different sessions than to those of other participants. These individual differences were found to be preserved even between test sessions held a year after the initial experiment. Our results support the idea that the auditory system attempts to extract an exhaustive set of embedded patterns which can be used to generate expectations of future events and which by competing for dominance give rise to (changing) perceptual awareness, with the characteristics of pattern discovery and perceptual competition having a strong idiosyncratic component. Perceptual multistability thus provides a means for characterizing both general mechanisms and individual differences in human perception. PMID:24616656
Stable individual characteristics in the perception of multiple embedded patterns in multistable auditory stimuli.

PubMed

Denham, Susan; Bõhm, Tamás M; Bendixen, Alexandra; Szalárdy, Orsolya; Kocsis, Zsuzsanna; Mill, Robert; Winkler, István

2014-01-01

The ability of the auditory system to parse complex scenes into component objects in order to extract information from the environment is very robust, yet the processing principles underlying this ability are still not well understood. This study was designed to investigate the proposal that the auditory system constructs multiple interpretations of the acoustic scene in parallel, based on the finding that when listening to a long repetitive sequence listeners report switching between different perceptual organizations. Using the "ABA-" auditory streaming paradigm we trained listeners until they could reliably recognize all possible embedded patterns of length four which could in principle be extracted from the sequence, and in a series of test sessions investigated their spontaneous reports of those patterns. With the training allowing them to identify and mark a wider variety of possible patterns, participants spontaneously reported many more patterns than the ones traditionally assumed (Integrated vs. Segregated). Despite receiving consistent training and despite the apparent randomness of perceptual switching, we found individual switching patterns were idiosyncratic; i.e., the perceptual switching patterns of each participant were more similar to their own switching patterns in different sessions than to those of other participants. These individual differences were found to be preserved even between test sessions held a year after the initial experiment. Our results support the idea that the auditory system attempts to extract an exhaustive set of embedded patterns which can be used to generate expectations of future events and which by competing for dominance give rise to (changing) perceptual awareness, with the characteristics of pattern discovery and perceptual competition having a strong idiosyncratic component. Perceptual multistability thus provides a means for characterizing both general mechanisms and individual differences in human perception.
The origins of music in auditory scene analysis and the roles of evolution and culture in musical creation

PubMed Central

Trainor, Laurel J.

2015-01-01

Whether music was an evolutionary adaptation that conferred survival advantages or a cultural creation has generated much debate. Consistent with an evolutionary hypothesis, music is unique to humans, emerges early in development and is universal across societies. However, the adaptive benefit of music is far from obvious. Music is highly flexible, generative and changes rapidly over time, consistent with a cultural creation hypothesis. In this paper, it is proposed that much of musical pitch and timing structure adapted to preexisting features of auditory processing that evolved for auditory scene analysis (ASA). Thus, music may have emerged initially as a cultural creation made possible by preexisting adaptations for ASA. However, some aspects of music, such as its emotional and social power, may have subsequently proved beneficial for survival and led to adaptations that enhanced musical behaviour. Ontogenetic and phylogenetic evidence is considered in this regard. In particular, enhanced auditory–motor pathways in humans that enable movement entrainment to music and consequent increases in social cohesion, and pathways enabling music to affect reward centres in the brain should be investigated as possible musical adaptations. It is concluded that the origins of music are complex and probably involved exaptation, cultural creation and evolutionary adaptation. PMID:25646512
Selective entrainment of brain oscillations drives auditory perceptual organization.

PubMed

Costa-Faidella, Jordi; Sussman, Elyse S; Escera, Carles

2017-10-01

Perceptual sound organization supports our ability to make sense of the complex acoustic environment, to understand speech and to enjoy music. However, the neuronal mechanisms underlying the subjective experience of perceiving univocal auditory patterns that can be listened to, despite hearing all sounds in a scene, are poorly understood. We hereby investigated the manner in which competing sound organizations are simultaneously represented by specific brain activity patterns and the way attention and task demands prime the internal model generating the current percept. Using a selective attention task on ambiguous auditory stimulation coupled with EEG recordings, we found that the phase of low-frequency oscillatory activity dynamically tracks multiple sound organizations concurrently. However, whereas the representation of ignored sound patterns is circumscribed to auditory regions, large-scale oscillatory entrainment in auditory, sensory-motor and executive-control network areas reflects the active perceptual organization, thereby giving rise to the subjective experience of a unitary percept. Copyright © 2017 Elsevier Inc. All rights reserved.
Auditory memory can be object based.

PubMed

Dyson, Benjamin J; Ishfaq, Feraz

2008-04-01

Identifying how memories are organized remains a fundamental issue in psychology. Previous work has shown that visual short-term memory is organized according to the object of origin, with participants being better at retrieving multiple pieces of information from the same object than from different objects. However, it is not yet clear whether similar memory structures are employed for other modalities, such as audition. Under analogous conditions in the auditory domain, we found that short-term memories for sound can also be organized according to object, with a same-object advantage being demonstrated for the retrieval of information in an auditory scene defined by two complex sounds overlapping in both space and time. Our results provide support for the notion of an auditory object, in addition to the continued identification of similar processing constraints across visual and auditory domains. The identification of modality-independent organizational principles of memory, such as object-based coding, suggests possible mechanisms by which the human processing system remembers multimodal experiences.
Separating pitch chroma and pitch height in the human brain

PubMed Central

Warren, J. D.; Uppenkamp, S.; Patterson, R. D.; Griffiths, T. D.

2003-01-01

Musicians recognize pitch as having two dimensions. On the keyboard, these are illustrated by the octave and the cycle of notes within the octave. In perception, these dimensions are referred to as pitch height and pitch chroma, respectively. Pitch chroma provides a basis for presenting acoustic patterns (melodies) that do not depend on the particular sound source. In contrast, pitch height provides a basis for segregation of notes into streams to separate sound sources. This paper reports a functional magnetic resonance experiment designed to search for distinct mappings of these two types of pitch change in the human brain. The results show that chroma change is specifically represented anterior to primary auditory cortex, whereas height change is specifically represented posterior to primary auditory cortex. We propose that tracking of acoustic information streams occurs in anterior auditory areas, whereas the segregation of sound objects (a crucial aspect of auditory scene analysis) depends on posterior areas. PMID:12909719
Separating pitch chroma and pitch height in the human brain.

PubMed

Warren, J D; Uppenkamp, S; Patterson, R D; Griffiths, T D

2003-08-19

Musicians recognize pitch as having two dimensions. On the keyboard, these are illustrated by the octave and the cycle of notes within the octave. In perception, these dimensions are referred to as pitch height and pitch chroma, respectively. Pitch chroma provides a basis for presenting acoustic patterns (melodies) that do not depend on the particular sound source. In contrast, pitch height provides a basis for segregation of notes into streams to separate sound sources. This paper reports a functional magnetic resonance experiment designed to search for distinct mappings of these two types of pitch change in the human brain. The results show that chroma change is specifically represented anterior to primary auditory cortex, whereas height change is specifically represented posterior to primary auditory cortex. We propose that tracking of acoustic information streams occurs in anterior auditory areas, whereas the segregation of sound objects (a crucial aspect of auditory scene analysis) depends on posterior areas.
Bigger Brains or Bigger Nuclei? Regulating the Size of Auditory Structures in Birds

PubMed Central

Kubke, M. Fabiana; Massoglia, Dino P.; Carr, Catherine E.

2012-01-01

Increases in the size of the neuronal structures that mediate specific behaviors are believed to be related to enhanced computational performance. It is not clear, however, what developmental and evolutionary mechanisms mediate these changes, nor whether an increase in the size of a given neuronal population is a general mechanism to achieve enhanced computational ability. We addressed the issue of size by analyzing the variation in the relative number of cells of auditory structures in auditory specialists and generalists. We show that bird species with different auditory specializations exhibit variation in the relative size of their hindbrain auditory nuclei. In the barn owl, an auditory specialist, the hind-brain auditory nuclei involved in the computation of sound location show hyperplasia. This hyperplasia was also found in songbirds, but not in non-auditory specialists. The hyperplasia of auditory nuclei was also not seen in birds with large body weight suggesting that the total number of cells is selected for in auditory specialists. In barn owls, differences observed in the relative size of the auditory nuclei might be attributed to modifications in neurogenesis and cell death. Thus, hyperplasia of circuits used for auditory computation accompanies auditory specialization in different orders of birds. PMID:14726625
Auditory Task Irrelevance: A Basis for Inattentional Deafness

PubMed Central

Scheer, Menja; Bülthoff, Heinrich H.; Chuang, Lewis L.

2018-01-01

Objective This study investigates the neural basis of inattentional deafness, which could result from task irrelevance in the auditory modality. Background Humans can fail to respond to auditory alarms under high workload situations. This failure, termed inattentional deafness, is often attributed to high workload in the visual modality, which reduces one’s capacity for information processing. Besides this, our capacity for processing auditory information could also be selectively diminished if there is no obvious task relevance in the auditory channel. This could be another contributing factor given the rarity of auditory warnings. Method Forty-eight participants performed a visuomotor tracking task while auditory stimuli were presented: a frequent pure tone, an infrequent pure tone, and infrequent environmental sounds. Participants were required either to respond to the presentation of the infrequent pure tone (auditory task-relevant) or not (auditory task-irrelevant). We recorded and compared the event-related potentials (ERPs) that were generated by environmental sounds, which were always task-irrelevant for both groups. These ERPs served as an index for our participants’ awareness of the task-irrelevant auditory scene. Results Manipulation of auditory task relevance influenced the brain’s response to task-irrelevant environmental sounds. Specifically, the late novelty-P3 to irrelevant environmental sounds, which underlies working memory updating, was found to be selectively enhanced by auditory task relevance independent of visuomotor workload. Conclusion Task irrelevance in the auditory modality selectively reduces our brain’s responses to unexpected and irrelevant sounds regardless of visuomotor workload. Application Presenting relevant auditory information more often could mitigate the risk of inattentional deafness. PMID:29578754
Representation of complex vocalizations in the Lusitanian toadfish auditory system: evidence of fine temporal, frequency and amplitude discrimination

PubMed Central

Vasconcelos, Raquel O.; Fonseca, Paulo J.; Amorim, M. Clara P.; Ladich, Friedrich

2011-01-01

Many fishes rely on their auditory skills to interpret crucial information about predators and prey, and to communicate intraspecifically. Few studies, however, have examined how complex natural sounds are perceived in fishes. We investigated the representation of conspecific mating and agonistic calls in the auditory system of the Lusitanian toadfish Halobatrachus didactylus, and analysed auditory responses to heterospecific signals from ecologically relevant species: a sympatric vocal fish (meagre Argyrosomus regius) and a potential predator (dolphin Tursiops truncatus). Using auditory evoked potential (AEP) recordings, we showed that both sexes can resolve fine features of conspecific calls. The toadfish auditory system was most sensitive to frequencies well represented in the conspecific vocalizations (namely the mating boatwhistle), and revealed a fine representation of duration and pulsed structure of agonistic and mating calls. Stimuli and corresponding AEP amplitudes were highly correlated, indicating an accurate encoding of amplitude modulation. Moreover, Lusitanian toadfish were able to detect T. truncatus foraging sounds and A. regius calls, although at higher amplitudes. We provide strong evidence that the auditory system of a vocal fish, lacking accessory hearing structures, is capable of resolving fine features of complex vocalizations that are probably important for intraspecific communication and other relevant stimuli from the auditory scene. PMID:20861044
Auditory object salience: human cortical processing of non-biological action sounds and their acoustic signal attributes

PubMed Central

Lewis, James W.; Talkington, William J.; Tallaksen, Katherine C.; Frum, Chris A.

2012-01-01

Whether viewed or heard, an object in action can be segmented as a distinct salient event based on a number of different sensory cues. In the visual system, several low-level attributes of an image are processed along parallel hierarchies, involving intermediate stages wherein gross-level object form and/or motion features are extracted prior to stages that show greater specificity for different object categories (e.g., people, buildings, or tools). In the auditory system, though relying on a rather different set of low-level signal attributes, meaningful real-world acoustic events and “auditory objects” can also be readily distinguished from background scenes. However, the nature of the acoustic signal attributes or gross-level perceptual features that may be explicitly processed along intermediate cortical processing stages remain poorly understood. Examining mechanical and environmental action sounds, representing two distinct non-biological categories of action sources, we had participants assess the degree to which each sound was perceived as object-like versus scene-like. We re-analyzed data from two of our earlier functional magnetic resonance imaging (fMRI) task paradigms (Engel et al., 2009) and found that scene-like action sounds preferentially led to activation along several midline cortical structures, but with strong dependence on listening task demands. In contrast, bilateral foci along the superior temporal gyri (STG) showed parametrically increasing activation to action sounds rated as more “object-like,” independent of sound category or task demands. Moreover, these STG regions also showed parametric sensitivity to spectral structure variations (SSVs) of the action sounds—a quantitative measure of change in entropy of the acoustic signals over time—and the right STG additionally showed parametric sensitivity to measures of mean entropy and harmonic content of the environmental sounds. Analogous to the visual system, intermediate stages of the auditory system appear to process or extract a number of quantifiable low-order signal attributes that are characteristic of action events perceived as being object-like, representing stages that may begin to dissociate different perceptual dimensions and categories of every-day, real-world action sounds. PMID:22582038
Predictability effects in auditory scene analysis: a review

PubMed Central

Bendixen, Alexandra

2014-01-01

Many sound sources emit signals in a predictable manner. The idea that predictability can be exploited to support the segregation of one source's signal emissions from the overlapping signals of other sources has been expressed for a long time. Yet experimental evidence for a strong role of predictability within auditory scene analysis (ASA) has been scarce. Recently, there has been an upsurge in experimental and theoretical work on this topic resulting from fundamental changes in our perspective on how the brain extracts predictability from series of sensory events. Based on effortless predictive processing in the auditory system, it becomes more plausible that predictability would be available as a cue for sound source decomposition. In the present contribution, empirical evidence for such a role of predictability in ASA will be reviewed. It will be shown that predictability affects ASA both when it is present in the sound source of interest (perceptual foreground) and when it is present in other sound sources that the listener wishes to ignore (perceptual background). First evidence pointing toward age-related impairments in the latter capacity will be addressed. Moreover, it will be illustrated how effects of predictability can be shown by means of objective listening tests as well as by subjective report procedures, with the latter approach typically exploiting the multi-stable nature of auditory perception. Critical aspects of study design will be delineated to ensure that predictability effects can be unambiguously interpreted. Possible mechanisms for a functional role of predictability within ASA will be discussed, and an analogy with the old-plus-new heuristic for grouping simultaneous acoustic signals will be suggested. PMID:24744695
Auditory spatial processing in Alzheimer’s disease

PubMed Central

Golden, Hannah L.; Nicholas, Jennifer M.; Yong, Keir X. X.; Downey, Laura E.; Schott, Jonathan M.; Mummery, Catherine J.; Crutch, Sebastian J.

2015-01-01

The location and motion of sounds in space are important cues for encoding the auditory world. Spatial processing is a core component of auditory scene analysis, a cognitively demanding function that is vulnerable in Alzheimer’s disease. Here we designed a novel neuropsychological battery based on a virtual space paradigm to assess auditory spatial processing in patient cohorts with clinically typical Alzheimer’s disease (n = 20) and its major variant syndrome, posterior cortical atrophy (n = 12) in relation to healthy older controls (n = 26). We assessed three dimensions of auditory spatial function: externalized versus non-externalized sound discrimination, moving versus stationary sound discrimination and stationary auditory spatial position discrimination, together with non-spatial auditory and visual spatial control tasks. Neuroanatomical correlates of auditory spatial processing were assessed using voxel-based morphometry. Relative to healthy older controls, both patient groups exhibited impairments in detection of auditory motion, and stationary sound position discrimination. The posterior cortical atrophy group showed greater impairment for auditory motion processing and the processing of a non-spatial control complex auditory property (timbre) than the typical Alzheimer’s disease group. Voxel-based morphometry in the patient cohort revealed grey matter correlates of auditory motion detection and spatial position discrimination in right inferior parietal cortex and precuneus, respectively. These findings delineate auditory spatial processing deficits in typical and posterior Alzheimer’s disease phenotypes that are related to posterior cortical regions involved in both syndromic variants and modulated by the syndromic profile of brain degeneration. Auditory spatial deficits contribute to impaired spatial awareness in Alzheimer’s disease and may constitute a novel perceptual model for probing brain network disintegration across the Alzheimer’s disease syndromic spectrum. PMID:25468732
Frequency-Selective Attention in Auditory Scenes Recruits Frequency Representations Throughout Human Superior Temporal Cortex.

PubMed

Riecke, Lars; Peters, Judith C; Valente, Giancarlo; Kemper, Valentin G; Formisano, Elia; Sorger, Bettina

2017-05-01

A sound of interest may be tracked amid other salient sounds by focusing attention on its characteristic features including its frequency. Functional magnetic resonance imaging findings have indicated that frequency representations in human primary auditory cortex (AC) contribute to this feat. However, attentional modulations were examined at relatively low spatial and spectral resolutions, and frequency-selective contributions outside the primary AC could not be established. To address these issues, we compared blood oxygenation level-dependent (BOLD) responses in the superior temporal cortex of human listeners while they identified single frequencies versus listened selectively for various frequencies within a multifrequency scene. Using best-frequency mapping, we observed that the detailed spatial layout of attention-induced BOLD response enhancements in primary AC follows the tonotopy of stimulus-driven frequency representations-analogous to the "spotlight" of attention enhancing visuospatial representations in retinotopic visual cortex. Moreover, using an algorithm trained to discriminate stimulus-driven frequency representations, we could successfully decode the focus of frequency-selective attention from listeners' BOLD response patterns in nonprimary AC. Our results indicate that the human brain facilitates selective listening to a frequency of interest in a scene by reinforcing the fine-grained activity pattern throughout the entire superior temporal cortex that would be evoked if that frequency was present alone. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Cat and mouse search: the influence of scene and object analysis on eye movements when targets change locations during search.

PubMed

Hillstrom, Anne P; Segabinazi, Joice D; Godwin, Hayward J; Liversedge, Simon P; Benson, Valerie

2017-02-19

We explored the influence of early scene analysis and visible object characteristics on eye movements when searching for objects in photographs of scenes. On each trial, participants were shown sequentially either a scene preview or a uniform grey screen (250 ms), a visual mask, the name of the target and the scene, now including the target at a likely location. During the participant's first saccade during search, the target location was changed to: (i) a different likely location, (ii) an unlikely but possible location or (iii) a very implausible location. The results showed that the first saccade landed more often on the likely location in which the target re-appeared than on unlikely or implausible locations, and overall the first saccade landed nearer the first target location with a preview than without. Hence, rapid scene analysis influenced initial eye movement planning, but availability of the target rapidly modified that plan. After the target moved, it was found more quickly when it appeared in a likely location than when it appeared in an unlikely or implausible location. The findings show that both scene gist and object properties are extracted rapidly, and are used in conjunction to guide saccadic eye movements during visual search.This article is part of the themed issue 'Auditory and visual scene analysis'. © 2017 The Author(s).
A Method for Assessing Auditory Spatial Analysis in Reverberant Multitalker Environments.

PubMed

Weller, Tobias; Best, Virginia; Buchholz, Jörg M; Young, Taegan

2016-07-01

Deficits in spatial hearing can have a negative impact on listeners' ability to orient in their environment and follow conversations in noisy backgrounds and may exacerbate the experience of hearing loss as a handicap. However, there are no good tools available for reliably capturing the spatial hearing abilities of listeners in complex acoustic environments containing multiple sounds of interest. The purpose of this study was to explore a new method to measure auditory spatial analysis in a reverberant multitalker scenario. This study was a descriptive case control study. Ten listeners with normal hearing (NH) aged 20-31 yr and 16 listeners with hearing impairment (HI) aged 52-85 yr participated in the study. The latter group had symmetrical sensorineural hearing losses with a four-frequency average hearing loss of 29.7 dB HL. A large reverberant room was simulated using a loudspeaker array in an anechoic chamber. In this simulated room, 96 scenes comprising between one and six concurrent talkers at different locations were generated. Listeners were presented with 45-sec samples of each scene, and were required to count, locate, and identify the gender of all talkers, using a graphical user interface on an iPad. Performance was evaluated in terms of correctly counting the sources and accuracy in localizing their direction. Listeners with NH were able to reliably analyze scenes with up to four simultaneous talkers, while most listeners with hearing loss demonstrated errors even with two talkers at a time. Localization performance decreased in both groups with increasing number of talkers and was significantly poorer in listeners with HI. Overall performance was significantly correlated with hearing loss. This new method appears to be useful for estimating spatial abilities in realistic multitalker scenes. The method is sensitive to the number of sources in the scene, and to effects of sensorineural hearing loss. Further work will be needed to compare this method to more traditional single-source localization tests. American Academy of Audiology.

Turning down the noise: the benefit of musical training on the aging auditory brain.

PubMed

Alain, Claude; Zendel, Benjamin Rich; Hutka, Stefanie; Bidelman, Gavin M

2014-02-01

Age-related decline in hearing abilities is a ubiquitous part of aging, and commonly impacts speech understanding, especially when there are competing sound sources. While such age effects are partially due to changes within the cochlea, difficulties typically exist beyond measurable hearing loss, suggesting that central brain processes, as opposed to simple peripheral mechanisms (e.g., hearing sensitivity), play a critical role in governing hearing abilities late into life. Current training regimens aimed to improve central auditory processing abilities have experienced limited success in promoting listening benefits. Interestingly, recent studies suggest that in young adults, musical training positively modifies neural mechanisms, providing robust, long-lasting improvements to hearing abilities as well as to non-auditory tasks that engage cognitive control. These results offer the encouraging possibility that musical training might be used to counteract age-related changes in auditory cognition commonly observed in older adults. Here, we reviewed studies that have examined the effects of age and musical experience on auditory cognition with an emphasis on auditory scene analysis. We infer that musical training may offer potential benefits to complex listening and might be utilized as a means to delay or even attenuate declines in auditory perception and cognition that often emerge later in life. Copyright © 2013 Elsevier B.V. All rights reserved.
Clinical Features of Auditory Hallucinations in Patients With Dementia With Lewy Bodies: A Soundtrack of Visual Hallucinations.

PubMed

Tsunoda, Naoko; Hashimoto, Mamoru; Ishikawa, Tomohisa; Fukuhara, Ryuji; Yuki, Seiji; Tanaka, Hibiki; Hatada, Yutaka; Miyagawa, Yusuke; Ikeda, Manabu

2018-05-08

Auditory hallucinations are an important symptom for diagnosing dementia with Lewy bodies (DLB), yet they have received less attention than visual hallucinations. We investigated the clinical features of auditory hallucinations and the possible mechanisms by which they arise in patients with DLB. We recruited 124 consecutive patients with probable DLB (diagnosis based on the DLB International Workshop 2005 criteria; study period: June 2007-January 2015) from the dementia referral center of Kumamoto University Hospital. We used the Neuropsychiatric Inventory to assess the presence of auditory hallucinations, visual hallucinations, and other neuropsychiatric symptoms. We reviewed all available clinical records of patients with auditory hallucinations to assess their clinical features. We performed multiple logistic regression analysis to identify significant independent predictors of auditory hallucinations. Of the 124 patients, 44 (35.5%) had auditory hallucinations and 75 (60.5%) had visual hallucinations. The majority of patients (90.9%) with auditory hallucinations also had visual hallucinations. Auditory hallucinations consisted mostly of human voices, and 90% of patients described them as like hearing a soundtrack of the scene. Multiple logistic regression showed that the presence of auditory hallucinations was significantly associated with female sex (P = .04) and hearing impairment (P = .004). The analysis also revealed independent correlations between the presence of auditory hallucinations and visual hallucinations (P < .001), phantom boarder delusions (P = .001), and depression (P = .038). Auditory hallucinations are common neuropsychiatric symptoms in DLB and usually appear as a background soundtrack accompanying visual hallucinations. Auditory hallucinations in patients with DLB are more likely to occur in women and those with impaired hearing, depression, delusions, or visual hallucinations. © Copyright 2018 Physicians Postgraduate Press, Inc.
Change Deafness and the Organizational Properties of Sounds

ERIC Educational Resources Information Center

Gregg, Melissa K.; Samuel, Arthur G.

2008-01-01

Change blindness, or the failure to detect (often large) changes to visual scenes, has been demonstrated in a variety of different situations. Failures to detect auditory changes are far less studied, and thus little is known about the nature of change deafness. Five experiments were conducted to explore the processes involved in change deafness…
Visually-guided attention enhances target identification in a complex auditory scene.

PubMed

Best, Virginia; Ozmeral, Erol J; Shinn-Cunningham, Barbara G

2007-06-01

In auditory scenes containing many similar sound sources, sorting of acoustic information into streams becomes difficult, which can lead to disruptions in the identification of behaviorally relevant targets. This study investigated the benefit of providing simple visual cues for when and/or where a target would occur in a complex acoustic mixture. Importantly, the visual cues provided no information about the target content. In separate experiments, human subjects either identified learned birdsongs in the presence of a chorus of unlearned songs or recalled strings of spoken digits in the presence of speech maskers. A visual cue indicating which loudspeaker (from an array of five) would contain the target improved accuracy for both kinds of stimuli. A cue indicating which time segment (out of a possible five) would contain the target also improved accuracy, but much more for birdsong than for speech. These results suggest that in real world situations, information about where a target of interest is located can enhance its identification, while information about when to listen can also be helpful when targets are unfamiliar or extremely similar to their competitors.
Visually-guided Attention Enhances Target Identification in a Complex Auditory Scene

PubMed Central

Ozmeral, Erol J.; Shinn-Cunningham, Barbara G.

2007-01-01

In auditory scenes containing many similar sound sources, sorting of acoustic information into streams becomes difficult, which can lead to disruptions in the identification of behaviorally relevant targets. This study investigated the benefit of providing simple visual cues for when and/or where a target would occur in a complex acoustic mixture. Importantly, the visual cues provided no information about the target content. In separate experiments, human subjects either identified learned birdsongs in the presence of a chorus of unlearned songs or recalled strings of spoken digits in the presence of speech maskers. A visual cue indicating which loudspeaker (from an array of five) would contain the target improved accuracy for both kinds of stimuli. A cue indicating which time segment (out of a possible five) would contain the target also improved accuracy, but much more for birdsong than for speech. These results suggest that in real world situations, information about where a target of interest is located can enhance its identification, while information about when to listen can also be helpful when targets are unfamiliar or extremely similar to their competitors. PMID:17453308
Physics Based Modeling and Rendering of Vegetation in the Thermal Infrared

NASA Technical Reports Server (NTRS)

Smith, J. A.; Ballard, J. R., Jr.

1999-01-01

We outline a procedure for rendering physically-based thermal infrared images of simple vegetation scenes. Our approach incorporates the biophysical processes that affect the temperature distribution of the elements within a scene. Computer graphics plays a key role in two respects. First, in computing the distribution of scene shaded and sunlit facets and, second, in the final image rendering once the temperatures of all the elements in the scene have been computed. We illustrate our approach for a simple corn scene where the three-dimensional geometry is constructed based on measured morphological attributes of the row crop. Statistical methods are used to construct a representation of the scene in agreement with the measured characteristics. Our results are quite good. The rendered images exhibit realistic behavior in directional properties as a function of view and sun angle. The root-mean-square error in measured versus predicted brightness temperatures for the scene was 2.1 deg C.
Emotion modulates eye movement patterns and subsequent memory for the gist and details of movie scenes.

PubMed

Subramanian, Ramanathan; Shankar, Divya; Sebe, Nicu; Melcher, David

2014-03-26

A basic question in vision research regards where people look in complex scenes and how this influences their performance in various tasks. Previous studies with static images have demonstrated a close link between where people look and what they remember. Here, we examined the pattern of eye movements when participants watched neutral and emotional clips from Hollywood-style movies. Participants answered multiple-choice memory questions concerning visual and auditory scene details immediately upon viewing 1-min-long neutral or emotional movie clips. Fixations were more narrowly focused for emotional clips, and immediate memory for object details was worse compared to matched neutral scenes, implying preferential attention to emotional events. Although we found the expected correlation between where people looked and what they remembered for neutral clips, this relationship broke down for emotional clips. When participants were subsequently presented with key frames (static images) extracted from the movie clips such that presentation duration of the target objects (TOs) corresponding to the multiple-choice questions was matched and the earlier questions were repeated, more fixations were observed on the TOs, and memory performance also improved significantly, confirming that emotion modulates the relationship between gaze position and memory performance. Finally, in a long-term memory test, old/new recognition performance was significantly better for emotional scenes as compared to neutral scenes. Overall, these results are consistent with the hypothesis that emotional content draws eye fixations and strengthens memory for the scene gist while weakening encoding of peripheral scene details.
Hierarchical neurocomputations underlying concurrent sound segregation: connecting periphery to percept.

PubMed

Bidelman, Gavin M; Alain, Claude

2015-02-01

Natural soundscapes often contain multiple sound sources at any given time. Numerous studies have reported that in human observers, the perception and identification of concurrent sounds is paralleled by specific changes in cortical event-related potentials (ERPs). Although these studies provide a window into the cerebral mechanisms governing sound segregation, little is known about the subcortical neural architecture and hierarchy of neurocomputations that lead to this robust perceptual process. Using computational modeling, scalp-recorded brainstem/cortical ERPs, and human psychophysics, we demonstrate that a primary cue for sound segregation, i.e., harmonicity, is encoded at the auditory nerve level within tens of milliseconds after the onset of sound and is maintained, largely untransformed, in phase-locked activity of the rostral brainstem. As then indexed by auditory cortical responses, (in)harmonicity is coded in the signature and magnitude of the cortical object-related negativity (ORN) response (150-200 ms). The salience of the resulting percept is then captured in a discrete, categorical-like coding scheme by a late negativity response (N5; ~500 ms latency), just prior to the elicitation of a behavioral judgment. Subcortical activity correlated with cortical evoked responses such that weaker phase-locked brainstem responses (lower neural harmonicity) generated larger ORN amplitude, reflecting the cortical registration of multiple sound objects. Studying multiple brain indices simultaneously helps illuminate the mechanisms and time-course of neural processing underlying concurrent sound segregation and may lead to further development and refinement of physiologically driven models of auditory scene analysis. Copyright © 2014 Elsevier Ltd. All rights reserved.
Sustained selective attention to competing amplitude-modulations in human auditory cortex.

PubMed

Riecke, Lars; Scharke, Wolfgang; Valente, Giancarlo; Gutschalk, Alexander

2014-01-01

Auditory selective attention plays an essential role for identifying sounds of interest in a scene, but the neural underpinnings are still incompletely understood. Recent findings demonstrate that neural activity that is time-locked to a particular amplitude-modulation (AM) is enhanced in the auditory cortex when the modulated stream of sounds is selectively attended to under sensory competition with other streams. However, the target sounds used in the previous studies differed not only in their AM, but also in other sound features, such as carrier frequency or location. Thus, it remains uncertain whether the observed enhancements reflect AM-selective attention. The present study aims at dissociating the effect of AM frequency on response enhancement in auditory cortex by using an ongoing auditory stimulus that contains two competing targets differing exclusively in their AM frequency. Electroencephalography results showed a sustained response enhancement for auditory attention compared to visual attention, but not for AM-selective attention (attended AM frequency vs. ignored AM frequency). In contrast, the response to the ignored AM frequency was enhanced, although a brief trend toward response enhancement occurred during the initial 15 s. Together with the previous findings, these observations indicate that selective enhancement of attended AMs in auditory cortex is adaptive under sustained AM-selective attention. This finding has implications for our understanding of cortical mechanisms for feature-based attentional gain control.
Sustained Selective Attention to Competing Amplitude-Modulations in Human Auditory Cortex

PubMed Central

Riecke, Lars; Scharke, Wolfgang; Valente, Giancarlo; Gutschalk, Alexander

2014-01-01

Auditory selective attention plays an essential role for identifying sounds of interest in a scene, but the neural underpinnings are still incompletely understood. Recent findings demonstrate that neural activity that is time-locked to a particular amplitude-modulation (AM) is enhanced in the auditory cortex when the modulated stream of sounds is selectively attended to under sensory competition with other streams. However, the target sounds used in the previous studies differed not only in their AM, but also in other sound features, such as carrier frequency or location. Thus, it remains uncertain whether the observed enhancements reflect AM-selective attention. The present study aims at dissociating the effect of AM frequency on response enhancement in auditory cortex by using an ongoing auditory stimulus that contains two competing targets differing exclusively in their AM frequency. Electroencephalography results showed a sustained response enhancement for auditory attention compared to visual attention, but not for AM-selective attention (attended AM frequency vs. ignored AM frequency). In contrast, the response to the ignored AM frequency was enhanced, although a brief trend toward response enhancement occurred during the initial 15 s. Together with the previous findings, these observations indicate that selective enhancement of attended AMs in auditory cortex is adaptive under sustained AM-selective attention. This finding has implications for our understanding of cortical mechanisms for feature-based attentional gain control. PMID:25259525
Cortical mechanisms for the segregation and representation of acoustic textures.

PubMed

Overath, Tobias; Kumar, Sukhbinder; Stewart, Lauren; von Kriegstein, Katharina; Cusack, Rhodri; Rees, Adrian; Griffiths, Timothy D

2010-02-10

Auditory object analysis requires two fundamental perceptual processes: the definition of the boundaries between objects, and the abstraction and maintenance of an object's characteristic features. Although it is intuitive to assume that the detection of the discontinuities at an object's boundaries precedes the subsequent precise representation of the object, the specific underlying cortical mechanisms for segregating and representing auditory objects within the auditory scene are unknown. We investigated the cortical bases of these two processes for one type of auditory object, an "acoustic texture," composed of multiple frequency-modulated ramps. In these stimuli, we independently manipulated the statistical rules governing (1) the frequency-time space within individual textures (comprising ramps with a given spectrotemporal coherence) and (2) the boundaries between textures (adjacent textures with different spectrotemporal coherences). Using functional magnetic resonance imaging, we show mechanisms defining boundaries between textures with different coherences in primary and association auditory cortices, whereas texture coherence is represented only in association cortex. Furthermore, participants' superior detection of boundaries across which texture coherence increased (as opposed to decreased) was reflected in a greater neural response in auditory association cortex at these boundaries. The results suggest a hierarchical mechanism for processing acoustic textures that is relevant to auditory object analysis: boundaries between objects are first detected as a change in statistical rules over frequency-time space, before a representation that corresponds to the characteristics of the perceived object is formed.
Feasibility of and Design Parameters for a Computer-Based Attitudinal Research Information System

DTIC Science & Technology

1975-08-01

Auditory Displays Auditory Evoked Potentials Auditory Feedback Auditory Hallucinations Auditory Localization Auditory Maski ng Auditory Neurons...surprising to hear these prob- lems e:qpressed once again and in the same old refrain. The Navy attitude surveyors were frustrated when they...Audiolcgy Audiometers Aud iometry Audiotapes Audiovisual Communications Media Audiovisual Instruction Auditory Cortex Auditory
Contextual modulation of primary visual cortex by auditory signals.

PubMed

Petro, L S; Paton, A T; Muckli, L

2017-02-19

Early visual cortex receives non-feedforward input from lateral and top-down connections (Muckli & Petro 2013 Curr. Opin. Neurobiol. 23, 195-201. (doi:10.1016/j.conb.2013.01.020)), including long-range projections from auditory areas. Early visual cortex can code for high-level auditory information, with neural patterns representing natural sound stimulation (Vetter et al. 2014 Curr. Biol. 24, 1256-1262. (doi:10.1016/j.cub.2014.04.020)). We discuss a number of questions arising from these findings. What is the adaptive function of bimodal representations in visual cortex? What type of information projects from auditory to visual cortex? What are the anatomical constraints of auditory information in V1, for example, periphery versus fovea, superficial versus deep cortical layers? Is there a putative neural mechanism we can infer from human neuroimaging data and recent theoretical accounts of cortex? We also present data showing we can read out high-level auditory information from the activation patterns of early visual cortex even when visual cortex receives simple visual stimulation, suggesting independent channels for visual and auditory signals in V1. We speculate which cellular mechanisms allow V1 to be contextually modulated by auditory input to facilitate perception, cognition and behaviour. Beyond cortical feedback that facilitates perception, we argue that there is also feedback serving counterfactual processing during imagery, dreaming and mind wandering, which is not relevant for immediate perception but for behaviour and cognition over a longer time frame.This article is part of the themed issue 'Auditory and visual scene analysis'. © 2017 The Authors.
Contextual modulation of primary visual cortex by auditory signals

PubMed Central

Paton, A. T.

2017-01-01

Early visual cortex receives non-feedforward input from lateral and top-down connections (Muckli & Petro 2013 Curr. Opin. Neurobiol. 23, 195–201. (doi:10.1016/j.conb.2013.01.020)), including long-range projections from auditory areas. Early visual cortex can code for high-level auditory information, with neural patterns representing natural sound stimulation (Vetter et al. 2014 Curr. Biol. 24, 1256–1262. (doi:10.1016/j.cub.2014.04.020)). We discuss a number of questions arising from these findings. What is the adaptive function of bimodal representations in visual cortex? What type of information projects from auditory to visual cortex? What are the anatomical constraints of auditory information in V1, for example, periphery versus fovea, superficial versus deep cortical layers? Is there a putative neural mechanism we can infer from human neuroimaging data and recent theoretical accounts of cortex? We also present data showing we can read out high-level auditory information from the activation patterns of early visual cortex even when visual cortex receives simple visual stimulation, suggesting independent channels for visual and auditory signals in V1. We speculate which cellular mechanisms allow V1 to be contextually modulated by auditory input to facilitate perception, cognition and behaviour. Beyond cortical feedback that facilitates perception, we argue that there is also feedback serving counterfactual processing during imagery, dreaming and mind wandering, which is not relevant for immediate perception but for behaviour and cognition over a longer time frame. This article is part of the themed issue ‘Auditory and visual scene analysis’. PMID:28044015
Time-Frequency Masking for Speech Separation and Its Potential for Hearing Aid Design

PubMed Central

Wang, DeLiang

2008-01-01

A new approach to the separation of speech from speech-in-noise mixtures is the use of time-frequency (T-F) masking. Originated in the field of computational auditory scene analysis, T-F masking performs separation in the time-frequency domain. This article introduces the T-F masking concept and reviews T-F masking algorithms that separate target speech from either monaural or binaural mixtures, as well as microphone-array recordings. The review emphasizes techniques that are promising for hearing aid design. This article also surveys recent studies that evaluate the perceptual effects of T-F masking techniques, particularly their effectiveness in improving human speech recognition in noise. An assessment is made of the potential benefits of T-F masking methods for the hearing impaired in light of the processing constraints of hearing aids. Finally, several issues pertinent to T-F masking are discussed. PMID:18974204
Summary statistics in auditory perception.

PubMed

McDermott, Josh H; Schemitsch, Michael; Simoncelli, Eero P

2013-04-01

Sensory signals are transduced at high resolution, but their structure must be stored in a more compact format. Here we provide evidence that the auditory system summarizes the temporal details of sounds using time-averaged statistics. We measured discrimination of 'sound textures' that were characterized by particular statistical properties, as normally result from the superposition of many acoustic features in auditory scenes. When listeners discriminated examples of different textures, performance improved with excerpt duration. In contrast, when listeners discriminated different examples of the same texture, performance declined with duration, a paradoxical result given that the information available for discrimination grows with duration. These results indicate that once these sounds are of moderate length, the brain's representation is limited to time-averaged statistics, which, for different examples of the same texture, converge to the same values with increasing duration. Such statistical representations produce good categorical discrimination, but limit the ability to discern temporal detail.
Perceptual congruency of audio-visual speech affects ventriloquism with bilateral visual stimuli.

PubMed

Kanaya, Shoko; Yokosawa, Kazuhiko

2011-02-01

Many studies on multisensory processes have focused on performance in simplified experimental situations, with a single stimulus in each sensory modality. However, these results cannot necessarily be applied to explain our perceptual behavior in natural scenes where various signals exist within one sensory modality. We investigated the role of audio-visual syllable congruency on participants' auditory localization bias or the ventriloquism effect using spoken utterances and two videos of a talking face. Salience of facial movements was also manipulated. Results indicated that more salient visual utterances attracted participants' auditory localization. Congruent pairing of audio-visual utterances elicited greater localization bias than incongruent pairing, while previous studies have reported little dependency on the reality of stimuli in ventriloquism. Moreover, audio-visual illusory congruency, owing to the McGurk effect, caused substantial visual interference on auditory localization. Multisensory performance appears more flexible and adaptive in this complex environment than in previous studies.
Efficacy of Individual Computer-Based Auditory Training for People with Hearing Loss: A Systematic Review of the Evidence

PubMed Central

Henshaw, Helen; Ferguson, Melanie A.

2013-01-01

Background Auditory training involves active listening to auditory stimuli and aims to improve performance in auditory tasks. As such, auditory training is a potential intervention for the management of people with hearing loss. Objective This systematic review (PROSPERO 2011: CRD42011001406) evaluated the published evidence-base for the efficacy of individual computer-based auditory training to improve speech intelligibility, cognition and communication abilities in adults with hearing loss, with or without hearing aids or cochlear implants. Methods A systematic search of eight databases and key journals identified 229 articles published since 1996, 13 of which met the inclusion criteria. Data were independently extracted and reviewed by the two authors. Study quality was assessed using ten pre-defined scientific and intervention-specific measures. Results Auditory training resulted in improved performance for trained tasks in 9/10 articles that reported on-task outcomes. Although significant generalisation of learning was shown to untrained measures of speech intelligibility (11/13 articles), cognition (1/1 articles) and self-reported hearing abilities (1/2 articles), improvements were small and not robust. Where reported, compliance with computer-based auditory training was high, and retention of learning was shown at post-training follow-ups. Published evidence was of very-low to moderate study quality. Conclusions Our findings demonstrate that published evidence for the efficacy of individual computer-based auditory training for adults with hearing loss is not robust and therefore cannot be reliably used to guide intervention at this time. We identify a need for high-quality evidence to further examine the efficacy of computer-based auditory training for people with hearing loss. PMID:23675431
Comparable mechanisms of working memory interference by auditory and visual motion in youth and aging

PubMed Central

Mishra, Jyoti; Zanto, Theodore; Nilakantan, Aneesha; Gazzaley, Adam

2013-01-01

Intrasensory interference during visual working memory (WM) maintenance by object stimuli (such as faces and scenes), has been shown to negatively impact WM performance, with greater detrimental impacts of interference observed in aging. Here we assessed age-related impacts by intrasensory WM interference from lower-level stimulus features such as visual and auditory motion stimuli. We consistently found that interference in the form of ignored distractions and secondary task i nterruptions presented during a WM maintenance period, degraded memory accuracy in both the visual and auditory domain. However, in contrast to prior studies assessing WM for visual object stimuli, feature-based interference effects were not observed to be significantly greater in older adults. Analyses of neural oscillations in the alpha frequency band further revealed preserved mechanisms of interference processing in terms of post-stimulus alpha suppression, which was observed maximally for secondary task interruptions in visual and auditory modalities in both younger and older adults. These results suggest that age-related sensitivity of WM to interference may be limited to complex object stimuli, at least at low WM loads. PMID:23791629
Time-compressed spoken word primes crossmodally enhance processing of semantically congruent visual targets.

PubMed

Mahr, Angela; Wentura, Dirk

2014-02-01

Findings from three experiments support the conclusion that auditory primes facilitate the processing of related targets. In Experiments 1 and 2, we employed a crossmodal Stroop color identification task with auditory color words (as primes) and visual color patches (as targets). Responses were faster for congruent priming, in comparison to neutral or incongruent priming. This effect also emerged for different levels of time compression of the auditory primes (to 30 % and 10 % of the original length; i.e., 120 and 40 ms) and turned out to be even more pronounced under high-perceptual-load conditions (Exps. 1 and 2). In Experiment 3, target-present or -absent decisions for brief target displays had to be made, thereby ruling out response-priming processes as a cause of the congruency effects. Nevertheless, target detection (d') was increased by congruent primes (30 % compression) in comparison to incongruent or neutral primes. Our results suggest semantic object-based auditory-visual interactions, which rapidly increase the denoted target object's salience. This would apply, in particular, to complex visual scenes.

PROCRU: A model for analyzing crew procedures in approach to landing

NASA Technical Reports Server (NTRS)

Baron, S.; Muralidharan, R.; Lancraft, R.; Zacharias, G.

1980-01-01

A model for analyzing crew procedures in approach to landing is developed. The model employs the information processing structure used in the optimal control model and in recent models for monitoring and failure detection. Mechanisms are added to this basic structure to model crew decision making in this multi task environment. Decisions are based on probability assessments and potential mission impact (or gain). Sub models for procedural activities are included. The model distinguishes among external visual, instrument visual, and auditory sources of information. The external visual scene perception models incorporate limitations in obtaining information. The auditory information channel contains a buffer to allow for storage in memory until that information can be processed.
Guidance of visual attention by semantic information in real-world scenes

PubMed Central

Wu, Chia-Chien; Wick, Farahnaz Ahmed; Pomplun, Marc

2014-01-01

Recent research on attentional guidance in real-world scenes has focused on object recognition within the context of a scene. This approach has been valuable for determining some factors that drive the allocation of visual attention and determine visual selection. This article provides a review of experimental work on how different components of context, especially semantic information, affect attentional deployment. We review work from the areas of object recognition, scene perception, and visual search, highlighting recent studies examining semantic structure in real-world scenes. A better understanding on how humans parse scene representations will not only improve current models of visual attention but also advance next-generation computer vision systems and human-computer interfaces. PMID:24567724
Large-Scale Analysis of Auditory Segregation Behavior Crowdsourced via a Smartphone App.

PubMed

Teki, Sundeep; Kumar, Sukhbinder; Griffiths, Timothy D

2016-01-01

The human auditory system is adept at detecting sound sources of interest from a complex mixture of several other simultaneous sounds. The ability to selectively attend to the speech of one speaker whilst ignoring other speakers and background noise is of vital biological significance-the capacity to make sense of complex 'auditory scenes' is significantly impaired in aging populations as well as those with hearing loss. We investigated this problem by designing a synthetic signal, termed the 'stochastic figure-ground' stimulus that captures essential aspects of complex sounds in the natural environment. Previously, we showed that under controlled laboratory conditions, young listeners sampled from the university subject pool (n = 10) performed very well in detecting targets embedded in the stochastic figure-ground signal. Here, we presented a modified version of this cocktail party paradigm as a 'game' featured in a smartphone app (The Great Brain Experiment) and obtained data from a large population with diverse demographical patterns (n = 5148). Despite differences in paradigms and experimental settings, the observed target-detection performance by users of the app was robust and consistent with our previous results from the psychophysical study. Our results highlight the potential use of smartphone apps in capturing robust large-scale auditory behavioral data from normal healthy volunteers, which can also be extended to study auditory deficits in clinical populations with hearing impairments and central auditory disorders.
Dimensionality of visual complexity in computer graphics scenes

NASA Astrophysics Data System (ADS)

Ramanarayanan, Ganesh; Bala, Kavita; Ferwerda, James A.; Walter, Bruce

2008-02-01

How do human observers perceive visual complexity in images? This problem is especially relevant for computer graphics, where a better understanding of visual complexity can aid in the development of more advanced rendering algorithms. In this paper, we describe a study of the dimensionality of visual complexity in computer graphics scenes. We conducted an experiment where subjects judged the relative complexity of 21 high-resolution scenes, rendered with photorealistic methods. Scenes were gathered from web archives and varied in theme, number and layout of objects, material properties, and lighting. We analyzed the subject responses using multidimensional scaling of pooled subject responses. This analysis embedded the stimulus images in a two-dimensional space, with axes that roughly corresponded to "numerosity" and "material / lighting complexity". In a follow-up analysis, we derived a one-dimensional complexity ordering of the stimulus images. We compared this ordering with several computable complexity metrics, such as scene polygon count and JPEG compression size, and did not find them to be very correlated. Understanding the differences between these measures can lead to the design of more efficient rendering algorithms in computer graphics.
Electrophysiological correlates of cocktail-party listening.

PubMed

Lewald, Jörg; Getzmann, Stephan

2015-10-01

Detecting, localizing, and selectively attending to a particular sound source of interest in complex auditory scenes composed of multiple competing sources is a remarkable capacity of the human auditory system. The neural basis of this so-called "cocktail-party effect" has remained largely unknown. Here, we studied the cortical network engaged in solving the "cocktail-party" problem, using event-related potentials (ERPs) in combination with two tasks demanding horizontal localization of a naturalistic target sound presented either in silence or in the presence of multiple competing sound sources. Presentation of multiple sound sources, as compared to single sources, induced an increased P1 amplitude, a reduction in N1, and a strong N2 component, resulting in a pronounced negativity in the ERP difference waveform (N2d) around 260 ms after stimulus onset. About 100 ms later, the anterior contralateral N2 subcomponent (N2ac) occurred in the multiple-sources condition, as computed from the amplitude difference for targets in the left minus right hemispaces. Cortical source analyses of the ERP modulation, resulting from the contrast of multiple vs. single sources, generally revealed an initial enhancement of electrical activity in right temporo-parietal areas, including auditory cortex, by multiple sources (at P1) that is followed by a reduction, with the primary sources shifting from right inferior parietal lobule (at N1) to left dorso-frontal cortex (at N2d). Thus, cocktail-party listening, as compared to single-source localization, appears to be based on a complex chronology of successive electrical activities within a specific cortical network involved in spatial hearing in complex situations. Copyright © 2015 Elsevier B.V. All rights reserved.
Association of auditory-verbal and visual hallucinations with impaired and improved recognition of colored pictures.

PubMed

Brébion, Gildas; Stephan-Otto, Christian; Usall, Judith; Huerta-Ramos, Elena; Perez del Olmo, Mireia; Cuevas-Esteban, Jorge; Haro, Josep Maria; Ochoa, Susana

2015-09-01

A number of cognitive underpinnings of auditory hallucinations have been established in schizophrenia patients, but few have, as yet, been uncovered for visual hallucinations. In previous research, we unexpectedly observed that auditory hallucinations were associated with poor recognition of color, but not black-and-white (b/w), pictures. In this study, we attempted to replicate and explain this finding. Potential associations with visual hallucinations were explored. B/w and color pictures were presented to 50 schizophrenia patients and 45 healthy individuals under 2 conditions of visual context presentation corresponding to 2 levels of visual encoding complexity. Then, participants had to recognize the target pictures among distractors. Auditory-verbal hallucinations were inversely associated with the recognition of the color pictures presented under the most effortful encoding condition. This association was fully mediated by working-memory span. Visual hallucinations were associated with improved recognition of the color pictures presented under the less effortful condition. Patients suffering from visual hallucinations were not impaired, relative to the healthy participants, in the recognition of these pictures. Decreased working-memory span in patients with auditory-verbal hallucinations might impede the effortful encoding of stimuli. Visual hallucinations might be associated with facilitation in the visual encoding of natural scenes, or with enhanced color perception abilities. (c) 2015 APA, all rights reserved).
Children's and Adults' Ability to Build Online Emotional Inferences during Comprehension of Audiovisual and Auditory Texts

ERIC Educational Resources Information Center

Diergarten, Anna Katharina; Nieding, Gerhild

2015-01-01

Two studies examined inferences drawn about the protagonist's emotional state in movies (Study 1) or audiobooks (Study 2). Children aged 5, 8, and 10 years old and adults took part. Participants saw or heard 20 movie scenes or sections of audiobooks taken or adapted from the TV show Lassie. An online measure of emotional inference was designed…
Acoustical Awareness for Intelligent Robotic Action

DTIC Science & Technology

2007-12-01

sound is desired or needed for some other purposes, but is interfering with the intended application, it is called noise. The Soundscape refers...to that which can be heard. Although often used interchangeably with the term Auditory Scene, the soundscape is a narrower definition, referring...difficult is the underlying complexity of the acoustical domain. The soundscape is always changing with time, more so than even the visual domain tends
Auditory training improves auditory performance in cochlear implanted children.

PubMed

Roman, Stephane; Rochette, Françoise; Triglia, Jean-Michel; Schön, Daniele; Bigand, Emmanuel

2016-07-01

While the positive benefits of pediatric cochlear implantation on language perception skills are now proven, the heterogeneity of outcomes remains high. The understanding of this heterogeneity and possible strategies to minimize it is of utmost importance. Our scope here is to test the effects of an auditory training strategy, "sound in Hands", using playful tasks grounded on the theoretical and empirical findings of cognitive sciences. Indeed, several basic auditory operations, such as auditory scene analysis (ASA) are not trained in the usual therapeutic interventions in deaf children. However, as they constitute a fundamental basis in auditory cognition, their development should imply general benefit in auditory processing and in turn enhance speech perception. The purpose of the present study was to determine whether cochlear implanted children could improve auditory performances in trained tasks and whether they could develop a transfer of learning to a phonetic discrimination test. Nineteen prelingually unilateral cochlear implanted children without additional handicap (4-10 year-olds) were recruited. The four main auditory cognitive processing (identification, discrimination, ASA and auditory memory) were stimulated and trained in the Experimental Group (EG) using Sound in Hands. The EG followed 20 training weekly sessions of 30 min and the untrained group was the control group (CG). Two measures were taken for both groups: before training (T1) and after training (T2). EG showed a significant improvement in the identification, discrimination and auditory memory tasks. The improvement in the ASA task did not reach significance. CG did not show any significant improvement in any of the tasks assessed. Most importantly, improvement was visible in the phonetic discrimination test for EG only. Moreover, younger children benefited more from the auditory training program to develop their phonetic abilities compared to older children, supporting the idea that rehabilitative care is most efficient when it takes place early on during childhood. These results are important to pinpoint the auditory deficits in CI children, to gather a better understanding of the links between basic auditory skills and speech perception which will in turn allow more efficient rehabilitative programs. Copyright © 2016 Elsevier B.V. All rights reserved.
Spatial Modulation Improves Performance in CTIS

NASA Technical Reports Server (NTRS)

Bearman, Gregory H.; Wilson, Daniel W.; Johnson, William R.

2009-01-01

Suitably formulated spatial modulation of a scene imaged by a computed-tomography imaging spectrometer (CTIS) has been found to be useful as a means of improving the imaging performance of the CTIS. As used here, "spatial modulation" signifies the imposition of additional, artificial structure on a scene from within the CTIS optics. The basic principles of a CTIS were described in "Improvements in Computed- Tomography Imaging Spectrometry" (NPO-20561) NASA Tech Briefs, Vol. 24, No. 12 (December 2000), page 38 and "All-Reflective Computed-Tomography Imaging Spectrometers" (NPO-20836), NASA Tech Briefs, Vol. 26, No. 11 (November 2002), page 7a. To recapitulate: A CTIS offers capabilities for imaging a scene with spatial, spectral, and temporal resolution. The spectral disperser in a CTIS is a two-dimensional diffraction grating. It is positioned between two relay lenses (or on one of two relay mirrors) in a video imaging system. If the disperser were removed, the system would produce ordinary images of the scene in its field of view. In the presence of the grating, the image on the focal plane of the system contains both spectral and spatial information because the multiple diffraction orders of the grating give rise to multiple, spectrally dispersed images of the scene. By use of algorithms adapted from computed tomography, the image on the focal plane can be processed into an image cube a three-dimensional collection of data on the image intensity as a function of the two spatial dimensions (x and y) in the scene and of wavelength (lambda). Thus, both spectrally and spatially resolved information on the scene at a given instant of time can be obtained, without scanning, from a single snapshot; this is what makes the CTIS such a potentially powerful tool for spatially, spectrally, and temporally resolved imaging. A CTIS performs poorly in imaging some types of scenes in particular, scenes that contain little spatial or spectral variation. The computed spectra of such scenes tend to approximate correct values to within acceptably small errors near the edges of the field of view but to be poor approximations away from the edges. The additional structure imposed on a scene according to the present method enables the CTIS algorithms to reconstruct acceptable approximations of the spectral data throughout the scene.
A new method for text detection and recognition in indoor scene for assisting blind people

NASA Astrophysics Data System (ADS)

Jabnoun, Hanen; Benzarti, Faouzi; Amiri, Hamid

2017-03-01

Developing assisting system of handicapped persons become a challenging ask in research projects. Recently, a variety of tools are designed to help visually impaired or blind people object as a visual substitution system. The majority of these tools are based on the conversion of input information into auditory or tactile sensory information. Furthermore, object recognition and text retrieval are exploited in the visual substitution systems. Text detection and recognition provides the description of the surrounding environments, so that the blind person can readily recognize the scene. In this work, we aim to introduce a method for detecting and recognizing text in indoor scene. The process consists on the detection of the regions of interest that should contain the text using the connected component. Then, the text detection is provided by employing the images correlation. This component of an assistive blind person should be simple, so that the users are able to obtain the most informative feedback within the shortest time.
The Auditory Kuleshov Effect: Multisensory Integration in Movie Editing.

PubMed

Baranowski, Andreas M; Hecht, H

2017-05-01

Almost a hundred years ago, the Russian filmmaker Lev Kuleshov conducted his now famous editing experiment in which different objects were added to a given film scene featuring a neutral face. It is said that the audience interpreted the unchanged facial expression as a function of the added object (e.g., an added soup made the face express hunger). This interaction effect has been dubbed "Kuleshov effect." In the current study, we explored the role of sound in the evaluation of facial expressions in films. Thirty participants watched different clips of faces that were intercut with neutral scenes, featuring either happy music, sad music, or no music at all. This was crossed with the facial expressions of happy, sad, or neutral. We found that the music significantly influenced participants' emotional judgments of facial expression. Thus, the intersensory effects of music are more specific than previously thought. They alter the evaluation of film scenes and can give meaning to ambiguous situations.
Toward a Neural Basis of Music Perception – A Review and Updated Model

PubMed Central

Koelsch, Stefan

2011-01-01

Music perception involves acoustic analysis, auditory memory, auditory scene analysis, processing of interval relations, of musical syntax and semantics, and activation of (pre)motor representations of actions. Moreover, music perception potentially elicits emotions, thus giving rise to the modulation of emotional effector systems such as the subjective feeling system, the autonomic nervous system, the hormonal, and the immune system. Building on a previous article (Koelsch and Siebel, 2005), this review presents an updated model of music perception and its neural correlates. The article describes processes involved in music perception, and reports EEG and fMRI studies that inform about the time course of these processes, as well as about where in the brain these processes might be located. PMID:21713060
Towards a neural basis of music perception.

PubMed

Koelsch, Stefan; Siebel, Walter A

2005-12-01

Music perception involves complex brain functions underlying acoustic analysis, auditory memory, auditory scene analysis, and processing of musical syntax and semantics. Moreover, music perception potentially affects emotion, influences the autonomic nervous system, the hormonal and immune systems, and activates (pre)motor representations. During the past few years, research activities on different aspects of music processing and their neural correlates have rapidly progressed. This article provides an overview of recent developments and a framework for the perceptual side of music processing. This framework lays out a model of the cognitive modules involved in music perception, and incorporates information about the time course of activity of some of these modules, as well as research findings about where in the brain these modules might be located.
Auditory presentation and synchronization in Adobe Flash and HTML5/JavaScript Web experiments.

PubMed

Reimers, Stian; Stewart, Neil

2016-09-01

Substantial recent research has examined the accuracy of presentation durations and response time measurements for visually presented stimuli in Web-based experiments, with a general conclusion that accuracy is acceptable for most kinds of experiments. However, many areas of behavioral research use auditory stimuli instead of, or in addition to, visual stimuli. Much less is known about auditory accuracy using standard Web-based testing procedures. We used a millisecond-accurate Black Box Toolkit to measure the actual durations of auditory stimuli and the synchronization of auditory and visual presentation onsets. We examined the distribution of timings for 100 presentations of auditory and visual stimuli across two computers with difference specs, three commonly used browsers, and code written in either Adobe Flash or JavaScript. We also examined different coding options for attempting to synchronize the auditory and visual onsets. Overall, we found that auditory durations were very consistent, but that the lags between visual and auditory onsets varied substantially across browsers and computer systems.
A Review of Auditory Prediction and Its Potential Role in Tinnitus Perception.

PubMed

Durai, Mithila; O'Keeffe, Mary G; Searchfield, Grant D

2018-06-01

The precise mechanisms underlying tinnitus perception and distress are still not fully understood. A recent proposition is that auditory prediction errors and related memory representations may play a role in driving tinnitus perception. It is of interest to further explore this. To obtain a comprehensive narrative synthesis of current research in relation to auditory prediction and its potential role in tinnitus perception and severity. A narrative review methodological framework was followed. The key words Prediction Auditory, Memory Prediction Auditory, Tinnitus AND Memory, Tinnitus AND Prediction in Article Title, Abstract, and Keywords were extensively searched on four databases: PubMed, Scopus, SpringerLink, and PsychINFO. All study types were selected from 2000-2016 (end of 2016) and had the following exclusion criteria applied: minimum age of participants <18, nonhuman participants, and article not available in English. Reference lists of articles were reviewed to identify any further relevant studies. Articles were short listed based on title relevance. After reading the abstracts and with consensus made between coauthors, a total of 114 studies were selected for charting data. The hierarchical predictive coding model based on the Bayesian brain hypothesis, attentional modulation and top-down feedback serves as the fundamental framework in current literature for how auditory prediction may occur. Predictions are integral to speech and music processing, as well as in sequential processing and identification of auditory objects during auditory streaming. Although deviant responses are observable from middle latency time ranges, the mismatch negativity (MMN) waveform is the most commonly studied electrophysiological index of auditory irregularity detection. However, limitations may apply when interpreting findings because of the debatable origin of the MMN and its restricted ability to model real-life, more complex auditory phenomenon. Cortical oscillatory band activity may act as neurophysiological substrates for auditory prediction. Tinnitus has been modeled as an auditory object which may demonstrate incomplete processing during auditory scene analysis resulting in tinnitus salience and therefore difficulty in habituation. Within the electrophysiological domain, there is currently mixed evidence regarding oscillatory band changes in tinnitus. There are theoretical proposals for a relationship between prediction error and tinnitus but few published empirical studies. American Academy of Audiology.
Thalamic and cortical pathways supporting auditory processing

PubMed Central

Lee, Charles C.

2012-01-01

The neural processing of auditory information engages pathways that begin initially at the cochlea and that eventually reach forebrain structures. At these higher levels, the computations necessary for extracting auditory source and identity information rely on the neuroanatomical connections between the thalamus and cortex. Here, the general organization of these connections in the medial geniculate body (thalamus) and the auditory cortex is reviewed. In addition, we consider two models organizing the thalamocortical pathways of the non-tonotopic and multimodal auditory nuclei. Overall, the transfer of information to the cortex via the thalamocortical pathways is complemented by the numerous intracortical and corticocortical pathways. Although interrelated, the convergent interactions among thalamocortical, corticocortical, and commissural pathways enable the computations necessary for the emergence of higher auditory perception. PMID:22728130
[In Process Citation

PubMed

Ackermann; Mathiak

1999-11-01

Pure word deafness (auditory verbal agnosia) is characterized by an impairment of auditory comprehension, repetition of verbal material and writing to dictation whereas spontaneous speech production and reading largely remain unaffected. Sometimes, this syndrome is preceded by complete deafness (cortical deafness) of varying duration. Perception of vowels and suprasegmental features of verbal utterances (e.g., intonation contours) seems to be less disrupted than the processing of consonants and, therefore, might mediate residual auditory functions. Often, lip reading and/or slowing of speaking rate allow within some limits to compensate for speech comprehension deficits. Apart from a few exceptions, the available reports of pure word deafness documented a bilateral temporal lesion. In these instances, as a rule, identification of nonverbal (environmental) sounds, perception of music, temporal resolution of sequential auditory cues and/or spatial localization of acoustic events were compromised as well. The observed variable constellation of auditory signs and symptoms in central hearing disorders following bilateral temporal disorders, most probably, reflects the multitude of functional maps at the level of the auditory cortices subserving, as documented in a variety of non-human species, the encoding of specific stimulus parameters each. Thus, verbal/nonverbal auditory agnosia may be considered a paradigm of distorted "auditory scene analysis" (Bregman 1990) affecting both primitive and schema-based perceptual processes. It cannot be excluded, however, that disconnection of the Wernicke-area from auditory input (Geschwind 1965) and/or an impairment of suggested "phonetic module" (Liberman 1996) contribute to the observed deficits as well. Conceivably, these latter mechanisms underly the rare cases of pure word deafness following a lesion restricted to the dominant hemisphere. Only few instances of a rather isolated disruption of the discrimination/identification of nonverbal sound sources, in the presence of uncompromised speech comprehension, have been reported so far (nonverbal auditory agnosia). As a rule, unilateral right-sided damage has been found to be the relevant lesion.
Auditory Environment Across the Life Span of Cochlear Implant Users: Insights From Data Logging.

PubMed

Busch, Tobias; Vanpoucke, Filiep; van Wieringen, Astrid

2017-05-24

We describe the natural auditory environment of people with cochlear implants (CIs), how it changes across the life span, and how it varies between individuals. We performed a retrospective cross-sectional analysis of Cochlear Nucleus 6 CI sound-processor data logs. The logs were obtained from 1,501 people with CIs (ages 0-96 years). They covered over 2.4 million hr of implant use and indicated how much time the CI users had spent in various acoustical environments. We investigated exposure to spoken language, noise, music, and quiet, and analyzed variation between age groups, users, and countries. CI users spent a substantial part of their daily life in noisy environments. As a consequence, most speech was presented in background noise. We found significant differences between age groups for all auditory scenes. Yet even within the same age group and country, variability between individuals was substantial. Regardless of their age, people with CIs face challenging acoustical environments in their daily life. Our results underline the importance of supporting them with assistive listening technology. Moreover, we found large differences between individuals' auditory diets that might contribute to differences in rehabilitation outcomes. Their causes and effects should be investigated further.
Perception of Complex Auditory Scenes

DTIC Science & Technology

2014-07-02

Simpson, B. D., & Romigh, G., (2014). “Ear dominance in a dichotic cocktail party .” Journal of the Association for Research in Otolaryngology, Abstract...B. D., & Romigh, G. (2014). Ear dominance in a dichotic cocktail party . Journal of the Association for Research in Otolaryngology, Abstract 37, p...dominance in a dichotic cocktail party .” Journal of the Association for Research in Otolaryngology, Abstract 37, p 518. Cherry, E. C. (1953). Some

Selective attention in normal and impaired hearing.

PubMed

Shinn-Cunningham, Barbara G; Best, Virginia

2008-12-01

A common complaint among listeners with hearing loss (HL) is that they have difficulty communicating in common social settings. This article reviews how normal-hearing listeners cope in such settings, especially how they focus attention on a source of interest. Results of experiments with normal-hearing listeners suggest that the ability to selectively attend depends on the ability to analyze the acoustic scene and to form perceptual auditory objects properly. Unfortunately, sound features important for auditory object formation may not be robustly encoded in the auditory periphery of HL listeners. In turn, impaired auditory object formation may interfere with the ability to filter out competing sound sources. Peripheral degradations are also likely to reduce the salience of higher-order auditory cues such as location, pitch, and timbre, which enable normal-hearing listeners to select a desired sound source out of a sound mixture. Degraded peripheral processing is also likely to increase the time required to form auditory objects and focus selective attention so that listeners with HL lose the ability to switch attention rapidly (a skill that is particularly important when trying to participate in a lively conversation). Finally, peripheral deficits may interfere with strategies that normal-hearing listeners employ in complex acoustic settings, including the use of memory to fill in bits of the conversation that are missed. Thus, peripheral hearing deficits are likely to cause a number of interrelated problems that challenge the ability of HL listeners to communicate in social settings requiring selective attention.
Selective Attention in Normal and Impaired Hearing

PubMed Central

Shinn-Cunningham, Barbara G.; Best, Virginia

2008-01-01

A common complaint among listeners with hearing loss (HL) is that they have difficulty communicating in common social settings. This article reviews how normal-hearing listeners cope in such settings, especially how they focus attention on a source of interest. Results of experiments with normal-hearing listeners suggest that the ability to selectively attend depends on the ability to analyze the acoustic scene and to form perceptual auditory objects properly. Unfortunately, sound features important for auditory object formation may not be robustly encoded in the auditory periphery of HL listeners. In turn, impaired auditory object formation may interfere with the ability to filter out competing sound sources. Peripheral degradations are also likely to reduce the salience of higher-order auditory cues such as location, pitch, and timbre, which enable normal-hearing listeners to select a desired sound source out of a sound mixture. Degraded peripheral processing is also likely to increase the time required to form auditory objects and focus selective attention so that listeners with HL lose the ability to switch attention rapidly (a skill that is particularly important when trying to participate in a lively conversation). Finally, peripheral deficits may interfere with strategies that normal-hearing listeners employ in complex acoustic settings, including the use of memory to fill in bits of the conversation that are missed. Thus, peripheral hearing deficits are likely to cause a number of interrelated problems that challenge the ability of HL listeners to communicate in social settings requiring selective attention. PMID:18974202
Large-scale synchronized activity during vocal deviance detection in the zebra finch auditory forebrain.

PubMed

Beckers, Gabriël J L; Gahr, Manfred

2012-08-01

Auditory systems bias responses to sounds that are unexpected on the basis of recent stimulus history, a phenomenon that has been widely studied using sequences of unmodulated tones (mismatch negativity; stimulus-specific adaptation). Such a paradigm, however, does not directly reflect problems that neural systems normally solve for adaptive behavior. We recorded multiunit responses in the caudomedial auditory forebrain of anesthetized zebra finches (Taeniopygia guttata) at 32 sites simultaneously, to contact calls that recur probabilistically at a rate that is used in communication. Neurons in secondary, but not primary, auditory areas respond preferentially to calls when they are unexpected (deviant) compared with the same calls when they are expected (standard). This response bias is predominantly due to sites more often not responding to standard events than to deviant events. When two call stimuli alternate between standard and deviant roles, most sites exhibit a response bias to deviant events of both stimuli. This suggests that biases are not based on a use-dependent decrease in response strength but involve a more complex mechanism that is sensitive to auditory deviance per se. Furthermore, between many secondary sites, responses are tightly synchronized, a phenomenon that is driven by internal neuronal interactions rather than by the timing of stimulus acoustic features. We hypothesize that this deviance-sensitive, internally synchronized network of neurons is involved in the involuntary capturing of attention by unexpected and behaviorally potentially relevant events in natural auditory scenes.
Modern Approaches to the Computation of the Probability of Target Detection in Cluttered Environments

NASA Astrophysics Data System (ADS)

Meitzler, Thomas J.

The field of computer vision interacts with fields such as psychology, vision research, machine vision, psychophysics, mathematics, physics, and computer science. The focus of this thesis is new algorithms and methods for the computation of the probability of detection (Pd) of a target in a cluttered scene. The scene can be either a natural visual scene such as one sees with the naked eye (visual), or, a scene displayed on a monitor with the help of infrared sensors. The relative clutter and the temperature difference between the target and background (DeltaT) are defined and then used to calculate a relative signal -to-clutter ratio (SCR) from which the Pd is calculated for a target in a cluttered scene. It is shown how this definition can include many previous definitions of clutter and (DeltaT). Next, fuzzy and neural -fuzzy techniques are used to calculate the Pd and it is shown how these methods can give results that have a good correlation with experiment. The experimental design for actually measuring the Pd of a target by observers is described. Finally, wavelets are applied to the calculation of clutter and it is shown how this new definition of clutter based on wavelets can be used to compute the Pd of a target.
The forensic holodeck: an immersive display for forensic crime scene reconstructions.

PubMed

Ebert, Lars C; Nguyen, Tuan T; Breitbeck, Robert; Braun, Marcel; Thali, Michael J; Ross, Steffen

2014-12-01

In forensic investigations, crime scene reconstructions are created based on a variety of three-dimensional image modalities. Although the data gathered are three-dimensional, their presentation on computer screens and paper is two-dimensional, which incurs a loss of information. By applying immersive virtual reality (VR) techniques, we propose a system that allows a crime scene to be viewed as if the investigator were present at the scene. We used a low-cost VR headset originally developed for computer gaming in our system. The headset offers a large viewing volume and tracks the user's head orientation in real-time, and an optical tracker is used for positional information. In addition, we created a crime scene reconstruction to demonstrate the system. In this article, we present a low-cost system that allows immersive, three-dimensional and interactive visualization of forensic incident scene reconstructions.
Technological Areas to Improve Soldier Decisiveness: Insights From the Soldier-System Design Perspective

DTIC Science & Technology

2012-03-01

learning state of the Soldier (e.g., frustrated, confused, engaged), to select the best learning strategies (e.g., feedback, reflection, hints), and...targeted to areas of weakness. This training can be enhanced by the use of “intelligent” agents to perceive learner attributes (e.g., competence...auditory scene would be made, and outlying objects and sounds, or missing activity, could be automatically identified and displayed aurally or visually
Hidden Hearing Loss and Computational Models of the Auditory Pathway: Predicting Speech Intelligibility Decline

DTIC Science & Technology

2016-11-28

of low spontaneous rate auditory nerve fibers (ANFs) and reduction of auditory brainstem response wave-I amplitudes. The goal of this research is...auditory nerve (AN) responses to speech stimuli under a variety of difficult listening conditions. The resulting cochlear neurogram, a spectrogram
Content Representation in the Human Medial Temporal Lobe

PubMed Central

Liang, Jackson C.; Wagner, Anthony D.

2013-01-01

Current theories of medial temporal lobe (MTL) function focus on event content as an important organizational principle that differentiates MTL subregions. Perirhinal and parahippocampal cortices may play content-specific roles in memory, whereas hippocampal processing is alternately hypothesized to be content specific or content general. Despite anatomical evidence for content-specific MTL pathways, empirical data for content-based MTL subregional dissociations are mixed. Here, we combined functional magnetic resonance imaging with multiple statistical approaches to characterize MTL subregional responses to different classes of novel event content (faces, scenes, spoken words, sounds, visual words). Univariate analyses revealed that responses to novel faces and scenes were distributed across the anterior–posterior axis of MTL cortex, with face responses distributed more anteriorly than scene responses. Moreover, multivariate pattern analyses of perirhinal and parahippocampal data revealed spatially organized representational codes for multiple content classes, including nonpreferred visual and auditory stimuli. In contrast, anterior hippocampal responses were content general, with less accurate overall pattern classification relative to MTL cortex. Finally, posterior hippocampal activation patterns consistently discriminated scenes more accurately than other forms of content. Collectively, our findings indicate differential contributions of MTL subregions to event representation via a distributed code along the anterior–posterior axis of MTL that depends on the nature of event content. PMID:22275474
An analysis of nonlinear dynamics underlying neural activity related to auditory induction in the rat auditory cortex.

PubMed

Noto, M; Nishikawa, J; Tateno, T

2016-03-24

A sound interrupted by silence is perceived as discontinuous. However, when high-intensity noise is inserted during the silence, the missing sound may be perceptually restored and be heard as uninterrupted. This illusory phenomenon is called auditory induction. Recent electrophysiological studies have revealed that auditory induction is associated with the primary auditory cortex (A1). Although experimental evidence has been accumulating, the neural mechanisms underlying auditory induction in A1 neurons are poorly understood. To elucidate this, we used both experimental and computational approaches. First, using an optical imaging method, we characterized population responses across auditory cortical fields to sound and identified five subfields in rats. Next, we examined neural population activity related to auditory induction with high temporal and spatial resolution in the rat auditory cortex (AC), including the A1 and several other AC subfields. Our imaging results showed that tone-burst stimuli interrupted by a silent gap elicited early phasic responses to the first tone and similar or smaller responses to the second tone following the gap. In contrast, tone stimuli interrupted by broadband noise (BN), considered to cause auditory induction, considerably suppressed or eliminated responses to the tone following the noise. Additionally, tone-burst stimuli that were interrupted by notched noise centered at the tone frequency, which is considered to decrease the strength of auditory induction, partially restored the second responses from the suppression caused by BN. To phenomenologically mimic the neural population activity in the A1 and thus investigate the mechanisms underlying auditory induction, we constructed a computational model from the periphery through the AC, including a nonlinear dynamical system. The computational model successively reproduced some of the above-mentioned experimental results. Therefore, our results suggest that a nonlinear, self-exciting system is a key element for qualitatively reproducing A1 population activity and to understand the underlying mechanisms. Copyright © 2016 IBRO. Published by Elsevier Ltd. All rights reserved.
Efficient coding of spectrotemporal binaural sounds leads to emergence of the auditory space representation

PubMed Central

Młynarski, Wiktor

2014-01-01

To date a number of studies have shown that receptive field shapes of early sensory neurons can be reproduced by optimizing coding efficiency of natural stimulus ensembles. A still unresolved question is whether the efficient coding hypothesis explains formation of neurons which explicitly represent environmental features of different functional importance. This paper proposes that the spatial selectivity of higher auditory neurons emerges as a direct consequence of learning efficient codes for natural binaural sounds. Firstly, it is demonstrated that a linear efficient coding transform—Independent Component Analysis (ICA) trained on spectrograms of naturalistic simulated binaural sounds extracts spatial information present in the signal. A simple hierarchical ICA extension allowing for decoding of sound position is proposed. Furthermore, it is shown that units revealing spatial selectivity can be learned from a binaural recording of a natural auditory scene. In both cases a relatively small subpopulation of learned spectrogram features suffices to perform accurate sound localization. Representation of the auditory space is therefore learned in a purely unsupervised way by maximizing the coding efficiency and without any task-specific constraints. This results imply that efficient coding is a useful strategy for learning structures which allow for making behaviorally vital inferences about the environment. PMID:24639644
Three-dimensional scene encryption and display based on computer-generated holograms.

PubMed

Kong, Dezhao; Cao, Liangcai; Jin, Guofan; Javidi, Bahram

2016-10-10

An optical encryption and display method for a three-dimensional (3D) scene is proposed based on computer-generated holograms (CGHs) using a single phase-only spatial light modulator. The 3D scene is encoded as one complex Fourier CGH. The Fourier CGH is then decomposed into two phase-only CGHs with random distributions by the vector stochastic decomposition algorithm. Two CGHs are interleaved as one final phase-only CGH for optical encryption and reconstruction. The proposed method can support high-level nonlinear optical 3D scene security and complex amplitude modulation of the optical field. The exclusive phase key offers strong resistances of decryption attacks. Experimental results demonstrate the validity of the novel method.
Listeners' expectation of room acoustical parameters based on visual cues

NASA Astrophysics Data System (ADS)

Valente, Daniel L.

Despite many studies investigating auditory spatial impressions in rooms, few have addressed the impact of simultaneous visual cues on localization and the perception of spaciousness. The current research presents an immersive audio-visual study, in which participants are instructed to make spatial congruency and quantity judgments in dynamic cross-modal environments. The results of these psychophysical tests suggest the importance of consilient audio-visual presentation to the legibility of an auditory scene. Several studies have looked into audio-visual interaction in room perception in recent years, but these studies rely on static images, speech signals, or photographs alone to represent the visual scene. Building on these studies, the aim is to propose a testing method that uses monochromatic compositing (blue-screen technique) to position a studio recording of a musical performance in a number of virtual acoustical environments and ask subjects to assess these environments. In the first experiment of the study, video footage was taken from five rooms varying in physical size from a small studio to a small performance hall. Participants were asked to perceptually align two distinct acoustical parameters---early-to-late reverberant energy ratio and reverberation time---of two solo musical performances in five contrasting visual environments according to their expectations of how the room should sound given its visual appearance. In the second experiment in the study, video footage shot from four different listening positions within a general-purpose space was coupled with sounds derived from measured binaural impulse responses (IRs). The relationship between the presented image, sound, and virtual receiver position was examined. It was found that many visual cues caused different perceived events of the acoustic environment. This included the visual attributes of the space in which the performance was located as well as the visual attributes of the performer. The addressed visual makeup of the performer included: (1) an actual video of the performance, (2) a surrogate image of the performance, for example a loudspeaker's image reproducing the performance, (3) no visual image of the performance (empty room), or (4) a multi-source visual stimulus (actual video of the performance coupled with two images of loudspeakers positioned to the left and right of the performer). For this experiment, perceived auditory events of sound were measured in terms of two subjective spatial metrics: Listener Envelopment (LEV) and Apparent Source Width (ASW) These metrics were hypothesized to be dependent on the visual imagery of the presented performance. Data was also collected by participants matching direct and reverberant sound levels for the presented audio-visual scenes. In the final experiment, participants judged spatial expectations of an ensemble of musicians presented in the five physical spaces from Experiment 1. Supporting data was accumulated in two stages. First, participants were given an audio-visual matching test, in which they were instructed to align the auditory width of a performing ensemble to a varying set of audio and visual cues. In the second stage, a conjoint analysis design paradigm was explored to extrapolate the relative magnitude of explored audio-visual factors in affecting three assessed response criteria: Congruency (the perceived match-up of the auditory and visual cues in the assessed performance), ASW and LEV. Results show that both auditory and visual factors affect the collected responses, and that the two sensory modalities coincide in distinct interactions. This study reveals participant resiliency in the presence of forced auditory-visual mismatch: Participants are able to adjust the acoustic component of the cross-modal environment in a statistically similar way despite randomized starting values for the monitored parameters. Subjective results of the experiments are presented along with objective measurements for verification.
Listening Into 2030 Workshop: An Experiment in Envisioning the Future of Hearing and Communication Science

PubMed Central

Carlile, Simon; Ciccarelli, Gregory; Cockburn, Jane; Diedesch, Anna C.; Finnegan, Megan K.; Hafter, Ervin; Henin, Simon; Kalluri, Sridhar; Kell, Alexander J. E.; Ozmeral, Erol J.; Roark, Casey L.

2017-01-01

Here we report the methods and output of a workshop examining possible futures of speech and hearing science out to 2030. Using a design thinking approach, a range of human-centered problems in communication were identified that could provide the motivation for a wide range of research. Nine main research programs were distilled and are summarized: (a) measuring brain and other physiological parameters, (b) auditory and multimodal displays of information, (c) auditory scene analysis, (d) enabling and understanding shared auditory virtual spaces, (e) holistic approaches to health management and hearing impairment, (f) universal access to evolving and individualized technologies, (g) biological intervention for hearing dysfunction, (h) understanding the psychosocial interactions with technology and other humans as mediated by technology, and (i) the impact of changing models of security and privacy. The design thinking approach attempted to link the judged level of importance of different research areas to the “end in mind” through empathy for the real-life problems embodied in the personas created during the workshop. PMID:29090640
Reducing involuntary memory by interfering consolidation of stressful auditory information: A pilot study.

PubMed

Tabrizi, Fara; Jansson, Billy

2016-03-01

Intrusive emotional memories were induced by aversive auditory stimuli and modulated with cognitive tasks performed post-encoding (i.e., during consolidation). A between-subjects design was used with four conditions; three consolidation-interference tasks (a visuospatial and two verbal interference tasks) and a no-task control condition. Forty-one participants listened to a soundtrack depicting traumatic scenes (e.g., police brutality, torture and rape). Immediately after listening to the soundtrack, the subjects completed a randomly assigned task for 10 min. Intrusions from the soundtrack were reported in a diary during the following seven-day period. In line with a modality-specific approach to intrusion modulation, auditory intrusions were reduced by verbal tasks compared to both a no-task and a visuospatial interference task.. The study did not control for individual differences in imagery ability which may be a feature in intrusion development. The results provide an increased understanding of how intrusive mental images can be modulated which may have implications for preventive treatment.. Copyright © 2015 Elsevier Ltd. All rights reserved.
Incorporating Auditory Models in Speech/Audio Applications

NASA Astrophysics Data System (ADS)

Krishnamoorthi, Harish

2011-12-01

Following the success in incorporating perceptual models in audio coding algorithms, their application in other speech/audio processing systems is expanding. In general, all perceptual speech/audio processing algorithms involve minimization of an objective function that directly/indirectly incorporates properties of human perception. This dissertation primarily investigates the problems associated with directly embedding an auditory model in the objective function formulation and proposes possible solutions to overcome high complexity issues for use in real-time speech/audio algorithms. Specific problems addressed in this dissertation include: 1) the development of approximate but computationally efficient auditory model implementations that are consistent with the principles of psychoacoustics, 2) the development of a mapping scheme that allows synthesizing a time/frequency domain representation from its equivalent auditory model output. The first problem is aimed at addressing the high computational complexity involved in solving perceptual objective functions that require repeated application of auditory model for evaluation of different candidate solutions. In this dissertation, a frequency pruning and a detector pruning algorithm is developed that efficiently implements the various auditory model stages. The performance of the pruned model is compared to that of the original auditory model for different types of test signals in the SQAM database. Experimental results indicate only a 4-7% relative error in loudness while attaining up to 80-90 % reduction in computational complexity. Similarly, a hybrid algorithm is developed specifically for use with sinusoidal signals and employs the proposed auditory pattern combining technique together with a look-up table to store representative auditory patterns. The second problem obtains an estimate of the auditory representation that minimizes a perceptual objective function and transforms the auditory pattern back to its equivalent time/frequency representation. This avoids the repeated application of auditory model stages to test different candidate time/frequency vectors in minimizing perceptual objective functions. In this dissertation, a constrained mapping scheme is developed by linearizing certain auditory model stages that ensures obtaining a time/frequency mapping corresponding to the estimated auditory representation. This paradigm was successfully incorporated in a perceptual speech enhancement algorithm and a sinusoidal component selection task.
Speech-in-noise perception deficit in adults with dyslexia: effects of background type and listening configuration.

PubMed

Dole, Marjorie; Hoen, Michel; Meunier, Fanny

2012-06-01

Developmental dyslexia is associated with impaired speech-in-noise perception. The goal of the present research was to further characterize this deficit in dyslexic adults. In order to specify the mechanisms and processing strategies used by adults with dyslexia during speech-in-noise perception, we explored the influence of background type, presenting single target-words against backgrounds made of cocktail party sounds, modulated speech-derived noise or stationary noise. We also evaluated the effect of three listening configurations differing in terms of the amount of spatial processing required. In a monaural condition, signal and noise were presented to the same ear while in a dichotic situation, target and concurrent sound were presented to two different ears, finally in a spatialised configuration, target and competing signals were presented as if they originated from slightly differing positions in the auditory scene. Our results confirm the presence of a speech-in-noise perception deficit in dyslexic adults, in particular when the competing signal is also speech, and when both signals are presented to the same ear, an observation potentially relating to phonological accounts of dyslexia. However, adult dyslexics demonstrated better levels of spatial release of masking than normal reading controls when the background was speech, suggesting that they are well able to rely on denoising strategies based on spatial auditory scene analysis strategies. Copyright © 2012 Elsevier Ltd. All rights reserved.
Treefrogs as Animal Models for Research on Auditory Scene Analysis and the Cocktail Party Problem

PubMed Central

Bee, Mark A.

2014-01-01

The perceptual analysis of acoustic scenes involves binding together sounds from the same source and separating them from other sounds in the environment. In large social groups, listeners experience increased difficulty performing these tasks due to high noise levels and interference from the concurrent signals of multiple individuals. While a substantial body of literature on these issues pertains to human hearing and speech communication, few studies have investigated how nonhuman animals may be evolutionarily adapted to solve biologically analogous communication problems. Here, I review recent and ongoing work aimed at testing hypotheses about perceptual mechanisms that enable treefrogs in the genus Hyla to communicate vocally in noisy, multi-source social environments. After briefly introducing the genus and the methods used to study hearing in frogs, I outline several functional constraints on communication posed by the acoustic environment of breeding “choruses”. Then, I review studies of sound source perception aimed at uncovering how treefrog listeners may be adapted to cope with these constraints. Specifically, this review covers research on the acoustic cues used in sequential and simultaneous auditory grouping, spatial release from masking, and dip listening. Throughout the paper, I attempt to illustrate how broad-scale, comparative studies of carefully considered animal models may ultimately reveal an evolutionary diversity of underlying mechanisms for solving cocktail-party-like problems in communication. PMID:24424243
Computational mechanisms underlying cortical responses to the affordance properties of visual scenes

PubMed Central

Epstein, Russell A.

2018-01-01

Biologically inspired deep convolutional neural networks (CNNs), trained for computer vision tasks, have been found to predict cortical responses with remarkable accuracy. However, the internal operations of these models remain poorly understood, and the factors that account for their success are unknown. Here we develop a set of techniques for using CNNs to gain insights into the computational mechanisms underlying cortical responses. We focused on responses in the occipital place area (OPA), a scene-selective region of dorsal occipitoparietal cortex. In a previous study, we showed that fMRI activation patterns in the OPA contain information about the navigational affordances of scenes; that is, information about where one can and cannot move within the immediate environment. We hypothesized that this affordance information could be extracted using a set of purely feedforward computations. To test this idea, we examined a deep CNN with a feedforward architecture that had been previously trained for scene classification. We found that responses in the CNN to scene images were highly predictive of fMRI responses in the OPA. Moreover the CNN accounted for the portion of OPA variance relating to the navigational affordances of scenes. The CNN could thus serve as an image-computable candidate model of affordance-related responses in the OPA. We then ran a series of in silico experiments on this model to gain insights into its internal operations. These analyses showed that the computation of affordance-related features relied heavily on visual information at high-spatial frequencies and cardinal orientations, both of which have previously been identified as low-level stimulus preferences of scene-selective visual cortex. These computations also exhibited a strong preference for information in the lower visual field, which is consistent with known retinotopic biases in the OPA. Visualizations of feature selectivity within the CNN suggested that affordance-based responses encoded features that define the layout of the spatial environment, such as boundary-defining junctions and large extended surfaces. Together, these results map the sensory functions of the OPA onto a fully quantitative model that provides insights into its visual computations. More broadly, they advance integrative techniques for understanding visual cortex across multiple level of analysis: from the identification of cortical sensory functions to the modeling of their underlying algorithms. PMID:29684011
Predictive Ensemble Decoding of Acoustical Features Explains Context-Dependent Receptive Fields.

PubMed

Yildiz, Izzet B; Mesgarani, Nima; Deneve, Sophie

2016-12-07

A primary goal of auditory neuroscience is to identify the sound features extracted and represented by auditory neurons. Linear encoding models, which describe neural responses as a function of the stimulus, have been primarily used for this purpose. Here, we provide theoretical arguments and experimental evidence in support of an alternative approach, based on decoding the stimulus from the neural response. We used a Bayesian normative approach to predict the responses of neurons detecting relevant auditory features, despite ambiguities and noise. We compared the model predictions to recordings from the primary auditory cortex of ferrets and found that: (1) the decoding filters of auditory neurons resemble the filters learned from the statistics of speech sounds; (2) the decoding model captures the dynamics of responses better than a linear encoding model of similar complexity; and (3) the decoding model accounts for the accuracy with which the stimulus is represented in neural activity, whereas linear encoding model performs very poorly. Most importantly, our model predicts that neuronal responses are fundamentally shaped by "explaining away," a divisive competition between alternative interpretations of the auditory scene. Neural responses in the auditory cortex are dynamic, nonlinear, and hard to predict. Traditionally, encoding models have been used to describe neural responses as a function of the stimulus. However, in addition to external stimulation, neural activity is strongly modulated by the responses of other neurons in the network. We hypothesized that auditory neurons aim to collectively decode their stimulus. In particular, a stimulus feature that is decoded (or explained away) by one neuron is not explained by another. We demonstrated that this novel Bayesian decoding model is better at capturing the dynamic responses of cortical neurons in ferrets. Whereas the linear encoding model poorly reflects selectivity of neurons, the decoding model can account for the strong nonlinearities observed in neural data. Copyright © 2016 Yildiz et al.
Qualitative spatial logic descriptors from 3D indoor scenes to generate explanations in natural language.

PubMed

Falomir, Zoe; Kluth, Thomas

2017-06-24

The challenge of describing 3D real scenes is tackled in this paper using qualitative spatial descriptors. A key point to study is which qualitative descriptors to use and how these qualitative descriptors must be organized to produce a suitable cognitive explanation. In order to find answers, a survey test was carried out with human participants which openly described a scene containing some pieces of furniture. The data obtained in this survey are analysed, and taking this into account, the QSn3D computational approach was developed which uses a XBox 360 Kinect to obtain 3D data from a real indoor scene. Object features are computed on these 3D data to identify objects in indoor scenes. The object orientation is computed, and qualitative spatial relations between the objects are extracted. These qualitative spatial relations are the input to a grammar which applies saliency rules obtained from the survey study and generates cognitive natural language descriptions of scenes. Moreover, these qualitative descriptors can be expressed as first-order logical facts in Prolog for further reasoning. Finally, a validation study is carried out to test whether the descriptions provided by QSn3D approach are human readable. The obtained results show that their acceptability is higher than 82%.

EEG signatures accompanying auditory figure-ground segregation

PubMed Central

Tóth, Brigitta; Kocsis, Zsuzsanna; Háden, Gábor P.; Szerafin, Ágnes; Shinn-Cunningham, Barbara; Winkler, István

2017-01-01

In everyday acoustic scenes, figure-ground segregation typically requires one to group together sound elements over both time and frequency. Electroencephalogram was recorded while listeners detected repeating tonal complexes composed of a random set of pure tones within stimuli consisting of randomly varying tonal elements. The repeating pattern was perceived as a figure over the randomly changing background. It was found that detection performance improved both as the number of pure tones making up each repeated complex (figure coherence) increased, and as the number of repeated complexes (duration) increased – i.e., detection was easier when either the spectral or temporal structure of the figure was enhanced. Figure detection was accompanied by the elicitation of the object related negativity (ORN) and the P400 event-related potentials (ERPs), which have been previously shown to be evoked by the presence of two concurrent sounds. Both ERP components had generators within and outside of auditory cortex. The amplitudes of the ORN and the P400 increased with both figure coherence and figure duration. However, only the P400 amplitude correlated with detection performance. These results suggest that 1) the ORN and P400 reflect processes involved in detecting the emergence of a new auditory object in the presence of other concurrent auditory objects; 2) the ORN corresponds to the likelihood of the presence of two or more concurrent sound objects, whereas the P400 reflects the perceptual recognition of the presence of multiple auditory objects and/or preparation for reporting the detection of a target object. PMID:27421185
Stimulus-specific adaptation and deviance detection in the inferior colliculus

PubMed Central

Ayala, Yaneri A.; Malmierca, Manuel S.

2013-01-01

Deviancy detection in the continuous flow of sensory information into the central nervous system is of vital importance for animals. The task requires neuronal mechanisms that allow for an efficient representation of the environment by removing statistically redundant signals. Recently, the neuronal principles of auditory deviance detection have been approached by studying the phenomenon of stimulus-specific adaptation (SSA). SSA is a reduction in the responsiveness of a neuron to a common or repetitive sound while the neuron remains highly sensitive to rare sounds (Ulanovsky et al., 2003). This phenomenon could enhance the saliency of unexpected, deviant stimuli against a background of repetitive signals. SSA shares many similarities with the evoked potential known as the “mismatch negativity,” (MMN) and it has been linked to cognitive process such as auditory memory and scene analysis (Winkler et al., 2009) as well as to behavioral habituation (Netser et al., 2011). Neurons exhibiting SSA can be found at several levels of the auditory pathway, from the inferior colliculus (IC) up to the auditory cortex (AC). In this review, we offer an account of the state-of-the art of SSA studies in the IC with the aim of contributing to the growing interest in the single-neuron electrophysiology of auditory deviance detection. The dependence of neuronal SSA on various stimulus features, e.g., probability of the deviant stimulus and repetition rate, and the roles of the AC and inhibition in shaping SSA at the level of the IC are addressed. PMID:23335883
Automatic acquisition of motion trajectories: tracking hockey players

NASA Astrophysics Data System (ADS)

Okuma, Kenji; Little, James J.; Lowe, David

2003-12-01

Computer systems that have the capability of analyzing complex and dynamic scenes play an essential role in video annotation. Scenes can be complex in such a way that there are many cluttered objects with different colors, shapes and sizes, and can be dynamic with multiple interacting moving objects and a constantly changing background. In reality, there are many scenes that are complex, dynamic, and challenging enough for computers to describe. These scenes include games of sports, air traffic, car traffic, street intersections, and cloud transformations. Our research is about the challenge of inventing a descriptive computer system that analyzes scenes of hockey games where multiple moving players interact with each other on a constantly moving background due to camera motions. Ultimately, such a computer system should be able to acquire reliable data by extracting the players" motion as their trajectories, querying them by analyzing the descriptive information of data, and predict the motions of some hockey players based on the result of the query. Among these three major aspects of the system, we primarily focus on visual information of the scenes, that is, how to automatically acquire motion trajectories of hockey players from video. More accurately, we automatically analyze the hockey scenes by estimating parameters (i.e., pan, tilt, and zoom) of the broadcast cameras, tracking hockey players in those scenes, and constructing a visual description of the data by displaying trajectories of those players. Many technical problems in vision such as fast and unpredictable players' motions and rapid camera motions make our challenge worth tackling. To the best of our knowledge, there have not been any automatic video annotation systems for hockey developed in the past. Although there are many obstacles to overcome, our efforts and accomplishments would hopefully establish the infrastructure of the automatic hockey annotation system and become a milestone for research in automatic video annotation in this domain.
Scene-based nonuniformity corrections for optical and SWIR pushbroom sensors.

PubMed

Leathers, Robert; Downes, Trijntje; Priest, Richard

2005-06-27

We propose and evaluate several scene-based methods for computing nonuniformity corrections for visible or near-infrared pushbroom sensors. These methods can be used to compute new nonuniformity correction values or to repair or refine existing radiometric calibrations. For a given data set, the preferred method depends on the quality of the data, the type of scenes being imaged, and the existence and quality of a laboratory calibration. We demonstrate our methods with data from several different sensor systems and provide a generalized approach to be taken for any new data set.
Assessing Auditory Discrimination Skill of Malay Children Using Computer-based Method.

PubMed

Ting, H; Yunus, J; Mohd Nordin, M Z

2005-01-01

The purpose of this paper is to investigate the auditory discrimination skill of Malay children using computer-based method. Currently, most of the auditory discrimination assessments are conducted manually by Speech-Language Pathologist. These conventional tests are actually general tests of sound discrimination, which do not reflect the client's specific speech sound errors. Thus, we propose computer-based Malay auditory discrimination test to automate the whole process of assessment as well as to customize the test according to the specific speech error sounds of the client. The ability in discriminating voiced and unvoiced Malay speech sounds was studied for the Malay children aged between 7 and 10 years old. The study showed no major difficulty for the children in discriminating the Malay speech sounds except differentiating /g/-/k/ sounds. Averagely the children of 7 years old failed to discriminate /g/-/k/ sounds.
Estimators of The Magnitude-Squared Spectrum and Methods for Incorporating SNR Uncertainty

PubMed Central

Lu, Yang; Loizou, Philipos C.

2011-01-01

Statistical estimators of the magnitude-squared spectrum are derived based on the assumption that the magnitude-squared spectrum of the noisy speech signal can be computed as the sum of the (clean) signal and noise magnitude-squared spectra. Maximum a posterior (MAP) and minimum mean square error (MMSE) estimators are derived based on a Gaussian statistical model. The gain function of the MAP estimator was found to be identical to the gain function used in the ideal binary mask (IdBM) that is widely used in computational auditory scene analysis (CASA). As such, it was binary and assumed the value of 1 if the local SNR exceeded 0 dB, and assumed the value of 0 otherwise. By modeling the local instantaneous SNR as an F-distributed random variable, soft masking methods were derived incorporating SNR uncertainty. The soft masking method, in particular, which weighted the noisy magnitude-squared spectrum by the a priori probability that the local SNR exceeds 0 dB was shown to be identical to the Wiener gain function. Results indicated that the proposed estimators yielded significantly better speech quality than the conventional MMSE spectral power estimators, in terms of yielding lower residual noise and lower speech distortion. PMID:21886543
Sound-by-sound thalamic stimulation modulates midbrain auditory excitability and relative binaural sensitivity in frogs

PubMed Central

Ponnath, Abhilash; Farris, Hamilton E.

2014-01-01

Descending circuitry can modulate auditory processing, biasing sensitivity to particular stimulus parameters and locations. Using awake in vivo single unit recordings, this study tested whether electrical stimulation of the thalamus modulates auditory excitability and relative binaural sensitivity in neurons of the amphibian midbrain. In addition, by using electrical stimuli that were either longer than the acoustic stimuli (i.e., seconds) or presented on a sound-by-sound basis (ms), experiments addressed whether the form of modulation depended on the temporal structure of the electrical stimulus. Following long duration electrical stimulation (3–10 s of 20 Hz square pulses), excitability (spikes/acoustic stimulus) to free-field noise stimuli decreased by 32%, but returned over 600 s. In contrast, sound-by-sound electrical stimulation using a single 2 ms duration electrical pulse 25 ms before each noise stimulus caused faster and varied forms of modulation: modulation lasted <2 s and, in different cells, excitability either decreased, increased or shifted in latency. Within cells, the modulatory effect of sound-by-sound electrical stimulation varied between different acoustic stimuli, including for different male calls, suggesting modulation is specific to certain stimulus attributes. For binaural units, modulation depended on the ear of input, as sound-by-sound electrical stimulation preceding dichotic acoustic stimulation caused asymmetric modulatory effects: sensitivity shifted for sounds at only one ear, or by different relative amounts for both ears. This caused a change in the relative difference in binaural sensitivity. Thus, sound-by-sound electrical stimulation revealed fast and ear-specific (i.e., lateralized) auditory modulation that is potentially suited to shifts in auditory attention during sound segregation in the auditory scene. PMID:25120437
Sound-by-sound thalamic stimulation modulates midbrain auditory excitability and relative binaural sensitivity in frogs.

PubMed

Ponnath, Abhilash; Farris, Hamilton E

2014-01-01

Descending circuitry can modulate auditory processing, biasing sensitivity to particular stimulus parameters and locations. Using awake in vivo single unit recordings, this study tested whether electrical stimulation of the thalamus modulates auditory excitability and relative binaural sensitivity in neurons of the amphibian midbrain. In addition, by using electrical stimuli that were either longer than the acoustic stimuli (i.e., seconds) or presented on a sound-by-sound basis (ms), experiments addressed whether the form of modulation depended on the temporal structure of the electrical stimulus. Following long duration electrical stimulation (3-10 s of 20 Hz square pulses), excitability (spikes/acoustic stimulus) to free-field noise stimuli decreased by 32%, but returned over 600 s. In contrast, sound-by-sound electrical stimulation using a single 2 ms duration electrical pulse 25 ms before each noise stimulus caused faster and varied forms of modulation: modulation lasted <2 s and, in different cells, excitability either decreased, increased or shifted in latency. Within cells, the modulatory effect of sound-by-sound electrical stimulation varied between different acoustic stimuli, including for different male calls, suggesting modulation is specific to certain stimulus attributes. For binaural units, modulation depended on the ear of input, as sound-by-sound electrical stimulation preceding dichotic acoustic stimulation caused asymmetric modulatory effects: sensitivity shifted for sounds at only one ear, or by different relative amounts for both ears. This caused a change in the relative difference in binaural sensitivity. Thus, sound-by-sound electrical stimulation revealed fast and ear-specific (i.e., lateralized) auditory modulation that is potentially suited to shifts in auditory attention during sound segregation in the auditory scene.
PC Scene Generation

NASA Astrophysics Data System (ADS)

Buford, James A., Jr.; Cosby, David; Bunfield, Dennis H.; Mayhall, Anthony J.; Trimble, Darian E.

2007-04-01

AMRDEC has successfully tested hardware and software for Real-Time Scene Generation for IR and SAL Sensors on COTS PC based hardware and video cards. AMRDEC personnel worked with nVidia and Concurrent Computer Corporation to develop a Scene Generation system capable of frame rates of at least 120Hz while frame locked to an external source (such as a missile seeker) with no dropped frames. Latency measurements and image validation were performed using COTS and in-house developed hardware and software. Software for the Scene Generation system was developed using OpenSceneGraph.
Training leads to increased auditory brain-computer interface performance of end-users with motor impairments.

PubMed

Halder, S; Käthner, I; Kübler, A

2016-02-01

Auditory brain-computer interfaces are an assistive technology that can restore communication for motor impaired end-users. Such non-visual brain-computer interface paradigms are of particular importance for end-users that may lose or have lost gaze control. We attempted to show that motor impaired end-users can learn to control an auditory speller on the basis of event-related potentials. Five end-users with motor impairments, two of whom with additional visual impairments, participated in five sessions. We applied a newly developed auditory brain-computer interface paradigm with natural sounds and directional cues. Three of five end-users learned to select symbols using this method. Averaged over all five end-users the information transfer rate increased by more than 1800% from the first session (0.17 bits/min) to the last session (3.08 bits/min). The two best end-users achieved information transfer rates of 5.78 bits/min and accuracies of 92%. Our results show that an auditory BCI with a combination of natural sounds and directional cues, can be controlled by end-users with motor impairment. Training improves the performance of end-users to the level of healthy controls. To our knowledge, this is the first time end-users with motor impairments controlled an auditory brain-computer interface speller with such high accuracy and information transfer rates. Further, our results demonstrate that operating a BCI with event-related potentials benefits from training and specifically end-users may require more than one session to develop their full potential. Copyright © 2015 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.
Development and application of operational techniques for the inventory and monitoring of resources and uses for the Texas coastal zone

NASA Technical Reports Server (NTRS)

Harwood, P. (Principal Investigator); Malin, P.; Finley, R.; Mcculloch, S.; Murphy, D.; Hupp, B.; Schell, J. A.

1977-01-01

The author has identified the following significant results. Four LANDSAT scenes were analyzed for the Harbor Island area test sites to produce land cover and land use maps using both image interpretation and computer-assisted techniques. When evaluated against aerial photography, the mean accuracy for three scenes was 84% for the image interpretation product and 62% for the computer-assisted classification maps. Analysis of the fourth scene was not completed using the image interpretation technique, because of poor quality, false color composite, but was available from the computer technique. Preliminary results indicate that these LANDSAT products can be applied to a variety of planning and management activities in the Texas coastal zone.
Compressed digital holography: from micro towards macro

NASA Astrophysics Data System (ADS)

Schretter, Colas; Bettens, Stijn; Blinder, David; Pesquet-Popescu, Béatrice; Cagnazzo, Marco; Dufaux, Frédéric; Schelkens, Peter

2016-09-01

signal processing methods from software-driven computer engineering and applied mathematics. The compressed sensing theory in particular established a practical framework for reconstructing the scene content using few linear combinations of complex measurements and a sparse prior for regularizing the solution. Compressed sensing found direct applications in digital holography for microscopy. Indeed, the wave propagation phenomenon in free space mixes in a natural way the spatial distribution of point sources from the 3-dimensional scene. As the 3-dimensional scene is mapped to a 2-dimensional hologram, the hologram samples form a compressed representation of the scene as well. This overview paper discusses contributions in the field of compressed digital holography at the micro scale. Then, an outreach on future extensions towards the real-size macro scale is discussed. Thanks to advances in sensor technologies, increasing computing power and the recent improvements in sparse digital signal processing, holographic modalities are on the verge of practical high-quality visualization at a macroscopic scale where much higher resolution holograms must be acquired and processed on the computer.
A novel scene management technology for complex virtual battlefield environment

NASA Astrophysics Data System (ADS)

Sheng, Changchong; Jiang, Libing; Tang, Bo; Tang, Xiaoan

2018-04-01

The efficient scene management of virtual environment is an important research content of computer real-time visualization, which has a decisive influence on the efficiency of drawing. However, Traditional scene management methods do not suitable for complex virtual battlefield environments, this paper combines the advantages of traditional scene graph technology and spatial data structure method, using the idea of management and rendering separation, a loose object-oriented scene graph structure is established to manage the entity model data in the scene, and the performance-based quad-tree structure is created for traversing and rendering. In addition, the collaborative update relationship between the above two structural trees is designed to achieve efficient scene management. Compared with the previous scene management method, this method is more efficient and meets the needs of real-time visualization.
The effect of distraction on change detection in crowded acoustic scenes.

PubMed

Petsas, Theofilos; Harrison, Jemma; Kashino, Makio; Furukawa, Shigeto; Chait, Maria

2016-11-01

In this series of behavioural experiments we investigated the effect of distraction on the maintenance of acoustic scene information in short-term memory. Stimuli are artificial acoustic 'scenes' composed of several (up to twelve) concurrent tone-pip streams ('sources'). A gap (1000 ms) is inserted partway through the 'scene'; Changes in the form of an appearance of a new source or disappearance of an existing source, occur after the gap in 50% of the trials. Listeners were instructed to monitor the unfolding 'soundscapes' for these events. Distraction was measured by presenting distractor stimuli during the gap. Experiments 1 and 2 used a dual task design where listeners were required to perform a task with varying attentional demands ('High Demand' vs. 'Low Demand') on brief auditory (Experiment 1a) or visual (Experiment 1b) signals presented during the gap. Experiments 2 and 3 required participants to ignore distractor sounds and focus on the change detection task. Our results demonstrate that the maintenance of scene information in short-term memory is influenced by the availability of attentional and/or processing resources during the gap, and that this dependence appears to be modality specific. We also show that these processes are susceptible to bottom up driven distraction even in situations when the distractors are not novel, but occur on each trial. Change detection performance is systematically linked with the, independently determined, perceptual salience of the distractor sound. The findings also demonstrate that the present task may be a useful objective means for determining relative perceptual salience. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.
Coding Strategies and Implementations of Compressive Sensing

NASA Astrophysics Data System (ADS)

Tsai, Tsung-Han

This dissertation studies the coding strategies of computational imaging to overcome the limitation of conventional sensing techniques. The information capacity of conventional sensing is limited by the physical properties of optics, such as aperture size, detector pixels, quantum efficiency, and sampling rate. These parameters determine the spatial, depth, spectral, temporal, and polarization sensitivity of each imager. To increase sensitivity in any dimension can significantly compromise the others. This research implements various coding strategies subject to optical multidimensional imaging and acoustic sensing in order to extend their sensing abilities. The proposed coding strategies combine hardware modification and signal processing to exploiting bandwidth and sensitivity from conventional sensors. We discuss the hardware architecture, compression strategies, sensing process modeling, and reconstruction algorithm of each sensing system. Optical multidimensional imaging measures three or more dimensional information of the optical signal. Traditional multidimensional imagers acquire extra dimensional information at the cost of degrading temporal or spatial resolution. Compressive multidimensional imaging multiplexes the transverse spatial, spectral, temporal, and polarization information on a two-dimensional (2D) detector. The corresponding spectral, temporal and polarization coding strategies adapt optics, electronic devices, and designed modulation techniques for multiplex measurement. This computational imaging technique provides multispectral, temporal super-resolution, and polarization imaging abilities with minimal loss in spatial resolution and noise level while maintaining or gaining higher temporal resolution. The experimental results prove that the appropriate coding strategies may improve hundreds times more sensing capacity. Human auditory system has the astonishing ability in localizing, tracking, and filtering the selected sound sources or information from a noisy environment. Using engineering efforts to accomplish the same task usually requires multiple detectors, advanced computational algorithms, or artificial intelligence systems. Compressive acoustic sensing incorporates acoustic metamaterials in compressive sensing theory to emulate the abilities of sound localization and selective attention. This research investigates and optimizes the sensing capacity and the spatial sensitivity of the acoustic sensor. The well-modeled acoustic sensor allows localizing multiple speakers in both stationary and dynamic auditory scene; and distinguishing mixed conversations from independent sources with high audio recognition rate.
DOE Office of Scientific and Technical Information (OSTI.GOV)

DOREN,NEALL E.

Wavefront curvature defocus effects occur in spotlight-mode SAR imagery when reconstructed via the well-known polar-formatting algorithm (PFA) under certain imaging scenarios. These include imaging at close range, using a very low radar center frequency, utilizing high resolution, and/or imaging very large scenes. Wavefront curvature effects arise from the unrealistic assumption of strictly planar wavefronts illuminating the imaged scene. This dissertation presents a method for the correction of wavefront curvature defocus effects under these scenarios, concentrating on the generalized: squint-mode imaging scenario and its computational aspects. This correction is accomplished through an efficient one-dimensional, image domain filter applied as a post-processingmore » step to PF.4. This post-filter, referred to as SVPF, is precalculated from a theoretical derivation of the wavefront curvature effect and varies as a function of scene location. Prior to SVPF, severe restrictions were placed on the imaged scene size in order to avoid defocus effects under these scenarios when using PFA. The SVPF algorithm eliminates the need for scene size restrictions when wavefront curvature effects are present, correcting for wavefront curvature in broadside as well as squinted collection modes while imposing little additional computational penalty for squinted images. This dissertation covers the theoretical development, implementation and analysis of the generalized, squint-mode SVPF algorithm (of which broadside-mode is a special case) and provides examples of its capabilities and limitations as well as offering guidelines for maximizing its computational efficiency. Tradeoffs between the PFA/SVPF combination and other spotlight-mode SAR image formation techniques are discussed with regard to computational burden, image quality, and imaging geometry constraints. It is demonstrated that other methods fail to exhibit a clear computational advantage over polar-formatting in conjunction with SVPF. This research concludes that PFA in conjunction with SVPF provides a computationally efficient spotlight-mode image formation solution that solves the wavefront curvature problem for most standoff distances and patch sizes, regardless of squint, resolution or radar center frequency. Additional advantages are that SVPF is not iterative and has no dependence on the visual contents of the scene: resulting in a deterministic computational complexity which typically adds only thirty percent to the overall image formation time.« less
SeaTouch: A Haptic and Auditory Maritime Environment for Non Visual Cognitive Mapping of Blind Sailors

NASA Astrophysics Data System (ADS)

Simonnet, Mathieu; Jacobson, Dan; Vieilledent, Stephane; Tisseau, Jacques

Navigating consists of coordinating egocentric and allocentric spatial frames of reference. Virtual environments have afforded researchers in the spatial community with tools to investigate the learning of space. The issue of the transfer between virtual and real situations is not trivial. A central question is the role of frames of reference in mediating spatial knowledge transfer to external surroundings, as is the effect of different sensory modalities accessed in simulated and real worlds. This challenges the capacity of blind people to use virtual reality to explore a scene without graphics. The present experiment involves a haptic and auditory maritime virtual environment. In triangulation tasks, we measure systematic errors and preliminary results show an ability to learn configurational knowledge and to navigate through it without vision. Subjects appeared to take advantage of getting lost in an egocentric “haptic” view in the virtual environment to improve performances in the real environment.
Responses to deceleration during car following: roles of optic flow, warnings, expectations, and interruptions.

PubMed

DeLucia, Patricia R; Tharanathan, Anand

2009-12-01

More than 25% of accidents are rear-end collisions. It is essential to identify the factors that contribute to such collisions. One such factor is a driver's ability to respond to the deceleration of the car ahead. In Experiment 1, we measured effects of optic flow information and discrete visual and auditory warnings (brake lights, tones) on responses to deceleration during car following. With computer simulations of car-following scenes, university students pressed a button when the lead car decelerated. Both classes of information affected responses. Observers relied on discrete warnings when optic flow information was relatively less effective as determined by the lead car's headway and deceleration rate. This is consistent with DeLucia's (2008) conceptual framework of space perception that emphasized the importance of viewing distance and motion (and task). In Experiment 2, we measured responses to deceleration after a visual interruption. Scenes were designed to tease apart the role of expectations and optic flow. Responses mostly were consistent with optic flow information presented after the interruption rather than with putative mental expectations that were set up by the lead car's motion prior to the interruption. The theoretical implication of the present results is that responses to deceleration are based on multiple sources of information, including optical size, optical expansion rate and tau, and discrete warnings that are independent of optic flow. The practical implication is that in-vehicle collision-avoidance warning systems may be more useful when optic flow is less effective (e.g., slow deceleration rates), implicating a role for adaptive collision-warning systems. Copyright 2009 APA
Integration and segregation in auditory streaming

NASA Astrophysics Data System (ADS)

Almonte, Felix; Jirsa, Viktor K.; Large, Edward W.; Tuller, Betty

2005-12-01

We aim to capture the perceptual dynamics of auditory streaming using a neurally inspired model of auditory processing. Traditional approaches view streaming as a competition of streams, realized within a tonotopically organized neural network. In contrast, we view streaming to be a dynamic integration process which resides at locations other than the sensory specific neural subsystems. This process finds its realization in the synchronization of neural ensembles or in the existence of informational convergence zones. Our approach uses two interacting dynamical systems, in which the first system responds to incoming acoustic stimuli and transforms them into a spatiotemporal neural field dynamics. The second system is a classification system coupled to the neural field and evolves to a stationary state. These states are identified with a single perceptual stream or multiple streams. Several results in human perception are modelled including temporal coherence and fission boundaries [L.P.A.S. van Noorden, Temporal coherence in the perception of tone sequences, Ph.D. Thesis, Eindhoven University of Technology, The Netherlands, 1975], and crossing of motions [A.S. Bregman, Auditory Scene Analysis: The Perceptual Organization of Sound, MIT Press, 1990]. Our model predicts phenomena such as the existence of two streams with the same pitch, which cannot be explained by the traditional stream competition models. An experimental study is performed to provide proof of existence of this phenomenon. The model elucidates possible mechanisms that may underlie perceptual phenomena.
Development of visual category selectivity in ventral visual cortex does not require visual experience

PubMed Central

van den Hurk, Job; Van Baelen, Marc; Op de Beeck, Hans P.

2017-01-01

To what extent does functional brain organization rely on sensory input? Here, we show that for the penultimate visual-processing region, ventral-temporal cortex (VTC), visual experience is not the origin of its fundamental organizational property, category selectivity. In the fMRI study reported here, we presented 14 congenitally blind participants with face-, body-, scene-, and object-related natural sounds and presented 20 healthy controls with both auditory and visual stimuli from these categories. Using macroanatomical alignment, response mapping, and surface-based multivoxel pattern analysis, we demonstrated that VTC in blind individuals shows robust discriminatory responses elicited by the four categories and that these patterns of activity in blind subjects could successfully predict the visual categories in sighted controls. These findings were confirmed in a subset of blind participants born without eyes and thus deprived from all light perception since conception. The sounds also could be decoded in primary visual and primary auditory cortex, but these regions did not sustain generalization across modalities. Surprisingly, although not as strong as visual responses, selectivity for auditory stimulation in visual cortex was stronger in blind individuals than in controls. The opposite was observed in primary auditory cortex. Overall, we demonstrated a striking similarity in the cortical response layout of VTC in blind individuals and sighted controls, demonstrating that the overall category-selective map in extrastriate cortex develops independently from visual experience. PMID:28507127

An initial investigation into the validity of a computer-based auditory processing assessment (Feather Squadron).

PubMed

Barker, Matthew D; Purdy, Suzanne C

2016-01-01

This research investigates a novel method for identifying and measuring school-aged children with poor auditory processing through a tablet computer. Feasibility and test-retest reliability are investigated by examining the percentage of Group 1 participants able to complete the tasks and developmental effects on performance. Concurrent validity was investigated against traditional tests of auditory processing using Group 2. There were 847 students aged 5 to 13 years in group 1, and 46 aged 5 to 14 years in group 2. Some tasks could not be completed by the youngest participants. Significant correlations were found between results of most auditory processing areas assessed by the Feather Squadron test and traditional auditory processing tests. Test-retest comparisons indicated good reliability for most of the Feather Squadron assessments and some of the traditional tests. The results indicate the Feather Squadron assessment is a time-efficient, feasible, concurrently valid, and reliable approach for measuring auditory processing in school-aged children. Clinically, this may be a useful option for audiologists when performing auditory processing assessments as it is a relatively fast, engaging, and easy way to assess auditory processing abilities. Research is needed to investigate further the construct validity of this new assessment by examining the association between performance on Feather Squadron and objective evoked potential, lesion studies, and/or functional imaging measures of auditory function.
An Investigation of Spatial Hearing in Children with Normal Hearing and with Cochlear Implants and the Impact of Executive Function

NASA Astrophysics Data System (ADS)

Misurelli, Sara M.

The ability to analyze an "auditory scene"---that is, to selectively attend to a target source while simultaneously segregating and ignoring distracting information---is one of the most important and complex skills utilized by normal hearing (NH) adults. The NH adult auditory system and brain work rather well to segregate auditory sources in adverse environments. However, for some children and individuals with hearing loss, selectively attending to one source in noisy environments can be extremely challenging. In a normal auditory system, information arriving at each ear is integrated, and thus these binaural cues aid in speech understanding in noise. A growing number of individuals who are deaf now receive cochlear implants (CIs), which supply hearing through electrical stimulation to the auditory nerve. In particular, bilateral cochlear implants (BICIs) are now becoming more prevalent, especially in children. However, because CI sound processing lacks both fine structure cues and coordination between stimulation at the two ears, binaural cues may either be absent or inconsistent. For children with NH and with BiCIs, this difficulty in segregating sources is of particular concern because their learning and development commonly occurs within the context of complex auditory environments. This dissertation intends to explore and understand the ability of children with NH and with BiCIs to function in everyday noisy environments. The goals of this work are to (1) Investigate source segregation abilities in children with NH and with BiCIs; (2) Examine the effect of target-interferer similarity and the benefits of source segregation for children with NH and with BiCIs; (3) Investigate measures of executive function that may predict performance in complex and realistic auditory tasks of source segregation for listeners with NH; and (4) Examine source segregation abilities in NH listeners, from school-age to adults.
The Attentional Boost Effect: Transient increases in attention to one task enhance performance in a second task.

PubMed

Swallow, Khena M; Jiang, Yuhong V

2010-04-01

Recent work on event perception suggests that perceptual processing increases when events change. An important question is how such changes influence the way other information is processed, particularly during dual-task performance. In this study, participants monitored a long series of distractor items for an occasional target as they simultaneously encoded unrelated background scenes. The appearance of an occasional target could have two opposite effects on the secondary task: It could draw attention away from the second task, or, as a change in the ongoing event, it could improve secondary task performance. Results were consistent with the second possibility. Memory for scenes presented simultaneously with the targets was better than memory for scenes that preceded or followed the targets. This effect was observed when the primary detection task involved visual feature oddball detection, auditory oddball detection, and visual color-shape conjunction detection. It was eliminated when the detection task was omitted, and when it required an arbitrary response mapping. The appearance of occasional, task-relevant events appears to trigger a temporal orienting response that facilitates processing of concurrently attended information (Attentional Boost Effect). Copyright 2009 Elsevier B.V. All rights reserved.
The Attentional Boost Effect: Transient Increases in Attention to One Task Enhance Performance in a Second Task

PubMed Central

Swallow, Khena M.; Jiang, Yuhong V.

2009-01-01

Recent work on event perception suggests that perceptual processing increases when events change. An important question is how such changes influence the way other information is processed, particularly during dual-task performance. In this study, participants monitored a long series of distractor items for an occasional target as they simultaneously encoded unrelated background scenes. The appearance of an occasional target could have two opposite effects on the secondary task: It could draw attention away from the second task, or, as a change in the ongoing event, it could improve secondary task performance. Results were consistent with the second possibility. Memory for scenes presented simultaneously with the targets was better than memory for scenes that preceded or followed the targets. This effect was observed when the primary detection task involved visual feature oddball detection, auditory oddball detection, and visual color-shape conjunction detection. It was eliminated when the detection task was omitted, and when it required an arbitrary response mapping. The appearance of occasional, task-relevant events appears to trigger a temporal orienting response that facilitates processing of concurrently attended information (Attentional Boost Effect). PMID:20080232
Generality and specificity in the effects of musical expertise on perception and cognition.

PubMed

Carey, Daniel; Rosen, Stuart; Krishnan, Saloni; Pearce, Marcus T; Shepherd, Alex; Aydelott, Jennifer; Dick, Frederic

2015-04-01

Performing musicians invest thousands of hours becoming experts in a range of perceptual, attentional, and cognitive skills. The duration and intensity of musicians' training - far greater than that of most educational or rehabilitation programs - provides a useful model to test the extent to which skills acquired in one particular context (music) generalize to different domains. Here, we asked whether the instrument-specific and more instrument-general skills acquired during professional violinists' and pianists' training would generalize to superior performance on a wide range of analogous (largely non-musical) skills, when compared to closely matched non-musicians. Violinists and pianists outperformed non-musicians on fine-grained auditory psychophysical measures, but surprisingly did not differ from each other, despite the different demands of their instruments. Musician groups did differ on a tuning system perception task: violinists showed clearest biases towards the tuning system specific to their instrument, suggesting that long-term experience leads to selective perceptual benefits given a training-relevant context. However, we found only weak evidence of group differences in non-musical skills, with musicians differing marginally in one measure of sustained auditory attention, but not significantly on auditory scene analysis or multi-modal sequencing measures. Further, regression analyses showed that this sustained auditory attention metric predicted more variance in one auditory psychophysical measure than did musical expertise. Our findings suggest that specific musical expertise may yield distinct perceptual outcomes within contexts close to the area of training. Generalization of expertise to relevant cognitive domains may be less clear, particularly where the task context is non-musical. Copyright © 2014 Elsevier B.V. All rights reserved.
EEG signatures accompanying auditory figure-ground segregation.

PubMed

Tóth, Brigitta; Kocsis, Zsuzsanna; Háden, Gábor P; Szerafin, Ágnes; Shinn-Cunningham, Barbara G; Winkler, István

2016-11-01

In everyday acoustic scenes, figure-ground segregation typically requires one to group together sound elements over both time and frequency. Electroencephalogram was recorded while listeners detected repeating tonal complexes composed of a random set of pure tones within stimuli consisting of randomly varying tonal elements. The repeating pattern was perceived as a figure over the randomly changing background. It was found that detection performance improved both as the number of pure tones making up each repeated complex (figure coherence) increased, and as the number of repeated complexes (duration) increased - i.e., detection was easier when either the spectral or temporal structure of the figure was enhanced. Figure detection was accompanied by the elicitation of the object related negativity (ORN) and the P400 event-related potentials (ERPs), which have been previously shown to be evoked by the presence of two concurrent sounds. Both ERP components had generators within and outside of auditory cortex. The amplitudes of the ORN and the P400 increased with both figure coherence and figure duration. However, only the P400 amplitude correlated with detection performance. These results suggest that 1) the ORN and P400 reflect processes involved in detecting the emergence of a new auditory object in the presence of other concurrent auditory objects; 2) the ORN corresponds to the likelihood of the presence of two or more concurrent sound objects, whereas the P400 reflects the perceptual recognition of the presence of multiple auditory objects and/or preparation for reporting the detection of a target object. Copyright © 2016. Published by Elsevier Inc.
Concurrent 3-D sonifications enable the head-up monitoring of two interrelated aircraft navigation instruments.

PubMed

Towers, John; Burgess-Limerick, Robin; Riek, Stephan

2014-12-01

The aim of this study was to enable the head-up monitoring of two interrelated aircraft navigation instruments by developing a 3-D auditory display that encodes this navigation information within two spatially discrete sonifications. Head-up monitoring of aircraft navigation information utilizing 3-D audio displays, particularly involving concurrently presented sonifications, requires additional research. A flight simulator's head-down waypoint bearing and course deviation instrument readouts were conveyed to participants via a 3-D auditory display. Both readouts were separately represented by a colocated pair of continuous sounds, one fixed and the other varying in pitch, which together encoded the instrument value's deviation from the norm. Each sound pair's position in the listening space indicated the left/right parameter of its instrument's readout. Participants' accuracy in navigating a predetermined flight plan was evaluated while performing a head-up task involving the detection of visual flares in the out-of-cockpit scene. The auditory display significantly improved aircraft heading and course deviation accuracy, head-up time, and flare detections. Head tracking did not improve performance by providing participants with the ability to orient potentially conflicting sounds, suggesting that the use of integrated localizing cues was successful. Conclusion: A supplementary 3-D auditory display enabled effective head-up monitoring of interrelated navigation information normally attended to through a head-down display. Pilots operating aircraft, such as helicopters and unmanned aerial vehicles, may benefit from a supplementary auditory display because they navigate in two dimensions while performing head-up, out-of-aircraft, visual tasks.
Computational modeling of the human auditory periphery: Auditory-nerve responses, evoked potentials and hearing loss.

PubMed

Verhulst, Sarah; Altoè, Alessandro; Vasilkov, Viacheslav

2018-03-01

Models of the human auditory periphery range from very basic functional descriptions of auditory filtering to detailed computational models of cochlear mechanics, inner-hair cell (IHC), auditory-nerve (AN) and brainstem signal processing. It is challenging to include detailed physiological descriptions of cellular components into human auditory models because single-cell data stems from invasive animal recordings while human reference data only exists in the form of population responses (e.g., otoacoustic emissions, auditory evoked potentials). To embed physiological models within a comprehensive human auditory periphery framework, it is important to capitalize on the success of basic functional models of hearing and render their descriptions more biophysical where possible. At the same time, comprehensive models should capture a variety of key auditory features, rather than fitting their parameters to a single reference dataset. In this study, we review and improve existing models of the IHC-AN complex by updating their equations and expressing their fitting parameters into biophysical quantities. The quality of the model framework for human auditory processing is evaluated using recorded auditory brainstem response (ABR) and envelope-following response (EFR) reference data from normal and hearing-impaired listeners. We present a model with 12 fitting parameters from the cochlea to the brainstem that can be rendered hearing impaired to simulate how cochlear gain loss and synaptopathy affect human population responses. The model description forms a compromise between capturing well-described single-unit IHC and AN properties and human population response features. Copyright © 2018 The Authors. Published by Elsevier B.V. All rights reserved.
Attentional modulation of informational masking on early cortical representations of speech signals.

PubMed

Zhang, Changxin; Arnott, Stephen R; Rabaglia, Cristina; Avivi-Reich, Meital; Qi, James; Wu, Xihong; Li, Liang; Schneider, Bruce A

2016-01-01

To recognize speech in a noisy auditory scene, listeners need to perceptually segregate the target talker's voice from other competing sounds (stream segregation). A number of studies have suggested that the attentional demands placed on listeners increase as the acoustic properties and informational content of the competing sounds become more similar to that of the target voice. Hence we would expect attentional demands to be considerably greater when speech is masked by speech than when it is masked by steady-state noise. To investigate the role of attentional mechanisms in the unmasking of speech sounds, event-related potentials (ERPs) were recorded to a syllable masked by noise or competing speech under both active (the participant was asked to respond when the syllable was presented) or passive (no response was required) listening conditions. The results showed that the long-latency auditory response to a syllable (/bi/), presented at different signal-to-masker ratios (SMRs), was similar in both passive and active listening conditions, when the masker was a steady-state noise. In contrast, a switch from the passive listening condition to the active one, when the masker was two-talker speech, significantly enhanced the ERPs to the syllable. These results support the hypothesis that the need to engage attentional mechanisms in aid of scene analysis increases as the similarity (both acoustic and informational) between the target speech and the competing background sounds increases. Copyright © 2015 Elsevier B.V. All rights reserved.
[STS-41 Onboard 16mm Photography Quick Release

NASA Technical Reports Server (NTRS)

1990-01-01

This videotape features scenes of onboard activities. The videotape was shot by the crew. The scenes include the following: Ulysses' deployment, middeck experiments, computer workstations, and Earth payload bay views.
Scene-based nonuniformity correction technique for infrared focal-plane arrays.

PubMed

Liu, Yong-Jin; Zhu, Hong; Zhao, Yi-Gong

2009-04-20

A scene-based nonuniformity correction algorithm is presented to compensate for the gain and bias nonuniformity in infrared focal-plane array sensors, which can be separated into three parts. First, an interframe-prediction method is used to estimate the true scene, since nonuniformity correction is a typical blind-estimation problem and both scene values and detector parameters are unavailable. Second, the estimated scene, along with its corresponding observed data obtained by detectors, is employed to update the gain and the bias by means of a line-fitting technique. Finally, with these nonuniformity parameters, the compensated output of each detector is obtained by computing a very simple formula. The advantages of the proposed algorithm lie in its low computational complexity and storage requirements and ability to capture temporal drifts in the nonuniformity parameters. The performance of every module is demonstrated with simulated and real infrared image sequences. Experimental results indicate that the proposed algorithm exhibits a superior correction effect.
A qualitative approach for recovering relative depths in dynamic scenes

NASA Technical Reports Server (NTRS)

Haynes, S. M.; Jain, R.

1987-01-01

This approach to dynamic scene analysis is a qualitative one. It computes relative depths using very general rules. The depths calculated are qualitative in the sense that the only information obtained is which object is in front of which others. The motion is qualitative in the sense that the only required motion data is whether objects are moving toward or away from the camera. Reasoning, which takes into account the temporal character of the data and the scene, is qualitative. This approach to dynamic scene analysis can tolerate imprecise data because in dynamic scenes the data are redundant.
Fear Processing in Dental Phobia during Crossmodal Symptom Provocation: An fMRI Study

PubMed Central

Maslowski, Nina Isabel; Wittchen, Hans-Ulrich; Lueken, Ulrike

2014-01-01

While previous studies successfully identified the core neural substrates of the animal subtype of specific phobia, only few and inconsistent research is available for dental phobia. These findings might partly relate to the fact that, typically, visual stimuli were employed. The current study aimed to investigate the influence of stimulus modality on neural fear processing in dental phobia. Thirteen dental phobics (DP) and thirteen healthy controls (HC) attended a block-design functional magnetic resonance imaging (fMRI) symptom provocation paradigm encompassing both visual and auditory stimuli. Drill sounds and matched neutral sinus tones served as auditory stimuli and dentist scenes and matched neutral videos as visual stimuli. Group comparisons showed increased activation in the insula, anterior cingulate cortex, orbitofrontal cortex, and thalamus in DP compared to HC during auditory but not visual stimulation. On the contrary, no differential autonomic reactions were observed in DP. Present results are largely comparable to brain areas identified in animal phobia, but also point towards a potential downregulation of autonomic outflow by neural fear circuits in this disorder. Findings enlarge our knowledge about neural correlates of dental phobia and may help to understand the neural underpinnings of the clinical and physiological characteristics of the disorder. PMID:24738049
Spatial selective attention in a complex auditory environment such as polyphonic music.

PubMed

Saupe, Katja; Koelsch, Stefan; Rübsamen, Rudolf

2010-01-01

To investigate the influence of spatial information in auditory scene analysis, polyphonic music (three parts in different timbres) was composed and presented in free field. Each part contained large falling interval jumps in the melody and the task of subjects was to detect these events in one part ("target part") while ignoring the other parts. All parts were either presented from the same location (0 degrees; overlap condition) or from different locations (-28 degrees, 0 degrees, and 28 degrees or -56 degrees, 0 degrees, and 56 degrees in the azimuthal plane), with the target part being presented either at 0 degrees or at one of the right-sided locations. Results showed that spatial separation of 28 degrees was sufficient for a significant improvement in target detection (i.e., in the detection of large interval jumps) compared to the overlap condition, irrespective of the position (frontal or right) of the target part. A larger spatial separation of the parts resulted in further improvements only if the target part was lateralized. These data support the notion of improvement in the suppression of interfering signals with spatial sound source separation. Additionally, the data show that the position of the relevant sound source influences auditory performance.
Evidence for a Non-Lexical Influence on Children's Auditory Repetition of Familiar Words

ERIC Educational Resources Information Center

Budd, Mary-Jane; Hanley, J. Richard; Nozari, Nazbanou

2012-01-01

This paper examines evidence for a nonlexical influence on children's repetition of real words. We investigate the extent to which two computational models of auditory repetition can simulate the performance of 68 children aged between 5 and 11 years-old when they are attempting to repeat familiar words. Both computational accounts were derived…
Learning Auditory Discrimination with Computer-Assisted Instruction: A Comparison of Two Different Performance Objectives.

ERIC Educational Resources Information Center

Steinhaus, Kurt A.

A 12-week study of two groups of 14 college freshmen music majors was conducted to determine which group demonstrated greater achievement in learning auditory discrimination using computer-assisted instruction (CAI). The method employed was a pre-/post-test experimental design using subjects randomly assigned to a control group or an experimental…
The Use of Spatialized Speech in Auditory Interfaces for Computer Users Who Are Visually Impaired

ERIC Educational Resources Information Center

Sodnik, Jaka; Jakus, Grega; Tomazic, Saso

2012-01-01

Introduction: This article reports on a study that explored the benefits and drawbacks of using spatially positioned synthesized speech in auditory interfaces for computer users who are visually impaired (that is, are blind or have low vision). The study was a practical application of such systems--an enhanced word processing application compared…
Continuous video coherence computing model for detecting scene boundaries

NASA Astrophysics Data System (ADS)

Kang, Hang-Bong

2001-07-01

The scene boundary detection is important in the semantic understanding of video data and is usually determined by coherence between shots. To measure the coherence, two approaches have been proposed. One is a discrete approach and the other one is a continuous approach. In this paper, we use the continuous approach and propose some modifications on the causal First-In-First-Out(FIFO) short-term memory-based model. One modification is that we allow dynamic memory size in computing coherence reliably regardless of the size of each shot. Another modification is that some shots can be removed from the memory buffer not by the FIFO rule. These removed shots have no or small foreground objects. Using this model, we detect scene boundaries by computing shot coherence. In computing coherence, we add one new term which is the number of intermediate shots between two comparing shots because the effect of intermediate shots is important in computing shot recall. In addition, we also consider shot activity because this is important to reflect human perception. We experiment our computing model on different genres of videos and have obtained reasonable results.
Basic level scene understanding: categories, attributes and structures

PubMed Central

Xiao, Jianxiong; Hays, James; Russell, Bryan C.; Patterson, Genevieve; Ehinger, Krista A.; Torralba, Antonio; Oliva, Aude

2013-01-01

A longstanding goal of computer vision is to build a system that can automatically understand a 3D scene from a single image. This requires extracting semantic concepts and 3D information from 2D images which can depict an enormous variety of environments that comprise our visual world. This paper summarizes our recent efforts toward these goals. First, we describe the richly annotated SUN database which is a collection of annotated images spanning 908 different scene categories with object, attribute, and geometric labels for many scenes. This database allows us to systematically study the space of scenes and to establish a benchmark for scene and object recognition. We augment the categorical SUN database with 102 scene attributes for every image and explore attribute recognition. Finally, we present an integrated system to extract the 3D structure of the scene and objects depicted in an image. PMID:24009590
Programmable personality interface for the dynamic infrared scene generator (IRSG2)

NASA Astrophysics Data System (ADS)

Buford, James A., Jr.; Mobley, Scott B.; Mayhall, Anthony J.; Braselton, William J.

1998-07-01

As scene generator platforms begin to rely specifically on commercial off-the-shelf (COTS) hardware and software components, the need for high speed programmable personality interfaces (PPIs) are required for interfacing to Infrared (IR) flight computer/processors and complex IR projectors in the hardware-in-the-loop (HWIL) simulation facilities. Recent technological advances and innovative applications of established technologies are beginning to allow development of cost effective PPIs to interface to COTS scene generators. At the U.S. Army Aviation and Missile Command (AMCOM) Missile Research, Development, and Engineering Center (MRDEC) researchers have developed such a PPI to reside between the AMCOM MRDEC IR Scene Generator (IRSG) and either a missile flight computer or the dynamic Laser Diode Array Projector (LDAP). AMCOM MRDEC has developed several PPIs for the first and second generation IRSGs (IRSG1 and IRSG2), which are based on Silicon Graphics Incorporated (SGI) Onyx and Onyx2 computers with Reality Engine 2 (RE2) and Infinite Reality (IR/IR2) graphics engines. This paper provides an overview of PPIs designed, integrated, tested, and verified at AMCOM MRDEC, specifically the IRSG2's PPI.

A model of proto-object based saliency

PubMed Central

Russell, Alexander F.; Mihalaş, Stefan; von der Heydt, Rudiger; Niebur, Ernst; Etienne-Cummings, Ralph

2013-01-01

Organisms use the process of selective attention to optimally allocate their computational resources to the instantaneously most relevant subsets of a visual scene, ensuring that they can parse the scene in real time. Many models of bottom-up attentional selection assume that elementary image features, like intensity, color and orientation, attract attention. Gestalt psychologists, how-ever, argue that humans perceive whole objects before they analyze individual features. This is supported by recent psychophysical studies that show that objects predict eye-fixations better than features. In this report we present a neurally inspired algorithm of object based, bottom-up attention. The model rivals the performance of state of the art non-biologically plausible feature based algorithms (and outperforms biologically plausible feature based algorithms) in its ability to predict perceptual saliency (eye fixations and subjective interest points) in natural scenes. The model achieves this by computing saliency as a function of proto-objects that establish the perceptual organization of the scene. All computational mechanisms of the algorithm have direct neural correlates, and our results provide evidence for the interface theory of attention. PMID:24184601
The Effect of Neurocognitive Function on Math Computation in Pediatric ADHD: Moderating Influences of Anxious Perfectionism and Gender.

PubMed

Sturm, Alexandra; Rozenman, Michelle; Piacentini, John C; McGough, James J; Loo, Sandra K; McCracken, James T

2018-03-20

Predictors of math achievement in attention-deficit/hyperactivity disorder (ADHD) are not well-known. To address this gap in the literature, we examined individual differences in neurocognitive functioning domains on math computation in a cross-sectional sample of youth with ADHD. Gender and anxiety symptoms were explored as potential moderators. The sample consisted of 281 youth (aged 8-15 years) diagnosed with ADHD. Neurocognitive tasks assessed auditory-verbal working memory, visuospatial working memory, and processing speed. Auditory-verbal working memory speed significantly predicted math computation. A three-way interaction revealed that at low levels of anxious perfectionism, slower processing speed predicted poorer math computation for boys compared to girls. These findings indicate the uniquely predictive values of auditory-verbal working memory and processing speed on math computation, and their differential moderation. These findings provide preliminary support that gender and anxious perfectionism may influence the relationship between neurocognitive functioning and academic achievement.
The effect of non-visual working memory load on top-down modulation of visual processing

PubMed Central

Rissman, Jesse; Gazzaley, Adam; D'Esposito, Mark

2009-01-01

While a core function of the working memory (WM) system is the active maintenance of behaviorally relevant sensory representations, it is also critical that distracting stimuli are appropriately ignored. We used functional magnetic resonance imaging to examine the role of domain-general WM resources in the top-down attentional modulation of task-relevant and irrelevant visual representations. In our dual-task paradigm, each trial began with the auditory presentation of six random (high load) or sequentially-ordered (low load) digits. Next, two relevant visual stimuli (e.g., faces), presented amongst two temporally interspersed visual distractors (e.g., scenes), were to be encoded and maintained across a 7-sec delay interval, after which memory for the relevant images and digits was probed. When taxed by high load digit maintenance, participants exhibited impaired performance on the visual WM task and a selective failure to attenuate the neural processing of task-irrelevant scene stimuli. The over-processing of distractor scenes under high load was indexed by elevated encoding activity in a scene-selective region-of-interest relative to low load and passive viewing control conditions, as well as by improved long-term recognition memory for these items. In contrast, the load manipulation did not affect participants' ability to upregulate activity in this region when scenes were task-relevant. These results highlight the critical role of domain-general WM resources in the goal-directed regulation of distractor processing. Moreover, the consequences of increased WM load in young adults closely resemble the effects of cognitive aging on distractor filtering [Gazzaley et al., (2005) Nature Neuroscience 8, 1298-1300], suggesting the possibility of a common underlying mechanism. PMID:19397858
Simulating Scenes In Outer Space

NASA Technical Reports Server (NTRS)

Callahan, John D.

1989-01-01

Multimission Interactive Picture Planner, MIP, computer program for scientifically accurate and fast, three-dimensional animation of scenes in deep space. Versatile, reasonably comprehensive, and portable, and runs on microcomputers. New techniques developed to perform rapidly calculations and transformations necessary to animate scenes in scientifically accurate three-dimensional space. Written in FORTRAN 77 code. Primarily designed to handle Voyager, Galileo, and Space Telescope. Adapted to handle other missions.
From Image Analysis to Computer Vision: Motives, Methods, and Milestones.

DTIC Science & Technology

1998-07-01

images. Initially, work on digital image analysis dealt with specific classes of images such as text, photomicrographs, nuclear particle tracks, and aerial...photographs; but by the 1960’s, general algorithms and paradigms for image analysis began to be formulated. When the artificial intelligence...scene, but eventually from image sequences obtained by a moving camera; at this stage, image analysis had become scene analysis or computer vision
Automatic temperature computation for realistic IR simulation

NASA Astrophysics Data System (ADS)

Le Goff, Alain; Kersaudy, Philippe; Latger, Jean; Cathala, Thierry; Stolte, Nilo; Barillot, Philippe

2000-07-01

Polygon temperature computation in 3D virtual scenes is fundamental for IR image simulation. This article describes in detail the temperature calculation software and its current extensions, briefly presented in [1]. This software, called MURET, is used by the simulation workshop CHORALE of the French DGA. MURET is a one-dimensional thermal software, which accurately takes into account the material thermal attributes of three-dimensional scene and the variation of the environment characteristics (atmosphere) as a function of the time. Concerning the environment, absorbed incident fluxes are computed wavelength by wavelength, for each half an hour, druing 24 hours before the time of the simulation. For each polygon, incident fluxes are compsed of: direct solar fluxes, sky illumination (including diffuse solar fluxes). Concerning the materials, classical thermal attributes are associated to several layers, such as conductivity, absorption, spectral emissivity, density, specific heat, thickness and convection coefficients are taken into account. In the future, MURET will be able to simulate permeable natural materials (water influence) and vegetation natural materials (woods). This model of thermal attributes induces a very accurate polygon temperature computation for the complex 3D databases often found in CHORALE simulations. The kernel of MUET consists of an efficient ray tracer allowing to compute the history (over 24 hours) of the shadowed parts of the 3D scene and a library, responsible for the thermal computations. The great originality concerns the way the heating fluxes are computed. Using ray tracing, the flux received in each 3D point of the scene accurately takes into account the masking (hidden surfaces) between objects. By the way, this library supplies other thermal modules such as a thermal shows computation tool.
Anteverted internal auditory canal as an inner ear anomaly in patients with craniofacial microsomia.

PubMed

L'Heureux-Lebeau, Bénédicte; Saliba, Issam

2014-09-01

Craniofacial microsomia involves structure of the first and second branchial arches. A wide range of ear anomalies, affecting external, middle and inner ear, has been described in association with this condition. We report three cases of anteverted internal auditory canal in patients presenting craniofacial microsomia. This unique internal auditory canal orientation was found on high-resolution computed tomography of the temporal bones. This internal auditory canal anomaly is yet unreported in craniofacial anomalies. Copyright © 2014. Published by Elsevier Ireland Ltd.
The Role of Auditory Features Within Slot-Themed Social Casino Games and Online Slot Machine Games.

PubMed

Bramley, Stephanie; Gainsbury, Sally M

2015-12-01

Over the last few years playing social casino games has become a popular entertainment activity. Social casino games are offered via social media platforms and mobile apps and resemble gambling activities. However, social casino games are not classified as gambling as they can be played for free, outcomes may not be determined by chance, and players receive no monetary payouts. Social casino games appear to be somewhat similar to online gambling activities in terms of their visual and auditory features, but to date little research has investigated the cross over between these games. This study examines the auditory features of slot-themed social casino games and online slot machine games using a case study design. An example of each game type was played on three separate occasions during which, the auditory features (i.e., music, speech, sound effects, and the absence of sound) within the games were logged. The online slot-themed game was played in demo mode. This is the first study to provide a qualitative account of the role of auditory features within a slot-themed social casino game and an online slot machine game. Our results found many similarities between how sound is utilised within the two games. Therefore the sounds within these games may serve functions including: setting the scene for gaming, creating an image, demarcating space, interacting with visual features, prompting players to act, communicating achievements to players, providing reinforcement, heightening player emotions and the gaming experience. As a result this may reduce the ability of players to make a clear distinction between these two activities, which may facilitate migration between games.
Neural Representation of Concurrent Harmonic Sounds in Monkey Primary Auditory Cortex: Implications for Models of Auditory Scene Analysis

PubMed Central

Steinschneider, Mitchell; Micheyl, Christophe

2014-01-01

The ability to attend to a particular sound in a noisy environment is an essential aspect of hearing. To accomplish this feat, the auditory system must segregate sounds that overlap in frequency and time. Many natural sounds, such as human voices, consist of harmonics of a common fundamental frequency (F0). Such harmonic complex tones (HCTs) evoke a pitch corresponding to their F0. A difference in pitch between simultaneous HCTs provides a powerful cue for their segregation. The neural mechanisms underlying concurrent sound segregation based on pitch differences are poorly understood. Here, we examined neural responses in monkey primary auditory cortex (A1) to two concurrent HCTs that differed in F0 such that they are heard as two separate “auditory objects” with distinct pitches. We found that A1 can resolve, via a rate-place code, the lower harmonics of both HCTs, a prerequisite for deriving their pitches and for their perceptual segregation. Onset asynchrony between the HCTs enhanced the neural representation of their harmonics, paralleling their improved perceptual segregation in humans. Pitches of the concurrent HCTs could also be temporally represented by neuronal phase-locking at their respective F0s. Furthermore, a model of A1 responses using harmonic templates could qualitatively reproduce psychophysical data on concurrent sound segregation in humans. Finally, we identified a possible intracortical homolog of the “object-related negativity” recorded noninvasively in humans, which correlates with the perceptual segregation of concurrent sounds. Findings indicate that A1 contains sufficient spectral and temporal information for segregating concurrent sounds based on differences in pitch. PMID:25209282
"Change deafness" arising from inter-feature masking within a single auditory object.

PubMed

Barascud, Nicolas; Griffiths, Timothy D; McAlpine, David; Chait, Maria

2014-03-01

Our ability to detect prominent changes in complex acoustic scenes depends not only on the ear's sensitivity but also on the capacity of the brain to process competing incoming information. Here, employing a combination of psychophysics and magnetoencephalography (MEG), we investigate listeners' sensitivity in situations when two features belonging to the same auditory object change in close succession. The auditory object under investigation is a sequence of tone pips characterized by a regularly repeating frequency pattern. Signals consisted of an initial, regularly alternating sequence of three short (60 msec) pure tone pips (in the form ABCABC…) followed by a long pure tone with a frequency that is either expected based on the on-going regular pattern ("LONG expected"-i.e., "LONG-expected") or constitutes a pattern violation ("LONG-unexpected"). The change in LONG-expected is manifest as a change in duration (when the long pure tone exceeds the established duration of a tone pip), whereas the change in LONG-unexpected is manifest as a change in both the frequency pattern and a change in the duration. Our results reveal a form of "change deafness," in that although changes in both the frequency pattern and the expected duration appear to be processed effectively by the auditory system-cortical signatures of both changes are evident in the MEG data-listeners often fail to detect changes in the frequency pattern when that change is closely followed by a change in duration. By systematically manipulating the properties of the changing features and measuring behavioral and MEG responses, we demonstrate that feature changes within the same auditory object, which occur close together in time, appear to compete for perceptual resources.
Abnormal Complex Auditory Pattern Analysis in Schizophrenia Reflected in an Absent Missing Stimulus Mismatch Negativity.

PubMed

Salisbury, Dean F; McCathern, Alexis G

2016-11-01

The simple mismatch negativity (MMN) to tones deviating physically (in pitch, loudness, duration, etc.) from repeated standard tones is robustly reduced in schizophrenia. Although generally interpreted to reflect memory or cognitive processes, simple MMN likely contains some activity from non-adapted sensory cells, clouding what process is affected in schizophrenia. Research in healthy participants has demonstrated that MMN can be elicited by deviations from abstract auditory patterns and complex rules that do not cause sensory adaptation. Whether persons with schizophrenia show abnormalities in the complex MMN is unknown. Fourteen schizophrenia participants and 16 matched healthy underwent EEG recording while listening to 400 groups of 6 tones 330 ms apart, separated by 800 ms. Occasional deviant groups were missing the 4th or 6th tone (50 groups each). Healthy participants generated a robust response to a missing but expected tone. The schizophrenia group was significantly impaired in activating the missing stimulus MMN, generating no significant activity at all. Schizophrenia affects the ability of "primitive sensory intelligence" and pre-attentive perceptual mechanisms to form implicit groups in the auditory environment. Importantly, this deficit must relate to abnormalities in abstract complex pattern analysis rather than sensory problems in the disorder. The results indicate a deficit in parsing of the complex auditory scene which likely impacts negatively on successful social navigation in schizophrenia. Knowledge of the location and circuit architecture underlying the true novelty-related MMN and its pathophysiology in schizophrenia will help target future interventions.
Using an auditory sensory substitution device to augment vision: evidence from eye movements.

PubMed

Wright, Thomas D; Margolis, Aaron; Ward, Jamie

2015-03-01

Sensory substitution devices convert information normally associated with one sense into another sense (e.g. converting vision into sound). This is often done to compensate for an impaired sense. The present research uses a multimodal approach in which both natural vision and sound-from-vision ('soundscapes') are simultaneously presented. Although there is a systematic correspondence between what is seen and what is heard, we introduce a local discrepancy between the signals (the presence of a target object that is heard but not seen) that the participant is required to locate. In addition to behavioural responses, the participants' gaze is monitored with eye-tracking. Although the target object is only presented in the auditory channel, behavioural performance is enhanced when visual information relating to the non-target background is presented. In this instance, vision may be used to generate predictions about the soundscape that enhances the ability to detect the hidden auditory object. The eye-tracking data reveal that participants look for longer in the quadrant containing the auditory target even when they subsequently judge it to be located elsewhere. As such, eye movements generated by soundscapes reveal the knowledge of the target location that does not necessarily correspond to the actual judgment made. The results provide a proof of principle that multimodal sensory substitution may be of benefit to visually impaired people with some residual vision and, in normally sighted participants, for guiding search within complex scenes.
Patterns of Auditory Perception Skills in Children with Learning Disabilities: A Computer-Assisted Approach.

ERIC Educational Resources Information Center

Pressman, E.; And Others

1986-01-01

The auditory receptive language skills of 40 learning disabled (LD) and 40 non-disabled boys (all 7 - 11 years old) were assessed via computerized versions of subtests of the Goldman-Fristoe-Woodcock Auditory Skills Test Battery. The computerized assessment correctly identified 92.5% of the LD group and 65% of the normal control children. (DB)
Brainstem auditory-evoked potentials as an objective tool for evaluating hearing dysfunction in traumatic brain injury.

PubMed

Lew, Henry L; Lee, Eun Ha; Miyoshi, Yasushi; Chang, Douglas G; Date, Elaine S; Jerger, James F

2004-03-01

Because of the violent nature of traumatic brain injury, traumatic brain injury patients are susceptible to various types of trauma involving the auditory system. We report a case of a 55-yr-old man who presented with communication problems after traumatic brain injury. Initial results from behavioral audiometry and Weber/Rinne tests were not reliable because of poor cooperation. He was transferred to our service for inpatient rehabilitation, where review of the initial head computed tomographic scan showed only left temporal bone fracture. Brainstem auditory-evoked potential was then performed to evaluate his hearing function. The results showed bilateral absence of auditory-evoked responses, which strongly suggested bilateral deafness. This finding led to a follow-up computed tomographic scan, with focus on bilateral temporal bones. A subtle transverse fracture of the right temporal bone was then detected, in addition to the left temporal bone fracture previously identified. Like children with hearing impairment, traumatic brain injury patients may not be able to verbalize their auditory deficits in a timely manner. If hearing loss is suspected in a patient who is unable to participate in traditional behavioral audiometric testing, brainstem auditory-evoked potential may be an option for evaluating hearing dysfunction.
Neural mechanisms underlying auditory feedback control of speech

PubMed Central

Reilly, Kevin J.; Guenther, Frank H.

2013-01-01

The neural substrates underlying auditory feedback control of speech were investigated using a combination of functional magnetic resonance imaging (fMRI) and computational modeling. Neural responses were measured while subjects spoke monosyllabic words under two conditions: (i) normal auditory feedback of their speech, and (ii) auditory feedback in which the first formant frequency of their speech was unexpectedly shifted in real time. Acoustic measurements showed compensation to the shift within approximately 135 ms of onset. Neuroimaging revealed increased activity in bilateral superior temporal cortex during shifted feedback, indicative of neurons coding mismatches between expected and actual auditory signals, as well as right prefrontal and Rolandic cortical activity. Structural equation modeling revealed increased influence of bilateral auditory cortical areas on right frontal areas during shifted speech, indicating that projections from auditory error cells in posterior superior temporal cortex to motor correction cells in right frontal cortex mediate auditory feedback control of speech. PMID:18035557
Optic Flow Dominates Visual Scene Polarity in Causing Adaptive Modification of Locomotor Trajectory

NASA Technical Reports Server (NTRS)

Nomura, Y.; Mulavara, A. P.; Richards, J. T.; Brady, R.; Bloomberg, Jacob J.

2005-01-01

Locomotion and posture are influenced and controlled by vestibular, visual and somatosensory information. Optic flow and scene polarity are two characteristics of a visual scene that have been identified as being critical in how they affect perceived body orientation and self-motion. The goal of this study was to determine the role of optic flow and visual scene polarity on adaptive modification in locomotor trajectory. Two computer-generated virtual reality scenes were shown to subjects during 20 minutes of treadmill walking. One scene was a highly polarized scene while the other was composed of objects displayed in a non-polarized fashion. Both virtual scenes depicted constant rate self-motion equivalent to walking counterclockwise around the perimeter of a room. Subjects performed Stepping Tests blindfolded before and after scene exposure to assess adaptive changes in locomotor trajectory. Subjects showed a significant difference in heading direction, between pre and post adaptation stepping tests, when exposed to either scene during treadmill walking. However, there was no significant difference in the subjects heading direction between the two visual scene polarity conditions. Therefore, it was inferred from these data that optic flow has a greater role than visual polarity in influencing adaptive locomotor function.
Reliable Acquisition of RAM Dumps from Intel-Based Apple Mac Computers over FireWire

NASA Astrophysics Data System (ADS)

Gladyshev, Pavel; Almansoori, Afrah

RAM content acquisition is an important step in live forensic analysis of computer systems. FireWire offers an attractive way to acquire RAM content of Apple Mac computers equipped with a FireWire connection. However, the existing techniques for doing so require substantial knowledge of the target computer configuration and cannot be used reliably on a previously unknown computer in a crime scene. This paper proposes a novel method for acquiring RAM content of Apple Mac computers over FireWire, which automatically discovers necessary information about the target computer and can be used in the crime scene setting. As an application of the developed method, the techniques for recovery of AOL Instant Messenger (AIM) conversation fragments from RAM dumps are also discussed in this paper.
How Might People Near National Roads Be Affected by Traffic Noise as Electric Vehicles Increase in Number? A Laboratory Study of Subjective Evaluations of Environmental Noise.

PubMed

Walker, Ian; Kennedy, John; Martin, Susanna; Rice, Henry

2016-01-01

We face a likely shift to electric vehicles (EVs) but the environmental and human consequences of this are not yet well understood. Simulated auditory traffic scenes were synthesized from recordings of real conventional and EVs. These sounded similar to what might be heard by a person near a major national road. Versions of the simulation had 0%, 20%, 40%, 60%, 80% and 100% EVs. Participants heard the auditory scenes in random order, rating each on five perceptual dimensions such as pleasant-unpleasant and relaxing-stressful. Ratings of traffic noise were, overall, towards the negative end of these scales, but improved significantly when there were high proportions of EVs in the traffic mix, particularly when there were 80% or 100% EVs. This suggests a shift towards a high proportion of EVs is likely to improve the subjective experiences of people exposed to traffic noise from major roads. The effects were not a simple result of EVs being quieter: ratings of bandpass-filtered versions of the recordings suggested that people's perceptions of traffic noise were specifically influenced by energy in the 500-2000 Hz band. Engineering countermeasures to reduce noise in this band might be effective for improving the subjective experience of people living or working near major roads, even for conventional vehicles; energy in the 0-100 Hz band was particularly associated with people identifying sound as 'quiet' and, again, this might feed into engineering to reduce the impact of traffic noise on people.
How Might People Near National Roads Be Affected by Traffic Noise as Electric Vehicles Increase in Number? A Laboratory Study of Subjective Evaluations of Environmental Noise

PubMed Central

Walker, Ian; Kennedy, John; Martin, Susanna; Rice, Henry

2016-01-01

We face a likely shift to electric vehicles (EVs) but the environmental and human consequences of this are not yet well understood. Simulated auditory traffic scenes were synthesized from recordings of real conventional and EVs. These sounded similar to what might be heard by a person near a major national road. Versions of the simulation had 0%, 20%, 40%, 60%, 80% and 100% EVs. Participants heard the auditory scenes in random order, rating each on five perceptual dimensions such as pleasant–unpleasant and relaxing–stressful. Ratings of traffic noise were, overall, towards the negative end of these scales, but improved significantly when there were high proportions of EVs in the traffic mix, particularly when there were 80% or 100% EVs. This suggests a shift towards a high proportion of EVs is likely to improve the subjective experiences of people exposed to traffic noise from major roads. The effects were not a simple result of EVs being quieter: ratings of bandpass-filtered versions of the recordings suggested that people’s perceptions of traffic noise were specifically influenced by energy in the 500–2000 Hz band. Engineering countermeasures to reduce noise in this band might be effective for improving the subjective experience of people living or working near major roads, even for conventional vehicles; energy in the 0–100 Hz band was particularly associated with people identifying sound as ‘quiet’ and, again, this might feed into engineering to reduce the impact of traffic noise on people. PMID:26938865
Music Perception in Dementia.

PubMed

Golden, Hannah L; Clark, Camilla N; Nicholas, Jennifer M; Cohen, Miriam H; Slattery, Catherine F; Paterson, Ross W; Foulkes, Alexander J M; Schott, Jonathan M; Mummery, Catherine J; Crutch, Sebastian J; Warren, Jason D

2017-01-01

Despite much recent interest in music and dementia, music perception has not been widely studied across dementia syndromes using an information processing approach. Here we addressed this issue in a cohort of 30 patients representing major dementia syndromes of typical Alzheimer's disease (AD, n = 16), logopenic aphasia (LPA, an Alzheimer variant syndrome; n = 5), and progressive nonfluent aphasia (PNFA; n = 9) in relation to 19 healthy age-matched individuals. We designed a novel neuropsychological battery to assess perception of musical patterns in the dimensions of pitch and temporal information (requiring detection of notes that deviated from the established pattern based on local or global sequence features) and musical scene analysis (requiring detection of a familiar tune within polyphonic harmony). Performance on these tests was referenced to generic auditory (timbral) deviance detection and recognition of familiar tunes and adjusted for general auditory working memory performance. Relative to healthy controls, patients with AD and LPA had group-level deficits of global pitch (melody contour) processing while patients with PNFA as a group had deficits of local (interval) as well as global pitch processing. There was substantial individual variation within syndromic groups. Taking working memory performance into account, no specific deficits of musical temporal processing, timbre processing, musical scene analysis, or tune recognition were identified. The findings suggest that particular aspects of music perception such as pitch pattern analysis may open a window on the processing of information streams in major dementia syndromes. The potential selectivity of musical deficits for particular dementia syndromes and particular dimensions of processing warrants further systematic investigation.

Prediction of Auditory and Visual P300 Brain-Computer Interface Aptitude

PubMed Central

Halder, Sebastian; Hammer, Eva Maria; Kleih, Sonja Claudia; Bogdan, Martin; Rosenstiel, Wolfgang; Birbaumer, Niels; Kübler, Andrea

2013-01-01

Objective Brain-computer interfaces (BCIs) provide a non-muscular communication channel for patients with late-stage motoneuron disease (e.g., amyotrophic lateral sclerosis (ALS)) or otherwise motor impaired people and are also used for motor rehabilitation in chronic stroke. Differences in the ability to use a BCI vary from person to person and from session to session. A reliable predictor of aptitude would allow for the selection of suitable BCI paradigms. For this reason, we investigated whether P300 BCI aptitude could be predicted from a short experiment with a standard auditory oddball. Methods Forty healthy participants performed an electroencephalography (EEG) based visual and auditory P300-BCI spelling task in a single session. In addition, prior to each session an auditory oddball was presented. Features extracted from the auditory oddball were analyzed with respect to predictive power for BCI aptitude. Results Correlation between auditory oddball response and P300 BCI accuracy revealed a strong relationship between accuracy and N2 amplitude and the amplitude of a late ERP component between 400 and 600 ms. Interestingly, the P3 amplitude of the auditory oddball response was not correlated with accuracy. Conclusions Event-related potentials recorded during a standard auditory oddball session moderately predict aptitude in an audiory and highly in a visual P300 BCI. The predictor will allow for faster paradigm selection. Significance Our method will reduce strain on patients because unsuccessful training may be avoided, provided the results can be generalized to the patient population. PMID:23457444
Eye movements to audiovisual scenes reveal expectations of a just world.

PubMed

Callan, Mitchell J; Ferguson, Heather J; Bindemann, Markus

2013-02-01

When confronted with bad things happening to good people, observers often engage reactive strategies, such as victim derogation, to maintain a belief in a just world. Although such reasoning is usually made retrospectively, we investigated the extent to which knowledge of another person's good or bad behavior can also bias people's online expectations for subsequent good or bad outcomes. Using a fully crossed design, participants listened to auditory scenarios that varied in terms of whether the characters engaged in morally good or bad behavior while their eye movements were tracked around concurrent visual scenes depicting good and bad outcomes. We found that the good (bad) behavior of the characters influenced gaze preferences for good (bad) outcomes just prior to the actual outcomes being revealed. These findings suggest that beliefs about a person's moral worth encourage observers to foresee a preferred deserved outcome as the event unfolds. We include evidence to show that this effect cannot be explained in terms of affective priming or matching strategies. 2013 APA, all rights reserved
A class of temporal boundaries derived by quantifying the sense of separation.

PubMed

Paine, Llewyn Elise; Gilden, David L

2013-12-01

The perception of moment-to-moment environmental flux as being composed of meaningful events requires that memory processes coordinate with cues that signify beginnings and endings. We have constructed a technique that allows this coordination to be monitored indirectly. This technique works by embedding a sequential priming task into the event under study. Memory and perception must be coordinated to resolve temporal flux into scenes. The implicit memory processes inherent in sequential priming are able to effectively shadow then mirror scene-forming processes. Certain temporal boundaries are found to weaken the strength of irrelevant feature priming, a signal which can then be used in more ambiguous cases to infer how people segment time. Over the course of 13 independent studies, we were able to calibrate the technique and then use it to measure the strength of event segmentation in several instructive contexts that involved both visual and auditory modalities. The signal generated by sequential priming may permit the sense of separation between events to be measured as an extensive psychophysical quantity.
Perception of scent over-marks by golden hamsters (Mesocricetus auratus): novel mechanisms for determining which individual's mark is on top.

PubMed

Johnston, R E; Bhorade, A

1998-09-01

Hamsters preferentially remember or value the top scent of a scent over-mark. What cues do they use to do this? Using habituation-discrimination techniques, we exposed male golden hamsters (Mesocricetus auratus) on 3 to 4 trials to genital over-marks from 2 females and then tested subjects for their familiarity with these 2 scents compared with that of a novel female's secretion. Preferential memory for 1 of the 2 individuals' scents did not occur if the 2 marks did not overlap or did not overlap but differed in age, but it did occur if a region of overlap existed or 1 mark apparently occluded another (but did not overlap it). Thus, hamsters use regions of overlap and the spatial configuration of scents to evaluate over-marks. These phenomena constitute evidence for previously unsuspected perceptual abilities, including olfactory scene analysis, which is analogous to visual and auditory scene analysis.
a Super Voxel-Based Riemannian Graph for Multi Scale Segmentation of LIDAR Point Clouds

NASA Astrophysics Data System (ADS)

Li, Minglei

2018-04-01

Automatically segmenting LiDAR points into respective independent partitions has become a topic of great importance in photogrammetry, remote sensing and computer vision. In this paper, we cast the problem of point cloud segmentation as a graph optimization problem by constructing a Riemannian graph. The scale space of the observed scene is explored by an octree-based over-segmentation with different depths. The over-segmentation produces many super voxels which restrict the structure of the scene and will be used as nodes of the graph. The Kruskal coordinates are used to compute edge weights that are proportional to the geodesic distance between nodes. Then we compute the edge-weight matrix in which the elements reflect the sectional curvatures associated with the geodesic paths between super voxel nodes on the scene surface. The final segmentation results are generated by clustering similar super voxels and cutting off the weak edges in the graph. The performance of this method was evaluated on LiDAR point clouds for both indoor and outdoor scenes. Additionally, extensive comparisons to state of the art techniques show that our algorithm outperforms on many metrics.
Enhanced Graphics for Extended Scale Range

NASA Technical Reports Server (NTRS)

Hanson, Andrew J.; Chi-Wing Fu, Philip

2012-01-01

Enhanced Graphics for Extended Scale Range is a computer program for rendering fly-through views of scene models that include visible objects differing in size by large orders of magnitude. An example would be a scene showing a person in a park at night with the moon, stars, and galaxies in the background sky. Prior graphical computer programs exhibit arithmetic and other anomalies when rendering scenes containing objects that differ enormously in scale and distance from the viewer. The present program dynamically repartitions distance scales of objects in a scene during rendering to eliminate almost all such anomalies in a way compatible with implementation in other software and in hardware accelerators. By assigning depth ranges correspond ing to rendering precision requirements, either automatically or under program control, this program spaces out object scales to match the precision requirements of the rendering arithmetic. This action includes an intelligent partition of the depth buffer ranges to avoid known anomalies from this source. The program is written in C++, using OpenGL, GLUT, and GLUI standard libraries, and nVidia GEForce Vertex Shader extensions. The program has been shown to work on several computers running UNIX and Windows operating systems.
Immediate integration of prosodic information from speech and visual information from pictures in the absence of focused attention: a mismatch negativity study.

PubMed

Li, X; Yang, Y; Ren, G

2009-06-16

Language is often perceived together with visual information. Recent experimental evidences indicated that, during spoken language comprehension, the brain can immediately integrate visual information with semantic or syntactic information from speech. Here we used the mismatch negativity to further investigate whether prosodic information from speech could be immediately integrated into a visual scene context or not, and especially the time course and automaticity of this integration process. Sixteen Chinese native speakers participated in the study. The materials included Chinese spoken sentences and picture pairs. In the audiovisual situation, relative to the concomitant pictures, the spoken sentence was appropriately accented in the standard stimuli, but inappropriately accented in the two kinds of deviant stimuli. In the purely auditory situation, the speech sentences were presented without pictures. It was found that the deviants evoked mismatch responses in both audiovisual and purely auditory situations; the mismatch negativity in the purely auditory situation peaked at the same time as, but was weaker than that evoked by the same deviant speech sounds in the audiovisual situation. This pattern of results suggested immediate integration of prosodic information from speech and visual information from pictures in the absence of focused attention.
Psychophysical evidence for auditory motion parallax.

PubMed

Genzel, Daria; Schutte, Michael; Brimijoin, W Owen; MacNeilage, Paul R; Wiegrebe, Lutz

2018-04-17

Distance is important: From an ecological perspective, knowledge about the distance to either prey or predator is vital. However, the distance of an unknown sound source is particularly difficult to assess, especially in anechoic environments. In vision, changes in perspective resulting from observer motion produce a reliable, consistent, and unambiguous impression of depth known as motion parallax. Here we demonstrate with formal psychophysics that humans can exploit auditory motion parallax, i.e., the change in the dynamic binaural cues elicited by self-motion, to assess the relative depths of two sound sources. Our data show that sensitivity to relative depth is best when subjects move actively; performance deteriorates when subjects are moved by a motion platform or when the sound sources themselves move. This is true even though the dynamic binaural cues elicited by these three types of motion are identical. Our data demonstrate a perceptual strategy to segregate intermittent sound sources in depth and highlight the tight interaction between self-motion and binaural processing that allows assessment of the spatial layout of complex acoustic scenes.
Research in interactive scene analysis

NASA Technical Reports Server (NTRS)

Tenenbaum, J. M.; Barrow, H. G.; Weyl, S. A.

1976-01-01

Cooperative (man-machine) scene analysis techniques were developed whereby humans can provide a computer with guidance when completely automated processing is infeasible. An interactive approach promises significant near-term payoffs in analyzing various types of high volume satellite imagery, as well as vehicle-based imagery used in robot planetary exploration. This report summarizes the work accomplished over the duration of the project and describes in detail three major accomplishments: (1) the interactive design of texture classifiers; (2) a new approach for integrating the segmentation and interpretation phases of scene analysis; and (3) the application of interactive scene analysis techniques to cartography.
View generated database

NASA Technical Reports Server (NTRS)

Downward, James G.

1992-01-01

This document represents the final report for the View Generated Database (VGD) project, NAS7-1066. It documents the work done on the project up to the point at which all project work was terminated due to lack of project funds. The VGD was to provide the capability to accurately represent any real-world object or scene as a computer model. Such models include both an accurate spatial/geometric representation of surfaces of the object or scene, as well as any surface detail present on the object. Applications of such models are numerous, including acquisition and maintenance of work models for tele-autonomous systems, generation of accurate 3-D geometric/photometric models for various 3-D vision systems, and graphical models for realistic rendering of 3-D scenes via computer graphics.
Mission Driven Scene Understanding: Candidate Model Training and Validation

DTIC Science & Technology

2016-09-01

driven scene understanding. One of the candidate engines that we are evaluating is a convolutional neural network (CNN) program installed on a Windows 10...Theano-AlexNet6,7) installed on a Windows 10 notebook computer. To the best of our knowledge, an implementation of the open-source, Python-based...AlexNet CNN on a Windows notebook computer has not been previously reported. In this report, we present progress toward the proof-of-principle testing
Graded zooming

DOEpatents

Coffland, Douglas R.

2006-04-25

A system for increasing the resolution in the far field resolution of video or still frame images, while maintaining full coverage in the near field. The system includes a camera connected to a computer. The computer applies a specific zooming scale factor to each of line of pixels and continuously increases the scale factor of the line of pixels from the bottom to the top to capture the scene in the near field, yet maintain resolution in the scene in the far field.
Diminished auditory sensory gating during active auditory verbal hallucinations.

PubMed

Thoma, Robert J; Meier, Andrew; Houck, Jon; Clark, Vincent P; Lewine, Jeffrey D; Turner, Jessica; Calhoun, Vince; Stephen, Julia

2017-10-01

Auditory sensory gating, assessed in a paired-click paradigm, indicates the extent to which incoming stimuli are filtered, or "gated", in auditory cortex. Gating is typically computed as the ratio of the peak amplitude of the event related potential (ERP) to a second click (S2) divided by the peak amplitude of the ERP to a first click (S1). Higher gating ratios are purportedly indicative of incomplete suppression of S2 and considered to represent sensory processing dysfunction. In schizophrenia, hallucination severity is positively correlated with gating ratios, and it was hypothesized that a failure of sensory control processes early in auditory sensation (gating) may represent a larger system failure within the auditory data stream; resulting in auditory verbal hallucinations (AVH). EEG data were collected while patients (N=12) with treatment-resistant AVH pressed a button to indicate the beginning (AVH-on) and end (AVH-off) of each AVH during a paired click protocol. For each participant, separate gating ratios were computed for the P50, N100, and P200 components for each of the AVH-off and AVH-on states. AVH trait severity was assessed using the Psychotic Symptoms Rating Scales AVH Total score (PSYRATS). The results of a mixed model ANOVA revealed an overall effect for AVH state, such that gating ratios were significantly higher during the AVH-on state than during AVH-off for all three components. PSYRATS score was significantly and negatively correlated with N100 gating ratio only in the AVH-off state. These findings link onset of AVH with a failure of an empirically-defined auditory inhibition system, auditory sensory gating, and pave the way for a sensory gating model of AVH. Copyright © 2017 Elsevier B.V. All rights reserved.
MONET: multidimensional radiative cloud scene model

NASA Astrophysics Data System (ADS)

Chervet, Patrick

1999-12-01

All cloud fields exhibit variable structures (bulge) and heterogeneities in water distributions. With the development of multidimensional radiative models by the atmospheric community, it is now possible to describe horizontal heterogeneities of the cloud medium, to study these influences on radiative quantities. We have developed a complete radiative cloud scene generator, called MONET (French acronym for: MOdelisation des Nuages En Tridim.) to compute radiative cloud scene from visible to infrared wavelengths for various viewing and solar conditions, different spatial scales, and various locations on the Earth. MONET is composed of two parts: a cloud medium generator (CSSM -- Cloud Scene Simulation Model) developed by the Air Force Research Laboratory, and a multidimensional radiative code (SHDOM -- Spherical Harmonic Discrete Ordinate Method) developed at the University of Colorado by Evans. MONET computes images for several scenario defined by user inputs: date, location, viewing angles, wavelength, spatial resolution, meteorological conditions (atmospheric profiles, cloud types)... For the same cloud scene, we can output different viewing conditions, or/and various wavelengths. Shadowing effects on clouds or grounds are taken into account. This code is useful to study heterogeneity effects on satellite data for various cloud types and spatial resolutions, and to determine specifications of new imaging sensor.
Fusion of monocular cues to detect man-made structures in aerial imagery

NASA Technical Reports Server (NTRS)

Shufelt, Jefferey; Mckeown, David M.

1991-01-01

The extraction of buildings from aerial imagery is a complex problem for automated computer vision. It requires locating regions in a scene that possess properties distinguishing them as man-made objects as opposed to naturally occurring terrain features. It is reasonable to assume that no single detection method can correctly delineate or verify buildings in every scene. A cooperative-methods paradigm is useful in approaching the building extraction problem. Using this paradigm, each extraction technique provides information which can be added or assimilated into an overall interpretation of the scene. Thus, the main objective is to explore the development of computer vision system that integrates the results of various scene analysis techniques into an accurate and robust interpretation of the underlying three dimensional scene. The problem of building hypothesis fusion in aerial imagery is discussed. Building extraction techniques are briefly surveyed, including four building extraction, verification, and clustering systems. A method for fusing the symbolic data generated by these systems is described, and applied to monocular image and stereo image data sets. Evaluation methods for the fusion results are described, and the fusion results are analyzed using these methods.
Creating Three-Dimensional Scenes

ERIC Educational Resources Information Center

Krumpe, Norm

2005-01-01

Persistence of Vision Raytracer (POV-Ray), a free computer program for creating photo-realistic, three-dimensional scenes and a link for Mathematica users interested in generating POV-Ray files from within Mathematica, is discussed. POV-Ray has great potential in secondary mathematics classrooms and helps in strengthening students' visualization…
Navigating the auditory scene: an expert role for the hippocampus.

PubMed

Teki, Sundeep; Kumar, Sukhbinder; von Kriegstein, Katharina; Stewart, Lauren; Lyness, C Rebecca; Moore, Brian C J; Capleton, Brian; Griffiths, Timothy D

2012-08-29

Over a typical career piano tuners spend tens of thousands of hours exploring a specialized acoustic environment. Tuning requires accurate perception and adjustment of beats in two-note chords that serve as a navigational device to move between points in previously learned acoustic scenes. It is a two-stage process that depends on the following: first, selective listening to beats within frequency windows, and, second, the subsequent use of those beats to navigate through a complex soundscape. The neuroanatomical substrates underlying brain specialization for such fundamental organization of sound scenes are unknown. Here, we demonstrate that professional piano tuners are significantly better than controls matched for age and musical ability on a psychophysical task simulating active listening to beats within frequency windows that is based on amplitude modulation rate discrimination. Tuners show a categorical increase in gray matter volume in the right frontal operculum and right superior temporal lobe. Tuners also show a striking enhancement of gray matter volume in the anterior hippocampus, parahippocampal gyrus, and superior temporal gyrus, and an increase in white matter volume in the posterior hippocampus as a function of years of tuning experience. The relationship with gray matter volume is sensitive to years of tuning experience and starting age but not actual age or level of musicality. Our findings support a role for a core set of regions in the hippocampus and superior temporal cortex in skilled exploration of complex sound scenes in which precise sound "templates" are encoded and consolidated into memory over time in an experience-dependent manner.
Probability distributions of whisker-surface contact: quantifying elements of the rat vibrissotactile natural scene.

PubMed

Hobbs, Jennifer A; Towal, R Blythe; Hartmann, Mitra J Z

2015-08-01

Analysis of natural scene statistics has been a powerful approach for understanding neural coding in the auditory and visual systems. In the field of somatosensation, it has been more challenging to quantify the natural tactile scene, in part because somatosensory signals are so tightly linked to the animal's movements. The present work takes a step towards quantifying the natural tactile scene for the rat vibrissal system by simulating rat whisking motions to systematically investigate the probabilities of whisker-object contact in naturalistic environments. The simulations permit an exhaustive search through the complete space of possible contact patterns, thereby allowing for the characterization of the patterns that would most likely occur during long sequences of natural exploratory behavior. We specifically quantified the probabilities of 'concomitant contact', that is, given that a particular whisker makes contact with a surface during a whisk, what is the probability that each of the other whiskers will also make contact with the surface during that whisk? Probabilities of concomitant contact were quantified in simulations that assumed increasingly naturalistic conditions: first, the space of all possible head poses; second, the space of behaviorally preferred head poses as measured experimentally; and third, common head poses in environments such as cages and burrows. As environments became more naturalistic, the probability distributions shifted from exhibiting a 'row-wise' structure to a more diagonal structure. Results also reveal that the rat appears to use motor strategies (e.g. head pitches) that generate contact patterns that are particularly well suited to extract information in the presence of uncertainty. © 2015. Published by The Company of Biologists Ltd.
Auditory-Motor Interactions in Pediatric Motor Speech Disorders: Neurocomputational Modeling of Disordered Development

PubMed Central

Terband, H.; Maassen, B.; Guenther, F.H.; Brumberg, J.

2014-01-01

Background/Purpose Differentiating the symptom complex due to phonological-level disorders, speech delay and pediatric motor speech disorders is a controversial issue in the field of pediatric speech and language pathology. The present study investigated the developmental interaction between neurological deficits in auditory and motor processes using computational modeling with the DIVA model. Method In a series of computer simulations, we investigated the effect of a motor processing deficit alone (MPD), and the effect of a motor processing deficit in combination with an auditory processing deficit (MPD+APD) on the trajectory and endpoint of speech motor development in the DIVA model. Results Simulation results showed that a motor programming deficit predominantly leads to deterioration on the phonological level (phonemic mappings) when auditory self-monitoring is intact, and on the systemic level (systemic mapping) if auditory self-monitoring is impaired. Conclusions These findings suggest a close relation between quality of auditory self-monitoring and the involvement of phonological vs. motor processes in children with pediatric motor speech disorders. It is suggested that MPD+APD might be involved in typically apraxic speech output disorders and MPD in pediatric motor speech disorders that also have a phonological component. Possibilities to verify these hypotheses using empirical data collected from human subjects are discussed. PMID:24491630
Hearing in Insects.

PubMed

Göpfert, Martin C; Hennig, R Matthias

2016-01-01

Insect hearing has independently evolved multiple times in the context of intraspecific communication and predator detection by transforming proprioceptive organs into ears. Research over the past decade, ranging from the biophysics of sound reception to molecular aspects of auditory transduction to the neuronal mechanisms of auditory signal processing, has greatly advanced our understanding of how insects hear. Apart from evolutionary innovations that seem unique to insect hearing, parallels between insect and vertebrate auditory systems have been uncovered, and the auditory sensory cells of insects and vertebrates turned out to be evolutionarily related. This review summarizes our current understanding of insect hearing. It also discusses recent advances in insect auditory research, which have put forward insect auditory systems for studying biological aspects that extend beyond hearing, such as cilium function, neuronal signal computation, and sensory system evolution.

Computer Vision Research and its Applications to Automated Cartography

DTIC Science & Technology

1985-09-01

D Scene Geometry Thomas M. Strat and Martin A. Fischler Appendix D A New Sense for Depth of Field Alex P. Pentland iv 9.* qb CONTENTS (cont’d...D modeling. A. Baseline Stereo System As a framework for integration and evaluation of our research in modeling * 3-D scene geometry , as well as a...B. New Methods for Stereo Compilation As we previously indicated, the conventional approach to recovering scene geometry from a stereo pair of
Integration of an open interface PC scene generator using COTS DVI converter hardware

NASA Astrophysics Data System (ADS)

Nordland, Todd; Lyles, Patrick; Schultz, Bret

2006-05-01

Commercial-Off-The-Shelf (COTS) personal computer (PC) hardware is increasingly capable of computing high dynamic range (HDR) scenes for military sensor testing at high frame rates. New electro-optical and infrared (EO/IR) scene projectors feature electrical interfaces that can accept the DVI output of these PC systems. However, military Hardware-in-the-loop (HWIL) facilities such as those at the US Army Aviation and Missile Research Development and Engineering Center (AMRDEC) utilize a sizeable inventory of existing projection systems that were designed to use the Silicon Graphics Incorporated (SGI) digital video port (DVP, also known as DVP2 or DD02) interface. To mate the new DVI-based scene generation systems to these legacy projection systems, CG2 Inc., a Quantum3D Company (CG2), has developed a DVI-to-DVP converter called Delta DVP. This device takes progressive scan DVI input, converts it to digital parallel data, and combines and routes color components to derive a 16-bit wide luminance channel replicated on a DVP output interface. The HWIL Functional Area of AMRDEC has developed a suite of modular software to perform deterministic real-time, wave band-specific rendering of sensor scenes, leveraging the features of commodity graphics hardware and open source software. Together, these technologies enable sensor simulation and test facilities to integrate scene generation and projection components with diverse pedigrees.
Computer-generated hologram calculation for real scenes using a commercial portable plenoptic camera

NASA Astrophysics Data System (ADS)

Endo, Yutaka; Wakunami, Koki; Shimobaba, Tomoyoshi; Kakue, Takashi; Arai, Daisuke; Ichihashi, Yasuyuki; Yamamoto, Kenji; Ito, Tomoyoshi

2015-12-01

This paper shows the process used to calculate a computer-generated hologram (CGH) for real scenes under natural light using a commercial portable plenoptic camera. In the CGH calculation, a light field captured with the commercial plenoptic camera is converted into a complex amplitude distribution. Then the converted complex amplitude is propagated to a CGH plane. We tested both numerical and optical reconstructions of the CGH and showed that the CGH calculation from captured data with the commercial plenoptic camera was successful.
[Ventriloquism and audio-visual integration of voice and face].

PubMed

Yokosawa, Kazuhiko; Kanaya, Shoko

2012-07-01

Presenting synchronous auditory and visual stimuli in separate locations creates the illusion that the sound originates from the direction of the visual stimulus. Participants' auditory localization bias, called the ventriloquism effect, has revealed factors affecting the perceptual integration of audio-visual stimuli. However, many studies on audio-visual processes have focused on performance in simplified experimental situations, with a single stimulus in each sensory modality. These results cannot necessarily explain our perceptual behavior in natural scenes, where various signals exist within a single sensory modality. In the present study we report the contributions of a cognitive factor, that is, the audio-visual congruency of speech, although this factor has often been underestimated in previous ventriloquism research. Thus, we investigated the contribution of speech congruency on the ventriloquism effect using a spoken utterance and two videos of a talking face. The salience of facial movements was also manipulated. As a result, when bilateral visual stimuli are presented in synchrony with a single voice, cross-modal speech congruency was found to have a significant impact on the ventriloquism effect. This result also indicated that more salient visual utterances attracted participants' auditory localization. The congruent pairing of audio-visual utterances elicited greater localization bias than did incongruent pairing, whereas previous studies have reported little dependency on the reality of stimuli in ventriloquism. Moreover, audio-visual illusory congruency, owing to the McGurk effect, caused substantial visual interference to auditory localization. This suggests that a greater flexibility in responding to multi-sensory environments exists than has been previously considered.
Impairing the useful field of view in natural scenes: Tunnel vision versus general interference.

PubMed

Ringer, Ryan V; Throneburg, Zachary; Johnson, Aaron P; Kramer, Arthur F; Loschky, Lester C

2016-01-01

A fundamental issue in visual attention is the relationship between the useful field of view (UFOV), the region of visual space where information is encoded within a single fixation, and eccentricity. A common assumption is that impairing attentional resources reduces the size of the UFOV (i.e., tunnel vision). However, most research has not accounted for eccentricity-dependent changes in spatial resolution, potentially conflating fixed visual properties with flexible changes in visual attention. Williams (1988, 1989) argued that foveal loads are necessary to reduce the size of the UFOV, producing tunnel vision. Without a foveal load, it is argued that the attentional decrement is constant across the visual field (i.e., general interference). However, other research asserts that auditory working memory (WM) loads produce tunnel vision. To date, foveal versus auditory WM loads have not been compared to determine if they differentially change the size of the UFOV. In two experiments, we tested the effects of a foveal (rotated L vs. T discrimination) task and an auditory WM (N-back) task on an extrafoveal (Gabor) discrimination task. Gabor patches were scaled for size and processing time to produce equal performance across the visual field under single-task conditions, thus removing the confound of eccentricity-dependent differences in visual sensitivity. The results showed that although both foveal and auditory loads reduced Gabor orientation sensitivity, only the foveal load interacted with retinal eccentricity to produce tunnel vision, clearly demonstrating task-specific changes to the form of the UFOV. This has theoretical implications for understanding the UFOV.
Investigation of scene identification algorithms for radiation budget measurements

NASA Technical Reports Server (NTRS)

Diekmann, F. J.

1986-01-01

The computation of Earth radiation budget from satellite measurements requires the identification of the scene in order to select spectral factors and bidirectional models. A scene identification procedure is developed for AVHRR SW and LW data by using two radiative transfer models. These AVHRR GAC pixels are then attached to corresponding ERBE pixels and the results are sorted into scene identification probability matrices. These scene intercomparisons show that there generally is a higher tendency for underestimation of cloudiness over ocean at high cloud amounts, e.g., mostly cloudy instead of overcast, partly cloudy instead of mostly cloudy, for the ERBE relative to the AVHRR results. Reasons for this are explained. Preliminary estimates of the errors of exitances due to scene misidentification demonstrates the high dependency on the probability matrices. While the longwave error can generally be neglected the shortwave deviations have reached maximum values of more than 12% of the respective exitances.
AgRISTARS. Supporting research: Algorithms for scene modelling

NASA Technical Reports Server (NTRS)

Rassbach, M. E. (Principal Investigator)

1982-01-01

The requirements for a comprehensive analysis of LANDSAT or other visual data scenes are defined. The development of a general model of a scene and a computer algorithm for finding the particular model for a given scene is discussed. The modelling system includes a boundary analysis subsystem, which detects all the boundaries and lines in the image and builds a boundary graph; a continuous variation analysis subsystem, which finds gradual variations not well approximated by a boundary structure; and a miscellaneous features analysis, which includes texture, line parallelism, etc. The noise reduction capabilities of this method and its use in image rectification and registration are discussed.
Rational-operator-based depth-from-defocus approach to scene reconstruction.

PubMed

Li, Ang; Staunton, Richard; Tjahjadi, Tardi

2013-09-01

This paper presents a rational-operator-based approach to depth from defocus (DfD) for the reconstruction of three-dimensional scenes from two-dimensional images, which enables fast DfD computation that is independent of scene textures. Two variants of the approach, one using the Gaussian rational operators (ROs) that are based on the Gaussian point spread function (PSF) and the second based on the generalized Gaussian PSF, are considered. A novel DfD correction method is also presented to further improve the performance of the approach. Experimental results are considered for real scenes and show that both approaches outperform existing RO-based methods.
Three-dimensional information hierarchical encryption based on computer-generated holograms

NASA Astrophysics Data System (ADS)

Kong, Dezhao; Shen, Xueju; Cao, Liangcai; Zhang, Hao; Zong, Song; Jin, Guofan

2016-12-01

A novel approach for encrypting three-dimensional (3-D) scene information hierarchically based on computer-generated holograms (CGHs) is proposed. The CGHs of the layer-oriented 3-D scene information are produced by angular-spectrum propagation algorithm at different depths. All the CGHs are then modulated by different chaotic random phase masks generated by the logistic map. Hierarchical encryption encoding is applied when all the CGHs are accumulated one by one, and the reconstructed volume of the 3-D scene information depends on permissions of different users. The chaotic random phase masks could be encoded into several parameters of the chaotic sequences to simplify the transmission and preservation of the keys. Optical experiments verify the proposed method and numerical simulations show the high key sensitivity, high security, and application flexibility of the method.
Exploiting current-generation graphics hardware for synthetic-scene generation

NASA Astrophysics Data System (ADS)

Tanner, Michael A.; Keen, Wayne A.

2010-04-01

Increasing seeker frame rate and pixel count, as well as the demand for higher levels of scene fidelity, have driven scene generation software for hardware-in-the-loop (HWIL) and software-in-the-loop (SWIL) testing to higher levels of parallelization. Because modern PC graphics cards provide multiple computational cores (240 shader cores for a current NVIDIA Corporation GeForce and Quadro cards), implementation of phenomenology codes on graphics processing units (GPUs) offers significant potential for simultaneous enhancement of simulation frame rate and fidelity. To take advantage of this potential requires algorithm implementation that is structured to minimize data transfers between the central processing unit (CPU) and the GPU. In this paper, preliminary methodologies developed at the Kinetic Hardware In-The-Loop Simulator (KHILS) will be presented. Included in this paper will be various language tradeoffs between conventional shader programming, Compute Unified Device Architecture (CUDA) and Open Computing Language (OpenCL), including performance trades and possible pathways for future tool development.
Real time imaging and infrared background scene analysis using the Naval Postgraduate School infrared search and target designation (NPS-IRSTD) system

NASA Astrophysics Data System (ADS)

Bernier, Jean D.

1991-09-01

The imaging in real time of infrared background scenes with the Naval Postgraduate School Infrared Search and Target Designation (NPS-IRSTD) System was achieved through extensive software developments in protected mode assembly language on an Intel 80386 33 MHz computer. The new software processes the 512 by 480 pixel images directly in the extended memory area of the computer where the DT-2861 frame grabber memory buffers are mapped. Direct interfacing, through a JDR-PR10 prototype card, between the frame grabber and the host computer AT bus enables each load of the frame grabber memory buffers to be effected under software control. The protected mode assembly language program can refresh the display of a six degree pseudo-color sector in the scanner rotation within the two second period of the scanner. A study of the imaging properties of the NPS-IRSTD is presented with preliminary work on image analysis and contrast enhancement of infrared background scenes.
CYCLOPS-3 System Research.

ERIC Educational Resources Information Center

Marill, Thomas; And Others

The aim of the CYCLOPS Project research is the development of techniques for allowing computers to perform visual scene analysis, pre-processing of visual imagery, and perceptual learning. Work on scene analysis and learning has previously been described. The present report deals with research on pre-processing and with further work on scene…
Neural Correlates of Fixation Duration during Real-world Scene Viewing: Evidence from Fixation-related (FIRE) fMRI.

PubMed

Henderson, John M; Choi, Wonil

2015-06-01

During active scene perception, our eyes move from one location to another via saccadic eye movements, with the eyes fixating objects and scene elements for varying amounts of time. Much of the variability in fixation duration is accounted for by attentional, perceptual, and cognitive processes associated with scene analysis and comprehension. For this reason, current theories of active scene viewing attempt to account for the influence of attention and cognition on fixation duration. Yet almost nothing is known about the neurocognitive systems associated with variation in fixation duration during scene viewing. We addressed this topic using fixation-related fMRI, which involves coregistering high-resolution eye tracking and magnetic resonance scanning to conduct event-related fMRI analysis based on characteristics of eye movements. We observed that activation in visual and prefrontal executive control areas was positively correlated with fixation duration, whereas activation in ventral areas associated with scene encoding and medial superior frontal and paracentral regions associated with changing action plans was negatively correlated with fixation duration. The results suggest that fixation duration in scene viewing is controlled by cognitive processes associated with real-time scene analysis interacting with motor planning, consistent with current computational models of active vision for scene perception.
Scene analysis for a breadboard Mars robot functioning in an indoor environment

NASA Technical Reports Server (NTRS)

Levine, M. D.

1973-01-01

The problem is delt with of computer perception in an indoor laboratory environment containing rocks of various sizes. The sensory data processing is required for the NASA/JPL breadboard mobile robot that is a test system for an adaptive variably-autonomous vehicle that will conduct scientific explorations on the surface of Mars. Scene analysis is discussed in terms of object segmentation followed by feature extraction, which results in a representation of the scene in the robot's world model.
Theoretical Limits of Lunar Vision Aided Navigation with Inertial Navigation System

DTIC Science & Technology

2015-03-26

camera model. Light reflected or projected from objects in the scene of the outside world is taken in by the aperture (or opening) shaped as a double...model’s analog aspects with an analog-to-digital interface converting raw images of the outside world scene into digital information a computer can use to...Figure 2.7. Digital Image Coordinate System. Used with permission [30]. Angular Field of View. The angular field of view is the angle of the world scene
Analytical techniques for the study of some parameters of multispectral scanner systems for remote sensing

NASA Technical Reports Server (NTRS)

Wiswell, E. R.; Cooper, G. R. (Principal Investigator)

1978-01-01

The author has identified the following significant results. The concept of average mutual information in the received spectral random process about the spectral scene was developed. Techniques amenable to implementation on a digital computer were also developed to make the required average mutual information calculations. These techniques required identification of models for the spectral response process of scenes. Stochastic modeling techniques were adapted for use. These techniques were demonstrated on empirical data from wheat and vegetation scenes.
The vectorization of a ray tracing program for image generation

NASA Technical Reports Server (NTRS)

Plunkett, D. J.; Cychosz, J. M.; Bailey, M. J.

1984-01-01

Ray tracing is a widely used method for producing realistic computer generated images. Ray tracing involves firing an imaginary ray from a view point, through a point on an image plane, into a three dimensional scene. The intersections of the ray with the objects in the scene determines what is visible at the point on the image plane. This process must be repeated many times, once for each point (commonly called a pixel) in the image plane. A typical image contains more than a million pixels making this process computationally expensive. A traditional ray tracing program processes one ray at a time. In such a serial approach, as much as ninety percent of the execution time is spent computing the intersection of a ray with the surface in the scene. With the CYBER 205, many rays can be intersected with all the bodies im the scene with a single series of vector operations. Vectorization of this intersection process results in large decreases in computation time. The CADLAB's interest in ray tracing stems from the need to produce realistic images of mechanical parts. A high quality image of a part during the design process can increase the productivity of the designer by helping him visualize the results of his work. To be useful in the design process, these images must be produced in a reasonable amount of time. This discussion will explain how the ray tracing process was vectorized and gives examples of the images obtained.
Sound texture perception via statistics of the auditory periphery: Evidence from sound synthesis

PubMed Central

McDermott, Josh H.; Simoncelli, Eero P.

2014-01-01

Rainstorms, insect swarms, and galloping horses produce “sound textures” – the collective result of many similar acoustic events. Sound textures are distinguished by temporal homogeneity, suggesting they could be recognized with time-averaged statistics. To test this hypothesis, we processed real-world textures with an auditory model containing filters tuned for sound frequencies and their modulations, and measured statistics of the resulting decomposition. We then assessed the realism and recognizability of novel sounds synthesized to have matching statistics. Statistics of individual frequency channels, capturing spectral power and sparsity, generally failed to produce compelling synthetic textures. However, combining them with correlations between channels produced identifiable and natural-sounding textures. Synthesis quality declined if statistics were computed from biologically implausible auditory models. The results suggest that sound texture perception is mediated by relatively simple statistics of early auditory representations, presumably computed by downstream neural populations. The synthesis methodology offers a powerful tool for their further investigation. PMID:21903084
Encoding of Natural Sounds at Multiple Spectral and Temporal Resolutions in the Human Auditory Cortex

PubMed Central

Santoro, Roberta; Moerel, Michelle; De Martino, Federico; Goebel, Rainer; Ugurbil, Kamil; Yacoub, Essa; Formisano, Elia

2014-01-01

Functional neuroimaging research provides detailed observations of the response patterns that natural sounds (e.g. human voices and speech, animal cries, environmental sounds) evoke in the human brain. The computational and representational mechanisms underlying these observations, however, remain largely unknown. Here we combine high spatial resolution (3 and 7 Tesla) functional magnetic resonance imaging (fMRI) with computational modeling to reveal how natural sounds are represented in the human brain. We compare competing models of sound representations and select the model that most accurately predicts fMRI response patterns to natural sounds. Our results show that the cortical encoding of natural sounds entails the formation of multiple representations of sound spectrograms with different degrees of spectral and temporal resolution. The cortex derives these multi-resolution representations through frequency-specific neural processing channels and through the combined analysis of the spectral and temporal modulations in the spectrogram. Furthermore, our findings suggest that a spectral-temporal resolution trade-off may govern the modulation tuning of neuronal populations throughout the auditory cortex. Specifically, our fMRI results suggest that neuronal populations in posterior/dorsal auditory regions preferably encode coarse spectral information with high temporal precision. Vice-versa, neuronal populations in anterior/ventral auditory regions preferably encode fine-grained spectral information with low temporal precision. We propose that such a multi-resolution analysis may be crucially relevant for flexible and behaviorally-relevant sound processing and may constitute one of the computational underpinnings of functional specialization in auditory cortex. PMID:24391486
Estimating the Intended Sound Direction of the User: Toward an Auditory Brain-Computer Interface Using Out-of-Head Sound Localization

PubMed Central

Nambu, Isao; Ebisawa, Masashi; Kogure, Masumi; Yano, Shohei; Hokari, Haruhide; Wada, Yasuhiro

2013-01-01

The auditory Brain-Computer Interface (BCI) using electroencephalograms (EEG) is a subject of intensive study. As a cue, auditory BCIs can deal with many of the characteristics of stimuli such as tone, pitch, and voices. Spatial information on auditory stimuli also provides useful information for a BCI. However, in a portable system, virtual auditory stimuli have to be presented spatially through earphones or headphones, instead of loudspeakers. We investigated the possibility of an auditory BCI using the out-of-head sound localization technique, which enables us to present virtual auditory stimuli to users from any direction, through earphones. The feasibility of a BCI using this technique was evaluated in an EEG oddball experiment and offline analysis. A virtual auditory stimulus was presented to the subject from one of six directions. Using a support vector machine, we were able to classify whether the subject attended the direction of a presented stimulus from EEG signals. The mean accuracy across subjects was 70.0% in the single-trial classification. When we used trial-averaged EEG signals as inputs to the classifier, the mean accuracy across seven subjects reached 89.5% (for 10-trial averaging). Further analysis showed that the P300 event-related potential responses from 200 to 500 ms in central and posterior regions of the brain contributed to the classification. In comparison with the results obtained from a loudspeaker experiment, we confirmed that stimulus presentation by out-of-head sound localization achieved similar event-related potential responses and classification performances. These results suggest that out-of-head sound localization enables us to provide a high-performance and loudspeaker-less portable BCI system. PMID:23437338

Sound localization by echolocating bats

NASA Astrophysics Data System (ADS)

Aytekin, Murat

Echolocating bats emit ultrasonic vocalizations and listen to echoes reflected back from objects in the path of the sound beam to build a spatial representation of their surroundings. Important to understanding the representation of space through echolocation are detailed studies of the cues used for localization, the sonar emission patterns and how this information is assembled. This thesis includes three studies, one on the directional properties of the sonar receiver, one on the directional properties of the sonar transmitter, and a model that demonstrates the role of action in building a representation of auditory space. The general importance of this work to a broader understanding of spatial localization is discussed. Investigations of the directional properties of the sonar receiver reveal that interaural level difference and monaural spectral notch cues are both dependent on sound source azimuth and elevation. This redundancy allows flexibility that an echolocating bat may need when coping with complex computational demands for sound localization. Using a novel method to measure bat sonar emission patterns from freely behaving bats, I show that the sonar beam shape varies between vocalizations. Consequently, the auditory system of a bat may need to adapt its computations to accurately localize objects using changing acoustic inputs. Extra-auditory signals that carry information about pinna position and beam shape are required for auditory localization of sound sources. The auditory system must learn associations between extra-auditory signals and acoustic spatial cues. Furthermore, the auditory system must adapt to changes in acoustic input that occur with changes in pinna position and vocalization parameters. These demands on the nervous system suggest that sound localization is achieved through the interaction of behavioral control and acoustic inputs. A sensorimotor model demonstrates how an organism can learn space through auditory-motor contingencies. The model also reveals how different aspects of sound localization, such as experience-dependent acquisition, adaptation, and extra-auditory influences, can be brought together under a comprehensive framework. This thesis presents a foundation for understanding the representation of auditory space that builds upon acoustic cues, motor control, and learning dynamic associations between action and auditory inputs.
Towards a Computational Comparative Neuroprimatology: Framing the language-ready brain.

PubMed

Arbib, Michael A

2016-03-01

We make the case for developing a Computational Comparative Neuroprimatology to inform the analysis of the function and evolution of the human brain. First, we update the mirror system hypothesis on the evolution of the language-ready brain by (i) modeling action and action recognition and opportunistic scheduling of macaque brains to hypothesize the nature of the last common ancestor of macaque and human (LCA-m); and then we (ii) introduce dynamic brain modeling to show how apes could acquire gesture through ontogenetic ritualization, hypothesizing the nature of evolution from LCA-m to the last common ancestor of chimpanzee and human (LCA-c). We then (iii) hypothesize the role of imitation, pantomime, protosign and protospeech in biological and cultural evolution from LCA-c to Homo sapiens with a language-ready brain. Second, we suggest how cultural evolution in Homo sapiens led from protolanguages to full languages with grammar and compositional semantics. Third, we assess the similarities and differences between the dorsal and ventral streams in audition and vision as the basis for presenting and comparing two models of language processing in the human brain: A model of (i) the auditory dorsal and ventral streams in sentence comprehension; and (ii) the visual dorsal and ventral streams in defining "what language is about" in both production and perception of utterances related to visual scenes provide the basis for (iii) a first step towards a synthesis and a look at challenges for further research. Copyright © 2015 Elsevier B.V. All rights reserved.
Towards a Computational Comparative Neuroprimatology: Framing the language-ready brain

NASA Astrophysics Data System (ADS)

Arbib, Michael A.

2016-03-01

We make the case for developing a Computational Comparative Neuroprimatology to inform the analysis of the function and evolution of the human brain. First, we update the mirror system hypothesis on the evolution of the language-ready brain by (i) modeling action and action recognition and opportunistic scheduling of macaque brains to hypothesize the nature of the last common ancestor of macaque and human (LCA-m); and then we (ii) introduce dynamic brain modeling to show how apes could acquire gesture through ontogenetic ritualization, hypothesizing the nature of evolution from LCA-m to the last common ancestor of chimpanzee and human (LCA-c). We then (iii) hypothesize the role of imitation, pantomime, protosign and protospeech in biological and cultural evolution from LCA-c to Homo sapiens with a language-ready brain. Second, we suggest how cultural evolution in Homo sapiens led from protolanguages to full languages with grammar and compositional semantics. Third, we assess the similarities and differences between the dorsal and ventral streams in audition and vision as the basis for presenting and comparing two models of language processing in the human brain: A model of (i) the auditory dorsal and ventral streams in sentence comprehension; and (ii) the visual dorsal and ventral streams in defining ;what language is about; in both production and perception of utterances related to visual scenes provide the basis for (iii) a first step towards a synthesis and a look at challenges for further research.
External audio for IBM-compatible computers

NASA Technical Reports Server (NTRS)

Washburn, David A.

1992-01-01

Numerous applications benefit from the presentation of computer-generated auditory stimuli at points discontiguous with the computer itself. Modification of an IBM-compatible computer for use of an external speaker is relatively easy but not intuitive. This modification is briefly described.
The theory and measurement of noncoherent microwave scattering parameters. [for remote sensing of scenes via radar scatterometry

NASA Technical Reports Server (NTRS)

Claassen, J. P.; Fung, A. K.

1977-01-01

The radar equation for incoherent scenes is derived and scattering coefficients are introduced in a systematic way to account for the complete interaction between the incident wave and the random scene. Intensity (power) and correlation techniques similar to that for coherent targets are proposed to measure all the scattering parameters. The sensitivity of the intensity technique to various practical realizations of the antenna polarization requirements is evaluated by means of computer simulated measurements, conducted with a scattering characteristic similar to that of the sea. It was shown that for scenes satisfying reciprocity one must admit three new cross-correlation scattering coefficients in addition to the commonly measured autocorrelation coefficients.
Research on the generation of the background with sea and sky in infrared scene

NASA Astrophysics Data System (ADS)

Dong, Yan-zhi; Han, Yan-li; Lou, Shu-li

2008-03-01

It is important for scene generation to keep the texture of infrared images in simulation of anti-ship infrared imaging guidance. We studied the fractal method and applied it to the infrared scene generation. We adopted the method of horizontal-vertical (HV) partition to encode the original image. Basing on the properties of infrared image with sea-sky background, we took advantage of Local Iteration Function System (LIFS) to decrease the complexity of computation and enhance the processing rate. Some results were listed. The results show that the fractal method can keep the texture of infrared image better and can be used in the infrared scene generation widely in future.
An Adapting Auditory-motor Feedback Loop Can Contribute to Generating Vocal Repetition

PubMed Central

Brainard, Michael S.; Jin, Dezhe Z.

2015-01-01

Consecutive repetition of actions is common in behavioral sequences. Although integration of sensory feedback with internal motor programs is important for sequence generation, if and how feedback contributes to repetitive actions is poorly understood. Here we study how auditory feedback contributes to generating repetitive syllable sequences in songbirds. We propose that auditory signals provide positive feedback to ongoing motor commands, but this influence decays as feedback weakens from response adaptation during syllable repetitions. Computational models show that this mechanism explains repeat distributions observed in Bengalese finch song. We experimentally confirmed two predictions of this mechanism in Bengalese finches: removal of auditory feedback by deafening reduces syllable repetitions; and neural responses to auditory playback of repeated syllable sequences gradually adapt in sensory-motor nucleus HVC. Together, our results implicate a positive auditory-feedback loop with adaptation in generating repetitive vocalizations, and suggest sensory adaptation is important for feedback control of motor sequences. PMID:26448054
Computed tomography imaging spectrometer (CTIS) with 2D reflective grating for ultraviolet to long-wave infrared detection especially useful for surveying transient events

NASA Technical Reports Server (NTRS)

Muller, Richard E. (Inventor); Mouroulis, Pantazis Z. (Inventor); Maker, Paul D. (Inventor); Wilson, Daniel W. (Inventor)

2003-01-01

The optical system of this invention is an unique type of imaging spectrometer, i.e. an instrument that can determine the spectra of all points in a two-dimensional scene. The general type of imaging spectrometer under which this invention falls has been termed a computed-tomography imaging spectrometer (CTIS). CTIS's have the ability to perform spectral imaging of scenes containing rapidly moving objects or evolving features, hereafter referred to as transient scenes. This invention, a reflective CTIS with an unique two-dimensional reflective grating, can operate in any wavelength band from the ultraviolet through long-wave infrared. Although this spectrometer is especially useful for rapidly occurring events it is also useful for investigation of some slow moving phenomena as in the life sciences.
Ventral-stream-like shape representation: from pixel intensity values to trainable object-selective COSFIRE models

PubMed Central

Azzopardi, George; Petkov, Nicolai

2014-01-01

The remarkable abilities of the primate visual system have inspired the construction of computational models of some visual neurons. We propose a trainable hierarchical object recognition model, which we call S-COSFIRE (S stands for Shape and COSFIRE stands for Combination Of Shifted FIlter REsponses) and use it to localize and recognize objects of interests embedded in complex scenes. It is inspired by the visual processing in the ventral stream (V1/V2 → V4 → TEO). Recognition and localization of objects embedded in complex scenes is important for many computer vision applications. Most existing methods require prior segmentation of the objects from the background which on its turn requires recognition. An S-COSFIRE filter is automatically configured to be selective for an arrangement of contour-based features that belong to a prototype shape specified by an example. The configuration comprises selecting relevant vertex detectors and determining certain blur and shift parameters. The response is computed as the weighted geometric mean of the blurred and shifted responses of the selected vertex detectors. S-COSFIRE filters share similar properties with some neurons in inferotemporal cortex, which provided inspiration for this work. We demonstrate the effectiveness of S-COSFIRE filters in two applications: letter and keyword spotting in handwritten manuscripts and object spotting in complex scenes for the computer vision system of a domestic robot. S-COSFIRE filters are effective to recognize and localize (deformable) objects in images of complex scenes without requiring prior segmentation. They are versatile trainable shape detectors, conceptually simple and easy to implement. The presented hierarchical shape representation contributes to a better understanding of the brain and to more robust computer vision algorithms. PMID:25126068
When viewing natural scenes, do abnormal colors impact on spatial or temporal parameters of eye movements?

PubMed

Ho-Phuoc, Tien; Guyader, Nathalie; Landragin, Frédéric; Guérin-Dugué, Anne

2012-02-03

Since Treisman's theory, it has been generally accepted that color is an elementary feature that guides eye movements when looking at natural scenes. Hence, most computational models of visual attention predict eye movements using color as an important visual feature. In this paper, using experimental data, we show that color does not affect where observers look when viewing natural scene images. Neither colors nor abnormal colors modify observers' fixation locations when compared to the same scenes in grayscale. In the same way, we did not find any significant difference between the scanpaths under grayscale, color, or abnormal color viewing conditions. However, we observed a decrease in fixation duration for color and abnormal color, and this was particularly true at the beginning of scene exploration. Finally, we found that abnormal color modifies saccade amplitude distribution.
Gaussian Radial Basis Function for Efficient Computation of Forest Indirect Illumination

NASA Astrophysics Data System (ADS)

Abbas, Fayçal; Babahenini, Mohamed Chaouki

2018-06-01

Global illumination of natural scenes in real time like forests is one of the most complex problems to solve, because the multiple inter-reflections between the light and material of the objects composing the scene. The major problem that arises is the problem of visibility computation. In fact, the computing of visibility is carried out for all the set of leaves visible from the center of a given leaf, given the enormous number of leaves present in a tree, this computation performed for each leaf of the tree which also reduces performance. We describe a new approach that approximates visibility queries, which precede in two steps. The first step is to generate point cloud representing the foliage. We assume that the point cloud is composed of two classes (visible, not-visible) non-linearly separable. The second step is to perform a point cloud classification by applying the Gaussian radial basis function, which measures the similarity in term of distance between each leaf and a landmark leaf. It allows approximating the visibility requests to extract the leaves that will be used to calculate the amount of indirect illumination exchanged between neighbor leaves. Our approach allows efficiently treat the light exchanges in the scene of a forest, it allows a fast computation and produces images of good visual quality, all this takes advantage of the immense power of computation of the GPU.
Virtual reality and 3D animation in forensic visualization.

PubMed

Ma, Minhua; Zheng, Huiru; Lallie, Harjinder

2010-09-01

Computer-generated three-dimensional (3D) animation is an ideal media to accurately visualize crime or accident scenes to the viewers and in the courtrooms. Based upon factual data, forensic animations can reproduce the scene and demonstrate the activity at various points in time. The use of computer animation techniques to reconstruct crime scenes is beginning to replace the traditional illustrations, photographs, and verbal descriptions, and is becoming popular in today's forensics. This article integrates work in the areas of 3D graphics, computer vision, motion tracking, natural language processing, and forensic computing, to investigate the state-of-the-art in forensic visualization. It identifies and reviews areas where new applications of 3D digital technologies and artificial intelligence could be used to enhance particular phases of forensic visualization to create 3D models and animations automatically and quickly. Having discussed the relationships between major crime types and level-of-detail in corresponding forensic animations, we recognized that high level-of-detail animation involving human characters, which is appropriate for many major crime types but has had limited use in courtrooms, could be useful for crime investigation. © 2010 American Academy of Forensic Sciences.
Human-Avatar Symbiosis for the Treatment of Auditory Verbal Hallucinations in Schizophrenia through Virtual/Augmented Reality and Brain-Computer Interfaces

PubMed Central

Fernández-Caballero, Antonio; Navarro, Elena; Fernández-Sotos, Patricia; González, Pascual; Ricarte, Jorge J.; Latorre, José M.; Rodriguez-Jimenez, Roberto

2017-01-01

This perspective paper faces the future of alternative treatments that take advantage of a social and cognitive approach with regards to pharmacological therapy of auditory verbal hallucinations (AVH) in patients with schizophrenia. AVH are the perception of voices in the absence of auditory stimulation and represents a severe mental health symptom. Virtual/augmented reality (VR/AR) and brain computer interfaces (BCI) are technologies that are growing more and more in different medical and psychological applications. Our position is that their combined use in computer-based therapies offers still unforeseen possibilities for the treatment of physical and mental disabilities. This is why, the paper expects that researchers and clinicians undergo a pathway toward human-avatar symbiosis for AVH by taking full advantage of new technologies. This outlook supposes to address challenging issues in the understanding of non-pharmacological treatment of schizophrenia-related disorders and the exploitation of VR/AR and BCI to achieve a real human-avatar symbiosis. PMID:29209193
Human-Avatar Symbiosis for the Treatment of Auditory Verbal Hallucinations in Schizophrenia through Virtual/Augmented Reality and Brain-Computer Interfaces.

PubMed

Fernández-Caballero, Antonio; Navarro, Elena; Fernández-Sotos, Patricia; González, Pascual; Ricarte, Jorge J; Latorre, José M; Rodriguez-Jimenez, Roberto

2017-01-01

This perspective paper faces the future of alternative treatments that take advantage of a social and cognitive approach with regards to pharmacological therapy of auditory verbal hallucinations (AVH) in patients with schizophrenia. AVH are the perception of voices in the absence of auditory stimulation and represents a severe mental health symptom. Virtual/augmented reality (VR/AR) and brain computer interfaces (BCI) are technologies that are growing more and more in different medical and psychological applications. Our position is that their combined use in computer-based therapies offers still unforeseen possibilities for the treatment of physical and mental disabilities. This is why, the paper expects that researchers and clinicians undergo a pathway toward human-avatar symbiosis for AVH by taking full advantage of new technologies. This outlook supposes to address challenging issues in the understanding of non-pharmacological treatment of schizophrenia-related disorders and the exploitation of VR/AR and BCI to achieve a real human-avatar symbiosis.
Scene-Aware Adaptive Updating for Visual Tracking via Correlation Filters

PubMed Central

Zhang, Sirou; Qiao, Xiaoya

2017-01-01

In recent years, visual object tracking has been widely used in military guidance, human-computer interaction, road traffic, scene monitoring and many other fields. The tracking algorithms based on correlation filters have shown good performance in terms of accuracy and tracking speed. However, their performance is not satisfactory in scenes with scale variation, deformation, and occlusion. In this paper, we propose a scene-aware adaptive updating mechanism for visual tracking via a kernel correlation filter (KCF). First, a low complexity scale estimation method is presented, in which the corresponding weight in five scales is employed to determine the final target scale. Then, the adaptive updating mechanism is presented based on the scene-classification. We classify the video scenes as four categories by video content analysis. According to the target scene, we exploit the adaptive updating mechanism to update the kernel correlation filter to improve the robustness of the tracker, especially in scenes with scale variation, deformation, and occlusion. We evaluate our tracker on the CVPR2013 benchmark. The experimental results obtained with the proposed algorithm are improved by 33.3%, 15%, 6%, 21.9% and 19.8% compared to those of the KCF tracker on the scene with scale variation, partial or long-time large-area occlusion, deformation, fast motion and out-of-view. PMID:29140311
Illumination discrimination in real and simulated scenes

PubMed Central

Radonjić, Ana; Pearce, Bradley; Aston, Stacey; Krieger, Avery; Dubin, Hilary; Cottaris, Nicolas P.; Brainard, David H.; Hurlbert, Anya C.

2016-01-01

Characterizing humans' ability to discriminate changes in illumination provides information about the visual system's representation of the distal stimulus. We have previously shown that humans are able to discriminate illumination changes and that sensitivity to such changes depends on their chromatic direction. Probing illumination discrimination further would be facilitated by the use of computer-graphics simulations, which would, in practice, enable a wider range of stimulus manipulations. There is no a priori guarantee, however, that results obtained with simulated scenes generalize to real illuminated scenes. To investigate this question, we measured illumination discrimination in real and simulated scenes that were well-matched in mean chromaticity and scene geometry. Illumination discrimination thresholds were essentially identical for the two stimulus types. As in our previous work, these thresholds varied with illumination change direction. We exploited the flexibility offered by the use of graphics simulations to investigate whether the differences across direction are preserved when the surfaces in the scene are varied. We show that varying the scene's surface ensemble in a manner that also changes mean scene chromaticity modulates the relative sensitivity to illumination changes along different chromatic directions. Thus, any characterization of sensitivity to changes in illumination must be defined relative to the set of surfaces in the scene. PMID:28558392
Firearms in major motion pictures, 1995-2004.

PubMed

Binswanger, Ingrid A; Cowan, John A

2009-03-01

Firearms are a major cause of injury and death. We sought to determine (1) the prevalence of movie scenes that depicted firearms and verbal firearm safety messages; (2) the context and health outcomes in firearm scenes; and (3) the association between the Motion Picture Association of America ratings and firearm scene characteristics. Ten top revenue-grossing motion pictures were selected for each year from 1995 to 2004 in descending order of gross revenues. Data on firearm scenes were collected by movie coders using dual-monitor computer workstations and real-time collection tools. Seventy of the 100 movies had scenes with firearms and the majority of movies with firearms were rated PG-13. Firearm scenes (N = 624) accounted for 17% of screen time in movies with firearms. Among firearm scenes, crime or illegal activity was involved in 45%, deaths occurred in 19%, and injuries occurred in 12%. A verbal reference to safety was made in 0.8%. Depictions of firearms in top revenue-grossing movies were common, but safety messages were exceedingly rare. Major motion pictures present an under-used opportunity for education about firearm safety.
Toward a Neurophysiological Theory of Auditory Stream Segregation

ERIC Educational Resources Information Center

Snyder, Joel S.; Alain, Claude

2007-01-01

Auditory stream segregation (or streaming) is a phenomenon in which 2 or more repeating sounds differing in at least 1 acoustic attribute are perceived as 2 or more separate sound sources (i.e., streams). This article selectively reviews psychophysical and computational studies of streaming and comprehensively reviews more recent…
Smartphone-Based Escalator Recognition for the Visually Impaired

PubMed Central

Nakamura, Daiki; Takizawa, Hotaka; Aoyagi, Mayumi; Ezaki, Nobuo; Mizuno, Shinji

2017-01-01

It is difficult for visually impaired individuals to recognize escalators in everyday environments. If the individuals ride on escalators in the wrong direction, they will stumble on the steps. This paper proposes a novel method to assist visually impaired individuals in finding available escalators by the use of smartphone cameras. Escalators are recognized by analyzing optical flows in video frames captured by the cameras, and auditory feedback is provided to the individuals. The proposed method was implemented on an Android smartphone and applied to actual escalator scenes. The experimental results demonstrate that the proposed method is promising for helping visually impaired individuals use escalators. PMID:28481270
Urban area change detection procedures with remote sensing data

NASA Technical Reports Server (NTRS)

Maxwell, E. L. (Principal Investigator); Riordan, C. J.

1980-01-01

The underlying factors affecting the detection and identification of nonurban to urban land cover change using satellite data were studied. Computer programs were developed to create a digital scene and to simulate the effect of the sensor point spread function (PSF) on the transfer of modulation from the scene to an image of the scene. The theory behind the development of a digital filter representing the PSF is given as well as an example of its application. Atmospheric effects on modulation transfer are also discussed. A user's guide and program listings are given.

The importance of individual frequencies of endogenous brain oscillations for auditory cognition - A short review.

PubMed

Baltus, Alina; Herrmann, Christoph Siegfried

2016-06-01

Oscillatory EEG activity in the human brain with frequencies in the gamma range (approx. 30-80Hz) is known to be relevant for a large number of cognitive processes. Interestingly, each subject reveals an individual frequency of the auditory gamma-band response (GBR) that coincides with the peak in the auditory steady state response (ASSR). A common resonance frequency of auditory cortex seems to underlie both the individual frequency of the GBR and the peak of the ASSR. This review sheds light on the functional role of oscillatory gamma activity for auditory processing. For successful processing, the auditory system has to track changes in auditory input over time and store information about past events in memory which allows the construction of auditory objects. Recent findings support the idea of gamma oscillations being involved in the partitioning of auditory input into discrete samples to facilitate higher order processing. We review experiments that seem to suggest that inter-individual differences in the resonance frequency are behaviorally relevant for gap detection and speech processing. A possible application of these resonance frequencies for brain computer interfaces is illustrated with regard to optimized individual presentation rates for auditory input to correspond with endogenous oscillatory activity. This article is part of a Special Issue entitled SI: Auditory working memory. Copyright © 2015 Elsevier B.V. All rights reserved.
Template construction grammar: from visual scene description to language comprehension and agrammatism.

PubMed

Barrès, Victor; Lee, Jinyong

2014-01-01

How does the language system coordinate with our visual system to yield flexible integration of linguistic, perceptual, and world-knowledge information when we communicate about the world we perceive? Schema theory is a computational framework that allows the simulation of perceptuo-motor coordination programs on the basis of known brain operating principles such as cooperative computation and distributed processing. We present first its application to a model of language production, SemRep/TCG, which combines a semantic representation of visual scenes (SemRep) with Template Construction Grammar (TCG) as a means to generate verbal descriptions of a scene from its associated SemRep graph. SemRep/TCG combines the neurocomputational framework of schema theory with the representational format of construction grammar in a model linking eye-tracking data to visual scene descriptions. We then offer a conceptual extension of TCG to include language comprehension and address data on the role of both world knowledge and grammatical semantics in the comprehension performances of agrammatic aphasic patients. This extension introduces a distinction between heavy and light semantics. The TCG model of language comprehension offers a computational framework to quantitatively analyze the distributed dynamics of language processes, focusing on the interactions between grammatical, world knowledge, and visual information. In particular, it reveals interesting implications for the understanding of the various patterns of comprehension performances of agrammatic aphasics measured using sentence-picture matching tasks. This new step in the life cycle of the model serves as a basis for exploring the specific challenges that neurolinguistic computational modeling poses to the neuroinformatics community.
Object tracking mask-based NLUT on GPUs for real-time generation of holographic videos of three-dimensional scenes.

PubMed

Kwon, M-W; Kim, S-C; Yoon, S-E; Ho, Y-S; Kim, E-S

2015-02-09

A new object tracking mask-based novel-look-up-table (OTM-NLUT) method is proposed and implemented on graphics-processing-units (GPUs) for real-time generation of holographic videos of three-dimensional (3-D) scenes. Since the proposed method is designed to be matched with software and memory structures of the GPU, the number of compute-unified-device-architecture (CUDA) kernel function calls and the computer-generated hologram (CGH) buffer size of the proposed method have been significantly reduced. It therefore results in a great increase of the computational speed of the proposed method and enables real-time generation of CGH patterns of 3-D scenes. Experimental results show that the proposed method can generate 31.1 frames of Fresnel CGH patterns with 1,920 × 1,080 pixels per second, on average, for three test 3-D video scenarios with 12,666 object points on three GPU boards of NVIDIA GTX TITAN, and confirm the feasibility of the proposed method in the practical application of electro-holographic 3-D displays.
Three-directional motion-compensation mask-based novel look-up table on graphics processing units for video-rate generation of digital holographic videos of three-dimensional scenes.

PubMed

Kwon, Min-Woo; Kim, Seung-Cheol; Kim, Eun-Soo

2016-01-20

A three-directional motion-compensation mask-based novel look-up table method is proposed and implemented on graphics processing units (GPUs) for video-rate generation of digital holographic videos of three-dimensional (3D) scenes. Since the proposed method is designed to be well matched with the software and memory structures of GPUs, the number of compute-unified-device-architecture kernel function calls can be significantly reduced. This results in a great increase of the computational speed of the proposed method, allowing video-rate generation of the computer-generated hologram (CGH) patterns of 3D scenes. Experimental results reveal that the proposed method can generate 39.8 frames of Fresnel CGH patterns with 1920×1080 pixels per second for the test 3D video scenario with 12,088 object points on dual GPU boards of NVIDIA GTX TITANs, and they confirm the feasibility of the proposed method in the practical application fields of electroholographic 3D displays.
Short-term Power Load Forecasting Based on Balanced KNN

NASA Astrophysics Data System (ADS)

Lv, Xianlong; Cheng, Xingong; YanShuang; Tang, Yan-mei

2018-03-01

To improve the accuracy of load forecasting, a short-term load forecasting model based on balanced KNN algorithm is proposed; According to the load characteristics, the historical data of massive power load are divided into scenes by the K-means algorithm; In view of unbalanced load scenes, the balanced KNN algorithm is proposed to classify the scene accurately; The local weighted linear regression algorithm is used to fitting and predict the load; Adopting the Apache Hadoop programming framework of cloud computing, the proposed algorithm model is parallelized and improved to enhance its ability of dealing with massive and high-dimension data. The analysis of the household electricity consumption data for a residential district is done by 23-nodes cloud computing cluster, and experimental results show that the load forecasting accuracy and execution time by the proposed model are the better than those of traditional forecasting algorithm.
Lifespan Differences in Nonlinear Dynamics during Rest and Auditory Oddball Performance

ERIC Educational Resources Information Center

Muller, Viktor; Lindenberger, Ulman

2012-01-01

Electroencephalographic recordings (EEG) were used to assess age-associated differences in nonlinear brain dynamics during both rest and auditory oddball performance in children aged 9.0-12.8 years, younger adults, and older adults. We computed nonlinear coupling dynamics and dimensional complexity, and also determined spectral alpha power as an…
Learning Pitch with STDP: A Computational Model of Place and Temporal Pitch Perception Using Spiking Neural Networks.

PubMed

Erfanian Saeedi, Nafise; Blamey, Peter J; Burkitt, Anthony N; Grayden, David B

2016-04-01

Pitch perception is important for understanding speech prosody, music perception, recognizing tones in tonal languages, and perceiving speech in noisy environments. The two principal pitch perception theories consider the place of maximum neural excitation along the auditory nerve and the temporal pattern of the auditory neurons' action potentials (spikes) as pitch cues. This paper describes a biophysical mechanism by which fine-structure temporal information can be extracted from the spikes generated at the auditory periphery. Deriving meaningful pitch-related information from spike times requires neural structures specialized in capturing synchronous or correlated activity from amongst neural events. The emergence of such pitch-processing neural mechanisms is described through a computational model of auditory processing. Simulation results show that a correlation-based, unsupervised, spike-based form of Hebbian learning can explain the development of neural structures required for recognizing the pitch of simple and complex tones, with or without the fundamental frequency. The temporal code is robust to variations in the spectral shape of the signal and thus can explain the phenomenon of pitch constancy.
Perceptual and academic patterns of learning-disabled/gifted students.

PubMed

Waldron, K A; Saphire, D G

1992-04-01

This research explored ways gifted children with learning disabilities perceive and recall auditory and visual input and apply this information to reading, mathematics, and spelling. 24 learning-disabled/gifted children and a matched control group of normally achieving gifted students were tested for oral reading, word recognition and analysis, listening comprehension, and spelling. In mathematics, they were tested for numeration, mental and written computation, word problems, and numerical reasoning. To explore perception and memory skills, students were administered formal tests of visual and auditory memory as well as auditory discrimination of sounds. Their responses to reading and to mathematical computations were further considered for evidence of problems in visual discrimination, visual sequencing, and visual spatial areas. Analyses indicated that these learning-disabled/gifted students were significantly weaker than controls in their decoding skills, in spelling, and in most areas of mathematics. They were also significantly weaker in auditory discrimination and memory, and in visual discrimination, sequencing, and spatial abilities. Conclusions are that these underlying perceptual and memory deficits may be related to students' academic problems.
Learning Pitch with STDP: A Computational Model of Place and Temporal Pitch Perception Using Spiking Neural Networks

PubMed Central

Erfanian Saeedi, Nafise; Blamey, Peter J.; Burkitt, Anthony N.; Grayden, David B.

2016-01-01

Pitch perception is important for understanding speech prosody, music perception, recognizing tones in tonal languages, and perceiving speech in noisy environments. The two principal pitch perception theories consider the place of maximum neural excitation along the auditory nerve and the temporal pattern of the auditory neurons’ action potentials (spikes) as pitch cues. This paper describes a biophysical mechanism by which fine-structure temporal information can be extracted from the spikes generated at the auditory periphery. Deriving meaningful pitch-related information from spike times requires neural structures specialized in capturing synchronous or correlated activity from amongst neural events. The emergence of such pitch-processing neural mechanisms is described through a computational model of auditory processing. Simulation results show that a correlation-based, unsupervised, spike-based form of Hebbian learning can explain the development of neural structures required for recognizing the pitch of simple and complex tones, with or without the fundamental frequency. The temporal code is robust to variations in the spectral shape of the signal and thus can explain the phenomenon of pitch constancy. PMID:27049657
Hierarchical Processing of Auditory Objects in Humans

PubMed Central

Kumar, Sukhbinder; Stephan, Klaas E; Warren, Jason D; Friston, Karl J; Griffiths, Timothy D

2007-01-01

This work examines the computational architecture used by the brain during the analysis of the spectral envelope of sounds, an important acoustic feature for defining auditory objects. Dynamic causal modelling and Bayesian model selection were used to evaluate a family of 16 network models explaining functional magnetic resonance imaging responses in the right temporal lobe during spectral envelope analysis. The models encode different hypotheses about the effective connectivity between Heschl's Gyrus (HG), containing the primary auditory cortex, planum temporale (PT), and superior temporal sulcus (STS), and the modulation of that coupling during spectral envelope analysis. In particular, we aimed to determine whether information processing during spectral envelope analysis takes place in a serial or parallel fashion. The analysis provides strong support for a serial architecture with connections from HG to PT and from PT to STS and an increase of the HG to PT connection during spectral envelope analysis. The work supports a computational model of auditory object processing, based on the abstraction of spectro-temporal “templates” in the PT before further analysis of the abstracted form in anterior temporal lobe areas. PMID:17542641
The Effect of Cumulus Cloud Field Anisotropy on Domain-Averaged Solar Fluxes and Atmospheric Heating Rates

NASA Technical Reports Server (NTRS)

Hinkelman, Laura M.; Evans, K. Franklin; Clothiaux, Eugene E.; Ackerman, Thomas P.; Stackhouse, Paul W., Jr.

2006-01-01

Cumulus clouds can become tilted or elongated in the presence of wind shear. Nevertheless, most studies of the interaction of cumulus clouds and radiation have assumed these clouds to be isotropic. This paper describes an investigation of the effect of fair-weather cumulus cloud field anisotropy on domain-averaged solar fluxes and atmospheric heating rate profiles. A stochastic field generation algorithm was used to produce twenty three-dimensional liquid water content fields based on the statistical properties of cloud scenes from a large eddy simulation. Progressively greater degrees of x-z plane tilting and horizontal stretching were imposed on each of these scenes, so that an ensemble of scenes was produced for each level of distortion. The resulting scenes were used as input to a three-dimensional Monte Carlo radiative transfer model. Domain-average transmission, reflection, and absorption of broadband solar radiation were computed for each scene along with the average heating rate profile. Both tilt and horizontal stretching were found to significantly affect calculated fluxes, with the amount and sign of flux differences depending strongly on sun position relative to cloud distortion geometry. The mechanisms by which anisotropy interacts with solar fluxes were investigated by comparisons to independent pixel approximation and tilted independent pixel approximation computations for the same scenes. Cumulus anisotropy was found to most strongly impact solar radiative transfer by changing the effective cloud fraction, i.e., the cloud fraction when the field is projected on a surface perpendicular to the direction of the incident solar beam.
Open and scalable analytics of large Earth observation datasets: From scenes to multidimensional arrays using SciDB and GDAL

NASA Astrophysics Data System (ADS)

Appel, Marius; Lahn, Florian; Buytaert, Wouter; Pebesma, Edzer

2018-04-01

Earth observation (EO) datasets are commonly provided as collection of scenes, where individual scenes represent a temporal snapshot and cover a particular region on the Earth's surface. Using these data in complex spatiotemporal modeling becomes difficult as soon as data volumes exceed a certain capacity or analyses include many scenes, which may spatially overlap and may have been recorded at different dates. In order to facilitate analytics on large EO datasets, we combine and extend the geospatial data abstraction library (GDAL) and the array-based data management and analytics system SciDB. We present an approach to automatically convert collections of scenes to multidimensional arrays and use SciDB to scale computationally intensive analytics. We evaluate the approach in three study cases on national scale land use change monitoring with Landsat imagery, global empirical orthogonal function analysis of daily precipitation, and combining historical climate model projections with satellite-based observations. Results indicate that the approach can be used to represent various EO datasets and that analyses in SciDB scale well with available computational resources. To simplify analyses of higher-dimensional datasets as from climate model output, however, a generalization of the GDAL data model might be needed. All parts of this work have been implemented as open-source software and we discuss how this may facilitate open and reproducible EO analyses.
Auditory-motor interactions in pediatric motor speech disorders: neurocomputational modeling of disordered development.

PubMed

Terband, H; Maassen, B; Guenther, F H; Brumberg, J

2014-01-01

Differentiating the symptom complex due to phonological-level disorders, speech delay and pediatric motor speech disorders is a controversial issue in the field of pediatric speech and language pathology. The present study investigated the developmental interaction between neurological deficits in auditory and motor processes using computational modeling with the DIVA model. In a series of computer simulations, we investigated the effect of a motor processing deficit alone (MPD), and the effect of a motor processing deficit in combination with an auditory processing deficit (MPD+APD) on the trajectory and endpoint of speech motor development in the DIVA model. Simulation results showed that a motor programming deficit predominantly leads to deterioration on the phonological level (phonemic mappings) when auditory self-monitoring is intact, and on the systemic level (systemic mapping) if auditory self-monitoring is impaired. These findings suggest a close relation between quality of auditory self-monitoring and the involvement of phonological vs. motor processes in children with pediatric motor speech disorders. It is suggested that MPD+APD might be involved in typically apraxic speech output disorders and MPD in pediatric motor speech disorders that also have a phonological component. Possibilities to verify these hypotheses using empirical data collected from human subjects are discussed. The reader will be able to: (1) identify the difficulties in studying disordered speech motor development; (2) describe the differences in speech motor characteristics between SSD and subtype CAS; (3) describe the different types of learning that occur in the sensory-motor system during babbling and early speech acquisition; (4) identify the neural control subsystems involved in speech production; (5) describe the potential role of auditory self-monitoring in developmental speech disorders. Copyright © 2014 Elsevier Inc. All rights reserved.
Abnormal Auditory Gain in Hyperacusis: Investigation with a Computational Model

PubMed Central

Diehl, Peter U.; Schaette, Roland

2015-01-01

Hyperacusis is a frequent auditory disorder that is characterized by abnormal loudness perception where sounds of relatively normal volume are perceived as too loud or even painfully loud. As hyperacusis patients show decreased loudness discomfort levels (LDLs) and steeper loudness growth functions, it has been hypothesized that hyperacusis might be caused by an increase in neuronal response gain in the auditory system. Moreover, since about 85% of hyperacusis patients also experience tinnitus, the conditions might be caused by a common mechanism. However, the mechanisms that give rise to hyperacusis have remained unclear. Here, we have used a computational model of the auditory system to investigate candidate mechanisms for hyperacusis. Assuming that perceived loudness is proportional to the summed activity of all auditory nerve (AN) fibers, the model was tuned to reproduce normal loudness perception. We then evaluated a variety of potential hyperacusis gain mechanisms by determining their effects on model equal-loudness contours and comparing the results to the LDLs of hyperacusis patients with normal hearing thresholds. Hyperacusis was best accounted for by an increase in non-linear gain in the central auditory system. Good fits to the average patient LDLs were obtained for a general increase in gain that affected all frequency channels to the same degree, and also for a frequency-specific gain increase in the high-frequency range. Moreover, the gain needed to be applied after subtraction of spontaneous activity of the AN, which is in contrast to current theories of tinnitus generation based on amplification of spontaneous activity. Hyperacusis and tinnitus might therefore be caused by different changes in neuronal processing in the central auditory system. PMID:26236277
Perceptual Learning and Auditory Training in Cochlear Implant Recipients

PubMed Central

Fu, Qian-Jie; Galvin, John J.

2007-01-01

Learning electrically stimulated speech patterns can be a new and difficult experience for cochlear implant (CI) recipients. Recent studies have shown that most implant recipients at least partially adapt to these new patterns via passive, daily-listening experiences. Gradually introducing a speech processor parameter (eg, the degree of spectral mismatch) may provide for more complete and less stressful adaptation. Although the implant device restores hearing sensation and the continued use of the implant provides some degree of adaptation, active auditory rehabilitation may be necessary to maximize the benefit of implantation for CI recipients. Currently, there are scant resources for auditory rehabilitation for adult, postlingually deafened CI recipients. We recently developed a computer-assisted speech-training program to provide the means to conduct auditory rehabilitation at home. The training software targets important acoustic contrasts among speech stimuli, provides auditory and visual feedback, and incorporates progressive training techniques, thereby maintaining recipients’ interest during the auditory training exercises. Our recent studies demonstrate the effectiveness of targeted auditory training in improving CI recipients’ speech and music perception. Provided with an inexpensive and effective auditory training program, CI recipients may find the motivation and momentum to get the most from the implant device. PMID:17709574
Communication and control by listening: toward optimal design of a two-class auditory streaming brain-computer interface.

PubMed

Hill, N Jeremy; Moinuddin, Aisha; Häuser, Ann-Katrin; Kienzle, Stephan; Schalk, Gerwin

2012-01-01

Most brain-computer interface (BCI) systems require users to modulate brain signals in response to visual stimuli. Thus, they may not be useful to people with limited vision, such as those with severe paralysis. One important approach for overcoming this issue is auditory streaming, an approach whereby a BCI system is driven by shifts of attention between two simultaneously presented auditory stimulus streams. Motivated by the long-term goal of translating such a system into a reliable, simple yes-no interface for clinical usage, we aim to answer two main questions. First, we asked which of two previously published variants provides superior performance: a fixed-phase (FP) design in which the streams have equal period and opposite phase, or a drifting-phase (DP) design where the periods are unequal. We found FP to be superior to DP (p = 0.002): average performance levels were 80 and 72% correct, respectively. We were also able to show, in a pilot with one subject, that auditory streaming can support continuous control and neurofeedback applications: by shifting attention between ongoing left and right auditory streams, the subject was able to control the position of a paddle in a computer game. Second, we examined whether the system is dependent on eye movements, since it is known that eye movements and auditory attention may influence each other, and any dependence on the ability to move one's eyes would be a barrier to translation to paralyzed users. We discovered that, despite instructions, some subjects did make eye movements that were indicative of the direction of attention. However, there was no correlation, across subjects, between the reliability of the eye movement signal and the reliability of the BCI system, indicating that our system was configured to work independently of eye movement. Together, these findings are an encouraging step forward toward BCIs that provide practical communication and control options for the most severely paralyzed users.
The visual light field in real scenes

PubMed Central

Xia, Ling; Pont, Sylvia C.; Heynderickx, Ingrid

2014-01-01

Human observers' ability to infer the light field in empty space is known as the “visual light field.” While most relevant studies were performed using images on computer screens, we investigate the visual light field in a real scene by using a novel experimental setup. A “probe” and a scene were mixed optically using a semitransparent mirror. Twenty participants were asked to judge whether the probe fitted the scene with regard to the illumination intensity, direction, and diffuseness. Both smooth and rough probes were used to test whether observers use the additional cues for the illumination direction and diffuseness provided by the 3D texture over the rough probe. The results confirmed that observers are sensitive to the intensity, direction, and diffuseness of the illumination also in real scenes. For some lighting combinations on scene and probe, the awareness of a mismatch between the probe and scene was found to depend on which lighting condition was on the scene and which on the probe, which we called the “swap effect.” For these cases, the observers judged the fit to be better if the average luminance of the visible parts of the probe was closer to the average luminance of the visible parts of the scene objects. The use of a rough instead of smooth probe was found to significantly improve observers' abilities to detect mismatches in lighting diffuseness and directions. PMID:25926970
Is sensorimotor BCI performance influenced differently by mono, stereo, or 3-D auditory feedback?

PubMed

McCreadie, Karl A; Coyle, Damien H; Prasad, Girijesh

2014-05-01

Imagination of movement can be used as a control method for a brain-computer interface (BCI) allowing communication for the physically impaired. Visual feedback within such a closed loop system excludes those with visual problems and hence there is a need for alternative sensory feedback pathways. In the context of substituting the visual channel for the auditory channel, this study aims to add to the limited evidence that it is possible to substitute visual feedback for its auditory equivalent and assess the impact this has on BCI performance. Secondly, the study aims to determine for the first time if the type of auditory feedback method influences motor imagery performance significantly. Auditory feedback is presented using a stepped approach of single (mono), double (stereo), and multiple (vector base amplitude panning as an audio game) loudspeaker arrangements. Visual feedback involves a ball-basket paradigm and a spaceship game. Each session consists of either auditory or visual feedback only with runs of each type of feedback presentation method applied in each session. Results from seven subjects across five sessions of each feedback type (visual, auditory) (10 sessions in total) show that auditory feedback is a suitable substitute for the visual equivalent and that there are no statistical differences in the type of auditory feedback presented across five sessions.
Scene Recognition for Indoor Localization Using a Multi-Sensor Fusion Approach.

PubMed

Liu, Mengyun; Chen, Ruizhi; Li, Deren; Chen, Yujin; Guo, Guangyi; Cao, Zhipeng; Pan, Yuanjin

2017-12-08

After decades of research, there is still no solution for indoor localization like the GNSS (Global Navigation Satellite System) solution for outdoor environments. The major reasons for this phenomenon are the complex spatial topology and RF transmission environment. To deal with these problems, an indoor scene constrained method for localization is proposed in this paper, which is inspired by the visual cognition ability of the human brain and the progress in the computer vision field regarding high-level image understanding. Furthermore, a multi-sensor fusion method is implemented on a commercial smartphone including cameras, WiFi and inertial sensors. Compared to former research, the camera on a smartphone is used to "see" which scene the user is in. With this information, a particle filter algorithm constrained by scene information is adopted to determine the final location. For indoor scene recognition, we take advantage of deep learning that has been proven to be highly effective in the computer vision community. For particle filter, both WiFi and magnetic field signals are used to update the weights of particles. Similar to other fingerprinting localization methods, there are two stages in the proposed system, offline training and online localization. In the offline stage, an indoor scene model is trained by Caffe (one of the most popular open source frameworks for deep learning) and a fingerprint database is constructed by user trajectories in different scenes. To reduce the volume requirement of training data for deep learning, a fine-tuned method is adopted for model training. In the online stage, a camera in a smartphone is used to recognize the initial scene. Then a particle filter algorithm is used to fuse the sensor data and determine the final location. To prove the effectiveness of the proposed method, an Android client and a web server are implemented. The Android client is used to collect data and locate a user. The web server is developed for indoor scene model training and communication with an Android client. To evaluate the performance, comparison experiments are conducted and the results demonstrate that a positioning accuracy of 1.32 m at 95% is achievable with the proposed solution. Both positioning accuracy and robustness are enhanced compared to approaches without scene constraint including commercial products such as IndoorAtlas.
Scene Recognition for Indoor Localization Using a Multi-Sensor Fusion Approach

PubMed Central

Chen, Ruizhi; Li, Deren; Chen, Yujin; Guo, Guangyi; Cao, Zhipeng

2017-01-01

After decades of research, there is still no solution for indoor localization like the GNSS (Global Navigation Satellite System) solution for outdoor environments. The major reasons for this phenomenon are the complex spatial topology and RF transmission environment. To deal with these problems, an indoor scene constrained method for localization is proposed in this paper, which is inspired by the visual cognition ability of the human brain and the progress in the computer vision field regarding high-level image understanding. Furthermore, a multi-sensor fusion method is implemented on a commercial smartphone including cameras, WiFi and inertial sensors. Compared to former research, the camera on a smartphone is used to “see” which scene the user is in. With this information, a particle filter algorithm constrained by scene information is adopted to determine the final location. For indoor scene recognition, we take advantage of deep learning that has been proven to be highly effective in the computer vision community. For particle filter, both WiFi and magnetic field signals are used to update the weights of particles. Similar to other fingerprinting localization methods, there are two stages in the proposed system, offline training and online localization. In the offline stage, an indoor scene model is trained by Caffe (one of the most popular open source frameworks for deep learning) and a fingerprint database is constructed by user trajectories in different scenes. To reduce the volume requirement of training data for deep learning, a fine-tuned method is adopted for model training. In the online stage, a camera in a smartphone is used to recognize the initial scene. Then a particle filter algorithm is used to fuse the sensor data and determine the final location. To prove the effectiveness of the proposed method, an Android client and a web server are implemented. The Android client is used to collect data and locate a user. The web server is developed for indoor scene model training and communication with an Android client. To evaluate the performance, comparison experiments are conducted and the results demonstrate that a positioning accuracy of 1.32 m at 95% is achievable with the proposed solution. Both positioning accuracy and robustness are enhanced compared to approaches without scene constraint including commercial products such as IndoorAtlas. PMID:29292761

A Vision-Based Wayfinding System for Visually Impaired People Using Situation Awareness and Activity-Based Instructions

PubMed Central

Kim, Eun Yi

2017-01-01

A significant challenge faced by visually impaired people is ‘wayfinding’, which is the ability to find one’s way to a destination in an unfamiliar environment. This study develops a novel wayfinding system for smartphones that can automatically recognize the situation and scene objects in real time. Through analyzing streaming images, the proposed system first classifies the current situation of a user in terms of their location. Next, based on the current situation, only the necessary context objects are found and interpreted using computer vision techniques. It estimates the motions of the user with two inertial sensors and records the trajectories of the user toward the destination, which are also used as a guide for the return route after reaching the destination. To efficiently convey the recognized results using an auditory interface, activity-based instructions are generated that guide the user in a series of movements along a route. To assess the effectiveness of the proposed system, experiments were conducted in several indoor environments: the sit in which the situation awareness accuracy was 90% and the object detection false alarm rate was 0.016. In addition, our field test results demonstrate that users can locate their paths with an accuracy of 97%. PMID:28813033
Presentation of 3D Scenes Through Video Example.

PubMed

Baldacci, Andrea; Ganovelli, Fabio; Corsini, Massimiliano; Scopigno, Roberto

2017-09-01

Using synthetic videos to present a 3D scene is a common requirement for architects, designers, engineers or Cultural Heritage professionals however it is usually time consuming and, in order to obtain high quality results, the support of a film maker/computer animation expert is necessary. We introduce an alternative approach that takes the 3D scene of interest and an example video as input, and automatically produces a video of the input scene that resembles the given video example. In other words, our algorithm allows the user to "replicate" an existing video, on a different 3D scene. We build on the intuition that a video sequence of a static environment is strongly characterized by its optical flow, or, in other words, that two videos are similar if their optical flows are similar. We therefore recast the problem as producing a video of the input scene whose optical flow is similar to the optical flow of the input video. Our intuition is supported by a user-study specifically designed to verify this statement. We have successfully tested our approach on several scenes and input videos, some of which are reported in the accompanying material of this paper.
Integrated framework for developing search and discrimination metrics

NASA Astrophysics Data System (ADS)

Copeland, Anthony C.; Trivedi, Mohan M.

1997-06-01

This paper presents an experimental framework for evaluating target signature metrics as models of human visual search and discrimination. This framework is based on a prototype eye tracking testbed, the Integrated Testbed for Eye Movement Studies (ITEMS). ITEMS determines an observer's visual fixation point while he studies a displayed image scene, by processing video of the observer's eye. The utility of this framework is illustrated with an experiment using gray-scale images of outdoor scenes that contain randomly placed targets. Each target is a square region of a specific size containing pixel values from another image of an outdoor scene. The real-world analogy of this experiment is that of a military observer looking upon the sensed image of a static scene to find camouflaged enemy targets that are reported to be in the area. ITEMS provides the data necessary to compute various statistics for each target to describe how easily the observers located it, including the likelihood the target was fixated or identified and the time required to do so. The computed values of several target signature metrics are compared to these statistics, and a second-order metric based on a model of image texture was found to be the most highly correlated.
[Preliminary construction of three-dimensional visual educational system for clinical dentistry based on world wide web webpage].

PubMed

Hu, Jian; Xu, Xiang-yang; Song, En-min; Tan, Hong-bao; Wang, Yi-ning

2009-09-01

To establish a new visual educational system of virtual reality for clinical dentistry based on world wide web (WWW) webpage in order to provide more three-dimensional multimedia resources to dental students and an online three-dimensional consulting system for patients. Based on computer graphics and three-dimensional webpage technologies, the software of 3Dsmax and Webmax were adopted in the system development. In the Windows environment, the architecture of whole system was established step by step, including three-dimensional model construction, three-dimensional scene setup, transplanting three-dimensional scene into webpage, reediting the virtual scene, realization of interactions within the webpage, initial test, and necessary adjustment. Five cases of three-dimensional interactive webpage for clinical dentistry were completed. The three-dimensional interactive webpage could be accessible through web browser on personal computer, and users could interact with the webpage through rotating, panning and zooming the virtual scene. It is technically feasible to implement the visual educational system of virtual reality for clinical dentistry based on WWW webpage. Information related to clinical dentistry can be transmitted properly, visually and interactively through three-dimensional webpage.
Integration of virtual and real scenes within an integral 3D imaging environment

NASA Astrophysics Data System (ADS)

Ren, Jinsong; Aggoun, Amar; McCormick, Malcolm

2002-11-01

The Imaging Technologies group at De Montfort University has developed an integral 3D imaging system, which is seen as the most likely vehicle for 3D television avoiding psychological effects. To create real fascinating three-dimensional television programs, a virtual studio that performs the task of generating, editing and integrating the 3D contents involving virtual and real scenes is required. The paper presents, for the first time, the procedures, factors and methods of integrating computer-generated virtual scenes with real objects captured using the 3D integral imaging camera system. The method of computer generation of 3D integral images, where the lens array is modelled instead of the physical camera is described. In the model each micro-lens that captures different elemental images of the virtual scene is treated as an extended pinhole camera. An integration process named integrated rendering is illustrated. Detailed discussion and deep investigation are focused on depth extraction from captured integral 3D images. The depth calculation method from the disparity and the multiple baseline method that is used to improve the precision of depth estimation are also presented. The concept of colour SSD and its further improvement in the precision is proposed and verified.
Tactile and bone-conduction auditory brain computer interface for vision and hearing impaired users.

PubMed

Rutkowski, Tomasz M; Mori, Hiromu

2015-04-15

The paper presents a report on the recently developed BCI alternative for users suffering from impaired vision (lack of focus or eye-movements) or from the so-called "ear-blocking-syndrome" (limited hearing). We report on our recent studies of the extents to which vibrotactile stimuli delivered to the head of a user can serve as a platform for a brain computer interface (BCI) paradigm. In the proposed tactile and bone-conduction auditory BCI novel multiple head positions are used to evoke combined somatosensory and auditory (via the bone conduction effect) P300 brain responses, in order to define a multimodal tactile and bone-conduction auditory brain computer interface (tbcaBCI). In order to further remove EEG interferences and to improve P300 response classification synchrosqueezing transform (SST) is applied. SST outperforms the classical time-frequency analysis methods of the non-linear and non-stationary signals such as EEG. The proposed method is also computationally more effective comparing to the empirical mode decomposition. The SST filtering allows for online EEG preprocessing application which is essential in the case of BCI. Experimental results with healthy BCI-naive users performing online tbcaBCI, validate the paradigm, while the feasibility of the concept is illuminated through information transfer rate case studies. We present a comparison of the proposed SST-based preprocessing method, combined with a logistic regression (LR) classifier, together with classical preprocessing and LDA-based classification BCI techniques. The proposed tbcaBCI paradigm together with data-driven preprocessing methods are a step forward in robust BCI applications research. Copyright © 2014 Elsevier B.V. All rights reserved.
A fuzzy measure approach to motion frame analysis for scene detection. M.S. Thesis - Houston Univ.

NASA Technical Reports Server (NTRS)

Leigh, Albert B.; Pal, Sankar K.

1992-01-01

This paper addresses a solution to the problem of scene estimation of motion video data in the fuzzy set theoretic framework. Using fuzzy image feature extractors, a new algorithm is developed to compute the change of information in each of two successive frames to classify scenes. This classification process of raw input visual data can be used to establish structure for correlation. The algorithm attempts to fulfill the need for nonlinear, frame-accurate access to video data for applications such as video editing and visual document archival/retrieval systems in multimedia environments.
Land Classification of South-central Iowa from Computer Enhanced Images

NASA Technical Reports Server (NTRS)

Lucas, J. R. (Principal Investigator); Taranik, J. V.; Billingsley, F. C.

1975-01-01

The author has identified the following significant results. Two enhanced false color negatives from multispectral scanner scenes, dated 15 April 1974 and 29 August 1972, were printed at a scale of 1:125,000 to form the basis for land use interpretations in the Wapello County, Iowa test site. The use of geomorphic principles proved valuable in the interpretation of the April scene to form valuable generalizations for planning purposes on soil associations, topography, alluvial valleys, and agricultural land use. The August scene was superior in providing information on urban extent, transportation networks, forest cover, and water bodies.
Frogs Exploit Statistical Regularities in Noisy Acoustic Scenes to Solve Cocktail-Party-like Problems.

PubMed

Lee, Norman; Ward, Jessica L; Vélez, Alejandro; Micheyl, Christophe; Bee, Mark A

2017-03-06

Noise is a ubiquitous source of errors in all forms of communication [1]. Noise-induced errors in speech communication, for example, make it difficult for humans to converse in noisy social settings, a challenge aptly named the "cocktail party problem" [2]. Many nonhuman animals also communicate acoustically in noisy social groups and thus face biologically analogous problems [3]. However, we know little about how the perceptual systems of receivers are evolutionarily adapted to avoid the costs of noise-induced errors in communication. In this study of Cope's gray treefrog (Hyla chrysoscelis; Hylidae), we investigated whether receivers exploit a potential statistical regularity present in noisy acoustic scenes to reduce errors in signal recognition and discrimination. We developed an anatomical/physiological model of the peripheral auditory system to show that temporal correlation in amplitude fluctuations across the frequency spectrum ("comodulation") [4-6] is a feature of the noise generated by large breeding choruses of sexually advertising males. In four psychophysical experiments, we investigated whether females exploit comodulation in background noise to mitigate noise-induced errors in evolutionarily critical mate-choice decisions. Subjects experienced fewer errors in recognizing conspecific calls and in selecting the calls of high-quality mates in the presence of simulated chorus noise that was comodulated. These data show unequivocally, and for the first time, that exploiting statistical regularities present in noisy acoustic scenes is an important biological strategy for solving cocktail-party-like problems in nonhuman animal communication. Copyright © 2017 Elsevier Ltd. All rights reserved.
The Reasoning behind the Scene: Why Do Early Childhood Educators Use Computers in Their Classrooms?

ERIC Educational Resources Information Center

Edwards, Suzy

2005-01-01

In recent times discussion surrounding the use of computers in early childhood education has emphasised the role computers play in children's everyday lives. This realisation has replaced early debate regarding the appropriateness or otherwise of computer use for young children in early childhood education. An important component of computer use…
EEG Responses to Auditory Stimuli for Automatic Affect Recognition

PubMed Central

Hettich, Dirk T.; Bolinger, Elaina; Matuz, Tamara; Birbaumer, Niels; Rosenstiel, Wolfgang; Spüler, Martin

2016-01-01

Brain state classification for communication and control has been well established in the area of brain-computer interfaces over the last decades. Recently, the passive and automatic extraction of additional information regarding the psychological state of users from neurophysiological signals has gained increased attention in the interdisciplinary field of affective computing. We investigated how well specific emotional reactions, induced by auditory stimuli, can be detected in EEG recordings. We introduce an auditory emotion induction paradigm based on the International Affective Digitized Sounds 2nd Edition (IADS-2) database also suitable for disabled individuals. Stimuli are grouped in three valence categories: unpleasant, neutral, and pleasant. Significant differences in time domain domain event-related potentials are found in the electroencephalogram (EEG) between unpleasant and neutral, as well as pleasant and neutral conditions over midline electrodes. Time domain data were classified in three binary classification problems using a linear support vector machine (SVM) classifier. We discuss three classification performance measures in the context of affective computing and outline some strategies for conducting and reporting affect classification studies. PMID:27375410
Hearing Loss

MedlinePlus

... test uses patches (also called electrodes) and a computer to check how the auditory nerve reacts to ... baby’s ear. The earphone is connected to a computer. Your baby’s provider sends a soft clicking sound ...
Factors influencing tests of auditory processing: a perspective on current issues and relevant concerns.

PubMed

Cacace, Anthony T; McFarland, Dennis J

2013-01-01

Tests of auditory perception, such as those used in the assessment of central auditory processing disorders ([C]APDs), represent a domain in audiological assessment where measurement of this theoretical construct is often confounded by nonauditory abilities due to methodological shortcomings. These confounds include the effects of cognitive variables such as memory and attention and suboptimal testing paradigms, including the use of verbal reproduction as a form of response selection. We argue that these factors need to be controlled more carefully and/or modified so that their impact on tests of auditory and visual perception is only minimal. To advocate for a stronger theoretical framework than currently exists and to suggest better methodological strategies to improve assessment of auditory processing disorders (APDs). Emphasis is placed on adaptive forced-choice psychophysical methods and the use of matched tasks in multiple sensory modalities to achieve these goals. Together, this approach has potential to improve the construct validity of the diagnosis, enhance and develop theory, and evolve into a preferred method of testing. Examination of methods commonly used in studies of APDs. Where possible, currently used methodology is compared to contemporary psychophysical methods that emphasize computer-controlled forced-choice paradigms. In many cases, the procedures used in studies of APD introduce confounding factors that could be minimized if computer-controlled forced-choice psychophysical methods were utilized. Ambiguities of interpretation, indeterminate diagnoses, and unwanted confounds can be avoided by minimizing memory and attentional demands on the input end and precluding the use of response-selection strategies that use complex motor processes on the output end. Advocated are the use of computer-controlled forced-choice psychophysical paradigms in combination with matched tasks in multiple sensory modalities to enhance the prospect of obtaining a valid diagnosis. American Academy of Audiology.
Recognizing Spoken Words: The Neighborhood Activation Model

PubMed Central

Luce, Paul A.; Pisoni, David B.

2012-01-01

Objective A fundamental problem in the study of human spoken word recognition concerns the structural relations among the sound patterns of words in memory and the effects these relations have on spoken word recognition. In the present investigation, computational and experimental methods were employed to address a number of fundamental issues related to the representation and structural organization of spoken words in the mental lexicon and to lay the groundwork for a model of spoken word recognition. Design Using a computerized lexicon consisting of transcriptions of 20,000 words, similarity neighborhoods for each of the transcriptions were computed. Among the variables of interest in the computation of the similarity neighborhoods were: 1) the number of words occurring in a neighborhood, 2) the degree of phonetic similarity among the words, and 3) the frequencies of occurrence of the words in the language. The effects of these variables on auditory word recognition were examined in a series of behavioral experiments employing three experimental paradigms: perceptual identification of words in noise, auditory lexical decision, and auditory word naming. Results The results of each of these experiments demonstrated that the number and nature of words in a similarity neighborhood affect the speed and accuracy of word recognition. A neighborhood probability rule was developed that adequately predicted identification performance. This rule, based on Luce's (1959) choice rule, combines stimulus word intelligibility, neighborhood confusability, and frequency into a single expression. Based on this rule, a model of auditory word recognition, the neighborhood activation model, was proposed. This model describes the effects of similarity neighborhood structure on the process of discriminating among the acoustic-phonetic representations of words in memory. The results of these experiments have important implications for current conceptions of auditory word recognition in normal and hearing impaired populations of children and adults. PMID:9504270
Self-Calibrated In-Process Photogrammetry for Large Raw Part Measurement and Alignment before Machining

PubMed Central

Mendikute, Alberto; Zatarain, Mikel; Bertelsen, Álvaro; Leizea, Ibai

2017-01-01

Photogrammetry methods are being used more and more as a 3D technique for large scale metrology applications in industry. Optical targets are placed on an object and images are taken around it, where measuring traceability is provided by precise off-process pre-calibrated digital cameras and scale bars. According to the 2D target image coordinates, target 3D coordinates and camera views are jointly computed. One of the applications of photogrammetry is the measurement of raw part surfaces prior to its machining. For this application, post-process bundle adjustment has usually been adopted for computing the 3D scene. With that approach, a high computation time is observed, leading in practice to time consuming and user dependent iterative review and re-processing procedures until an adequate set of images is taken, limiting its potential for fast, easy-to-use, and precise measurements. In this paper, a new efficient procedure is presented for solving the bundle adjustment problem in portable photogrammetry. In-process bundle computing capability is demonstrated on a consumer grade desktop PC, enabling quasi real time 2D image and 3D scene computing. Additionally, a method for the self-calibration of camera and lens distortion has been integrated into the in-process approach due to its potential for highest precision when using low cost non-specialized digital cameras. Measurement traceability is set only by scale bars available in the measuring scene, avoiding the uncertainty contribution of off-process camera calibration procedures or the use of special purpose calibration artifacts. The developed self-calibrated in-process photogrammetry has been evaluated both in a pilot case scenario and in industrial scenarios for raw part measurement, showing a total in-process computing time typically below 1 s per image up to a maximum of 2 s during the last stages of the computed industrial scenes, along with a relative precision of 1/10,000 (e.g., 0.1 mm error in 1 m) with an error RMS below 0.2 pixels at image plane, ranging at the same performance reported for portable photogrammetry with precise off-process pre-calibrated cameras. PMID:28891946
Self-Calibrated In-Process Photogrammetry for Large Raw Part Measurement and Alignment before Machining.

PubMed

Mendikute, Alberto; Yagüe-Fabra, José A; Zatarain, Mikel; Bertelsen, Álvaro; Leizea, Ibai

2017-09-09

Photogrammetry methods are being used more and more as a 3D technique for large scale metrology applications in industry. Optical targets are placed on an object and images are taken around it, where measuring traceability is provided by precise off-process pre-calibrated digital cameras and scale bars. According to the 2D target image coordinates, target 3D coordinates and camera views are jointly computed. One of the applications of photogrammetry is the measurement of raw part surfaces prior to its machining. For this application, post-process bundle adjustment has usually been adopted for computing the 3D scene. With that approach, a high computation time is observed, leading in practice to time consuming and user dependent iterative review and re-processing procedures until an adequate set of images is taken, limiting its potential for fast, easy-to-use, and precise measurements. In this paper, a new efficient procedure is presented for solving the bundle adjustment problem in portable photogrammetry. In-process bundle computing capability is demonstrated on a consumer grade desktop PC, enabling quasi real time 2D image and 3D scene computing. Additionally, a method for the self-calibration of camera and lens distortion has been integrated into the in-process approach due to its potential for highest precision when using low cost non-specialized digital cameras. Measurement traceability is set only by scale bars available in the measuring scene, avoiding the uncertainty contribution of off-process camera calibration procedures or the use of special purpose calibration artifacts. The developed self-calibrated in-process photogrammetry has been evaluated both in a pilot case scenario and in industrial scenarios for raw part measurement, showing a total in-process computing time typically below 1 s per image up to a maximum of 2 s during the last stages of the computed industrial scenes, along with a relative precision of 1/10,000 (e.g. 0.1 mm error in 1 m) with an error RMS below 0.2 pixels at image plane, ranging at the same performance reported for portable photogrammetry with precise off-process pre-calibrated cameras.
Addressing Small Computers in the First OS Course

ERIC Educational Resources Information Center

Nutt, Gary

2006-01-01

Small computers are emerging as important components of the contemporary computing scene. Their operating systems vary from specialized software for an embedded system to the same style of OS used on a generic desktop or server computer. This article describes a course in which systems are classified by their hardware capability and the…
Functional Architecture for Disparity in Macaque Inferior Temporal Cortex and Its Relationship to the Architecture for Faces, Color, Scenes, and Visual Field

PubMed Central

Verhoef, Bram-Ernst; Bohon, Kaitlin S.

2015-01-01

Binocular disparity is a powerful depth cue for object perception. The computations for object vision culminate in inferior temporal cortex (IT), but the functional organization for disparity in IT is unknown. Here we addressed this question by measuring fMRI responses in alert monkeys to stimuli that appeared in front of (near), behind (far), or at the fixation plane. We discovered three regions that showed preferential responses for near and far stimuli, relative to zero-disparity stimuli at the fixation plane. These “near/far” disparity-biased regions were located within dorsal IT, as predicted by microelectrode studies, and on the posterior inferotemporal gyrus. In a second analysis, we instead compared responses to near stimuli with responses to far stimuli and discovered a separate network of “near” disparity-biased regions that extended along the crest of the superior temporal sulcus. We also measured in the same animals fMRI responses to faces, scenes, color, and checkerboard annuli at different visual field eccentricities. Disparity-biased regions defined in either analysis did not show a color bias, suggesting that disparity and color contribute to different computations within IT. Scene-biased regions responded preferentially to near and far stimuli (compared with stimuli without disparity) and had a peripheral visual field bias, whereas face patches had a marked near bias and a central visual field bias. These results support the idea that IT is organized by a coarse eccentricity map, and show that disparity likely contributes to computations associated with both central (face processing) and peripheral (scene processing) visual field biases, but likely does not contribute much to computations within IT that are implicated in processing color. PMID:25926470
Scene-based nonuniformity correction with video sequences and registration.

PubMed

Hardie, R C; Hayat, M M; Armstrong, E; Yasuda, B

2000-03-10

We describe a new, to our knowledge, scene-based nonuniformity correction algorithm for array detectors. The algorithm relies on the ability to register a sequence of observed frames in the presence of the fixed-pattern noise caused by pixel-to-pixel nonuniformity. In low-to-moderate levels of nonuniformity, sufficiently accurate registration may be possible with standard scene-based registration techniques. If the registration is accurate, and motion exists between the frames, then groups of independent detectors can be identified that observe the same irradiance (or true scene value). These detector outputs are averaged to generate estimates of the true scene values. With these scene estimates, and the corresponding observed values through a given detector, a curve-fitting procedure is used to estimate the individual detector response parameters. These can then be used to correct for detector nonuniformity. The strength of the algorithm lies in its simplicity and low computational complexity. Experimental results, to illustrate the performance of the algorithm, include the use of visible-range imagery with simulated nonuniformity and infrared imagery with real nonuniformity.
Multi- and hyperspectral scene modeling

NASA Astrophysics Data System (ADS)

Borel, Christoph C.; Tuttle, Ronald F.

2011-06-01

This paper shows how to use a public domain raytracer POV-Ray (Persistence Of Vision Raytracer) to render multiand hyper-spectral scenes. The scripting environment allows automatic changing of the reflectance and transmittance parameters. The radiosity rendering mode allows accurate simulation of multiple-reflections between surfaces and also allows semi-transparent surfaces such as plant leaves. We show that POV-Ray computes occlusion accurately using a test scene with two blocks under a uniform sky. A complex scene representing a plant canopy is generated using a few lines of script. With appropriate rendering settings, shadows cast by leaves are rendered in many bands. Comparing single and multiple reflection renderings, the effect of multiple reflections is clearly visible and accounts for 25% of the overall apparent canopy reflectance in the near infrared.

Auditory cortex of newborn bats is prewired for echolocation.

PubMed

Kössl, Manfred; Voss, Cornelia; Mora, Emanuel C; Macias, Silvio; Foeller, Elisabeth; Vater, Marianne

2012-04-10

Neuronal computation of object distance from echo delay is an essential task that echolocating bats must master for spatial orientation and the capture of prey. In the dorsal auditory cortex of bats, neurons specifically respond to combinations of short frequency-modulated components of emitted call and delayed echo. These delay-tuned neurons are thought to serve in target range calculation. It is unknown whether neuronal correlates of active space perception are established by experience-dependent plasticity or by innate mechanisms. Here we demonstrate that in the first postnatal week, before onset of echolocation and flight, dorsal auditory cortex already contains functional circuits that calculate distance from the temporal separation of a simulated pulse and echo. This innate cortical implementation of a purely computational processing mechanism for sonar ranging should enhance survival of juvenile bats when they first engage in active echolocation behaviour and flight.
Individual differences in attentional modulation of cortical responses correlate with selective attention performance

PubMed Central

Choi, Inyong; Wang, Le; Bharadwaj, Hari; Shinn-Cunningham, Barbara

2014-01-01

Many studies have shown that attention modulates the cortical representation of an auditory scene, emphasizing an attended source while suppressing competing sources. Yet, individual differences in the strength of this attentional modulation and their relationship with selective attention ability are poorly understood. Here, we ask whether differences in how strongly attention modulates cortical responses reflect differences in normal-hearing listeners’ selective auditory attention ability. We asked listeners to attend to one of three competing melodies and identify its pitch contour while we measured cortical electroencephalographic responses. The three melodies were either from widely separated pitch ranges (“easy trials”), or from a narrow, overlapping pitch range (“hard trials”). The melodies started at slightly different times; listeners attended either the leading or lagging melody. Because of the timing of the onsets, the leading melody drew attention exogenously. In contrast, attending the lagging melody required listeners to direct top-down attention volitionally. We quantified how attention amplified auditory N1 response to the attended melody and found large individual differences in the N1 amplification, even though only correctly answered trials were used to quantify the ERP gain. Importantly, listeners with the strongest amplification of N1 response to the lagging melody in the easy trials were the best performers across other types of trials. Our results raise the possibility that individual differences in the strength of top-down gain control reflect inherent differences in the ability to control top-down attention. PMID:24821552
Speech target modulates speaking induced suppression in auditory cortex

PubMed Central

Ventura, Maria I; Nagarajan, Srikantan S; Houde, John F

2009-01-01

Background Previous magnetoencephalography (MEG) studies have demonstrated speaking-induced suppression (SIS) in the auditory cortex during vocalization tasks wherein the M100 response to a subject's own speaking is reduced compared to the response when they hear playback of their speech. Results The present MEG study investigated the effects of utterance rapidity and complexity on SIS: The greatest difference between speak and listen M100 amplitudes (i.e., most SIS) was found in the simple speech task. As the utterances became more rapid and complex, SIS was significantly reduced (p = 0.0003). Conclusion These findings are highly consistent with our model of how auditory feedback is processed during speaking, where incoming feedback is compared with an efference-copy derived prediction of expected feedback. Thus, the results provide further insights about how speech motor output is controlled, as well as the computational role of auditory cortex in transforming auditory feedback. PMID:19523234
External auditory canal atresia of probable congenital origin in a dog.

PubMed

Schmidt, K; Piaia, T; Bertolini, G; De Lorenzi, D

2007-04-01

A nine-month-old Labrador retriever was referred to the Clinica Veterinaria Privata San Marco because of frequent headshaking and downward turning of the right ear. Clinical examination revealed that there was no external acoustic meatus in the right ear. Computed tomography confirmed that the vertical part of the right auditory canal ended blindly, providing a diagnosis of external auditory canal atresia. Cytological examination and culture of fluid from the canal and the bulla revealed only aseptic cerumen; for this reason, it was assumed that the dog was probably affected by a congenital developmental deformity of the external auditory canal. Reconstructive surgery was performed using a "pull-through" technique. Four months after surgery the cosmetic and functional results were satisfactory.
Scientific Visualization and Computational Science: Natural Partners

NASA Technical Reports Server (NTRS)

Uselton, Samuel P.; Lasinski, T. A. (Technical Monitor)

1995-01-01

Scientific visualization is developing rapidly, stimulated by computational science, which is gaining acceptance as a third alternative to theory and experiment. Computational science is based on numerical simulations of mathematical models derived from theory. But each individual simulation is like a hypothetical experiment; initial conditions are specified, and the result is a record of the observed conditions. Experiments can be simulated for situations that can not really be created or controlled. Results impossible to measure can be computed.. Even for observable values, computed samples are typically much denser. Numerical simulations also extend scientific exploration where the mathematics is analytically intractable. Numerical simulations are used to study phenomena from subatomic to intergalactic scales and from abstract mathematical structures to pragmatic engineering of everyday objects. But computational science methods would be almost useless without visualization. The obvious reason is that the huge amounts of data produced require the high bandwidth of the human visual system, and interactivity adds to the power. Visualization systems also provide a single context for all the activities involved from debugging the simulations, to exploring the data, to communicating the results. Most of the presentations today have their roots in image processing, where the fundamental task is: Given an image, extract information about the scene. Visualization has developed from computer graphics, and the inverse task: Given a scene description, make an image. Visualization extends the graphics paradigm by expanding the possible input. The goal is still to produce images; the difficulty is that the input is not a scene description displayable by standard graphics methods. Visualization techniques must either transform the data into a scene description or extend graphics techniques to display this odd input. Computational science is a fertile field for visualization research because the results vary so widely and include things that have no known appearance. The amount of data creates additional challenges for both hardware and software systems. Evaluations of visualization should ultimately reflect the insight gained into the scientific phenomena. So making good visualizations requires consideration of characteristics of the user and the purpose of the visualization. Knowledge about human perception and graphic design is also relevant. It is this breadth of knowledge that stimulates proposals for multidisciplinary visualization teams and intelligent visualization assistant software. Visualization is an immature field, but computational science is stimulating research on a broad front.
Noise-induced hearing loss alters the temporal dynamics of auditory-nerve responses

PubMed Central

Scheidt, Ryan E.; Kale, Sushrut; Heinz, Michael G.

2010-01-01

Auditory-nerve fibers demonstrate dynamic response properties in that they adapt to rapid changes in sound level, both at the onset and offset of a sound. These dynamic response properties affect temporal coding of stimulus modulations that are perceptually relevant for many sounds such as speech and music. Temporal dynamics have been well characterized in auditory-nerve fibers from normal-hearing animals, but little is known about the effects of sensorineural hearing loss on these dynamics. This study examined the effects of noise-induced hearing loss on the temporal dynamics in auditory-nerve fiber responses from anesthetized chinchillas. Post-stimulus time histograms were computed from responses to 50-ms tones presented at characteristic frequency and 30 dB above fiber threshold. Several response metrics related to temporal dynamics were computed from post-stimulus-time histograms and were compared between normal-hearing and noise-exposed animals. Results indicate that noise-exposed auditory-nerve fibers show significantly reduced response latency, increased onset response and percent adaptation, faster adaptation after onset, and slower recovery after offset. The decrease in response latency only occurred in noise-exposed fibers with significantly reduced frequency selectivity. These changes in temporal dynamics have important implications for temporal envelope coding in hearing-impaired ears, as well as for the design of dynamic compression algorithms for hearing aids. PMID:20696230
Effects of training and motivation on auditory P300 brain-computer interface performance.

PubMed

Baykara, E; Ruf, C A; Fioravanti, C; Käthner, I; Simon, N; Kleih, S C; Kübler, A; Halder, S

2016-01-01

Brain-computer interface (BCI) technology aims at helping end-users with severe motor paralysis to communicate with their environment without using the natural output pathways of the brain. For end-users in complete paralysis, loss of gaze control may necessitate non-visual BCI systems. The present study investigated the effect of training on performance with an auditory P300 multi-class speller paradigm. For half of the participants, spatial cues were added to the auditory stimuli to see whether performance can be further optimized. The influence of motivation, mood and workload on performance and P300 component was also examined. In five sessions, 16 healthy participants were instructed to spell several words by attending to animal sounds representing the rows and columns of a 5 × 5 letter matrix. 81% of the participants achieved an average online accuracy of ⩾ 70%. From the first to the fifth session information transfer rates increased from 3.72 bits/min to 5.63 bits/min. Motivation significantly influenced P300 amplitude and online ITR. No significant facilitative effect of spatial cues on performance was observed. Training improves performance in an auditory BCI paradigm. Motivation influences performance and P300 amplitude. The described auditory BCI system may help end-users to communicate independently of gaze control with their environment. Copyright © 2015 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.
An Evaluation of Training with an Auditory P300 Brain-Computer Interface for the Japanese Hiragana Syllabary

PubMed Central

Halder, Sebastian; Takano, Kouji; Ora, Hiroki; Onishi, Akinari; Utsumi, Kota; Kansaku, Kenji

2016-01-01

Gaze-independent brain-computer interfaces (BCIs) are a possible communication channel for persons with paralysis. We investigated if it is possible to use auditory stimuli to create a BCI for the Japanese Hiragana syllabary, which has 46 Hiragana characters. Additionally, we investigated if training has an effect on accuracy despite the high amount of different stimuli involved. Able-bodied participants (N = 6) were asked to select 25 syllables (out of fifty possible choices) using a two step procedure: First the consonant (ten choices) and then the vowel (five choices). This was repeated on 3 separate days. Additionally, a person with spinal cord injury (SCI) participated in the experiment. Four out of six healthy participants reached Hiragana syllable accuracies above 70% and the information transfer rate increased from 1.7 bits/min in the first session to 3.2 bits/min in the third session. The accuracy of the participant with SCI increased from 12% (0.2 bits/min) to 56% (2 bits/min) in session three. Reliable selections from a 10 × 5 matrix using auditory stimuli were possible and performance is increased by training. We were able to show that auditory P300 BCIs can be used for communication with up to fifty symbols. This enables the use of the technology of auditory P300 BCIs with a variety of applications. PMID:27746716
An Evaluation of Training with an Auditory P300 Brain-Computer Interface for the Japanese Hiragana Syllabary.

PubMed

Halder, Sebastian; Takano, Kouji; Ora, Hiroki; Onishi, Akinari; Utsumi, Kota; Kansaku, Kenji

2016-01-01

Gaze-independent brain-computer interfaces (BCIs) are a possible communication channel for persons with paralysis. We investigated if it is possible to use auditory stimuli to create a BCI for the Japanese Hiragana syllabary, which has 46 Hiragana characters. Additionally, we investigated if training has an effect on accuracy despite the high amount of different stimuli involved. Able-bodied participants ( N = 6) were asked to select 25 syllables (out of fifty possible choices) using a two step procedure: First the consonant (ten choices) and then the vowel (five choices). This was repeated on 3 separate days. Additionally, a person with spinal cord injury (SCI) participated in the experiment. Four out of six healthy participants reached Hiragana syllable accuracies above 70% and the information transfer rate increased from 1.7 bits/min in the first session to 3.2 bits/min in the third session. The accuracy of the participant with SCI increased from 12% (0.2 bits/min) to 56% (2 bits/min) in session three. Reliable selections from a 10 × 5 matrix using auditory stimuli were possible and performance is increased by training. We were able to show that auditory P300 BCIs can be used for communication with up to fifty symbols. This enables the use of the technology of auditory P300 BCIs with a variety of applications.
Information Processing Research.

DTIC Science & Technology

1986-09-01

Kuroe. The 3D MOSAIC Scene Understanding System. In Alan Bundy, Editor, Proceedings of the Eighth International Joint Conference on Artificial ... Artificial Jntelligencel7(1-3):409-460, August, 1981. Given a single picture which is a projection of a three-dimensional scene onto the two...values are detected as outliers by computing the distribution of values over a sliding 80 msec window. During the third pass (based on artificial
Fixations on objects in natural scenes: dissociating importance from salience

PubMed Central

't Hart, Bernard M.; Schmidt, Hannah C. E. F.; Roth, Christine; Einhäuser, Wolfgang

2013-01-01

The relation of selective attention to understanding of natural scenes has been subject to intense behavioral research and computational modeling, and gaze is often used as a proxy for such attention. The probability of an image region to be fixated typically correlates with its contrast. However, this relation does not imply a causal role of contrast. Rather, contrast may relate to an object's “importance” for a scene, which in turn drives attention. Here we operationalize importance by the probability that an observer names the object as characteristic for a scene. We modify luminance contrast of either a frequently named (“common”/“important”) or a rarely named (“rare”/“unimportant”) object, track the observers' eye movements during scene viewing and ask them to provide keywords describing the scene immediately after. When no object is modified relative to the background, important objects draw more fixations than unimportant ones. Increases of contrast make an object more likely to be fixated, irrespective of whether it was important for the original scene, while decreases in contrast have little effect on fixations. Any contrast modification makes originally unimportant objects more important for the scene. Finally, important objects are fixated more centrally than unimportant objects, irrespective of contrast. Our data suggest a dissociation between object importance (relevance for the scene) and salience (relevance for attention). If an object obeys natural scene statistics, important objects are also salient. However, when natural scene statistics are violated, importance and salience are differentially affected. Object salience is modulated by the expectation about object properties (e.g., formed by context or gist), and importance by the violation of such expectations. In addition, the dependence of fixated locations within an object on the object's importance suggests an analogy to the effects of word frequency on landing positions in reading. PMID:23882251
Auralization of CFD Vorticity Using an Auditory Illusion

NASA Astrophysics Data System (ADS)

Volpe, C. R.

2005-12-01

One way in which scientists and engineers interpret large quantities of data is through a process called visualization, i.e. generating graphical images that capture essential characteristics and highlight interesting relationships. Another approach, which has received far less attention, is to present complex information with sound. This approach, called ``auralization" or ``sonification", is the auditory analog of visualization. Early work in data auralization frequently involved directly mapping some variable in the data to a sound parameter, such as pitch or volume. Multi-variate data could be auralized by mapping several variables to several sound parameters simultaneously. A clear drawback of this approach is the limited practical range of sound parameters that can be presented to human listeners without exceeding their range of perception or comfort. A software auralization system built upon an existing visualization system is briefly described. This system incorporates an aural presentation synchronously and interactively with an animated scientific visualization, so that alternate auralization techniques can be investigated. One such alternate technique involves auditory illusions: sounds which trick the listener into perceiving something other than what is actually being presented. This software system will be used to present an auditory illusion, known for decades among cognitive psychologists, which produces a sound that seems to ascend or descend endlessly in pitch. The applicability of this illusion for presenting Computational Fluid Dynamics data will be demonstrated. CFD data is frequently visualized with thin stream-lines, but thicker stream-ribbons and stream-tubes can also be used, which rotate to convey fluid vorticity. But a purely graphical presentation can yield drawbacks of its own. Thicker stream-tubes can be self-obscuring, and can obscure other scene elements as well, thus motivating a different approach, such as using sound. Naturally, the simple approach of mapping clockwise and counterclockwise rotations to actual pitch increases and decreases, eventually results in sounds that the listener cannot hear. In this alternate presentation using an auditory illusion, repeated rotations of a stream-tube are replaced with continual increases or decreases in apparent pitch. These apparent pitch changes can continue without bound, yet never exceed the range of frequencies that the listener can hear. The effectiveness of this presentation technique has been studied, and empirical results, obtained through formal user testing and statistical analysis, are presented. These results demonstrate that an aural data presentation using an auditory illusion can improve performance in locating key data characteristics, a task that demonstrates a certain level of understanding of the data. The experiments show that this holds true even when the user expresses a subjective preference and greater confidence in a visual presentation. The CFD data used in the research comes from a number of different industrial domains, but the advantages of this technique could be equally applicable to the study of earth sciences involving fluid mechanics, such as atmospheric or ocean sciences. Furthermore, the approach is applicable not only to CFD data, but to any type of data in which a quantity that is cyclic in nature, such as orientation, needs to be presented. Although the techniques and tools were originally developed with scientists and engineers in mind, they can also be used to aid students, particularly those who are visually impaired or who have difficulty interpreting certain spatial relationships visually.
Real-time depth processing for embedded platforms

NASA Astrophysics Data System (ADS)

Rahnama, Oscar; Makarov, Aleksej; Torr, Philip

2017-05-01

Obtaining depth information of a scene is an important requirement in many computer-vision and robotics applications. For embedded platforms, passive stereo systems have many advantages over their active counterparts (i.e. LiDAR, Infrared). They are power efficient, cheap, robust to lighting conditions and inherently synchronized to the RGB images of the scene. However, stereo depth estimation is a computationally expensive task that operates over large amounts of data. For embedded applications which are often constrained by power consumption, obtaining accurate results in real-time is a challenge. We demonstrate a computationally and memory efficient implementation of a stereo block-matching algorithm in FPGA. The computational core achieves a throughput of 577 fps at standard VGA resolution whilst consuming less than 3 Watts of power. The data is processed using an in-stream approach that minimizes memory-access bottlenecks and best matches the raster scan readout of modern digital image sensors.
Effects of Background Music on Objective and Subjective Performance Measures in an Auditory BCI.

PubMed

Zhou, Sijie; Allison, Brendan Z; Kübler, Andrea; Cichocki, Andrzej; Wang, Xingyu; Jin, Jing

2016-01-01

Several studies have explored brain computer interface (BCI) systems based on auditory stimuli, which could help patients with visual impairments. Usability and user satisfaction are important considerations in any BCI. Although background music can influence emotion and performance in other task environments, and many users may wish to listen to music while using a BCI, auditory, and other BCIs are typically studied without background music. Some work has explored the possibility of using polyphonic music in auditory BCI systems. However, this approach requires users with good musical skills, and has not been explored in online experiments. Our hypothesis was that an auditory BCI with background music would be preferred by subjects over a similar BCI without background music, without any difference in BCI performance. We introduce a simple paradigm (which does not require musical skill) using percussion instrument sound stimuli and background music, and evaluated it in both offline and online experiments. The result showed that subjects preferred the auditory BCI with background music. Different performance measures did not reveal any significant performance effect when comparing background music vs. no background. Since the addition of background music does not impair BCI performance but is preferred by users, auditory (and perhaps other) BCIs should consider including it. Our study also indicates that auditory BCIs can be effective even if the auditory channel is simultaneously otherwise engaged.
Impairment of Auditory-Motor Timing and Compensatory Reorganization after Ventral Premotor Cortex Stimulation

PubMed Central

Kornysheva, Katja; Schubotz, Ricarda I.

2011-01-01

Integrating auditory and motor information often requires precise timing as in speech and music. In humans, the position of the ventral premotor cortex (PMv) in the dorsal auditory stream renders this area a node for auditory-motor integration. Yet, it remains unknown whether the PMv is critical for auditory-motor timing and which activity increases help to preserve task performance following its disruption. 16 healthy volunteers participated in two sessions with fMRI measured at baseline and following rTMS (rTMS) of either the left PMv or a control region. Subjects synchronized left or right finger tapping to sub-second beat rates of auditory rhythms in the experimental task, and produced self-paced tapping during spectrally matched auditory stimuli in the control task. Left PMv rTMS impaired auditory-motor synchronization accuracy in the first sub-block following stimulation (p<0.01, Bonferroni corrected), but spared motor timing and attention to task. Task-related activity increased in the homologue right PMv, but did not predict the behavioral effect of rTMS. In contrast, anterior midline cerebellum revealed most pronounced activity increase in less impaired subjects. The present findings suggest a critical role of the left PMv in feed-forward computations enabling accurate auditory-motor timing, which can be compensated by activity modulations in the cerebellum, but not in the homologue region contralateral to stimulation. PMID:21738657
A novel hybrid auditory BCI paradigm combining ASSR and P300.

PubMed

Kaongoen, Netiwit; Jo, Sungho

2017-03-01

Brain-computer interface (BCI) is a technology that provides an alternative way of communication by translating brain activities into digital commands. Due to the incapability of using the vision-dependent BCI for patients who have visual impairment, auditory stimuli have been used to substitute the conventional visual stimuli. This paper introduces a hybrid auditory BCI that utilizes and combines auditory steady state response (ASSR) and spatial-auditory P300 BCI to improve the performance for the auditory BCI system. The system works by simultaneously presenting auditory stimuli with different pitches and amplitude modulation (AM) frequencies to the user with beep sounds occurring randomly between all sound sources. Attention to different auditory stimuli yields different ASSR and beep sounds trigger the P300 response when they occur in the target channel, thus the system can utilize both features for classification. The proposed ASSR/P300-hybrid auditory BCI system achieves 85.33% accuracy with 9.11 bits/min information transfer rate (ITR) in binary classification problem. The proposed system outperformed the P300 BCI system (74.58% accuracy with 4.18 bits/min ITR) and the ASSR BCI system (66.68% accuracy with 2.01 bits/min ITR) in binary-class problem. The system is completely vision-independent. This work demonstrates that combining ASSR and P300 BCI into a hybrid system could result in a better performance and could help in the development of the future auditory BCI. Copyright © 2017 Elsevier B.V. All rights reserved.
Perceptual geometry of space and form: visual perception of natural scenes and their virtual representation

NASA Astrophysics Data System (ADS)

Assadi, Amir H.

2001-11-01

Perceptual geometry is an emerging field of interdisciplinary research whose objectives focus on study of geometry from the perspective of visual perception, and in turn, apply such geometric findings to the ecological study of vision. Perceptual geometry attempts to answer fundamental questions in perception of form and representation of space through synthesis of cognitive and biological theories of visual perception with geometric theories of the physical world. Perception of form and space are among fundamental problems in vision science. In recent cognitive and computational models of human perception, natural scenes are used systematically as preferred visual stimuli. Among key problems in perception of form and space, we have examined perception of geometry of natural surfaces and curves, e.g. as in the observer's environment. Besides a systematic mathematical foundation for a remarkably general framework, the advantages of the Gestalt theory of natural surfaces include a concrete computational approach to simulate or recreate images whose geometric invariants and quantities might be perceived and estimated by an observer. The latter is at the very foundation of understanding the nature of perception of space and form, and the (computer graphics) problem of rendering scenes to visually invoke virtual presence.
A HWIL test facility of infrared imaging laser radar using direct signal injection

NASA Astrophysics Data System (ADS)

Wang, Qian; Lu, Wei; Wang, Chunhui; Wang, Qi

2005-01-01

Laser radar has been widely used these years and the hardware-in-the-loop (HWIL) testing of laser radar become important because of its low cost and high fidelity compare with On-the-Fly testing and whole digital simulation separately. Scene generation and projection two key technologies of hardware-in-the-loop testing of laser radar and is a complicated problem because the 3D images result from time delay. The scene generation process begins with the definition of the target geometry and reflectivity and range. The real-time 3D scene generation computer is a PC based hardware and the 3D target models were modeled using 3dsMAX. The scene generation software was written in C and OpenGL and is executed to extract the Z-buffer from the bit planes to main memory as range image. These pixels contain each target position x, y, z and its respective intensity and range value. Expensive optical injection technologies of scene projection such as LDP array, VCSEL array, DMD and associated scene generation is ongoing. But the optical scene projection is complicated and always unaffordable. In this paper a cheaper test facility was described that uses direct electronic injection to provide rang images for laser radar testing. The electronic delay and pulse shaping circuits inject the scenes directly into the seeker's signal processing unit.
SDI Software Technology Program Plan Version 1.5

DTIC Science & Technology

1987-06-01

computer generation of auditory communication of meaningful speech. Most speech synthesizers are based on mathematical models of the human vocal tract, but...oral/ auditory and multimodal communications. Although such state-of-the-art interaction technology has not fully matured, user experience has...superior I pattern matching capabilities and the subliminal intuitive deduction capability. The error performance of humans can be helped by careful
Long-range synchrony of gamma oscillations and auditory hallucination symptoms in schizophrenia

PubMed Central

Mulert, C.; Kirsch; Pascual-Marqui, Roberto; McCarley, Robert W.; Spencer, Kevin M.

2010-01-01

Phase locking in the gamma-band range has been shown to be diminished in patients with schizophrenia. Moreover, there have been reports of positive correlations between phase locking in the gamma-band range and positive symptoms, especially hallucinations. The aim of the present study was to use a new methodological approach in order to investigate gamma-band phase synchronization between the left and right auditory cortex in patients with schizophrenia and its relationship to auditory hallucinations. Subjects were 18 patients with chronic schizophrenia (SZ) and 16 healthy control (HC) subjects. Auditory hallucination symptom scores were obtained using the Scale for the Assessment of Positive Symptoms. Stimuli were 40-Hz binaural click trains. The generators of the 40 Hz-ASSR were localized using eLORETA and based on the computed intracranial signals lagged interhemispheric phase locking between primary and secondary auditory cortices was analyzed. Current source density of the 40 ASSR response was significantly diminished in SZ in comparison to HC in the right superior and middle temporal gyrus (p<0.05). Interhemispheric phase locking was reduced in SZ in comparison to HC for the primary auditory cortices (p<0.05) but not in the secondary auditory cortices. A significant positive correlation was found between auditory hallucination symptom scores and phase synchronization between the primary auditory cortices (p<0.05, corrected for multiple testing) but not for the secondary auditory cortices. These results suggest that long-range synchrony of gamma oscillations is disturbed in schizophrenia and that this deficit is related to clinical symptoms such as auditory hallucinations. PMID:20713096

Robust Feature Matching in Terrestrial Image Sequences

NASA Astrophysics Data System (ADS)

Abbas, A.; Ghuffar, S.

2018-04-01

From the last decade, the feature detection, description and matching techniques are most commonly exploited in various photogrammetric and computer vision applications, which includes: 3D reconstruction of scenes, image stitching for panoramic creation, image classification, or object recognition etc. However, in terrestrial imagery of urban scenes contains various issues, which include duplicate and identical structures (i.e. repeated windows and doors) that cause the problem in feature matching phase and ultimately lead to failure of results specially in case of camera pose and scene structure estimation. In this paper, we will address the issue related to ambiguous feature matching in urban environment due to repeating patterns.
Do deep convolutional neural networks really need to be deep when applied for remote scene classification?

NASA Astrophysics Data System (ADS)

Luo, Chang; Wang, Jie; Feng, Gang; Xu, Suhui; Wang, Shiqiang

2017-10-01

Deep convolutional neural networks (CNNs) have been widely used to obtain high-level representation in various computer vision tasks. However, for remote scene classification, there are not sufficient images to train a very deep CNN from scratch. From two viewpoints of generalization power, we propose two promising kinds of deep CNNs for remote scenes and try to find whether deep CNNs need to be deep for remote scene classification. First, we transfer successful pretrained deep CNNs to remote scenes based on the theory that depth of CNNs brings the generalization power by learning available hypothesis for finite data samples. Second, according to the opposite viewpoint that generalization power of deep CNNs comes from massive memorization and shallow CNNs with enough neural nodes have perfect finite sample expressivity, we design a lightweight deep CNN (LDCNN) for remote scene classification. With five well-known pretrained deep CNNs, experimental results on two independent remote-sensing datasets demonstrate that transferred deep CNNs can achieve state-of-the-art results in an unsupervised setting. However, because of its shallow architecture, LDCNN cannot obtain satisfactory performance, regardless of whether in an unsupervised, semisupervised, or supervised setting. CNNs really need depth to obtain general features for remote scenes. This paper also provides baseline for applying deep CNNs to other remote sensing tasks.
Computer 3D site model generation based on aerial images

NASA Astrophysics Data System (ADS)

Zheltov, Sergey Y.; Blokhinov, Yuri B.; Stepanov, Alexander A.; Skryabin, Sergei V.; Sibiriakov, Alexandre V.

1997-07-01

The technology for 3D model design of real world scenes and its photorealistic rendering are current topics of investigation. Development of such technology is very attractive to implement in vast varieties of applications: military mission planning, crew training, civil engineering, architecture, virtual reality entertainments--just a few were mentioned. 3D photorealistic models of urban areas are often discussed now as upgrade from existing 2D geographic information systems. Possibility of site model generation with small details depends on two main factors: available source dataset and computer power resources. In this paper PC based technology is presented, so the scenes of middle resolution (scale of 1:1000) be constructed. Types of datasets are the gray level aerial stereo pairs of photographs (scale of 1:14000) and true color on ground photographs of buildings (scale ca.1:1000). True color terrestrial photographs are also necessary for photorealistic rendering, that in high extent improves human perception of the scene.
Krikalev on middeck with laptop computer

NASA Image and Video Library

1998-12-06

S88-E-5041 (12-06-98) --- Sergei Krikalev, mission specialist representing the Russian Space Agency (RSA), works on a laptop computer on Endeavour's middeck. The scene was photographed shortly after the successful mating of Unity with the shuttle's docking system.
Individual differences in visual motion perception and neurotransmitter concentrations in the human brain.

PubMed

Takeuchi, Tatsuto; Yoshimoto, Sanae; Shimada, Yasuhiro; Kochiyama, Takanori; Kondo, Hirohito M

2017-02-19

Recent studies have shown that interindividual variability can be a rich source of information regarding the mechanism of human visual perception. In this study, we examined the mechanisms underlying interindividual variability in the perception of visual motion, one of the fundamental components of visual scene analysis, by measuring neurotransmitter concentrations using magnetic resonance spectroscopy. First, by psychophysically examining two types of motion phenomena-motion assimilation and contrast-we found that, following the presentation of the same stimulus, some participants perceived motion assimilation, while others perceived motion contrast. Furthermore, we found that the concentration of the excitatory neurotransmitter glutamate-glutamine (Glx) in the dorsolateral prefrontal cortex (Brodmann area 46) was positively correlated with the participant's tendency to motion assimilation over motion contrast; however, this effect was not observed in the visual areas. The concentration of the inhibitory neurotransmitter γ-aminobutyric acid had only a weak effect compared with that of Glx. We conclude that excitatory process in the suprasensory area is important for an individual's tendency to determine antagonistically perceived visual motion phenomena.This article is part of the themed issue 'Auditory and visual scene analysis'. © 2017 The Author(s).
Good initialization model with constrained body structure for scene text recognition

NASA Astrophysics Data System (ADS)

Zhu, Anna; Wang, Guoyou; Dong, Yangbo

2016-09-01

Scene text recognition has gained significant attention in the computer vision community. Character detection and recognition are the promise of text recognition and affect the overall performance to a large extent. We proposed a good initialization model for scene character recognition from cropped text regions. We use constrained character's body structures with deformable part-based models to detect and recognize characters in various backgrounds. The character's body structures are achieved by an unsupervised discriminative clustering approach followed by a statistical model and a self-build minimum spanning tree model. Our method utilizes part appearance and location information, and combines character detection and recognition in cropped text region together. The evaluation results on the benchmark datasets demonstrate that our proposed scheme outperforms the state-of-the-art methods both on scene character recognition and word recognition aspects.
Research on three-dimensional visualization based on virtual reality and Internet

NASA Astrophysics Data System (ADS)

Wang, Zongmin; Yang, Haibo; Zhao, Hongling; Li, Jiren; Zhu, Qiang; Zhang, Xiaohong; Sun, Kai

2007-06-01

To disclose and display water information, a three-dimensional visualization system based on Virtual Reality (VR) and Internet is researched for demonstrating "digital water conservancy" application and also for routine management of reservoir. To explore and mine in-depth information, after completion of modeling high resolution DEM with reliable quality, topographical analysis, visibility analysis and reservoir volume computation are studied. And also, some parameters including slope, water level and NDVI are selected to classify easy-landslide zone in water-level-fluctuating zone of reservoir area. To establish virtual reservoir scene, two kinds of methods are used respectively for experiencing immersion, interaction and imagination (3I). First virtual scene contains more detailed textures to increase reality on graphical workstation with virtual reality engine Open Scene Graph (OSG). Second virtual scene is for internet users with fewer details for assuring fluent speed.
Ray tracing a three dimensional scene using a grid

DOEpatents

Wald, Ingo; Ize, Santiago; Parker, Steven G; Knoll, Aaron

2013-02-26

Ray tracing a three-dimensional scene using a grid. One example embodiment is a method for ray tracing a three-dimensional scene using a grid. In this example method, the three-dimensional scene is made up of objects that are spatially partitioned into a plurality of cells that make up the grid. The method includes a first act of computing a bounding frustum of a packet of rays, and a second act of traversing the grid slice by slice along a major traversal axis. Each slice traversal includes a first act of determining one or more cells in the slice that are overlapped by the frustum and a second act of testing the rays in the packet for intersection with any objects at least partially bounded by the one or more cells overlapped by the frustum.
[The research on bidirectional reflectance computer simulation of forest canopy at pixel scale].

PubMed

Song, Jin-Ling; Wang, Jin-Di; Shuai, Yan-Min; Xiao, Zhi-Qiang

2009-08-01

Computer simulation is based on computer graphics to generate the realistic 3D structure scene of vegetation, and to simulate the canopy regime using radiosity method. In the present paper, the authors expand the computer simulation model to simulate forest canopy bidirectional reflectance at pixel scale. But usually, the trees are complex structures, which are tall and have many branches. So there is almost a need for hundreds of thousands or even millions of facets to built up the realistic structure scene for the forest It is difficult for the radiosity method to compute so many facets. In order to make the radiosity method to simulate the forest scene at pixel scale, in the authors' research, the authors proposed one idea to simplify the structure of forest crowns, and abstract the crowns to ellipsoids. And based on the optical characteristics of the tree component and the characteristics of the internal energy transmission of photon in real crown, the authors valued the optical characteristics of ellipsoid surface facets. In the computer simulation of the forest, with the idea of geometrical optics model, the gap model is considered to get the forest canopy bidirectional reflectance at pixel scale. Comparing the computer simulation results with the GOMS model, and Multi-angle Imaging SpectroRadiometer (MISR) multi-angle remote sensing data, the simulation results are in agreement with the GOMS simulation result and MISR BRF. But there are also some problems to be solved. So the authors can conclude that the study has important value for the application of multi-angle remote sensing and the inversion of vegetation canopy structure parameters.
Dataset Curation through Renders and Ontology Matching

DTIC Science & Technology

2015-09-01

government .001 services .246 furniture store .299 health salon .998 assoc./organization* .029 insurance agency .103 gastronomy .001 gas & automotive .219...Stanley Michael Bileschi. StreetScenes: Towards scene understanding in still images. PhD thesis, Massachusetts Institute of Technology , 2006. 3.5, 4.4...computer animation]. In Compcon’96.’ Technologies for the Information Superhigh- way’Digest of Papers. IEEE, 1996. 2.1 Aharon Bar Hillel and Daphna Weinshall
Real-Time Mapping Using Stereoscopic Vision Optimization

DTIC Science & Technology

2005-03-01

pinhole geometry . . . . . . . . . . . . . . 17 2.8. Artificially textured scenes . . . . . . . . . . . . . . . . . . . . 23 3.1. Bilbo the robot...geometry. 2.2.1 The Fundamental Matrix. The fundamental matrix (F) describes the relationship between a pair of 2D pictures of a 3D scene . This is...eight CCD cameras to compute a mesh model of the environment from a large number of overlapped 3D images. In [1,17], a range scanner is combined with a
An interactive display system for large-scale 3D models

NASA Astrophysics Data System (ADS)

Liu, Zijian; Sun, Kun; Tao, Wenbing; Liu, Liman

2018-04-01

With the improvement of 3D reconstruction theory and the rapid development of computer hardware technology, the reconstructed 3D models are enlarging in scale and increasing in complexity. Models with tens of thousands of 3D points or triangular meshes are common in practical applications. Due to storage and computing power limitation, it is difficult to achieve real-time display and interaction with large scale 3D models for some common 3D display software, such as MeshLab. In this paper, we propose a display system for large-scale 3D scene models. We construct the LOD (Levels of Detail) model of the reconstructed 3D scene in advance, and then use an out-of-core view-dependent multi-resolution rendering scheme to realize the real-time display of the large-scale 3D model. With the proposed method, our display system is able to render in real time while roaming in the reconstructed scene and 3D camera poses can also be displayed. Furthermore, the memory consumption can be significantly decreased via internal and external memory exchange mechanism, so that it is possible to display a large scale reconstructed scene with over millions of 3D points or triangular meshes in a regular PC with only 4GB RAM.
Satellite Image Mosaic Engine

NASA Technical Reports Server (NTRS)

Plesea, Lucian

2006-01-01

A computer program automatically builds large, full-resolution mosaics of multispectral images of Earth landmasses from images acquired by Landsat 7, complete with matching of colors and blending between adjacent scenes. While the code has been used extensively for Landsat, it could also be used for other data sources. A single mosaic of as many as 8,000 scenes, represented by more than 5 terabytes of data and the largest set produced in this work, demonstrated what the code could do to provide global coverage. The program first statistically analyzes input images to determine areas of coverage and data-value distributions. It then transforms the input images from their original universal transverse Mercator coordinates to other geographical coordinates, with scaling. It applies a first-order polynomial brightness correction to each band in each scene. It uses a data-mask image for selecting data and blending of input scenes. Under control by a user, the program can be made to operate on small parts of the output image space, with check-point and restart capabilities. The program runs on SGI IRIX computers. It is capable of parallel processing using shared-memory code, large memories, and tens of central processing units. It can retrieve input data and store output data at locations remote from the processors on which it is executed.
Finding and recognizing objects in natural scenes: complementary computations in the dorsal and ventral visual systems

PubMed Central

Rolls, Edmund T.; Webb, Tristan J.

2014-01-01

Searching for and recognizing objects in complex natural scenes is implemented by multiple saccades until the eyes reach within the reduced receptive field sizes of inferior temporal cortex (IT) neurons. We analyze and model how the dorsal and ventral visual streams both contribute to this. Saliency detection in the dorsal visual system including area LIP is modeled by graph-based visual saliency, and allows the eyes to fixate potential objects within several degrees. Visual information at the fixated location subtending approximately 9° corresponding to the receptive fields of IT neurons is then passed through a four layer hierarchical model of the ventral cortical visual system, VisNet. We show that VisNet can be trained using a synaptic modification rule with a short-term memory trace of recent neuronal activity to capture both the required view and translation invariances to allow in the model approximately 90% correct object recognition for 4 objects shown in any view across a range of 135° anywhere in a scene. The model was able to generalize correctly within the four trained views and the 25 trained translations. This approach analyses the principles by which complementary computations in the dorsal and ventral visual cortical streams enable objects to be located and recognized in complex natural scenes. PMID:25161619
Auditory attention strategy depends on target linguistic properties and spatial configurationa)

PubMed Central

McCloy, Daniel R.; Lee, Adrian K. C.

2015-01-01

Whether crossing a busy intersection or attending a large dinner party, listeners sometimes need to attend to multiple spatially distributed sound sources or streams concurrently. How they achieve this is not clear—some studies suggest that listeners cannot truly simultaneously attend to separate streams, but instead combine attention switching with short-term memory to achieve something resembling divided attention. This paper presents two oddball detection experiments designed to investigate whether directing attention to phonetic versus semantic properties of the attended speech impacts listeners' ability to divide their auditory attention across spatial locations. Each experiment uses four spatially distinct streams of monosyllabic words, variation in cue type (providing phonetic or semantic information), and requiring attention to one or two locations. A rapid button-press response paradigm is employed to minimize the role of short-term memory in performing the task. Results show that differences in the spatial configuration of attended and unattended streams interact with linguistic properties of the speech streams to impact performance. Additionally, listeners may leverage phonetic information to make oddball detection judgments even when oddballs are semantically defined. Both of these effects appear to be mediated by the overall complexity of the acoustic scene. PMID:26233011
Emotional pictures and sounds: a review of multimodal interactions of emotion cues in multiple domains

PubMed Central

Gerdes, Antje B. M.; Wieser, Matthias J.; Alpers, Georg W.

2014-01-01

In everyday life, multiple sensory channels jointly trigger emotional experiences and one channel may alter processing in another channel. For example, seeing an emotional facial expression and hearing the voice’s emotional tone will jointly create the emotional experience. This example, where auditory and visual input is related to social communication, has gained considerable attention by researchers. However, interactions of visual and auditory emotional information are not limited to social communication but can extend to much broader contexts including human, animal, and environmental cues. In this article, we review current research on audiovisual emotion processing beyond face-voice stimuli to develop a broader perspective on multimodal interactions in emotion processing. We argue that current concepts of multimodality should be extended in considering an ecologically valid variety of stimuli in audiovisual emotion processing. Therefore, we provide an overview of studies in which emotional sounds and interactions with complex pictures of scenes were investigated. In addition to behavioral studies, we focus on neuroimaging, electro- and peripher-physiological findings. Furthermore, we integrate these findings and identify similarities or differences. We conclude with suggestions for future research. PMID:25520679
Interaction of Object Binding Cues in Binaural Masking Pattern Experiments.

PubMed

Verhey, Jesko L; Lübken, Björn; van de Par, Steven

2016-01-01

Object binding cues such as binaural and across-frequency modulation cues are likely to be used by the auditory system to separate sounds from different sources in complex auditory scenes. The present study investigates the interaction of these cues in a binaural masking pattern paradigm where a sinusoidal target is masked by a narrowband noise. It was hypothesised that beating between signal and masker may contribute to signal detection when signal and masker do not spectrally overlap but that this cue could not be used in combination with interaural cues. To test this hypothesis an additional sinusoidal interferer was added to the noise masker with a lower frequency than the noise whereas the target had a higher frequency than the noise. Thresholds increase when the interferer is added. This effect is largest when the spectral interferer-masker and masker-target distances are equal. The result supports the hypothesis that modulation cues contribute to signal detection in the classical masking paradigm and that these are analysed with modulation bandpass filters. A monaural model including an across-frequency modulation process is presented that account for this effect. Interestingly, the interferer also affects dichotic thresholds indicating that modulation cues also play a role in binaural processing.
Amazon river dolphins (Inia geoffrensis) use a high-frequency short-range biosonar.

PubMed

Ladegaard, Michael; Jensen, Frants Havmand; de Freitas, Mafalda; Ferreira da Silva, Vera Maria; Madsen, Peter Teglberg

2015-10-01

Toothed whales produce echolocation clicks with source parameters related to body size; however, it may be equally important to consider the influence of habitat, as suggested by studies on echolocating bats. A few toothed whale species have fully adapted to river systems, where sonar operation is likely to result in higher clutter and reverberation levels than those experienced by most toothed whales at sea because of the shallow water and dense vegetation. To test the hypothesis that habitat shapes the evolution of toothed whale biosonar parameters by promoting simpler auditory scenes to interpret in acoustically complex habitats, echolocation clicks of wild Amazon river dolphins were recorded using a vertical seven-hydrophone array. We identified 404 on-axis biosonar clicks having a mean SLpp of 190.3 ± 6.1 dB re. 1 µPa, mean SLEFD of 132.1 ± 6.0 dB re. 1 µPa(2)s, mean Fc of 101.2 ± 10.5 kHz, mean BWRMS of 29.3 ± 4.3 kHz and mean ICI of 35.1 ± 17.9 ms. Piston fit modelling resulted in an estimated half-power beamwidth of 10.2 deg (95% CI: 9.6-10.5 deg) and directivity index of 25.2 dB (95% CI: 24.9-25.7 dB). These results support the hypothesis that river-dwelling toothed whales operate their biosonars at lower amplitude and higher sampling rates than similar-sized marine species without sacrificing high directivity, in order to provide high update rates in acoustically complex habitats and simplify auditory scenes through reduced clutter and reverberation levels. We conclude that habitat, along with body size, is an important evolutionary driver of source parameters in toothed whale biosonars. © 2015. Published by The Company of Biologists Ltd.
Utilization of DIRSIG in support of real-time infrared scene generation

NASA Astrophysics Data System (ADS)

Sanders, Jeffrey S.; Brown, Scott D.

2000-07-01

Real-time infrared scene generation for hardware-in-the-loop has been a traditionally difficult challenge. Infrared scenes are usually generated using commercial hardware that was not designed to properly handle the thermal and environmental physics involved. Real-time infrared scenes typically lack details that are included in scenes rendered in no-real- time by ray-tracing programs such as the Digital Imaging and Remote Sensing Scene Generation (DIRSIG) program. However, executing DIRSIG in real-time while retaining all the physics is beyond current computational capabilities for many applications. DIRSIG is a first principles-based synthetic image generation model that produces multi- or hyper-spectral images in the 0.3 to 20 micron region of the electromagnetic spectrum. The DIRSIG model is an integrated collection of independent first principles based on sub-models, each of which works in conjunction to produce radiance field images with high radiometric fidelity. DIRSIG uses the MODTRAN radiation propagation model for exo-atmospheric irradiance, emitted and scattered radiances (upwelled and downwelled) and path transmission predictions. This radiometry submodel utilizes bidirectional reflectance data, accounts for specular and diffuse background contributions, and features path length dependent extinction and emission for transmissive bodies (plumes, clouds, etc.) which may be present in any target, background or solar path. This detailed environmental modeling greatly enhances the number of rendered features and hence, the fidelity of a rendered scene. While DIRSIG itself cannot currently be executed in real-time, its outputs can be used to provide scene inputs for real-time scene generators. These inputs can incorporate significant features such as target to background thermal interactions, static background object thermal shadowing, and partially transmissive countermeasures. All of these features represent significant improvements over the current state of the art in real-time IR scene generation.
Real-time synchronized multiple-sensor IR/EO scene generation utilizing the SGI Onyx2

NASA Astrophysics Data System (ADS)

Makar, Robert J.; O'Toole, Brian E.

1998-07-01

An approach to utilize the symmetric multiprocessing environment of the Silicon Graphics Inc.R (SGI) Onyx2TM has been developed to support the generation of IR/EO scenes in real-time. This development, supported by the Naval Air Warfare Center Aircraft Division (NAWC/AD), focuses on high frame rate hardware-in-the-loop testing of multiple sensor avionics systems. In the past, real-time IR/EO scene generators have been developed as custom architectures that were often expensive and difficult to maintain. Previous COTS scene generation systems, designed and optimized for visual simulation, could not be adapted for accurate IR/EO sensor stimulation. The new Onyx2 connection mesh architecture made it possible to develop a more economical system while maintaining the fidelity needed to stimulate actual sensors. An SGI based Real-time IR/EO Scene Simulator (RISS) system was developed to utilize the Onyx2's fast multiprocessing hardware to perform real-time IR/EO scene radiance calculations. During real-time scene simulation, the multiprocessors are used to update polygon vertex locations and compute radiometrically accurate floating point radiance values. The output of this process can be utilized to drive a variety of scene rendering engines. Recent advancements in COTS graphics systems, such as the Silicon Graphics InfiniteRealityR make a total COTS solution possible for some classes of sensors. This paper will discuss the critical technologies that apply to infrared scene generation and hardware-in-the-loop testing using SGI compatible hardware. Specifically, the application of RISS high-fidelity real-time radiance algorithms on the SGI Onyx2's multiprocessing hardware will be discussed. Also, issues relating to external real-time control of multiple synchronized scene generation channels will be addressed.

Rank preserving sparse learning for Kinect based scene classification.

PubMed

Tao, Dapeng; Jin, Lianwen; Yang, Zhao; Li, Xuelong

2013-10-01

With the rapid development of the RGB-D sensors and the promptly growing population of the low-cost Microsoft Kinect sensor, scene classification, which is a hard, yet important, problem in computer vision, has gained a resurgence of interest recently. That is because the depth of information provided by the Kinect sensor opens an effective and innovative way for scene classification. In this paper, we propose a new scheme for scene classification, which applies locality-constrained linear coding (LLC) to local SIFT features for representing the RGB-D samples and classifies scenes through the cooperation between a new rank preserving sparse learning (RPSL) based dimension reduction and a simple classification method. RPSL considers four aspects: 1) it preserves the rank order information of the within-class samples in a local patch; 2) it maximizes the margin between the between-class samples on the local patch; 3) the L1-norm penalty is introduced to obtain the parsimony property; and 4) it models the classification error minimization by utilizing the least-squares error minimization. Experiments are conducted on the NYU Depth V1 dataset and demonstrate the robustness and effectiveness of RPSL for scene classification.
Auditory Support in Linguistically Diverse Classrooms: Factors Related to Bilingual Text-to-Speech Use

ERIC Educational Resources Information Center

Van Laere, E.; Braak, J.

2017-01-01

Text-to-speech technology can act as an important support tool in computer-based learning environments (CBLEs) as it provides auditory input, next to on-screen text. Particularly for students who use a language at home other than the language of instruction (LOI) applied at school, text-to-speech can be useful. The CBLE E-Validiv offers content in…
The Effects of Computerized Auditory Feedback on Electronic Article Surveillance Tag Placement in an Auto-Parts Distribution Center

ERIC Educational Resources Information Center

Goomas, David T.

2008-01-01

In this report from the field, computerized auditory feedback was used to inform order selectors and order selector auditors in a distribution center to add an electronic article surveillance (EAS) adhesive tag. This was done by programming handheld computers to emit a loud beep for high-priced items upon scanning the item's bar-coded Universal…
Web-based auditory self-training system for adult and elderly users of hearing aids.

PubMed

Vitti, Simone Virginia; Blasca, Wanderléia Quinhoneiro; Sigulem, Daniel; Torres Pisa, Ivan

2015-01-01

Adults and elderly users of hearing aids suffer psychosocial reactions as a result of hearing loss. Auditory rehabilitation is typically carried out with support from a speech therapist, usually in a clinical center. For these cases, there is a lack of computer-based self-training tools for minimizing the psychosocial impact of hearing deficiency. To develop and evaluate a web-based auditory self-training system for adult and elderly users of hearing aids. Two modules were developed for the web system: an information module based on guidelines for using hearing aids; and an auditory training module presenting a sequence of training exercises for auditory abilities along the lines of the auditory skill steps within auditory processing. We built aweb system using PHP programming language and a MySQL database .from requirements surveyed through focus groups that were conducted by healthcare information technology experts. The web system was evaluated by speech therapists and hearing aid users. An initial sample of 150 patients at DSA/HRAC/USP was defined to apply the system with the inclusion criteria that: the individuals should be over the age of 25 years, presently have hearing impairment, be a hearing aid user, have a computer and have internet experience. They were divided into two groups: a control group (G1) and an experimental group (G2). These patients were evaluated clinically using the HHIE for adults and HHIA for elderly people, before and after system implementation. A third web group was formed with users who were invited through social networks for their opinions on using the system. A questionnaire evaluating hearing complaints was given to all three groups. The study hypothesis considered that G2 would present greater auditory perception, higher satisfaction and fewer complaints than G1 after the auditory training. It was expected that G3 would have fewer complaints regarding use and acceptance of the system. The web system, which was named SisTHA portal, was finalized, rated by experts and hearing aid users and approved for use. The system comprised auditory skills training along five lines: discrimination; recognition; comprehension and temporal sequencing; auditory closure; and cognitive-linguistic and communication strategies. Users needed to undergo auditory training over a minimum period of 1 month: 5 times a week for 30 minutes a day. Comparisons were made between G1 and G2 and web system use by G3. The web system developed was approved for release to hearing aid users. It is expected that the self-training will help improve effective use of hearing aids, thereby decreasing their rejection.
The Advance of Computing from the Ground to the Cloud

ERIC Educational Resources Information Center

Breeding, Marshall

2009-01-01

A trend toward the abstraction of computing platforms that has been developing in the broader IT arena over the last few years is just beginning to make inroads into the library technology scene. Cloud computing offers for libraries many interesting possibilities that may help reduce technology costs and increase capacity, reliability, and…
Experiencing simultanagnosia through windowed viewing of complex social scenes.

PubMed

Dalrymple, Kirsten A; Birmingham, Elina; Bischof, Walter F; Barton, Jason J S; Kingstone, Alan

2011-01-07

Simultanagnosia is a disorder of visual attention, defined as an inability to see more than one object at once. It has been conceived as being due to a constriction of the visual "window" of attention, a metaphor that we examine in the present article. A simultanagnosic patient (SL) and two non-simultanagnosic control patients (KC and ES) described social scenes while their eye movements were monitored. These data were compared to a group of healthy subjects who described the same scenes under the same conditions as the patients, or through an aperture that restricted their vision to a small portion of the scene. Experiment 1 demonstrated that SL showed unusually low proportions of fixations to the eyes in social scenes, which contrasted with all other participants who demonstrated the standard preferential bias toward eyes. Experiments 2 and 3 revealed that when healthy participants viewed scenes through a window that was contingent on where they looked (Experiment 2) or where they moved a computer mouse (Experiment 3), their behavior closely mirrored that of patient SL. These findings suggest that a constricted window of visual processing has important consequences for how simultanagnosic patients explore their world. Our paradigm's capacity to mimic simultanagnosic behaviors while viewing complex scenes implies that it may be a valid way of modeling simultanagnosia in healthy individuals, providing a useful tool for future research. More broadly, our results support the thesis that people fixate the eyes in social scenes because they are informative to the meaning of the scene. Copyright © 2010 Elsevier B.V. All rights reserved.
Using Neuroplasticity-Based Auditory Training to Improve Verbal Memory in Schizophrenia

PubMed Central

Fisher, Melissa; Holland, Christine; Merzenich, Michael M.; Vinogradov, Sophia

2009-01-01

Objective Impaired verbal memory in schizophrenia is a key rate-limiting factor for functional outcome, does not respond to currently available medications, and shows only modest improvement after conventional behavioral remediation. The authors investigated an innovative approach to the remediation of verbal memory in schizophrenia, based on principles derived from the basic neuroscience of learning-induced neuroplasticity. The authors report interim findings in this ongoing study. Method Fifty-five clinically stable schizophrenia subjects were randomly assigned to either 50 hours of computerized auditory training or a control condition using computer games. Those receiving auditory training engaged in daily computerized exercises that placed implicit, increasing demands on auditory perception through progressively more difficult auditory-verbal working memory and verbal learning tasks. Results Relative to the control group, subjects who received active training showed significant gains in global cognition, verbal working memory, and verbal learning and memory. They also showed reliable and significant improvement in auditory psychophysical performance; this improvement was significantly correlated with gains in verbal working memory and global cognition. Conclusions Intensive training in early auditory processes and auditory-verbal learning results in substantial gains in verbal cognitive processes relevant to psychosocial functioning in schizophrenia. These gains may be due to a training method that addresses the early perceptual impairments in the illness, that exploits intact mechanisms of repetitive practice in schizophrenia, and that uses an intensive, adaptive training approach. PMID:19448187
Auditory Cortical Plasticity Drives Training-Induced Cognitive Changes in Schizophrenia

PubMed Central

Dale, Corby L.; Brown, Ethan G.; Fisher, Melissa; Herman, Alexander B.; Dowling, Anne F.; Hinkley, Leighton B.; Subramaniam, Karuna; Nagarajan, Srikantan S.; Vinogradov, Sophia

2016-01-01

Schizophrenia is characterized by dysfunction in basic auditory processing, as well as higher-order operations of verbal learning and executive functions. We investigated whether targeted cognitive training of auditory processing improves neural responses to speech stimuli, and how these changes relate to higher-order cognitive functions. Patients with schizophrenia performed an auditory syllable identification task during magnetoencephalography before and after 50 hours of either targeted cognitive training or a computer games control. Healthy comparison subjects were assessed at baseline and after a 10 week no-contact interval. Prior to training, patients (N = 34) showed reduced M100 response in primary auditory cortex relative to healthy participants (N = 13). At reassessment, only the targeted cognitive training patient group (N = 18) exhibited increased M100 responses. Additionally, this group showed increased induced high gamma band activity within left dorsolateral prefrontal cortex immediately after stimulus presentation, and later in bilateral temporal cortices. Training-related changes in neural activity correlated with changes in executive function scores but not verbal learning and memory. These data suggest that computerized cognitive training that targets auditory and verbal learning operations enhances both sensory responses in auditory cortex as well as engagement of prefrontal regions, as indexed during an auditory processing task with low demands on working memory. This neural circuit enhancement is in turn associated with better executive function but not verbal memory. PMID:26152668
Study on general design of dual-DMD based infrared two-band scene simulation system

NASA Astrophysics Data System (ADS)

Pan, Yue; Qiao, Yang; Xu, Xi-ping

2017-02-01

Mid-wave infrared(MWIR) and long-wave infrared(LWIR) two-band scene simulation system is a kind of testing equipment that used for infrared two-band imaging seeker. Not only it would be qualified for working waveband, but also realize the essence requests that infrared radiation characteristics should correspond to the real scene. Past single-digital micromirror device (DMD) based infrared scene simulation system does not take the huge difference between targets and background radiation into account, and it cannot realize the separated modulation to two-band light beam. Consequently, single-DMD based infrared scene simulation system cannot accurately express the thermal scene model that upper-computer built, and it is not that practical. To solve the problem, we design a dual-DMD based, dual-channel, co-aperture, compact-structure infrared two-band scene simulation system. The operating principle of the system is introduced in detail, and energy transfer process of the hardware-in-the-loop simulation experiment is analyzed as well. Also, it builds the equation about the signal-to-noise ratio of infrared detector in the seeker, directing the system overall design. The general design scheme of system is given, including the creation of infrared scene model, overall control, optical-mechanical structure design and image registration. By analyzing and comparing the past designs, we discuss the arrangement of optical engine framework in the system. According to the main content of working principle and overall design, we summarize each key techniques in the system.
Differential Effects of Music and Video Gaming During Breaks on Auditory and Visual Learning.

PubMed

Liu, Shuyan; Kuschpel, Maxim S; Schad, Daniel J; Heinz, Andreas; Rapp, Michael A

2015-11-01

The interruption of learning processes by breaks filled with diverse activities is common in everyday life. This study investigated the effects of active computer gaming and passive relaxation (rest and music) breaks on auditory versus visual memory performance. Young adults were exposed to breaks involving (a) open eyes resting, (b) listening to music, and (c) playing a video game, immediately after memorizing auditory versus visual stimuli. To assess learning performance, words were recalled directly after the break (an 8:30 minute delay) and were recalled and recognized again after 7 days. Based on linear mixed-effects modeling, it was found that playing the Angry Birds video game during a short learning break impaired long-term retrieval in auditory learning but enhanced long-term retrieval in visual learning compared with the music and rest conditions. These differential effects of video games on visual versus auditory learning suggest specific interference of common break activities on learning.
The Role of Inhibition in a Computational Model of an Auditory Cortical Neuron during the Encoding of Temporal Information

PubMed Central

Bendor, Daniel

2015-01-01

In auditory cortex, temporal information within a sound is represented by two complementary neural codes: a temporal representation based on stimulus-locked firing and a rate representation, where discharge rate co-varies with the timing between acoustic events but lacks a stimulus-synchronized response. Using a computational neuronal model, we find that stimulus-locked responses are generated when sound-evoked excitation is combined with strong, delayed inhibition. In contrast to this, a non-synchronized rate representation is generated when the net excitation evoked by the sound is weak, which occurs when excitation is coincident and balanced with inhibition. Using single-unit recordings from awake marmosets (Callithrix jacchus), we validate several model predictions, including differences in the temporal fidelity, discharge rates and temporal dynamics of stimulus-evoked responses between neurons with rate and temporal representations. Together these data suggest that feedforward inhibition provides a parsimonious explanation of the neural coding dichotomy observed in auditory cortex. PMID:25879843
A robust pseudo-inverse spectral filter applied to the Earth Radiation Budget Experiment (ERBE) scanning channels

NASA Technical Reports Server (NTRS)

Avis, L. M.; Green, R. N.; Suttles, J. T.; Gupta, S. K.

1984-01-01

Computer simulations of a least squares estimator operating on the ERBE scanning channels are discussed. The estimator is designed to minimize the errors produced by nonideal spectral response to spectrally varying and uncertain radiant input. The three ERBE scanning channels cover a shortwave band a longwave band and a ""total'' band from which the pseudo inverse spectral filter estimates the radiance components in the shortwave band and a longwave band. The radiance estimator draws on instantaneous field of view (IFOV) scene type information supplied by another algorithm of the ERBE software, and on a priori probabilistic models of the responses of the scanning channels to the IFOV scene types for given Sun scene spacecraft geometry. It is found that the pseudoinverse spectral filter is stable, tolerant of errors in scene identification and in channel response modeling, and, in the absence of such errors, yields minimum variance and essentially unbiased radiance estimates.
Flies and humans share a motion estimation strategy that exploits natural scene statistics

PubMed Central

Clark, Damon A.; Fitzgerald, James E.; Ales, Justin M.; Gohl, Daryl M.; Silies, Marion A.; Norcia, Anthony M.; Clandinin, Thomas R.

2014-01-01

Sighted animals extract motion information from visual scenes by processing spatiotemporal patterns of light falling on the retina. The dominant models for motion estimation exploit intensity correlations only between pairs of points in space and time. Moving natural scenes, however, contain more complex correlations. Here we show that fly and human visual systems encode the combined direction and contrast polarity of moving edges using triple correlations that enhance motion estimation in natural environments. Both species extract triple correlations with neural substrates tuned for light or dark edges, and sensitivity to specific triple correlations is retained even as light and dark edge motion signals are combined. Thus, both species separately process light and dark image contrasts to capture motion signatures that can improve estimation accuracy. This striking convergence argues that statistical structures in natural scenes have profoundly affected visual processing, driving a common computational strategy over 500 million years of evolution. PMID:24390225
Scene analysis for effective visual search in rough three-dimensional-modeling scenes

NASA Astrophysics Data System (ADS)

Wang, Qi; Hu, Xiaopeng

2016-11-01

Visual search is a fundamental technology in the computer vision community. It is difficult to find an object in complex scenes when there exist similar distracters in the background. We propose a target search method in rough three-dimensional-modeling scenes based on a vision salience theory and camera imaging model. We give the definition of salience of objects (or features) and explain the way that salience measurements of objects are calculated. Also, we present one type of search path that guides to the target through salience objects. Along the search path, when the previous objects are localized, the search region of each subsequent object decreases, which is calculated through imaging model and an optimization method. The experimental results indicate that the proposed method is capable of resolving the ambiguities resulting from distracters containing similar visual features with the target, leading to an improvement of search speed by over 50%.
Scene recognition based on integrating active learning with dictionary learning

NASA Astrophysics Data System (ADS)

Wang, Chengxi; Yin, Xueyan; Yang, Lin; Gong, Chengrong; Zheng, Caixia; Yi, Yugen

2018-04-01

Scene recognition is a significant topic in the field of computer vision. Most of the existing scene recognition models require a large amount of labeled training samples to achieve a good performance. However, labeling image manually is a time consuming task and often unrealistic in practice. In order to gain satisfying recognition results when labeled samples are insufficient, this paper proposed a scene recognition algorithm named Integrating Active Learning and Dictionary Leaning (IALDL). IALDL adopts projective dictionary pair learning (DPL) as classifier and introduces active learning mechanism into DPL for improving its performance. When constructing sampling criterion in active learning, IALDL considers both the uncertainty and representativeness as the sampling criteria to effectively select the useful unlabeled samples from a given sample set for expanding the training dataset. Experiment results on three standard databases demonstrate the feasibility and validity of the proposed IALDL.
Simulation as an Engine of Physical Scene Understanding

DTIC Science & Technology

2013-11-05

critical to the origins of intelligence : Researchers in developmental psychology, language, animal cognition, and artificial intelligence (2–6) con- sider...implemented computationally in classic artificial intelligence systems (18–20). However, these systems have not attempted to engage with physical scene un...N00014-09-0124, N00014-07-1-0937, and 1015GNA126; by Qualcomm; and by Intelligence Advanced Research Project Activity Grant D10PC20023. 1. Marr D (1982
Video Segmentation Descriptors for Event Recognition

DTIC Science & Technology

2014-12-08

Velastin, 3D Extended Histogram of Oriented Gradients (3DHOG) for Classification of Road Users in Urban Scenes , BMVC, 2009. [3] M.-Y. Chen and A. Hauptmann...computed on 3D volume outputted by the hierarchical segmentation . Each video is described as follows. Each supertube is temporally divided in n-frame...strength of these descriptors is their adaptability to the scene variations since they are grounded on a video segmentation . This makes them naturally robust
Modulation of Visually Evoked Postural Responses by Contextual Visual, Haptic and Auditory Information: A ‘Virtual Reality Check’

PubMed Central

Meyer, Georg F.; Shao, Fei; White, Mark D.; Hopkins, Carl; Robotham, Antony J.

2013-01-01

Externally generated visual motion signals can cause the illusion of self-motion in space (vection) and corresponding visually evoked postural responses (VEPR). These VEPRs are not simple responses to optokinetic stimulation, but are modulated by the configuration of the environment. The aim of this paper is to explore what factors modulate VEPRs in a high quality virtual reality (VR) environment where real and virtual foreground objects served as static visual, auditory and haptic reference points. Data from four experiments on visually evoked postural responses show that: 1) visually evoked postural sway in the lateral direction is modulated by the presence of static anchor points that can be haptic, visual and auditory reference signals; 2) real objects and their matching virtual reality representations as visual anchors have different effects on postural sway; 3) visual motion in the anterior-posterior plane induces robust postural responses that are not modulated by the presence of reference signals or the reality of objects that can serve as visual anchors in the scene. We conclude that automatic postural responses for laterally moving visual stimuli are strongly influenced by the configuration and interpretation of the environment and draw on multisensory representations. Different postural responses were observed for real and virtual visual reference objects. On the basis that automatic visually evoked postural responses in high fidelity virtual environments should mimic those seen in real situations we propose to use the observed effect as a robust objective test for presence and fidelity in VR. PMID:23840760
Stimulus change detection in phasic auditory units in the frog midbrain: frequency and ear specific adaptation.

PubMed

Ponnath, Abhilash; Hoke, Kim L; Farris, Hamilton E

2013-04-01

Neural adaptation, a reduction in the response to a maintained stimulus, is an important mechanism for detecting stimulus change. Contributing to change detection is the fact that adaptation is often stimulus specific: adaptation to a particular stimulus reduces excitability to a specific subset of stimuli, while the ability to respond to other stimuli is unaffected. Phasic cells (e.g., cells responding to stimulus onset) are good candidates for detecting the most rapid changes in natural auditory scenes, as they exhibit fast and complete adaptation to an initial stimulus presentation. We made recordings of single phasic auditory units in the frog midbrain to determine if adaptation was specific to stimulus frequency and ear of input. In response to an instantaneous frequency step in a tone, 28% of phasic cells exhibited frequency specific adaptation based on a relative frequency change (delta-f=±16%). Frequency specific adaptation was not limited to frequency steps, however, as adaptation was also overcome during continuous frequency modulated stimuli and in response to spectral transients interrupting tones. The results suggest that adaptation is separated for peripheral (e.g., frequency) channels. This was tested directly using dichotic stimuli. In 45% of binaural phasic units, adaptation was ear specific: adaptation to stimulation of one ear did not affect responses to stimulation of the other ear. Thus, adaptation exhibited specificity for stimulus frequency and lateralization at the level of the midbrain. This mechanism could be employed to detect rapid stimulus change within and between sound sources in complex acoustic environments.
Is there a hearing aid for the thinking person?

PubMed

Hafter, Ervin R

2010-10-01

The history of auditory prosthesis has generally concentrated on bottom-up processing, that is, on audibility. However, a growing interest in top-down processing has focused on correlations between success with a hearing aid and such higher order processing as the patient's intelligence, problem solving and language skills, and the perceived effort of day-to-day listening. Examination of two cases of cognitive effects in hearing that illustrate less-often-studied issues: (1) Individual subjects in a study use different listening strategies, a fact that, if not known to the experimenter, can lead to errors in interpretation; (2) A measure of shared attention can point to otherwise unknown functional effects of an algorithm used in hearing aids. In the two examples described above: (1) Patients with cochlear implants served in a study of the binaural precedence effect, that is, echo suppression. (2) Individuals identifying speech-in-noise benefit from noise reduction (NR) when the criterion was improved performance in simultaneous tests of verbal memory or visual reaction times. Studies of hearing impairment, either in the laboratory or in a fitting session, should include study of the complex stimuli that make up the natural environment, conditions where the thinking auditory brain adopts strategies for dealing with large amounts of input data. In addition to well-known factors that must be included in communication, such things as familiarity, syntax, and semantics, the work here shows that strategic listening can affect even how we deal with seemingly simpler requirements, localizing sounds in a reverberant auditory scene and listening for speech in noise when busy with other cognitive tasks. American Academy of Audiology.

Stimulus change detection in phasic auditory units in the frog midbrain: frequency and ear specific adaptation

PubMed Central

Ponnath, Abhilash; Hoke, Kim L.

2013-01-01

Neural adaptation, a reduction in the response to a maintained stimulus, is an important mechanism for detecting stimulus change. Contributing to change detection is the fact that adaptation is often stimulus specific: adaptation to a particular stimulus reduces excitability to a specific subset of stimuli, while the ability to respond to other stimuli is unaffected. Phasic cells (e.g., cells responding to stimulus onset) are good candidates for detecting the most rapid changes in natural auditory scenes, as they exhibit fast and complete adaptation to an initial stimulus presentation. We made recordings of single phasic auditory units in the frog midbrain to determine if adaptation was specific to stimulus frequency and ear of input. In response to an instantaneous frequency step in a tone, 28 % of phasic cells exhibited frequency specific adaptation based on a relative frequency change (delta-f = ±16 %). Frequency specific adaptation was not limited to frequency steps, however, as adaptation was also overcome during continuous frequency modulated stimuli and in response to spectral transients interrupting tones. The results suggest that adaptation is separated for peripheral (e.g., frequency) channels. This was tested directly using dichotic stimuli. In 45 % of binaural phasic units, adaptation was ear specific: adaptation to stimulation of one ear did not affect responses to stimulation of the other ear. Thus, adaptation exhibited specificity for stimulus frequency and lateralization at the level of the midbrain. This mechanism could be employed to detect rapid stimulus change within and between sound sources in complex acoustic environments. PMID:23344947
Perception of Scenes in Different Sensory Modalities: A Result of Modal Completion.

PubMed

Gruber, Ronald R; Block, Richard A

2017-01-01

Dynamic perception includes amodal and modal completion, along with apparent movement. It fills temporal gaps for single objects. In 2 experiments, using 6 stimulus presentation conditions involving 3 sensory modalities, participants experienced 8-10 sequential stimuli (200 ms each) with interstimulus intervals (ISIs) of 0.25-7.0 s. Experiments focused on spatiotemporal completion (walking), featural completion (object changing), auditory, completion (falling bomb), and haptic changes (insect crawling). After each trial, participants judged whether they experienced the process of "happening " or whether they simply knew that the process must have occurred. The phenomenon was frequency independent, being reported at short ISIs but not at long ISIs. The phenomenon involves dynamic modal completion and possibly also conceptual processes.
Unique digital imagery interface between a silicon graphics computer and the kinetic kill vehicle hardware-in-the-loop simulator (KHILS) wideband infrared scene projector (WISP)

NASA Astrophysics Data System (ADS)

Erickson, Ricky A.; Moren, Stephen E.; Skalka, Marion S.

1998-07-01

Providing a flexible and reliable source of IR target imagery is absolutely essential for operation of an IR Scene Projector in a hardware-in-the-loop simulation environment. The Kinetic Kill Vehicle Hardware-in-the-Loop Simulator (KHILS) at Eglin AFB provides the capability, and requisite interfaces, to supply target IR imagery to its Wideband IR Scene Projector (WISP) from three separate sources at frame rates ranging from 30 - 120 Hz. Video can be input from a VCR source at the conventional 30 Hz frame rate. Pre-canned digital imagery and test patterns can be downloaded into stored memory from the host processor and played back as individual still frames or movie sequences up to a 120 Hz frame rate. Dynamic real-time imagery to the KHILS WISP projector system, at a 120 Hz frame rate, can be provided from a Silicon Graphics Onyx computer system normally used for generation of digital IR imagery through a custom CSA-built interface which is available for either the SGI/DVP or SGI/DD02 interface port. The primary focus of this paper is to describe our technical approach and experience in the development of this unique SGI computer and WISP projector interface.
Edge co-occurrences can account for rapid categorization of natural versus animal images

NASA Astrophysics Data System (ADS)

Perrinet, Laurent U.; Bednar, James A.

2015-06-01

Making a judgment about the semantic category of a visual scene, such as whether it contains an animal, is typically assumed to involve high-level associative brain areas. Previous explanations require progressively analyzing the scene hierarchically at increasing levels of abstraction, from edge extraction to mid-level object recognition and then object categorization. Here we show that the statistics of edge co-occurrences alone are sufficient to perform a rough yet robust (translation, scale, and rotation invariant) scene categorization. We first extracted the edges from images using a scale-space analysis coupled with a sparse coding algorithm. We then computed the “association field” for different categories (natural, man-made, or containing an animal) by computing the statistics of edge co-occurrences. These differed strongly, with animal images having more curved configurations. We show that this geometry alone is sufficient for categorization, and that the pattern of errors made by humans is consistent with this procedure. Because these statistics could be measured as early as the primary visual cortex, the results challenge widely held assumptions about the flow of computations in the visual system. The results also suggest new algorithms for image classification and signal processing that exploit correlations between low-level structure and the underlying semantic category.
Towards a truly mobile auditory brain-computer interface: exploring the P300 to take away.

PubMed

De Vos, Maarten; Gandras, Katharina; Debener, Stefan

2014-01-01

In a previous study we presented a low-cost, small, and wireless 14-channel EEG system suitable for field recordings (Debener et al., 2012, psychophysiology). In the present follow-up study we investigated whether a single-trial P300 response can be reliably measured with this system, while subjects freely walk outdoors. Twenty healthy participants performed a three-class auditory oddball task, which included rare target and non-target distractor stimuli presented with equal probabilities of 16%. Data were recorded in a seated (control condition) and in a walking condition, both of which were realized outdoors. A significantly larger P300 event-related potential amplitude was evident for targets compared to distractors (p<.001), but no significant interaction with recording condition emerged. P300 single-trial analysis was performed with regularized stepwise linear discriminant analysis and revealed above chance-level classification accuracies for most participants (19 out of 20 for the seated, 16 out of 20 for the walking condition), with mean classification accuracies of 71% (seated) and 64% (walking). Moreover, the resulting information transfer rates for the seated and walking conditions were comparable to a recently published laboratory auditory brain-computer interface (BCI) study. This leads us to conclude that a truly mobile auditory BCI system is feasible. © 2013.
Initial progress in the recording of crime scene simulations using 3D laser structured light imagery techniques for law enforcement and forensic applications

NASA Astrophysics Data System (ADS)

Altschuler, Bruce R.; Monson, Keith L.

1998-03-01

Representation of crime scenes as virtual reality 3D computer displays promises to become a useful and important tool for law enforcement evaluation and analysis, forensic identification and pathological study and archival presentation during court proceedings. Use of these methods for assessment of evidentiary materials demands complete accuracy of reproduction of the original scene, both in data collection and in its eventual virtual reality representation. The recording of spatially accurate information as soon as possible after first arrival of law enforcement personnel is advantageous for unstable or hazardous crime scenes and reduces the possibility that either inadvertent measurement error or deliberate falsification may occur or be alleged concerning processing of a scene. Detailed measurements and multimedia archiving of critical surface topographical details in a calibrated, uniform, consistent and standardized quantitative 3D coordinate method are needed. These methods would afford professional personnel in initial contact with a crime scene the means for remote, non-contacting, immediate, thorough and unequivocal documentation of the contents of the scene. Measurements of the relative and absolute global positions of object sand victims, and their dispositions within the scene before their relocation and detailed examination, could be made. Resolution must be sufficient to map both small and large objects. Equipment must be able to map regions at varied resolution as collected from different perspectives. Progress is presented in devising methods for collecting and archiving 3D spatial numerical data from crime scenes, sufficient for law enforcement needs, by remote laser structured light and video imagery. Two types of simulation studies were done. One study evaluated the potential of 3D topographic mapping and 3D telepresence using a robotic platform for explosive ordnance disassembly. The second study involved using the laser mapping system on a fixed optical bench with simulated crime scene models of the people and furniture to assess feasibility, requirements and utility of such a system for crime scene documentation and analysis.
Photogrammetry and remote sensing for visualization of spatial data in a virtual reality environment

NASA Astrophysics Data System (ADS)

Bhagawati, Dwipen

2001-07-01

Researchers in many disciplines have started using the tool of Virtual Reality (VR) to gain new insights into problems in their respective disciplines. Recent advances in computer graphics, software and hardware technologies have created many opportunities for VR systems, advanced scientific and engineering applications being among them. In Geometronics, generally photogrammetry and remote sensing are used for management of spatial data inventory. VR technology can be suitably used for management of spatial data inventory. This research demonstrates usefulness of VR technology for inventory management by taking the roadside features as a case study. Management of roadside feature inventory involves positioning and visualization of the features. This research has developed a methodology to demonstrate how photogrammetric principles can be used to position the features using the video-logging images and GPS camera positioning and how image analysis can help produce appropriate texture for building the VR, which then can be visualized in a Cave Augmented Virtual Environment (CAVE). VR modeling was implemented in two stages to demonstrate the different approaches for modeling the VR scene. A simulated highway scene was implemented with the brute force approach, while modeling software was used to model the real world scene using feature positions produced in this research. The first approach demonstrates an implementation of the scene by writing C++ codes to include a multi-level wand menu for interaction with the scene that enables the user to interact with the scene. The interactions include editing the features inside the CAVE display, navigating inside the scene, and performing limited geographic analysis. The second approach demonstrates creation of a VR scene for a real roadway environment using feature positions determined in this research. The scene looks realistic with textures from the real site mapped on to the geometry of the scene. Remote sensing and digital image processing techniques were used for texturing the roadway features in this scene.
The New Film Technologies: Computerized Video-Assisted Film Production.

ERIC Educational Resources Information Center

Mott, Donald R.

Over the past few years, video technology has been used to assist film directors after they have shot a scene, to control costs, and to create special effects, especially computer assisted graphics. At present, a computer based editing system called "Film 5" combines computer technology and video tape with film to save as much as 50% of…
An overview of computer vision

NASA Technical Reports Server (NTRS)

Gevarter, W. B.

1982-01-01

An overview of computer vision is provided. Image understanding and scene analysis are emphasized, and pertinent aspects of pattern recognition are treated. The basic approach to computer vision systems, the techniques utilized, applications, the current existing systems and state-of-the-art issues and research requirements, who is doing it and who is funding it, and future trends and expectations are reviewed.
Computer Presentational Features for Young Children's Preferential Selection and Recall of Information.

ERIC Educational Resources Information Center

Calvert, Sandra L.; And Others

The purpose of this study was to examine the impact of visual and auditory presentational features on young children's selection and memory for verbally presented content. Assessed as a function of action and sound were preschool children's preferential selection and recall of words presented in a computer microworld. A computer microworld…
Learning Disabilities and the Auditory and Visual Matching Computer Program

ERIC Educational Resources Information Center

Tormanen, Minna R. K.; Takala, Marjatta; Sajaniemi, Nina

2008-01-01

This study examined whether audiovisual computer training without linguistic material had a remedial effect on different learning disabilities, like dyslexia and ADD (Attention Deficit Disorder). This study applied a pre-test-intervention-post-test design with students (N = 62) between the ages of 7 and 19. The computer training lasted eight weeks…
A self-synchronized high speed computational ghost imaging system: A leap towards dynamic capturing

NASA Astrophysics Data System (ADS)

Suo, Jinli; Bian, Liheng; Xiao, Yudong; Wang, Yongjin; Zhang, Lei; Dai, Qionghai

2015-11-01

High quality computational ghost imaging needs to acquire a large number of correlated measurements between the to-be-imaged scene and different reference patterns, thus ultra-high speed data acquisition is of crucial importance in real applications. To raise the acquisition efficiency, this paper reports a high speed computational ghost imaging system using a 20 kHz spatial light modulator together with a 2 MHz photodiode. Technically, the synchronization between such high frequency illumination and bucket detector needs nanosecond trigger precision, so the development of synchronization module is quite challenging. To handle this problem, we propose a simple and effective computational self-synchronization scheme by building a general mathematical model and introducing a high precision synchronization technique. The resulted efficiency is around 14 times faster than state-of-the-arts, and takes an important step towards ghost imaging of dynamic scenes. Besides, the proposed scheme is a general approach with high flexibility for readily incorporating other illuminators and detectors.
New HRCT-based measurement of the human outer ear canal as a basis for acoustical methods.

PubMed

Grewe, Johanna; Thiele, Cornelia; Mojallal, Hamidreza; Raab, Peter; Sankowsky-Rothe, Tobias; Lenarz, Thomas; Blau, Matthias; Teschner, Magnus

2013-06-01

As the form and size of the external auditory canal determine its transmitting function and hence the sound pressure in front of the eardrum, it is important to understand its anatomy in order to develop, optimize, and compare acoustical methods. High-resolution computed tomography (HRCT) data were measured retrospectively for 100 patients who had received a cochlear implant. In order to visualize the anatomy of the auditory canal, its length, radius, and the angle at which it runs were determined for the patients’ right and left ears. The canal’s volume was calculated, and a radius function was created. The determined length of the auditory canal averaged 23.6 mm for the right ear and 23.5 mm for the left ear. The calculated auditory canal volume (Vtotal) was 0.7 ml for the right ear and 0.69 ml for the left ear. The auditory canal was found to be significantly longer in men than in women, and the volume greater. The values obtained can be employed to develop a method that represents the shape of the auditory canal as accurately as possible to allow the best possible outcomes for hearing aid fitting.
Demonstrations of simple and complex auditory psychophysics for multiple platforms and environments

NASA Astrophysics Data System (ADS)

Horowitz, Seth S.; Simmons, Andrea M.; Blue, China

2005-09-01

Sound is arguably the most widely perceived and pervasive form of energy in our world, and among the least understood, in part due to the complexity of its underlying principles. A series of interactive displays has been developed which demonstrates that the nature of sound involves the propagation of energy through space, and illustrates the definition of psychoacoustics, which is how listeners map the physical aspects of sound and vibration onto their brains. These displays use auditory illusions and commonly experienced music and sound in novel presentations (using interactive computer algorithms) to show that what you hear is not always what you get. The areas covered in these demonstrations range from simple and complex auditory localization, which illustrate why humans are bad at echolocation but excellent at determining the contents of auditory space, to auditory illusions that manipulate fine phase information and make the listener think their head is changing size. Another demonstration shows how auditory and visual localization coincide and sound can be used to change visual tracking. These demonstrations are designed to run on a wide variety of student accessible platforms including web pages, stand-alone presentations, or even hardware-based systems for museum displays.
Modulation frequency as a cue for auditory speed perception.

PubMed

Senna, Irene; Parise, Cesare V; Ernst, Marc O

2017-07-12

Unlike vision, the mechanisms underlying auditory motion perception are poorly understood. Here we describe an auditory motion illusion revealing a novel cue to auditory speed perception: the temporal frequency of amplitude modulation (AM-frequency), typical for rattling sounds. Naturally, corrugated objects sliding across each other generate rattling sounds whose AM-frequency tends to directly correlate with speed. We found that AM-frequency modulates auditory speed perception in a highly systematic fashion: moving sounds with higher AM-frequency are perceived as moving faster than sounds with lower AM-frequency. Even more interestingly, sounds with higher AM-frequency also induce stronger motion aftereffects. This reveals the existence of specialized neural mechanisms for auditory motion perception, which are sensitive to AM-frequency. Thus, in spatial hearing, the brain successfully capitalizes on the AM-frequency of rattling sounds to estimate the speed of moving objects. This tightly parallels previous findings in motion vision, where spatio-temporal frequency of moving displays systematically affects both speed perception and the magnitude of the motion aftereffects. Such an analogy with vision suggests that motion detection may rely on canonical computations, with similar neural mechanisms shared across the different modalities. © 2017 The Author(s).
Statistics of high-level scene context.

PubMed

Greene, Michelle R

2013-01-01

CONTEXT IS CRITICAL FOR RECOGNIZING ENVIRONMENTS AND FOR SEARCHING FOR OBJECTS WITHIN THEM: contextual associations have been shown to modulate reaction time and object recognition accuracy, as well as influence the distribution of eye movements and patterns of brain activations. However, we have not yet systematically quantified the relationships between objects and their scene environments. Here I seek to fill this gap by providing descriptive statistics of object-scene relationships. A total of 48, 167 objects were hand-labeled in 3499 scenes using the LabelMe tool (Russell et al., 2008). From these data, I computed a variety of descriptive statistics at three different levels of analysis: the ensemble statistics that describe the density and spatial distribution of unnamed "things" in the scene; the bag of words level where scenes are described by the list of objects contained within them; and the structural level where the spatial distribution and relationships between the objects are measured. The utility of each level of description for scene categorization was assessed through the use of linear classifiers, and the plausibility of each level for modeling human scene categorization is discussed. Of the three levels, ensemble statistics were found to be the most informative (per feature), and also best explained human patterns of categorization errors. Although a bag of words classifier had similar performance to human observers, it had a markedly different pattern of errors. However, certain objects are more useful than others, and ceiling classification performance could be achieved using only the 64 most informative objects. As object location tends not to vary as a function of category, structural information provided little additional information. Additionally, these data provide valuable information on natural scene redundancy that can be exploited for machine vision, and can help the visual cognition community to design experiments guided by statistics rather than intuition.
Relative size of auditory pathways in symmetrically and asymmetrically eared owls.

PubMed

Gutiérrez-Ibáñez, Cristián; Iwaniuk, Andrew N; Wylie, Douglas R

2011-01-01

Owls are highly efficient predators with a specialized auditory system designed to aid in the localization of prey. One of the most unique anatomical features of the owl auditory system is the evolution of vertically asymmetrical ears in some species, which improves their ability to localize the elevational component of a sound stimulus. In the asymmetrically eared barn owl, interaural time differences (ITD) are used to localize sounds in azimuth, whereas interaural level differences (ILD) are used to localize sounds in elevation. These two features are processed independently in two separate neural pathways that converge in the external nucleus of the inferior colliculus to form an auditory map of space. Here, we present a comparison of the relative volume of 11 auditory nuclei in both the ITD and the ILD pathways of 8 species of symmetrically and asymmetrically eared owls in order to investigate evolutionary changes in the auditory pathways in relation to ear asymmetry. Overall, our results indicate that asymmetrically eared owls have much larger auditory nuclei than owls with symmetrical ears. In asymmetrically eared owls we found that both the ITD and ILD pathways are equally enlarged, and other auditory nuclei, not directly involved in binaural comparisons, are also enlarged. We suggest that the hypertrophy of auditory nuclei in asymmetrically eared owls likely reflects both an improved ability to precisely locate sounds in space and an expansion of the hearing range. Additionally, our results suggest that the hypertrophy of nuclei that compute space may have preceded that of the expansion of the hearing range and evolutionary changes in the size of the auditory system occurred independently of phylogeny. Copyright © 2011 S. Karger AG, Basel.
Computer image generation: Reconfigurability as a strategy in high fidelity space applications

NASA Technical Reports Server (NTRS)

Bartholomew, Michael J.

1989-01-01

The demand for realistic, high fidelity, computer image generation systems to support space simulation is well established. However, as the number and diversity of space applications increase, the complexity and cost of computer image generation systems also increase. One strategy used to harmonize cost with varied requirements is establishment of a reconfigurable image generation system that can be adapted rapidly and easily to meet new and changing requirements. The reconfigurability strategy through the life cycle of system conception, specification, design, implementation, operation, and support for high fidelity computer image generation systems are discussed. The discussion is limited to those issues directly associated with reconfigurability and adaptability of a specialized scene generation system in a multi-faceted space applications environment. Examples and insights gained through the recent development and installation of the Improved Multi-function Scene Generation System at Johnson Space Center, Systems Engineering Simulator are reviewed and compared with current simulator industry practices. The results are clear; the strategy of reconfigurability applied to space simulation requirements provides a viable path to supporting diverse applications with an adaptable computer image generation system.
New light field camera based on physical based rendering tracing

NASA Astrophysics Data System (ADS)

Chung, Ming-Han; Chang, Shan-Ching; Lee, Chih-Kung

2014-03-01

Even though light field technology was first invented more than 50 years ago, it did not gain popularity due to the limitation imposed by the computation technology. With the rapid advancement of computer technology over the last decade, the limitation has been uplifted and the light field technology quickly returns to the spotlight of the research stage. In this paper, PBRT (Physical Based Rendering Tracing) was introduced to overcome the limitation of using traditional optical simulation approach to study the light field camera technology. More specifically, traditional optical simulation approach can only present light energy distribution but typically lack the capability to present the pictures in realistic scenes. By using PBRT, which was developed to create virtual scenes, 4D light field information was obtained to conduct initial data analysis and calculation. This PBRT approach was also used to explore the light field data calculation potential in creating realistic photos. Furthermore, we integrated the optical experimental measurement results with PBRT in order to place the real measurement results into the virtually created scenes. In other words, our approach provided us with a way to establish a link of virtual scene with the real measurement results. Several images developed based on the above-mentioned approaches were analyzed and discussed to verify the pros and cons of the newly developed PBRT based light field camera technology. It will be shown that this newly developed light field camera approach can circumvent the loss of spatial resolution associated with adopting a micro-lens array in front of the image sensors. Detailed operational constraint, performance metrics, computation resources needed, etc. associated with this newly developed light field camera technique were presented in detail.
Locally excitatory, globally inhibitory oscillator networks: theory and application to scene segmentation

NASA Astrophysics Data System (ADS)

Wang, DeLiang; Terman, David

1995-01-01

A novel class of locally excitatory, globally inhibitory oscillator networks (LEGION) is proposed and investigated analytically and by computer simulation. The model of each oscillator corresponds to a standard relaxation oscillator with two time scales. The network exhibits a mechanism of selective gating, whereby an oscillator jumping up to its active phase rapidly recruits the oscillators stimulated by the same pattern, while preventing other oscillators from jumping up. We show analytically that with the selective gating mechanism the network rapidly achieves both synchronization within blocks of oscillators that are stimulated by connected regions and desynchronization between different blocks. Computer simulations demonstrate LEGION's promising ability for segmenting multiple input patterns in real time. This model lays a physical foundation for the oscillatory correlation theory of feature binding, and may provide an effective computational framework for scene segmentation and figure/ground segregation.

StreetScenes: Towards Scene Understanding in Still Images

DTIC Science & Technology

2006-05-01

Riesenhuber and Poggio’s C1 features from their Standard Model [91], Berg and Malik’s geometric blur [7], Belongie et al.’s shape context [5], and Dalal and...images was performed by payed employees of the Center for Biological and Computational Learning (CBCL) during the period between January 2003 and July...single example by feature replacement. In CVPR, 2005. [5] S. Belongie , J. Malik, and J. Puzicha. Shape matching and object recognition using shape
Best-next-view algorithm for three-dimensional scene reconstruction using range images

NASA Astrophysics Data System (ADS)

Banta, J. E.; Zhien, Yu; Wang, X. Z.; Zhang, G.; Smith, M. T.; Abidi, Mongi A.

1995-10-01

The primary focus of the research detailed in this paper is to develop an intelligent sensing module capable of automatically determining the optimal next sensor position and orientation during scene reconstruction. To facilitate a solution to this problem, we have assembled a system for reconstructing a 3D model of an object or scene from a sequence of range images. Candidates for the best-next-view position are determined by detecting and measuring occlusions to the range camera's view in an image. Ultimately, the candidate which will reveal the greatest amount of unknown scene information is selected as the best-next-view position. Our algorithm uses ray tracing to determine how much new information a given sensor perspective will reveal. We have tested our algorithm successfully on several synthetic range data streams, and found the system's results to be consistent with an intuitive human search. The models recovered by our system from range data compared well with the ideal models. Essentially, we have proven that range information of physical objects can be employed to automatically reconstruct a satisfactory dynamic 3D computer model at a minimal computational expense. This has obvious implications in the contexts of robot navigation, manufacturing, and hazardous materials handling. The algorithm we developed takes advantage of no a priori information in finding the best-next-view position.
Scene-based nonuniformity correction for airborne point target detection systems.

PubMed

Zhou, Dabiao; Wang, Dejiang; Huo, Lijun; Liu, Rang; Jia, Ping

2017-06-26

Images acquired by airborne infrared search and track (IRST) systems are often characterized by nonuniform noise. In this paper, a scene-based nonuniformity correction method for infrared focal-plane arrays (FPAs) is proposed based on the constant statistics of the received radiation ratios of adjacent pixels. The gain of each pixel is computed recursively based on the ratios between adjacent pixels, which are estimated through a median operation. Then, an elaborate mathematical model describing the error propagation, derived from random noise and the recursive calculation procedure, is established. The proposed method maintains the characteristics of traditional methods in calibrating the whole electro-optics chain, in compensating for temporal drifts, and in not preserving the radiometric accuracy of the system. Moreover, the proposed method is robust since the frame number is the only variant, and is suitable for real-time applications owing to its low computational complexity and simplicity of implementation. The experimental results, on different scenes from a proof-of-concept point target detection system with a long-wave Sofradir FPA, demonstrate the compelling performance of the proposed method.
Scene-based nonuniformity correction for focal plane arrays by the method of the inverse covariance form.

PubMed

Torres, Sergio N; Pezoa, Jorge E; Hayat, Majeed M

2003-10-10

What is to our knowledge a new scene-based algorithm for nonuniformity correction in infrared focal-plane array sensors has been developed. The technique is based on the inverse covariance form of the Kalman filter (KF), which has been reported previously and used in estimating the gain and bias of each detector in the array from scene data. The gain and the bias of each detector in the focal-plane array are assumed constant within a given sequence of frames, corresponding to a certain time and operational conditions, but they are allowed to randomly drift from one sequence to another following a discrete-time Gauss-Markov process. The inverse covariance form filter estimates the gain and the bias of each detector in the focal-plane array and optimally updates them as they drift in time. The estimation is performed with considerably higher computational efficiency than the equivalent KF. The ability of the algorithm in compensating for fixed-pattern noise in infrared imagery and in reducing the computational complexity is demonstrated by use of both simulated and real data.
Salient contour extraction from complex natural scene in night vision image

NASA Astrophysics Data System (ADS)

Han, Jing; Yue, Jiang; Zhang, Yi; Bai, Lian-fa

2014-03-01

The theory of center-surround interaction in non-classical receptive field can be applied in night vision information processing. In this work, an optimized compound receptive field modulation method is proposed to extract salient contour from complex natural scene in low-light-level (LLL) and infrared images. The kernel idea is that multi-feature analysis can recognize the inhomogeneity in modulatory coverage more accurately and that center and surround with the grouping structure satisfying Gestalt rule deserves high connection-probability. Computationally, a multi-feature contrast weighted inhibition model is presented to suppress background and lower mutual inhibition among contour elements; a fuzzy connection facilitation model is proposed to achieve the enhancement of contour response, the connection of discontinuous contour and the further elimination of randomly distributed noise and texture; a multi-scale iterative attention method is designed to accomplish dynamic modulation process and extract contours of targets in multi-size. This work provides a series of biologically motivated computational visual models with high-performance for contour detection from cluttered scene in night vision images.
Unconscious analyses of visual scenes based on feature conjunctions.

PubMed

Tachibana, Ryosuke; Noguchi, Yasuki

2015-06-01

To efficiently process a cluttered scene, the visual system analyzes statistical properties or regularities of visual elements embedded in the scene. It is controversial, however, whether those scene analyses could also work for stimuli unconsciously perceived. Here we show that our brain performs the unconscious scene analyses not only using a single featural cue (e.g., orientation) but also based on conjunctions of multiple visual features (e.g., combinations of color and orientation information). Subjects foveally viewed a stimulus array (duration: 50 ms) where 4 types of bars (red-horizontal, red-vertical, green-horizontal, and green-vertical) were intermixed. Although a conscious perception of those bars was inhibited by a subsequent mask stimulus, the brain correctly analyzed the information about color, orientation, and color-orientation conjunctions of those invisible bars. The information of those features was then used for the unconscious configuration analysis (statistical processing) of the central bars, which induced a perceptual bias and illusory feature binding in visible stimuli at peripheral locations. While statistical analyses and feature binding are normally 2 key functions of the visual system to construct coherent percepts of visual scenes, our results show that a high-level analysis combining those 2 functions is correctly performed by unconscious computations in the brain. (c) 2015 APA, all rights reserved).
Modeling Of Object- And Scene-Prototypes With Hierarchically Structured Classes

NASA Astrophysics Data System (ADS)

Ren, Z.; Jensch, P.; Ameling, W.

1989-03-01

The success of knowledge-based image analysis methodology and implementation tools depends largely on an appropriately and efficiently built model wherein the domain-specific context information about and the inherent structure of the observed image scene have been encoded. For identifying an object in an application environment a computer vision system needs to know firstly the description of the object to be found in an image or in an image sequence, secondly the corresponding relationships between object descriptions within the image sequence. This paper presents models of image objects scenes by means of hierarchically structured classes. Using the topovisual formalism of graph and higraph, we are currently studying principally the relational aspect and data abstraction of the modeling in order to visualize the structural nature resident in image objects and scenes, and to formalize. their descriptions. The goal is to expose the structure of image scene and the correspondence of image objects in the low level image interpretation. process. The object-based system design approach has been applied to build the model base. We utilize the object-oriented programming language C + + for designing, testing and implementing the abstracted entity classes and the operation structures which have been modeled topovisually. The reference images used for modeling prototypes of objects and scenes are from industrial environments as'well as medical applications.
Measuring the performance of visual to auditory information conversion.

PubMed

Tan, Shern Shiou; Maul, Tomás Henrique Bode; Mennie, Neil Russell

2013-01-01

Visual to auditory conversion systems have been in existence for several decades. Besides being among the front runners in providing visual capabilities to blind users, the auditory cues generated from image sonification systems are still easier to learn and adapt to compared to other similar techniques. Other advantages include low cost, easy customizability, and universality. However, every system developed so far has its own set of strengths and weaknesses. In order to improve these systems further, we propose an automated and quantitative method to measure the performance of such systems. With these quantitative measurements, it is possible to gauge the relative strengths and weaknesses of different systems and rank the systems accordingly. Performance is measured by both the interpretability and also the information preservation of visual to auditory conversions. Interpretability is measured by computing the correlation of inter image distance (IID) and inter sound distance (ISD) whereas the information preservation is computed by applying Information Theory to measure the entropy of both visual and corresponding auditory signals. These measurements provide a basis and some insights on how the systems work. With an automated interpretability measure as a standard, more image sonification systems can be developed, compared, and then improved. Even though the measure does not test systems as thoroughly as carefully designed psychological experiments, a quantitative measurement like the one proposed here can compare systems to a certain degree without incurring much cost. Underlying this research is the hope that a major breakthrough in image sonification systems will allow blind users to cost effectively regain enough visual functions to allow them to lead secure and productive lives.
Psychophysics and Neuronal Bases of Sound Localization in Humans

PubMed Central

Ahveninen, Jyrki; Kopco, Norbert; Jääskeläinen, Iiro P.

2013-01-01

Localization of sound sources is a considerable computational challenge for the human brain. Whereas the visual system can process basic spatial information in parallel, the auditory system lacks a straightforward correspondence between external spatial locations and sensory receptive fields. Consequently, the question how different acoustic features supporting spatial hearing are represented in the central nervous system is still open. Functional neuroimaging studies in humans have provided evidence for a posterior auditory “where” pathway that encompasses non-primary auditory cortex areas, including the planum temporale (PT) and posterior superior temporal gyrus (STG), which are strongly activated by horizontal sound direction changes, distance changes, and movement. However, these areas are also activated by a wide variety of other stimulus features, posing a challenge for the interpretation that the underlying areas are purely spatial. This review discusses behavioral and neuroimaging studies on sound localization, and some of the competing models of representation of auditory space in humans. PMID:23886698
Taming Crowded Visual Scenes

DTIC Science & Technology

2014-08-12

Nolan Warner, Mubarak Shah. Tracking in Dense Crowds Using Prominenceand Neighborhood Motion Concurrence, IEEE Transactions on Pattern Analysis...of computer vision, computer graphics and evacuation dynamics by providing a common platform, and provides...areas that includes Computer Vision, Computer Graphics , and Pedestrian Evacuation Dynamics. Despite the
The human auditory evoked response

NASA Technical Reports Server (NTRS)

Galambos, R.

1974-01-01

Figures are presented of computer-averaged auditory evoked responses (AERs) that point to the existence of a completely endogenous brain event. A series of regular clicks or tones was administered to the ear, and 'odd-balls' of different intensity or frequency respectively were included. Subjects were asked either to ignore the sounds (to read or do something else) or to attend to the stimuli. When they listened and counted the odd-balls, a P3 wave occurred at 300msec after stimulus. When the odd-balls consisted of omitted clicks or tone bursts, a similar response was observed. This could not have come from auditory nerve, but only from cortex. It is evidence of recognition, a conscious process.
A behavioral framework to guide research on central auditory development and plasticity

PubMed Central

Sanes, Dan H.; Woolley, Sarah M. N.

2011-01-01

The auditory CNS is influenced profoundly by sounds heard during development. Auditory deprivation and augmented sound exposure can each perturb the maturation of neural computations as well as their underlying synaptic properties. However, we have learned little about the emergence of perceptual skills in these same model systems, and especially how perception is influenced by early acoustic experience. Here, we argue that developmental studies must take greater advantage of behavioral benchmarks. We discuss quantitative measures of perceptual development, and suggest how they can play a much larger role in guiding experimental design. Most importantly, including behavioral measures will allow us to establish empirical connections among environment, neural development, and perception. PMID:22196328
Coding for parallel execution of hardware-in-the-loop millimeter-wave scene generation models on multicore SIMD processor architectures

NASA Astrophysics Data System (ADS)

Olson, Richard F.

2013-05-01

Rendering of point scatterer based radar scenes for millimeter wave (mmW) seeker tests in real-time hardware-in-the-loop (HWIL) scene generation requires efficient algorithms and vector-friendly computer architectures for complex signal synthesis. New processor technology from Intel implements an extended 256-bit vector SIMD instruction set (AVX, AVX2) in a multi-core CPU design providing peak execution rates of hundreds of GigaFLOPS (GFLOPS) on one chip. Real world mmW scene generation code can approach peak SIMD execution rates only after careful algorithm and source code design. An effective software design will maintain high computing intensity emphasizing register-to-register SIMD arithmetic operations over data movement between CPU caches or off-chip memories. Engineers at the U.S. Army Aviation and Missile Research, Development and Engineering Center (AMRDEC) applied two basic parallel coding methods to assess new 256-bit SIMD multi-core architectures for mmW scene generation in HWIL. These include use of POSIX threads built on vector library functions and more portable, highlevel parallel code based on compiler technology (e.g. OpenMP pragmas and SIMD autovectorization). Since CPU technology is rapidly advancing toward high processor core counts and TeraFLOPS peak SIMD execution rates, it is imperative that coding methods be identified which produce efficient and maintainable parallel code. This paper describes the algorithms used in point scatterer target model rendering, the parallelization of those algorithms, and the execution performance achieved on an AVX multi-core machine using the two basic parallel coding methods. The paper concludes with estimates for scale-up performance on upcoming multi-core technology.
Computer-based auditory phoneme discrimination training improves speech recognition in noise in experienced adult cochlear implant listeners.

PubMed

Schumann, Annette; Serman, Maja; Gefeller, Olaf; Hoppe, Ulrich

2015-03-01

Specific computer-based auditory training may be a useful completion in the rehabilitation process for cochlear implant (CI) listeners to achieve sufficient speech intelligibility. This study evaluated the effectiveness of a computerized, phoneme-discrimination training programme. The study employed a pretest-post-test design; participants were randomly assigned to the training or control group. Over a period of three weeks, the training group was instructed to train in phoneme discrimination via computer, twice a week. Sentence recognition in different noise conditions (moderate to difficult) was tested pre- and post-training, and six months after the training was completed. The control group was tested and retested within one month. Twenty-seven adult CI listeners who had been using cochlear implants for more than two years participated in the programme; 15 adults in the training group, 12 adults in the control group. Besides significant improvements for the trained phoneme-identification task, a generalized training effect was noted via significantly improved sentence recognition in moderate noise. No significant changes were noted in the difficult noise conditions. Improved performance was maintained over an extended period. Phoneme-discrimination training improves experienced CI listeners' speech perception in noise. Additional research is needed to optimize auditory training for individual benefit.
Generation and physical characteristics of the ERTS MSS system corrected computer compatible tapes

NASA Technical Reports Server (NTRS)

Thomas, V. L.

1973-01-01

The generation and format are discussed of the ERTS system corrected multispectral scanner computer compatible tapes. The discussion includes spacecraft sensors, scene characteristics, data transmission, and conversion of data to computer compatible tapes at the NASA Data Processing Facility. Geometeric and radiometric corrections, tape formats, and the physical characteristics of the tapes are also included.
Decision rules for unbiased inventory estimates

NASA Technical Reports Server (NTRS)

Argentiero, P. D.; Koch, D.

1979-01-01

An efficient and accurate procedure for estimating inventories from remote sensing scenes is presented. In place of the conventional and expensive full dimensional Bayes decision rule, a one-dimensional feature extraction and classification technique was employed. It is shown that this efficient decision rule can be used to develop unbiased inventory estimates and that for large sample sizes typical of satellite derived remote sensing scenes, resulting accuracies are comparable or superior to more expensive alternative procedures. Mathematical details of the procedure are provided in the body of the report and in the appendix. Results of a numerical simulation of the technique using statistics obtained from an observed LANDSAT scene are included. The simulation demonstrates the effectiveness of the technique in computing accurate inventory estimates.
Generation of binary holograms for deep scenes captured with a camera and a depth sensor

NASA Astrophysics Data System (ADS)

Leportier, Thibault; Park, Min-Chul

2017-01-01

This work presents binary hologram generation from images of a real object acquired from a Kinect sensor. Since hologram calculation from a point-cloud or polygon model presents a heavy computational burden, we adopted a depth-layer approach to generate the holograms. This method enables us to obtain holographic data of large scenes quickly. Our investigations focus on the performance of different methods, iterative and noniterative, to convert complex holograms into binary format. Comparisons were performed to examine the reconstruction of the binary holograms at different depths. We also propose to modify the direct binary search algorithm to take into account several reference image planes. Then, deep scenes featuring multiple planes of interest can be reconstructed with better efficiency.
Spatial image modulation to improve performance of computed tomography imaging spectrometer

NASA Technical Reports Server (NTRS)

Bearman, Gregory H. (Inventor); Wilson, Daniel W. (Inventor); Johnson, William R. (Inventor)

2010-01-01

Computed tomography imaging spectrometers ("CTIS"s) having patterns for imposing spatial structure are provided. The pattern may be imposed either directly on the object scene being imaged or at the field stop aperture. The use of the pattern improves the accuracy of the captured spatial and spectral information.
Multispectral Terrain Background Simulation Techniques For Use In Airborne Sensor Evaluation

NASA Astrophysics Data System (ADS)

Weinberg, Michael; Wohlers, Ronald; Conant, John; Powers, Edward

1988-08-01

A background simulation code developed at Aerodyne Research, Inc., called AERIE is designed to reflect the major sources of clutter that are of concern to staring and scanning sensors of the type being considered for various airborne threat warning (both aircraft and missiles) sensors. The code is a first principles model that could be used to produce a consistent image of the terrain for various spectral bands, i.e., provide the proper scene correlation both spectrally and spatially. The code utilizes both topographic and cultural features to model terrain, typically from DMA data, with a statistical overlay of the critical underlying surface properties (reflectance, emittance, and thermal factors) to simulate the resulting texture in the scene. Strong solar scattering from water surfaces is included with allowance for wind driven surface roughness. Clouds can be superimposed on the scene using physical cloud models and an analytical representation of the reflectivity obtained from scattering off spherical particles. The scene generator is augmented by collateral codes that allow for the generation of images at finer resolution. These codes provide interpolation of the basic DMA databases using fractal procedures that preserve the high frequency power spectral density behavior of the original scene. Scenes are presented illustrating variations in altitude, radiance, resolution, material, thermal factors, and emissivities. The basic models utilized for simulation of the various scene components and various "engineering level" approximations are incorporated to reduce the computational complexity of the simulation.
Significance of auditory and kinesthetic feedback to singers' pitch control.

PubMed

Mürbe, Dirk; Pabst, Friedemann; Hofmann, Gert; Sundberg, Johan

2002-03-01

An accurate control of fundamental frequency (F0) is required from singers. This control relies on auditory and kinesthetic feedback. However, a loud accompaniment may mask the auditory feedback, leaving the singers to rely on kinesthetic feedback. The object of the present study was to estimate the significance of auditory and kinesthetic feedback to pitch control in 28 students beginning a professional solo singing education. The singers sang an ascending and descending triad pattern covering their entire pitch range with and without masking noise in legato and staccato and in a slow and a fast tempo. F0 was measured by means of a computer program. The interval sizes between adjacent tones were determined and their departures from equally tempered tuning were calculated. The deviations from this tuning were used as a measure of the accuracy of intonation. Statistical analysis showed a significant effect of masking that amounted to a mean impairment of pitch accuracy by 14 cent across all subjects. Furthermore, significant effects were found of tempo as well as of the staccato/legato conditions. The results indicate that auditory feedback contributes significantly to singers' control of pitch.

A Computational Dual-Process Model of Social Interaction

DTIC Science & Technology

2014-01-30

that portray the model’s behaviors when OMAR was adapted to operate first with Neverwinter Night and then with OpenSim . 3.1 Overview of Human...the Neverwinter Night (NwN; http://nwn.wikia.com) environment and more recently adapted to run with OpenSim (http://opensimulator.org). Figure 1...provides a screenshot of a scene from the NwN world while Figure 2 provides a screenshot from the OpenSim world. The NwN and OpenSim avatars and scenes
Layered recognition networks that pre-process, classify, and describe

NASA Technical Reports Server (NTRS)

Uhr, L.

1971-01-01

A brief overview is presented of six types of pattern recognition programs that: (1) preprocess, then characterize; (2) preprocess and characterize together; (3) preprocess and characterize into a recognition cone; (4) describe as well as name; (5) compose interrelated descriptions; and (6) converse. A computer program (of types 3 through 6) is presented that transforms and characterizes the input scene through the successive layers of a recognition cone, and then engages in a stylized conversation to describe the scene.
Perceptual Load Affects Eyewitness Accuracy and Susceptibility to Leading Questions.

PubMed

Murphy, Gillian; Greene, Ciara M

2016-01-01

Load Theory (Lavie, 1995, 2005) states that the level of perceptual load in a task (i.e., the amount of information involved in processing task-relevant stimuli) determines the efficiency of selective attention. There is evidence that perceptual load affects distractor processing, with increased inattentional blindness under high load. Given that high load can result in individuals failing to report seeing obvious objects, it is conceivable that load may also impair memory for the scene. The current study is the first to assess the effect of perceptual load on eyewitness memory. Across three experiments (two video-based and one in a driving simulator), the effect of perceptual load on eyewitness memory was assessed. The results showed that eyewitnesses were less accurate under high load, in particular for peripheral details. For example, memory for the central character in the video was not affected by load but memory for a witness who passed by the window at the edge of the scene was significantly worse under high load. High load memories were also more open to suggestion, showing increased susceptibility to leading questions. High visual perceptual load also affected recall for auditory information, illustrating a possible cross-modal perceptual load effect on memory accuracy. These results have implications for eyewitness memory researchers and forensic professionals.
Perceptual Load Affects Eyewitness Accuracy and Susceptibility to Leading Questions

PubMed Central

Murphy, Gillian; Greene, Ciara M.

2016-01-01

Load Theory (Lavie, 1995, 2005) states that the level of perceptual load in a task (i.e., the amount of information involved in processing task-relevant stimuli) determines the efficiency of selective attention. There is evidence that perceptual load affects distractor processing, with increased inattentional blindness under high load. Given that high load can result in individuals failing to report seeing obvious objects, it is conceivable that load may also impair memory for the scene. The current study is the first to assess the effect of perceptual load on eyewitness memory. Across three experiments (two video-based and one in a driving simulator), the effect of perceptual load on eyewitness memory was assessed. The results showed that eyewitnesses were less accurate under high load, in particular for peripheral details. For example, memory for the central character in the video was not affected by load but memory for a witness who passed by the window at the edge of the scene was significantly worse under high load. High load memories were also more open to suggestion, showing increased susceptibility to leading questions. High visual perceptual load also affected recall for auditory information, illustrating a possible cross-modal perceptual load effect on memory accuracy. These results have implications for eyewitness memory researchers and forensic professionals. PMID:27625628
A view of Kanerva's sparse distributed memory

NASA Technical Reports Server (NTRS)

Denning, P. J.

1986-01-01

Pentti Kanerva is working on a new class of computers, which are called pattern computers. Pattern computers may close the gap between capabilities of biological organisms to recognize and act on patterns (visual, auditory, tactile, or olfactory) and capabilities of modern computers. Combinations of numeric, symbolic, and pattern computers may one day be capable of sustaining robots. The overview of the requirements for a pattern computer, a summary of Kanerva's Sparse Distributed Memory (SDM), and examples of tasks this computer can be expected to perform well are given.
Auditory cortex of bats and primates: managing species-specific calls for social communication

PubMed Central

Kanwal, Jagmeet S.; Rauschecker, Josef P.

2014-01-01

Individuals of many animal species communicate with each other using sounds or “calls” that are made up of basic acoustic patterns and their combinations. We are interested in questions about the processing of communication calls and their representation within the mammalian auditory cortex. Our studies compare in particular two species for which a large body of data has accumulated: the mustached bat and the rhesus monkey. We conclude that the brains of both species share a number of functional and organizational principles, which differ only in the extent to which and how they are implemented. For instance, neurons in both species use “combination-sensitivity” (nonlinear spectral and temporal integration of stimulus components) as a basic mechanism to enable exquisite sensitivity to and selectivity for particular call types. Whereas combination-sensitivity is already found abundantly at the primary auditory cortical and also at subcortical levels in bats, it becomes prevalent only at the level of the lateral belt in the secondary auditory cortex of monkeys. A parallel-hierarchical framework for processing complex sounds up to the level of the auditory cortex in bats and an organization into parallel-hierarchical, cortico-cortical auditory processing streams in monkeys is another common principle. Response specialization of neurons seems to be more pronounced in bats than in monkeys, whereas a functional specialization into “what” and “where” streams in the cerebral cortex is more pronounced in monkeys than in bats. These differences, in part, are due to the increased number and larger size of auditory areas in the parietal and frontal cortex in primates. Accordingly, the computational prowess of neural networks and the functional hierarchy resulting in specializations is established early and accelerated across brain regions in bats. The principles proposed here for the neural “management” of species-specific calls in bats and primates can be tested by studying the details of call processing in additional species. Also, computational modeling in conjunction with coordinated studies in bats and monkeys can help to clarify the fundamental question of perceptual invariance (or “constancy”) in call recognition, which has obvious relevance for understanding speech perception and its disorders in humans. PMID:17485400
Underwater Hearing in Turtles.

PubMed

Willis, Katie L

2016-01-01

The hearing of turtles is poorly understood compared with the other reptiles. Although the mechanism of transduction of sound into a neural signal via hair cells has been described in detail, the rest of the auditory system is largely a black box. What is known is that turtles have higher hearing thresholds than other reptiles, with best frequencies around 500 Hz. They also have lower underwater hearing thresholds than those in air, owing to resonance of the middle ear cavity. Further studies demonstrated that all families of turtles and tortoises share a common middle ear cavity morphology, with scaling best suited to underwater hearing. This supports an aquatic origin of the group. Because turtles hear best under water, it is important to examine their vulnerability to anthropogenic noise. However, the lack of basic data makes such experiments difficult because only a few species of turtles have published audiograms. There are also almost no behavioral data available (understandable due to training difficulties). Finally, few studies show what kinds of sounds are behaviorally relevant. One notable paper revealed that the Australian snake-necked turtle (Chelodina oblonga) has a vocal repertoire in air, at the interface, and under water. Findings like these suggest that there is more to the turtle aquatic auditory scene than previously thought.
Failure of the precedence effect with a noise-band vocoder

PubMed Central

Seeber, Bernhard U.; Hafter, Ervin R.

2011-01-01

The precedence effect (PE) describes the ability to localize a direct, leading sound correctly when its delayed copy (lag) is present, though not separately audible. The relative contribution of binaural cues in the temporal fine structure (TFS) of lead–lag signals was compared to that of interaural level differences (ILDs) and interaural time differences (ITDs) carried in the envelope. In a localization dominance paradigm participants indicated the spatial location of lead–lag stimuli processed with a binaural noise-band vocoder whose noise carriers introduced random TFS. The PE appeared for noise bursts of 10 ms duration, indicating dominance of envelope information. However, for three test words the PE often failed even at short lead–lag delays, producing two images, one toward the lead and one toward the lag. When interaural correlation in the carrier was increased, the images appeared more centered, but often remained split. Although previous studies suggest dominance of TFS cues, no image is lateralized in accord with the ITD in the TFS. An interpretation in the context of auditory scene analysis is proposed: By replacing the TFS with that of noise the auditory system loses the ability to fuse lead and lag into one object, and thus to show the PE. PMID:21428515
Computer-Based Auditory Training Programs for Children with Hearing Impairment - A Scoping Review.

PubMed

Nanjundaswamy, Manohar; Prabhu, Prashanth; Rajanna, Revathi Kittur; Ningegowda, Raghavendra Gulaganji; Sharma, Madhuri

2018-01-01

Introduction Communication breakdown, a consequence of hearing impairment (HI), is being fought by fitting amplification devices and providing auditory training since the inception of audiology. The advances in both audiology and rehabilitation programs have led to the advent of computer-based auditory training programs (CBATPs). Objective To review the existing literature documenting the evidence-based CBATPs for children with HIs. Since there was only one such article, we also chose to review the commercially available CBATPs for children with HI. The strengths and weaknesses of the existing literature were reviewed in order to improve further researches. Data Synthesis Google Scholar and PubMed databases were searched using various combinations of keywords. The participant, intervention, control, outcome and study design (PICOS) criteria were used for the inclusion of articles. Out of 124 article abstracts reviewed, 5 studies were shortlisted for detailed reading. One among them satisfied all the criteria, and was taken for review. The commercially available programs were chosen based on an extensive search in Google. The reviewed article was well-structured, with appropriate outcomes. The commercially available programs cover many aspects of the auditory training through a wide range of stimuli and activities. Conclusions There is a dire need for extensive research to be performed in the field of CBATPs to establish their efficacy, also to establish them as evidence-based practices.
Computer-Based Auditory Training Programs for Children with Hearing Impairment – A Scoping Review

PubMed Central

Nanjundaswamy, Manohar; Prabhu, Prashanth; Rajanna, Revathi Kittur; Ningegowda, Raghavendra Gulaganji; Sharma, Madhuri

2018-01-01

Introduction Communication breakdown, a consequence of hearing impairment (HI), is being fought by fitting amplification devices and providing auditory training since the inception of audiology. The advances in both audiology and rehabilitation programs have led to the advent of computer-based auditory training programs (CBATPs). Objective To review the existing literature documenting the evidence-based CBATPs for children with HIs. Since there was only one such article, we also chose to review the commercially available CBATPs for children with HI. The strengths and weaknesses of the existing literature were reviewed in order to improve further researches. Data Synthesis Google Scholar and PubMed databases were searched using various combinations of keywords. The participant, intervention, control, outcome and study design (PICOS) criteria were used for the inclusion of articles. Out of 124 article abstracts reviewed, 5 studies were shortlisted for detailed reading. One among them satisfied all the criteria, and was taken for review. The commercially available programs were chosen based on an extensive search in Google. The reviewed article was well-structured, with appropriate outcomes. The commercially available programs cover many aspects of the auditory training through a wide range of stimuli and activities. Conclusions There is a dire need for extensive research to be performed in the field of CBATPs to establish their efficacy, also to establish them as evidence-based practices. PMID:29371904
Words from spontaneous conversational speech can be recognized with human-like accuracy by an error-driven learning algorithm that discriminates between meanings straight from smart acoustic features, bypassing the phoneme as recognition unit.

PubMed

Arnold, Denis; Tomaschek, Fabian; Sering, Konstantin; Lopez, Florence; Baayen, R Harald

2017-01-01

Sound units play a pivotal role in cognitive models of auditory comprehension. The general consensus is that during perception listeners break down speech into auditory words and subsequently phones. Indeed, cognitive speech recognition is typically taken to be computationally intractable without phones. Here we present a computational model trained on 20 hours of conversational speech that recognizes word meanings within the range of human performance (model 25%, native speakers 20-44%), without making use of phone or word form representations. Our model also generates successfully predictions about the speed and accuracy of human auditory comprehension. At the heart of the model is a 'wide' yet sparse two-layer artificial neural network with some hundred thousand input units representing summaries of changes in acoustic frequency bands, and proxies for lexical meanings as output units. We believe that our model holds promise for resolving longstanding theoretical problems surrounding the notion of the phone in linguistic theory.
Spatial detection of tv channel logos as outliers from the content

NASA Astrophysics Data System (ADS)

Ekin, Ahmet; Braspenning, Ralph

2006-01-01

This paper proposes a purely image-based TV channel logo detection algorithm that can detect logos independently from their motion and transparency features. The proposed algorithm can robustly detect any type of logos, such as transparent and animated, without requiring any temporal constraints whereas known methods have to wait for the occurrence of large motion in the scene and assume stationary logos. The algorithm models logo pixels as outliers from the actual scene content that is represented by multiple 3-D histograms in the YC BC R space. We use four scene histograms corresponding to each of the four corners because the content characteristics change from one image corner to another. A further novelty of the proposed algorithm is that we define image corners and the areas where we compute the scene histograms by a cinematic technique called Golden Section Rule that is used by professionals. The robustness of the proposed algorithm is demonstrated over a dataset of representative TV content.
Significance of Crime Scene Visit by Forensic Pathologist in Cases of Atypical Firearm Injuries.

PubMed

Thejaswi, H T; Kumar, A; Krishna, K

2015-01-01

Deaths due to firearms are some of the interesting and contentious cases that a forensic pathologist/autopsy surgeon encounters in his practice. Whenever there is 'ambiguity' regarding the nature or sequence of events any unnatural deaths including those caused by firearms the practice of visiting crime scene should be encouraged especially in a country like India where autopsy surgeons often neglect it. Here we present a case report in which there were inconsistencies in the autopsy findings with the alleged history. The witnesses heard about four to six gunshot sounds, whereas only two spent cartridge cases were retrieved from the crime scene. Authors identified the atypical nature of firearm injuries sustained by the victims that were possible by just two bullets. Crime scene visit was undertaken where we discovered the possibility of the echo effect behind the production of four to six sounds. Further by using computer software program, positions of the gunman, victims and the bullet trajectory of the two bullets was created.
A novel scene-based non-uniformity correction method for SWIR push-broom hyperspectral sensors

NASA Astrophysics Data System (ADS)

Hu, Bin-Lin; Hao, Shi-Jing; Sun, De-Xin; Liu, Yin-Nian

2017-09-01

A novel scene-based non-uniformity correction (NUC) method for short-wavelength infrared (SWIR) push-broom hyperspectral sensors is proposed and evaluated. This method relies on the assumption that for each band there will be ground objects with similar reflectance to form uniform regions when a sufficient number of scanning lines are acquired. The uniform regions are extracted automatically through a sorting algorithm, and are used to compute the corresponding NUC coefficients. SWIR hyperspectral data from airborne experiment are used to verify and evaluate the proposed method, and results show that stripes in the scenes have been well corrected without any significant information loss, and the non-uniformity is less than 0.5%. In addition, the proposed method is compared to two other regular methods, and they are evaluated based on their adaptability to the various scenes, non-uniformity, roughness and spectral fidelity. It turns out that the proposed method shows strong adaptability, high accuracy and efficiency.
Navigable points estimation for mobile robots using binary image skeletonization

NASA Astrophysics Data System (ADS)

Martinez S., Fernando; Jacinto G., Edwar; Montiel A., Holman

2017-02-01

This paper describes the use of image skeletonization for the estimation of all the navigable points, inside a scene of mobile robots navigation. Those points are used for computing a valid navigation path, using standard methods. The main idea is to find the middle and the extreme points of the obstacles in the scene, taking into account the robot size, and create a map of navigable points, in order to reduce the amount of information for the planning algorithm. Those points are located by means of the skeletonization of a binary image of the obstacles and the scene background, along with some other digital image processing algorithms. The proposed algorithm automatically gives a variable number of navigable points per obstacle, depending on the complexity of its shape. As well as, the way how the algorithm can change some of their parameters in order to change the final number of the resultant key points is shown. The results shown here were obtained applying different kinds of digital image processing algorithms on static scenes.
The lawful imprecision of human surface tilt estimation in natural scenes

PubMed Central

2018-01-01

Estimating local surface orientation (slant and tilt) is fundamental to recovering the three-dimensional structure of the environment. It is unknown how well humans perform this task in natural scenes. Here, with a database of natural stereo-images having groundtruth surface orientation at each pixel, we find dramatic differences in human tilt estimation with natural and artificial stimuli. Estimates are precise and unbiased with artificial stimuli and imprecise and strongly biased with natural stimuli. An image-computable Bayes optimal model grounded in natural scene statistics predicts human bias, precision, and trial-by-trial errors without fitting parameters to the human data. The similarities between human and model performance suggest that the complex human performance patterns with natural stimuli are lawful, and that human visual systems have internalized local image and scene statistics to optimally infer the three-dimensional structure of the environment. These results generalize our understanding of vision from the lab to the real world. PMID:29384477
The lawful imprecision of human surface tilt estimation in natural scenes.

PubMed

Kim, Seha; Burge, Johannes

2018-01-31

Estimating local surface orientation (slant and tilt) is fundamental to recovering the three-dimensional structure of the environment. It is unknown how well humans perform this task in natural scenes. Here, with a database of natural stereo-images having groundtruth surface orientation at each pixel, we find dramatic differences in human tilt estimation with natural and artificial stimuli. Estimates are precise and unbiased with artificial stimuli and imprecise and strongly biased with natural stimuli. An image-computable Bayes optimal model grounded in natural scene statistics predicts human bias, precision, and trial-by-trial errors without fitting parameters to the human data. The similarities between human and model performance suggest that the complex human performance patterns with natural stimuli are lawful, and that human visual systems have internalized local image and scene statistics to optimally infer the three-dimensional structure of the environment. These results generalize our understanding of vision from the lab to the real world. © 2018, Kim et al.
Scene Segmentation For Autonomous Robotic Navigation Using Sequential Laser Projected Structured Light

NASA Astrophysics Data System (ADS)

Brown, C. David; Ih, Charles S.; Arce, Gonzalo R.; Fertell, David A.

1987-01-01

Vision systems for mobile robots or autonomous vehicles navigating in an unknown terrain environment must provide a rapid and accurate method of segmenting the scene ahead into regions of pathway and background. A major distinguishing feature between the pathway and background is the three dimensional texture of these two regions. Typical methods of textural image segmentation are very computationally intensive, often lack the required robustness, and are incapable of sensing the three dimensional texture of various regions of the scene. A method is presented where scanned laser projected lines of structured light, viewed by a stereoscopically located single video camera, resulted in an image in which the three dimensional characteristics of the scene were represented by the discontinuity of the projected lines. This image was conducive to processing with simple regional operators to classify regions as pathway or background. Design of some operators and application methods, and demonstration on sample images are presented. This method provides rapid and robust scene segmentation capability that has been implemented on a microcomputer in near real time, and should result in higher speed and more reliable robotic or autonomous navigation in unstructured environments.
TMS to object cortex affects both object and scene remote networks while TMS to scene cortex only affects scene networks.

PubMed

Rafique, Sara A; Solomon-Harris, Lily M; Steeves, Jennifer K E

2015-12-01

Viewing the world involves many computations across a great number of regions of the brain, all the while appearing seamless and effortless. We sought to determine the connectivity of object and scene processing regions of cortex through the influence of transient focal neural noise in discrete nodes within these networks. We consecutively paired repetitive transcranial magnetic stimulation (rTMS) with functional magnetic resonance-adaptation (fMR-A) to measure the effect of rTMS on functional response properties at the stimulation site and in remote regions. In separate sessions, rTMS was applied to the object preferential lateral occipital region (LO) and scene preferential transverse occipital sulcus (TOS). Pre- and post-stimulation responses were compared using fMR-A. In addition to modulating BOLD signal at the stimulation site, TMS affected remote regions revealing inter and intrahemispheric connections between LO, TOS, and the posterior parahippocampal place area (PPA). Moreover, we show remote effects from object preferential LO to outside the ventral perception network, in parietal and frontal areas, indicating an interaction of dorsal and ventral streams and possibly a shared common framework of perception and action. Copyright © 2015 Elsevier Ltd. All rights reserved.
Modelling Technology for Building Fire Scene with Virtual Geographic Environment

NASA Astrophysics Data System (ADS)

Song, Y.; Zhao, L.; Wei, M.; Zhang, H.; Liu, W.

2017-09-01

Building fire is a risky activity that can lead to disaster and massive destruction. The management and disposal of building fire has always attracted much interest from researchers. Integrated Virtual Geographic Environment (VGE) is a good choice for building fire safety management and emergency decisions, in which a more real and rich fire process can be computed and obtained dynamically, and the results of fire simulations and analyses can be much more accurate as well. To modelling building fire scene with VGE, the application requirements and modelling objective of building fire scene were analysed in this paper. Then, the four core elements of modelling building fire scene (the building space environment, the fire event, the indoor Fire Extinguishing System (FES) and the indoor crowd) were implemented, and the relationship between the elements was discussed also. Finally, with the theory and framework of VGE, the technology of building fire scene system with VGE was designed within the data environment, the model environment, the expression environment, and the collaborative environment as well. The functions and key techniques in each environment are also analysed, which may provide a reference for further development and other research on VGE.

A modern approach to storing of 3D geometry of objects in machine engineering industry

NASA Astrophysics Data System (ADS)

Sokolova, E. A.; Aslanov, G. A.; Sokolov, A. A.

2017-02-01

3D graphics is a kind of computer graphics which has absorbed a lot from the vector and raster computer graphics. It is used in interior design projects, architectural projects, advertising, while creating educational computer programs, movies, visual images of parts and products in engineering, etc. 3D computer graphics allows one to create 3D scenes along with simulation of light conditions and setting up standpoints.
Color constancy in natural scenes explained by global image statistics

PubMed Central

Foster, David H.; Amano, Kinjiro; Nascimento, Sérgio M. C.

2007-01-01

To what extent do observers' judgments of surface color with natural scenes depend on global image statistics? To address this question, a psychophysical experiment was performed in which images of natural scenes under two successive daylights were presented on a computer-controlled high-resolution color monitor. Observers reported whether there was a change in reflectance of a test surface in the scene. The scenes were obtained with a hyperspectral imaging system and included variously trees, shrubs, grasses, ferns, flowers, rocks, and buildings. Discrimination performance, quantified on a scale of 0 to 1 with a color-constancy index, varied from 0.69 to 0.97 over 21 scenes and two illuminant changes, from a correlated color temperature of 25,000 K to 6700 K and from 4000 K to 6700 K. The best account of these effects was provided by receptor-based rather than colorimetric properties of the images. Thus, in a linear regression, 43% of the variance in constancy index was explained by the log of the mean relative deviation in spatial cone-excitation ratios evaluated globally across the two images of a scene. A further 20% was explained by including the mean chroma of the first image and its difference from that of the second image and a further 7% by the mean difference in hue. Together, all four global color properties accounted for 70% of the variance and provided a good fit to the effects of scene and of illuminant change on color constancy, and, additionally, of changing test-surface position. By contrast, a spatial-frequency analysis of the images showed that the gradient of the luminance amplitude spectrum accounted for only 5% of the variance. PMID:16961965
Color constancy in natural scenes explained by global image statistics.

PubMed

Foster, David H; Amano, Kinjiro; Nascimento, Sérgio M C

2006-01-01

To what extent do observers' judgments of surface color with natural scenes depend on global image statistics? To address this question, a psychophysical experiment was performed in which images of natural scenes under two successive daylights were presented on a computer-controlled high-resolution color monitor. Observers reported whether there was a change in reflectance of a test surface in the scene. The scenes were obtained with a hyperspectral imaging system and included variously trees, shrubs, grasses, ferns, flowers, rocks, and buildings. Discrimination performance, quantified on a scale of 0 to 1 with a color-constancy index, varied from 0.69 to 0.97 over 21 scenes and two illuminant changes, from a correlated color temperature of 25,000 K to 6700 K and from 4000 K to 6700 K. The best account of these effects was provided by receptor-based rather than colorimetric properties of the images. Thus, in a linear regression, 43% of the variance in constancy index was explained by the log of the mean relative deviation in spatial cone-excitation ratios evaluated globally across the two images of a scene. A further 20% was explained by including the mean chroma of the first image and its difference from that of the second image and a further 7% by the mean difference in hue. Together, all four global color properties accounted for 70% of the variance and provided a good fit to the effects of scene and of illuminant change on color constancy, and, additionally, of changing test-surface position. By contrast, a spatial-frequency analysis of the images showed that the gradient of the luminance amplitude spectrum accounted for only 5% of the variance.
Statistical regularities of art images and natural scenes: spectra, sparseness and nonlinearities.

PubMed

Graham, Daniel J; Field, David J

2007-01-01

Paintings are the product of a process that begins with ordinary vision in the natural world and ends with manipulation of pigments on canvas. Because artists must produce images that can be seen by a visual system that is thought to take advantage of statistical regularities in natural scenes, artists are likely to replicate many of these regularities in their painted art. We have tested this notion by computing basic statistical properties and modeled cell response properties for a large set of digitized paintings and natural scenes. We find that both representational and non-representational (abstract) paintings from our sample (124 images) show basic similarities to a sample of natural scenes in terms of their spatial frequency amplitude spectra, but the paintings and natural scenes show significantly different mean amplitude spectrum slopes. We also find that the intensity distributions of paintings show a lower skewness and sparseness than natural scenes. We account for this by considering the range of luminances found in the environment compared to the range available in the medium of paint. A painting's range is limited by the reflective properties of its materials. We argue that artists do not simply scale the intensity range down but use a compressive nonlinearity. In our studies, modeled retinal and cortical filter responses to the images were less sparse for the paintings than for the natural scenes. But when a compressive nonlinearity was applied to the images, both the paintings' sparseness and the modeled responses to the paintings showed the same or greater sparseness compared to the natural scenes. This suggests that artists achieve some degree of nonlinear compression in their paintings. Because paintings have captivated humans for millennia, finding basic statistical regularities in paintings' spatial structure could grant insights into the range of spatial patterns that humans find compelling.
Binary patterns encoded convolutional neural networks for texture recognition and remote sensing scene classification

NASA Astrophysics Data System (ADS)

Anwer, Rao Muhammad; Khan, Fahad Shahbaz; van de Weijer, Joost; Molinier, Matthieu; Laaksonen, Jorma

2018-04-01

Designing discriminative powerful texture features robust to realistic imaging conditions is a challenging computer vision problem with many applications, including material recognition and analysis of satellite or aerial imagery. In the past, most texture description approaches were based on dense orderless statistical distribution of local features. However, most recent approaches to texture recognition and remote sensing scene classification are based on Convolutional Neural Networks (CNNs). The de facto practice when learning these CNN models is to use RGB patches as input with training performed on large amounts of labeled data (ImageNet). In this paper, we show that Local Binary Patterns (LBP) encoded CNN models, codenamed TEX-Nets, trained using mapped coded images with explicit LBP based texture information provide complementary information to the standard RGB deep models. Additionally, two deep architectures, namely early and late fusion, are investigated to combine the texture and color information. To the best of our knowledge, we are the first to investigate Binary Patterns encoded CNNs and different deep network fusion architectures for texture recognition and remote sensing scene classification. We perform comprehensive experiments on four texture recognition datasets and four remote sensing scene classification benchmarks: UC-Merced with 21 scene categories, WHU-RS19 with 19 scene classes, RSSCN7 with 7 categories and the recently introduced large scale aerial image dataset (AID) with 30 aerial scene types. We demonstrate that TEX-Nets provide complementary information to standard RGB deep model of the same network architecture. Our late fusion TEX-Net architecture always improves the overall performance compared to the standard RGB network on both recognition problems. Furthermore, our final combination leads to consistent improvement over the state-of-the-art for remote sensing scene classification.
Single Neurons in the Avian Auditory Cortex Encode Individual Identity and Propagation Distance in Naturally Degraded Communication Calls.

PubMed

Mouterde, Solveig C; Elie, Julie E; Mathevon, Nicolas; Theunissen, Frédéric E

2017-03-29

One of the most complex tasks performed by sensory systems is "scene analysis": the interpretation of complex signals as behaviorally relevant objects. The study of this problem, universal to species and sensory modalities, is particularly challenging in audition, where sounds from various sources and localizations, degraded by propagation through the environment, sum to form a single acoustical signal. Here we investigated in a songbird model, the zebra finch, the neural substrate for ranging and identifying a single source. We relied on ecologically and behaviorally relevant stimuli, contact calls, to investigate the neural discrimination of individual vocal signature as well as sound source distance when calls have been degraded through propagation in a natural environment. Performing electrophysiological recordings in anesthetized birds, we found neurons in the auditory forebrain that discriminate individual vocal signatures despite long-range degradation, as well as neurons discriminating propagation distance, with varying degrees of multiplexing between both information types. Moreover, the neural discrimination performance of individual identity was not affected by propagation-induced degradation beyond what was induced by the decreased intensity. For the first time, neurons with distance-invariant identity discrimination properties as well as distance-discriminant neurons are revealed in the avian auditory cortex. Because these neurons were recorded in animals that had prior experience neither with the vocalizers of the stimuli nor with long-range propagation of calls, we suggest that this neural population is part of a general-purpose system for vocalizer discrimination and ranging. SIGNIFICANCE STATEMENT Understanding how the brain makes sense of the multitude of stimuli that it continually receives in natural conditions is a challenge for scientists. Here we provide a new understanding of how the auditory system extracts behaviorally relevant information, the vocalizer identity and its distance to the listener, from acoustic signals that have been degraded by long-range propagation in natural conditions. We show, for the first time, that single neurons, in the auditory cortex of zebra finches, are capable of discriminating the individual identity and sound source distance in conspecific communication calls. The discrimination of identity in propagated calls relies on a neural coding that is robust to intensity changes, signals' quality, and decreases in the signal-to-noise ratio. Copyright © 2017 Mouterde et al.
Auditory Training Effects on the Listening Skills of Children With Auditory Processing Disorder.

PubMed

Loo, Jenny Hooi Yin; Rosen, Stuart; Bamiou, Doris-Eva

2016-01-01

Children with auditory processing disorder (APD) typically present with "listening difficulties,"' including problems understanding speech in noisy environments. The authors examined, in a group of such children, whether a 12-week computer-based auditory training program with speech material improved the perception of speech-in-noise test performance, and functional listening skills as assessed by parental and teacher listening and communication questionnaires. The authors hypothesized that after the intervention, (1) trained children would show greater improvements in speech-in-noise perception than untrained controls; (2) this improvement would correlate with improvements in observer-rated behaviors; and (3) the improvement would be maintained for at least 3 months after the end of training. This was a prospective randomized controlled trial of 39 children with normal nonverbal intelligence, ages 7 to 11 years, all diagnosed with APD. This diagnosis required a normal pure-tone audiogram and deficits in at least two clinical auditory processing tests. The APD children were randomly assigned to (1) a control group that received only the current standard treatment for children diagnosed with APD, employing various listening/educational strategies at school (N = 19); or (2) an intervention group that undertook a 3-month 5-day/week computer-based auditory training program at home, consisting of a wide variety of speech-based listening tasks with competing sounds, in addition to the current standard treatment. All 39 children were assessed for language and cognitive skills at baseline and on three outcome measures at baseline and immediate postintervention. Outcome measures were repeated 3 months postintervention in the intervention group only, to assess the sustainability of treatment effects. The outcome measures were (1) the mean speech reception threshold obtained from the four subtests of the listening in specialized noise test that assesses sentence perception in various configurations of masking speech, and in which the target speakers and test materials were unrelated to the training materials; (2) the Children's Auditory Performance Scale that assesses listening skills, completed by the children's teachers; and (3) the Clinical Evaluation of Language Fundamental-4 pragmatic profile that assesses pragmatic language use, completed by parents. All outcome measures significantly improved at immediate postintervention in the intervention group only, with effect sizes ranging from 0.76 to 1.7. Improvements in speech-in-noise performance correlated with improved scores in the Children's Auditory Performance Scale questionnaire in the trained group only. Baseline language and cognitive assessments did not predict better training outcome. Improvements in speech-in-noise performance were sustained 3 months postintervention. Broad speech-based auditory training led to improved auditory processing skills as reflected in speech-in-noise test performance and in better functional listening in real life. The observed correlation between improved functional listening with improved speech-in-noise perception in the trained group suggests that improved listening was a direct generalization of the auditory training.
Using Haptic and Auditory Interaction Tools to Engage Students with Visual Impairments in Robot Programming Activities

ERIC Educational Resources Information Center

Howard, A. M.; Park, Chung Hyuk; Remy, S.

2012-01-01

The robotics field represents the integration of multiple facets of computer science and engineering. Robotics-based activities have been shown to encourage K-12 students to consider careers in computing and have even been adopted as part of core computer-science curriculum at a number of universities. Unfortunately, for students with visual…
The Role of Age and Executive Function in Auditory Category Learning

PubMed Central

Reetzke, Rachel; Maddox, W. Todd; Chandrasekaran, Bharath

2015-01-01

Auditory categorization is a natural and adaptive process that allows for the organization of high-dimensional, continuous acoustic information into discrete representations. Studies in the visual domain have identified a rule-based learning system that learns and reasons via a hypothesis-testing process that requires working memory and executive attention. The rule-based learning system in vision shows a protracted development, reflecting the influence of maturing prefrontal function on visual categorization. The aim of the current study is two-fold: (a) to examine the developmental trajectory of rule-based auditory category learning from childhood through adolescence, into early adulthood; and (b) to examine the extent to which individual differences in rule-based category learning relate to individual differences in executive function. Sixty participants with normal hearing, 20 children (age range, 7–12), 21 adolescents (age range, 13–19), and 19 young adults (age range, 20–23), learned to categorize novel dynamic ripple sounds using trial-by-trial feedback. The spectrotemporally modulated ripple sounds are considered the auditory equivalent of the well-studied Gabor patches in the visual domain. Results revealed that auditory categorization accuracy improved with age, with young adults outperforming children and adolescents. Computational modeling analyses indicated that the use of the task-optimal strategy (i.e. a conjunctive rule-based learning strategy) improved with age. Notably, individual differences in executive flexibility significantly predicted auditory category learning success. The current findings demonstrate a protracted development of rule-based auditory categorization. The results further suggest that executive flexibility coupled with perceptual processes play important roles in successful rule-based auditory category learning. PMID:26491987
Modulation of Auditory Cortex Response to Pitch Variation Following Training with Microtonal Melodies

PubMed Central

Zatorre, Robert J.; Delhommeau, Karine; Zarate, Jean Mary

2012-01-01

We tested changes in cortical functional response to auditory patterns in a configural learning paradigm. We trained 10 human listeners to discriminate micromelodies (consisting of smaller pitch intervals than normally used in Western music) and measured covariation in blood oxygenation signal to increasing pitch interval size in order to dissociate global changes in activity from those specifically associated with the stimulus feature that was trained. A psychophysical staircase procedure with feedback was used for training over a 2-week period. Behavioral tests of discrimination ability performed before and after training showed significant learning on the trained stimuli, and generalization to other frequencies and tasks; no learning occurred in an untrained control group. Before training the functional MRI data showed the expected systematic increase in activity in auditory cortices as a function of increasing micromelody pitch interval size. This function became shallower after training, with the maximal change observed in the right posterior auditory cortex. Global decreases in activity in auditory regions, along with global increases in frontal cortices also occurred after training. Individual variation in learning rate was related to the hemodynamic slope to pitch interval size, such that those who had a higher sensitivity to pitch interval variation prior to learning achieved the fastest learning. We conclude that configural auditory learning entails modulation in the response of auditory cortex to the trained stimulus feature. Reduction in blood oxygenation response to increasing pitch interval size suggests that fewer computational resources, and hence lower neural recruitment, is associated with learning, in accord with models of auditory cortex function, and with data from other modalities. PMID:23227019
Large holographic displays for real-time applications

NASA Astrophysics Data System (ADS)

Schwerdtner, A.; Häussler, R.; Leister, N.

2008-02-01

Holography is generally accepted as the ultimate approach to display three-dimensional scenes or objects. Principally, the reconstruction of an object from a perfect hologram would appear indistinguishable from viewing the corresponding real-world object. Up to now two main obstacles have prevented large-screen Computer-Generated Holograms (CGH) from achieving a satisfactory laboratory prototype not to mention a marketable one. The reason is a small cell pitch CGH resulting in a huge number of hologram cells and a very high computational load for encoding the CGH. These seemingly inevitable technological hurdles for a long time have not been cleared limiting the use of holography to special applications, such as optical filtering, interference, beam forming, digital holography for capturing the 3-D shape of objects, and others. SeeReal Technologies has developed a new approach for real-time capable CGH using the socalled Tracked Viewing Windows technology to overcome these problems. The paper will show that today's state of the art reconfigurable Spatial Light Modulators (SLM), especially today's feasible LCD panels are suited for reconstructing large 3-D scenes which can be observed from large viewing angles. For this to achieve the original holographic concept of containing information from the entire scene in each part of the CGH has been abandoned. This substantially reduces the hologram resolution and thus the computational load by several orders of magnitude making thus real-time computation possible. A monochrome real-time prototype measuring 20 inches has been built and demonstrated at last year's SID conference and exhibition 2007 and at several other events.
Generalized algebraic scene-based nonuniformity correction algorithm.

PubMed

Ratliff, Bradley M; Hayat, Majeed M; Tyo, J Scott

2005-02-01

A generalization of a recently developed algebraic scene-based nonuniformity correction algorithm for focal plane array (FPA) sensors is presented. The new technique uses pairs of image frames exhibiting arbitrary one- or two-dimensional translational motion to compute compensator quantities that are then used to remove nonuniformity in the bias of the FPA response. Unlike its predecessor, the generalization does not require the use of either a blackbody calibration target or a shutter. The algorithm has a low computational overhead, lending itself to real-time hardware implementation. The high-quality correction ability of this technique is demonstrated through application to real IR data from both cooled and uncooled infrared FPAs. A theoretical and experimental error analysis is performed to study the accuracy of the bias compensator estimates in the presence of two main sources of error.
Idealized Computational Models for Auditory Receptive Fields

PubMed Central

Lindeberg, Tony; Friberg, Anders

2015-01-01

We present a theory by which idealized models of auditory receptive fields can be derived in a principled axiomatic manner, from a set of structural properties to (i) enable invariance of receptive field responses under natural sound transformations and (ii) ensure internal consistency between spectro-temporal receptive fields at different temporal and spectral scales. For defining a time-frequency transformation of a purely temporal sound signal, it is shown that the framework allows for a new way of deriving the Gabor and Gammatone filters as well as a novel family of generalized Gammatone filters, with additional degrees of freedom to obtain different trade-offs between the spectral selectivity and the temporal delay of time-causal temporal window functions. When applied to the definition of a second-layer of receptive fields from a spectrogram, it is shown that the framework leads to two canonical families of spectro-temporal receptive fields, in terms of spectro-temporal derivatives of either spectro-temporal Gaussian kernels for non-causal time or a cascade of time-causal first-order integrators over the temporal domain and a Gaussian filter over the logspectral domain. For each filter family, the spectro-temporal receptive fields can be either separable over the time-frequency domain or be adapted to local glissando transformations that represent variations in logarithmic frequencies over time. Within each domain of either non-causal or time-causal time, these receptive field families are derived by uniqueness from the assumptions. It is demonstrated how the presented framework allows for computation of basic auditory features for audio processing and that it leads to predictions about auditory receptive fields with good qualitative similarity to biological receptive fields measured in the inferior colliculus (ICC) and primary auditory cortex (A1) of mammals. PMID:25822973
Music and natural sounds in an auditory steady-state response based brain-computer interface to increase user acceptance.

PubMed

Heo, Jeong; Baek, Hyun Jae; Hong, Seunghyeok; Chang, Min Hye; Lee, Jeong Su; Park, Kwang Suk

2017-05-01

Patients with total locked-in syndrome are conscious; however, they cannot express themselves because most of their voluntary muscles are paralyzed, and many of these patients have lost their eyesight. To improve the quality of life of these patients, there is an increasing need for communication-supporting technologies that leverage the remaining senses of the patient along with physiological signals. The auditory steady-state response (ASSR) is an electro-physiologic response to auditory stimulation that is amplitude-modulated by a specific frequency. By leveraging the phenomenon whereby ASSR is modulated by mind concentration, a brain-computer interface paradigm was proposed to classify the selective attention of the patient. In this paper, we propose an auditory stimulation method to minimize auditory stress by replacing the monotone carrier with familiar music and natural sounds for an ergonomic system. Piano and violin instrumentals were employed in the music sessions; the sounds of water streaming and cicadas singing were used in the natural sound sessions. Six healthy subjects participated in the experiment. Electroencephalograms were recorded using four electrodes (Cz, Oz, T7 and T8). Seven sessions were performed using different stimuli. The spectral power at 38 and 42Hz and their ratio for each electrode were extracted as features. Linear discriminant analysis was utilized to classify the selections for each subject. In offline analysis, the average classification accuracies with a modulation index of 1.0 were 89.67% and 87.67% using music and natural sounds, respectively. In online experiments, the average classification accuracies were 88.3% and 80.0% using music and natural sounds, respectively. Using the proposed method, we obtained significantly higher user-acceptance scores, while maintaining a high average classification accuracy. Copyright © 2017 Elsevier Ltd. All rights reserved.
Volumetric segmentation of range images for printed circuit board inspection

NASA Astrophysics Data System (ADS)

Van Dop, Erik R.; Regtien, Paul P. L.

1996-10-01

Conventional computer vision approaches towards object recognition and pose estimation employ 2D grey-value or color imaging. As a consequence these images contain information about projections of a 3D scene only. The subsequent image processing will then be difficult, because the object coordinates are represented with just image coordinates. Only complicated low-level vision modules like depth from stereo or depth from shading can recover some of the surface geometry of the scene. Recent advances in fast range imaging have however paved the way towards 3D computer vision, since range data of the scene can now be obtained with sufficient accuracy and speed for object recognition and pose estimation purposes. This article proposes the coded-light range-imaging method together with superquadric segmentation to approach this task. Superquadric segments are volumetric primitives that describe global object properties with 5 parameters, which provide the main features for object recognition. Besides, the principle axes of a superquadric segment determine the phase of an object in the scene. The volumetric segmentation of a range image can be used to detect missing, false or badly placed components on assembled printed circuit boards. Furthermore, this approach will be useful to recognize and extract valuable or toxic electronic components on printed circuit boards scrap that currently burden the environment during electronic waste processing. Results on synthetic range images with errors constructed according to a verified noise model illustrate the capabilities of this approach.
Higher-order scene statistics of breast images

NASA Astrophysics Data System (ADS)

Abbey, Craig K.; Sohl-Dickstein, Jascha N.; Olshausen, Bruno A.; Eckstein, Miguel P.; Boone, John M.

2009-02-01

Researchers studying human and computer vision have found description and construction of these systems greatly aided by analysis of the statistical properties of naturally occurring scenes. More specifically, it has been found that receptive fields with directional selectivity and bandwidth properties similar to mammalian visual systems are more closely matched to the statistics of natural scenes. It is argued that this allows for sparse representation of the independent components of natural images [Olshausen and Field, Nature, 1996]. These theories have important implications for medical image perception. For example, will a system that is designed to represent the independent components of natural scenes, where objects occlude one another and illumination is typically reflected, be appropriate for X-ray imaging, where features superimpose on one another and illumination is transmissive? In this research we begin to examine these issues by evaluating higher-order statistical properties of breast images from X-ray projection mammography (PM) and dedicated breast computed tomography (bCT). We evaluate kurtosis in responses of octave bandwidth Gabor filters applied to PM and to coronal slices of bCT scans. We find that kurtosis in PM rises and quickly saturates for filter center frequencies with an average value above 0.95. By contrast, kurtosis in bCT peaks near 0.20 cyc/mm with kurtosis of approximately 2. Our findings suggest that the human visual system may be tuned to represent breast tissue more effectively in bCT over a specific range of spatial frequencies.
Computer-generated scenes depicting the HST capture and EVA repair mission

NASA Image and Video Library

1993-11-12

Computer generated scenes depicting the Hubble Space Telescope capture and a sequence of planned events on the planned extravehicular activity (EVA). Scenes include the Remote Manipulator System (RMS) arm assisting two astronauts changing out the Wide Field/Planetary Camera (WF/PC) (48699); RMS arm assisting in the temporary mating of the orbiting telescope to the flight support system in Endeavour's cargo bay (48700); Endeavour's RMS arm assisting in the "capture" of the orbiting telescope (48701); Two astronauts changing out the telescope's coprocessor (48702); RMS arm assistign two astronauts replacing one of the telescope's electronic control units (48703); RMS assisting two astronauts replacing the fuse plugs on the telescope's Power Distribution Unit (PDU) (48704); The telescope's High Resolution Spectrograph (HRS) kit is depicted in this scene (48705); Two astronauts during the removal of the high speed photometer and the installation of the COSTAR instrument (48706); Two astronauts, standing on the RMS, during installation of one of the Magnetic Sensing System (MSS) (48707); High angle view of the orbiting Space Shuttle Endeavour with its cargo bay doors open, revealing the bay's pre-capture configuration. Seen are, from the left, the Solar Array Carrier, the ORU Carrier and the flight support system (48708); Two astronauts performing the replacement of HST's Rate Sensor Units (RSU) (48709); The RMS arm assisting two astronauts with the replacement of the telescope's solar array panels (48710); Two astronauts replacing the telescope's Solar Array Drive Electronics (SADE) (48711).
Skeletal camera network embedded structure-from-motion for 3D scene reconstruction from UAV images

NASA Astrophysics Data System (ADS)

Xu, Zhihua; Wu, Lixin; Gerke, Markus; Wang, Ran; Yang, Huachao

2016-11-01

Structure-from-Motion (SfM) techniques have been widely used for 3D scene reconstruction from multi-view images. However, due to the large computational costs of SfM methods there is a major challenge in processing highly overlapping images, e.g. images from unmanned aerial vehicles (UAV). This paper embeds a novel skeletal camera network (SCN) into SfM to enable efficient 3D scene reconstruction from a large set of UAV images. First, the flight control data are used within a weighted graph to construct a topologically connected camera network (TCN) to determine the spatial connections between UAV images. Second, the TCN is refined using a novel hierarchical degree bounded maximum spanning tree to generate a SCN, which contains a subset of edges from the TCN and ensures that each image is involved in at least a 3-view configuration. Third, the SCN is embedded into the SfM to produce a novel SCN-SfM method, which allows performing tie-point matching only for the actually connected image pairs. The proposed method was applied in three experiments with images from two fixed-wing UAVs and an octocopter UAV, respectively. In addition, the SCN-SfM method was compared to three other methods for image connectivity determination. The comparison shows a significant reduction in the number of matched images if our method is used, which leads to less computational costs. At the same time the achieved scene completeness and geometric accuracy are comparable.
Effect of subliminal visual material on an auditory signal detection task.

PubMed

Moroney, E; Bross, M

1984-02-01

An experiment assessed the effect of subliminally embedded, visual material on an auditory detection task. 22 women and 19 men were presented tachistoscopically with words designated as "emotional" or "neutral" on the basis of prior GSRs and a Word Rating List under four conditions: (a) Unembedded Neutral, (b) Embedded Neutral, (c) Unembedded Emotional, and (d) Embedded Emotional. On each trial subjects made forced choices concerning the presence or absence of an auditory tone (1000 Hz) at threshold level; hits and false alarm rates were used to compute non-parametric indices for sensitivity (A') and response bias (B"). While over-all analyses of variance yielded no significant differences, further examination of the data suggests the presence of subliminally "receptive" and "non-receptive" subpopulations.
A Cellular Automata Approach to Computer Vision and Image Processing.

DTIC Science & Technology

1980-09-01

the ACM, vol. 15, no. 9, pp. 827-837. [ Duda and Hart] R. 0. Duda and P. E. Hart, Pattern Classification and Scene Analysis, Wiley, New York, 1973...Center TR-738, 1979. [Farley] Arthur M. Farley and Andrzej Proskurowski, "Gossiping in Grid Graphs", University of Oregon Computer Science Department CS-TR

Toward a Script Theory of Guidance in Computer-Supported Collaborative Learning

ERIC Educational Resources Information Center

Fischer, Frank; Kollar, Ingo; Stegmann, Karsten; Wecker, Christof

2013-01-01

This article presents an outline of a script theory of guidance for computer-supported collaborative learning (CSCL). With its 4 types of components of internal and external scripts (play, scene, role, and scriptlet) and 7 principles, this theory addresses the question of how CSCL practices are shaped by dynamically reconfigured internal…
a Study of the Reconstruction of Accidents and Crime Scenes Through Computational Experiments

NASA Astrophysics Data System (ADS)

Park, S. J.; Chae, S. W.; Kim, S. H.; Yang, K. M.; Chung, H. S.

Recently, with an increase in the number of studies of the safety of both pedestrians and passengers, computer software, such as MADYMO, Pam-crash, and LS-dyna, has been providing human models for computer simulation. Although such programs have been applied to make machines beneficial for humans, studies that analyze the reconstruction of accidents or crime scenes are rare. Therefore, through computational experiments, the present study presents reconstructions of two questionable accidents. In the first case, a car fell off the road and the driver was separated from it. The accident investigator was very confused because some circumstantial evidence suggested the possibility that the driver was murdered. In the second case, a woman died in her house and the police suspected foul play with her boyfriend as a suspect. These two cases were reconstructed using the human model in MADYMO software. The first case was eventually confirmed as a traffic accident in which the driver bounced out of the car when the car fell off, and the second case was proved to be suicide rather than homicide.
A computer simulation model to compute the radiation transfer of mountainous regions

NASA Astrophysics Data System (ADS)

Li, Yuguang; Zhao, Feng; Song, Rui

2011-11-01

In mountainous regions, the radiometric signal recorded at the sensor depends on a number of factors such as sun angle, atmospheric conditions, surface cover type, and topography. In this paper, a computer simulation model of radiation transfer is designed and evaluated. This model implements the Monte Carlo ray-tracing techniques and is specifically dedicated to the study of light propagation in mountainous regions. The radiative processes between sun light and the objects within the mountainous region are realized by using forward Monte Carlo ray-tracing methods. The performance of the model is evaluated through detailed comparisons with the well-established 3D computer simulation model: RGM (Radiosity-Graphics combined Model) based on the same scenes and identical spectral parameters, which shows good agreements between these two models' results. By using the newly developed computer model, series of typical mountainous scenes are generated to analyze the physical mechanism of mountainous radiation transfer. The results show that the effects of the adjacent slopes are important for deep valleys and they particularly affect shadowed pixels, and the topographic effect needs to be considered in mountainous terrain before accurate inferences from remotely sensed data can be made.
Statistics of high-level scene context

PubMed Central

Greene, Michelle R.

2013-01-01

Context is critical for recognizing environments and for searching for objects within them: contextual associations have been shown to modulate reaction time and object recognition accuracy, as well as influence the distribution of eye movements and patterns of brain activations. However, we have not yet systematically quantified the relationships between objects and their scene environments. Here I seek to fill this gap by providing descriptive statistics of object-scene relationships. A total of 48, 167 objects were hand-labeled in 3499 scenes using the LabelMe tool (Russell et al., 2008). From these data, I computed a variety of descriptive statistics at three different levels of analysis: the ensemble statistics that describe the density and spatial distribution of unnamed “things” in the scene; the bag of words level where scenes are described by the list of objects contained within them; and the structural level where the spatial distribution and relationships between the objects are measured. The utility of each level of description for scene categorization was assessed through the use of linear classifiers, and the plausibility of each level for modeling human scene categorization is discussed. Of the three levels, ensemble statistics were found to be the most informative (per feature), and also best explained human patterns of categorization errors. Although a bag of words classifier had similar performance to human observers, it had a markedly different pattern of errors. However, certain objects are more useful than others, and ceiling classification performance could be achieved using only the 64 most informative objects. As object location tends not to vary as a function of category, structural information provided little additional information. Additionally, these data provide valuable information on natural scene redundancy that can be exploited for machine vision, and can help the visual cognition community to design experiments guided by statistics rather than intuition. PMID:24194723
Adaptive foveated single-pixel imaging with dynamic supersampling

PubMed Central

Phillips, David B.; Sun, Ming-Jie; Taylor, Jonathan M.; Edgar, Matthew P.; Barnett, Stephen M.; Gibson, Graham M.; Padgett, Miles J.

2017-01-01

In contrast to conventional multipixel cameras, single-pixel cameras capture images using a single detector that measures the correlations between the scene and a set of patterns. However, these systems typically exhibit low frame rates, because to fully sample a scene in this way requires at least the same number of correlation measurements as the number of pixels in the reconstructed image. To mitigate this, a range of compressive sensing techniques have been developed which use a priori knowledge to reconstruct images from an undersampled measurement set. Here, we take a different approach and adopt a strategy inspired by the foveated vision found in the animal kingdom—a framework that exploits the spatiotemporal redundancy of many dynamic scenes. In our system, a high-resolution foveal region tracks motion within the scene, yet unlike a simple zoom, every frame delivers new spatial information from across the entire field of view. This strategy rapidly records the detail of quickly changing features in the scene while simultaneously accumulating detail of more slowly evolving regions over several consecutive frames. This architecture provides video streams in which both the resolution and exposure time spatially vary and adapt dynamically in response to the evolution of the scene. The degree of local frame rate enhancement is scene-dependent, but here, we demonstrate a factor of 4, thereby helping to mitigate one of the main drawbacks of single-pixel imaging techniques. The methods described here complement existing compressive sensing approaches and may be applied to enhance computational imagers that rely on sequential correlation measurements. PMID:28439538
Rotation-invariant features for multi-oriented text detection in natural images.

PubMed

Yao, Cong; Zhang, Xin; Bai, Xiang; Liu, Wenyu; Ma, Yi; Tu, Zhuowen

2013-01-01

Texts in natural scenes carry rich semantic information, which can be used to assist a wide range of applications, such as object recognition, image/video retrieval, mapping/navigation, and human computer interaction. However, most existing systems are designed to detect and recognize horizontal (or near-horizontal) texts. Due to the increasing popularity of mobile-computing devices and applications, detecting texts of varying orientations from natural images under less controlled conditions has become an important but challenging task. In this paper, we propose a new algorithm to detect texts of varying orientations. Our algorithm is based on a two-level classification scheme and two sets of features specially designed for capturing the intrinsic characteristics of texts. To better evaluate the proposed method and compare it with the competing algorithms, we generate a comprehensive dataset with various types of texts in diverse real-world scenes. We also propose a new evaluation protocol, which is more suitable for benchmarking algorithms for detecting texts in varying orientations. Experiments on benchmark datasets demonstrate that our system compares favorably with the state-of-the-art algorithms when handling horizontal texts and achieves significantly enhanced performance on variant texts in complex natural scenes.
Optical system for object detection and delineation in space

NASA Astrophysics Data System (ADS)

Handelman, Amir; Shwartz, Shoam; Donitza, Liad; Chaplanov, Loran

2018-01-01

Object recognition and delineation is an important task in many environments, such as in crime scenes and operating rooms. Marking evidence or surgical tools and attracting the attention of the surrounding staff to the marked objects can affect people's lives. We present an optical system comprising a camera, computer, and small laser projector that can detect and delineate objects in the environment. To prove the optical system's concept, we show that it can operate in a hypothetical crime scene in which a pistol is present and automatically recognize and segment it by various computer-vision algorithms. Based on such segmentation, the laser projector illuminates the actual boundaries of the pistol and thus allows the persons in the scene to comfortably locate and measure the pistol without holding any intermediator device, such as an augmented reality handheld device, glasses, or screens. Using additional optical devices, such as diffraction grating and a cylinder lens, the pistol size can be estimated. The exact location of the pistol in space remains static, even after its removal. Our optical system can be fixed or dynamically moved, making it suitable for various applications that require marking of objects in space.
Advancing crime scene computer forensics techniques

NASA Astrophysics Data System (ADS)

Hosmer, Chet; Feldman, John; Giordano, Joe

1999-02-01

Computers and network technology have become inexpensive and powerful tools that can be applied to a wide range of criminal activity. Computers have changed the world's view of evidence because computers are used more and more as tools in committing `traditional crimes' such as embezzlements, thefts, extortion and murder. This paper will focus on reviewing the current state-of-the-art of the data recovery and evidence construction tools used in both the field and laboratory for prosection purposes.
Wavelet Algorithms for Illumination Computations

NASA Astrophysics Data System (ADS)

Schroder, Peter

One of the core problems of computer graphics is the computation of the equilibrium distribution of light in a scene. This distribution is given as the solution to a Fredholm integral equation of the second kind involving an integral over all surfaces in the scene. In the general case such solutions can only be numerically approximated, and are generally costly to compute, due to the geometric complexity of typical computer graphics scenes. For this computation both Monte Carlo and finite element techniques (or hybrid approaches) are typically used. A simplified version of the illumination problem is known as radiosity, which assumes that all surfaces are diffuse reflectors. For this case hierarchical techniques, first introduced by Hanrahan et al. (32), have recently gained prominence. The hierarchical approaches lead to an asymptotic improvement when only finite precision is required. The resulting algorithms have cost proportional to O(k^2 + n) versus the usual O(n^2) (k is the number of input surfaces, n the number of finite elements into which the input surfaces are meshed). Similarly a hierarchical technique has been introduced for the more general radiance problem (which allows glossy reflectors) by Aupperle et al. (6). In this dissertation we show the equivalence of these hierarchical techniques to the use of a Haar wavelet basis in a general Galerkin framework. By so doing, we come to a deeper understanding of the properties of the numerical approximations used and are able to extend the hierarchical techniques to higher orders. In particular, we show the correspondence of the geometric arguments underlying hierarchical methods to the theory of Calderon-Zygmund operators and their sparse realization in wavelet bases. The resulting wavelet algorithms for radiosity and radiance are analyzed and numerical results achieved with our implementation are reported. We find that the resulting algorithms achieve smaller and smoother errors at equivalent work.
The Role of Audio-Visual Feedback in a Thought-Based Control of a Humanoid Robot: A BCI Study in Healthy and Spinal Cord Injured People.

PubMed

Tidoni, Emmanuele; Gergondet, Pierre; Fusco, Gabriele; Kheddar, Abderrahmane; Aglioti, Salvatore M

2017-06-01

The efficient control of our body and successful interaction with the environment are possible through the integration of multisensory information. Brain-computer interface (BCI) may allow people with sensorimotor disorders to actively interact in the world. In this study, visual information was paired with auditory feedback to improve the BCI control of a humanoid surrogate. Healthy and spinal cord injured (SCI) people were asked to embody a humanoid robot and complete a pick-and-place task by means of a visual evoked potentials BCI system. Participants observed the remote environment from the robot's perspective through a head mounted display. Human-footsteps and computer-beep sounds were used as synchronous/asynchronous auditory feedback. Healthy participants achieved better placing accuracy when listening to human footstep sounds relative to a computer-generated sound. SCI people demonstrated more difficulty in steering the robot during asynchronous auditory feedback conditions. Importantly, subjective reports highlighted that the BCI mask overlaying the display did not limit the observation of the scenario and the feeling of being in control of the robot. Overall, the data seem to suggest that sensorimotor-related information may improve the control of external devices. Further studies are required to understand how the contribution of residual sensory channels could improve the reliability of BCI systems.
International Space Station (ISS)

NASA Image and Video Library

1995-04-17

This computer generated scene of the International Space Station (ISS) represents the first addition of hardware following the completion of Phase II. The 8-A Phase shows the addition of the S-9 truss.
Computational model of lightness perception in high dynamic range imaging

NASA Astrophysics Data System (ADS)

Krawczyk, Grzegorz; Myszkowski, Karol; Seidel, Hans-Peter

2006-02-01

An anchoring theory of lightness perception by Gilchrist et al. [1999] explains many characteristics of human visual system such as lightness constancy and its spectacular failures which are important in the perception of images. The principal concept of this theory is the perception of complex scenes in terms of groups of consistent areas (frameworks). Such areas, following the gestalt theorists, are defined by the regions of common illumination. The key aspect of the image perception is the estimation of lightness within each framework through the anchoring to the luminance perceived as white, followed by the computation of the global lightness. In this paper we provide a computational model for automatic decomposition of HDR images into frameworks. We derive a tone mapping operator which predicts lightness perception of the real world scenes and aims at its accurate reproduction on low dynamic range displays. Furthermore, such a decomposition into frameworks opens new grounds for local image analysis in view of human perception.
A three-layer model of natural image statistics.

PubMed

Gutmann, Michael U; Hyvärinen, Aapo

2013-11-01

An important property of visual systems is to be simultaneously both selective to specific patterns found in the sensory input and invariant to possible variations. Selectivity and invariance (tolerance) are opposing requirements. It has been suggested that they could be joined by iterating a sequence of elementary selectivity and tolerance computations. It is, however, unknown what should be selected or tolerated at each level of the hierarchy. We approach this issue by learning the computations from natural images. We propose and estimate a probabilistic model of natural images that consists of three processing layers. Two natural image data sets are considered: image patches, and complete visual scenes downsampled to the size of small patches. For both data sets, we find that in the first two layers, simple and complex cell-like computations are performed. In the third layer, we mainly find selectivity to longer contours; for patch data, we further find some selectivity to texture, while for the downsampled complete scenes, some selectivity to curvature is observed. Copyright © 2013 Elsevier Ltd. All rights reserved.
Three-camera stereo vision for intelligent transportation systems

NASA Astrophysics Data System (ADS)

Bergendahl, Jason; Masaki, Ichiro; Horn, Berthold K. P.

1997-02-01

A major obstacle in the application of stereo vision to intelligent transportation system is high computational cost. In this paper, a PC based three-camera stereo vision system constructed with off-the-shelf components is described. The system serves as a tool for developing and testing robust algorithms which approach real-time performance. We present an edge based, subpixel stereo algorithm which is adapted to permit accurate distance measurements to objects in the field of view using a compact camera assembly. Once computed, the 3D scene information may be directly applied to a number of in-vehicle applications, such as adaptive cruise control, obstacle detection, and lane tracking. Moreover, since the largest computational costs is incurred in generating the 3D scene information, multiple applications that leverage this information can be implemented in a single system with minimal cost. On-road applications, such as vehicle counting and incident detection, are also possible. Preliminary in-vehicle road trial results are presented.
Generation and physical characteristics of the LANDSAT-1, -2 and -3 MSS computer compatible tapes

NASA Technical Reports Server (NTRS)

Thomas, V. L.

1977-01-01

The generation and format of the LANDSAT 1, 2, and 3 system corrected multispectral scanner computer compatible tapes are discussed. Included in the discussion are the spacecraft sensors, scene characteristics, the transmission of data, and the conversion of the data to computer compatible tapes. Also included in the discussion are geometric and radiometric corrections, tape formats, and the physical characteristics of the tape.
Generation and physical characteristics of the Landsat 1 and 2 MSS computer compatible tapes

NASA Technical Reports Server (NTRS)

Thomas, V. L.

1975-01-01

The generation and format is discussed of the Landsat 1 and 2 system corrected multispectral scanner computer compatible tapes. Included in the discussion are the spacecraft sensors, scene characteristics, the transmission of data, and the conversion of the data to computer compatible tapes at the NASA Data Processing Facility. Geometric and radiometric corrections, tape formats, and the physical characteristics of the tape are also described.
The representation of visual depth perception based on the plenoptic function in the retina and its neural computation in visual cortex V1.

PubMed

Songnian, Zhao; Qi, Zou; Chang, Liu; Xuemin, Liu; Shousi, Sun; Jun, Qiu

2014-04-23

How it is possible to "faithfully" represent a three-dimensional stereoscopic scene using Cartesian coordinates on a plane, and how three-dimensional perceptions differ between an actual scene and an image of the same scene are questions that have not yet been explored in depth. They seem like commonplace phenomena, but in fact, they are important and difficult issues for visual information processing, neural computation, physics, psychology, cognitive psychology, and neuroscience. The results of this study show that the use of plenoptic (or all-optical) functions and their dual plane parameterizations can not only explain the nature of information processing from the retina to the primary visual cortex and, in particular, the characteristics of the visual pathway's optical system and its affine transformation, but they can also clarify the reason why the vanishing point and line exist in a visual image. In addition, they can better explain the reasons why a three-dimensional Cartesian coordinate system can be introduced into the two-dimensional plane to express a real three-dimensional scene. 1. We introduce two different mathematical expressions of the plenoptic functions, Pw and Pv that can describe the objective world. We also analyze the differences between these two functions when describing visual depth perception, that is, the difference between how these two functions obtain the depth information of an external scene.2. The main results include a basic method for introducing a three-dimensional Cartesian coordinate system into a two-dimensional plane to express the depth of a scene, its constraints, and algorithmic implementation. In particular, we include a method to separate the plenoptic function and proceed with the corresponding transformation in the retina and visual cortex.3. We propose that size constancy, the vanishing point, and vanishing line form the basis of visual perception of the outside world, and that the introduction of a three-dimensional Cartesian coordinate system into a two dimensional plane reveals a corresponding mapping between a retinal image and the vanishing point and line.
The representation of visual depth perception based on the plenoptic function in the retina and its neural computation in visual cortex V1

PubMed Central

2014-01-01

Background How it is possible to “faithfully” represent a three-dimensional stereoscopic scene using Cartesian coordinates on a plane, and how three-dimensional perceptions differ between an actual scene and an image of the same scene are questions that have not yet been explored in depth. They seem like commonplace phenomena, but in fact, they are important and difficult issues for visual information processing, neural computation, physics, psychology, cognitive psychology, and neuroscience. Results The results of this study show that the use of plenoptic (or all-optical) functions and their dual plane parameterizations can not only explain the nature of information processing from the retina to the primary visual cortex and, in particular, the characteristics of the visual pathway’s optical system and its affine transformation, but they can also clarify the reason why the vanishing point and line exist in a visual image. In addition, they can better explain the reasons why a three-dimensional Cartesian coordinate system can be introduced into the two-dimensional plane to express a real three-dimensional scene. Conclusions 1. We introduce two different mathematical expressions of the plenoptic functions, P w and P v that can describe the objective world. We also analyze the differences between these two functions when describing visual depth perception, that is, the difference between how these two functions obtain the depth information of an external scene. 2. The main results include a basic method for introducing a three-dimensional Cartesian coordinate system into a two-dimensional plane to express the depth of a scene, its constraints, and algorithmic implementation. In particular, we include a method to separate the plenoptic function and proceed with the corresponding transformation in the retina and visual cortex. 3. We propose that size constancy, the vanishing point, and vanishing line form the basis of visual perception of the outside world, and that the introduction of a three-dimensional Cartesian coordinate system into a two dimensional plane reveals a corresponding mapping between a retinal image and the vanishing point and line. PMID:24755246
Signature modelling and radiometric rendering equations in infrared scene simulation systems

NASA Astrophysics Data System (ADS)

Willers, Cornelius J.; Willers, Maria S.; Lapierre, Fabian

2011-11-01

The development and optimisation of modern infrared systems necessitates the use of simulation systems to create radiometrically realistic representations (e.g. images) of infrared scenes. Such simulation systems are used in signature prediction, the development of surveillance and missile sensors, signal/image processing algorithm development and aircraft self-protection countermeasure system development and evaluation. Even the most cursory investigation reveals a multitude of factors affecting the infrared signatures of realworld objects. Factors such as spectral emissivity, spatial/volumetric radiance distribution, specular reflection, reflected direct sunlight, reflected ambient light, atmospheric degradation and more, all affect the presentation of an object's instantaneous signature. The signature is furthermore dynamically varying as a result of internal and external influences on the object, resulting from the heat balance comprising insolation, internal heat sources, aerodynamic heating (airborne objects), conduction, convection and radiation. In order to accurately render the object's signature in a computer simulation, the rendering equations must therefore account for all the elements of the signature. In this overview paper, the signature models, rendering equations and application frameworks of three infrared simulation systems are reviewed and compared. The paper first considers the problem of infrared scene simulation in a framework for simulation validation. This approach provides concise definitions and a convenient context for considering signature models and subsequent computer implementation. The primary radiometric requirements for an infrared scene simulator are presented next. The signature models and rendering equations implemented in OSMOSIS (Belgian Royal Military Academy), DIRSIG (Rochester Institute of Technology) and OSSIM (CSIR & Denel Dynamics) are reviewed. In spite of these three simulation systems' different application focus areas, their underlying physics-based approach is similar. The commonalities and differences between the different systems are investigated, in the context of their somewhat different application areas. The application of an infrared scene simulation system towards the development of imaging missiles and missile countermeasures are briefly described. Flowing from the review of the available models and equations, recommendations are made to further enhance and improve the signature models and rendering equations in infrared scene simulators.
Nodular Fasciitis of External Auditory Canal

PubMed Central

Ahn, Jihyun; Kim, Sunyoung; Park, Youngsil

2016-01-01

Nodular fasciitis is a pseudosarcomatous reactive process composed of fibroblasts and myofibroblasts, and it is most common in the upper extremities. Nodular fasciitis of the external auditory canal is rare. To the best of our knowledge, less than 20 cases have been reported to date. We present a case of nodular fasciitis arising in the cartilaginous part of the external auditory canal. A 19-year-old man complained of an auricular mass with pruritus. Computed tomography showed a 1.7 cm sized soft tissue mass in the right external auditory canal, and total excision was performed. Histologic examination revealed spindle or stellate cells proliferation in a fascicular and storiform pattern. Lymphoid cells and erythrocytes were intermixed with tumor cells. The stroma was myxoid to hyalinized with a few microcysts. The tumor cells were immunoreactive for smooth muscle actin, but not for desmin, caldesmon, CD34, S-100, anaplastic lymphoma kinase, and cytokeratin. The patient has been doing well during the 1 year follow-up period. PMID:27304679

Some links on this page may take you to non-federal websites. Their policies may differ from this site.