Image Feature Types and Their Predictions of Aesthetic Preference and Naturalness
Ibarra, Frank F.; Kardan, Omid; Hunter, MaryCarol R.; Kotabe, Hiroki P.; Meyer, Francisco A. C.; Berman, Marc G.
2017-01-01
Previous research has investigated ways to quantify visual information of a scene in terms of a visual processing hierarchy, i.e., making sense of visual environment by segmentation and integration of elementary sensory input. Guided by this research, studies have developed categories for low-level visual features (e.g., edges, colors), high-level visual features (scene-level entities that convey semantic information such as objects), and how models of those features predict aesthetic preference and naturalness. For example, in Kardan et al. (2015a), 52 participants provided aesthetic preference and naturalness ratings, which are used in the current study, for 307 images of mixed natural and urban content. Kardan et al. (2015a) then developed a model using low-level features to predict aesthetic preference and naturalness and could do so with high accuracy. What has yet to be explored is the ability of higher-level visual features (e.g., horizon line position relative to viewer, geometry of building distribution relative to visual access) to predict aesthetic preference and naturalness of scenes, and whether higher-level features mediate some of the association between the low-level features and aesthetic preference or naturalness. In this study we investigated these relationships and found that low- and high- level features explain 68.4% of the variance in aesthetic preference ratings and 88.7% of the variance in naturalness ratings. Additionally, several high-level features mediated the relationship between the low-level visual features and aaesthetic preference. In a multiple mediation analysis, the high-level feature mediators accounted for over 50% of the variance in predicting aesthetic preference. These results show that high-level visual features play a prominent role predicting aesthetic preference, but do not completely eliminate the predictive power of the low-level visual features. These strong predictors provide powerful insights for future research relating to landscape and urban design with the aim of maximizing subjective well-being, which could lead to improved health outcomes on a larger scale. PMID:28503158
An integrative view of storage of low- and high-level visual dimensions in visual short-term memory.
Magen, Hagit
2017-03-01
Efficient performance in an environment filled with complex objects is often achieved through the temporal maintenance of conjunctions of features from multiple dimensions. The most striking finding in the study of binding in visual short-term memory (VSTM) is equal memory performance for single features and for integrated multi-feature objects, a finding that has been central to several theories of VSTM. Nevertheless, research on binding in VSTM focused almost exclusively on low-level features, and little is known about how items from low- and high-level visual dimensions (e.g., colored manmade objects) are maintained simultaneously in VSTM. The present study tested memory for combinations of low-level features and high-level representations. In agreement with previous findings, Experiments 1 and 2 showed decrements in memory performance when non-integrated low- and high-level stimuli were maintained simultaneously compared to maintaining each dimension in isolation. However, contrary to previous findings the results of Experiments 3 and 4 showed decrements in memory performance even when integrated objects of low- and high-level stimuli were maintained in memory, compared to maintaining single-dimension objects. Overall, the results demonstrate that low- and high-level visual dimensions compete for the same limited memory capacity, and offer a more comprehensive view of VSTM.
A Novel Image Retrieval Based on Visual Words Integration of SIFT and SURF
Ali, Nouman; Bajwa, Khalid Bashir; Sablatnig, Robert; Chatzichristofis, Savvas A.; Iqbal, Zeshan; Rashid, Muhammad; Habib, Hafiz Adnan
2016-01-01
With the recent evolution of technology, the number of image archives has increased exponentially. In Content-Based Image Retrieval (CBIR), high-level visual information is represented in the form of low-level features. The semantic gap between the low-level features and the high-level image concepts is an open research problem. In this paper, we present a novel visual words integration of Scale Invariant Feature Transform (SIFT) and Speeded-Up Robust Features (SURF). The two local features representations are selected for image retrieval because SIFT is more robust to the change in scale and rotation, while SURF is robust to changes in illumination. The visual words integration of SIFT and SURF adds the robustness of both features to image retrieval. The qualitative and quantitative comparisons conducted on Corel-1000, Corel-1500, Corel-2000, Oliva and Torralba and Ground Truth image benchmarks demonstrate the effectiveness of the proposed visual words integration. PMID:27315101
Visual search deficits in amblyopia.
Tsirlin, Inna; Colpa, Linda; Goltz, Herbert C; Wong, Agnes M F
2018-04-01
Amblyopia is a neurodevelopmental disorder defined as a reduction in visual acuity that cannot be corrected by optical means. It has been associated with low-level deficits. However, research has demonstrated a link between amblyopia and visual attention deficits in counting, tracking, and identifying objects. Visual search is a useful tool for assessing visual attention but has not been well studied in amblyopia. Here, we assessed the extent of visual search deficits in amblyopia using feature and conjunction search tasks. We compared the performance of participants with amblyopia (n = 10) to those of controls (n = 12) on both feature and conjunction search tasks using Gabor patch stimuli, varying spatial bandwidth and orientation. To account for the low-level deficits inherent in amblyopia, we measured individual contrast and crowding thresholds and monitored eye movements. The display elements were then presented at suprathreshold levels to ensure that visibility was equalized across groups. There was no performance difference between groups on feature search, indicating that our experimental design controlled successfully for low-level amblyopia deficits. In contrast, during conjunction search, median reaction times and reaction time slopes were significantly larger in participants with amblyopia compared with controls. Amblyopia differentially affects performance on conjunction visual search, a more difficult task that requires feature binding and possibly the involvement of higher-level attention processes. Deficits in visual search may affect day-to-day functioning in people with amblyopia.
Kotabe, Hiroki P; Kardan, Omid; Berman, Marc G
2017-08-01
Natural environments have powerful aesthetic appeal linked to their capacity for psychological restoration. In contrast, disorderly environments are aesthetically aversive, and have various detrimental psychological effects. But in our research, we have repeatedly found that natural environments are perceptually disorderly. What could explain this paradox? We present 3 competing hypotheses: the aesthetic preference for naturalness is more powerful than the aesthetic aversion to disorder (the nature-trumps-disorder hypothesis ); disorder is trivial to aesthetic preference in natural contexts (the harmless-disorder hypothesis ); and disorder is aesthetically preferred in natural contexts (the beneficial-disorder hypothesis ). Utilizing novel methods of perceptual study and diverse stimuli, we rule in the nature-trumps-disorder hypothesis and rule out the harmless-disorder and beneficial-disorder hypotheses. In examining perceptual mechanisms, we find evidence that high-level scene semantics are both necessary and sufficient for the nature-trumps-disorder effect. Necessity is evidenced by the effect disappearing in experiments utilizing only low-level visual stimuli (i.e., where scene semantics have been removed) and experiments utilizing a rapid-scene-presentation procedure that obscures scene semantics. Sufficiency is evidenced by the effect reappearing in experiments utilizing noun stimuli which remove low-level visual features. Furthermore, we present evidence that the interaction of scene semantics with low-level visual features amplifies the nature-trumps-disorder effect-the effect is weaker both when statistically adjusting for quantified low-level visual features and when using noun stimuli which remove low-level visual features. These results have implications for psychological theories bearing on the joint influence of low- and high-level perceptual inputs on affect and cognition, as well as for aesthetic design. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Modeling of pilot's visual behavior for low-level flight
NASA Astrophysics Data System (ADS)
Schulte, Axel; Onken, Reiner
1995-06-01
Developers of synthetic vision systems for low-level flight simulators deal with the problem to decide which features to incorporate in order to achieve most realistic training conditions. This paper supports an approach to this problem on the basis of modeling the pilot's visual behavior. This approach is founded upon the basic requirement that the pilot's mechanisms of visual perception should be identical in simulated and real low-level flight. Flight simulator experiments with pilots were conducted for knowledge acquisition. During the experiments video material of a real low-level flight mission containing different situations was displayed to the pilot who was acting under a realistic mission assignment in a laboratory environment. Pilot's eye movements could be measured during the replay. The visual mechanisms were divided into rule based strategies for visual navigation, based on the preflight planning process, as opposed to skill based processes. The paper results in a model of the pilot's planning strategy of a visual fixing routine as part of the navigation task. The model is a knowledge based system based upon the fuzzy evaluation of terrain features in order to determine the landmarks used by pilots. It can be shown that a computer implementation of the model selects those features, which were preferred by trained pilots, too.
Contini, Erika W; Wardle, Susan G; Carlson, Thomas A
2017-10-01
Visual object recognition is a complex, dynamic process. Multivariate pattern analysis methods, such as decoding, have begun to reveal how the brain processes complex visual information. Recently, temporal decoding methods for EEG and MEG have offered the potential to evaluate the temporal dynamics of object recognition. Here we review the contribution of M/EEG time-series decoding methods to understanding visual object recognition in the human brain. Consistent with the current understanding of the visual processing hierarchy, low-level visual features dominate decodable object representations early in the time-course, with more abstract representations related to object category emerging later. A key finding is that the time-course of object processing is highly dynamic and rapidly evolving, with limited temporal generalisation of decodable information. Several studies have examined the emergence of object category structure, and we consider to what degree category decoding can be explained by sensitivity to low-level visual features. Finally, we evaluate recent work attempting to link human behaviour to the neural time-course of object processing. Copyright © 2017 Elsevier Ltd. All rights reserved.
Research on metallic material defect detection based on bionic sensing of human visual properties
NASA Astrophysics Data System (ADS)
Zhang, Pei Jiang; Cheng, Tao
2018-05-01
Due to the fact that human visual system can quickly lock the areas of interest in complex natural environment and focus on it, this paper proposes an eye-based visual attention mechanism by simulating human visual imaging features based on human visual attention mechanism Bionic Sensing Visual Inspection Model Method to Detect Defects of Metallic Materials in the Mechanical Field. First of all, according to the biologically visually significant low-level features, the mark of defect experience marking is used as the intermediate feature of simulated visual perception. Afterwards, SVM method was used to train the advanced features of visual defects of metal material. According to the weight of each party, the biometrics detection model of metal material defect, which simulates human visual characteristics, is obtained.
The perception of naturalness correlates with low-level visual features of environmental scenes.
Berman, Marc G; Hout, Michael C; Kardan, Omid; Hunter, MaryCarol R; Yourganov, Grigori; Henderson, John M; Hanayik, Taylor; Karimi, Hossein; Jonides, John
2014-01-01
Previous research has shown that interacting with natural environments vs. more urban or built environments can have salubrious psychological effects, such as improvements in attention and memory. Even viewing pictures of nature vs. pictures of built environments can produce similar effects. A major question is: What is it about natural environments that produces these benefits? Problematically, there are many differing qualities between natural and urban environments, making it difficult to narrow down the dimensions of nature that may lead to these benefits. In this study, we set out to uncover visual features that related to individuals' perceptions of naturalness in images. We quantified naturalness in two ways: first, implicitly using a multidimensional scaling analysis and second, explicitly with direct naturalness ratings. Features that seemed most related to perceptions of naturalness were related to the density of contrast changes in the scene, the density of straight lines in the scene, the average color saturation in the scene and the average hue diversity in the scene. We then trained a machine-learning algorithm to predict whether a scene was perceived as being natural or not based on these low-level visual features and we could do so with 81% accuracy. As such we were able to reliably predict subjective perceptions of naturalness with objective low-level visual features. Our results can be used in future studies to determine if these features, which are related to naturalness, may also lead to the benefits attained from interacting with nature.
Generic decoding of seen and imagined objects using hierarchical visual features.
Horikawa, Tomoyasu; Kamitani, Yukiyasu
2017-05-22
Object recognition is a key function in both human and machine vision. While brain decoding of seen and imagined objects has been achieved, the prediction is limited to training examples. We present a decoding approach for arbitrary objects using the machine vision principle that an object category is represented by a set of features rendered invariant through hierarchical processing. We show that visual features, including those derived from a deep convolutional neural network, can be predicted from fMRI patterns, and that greater accuracy is achieved for low-/high-level features with lower-/higher-level visual areas, respectively. Predicted features are used to identify seen/imagined object categories (extending beyond decoder training) from a set of computed features for numerous object images. Furthermore, decoding of imagined objects reveals progressive recruitment of higher-to-lower visual representations. Our results demonstrate a homology between human and machine vision and its utility for brain-based information retrieval.
Sneve, Markus H; Sreenivasan, Kartik K; Alnæs, Dag; Endestad, Tor; Magnussen, Svein
2015-01-01
Retention of features in visual short-term memory (VSTM) involves maintenance of sensory traces in early visual cortex. However, the mechanism through which this is accomplished is not known. Here, we formulate specific hypotheses derived from studies on feature-based attention to test the prediction that visual cortex is recruited by attentional mechanisms during VSTM of low-level features. Functional magnetic resonance imaging (fMRI) of human visual areas revealed that neural populations coding for task-irrelevant feature information are suppressed during maintenance of detailed spatial frequency memory representations. The narrow spectral extent of this suppression agrees well with known effects of feature-based attention. Additionally, analyses of effective connectivity during maintenance between retinotopic areas in visual cortex show that the observed highlighting of task-relevant parts of the feature spectrum originates in V4, a visual area strongly connected with higher-level control regions and known to convey top-down influence to earlier visual areas during attentional tasks. In line with this property of V4 during attentional operations, we demonstrate that modulations of earlier visual areas during memory maintenance have behavioral consequences, and that these modulations are a result of influences from V4. Copyright © 2014 Elsevier Ltd. All rights reserved.
Visual affective classification by combining visual and text features.
Liu, Ningning; Wang, Kai; Jin, Xin; Gao, Boyang; Dellandréa, Emmanuel; Chen, Liming
2017-01-01
Affective analysis of images in social networks has drawn much attention, and the texts surrounding images are proven to provide valuable semantic meanings about image content, which can hardly be represented by low-level visual features. In this paper, we propose a novel approach for visual affective classification (VAC) task. This approach combines visual representations along with novel text features through a fusion scheme based on Dempster-Shafer (D-S) Evidence Theory. Specifically, we not only investigate different types of visual features and fusion methods for VAC, but also propose textual features to effectively capture emotional semantics from the short text associated to images based on word similarity. Experiments are conducted on three public available databases: the International Affective Picture System (IAPS), the Artistic Photos and the MirFlickr Affect set. The results demonstrate that the proposed approach combining visual and textual features provides promising results for VAC task.
Visual affective classification by combining visual and text features
Liu, Ningning; Wang, Kai; Jin, Xin; Gao, Boyang; Dellandréa, Emmanuel; Chen, Liming
2017-01-01
Affective analysis of images in social networks has drawn much attention, and the texts surrounding images are proven to provide valuable semantic meanings about image content, which can hardly be represented by low-level visual features. In this paper, we propose a novel approach for visual affective classification (VAC) task. This approach combines visual representations along with novel text features through a fusion scheme based on Dempster-Shafer (D-S) Evidence Theory. Specifically, we not only investigate different types of visual features and fusion methods for VAC, but also propose textual features to effectively capture emotional semantics from the short text associated to images based on word similarity. Experiments are conducted on three public available databases: the International Affective Picture System (IAPS), the Artistic Photos and the MirFlickr Affect set. The results demonstrate that the proposed approach combining visual and textual features provides promising results for VAC task. PMID:28850566
Small numbers are sensed directly, high numbers constructed from size and density.
Zimmermann, Eckart
2018-04-01
Two theories compete to explain how we estimate the numerosity of visual object sets. The first suggests that the apparent numerosity is derived from an analysis of more low-level features like size and density of the set. The second theory suggests that numbers are sensed directly. Consistent with the latter claim is the existence of neurons in parietal cortex which are specialized for processing the numerosity of elements in the visual scene. However, recent evidence suggests that only low numbers can be sensed directly whereas the perception of high numbers is supported by the analysis of low-level features. Processing of low and high numbers, being located at different levels of the neural hierarchy should involve different receptive field sizes. Here, I tested this idea with visual adaptation. I measured the spatial spread of number adaptation for low and high numerosities. A focused adaptation spread of high numerosities suggested the involvement of early neural levels where receptive fields are comparably small and the broad spread for low numerosities was consistent with processing of number neurons which have larger receptive fields. These results provide evidence for the claim that different mechanism exist generating the perception of visual numerosity. Whereas low numbers are sensed directly as a primary visual attribute, the estimation of high numbers however likely depends on the area size over which the objects are spread. Copyright © 2017 Elsevier B.V. All rights reserved.
Expectation and Surprise Determine Neural Population Responses in the Ventral Visual Stream
Egner, Tobias; Monti, Jim M.; Summerfield, Christopher
2014-01-01
Visual cortex is traditionally viewed as a hierarchy of neural feature detectors, with neural population responses being driven by bottom-up stimulus features. Conversely, “predictive coding” models propose that each stage of the visual hierarchy harbors two computationally distinct classes of processing unit: representational units that encode the conditional probability of a stimulus and provide predictions to the next lower level; and error units that encode the mismatch between predictions and bottom-up evidence, and forward prediction error to the next higher level. Predictive coding therefore suggests that neural population responses in category-selective visual regions, like the fusiform face area (FFA), reflect a summation of activity related to prediction (“face expectation”) and prediction error (“face surprise”), rather than a homogenous feature detection response. We tested the rival hypotheses of the feature detection and predictive coding models by collecting functional magnetic resonance imaging data from the FFA while independently varying both stimulus features (faces vs houses) and subjects’ perceptual expectations regarding those features (low vs medium vs high face expectation). The effects of stimulus and expectation factors interacted, whereby FFA activity elicited by face and house stimuli was indistinguishable under high face expectation and maximally differentiated under low face expectation. Using computational modeling, we show that these data can be explained by predictive coding but not by feature detection models, even when the latter are augmented with attentional mechanisms. Thus, population responses in the ventral visual stream appear to be determined by feature expectation and surprise rather than by stimulus features per se. PMID:21147999
New insights into ambient and focal visual fixations using an automatic classification algorithm
Follet, Brice; Le Meur, Olivier; Baccino, Thierry
2011-01-01
Overt visual attention is the act of directing the eyes toward a given area. These eye movements are characterised by saccades and fixations. A debate currently surrounds the role of visual fixations. Do they all have the same role in the free viewing of natural scenes? Recent studies suggest that at least two types of visual fixations exist: focal and ambient. The former is believed to be used to inspect local areas accurately, whereas the latter is used to obtain the context of the scene. We investigated the use of an automated system to cluster visual fixations in two groups using four types of natural scene images. We found new evidence to support a focal–ambient dichotomy. Our data indicate that the determining factor is the saccade amplitude. The dependence on the low-level visual features and the time course of these two kinds of visual fixations were examined. Our results demonstrate that there is an interplay between both fixation populations and that focal fixations are more dependent on low-level visual features than are ambient fixations. PMID:23145248
Classification of CT examinations for COPD visual severity analysis
NASA Astrophysics Data System (ADS)
Tan, Jun; Zheng, Bin; Wang, Xingwei; Pu, Jiantao; Gur, David; Sciurba, Frank C.; Leader, J. Ken
2012-03-01
In this study we present a computational method of CT examination classification into visual assessed emphysema severity. The visual severity categories ranged from 0 to 5 and were rated by an experienced radiologist. The six categories were none, trace, mild, moderate, severe and very severe. Lung segmentation was performed for every input image and all image features are extracted from the segmented lung only. We adopted a two-level feature representation method for the classification. Five gray level distribution statistics, six gray level co-occurrence matrix (GLCM), and eleven gray level run-length (GLRL) features were computed for each CT image depicted segment lung. Then we used wavelets decomposition to obtain the low- and high-frequency components of the input image, and again extract from the lung region six GLCM features and eleven GLRL features. Therefore our feature vector length is 56. The CT examinations were classified using the support vector machine (SVM) and k-nearest neighbors (KNN) and the traditional threshold (density mask) approach. The SVM classifier had the highest classification performance of all the methods with an overall sensitivity of 54.4% and a 69.6% sensitivity to discriminate "no" and "trace visually assessed emphysema. We believe this work may lead to an automated, objective method to categorically classify emphysema severity on CT exam.
Visual Saliency Detection Based on Multiscale Deep CNN Features.
Guanbin Li; Yizhou Yu
2016-11-01
Visual saliency is a fundamental problem in both cognitive and computational sciences, including computer vision. In this paper, we discover that a high-quality visual saliency model can be learned from multiscale features extracted using deep convolutional neural networks (CNNs), which have had many successes in visual recognition tasks. For learning such saliency models, we introduce a neural network architecture, which has fully connected layers on top of CNNs responsible for feature extraction at three different scales. The penultimate layer of our neural network has been confirmed to be a discriminative high-level feature vector for saliency detection, which we call deep contrast feature. To generate a more robust feature, we integrate handcrafted low-level features with our deep contrast feature. To promote further research and evaluation of visual saliency models, we also construct a new large database of 4447 challenging images and their pixelwise saliency annotations. Experimental results demonstrate that our proposed method is capable of achieving the state-of-the-art performance on all public benchmarks, improving the F-measure by 6.12% and 10%, respectively, on the DUT-OMRON data set and our new data set (HKU-IS), and lowering the mean absolute error by 9% and 35.3%, respectively, on these two data sets.
Art Expertise Reduces Influence of Visual Salience on Fixation in Viewing Abstract-Paintings
Koide, Naoko; Kubo, Takatomi; Nishida, Satoshi; Shibata, Tomohiro; Ikeda, Kazushi
2015-01-01
When viewing a painting, artists perceive more information from the painting on the basis of their experience and knowledge than art novices do. This difference can be reflected in eye scan paths during viewing of paintings. Distributions of scan paths of artists are different from those of novices even when the paintings contain no figurative object (i.e. abstract paintings). There are two possible explanations for this difference of scan paths. One is that artists have high sensitivity to high-level features such as textures and composition of colors and therefore their fixations are more driven by such features compared with novices. The other is that fixations of artists are more attracted by salient features than those of novices and the fixations are driven by low-level features. To test these, we measured eye fixations of artists and novices during the free viewing of various abstract paintings and compared the distribution of their fixations for each painting with a topological attentional map that quantifies the conspicuity of low-level features in the painting (i.e. saliency map). We found that the fixation distribution of artists was more distinguishable from the saliency map than that of novices. This difference indicates that fixations of artists are less driven by low-level features than those of novices. Our result suggests that artists may extract visual information from paintings based on high-level features. This ability of artists may be associated with artists’ deep aesthetic appreciation of paintings. PMID:25658327
Oculomotor guidance and capture by irrelevant faces.
Devue, Christel; Belopolsky, Artem V; Theeuwes, Jan
2012-01-01
Even though it is generally agreed that face stimuli constitute a special class of stimuli, which are treated preferentially by our visual system, it remains unclear whether faces can capture attention in a stimulus-driven manner. Moreover, there is a long-standing debate regarding the mechanism underlying the preferential bias of selecting faces. Some claim that faces constitute a set of special low-level features to which our visual system is tuned; others claim that the visual system is capable of extracting the meaning of faces very rapidly, driving attentional selection. Those debates continue because many studies contain methodological peculiarities and manipulations that prevent a definitive conclusion. Here, we present a new visual search task in which observers had to make a saccade to a uniquely colored circle while completely irrelevant objects were also present in the visual field. The results indicate that faces capture and guide the eyes more than other animated objects and that our visual system is not only tuned to the low-level features that make up a face but also to its meaning.
The effect of spatial attention on invisible stimuli.
Shin, Kilho; Stolte, Moritz; Chong, Sang Chul
2009-10-01
The influence of selective attention on visual processing is widespread. Recent studies have demonstrated that spatial attention can affect processing of invisible stimuli. However, it has been suggested that this effect is limited to low-level features, such as line orientations. The present experiments investigated whether spatial attention can influence both low-level (contrast threshold) and high-level (gender discrimination) adaptation, using the same method of attentional modulation for both types of stimuli. We found that spatial attention was able to increase the amount of adaptation to low- as well as to high-level invisible stimuli. These results suggest that attention can influence perceptual processes independent of visual awareness.
Fox, Olivia M.; Harel, Assaf; Bennett, Kevin B.
2017-01-01
The perception of a visual stimulus is dependent not only upon local features, but also on the arrangement of those features. When stimulus features are perceptually well organized (e.g., symmetric or parallel), a global configuration with a high degree of salience emerges from the interactions between these features, often referred to as emergent features. Emergent features can be demonstrated in the Configural Superiority Effect (CSE): presenting a stimulus within an organized context relative to its presentation in a disarranged one results in better performance. Prior neuroimaging work on the perception of emergent features regards the CSE as an “all or none” phenomenon, focusing on the contrast between configural and non-configural stimuli. However, it is still not clear how emergent features are processed between these two endpoints. The current study examined the extent to which behavioral and neuroimaging markers of emergent features are responsive to the degree of configurality in visual displays. Subjects were tasked with reporting the anomalous quadrant in a visual search task while being scanned. Degree of configurality was manipulated by incrementally varying the rotational angle of low-level features within the stimulus arrays. Behaviorally, we observed faster response times with increasing levels of configurality. These behavioral changes were accompanied by increases in response magnitude across multiple visual areas in occipito-temporal cortex, primarily early visual cortex and object-selective cortex. Our findings suggest that the neural correlates of emergent features can be observed even in response to stimuli that are not fully configural, and demonstrate that configural information is already present at early stages of the visual hierarchy. PMID:28167924
Groen, Iris I A; Silson, Edward H; Baker, Chris I
2017-02-19
Visual scene analysis in humans has been characterized by the presence of regions in extrastriate cortex that are selectively responsive to scenes compared with objects or faces. While these regions have often been interpreted as representing high-level properties of scenes (e.g. category), they also exhibit substantial sensitivity to low-level (e.g. spatial frequency) and mid-level (e.g. spatial layout) properties, and it is unclear how these disparate findings can be united in a single framework. In this opinion piece, we suggest that this problem can be resolved by questioning the utility of the classical low- to high-level framework of visual perception for scene processing, and discuss why low- and mid-level properties may be particularly diagnostic for the behavioural goals specific to scene perception as compared to object recognition. In particular, we highlight the contributions of low-level vision to scene representation by reviewing (i) retinotopic biases and receptive field properties of scene-selective regions and (ii) the temporal dynamics of scene perception that demonstrate overlap of low- and mid-level feature representations with those of scene category. We discuss the relevance of these findings for scene perception and suggest a more expansive framework for visual scene analysis.This article is part of the themed issue 'Auditory and visual scene analysis'. © 2017 The Author(s).
2017-01-01
Visual scene analysis in humans has been characterized by the presence of regions in extrastriate cortex that are selectively responsive to scenes compared with objects or faces. While these regions have often been interpreted as representing high-level properties of scenes (e.g. category), they also exhibit substantial sensitivity to low-level (e.g. spatial frequency) and mid-level (e.g. spatial layout) properties, and it is unclear how these disparate findings can be united in a single framework. In this opinion piece, we suggest that this problem can be resolved by questioning the utility of the classical low- to high-level framework of visual perception for scene processing, and discuss why low- and mid-level properties may be particularly diagnostic for the behavioural goals specific to scene perception as compared to object recognition. In particular, we highlight the contributions of low-level vision to scene representation by reviewing (i) retinotopic biases and receptive field properties of scene-selective regions and (ii) the temporal dynamics of scene perception that demonstrate overlap of low- and mid-level feature representations with those of scene category. We discuss the relevance of these findings for scene perception and suggest a more expansive framework for visual scene analysis. This article is part of the themed issue ‘Auditory and visual scene analysis’. PMID:28044013
Eguchi, Akihiro; Isbister, James B; Ahmad, Nasir; Stringer, Simon
2018-07-01
We present a hierarchical neural network model, in which subpopulations of neurons develop fixed and regularly repeating temporal chains of spikes (polychronization), which respond specifically to randomized Poisson spike trains representing the input training images. The performance is improved by including top-down and lateral synaptic connections, as well as introducing multiple synaptic contacts between each pair of pre- and postsynaptic neurons, with different synaptic contacts having different axonal delays. Spike-timing-dependent plasticity thus allows the model to select the most effective axonal transmission delay between neurons. Furthermore, neurons representing the binding relationship between low-level and high-level visual features emerge through visually guided learning. This begins to provide a way forward to solving the classic feature binding problem in visual neuroscience and leads to a new hypothesis concerning how information about visual features at every spatial scale may be projected upward through successive neuronal layers. We name this hypothetical upward projection of information the "holographic principle." (PsycINFO Database Record (c) 2018 APA, all rights reserved).
The sensory components of high-capacity iconic memory and visual working memory.
Bradley, Claire; Pearson, Joel
2012-01-01
EARLY VISUAL MEMORY CAN BE SPLIT INTO TWO PRIMARY COMPONENTS: a high-capacity, short-lived iconic memory followed by a limited-capacity visual working memory that can last many seconds. Whereas a large number of studies have investigated visual working memory for low-level sensory features, much research on iconic memory has used more "high-level" alphanumeric stimuli such as letters or numbers. These two forms of memory are typically examined separately, despite an intrinsic overlap in their characteristics. Here, we used a purely sensory paradigm to examine visual short-term memory for 10 homogeneous items of three different visual features (color, orientation and motion) across a range of durations from 0 to 6 s. We found that the amount of information stored in iconic memory is smaller for motion than for color or orientation. Performance declined exponentially with longer storage durations and reached chance levels after ∼2 s. Further experiments showed that performance for the 10 items at 1 s was contingent on unperturbed attentional resources. In addition, for orientation stimuli, performance was contingent on the location of stimuli in the visual field, especially for short cue delays. Overall, our results suggest a smooth transition between an automatic, high-capacity, feature-specific sensory-iconic memory, and an effortful "lower-capacity" visual working memory.
Dynamic interactions between visual working memory and saccade target selection
Schneegans, Sebastian; Spencer, John P.; Schöner, Gregor; Hwang, Seongmin; Hollingworth, Andrew
2014-01-01
Recent psychophysical experiments have shown that working memory for visual surface features interacts with saccadic motor planning, even in tasks where the saccade target is unambiguously specified by spatial cues. Specifically, a match between a memorized color and the color of either the designated target or a distractor stimulus influences saccade target selection, saccade amplitudes, and latencies in a systematic fashion. To elucidate these effects, we present a dynamic neural field model in combination with new experimental data. The model captures the neural processes underlying visual perception, working memory, and saccade planning relevant to the psychophysical experiment. It consists of a low-level visual sensory representation that interacts with two separate pathways: a spatial pathway implementing spatial attention and saccade generation, and a surface feature pathway implementing color working memory and feature attention. Due to bidirectional coupling between visual working memory and feature attention in the model, the working memory content can indirectly exert an effect on perceptual processing in the low-level sensory representation. This in turn biases saccadic movement planning in the spatial pathway, allowing the model to quantitatively reproduce the observed interaction effects. The continuous coupling between representations in the model also implies that modulation should be bidirectional, and model simulations provide specific predictions for complementary effects of saccade target selection on visual working memory. These predictions were empirically confirmed in a new experiment: Memory for a sample color was biased toward the color of a task-irrelevant saccade target object, demonstrating the bidirectional coupling between visual working memory and perceptual processing. PMID:25228628
Recovery of a crowded object by masking the flankers: Determining the locus of feature integration
Chakravarthi, Ramakrishna; Cavanagh, Patrick
2009-01-01
Object recognition is a central function of the visual system. As a first step, the features of an object are registered; these independently encoded features are then bound together to form a single representation. Here we investigate the locus of this “feature integration” by examining crowding, a striking breakdown of this process. Crowding, an inability to identify a peripheral target surrounded by flankers, results from “excessive integration” of target and flanker features. We presented a standard crowding display with a target C flanked by four flanker C's in the periphery. We then masked only the flankers (but not the target) with one of three kinds of masks—noise, metacontrast, and object substitution—each of which interferes at progressively higher levels of visual processing. With noise and metacontrast masks (low-level masking), the crowded target was recovered, whereas with object substitution masks (high-level masking), it was not. This places a clear upper bound on the locus of interference in crowding suggesting that crowding is not a low-level phenomenon. We conclude that feature integration, which underlies crowding, occurs prior to the locus of object substitution masking. Further, our results indicate that the integrity of the flankers, but not their identification, is crucial for crowding to occur. PMID:19810785
Awareness Becomes Necessary Between Adaptive Pattern Coding of Open and Closed Curvatures
Sweeny, Timothy D.; Grabowecky, Marcia; Suzuki, Satoru
2012-01-01
Visual pattern processing becomes increasingly complex along the ventral pathway, from the low-level coding of local orientation in the primary visual cortex to the high-level coding of face identity in temporal visual areas. Previous research using pattern aftereffects as a psychophysical tool to measure activation of adaptive feature coding has suggested that awareness is relatively unimportant for the coding of orientation, but awareness is crucial for the coding of face identity. We investigated where along the ventral visual pathway awareness becomes crucial for pattern coding. Monoptic masking, which interferes with neural spiking activity in low-level processing while preserving awareness of the adaptor, eliminated open-curvature aftereffects but preserved closed-curvature aftereffects. In contrast, dichoptic masking, which spares spiking activity in low-level processing while wiping out awareness, preserved open-curvature aftereffects but eliminated closed-curvature aftereffects. This double dissociation suggests that adaptive coding of open and closed curvatures straddles the divide between weakly and strongly awareness-dependent pattern coding. PMID:21690314
Gelbard-Sagiv, Hagar; Faivre, Nathan; Mudrik, Liad; Koch, Christof
2016-01-01
The scope and limits of unconscious processing are a matter of ongoing debate. Lately, continuous flash suppression (CFS), a technique for suppressing visual stimuli, has been widely used to demonstrate surprisingly high-level processing of invisible stimuli. Yet, recent studies showed that CFS might actually allow low-level features of the stimulus to escape suppression and be consciously perceived. The influence of such low-level awareness on high-level processing might easily go unnoticed, as studies usually only probe the visibility of the feature of interest, and not that of lower-level features. For instance, face identity is held to be processed unconsciously since subjects who fail to judge the identity of suppressed faces still show identity priming effects. Here we challenge these results, showing that such high-level priming effects are indeed induced by faces whose identity is invisible, but critically, only when a lower-level feature, such as color or location, is visible. No evidence for identity processing was found when subjects had no conscious access to any feature of the suppressed face. These results suggest that high-level processing of an image might be enabled by-or co-occur with-conscious access to some of its low-level features, even when these features are not relevant to the processed dimension. Accordingly, they call for further investigation of lower-level awareness during CFS, and reevaluation of other unconscious high-level processing findings.
Learning and Recognition of Clothing Genres From Full-Body Images.
Hidayati, Shintami C; You, Chuang-Wen; Cheng, Wen-Huang; Hua, Kai-Lung
2018-05-01
According to the theory of clothing design, the genres of clothes can be recognized based on a set of visually differentiable style elements, which exhibit salient features of visual appearance and reflect high-level fashion styles for better describing clothing genres. Instead of using less-discriminative low-level features or ambiguous keywords to identify clothing genres, we proposed a novel approach for automatically classifying clothing genres based on the visually differentiable style elements. A set of style elements, that are crucial for recognizing specific visual styles of clothing genres, were identified based on the clothing design theory. In addition, the corresponding salient visual features of each style element were identified and formulated with variables that can be computationally derived with various computer vision algorithms. To evaluate the performance of our algorithm, a dataset containing 3250 full-body shots crawled from popular online stores was built. Recognition results show that our proposed algorithms achieved promising overall precision, recall, and -score of 88.76%, 88.53%, and 88.64% for recognizing upperwear genres, and 88.21%, 88.17%, and 88.19% for recognizing lowerwear genres, respectively. The effectiveness of each style element and its visual features on recognizing clothing genres was demonstrated through a set of experiments involving different sets of style elements or features. In summary, our experimental results demonstrate the effectiveness of the proposed method in clothing genre recognition.
A computational visual saliency model based on statistics and machine learning.
Lin, Ru-Je; Lin, Wei-Song
2014-08-01
Identifying the type of stimuli that attracts human visual attention has been an appealing topic for scientists for many years. In particular, marking the salient regions in images is useful for both psychologists and many computer vision applications. In this paper, we propose a computational approach for producing saliency maps using statistics and machine learning methods. Based on four assumptions, three properties (Feature-Prior, Position-Prior, and Feature-Distribution) can be derived and combined by a simple intersection operation to obtain a saliency map. These properties are implemented by a similarity computation, support vector regression (SVR) technique, statistical analysis of training samples, and information theory using low-level features. This technique is able to learn the preferences of human visual behavior while simultaneously considering feature uniqueness. Experimental results show that our approach performs better in predicting human visual attention regions than 12 other models in two test databases. © 2014 ARVO.
Advanced biologically plausible algorithms for low-level image processing
NASA Astrophysics Data System (ADS)
Gusakova, Valentina I.; Podladchikova, Lubov N.; Shaposhnikov, Dmitry G.; Markin, Sergey N.; Golovan, Alexander V.; Lee, Seong-Whan
1999-08-01
At present, in computer vision, the approach based on modeling the biological vision mechanisms is extensively developed. However, up to now, real world image processing has no effective solution in frameworks of both biologically inspired and conventional approaches. Evidently, new algorithms and system architectures based on advanced biological motivation should be developed for solution of computational problems related to this visual task. Basic problems that should be solved for creation of effective artificial visual system to process real world imags are a search for new algorithms of low-level image processing that, in a great extent, determine system performance. In the present paper, the result of psychophysical experiments and several advanced biologically motivated algorithms for low-level processing are presented. These algorithms are based on local space-variant filter, context encoding visual information presented in the center of input window, and automatic detection of perceptually important image fragments. The core of latter algorithm are using local feature conjunctions such as noncolinear oriented segment and composite feature map formation. Developed algorithms were integrated into foveal active vision model, the MARR. It is supposed that proposed algorithms may significantly improve model performance while real world image processing during memorizing, search, and recognition.
Observers' cognitive states modulate how visual inputs relate to gaze control.
Kardan, Omid; Henderson, John M; Yourganov, Grigori; Berman, Marc G
2016-09-01
Previous research has shown that eye-movements change depending on both the visual features of our environment, and the viewer's top-down knowledge. One important question that is unclear is the degree to which the visual goals of the viewer modulate how visual features of scenes guide eye-movements. Here, we propose a systematic framework to investigate this question. In our study, participants performed 3 different visual tasks on 135 scenes: search, memorization, and aesthetic judgment, while their eye-movements were tracked. Canonical correlation analyses showed that eye-movements were reliably more related to low-level visual features at fixations during the visual search task compared to the aesthetic judgment and scene memorization tasks. Different visual features also had different relevance to eye-movements between tasks. This modulation of the relationship between visual features and eye-movements by task was also demonstrated with classification analyses, where classifiers were trained to predict the viewing task based on eye movements and visual features at fixations. Feature loadings showed that the visual features at fixations could signal task differences independent of temporal and spatial properties of eye-movements. When classifying across participants, edge density and saliency at fixations were as important as eye-movements in the successful prediction of task, with entropy and hue also being significant, but with smaller effect sizes. When classifying within participants, brightness and saturation were also significant contributors. Canonical correlation and classification results, together with a test of moderation versus mediation, suggest that the cognitive state of the observer moderates the relationship between stimulus-driven visual features and eye-movements. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Minimizing the semantic gap in biomedical content-based image retrieval
NASA Astrophysics Data System (ADS)
Guan, Haiying; Antani, Sameer; Long, L. Rodney; Thoma, George R.
2010-03-01
A major challenge in biomedical Content-Based Image Retrieval (CBIR) is to achieve meaningful mappings that minimize the semantic gap between the high-level biomedical semantic concepts and the low-level visual features in images. This paper presents a comprehensive learning-based scheme toward meeting this challenge and improving retrieval quality. The article presents two algorithms: a learning-based feature selection and fusion algorithm and the Ranking Support Vector Machine (Ranking SVM) algorithm. The feature selection algorithm aims to select 'good' features and fuse them using different similarity measurements to provide a better representation of the high-level concepts with the low-level image features. Ranking SVM is applied to learn the retrieval rank function and associate the selected low-level features with query concepts, given the ground-truth ranking of the training samples. The proposed scheme addresses four major issues in CBIR to improve the retrieval accuracy: image feature extraction, selection and fusion, similarity measurements, the association of the low-level features with high-level concepts, and the generation of the rank function to support high-level semantic image retrieval. It models the relationship between semantic concepts and image features, and enables retrieval at the semantic level. We apply it to the problem of vertebra shape retrieval from a digitized spine x-ray image set collected by the second National Health and Nutrition Examination Survey (NHANES II). The experimental results show an improvement of up to 41.92% in the mean average precision (MAP) over conventional image similarity computation methods.
Corredor, Germán; Whitney, Jon; Arias, Viviana; Madabhushi, Anant; Romero, Eduardo
2017-01-01
Abstract. Computational histomorphometric approaches typically use low-level image features for building machine learning classifiers. However, these approaches usually ignore high-level expert knowledge. A computational model (M_im) combines low-, mid-, and high-level image information to predict the likelihood of cancer in whole slide images. Handcrafted low- and mid-level features are computed from area, color, and spatial nuclei distributions. High-level information is implicitly captured from the recorded navigations of pathologists while exploring whole slide images during diagnostic tasks. This model was validated by predicting the presence of cancer in a set of unseen fields of view. The available database was composed of 24 cases of basal-cell carcinoma, from which 17 served to estimate the model parameters and the remaining 7 comprised the evaluation set. A total of 274 fields of view of size 1024×1024 pixels were extracted from the evaluation set. Then 176 patches from this set were used to train a support vector machine classifier to predict the presence of cancer on a patch-by-patch basis while the remaining 98 image patches were used for independent testing, ensuring that the training and test sets do not comprise patches from the same patient. A baseline model (M_ex) estimated the cancer likelihood for each of the image patches. M_ex uses the same visual features as M_im, but its weights are estimated from nuclei manually labeled as cancerous or noncancerous by a pathologist. M_im achieved an accuracy of 74.49% and an F-measure of 80.31%, while M_ex yielded corresponding accuracy and F-measures of 73.47% and 77.97%, respectively. PMID:28382314
The levels of perceptual processing and the neural correlates of increasing subjective visibility.
Binder, Marek; Gociewicz, Krzysztof; Windey, Bert; Koculak, Marcin; Finc, Karolina; Nikadon, Jan; Derda, Monika; Cleeremans, Axel
2017-10-01
According to the levels-of-processing hypothesis, transitions from unconscious to conscious perception may depend on stimulus processing level, with more gradual changes for low-level stimuli and more dichotomous changes for high-level stimuli. In an event-related fMRI study we explored this hypothesis using a visual backward masking procedure. Task requirements manipulated level of processing. Participants reported the magnitude of the target digit in the high-level task, its color in the low-level task, and rated subjective visibility of stimuli using the Perceptual Awareness Scale. Intermediate stimulus visibility was reported more frequently in the low-level task, confirming prior behavioral results. Visible targets recruited insulo-fronto-parietal regions in both tasks. Task effects were observed in visual areas, with higher activity in the low-level task across all visibility levels. Thus, the influence of level of processing on conscious perception may be mediated by attentional modulation of activity in regions representing features of consciously experienced stimuli. Copyright © 2017 Elsevier Inc. All rights reserved.
Internal curvature signal and noise in low- and high-level vision
Grabowecky, Marcia; Kim, Yee Joon; Suzuki, Satoru
2011-01-01
How does internal processing contribute to visual pattern perception? By modeling visual search performance, we estimated internal signal and noise relevant to perception of curvature, a basic feature important for encoding of three-dimensional surfaces and objects. We used isolated, sparse, crowded, and face contexts to determine how internal curvature signal and noise depended on image crowding, lateral feature interactions, and level of pattern processing. Observers reported the curvature of a briefly flashed segment, which was presented alone (without lateral interaction) or among multiple straight segments (with lateral interaction). Each segment was presented with no context (engaging low-to-intermediate-level curvature processing), embedded within a face context as the mouth (engaging high-level face processing), or embedded within an inverted-scrambled-face context as a control for crowding. Using a simple, biologically plausible model of curvature perception, we estimated internal curvature signal and noise as the mean and standard deviation, respectively, of the Gaussian-distributed population activity of local curvature-tuned channels that best simulated behavioral curvature responses. Internal noise was increased by crowding but not by face context (irrespective of lateral interactions), suggesting prevention of noise accumulation in high-level pattern processing. In contrast, internal curvature signal was unaffected by crowding but modulated by lateral interactions. Lateral interactions (with straight segments) increased curvature signal when no contextual elements were added, but equivalent interactions reduced curvature signal when each segment was presented within a face. These opposing effects of lateral interactions are consistent with the phenomena of local-feature contrast in low-level processing and global-feature averaging in high-level processing. PMID:21209356
Can responses to basic non-numerical visual features explain neural numerosity responses?
Harvey, Ben M; Dumoulin, Serge O
2017-04-01
Humans and many animals can distinguish between stimuli that differ in numerosity, the number of objects in a set. Human and macaque parietal lobes contain neurons that respond to changes in stimulus numerosity. However, basic non-numerical visual features can affect neural responses to and perception of numerosity, and visual features often co-vary with numerosity. Therefore, it is debated whether numerosity or co-varying low-level visual features underlie neural and behavioral responses to numerosity. To test the hypothesis that non-numerical visual features underlie neural numerosity responses in a human parietal numerosity map, we analyze responses to a group of numerosity stimulus configurations that have the same numerosity progression but vary considerably in their non-numerical visual features. Using ultra-high-field (7T) fMRI, we measure responses to these stimulus configurations in an area of posterior parietal cortex whose responses are believed to reflect numerosity-selective activity. We describe an fMRI analysis method to distinguish between alternative models of neural response functions, following a population receptive field (pRF) modeling approach. For each stimulus configuration, we first quantify the relationships between numerosity and several non-numerical visual features that have been proposed to underlie performance in numerosity discrimination tasks. We then determine how well responses to these non-numerical visual features predict the observed fMRI responses, and compare this to the predictions of responses to numerosity. We demonstrate that a numerosity response model predicts observed responses more accurately than models of responses to simple non-numerical visual features. As such, neural responses in cognitive processing need not reflect simpler properties of early sensory inputs. Copyright © 2017 Elsevier Inc. All rights reserved.
The Sensory Components of High-Capacity Iconic Memory and Visual Working Memory
Bradley, Claire; Pearson, Joel
2012-01-01
Early visual memory can be split into two primary components: a high-capacity, short-lived iconic memory followed by a limited-capacity visual working memory that can last many seconds. Whereas a large number of studies have investigated visual working memory for low-level sensory features, much research on iconic memory has used more “high-level” alphanumeric stimuli such as letters or numbers. These two forms of memory are typically examined separately, despite an intrinsic overlap in their characteristics. Here, we used a purely sensory paradigm to examine visual short-term memory for 10 homogeneous items of three different visual features (color, orientation and motion) across a range of durations from 0 to 6 s. We found that the amount of information stored in iconic memory is smaller for motion than for color or orientation. Performance declined exponentially with longer storage durations and reached chance levels after ∼2 s. Further experiments showed that performance for the 10 items at 1 s was contingent on unperturbed attentional resources. In addition, for orientation stimuli, performance was contingent on the location of stimuli in the visual field, especially for short cue delays. Overall, our results suggest a smooth transition between an automatic, high-capacity, feature-specific sensory-iconic memory, and an effortful “lower-capacity” visual working memory. PMID:23055993
A computer-generated animated face stimulus set for psychophysiological research
Naples, Adam; Nguyen-Phuc, Alyssa; Coffman, Marika; Kresse, Anna; Faja, Susan; Bernier, Raphael; McPartland., James
2014-01-01
Human faces are fundamentally dynamic, but experimental investigations of face perception traditionally rely on static images of faces. While naturalistic videos of actors have been used with success in some contexts, much research in neuroscience and psychophysics demands carefully controlled stimuli. In this paper, we describe a novel set of computer generated, dynamic, face stimuli. These grayscale faces are tightly controlled for low- and high-level visual properties. All faces are standardized in terms of size, luminance, and location and size of facial features. Each face begins with a neutral pose and transitions to an expression over the course of 30 frames. Altogether there are 222 stimuli spanning 3 different categories of movement: (1) an affective movement (fearful face); (2) a neutral movement (close-lipped, puffed cheeks with open eyes); and (3) a biologically impossible movement (upward dislocation of eyes and mouth). To determine whether early brain responses sensitive to low-level visual features differed between expressions, we measured the occipital P100 event related potential (ERP), which is known to reflect differences in early stages of visual processing and the N170, which reflects structural encoding of faces. We found no differences between faces at the P100, indicating that different face categories were well matched on low-level image properties. This database provides researchers with a well-controlled set of dynamic faces controlled on low-level image characteristics that are applicable to a range of research questions in social perception. PMID:25028164
Decoding natural images from evoked brain activities using encoding models with invertible mapping.
Li, Chao; Xu, Junhai; Liu, Baolin
2018-05-21
Recent studies have built encoding models in the early visual cortex, and reliable mappings have been made between the low-level visual features of stimuli and brain activities. However, these mappings are irreversible, so that the features cannot be directly decoded. To solve this problem, we designed a sparse framework-based encoding model that predicted brain activities from a complete feature representation. Moreover, according to the distribution and activation rules of neurons in the primary visual cortex (V1), three key transformations were introduced into the basic feature to improve the model performance. In this setting, the mapping was simple enough that it could be inverted using a closed-form formula. Using this mapping, we designed a hybrid identification method based on the support vector machine (SVM), and tested it on a published functional magnetic resonance imaging (fMRI) dataset. The experiments confirmed the rationality of our encoding model, and the identification accuracies for 2 subjects increased from 92% and 72% to 98% and 92% with the chance level only 0.8%. Copyright © 2018 Elsevier Ltd. All rights reserved.
Dima, Diana C; Perry, Gavin; Singh, Krish D
2018-06-11
In navigating our environment, we rapidly process and extract meaning from visual cues. However, the relationship between visual features and categorical representations in natural scene perception is still not well understood. Here, we used natural scene stimuli from different categories and filtered at different spatial frequencies to address this question in a passive viewing paradigm. Using representational similarity analysis (RSA) and cross-decoding of magnetoencephalography (MEG) data, we show that categorical representations emerge in human visual cortex at ∼180 ms and are linked to spatial frequency processing. Furthermore, dorsal and ventral stream areas reveal temporally and spatially overlapping representations of low and high-level layer activations extracted from a feedforward neural network. Our results suggest that neural patterns from extrastriate visual cortex switch from low-level to categorical representations within 200 ms, highlighting the rapid cascade of processing stages essential in human visual perception. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.
Infrared and visible image fusion scheme based on NSCT and low-level visual features
NASA Astrophysics Data System (ADS)
Li, Huafeng; Qiu, Hongmei; Yu, Zhengtao; Zhang, Yafei
2016-05-01
Multi-scale transform (MST) is an efficient tool for image fusion. Recently, many fusion methods have been developed based on different MSTs, and they have shown potential application in many fields. In this paper, we propose an effective infrared and visible image fusion scheme in nonsubsampled contourlet transform (NSCT) domain, in which the NSCT is firstly employed to decompose each of the source images into a series of high frequency subbands and one low frequency subband. To improve the fusion performance we designed two new activity measures for fusion of the lowpass subbands and the highpass subbands. These measures are developed based on the fact that the human visual system (HVS) percept the image quality mainly according to its some low-level features. Then, the selection principles of different subbands are presented based on the corresponding activity measures. Finally, the merged subbands are constructed according to the selection principles, and the final fused image is produced by applying the inverse NSCT on these merged subbands. Experimental results demonstrate the effectiveness and superiority of the proposed method over the state-of-the-art fusion methods in terms of both visual effect and objective evaluation results.
Temporal resolution of orientation-defined texture segregation: a VEP study.
Lachapelle, Julie; McKerral, Michelle; Jauffret, Colin; Bach, Michael
2008-09-01
Orientation is one of the visual dimensions that subserve figure-ground discrimination. A spatial gradient in orientation leads to "texture segregation", which is thought to be concurrent parallel processing across the visual field, without scanning. In the visual-evoked potential (VEP) a component can be isolated which is related to texture segregation ("tsVEP"). Our objective was to evaluate the temporal frequency dependence of the tsVEP to compare processing speed of low-level features (e.g., orientation, using the VEP, here denoted llVEP) with texture segregation because of a recent literature controversy in that regard. Visual-evoked potentials (VEPs) were recorded in seven normal adults. Oriented line segments of 0.1 degrees x 0.8 degrees at 100% contrast were presented in four different arrangements: either oriented in parallel for two homogeneous stimuli (from which were obtained the low-level VEP (llVEP)) or with a 90 degrees orientation gradient for two textured ones (from which were obtained the texture VEP). The orientation texture condition was presented at eight different temporal frequencies ranging from 7.5 to 45 Hz. Fourier analysis was used to isolate low-level components at the pattern-change frequency and texture-segregation components at half that frequency. For all subjects, there was lower high-cutoff frequency for tsVEP than for llVEPs, on average 12 Hz vs. 17 Hz (P = 0.017). The results suggest that the processing of feature gradients to extract texture segregation requires additional processing time, resulting in a lower fusion frequency.
Reduced Perceptual Exclusivity during Object and Grating Rivalry in Autism
Freyberg, J.; Robertson, C.E.; Baron-Cohen, S.
2015-01-01
Background The dynamics of binocular rivalry may be a behavioural footprint of excitatory and inhibitory neural transmission in visual cortex. Given the presence of atypical visual features in Autism Spectrum Conditions (ASC), and evidence in support of the idea of an imbalance in excitatory/inhibitory neural transmission in ASC, we hypothesized that binocular rivalry might prove a simple behavioural marker of such a transmission imbalance in the autistic brain. In support of this hypothesis, we previously reported a slower rate of rivalry in ASC, driven by reduced perceptual exclusivity. Methods We tested whether atypical dynamics of binocular rivalry in ASC are specific to certain stimulus features. 53 participants (26 with ASC, matched for age, sex and IQ) participated in binocular rivalry experiments in which the dynamics of rivalry were measured at two levels of stimulus complexity, low (grayscale gratings) and high (coloured objects). Results Individuals with ASC experienced a slower rate of rivalry, driven by longer transitional states between dominant percepts. These exaggerated transitional states were present at both low and high levels of stimulus complexity, suggesting that atypical rivalry dynamics in autism are robust with respect to stimulus choice. Interactions between stimulus properties and rivalry dynamics in autism indicate that achromatic grating stimuli produce stronger group differences. Conclusion These results confirm the finding of atypical dynamics of binocular rivalry in ASC. These dynamics were present for stimuli of both low and high levels of visual complexity, suggesting an imbalance in competitive interactions throughout the visual system of individuals with ASC. PMID:26382002
Electrophysiological evidence for biased competition in V1 for fear expressions.
West, Greg L; Anderson, Adam A K; Ferber, Susanne; Pratt, Jay
2011-11-01
When multiple stimuli are concurrently displayed in the visual field, they must compete for neural representation at the processing expense of their contemporaries. This biased competition is thought to begin as early as primary visual cortex, and can be driven by salient low-level stimulus features. Stimuli important for an organism's survival, such as facial expressions signaling environmental threat, might be similarly prioritized at this early stage of visual processing. In the present study, we used ERP recordings from striate cortex to examine whether fear expressions can bias the competition for neural representation at the earliest stage of retinotopic visuo-cortical processing when in direct competition with concurrently presented visual information of neutral valence. We found that within 50 msec after stimulus onset, information processing in primary visual cortex is biased in favor of perceptual representations of fear at the expense of competing visual information (Experiment 1). Additional experiments confirmed that the facial display's emotional content rather than low-level features is responsible for this prioritization in V1 (Experiment 2), and that this competition is reliant on a face's upright canonical orientation (Experiment 3). These results suggest that complex stimuli important for an organism's survival can indeed be prioritized at the earliest stage of cortical processing at the expense of competing information, with competition possibly beginning before encoding in V1.
Familiarity enhances visual working memory for faces.
Jackson, Margaret C; Raymond, Jane E
2008-06-01
Although it is intuitive that familiarity with complex visual objects should aid their preservation in visual working memory (WM), empirical evidence for this is lacking. This study used a conventional change-detection procedure to assess visual WM for unfamiliar and famous faces in healthy adults. Across experiments, faces were upright or inverted and a low- or high-load concurrent verbal WM task was administered to suppress contribution from verbal WM. Even with a high verbal memory load, visual WM performance was significantly better and capacity estimated as significantly greater for famous versus unfamiliar faces. Face inversion abolished this effect. Thus, neither strategic, explicit support from verbal WM nor low-level feature processing easily accounts for the observed benefit of high familiarity for visual WM. These results demonstrate that storage of items in visual WM can be enhanced if robust visual representations of them already exist in long-term memory.
Visual motherese? Signal-to-noise ratios in toddler-directed television
Wass, Sam V; Smith, Tim J
2015-01-01
Younger brains are noisier information processing systems; this means that information for younger individuals has to allow clearer differentiation between those aspects that are required for the processing task in hand (the ‘signal’) and those that are not (the ‘noise’). We compared toddler-directed and adult-directed TV programmes (TotTV/ATV). We examined how low-level visual features (that previous research has suggested influence gaze allocation) relate to semantic information, namely the location of the character speaking in each frame. We show that this relationship differs between TotTV and ATV. First, we conducted Receiver Operator Characteristics analyses and found that feature congestion predicted speaking character location in TotTV but not ATV. Second, we used multiple analytical strategies to show that luminance differentials (flicker) predict face location more strongly in TotTV than ATV. Our results suggest that TotTV designers have intuited techniques for controlling toddler attention using low-level visual cues. The implications of these findings for structuring childhood learning experiences away from a screen are discussed. PMID:24702791
Visual motherese? Signal-to-noise ratios in toddler-directed television.
Wass, Sam V; Smith, Tim J
2015-01-01
Younger brains are noisier information processing systems; this means that information for younger individuals has to allow clearer differentiation between those aspects that are required for the processing task in hand (the 'signal') and those that are not (the 'noise'). We compared toddler-directed and adult-directed TV programmes (TotTV/ATV). We examined how low-level visual features (that previous research has suggested influence gaze allocation) relate to semantic information, namely the location of the character speaking in each frame. We show that this relationship differs between TotTV and ATV. First, we conducted Receiver Operator Characteristics analyses and found that feature congestion predicted speaking character location in TotTV but not ATV. Second, we used multiple analytical strategies to show that luminance differentials (flicker) predict face location more strongly in TotTV than ATV. Our results suggest that TotTV designers have intuited techniques for controlling toddler attention using low-level visual cues. The implications of these findings for structuring childhood learning experiences away from a screen are discussed. © 2014 The Authors. Developmental Science Published by John Wiley & Sons Ltd.
ViA: a perceptual visualization assistant
NASA Astrophysics Data System (ADS)
Healey, Chris G.; St. Amant, Robert; Elhaddad, Mahmoud S.
2000-05-01
This paper describes an automated visualized assistant called ViA. ViA is designed to help users construct perceptually optical visualizations to represent, explore, and analyze large, complex, multidimensional datasets. We have approached this problem by studying what is known about the control of human visual attention. By harnessing the low-level human visual system, we can support our dual goals of rapid and accurate visualization. Perceptual guidelines that we have built using psychophysical experiments form the basis for ViA. ViA uses modified mixed-initiative planning algorithms from artificial intelligence to search of perceptually optical data attribute to visual feature mappings. Our perceptual guidelines are integrated into evaluation engines that provide evaluation weights for a given data-feature mapping, and hints on how that mapping might be improved. ViA begins by asking users a set of simple questions about their dataset and the analysis tasks they want to perform. Answers to these questions are used in combination with the evaluation engines to identify and intelligently pursue promising data-feature mappings. The result is an automatically-generated set of mappings that are perceptually salient, but that also respect the context of the dataset and users' preferences about how they want to visualize their data.
Saliency Detection of Stereoscopic 3D Images with Application to Visual Discomfort Prediction
NASA Astrophysics Data System (ADS)
Li, Hong; Luo, Ting; Xu, Haiyong
2017-06-01
Visual saliency detection is potentially useful for a wide range of applications in image processing and computer vision fields. This paper proposes a novel bottom-up saliency detection approach for stereoscopic 3D (S3D) images based on regional covariance matrix. As for S3D saliency detection, besides the traditional 2D low-level visual features, additional 3D depth features should also be considered. However, only limited efforts have been made to investigate how different features (e.g. 2D and 3D features) contribute to the overall saliency of S3D images. The main contribution of this paper is that we introduce a nonlinear feature integration descriptor, i.e., regional covariance matrix, to fuse both 2D and 3D features for S3D saliency detection. The regional covariance matrix is shown to be effective for nonlinear feature integration by modelling the inter-correlation of different feature dimensions. Experimental results demonstrate that the proposed approach outperforms several existing relevant models including 2D extended and pure 3D saliency models. In addition, we also experimentally verified that the proposed S3D saliency map can significantly improve the prediction accuracy of experienced visual discomfort when viewing S3D images.
NASA Astrophysics Data System (ADS)
Khaustova, Dar'ya; Fournier, Jérôme; Wyckens, Emmanuel; Le Meur, Olivier
2014-02-01
The aim of this research is to understand the difference in visual attention to 2D and 3D content depending on texture and amount of depth. Two experiments were conducted using an eye-tracker and a 3DTV display. Collected fixation data were used to build saliency maps and to analyze the differences between 2D and 3D conditions. In the first experiment 51 observers participated in the test. Using scenes that contained objects with crossed disparity, it was discovered that such objects are the most salient, even if observers experience discomfort due to the high level of disparity. The goal of the second experiment is to decide whether depth is a determinative factor for visual attention. During the experiment, 28 observers watched the scenes that contained objects with crossed and uncrossed disparities. We evaluated features influencing the saliency of the objects in stereoscopic conditions by using contents with low-level visual features. With univariate tests of significance (MANOVA), it was detected that texture is more important than depth for selection of objects. Objects with crossed disparity are significantly more important for selection processes when compared to 2D. However, objects with uncrossed disparity have the same influence on visual attention as 2D objects. Analysis of eyemovements indicated that there is no difference in saccade length. Fixation durations were significantly higher in stereoscopic conditions for low-level stimuli than in 2D. We believe that these experiments can help to refine existing models of visual attention for 3D content.
Cognitive and artificial representations in handwriting recognition
NASA Astrophysics Data System (ADS)
Lenaghan, Andrew P.; Malyan, Ron
1996-03-01
Both cognitive processes and artificial recognition systems may be characterized by the forms of representation they build and manipulate. This paper looks at how handwriting is represented in current recognition systems and the psychological evidence for its representation in the cognitive processes responsible for reading. Empirical psychological work on feature extraction in early visual processing is surveyed to show that a sound psychological basis for feature extraction exists and to describe the features this approach leads to. The first stage of the development of an architecture for a handwriting recognition system which has been strongly influenced by the psychological evidence for the cognitive processes and representations used in early visual processing, is reported. This architecture builds a number of parallel low level feature maps from raw data. These feature maps are thresholded and a region labeling algorithm is used to generate sets of features. Fuzzy logic is used to quantify the uncertainty in the presence of individual features.
Visual Categorization of Natural Movies by Rats
Vinken, Kasper; Vermaercke, Ben
2014-01-01
Visual categorization of complex, natural stimuli has been studied for some time in human and nonhuman primates. Recent interest in the rodent as a model for visual perception, including higher-level functional specialization, leads to the question of how rodents would perform on a categorization task using natural stimuli. To answer this question, rats were trained in a two-alternative forced choice task to discriminate movies containing rats from movies containing other objects and from scrambled movies (ordinate-level categorization). Subsequently, transfer to novel, previously unseen stimuli was tested, followed by a series of control probes. The results show that the animals are capable of acquiring a decision rule by abstracting common features from natural movies to generalize categorization to new stimuli. Control probes demonstrate that they did not use single low-level features, such as motion energy or (local) luminance. Significant generalization was even present with stationary snapshots from untrained movies. The variability within and between training and test stimuli, the complexity of natural movies, and the control experiments and analyses all suggest that a more high-level rule based on more complex stimulus features than local luminance-based cues was used to classify the novel stimuli. In conclusion, natural stimuli can be used to probe ordinate-level categorization in rats. PMID:25100598
Human Occipital and Parietal GABA Selectively Influence Visual Perception of Orientation and Size.
Song, Chen; Sandberg, Kristian; Andersen, Lau Møller; Blicher, Jakob Udby; Rees, Geraint
2017-09-13
GABA is the primary inhibitory neurotransmitter in human brain. The level of GABA varies substantially across individuals, and this variability is associated with interindividual differences in visual perception. However, it remains unclear whether the association between GABA level and visual perception reflects a general influence of visual inhibition or whether the GABA levels of different cortical regions selectively influence perception of different visual features. To address this, we studied how the GABA levels of parietal and occipital cortices related to interindividual differences in size, orientation, and brightness perception. We used visual contextual illusion as a perceptual assay since the illusion dissociates perceptual content from stimulus content and the magnitude of the illusion reflects the effect of visual inhibition. Across individuals, we observed selective correlations between the level of GABA and the magnitude of contextual illusion. Specifically, parietal GABA level correlated with size illusion magnitude but not with orientation or brightness illusion magnitude; in contrast, occipital GABA level correlated with orientation illusion magnitude but not with size or brightness illusion magnitude. Our findings reveal a region- and feature-dependent influence of GABA level on human visual perception. Parietal and occipital cortices contain, respectively, topographic maps of size and orientation preference in which neural responses to stimulus sizes and stimulus orientations are modulated by intraregional lateral connections. We propose that these lateral connections may underlie the selective influence of GABA on visual perception. SIGNIFICANCE STATEMENT GABA, the primary inhibitory neurotransmitter in human visual system, varies substantially across individuals. This interindividual variability in GABA level is linked to interindividual differences in many aspects of visual perception. However, the widespread influence of GABA raises the question of whether interindividual variability in GABA reflects an overall variability in visual inhibition and has a general influence on visual perception or whether the GABA levels of different cortical regions have selective influence on perception of different visual features. Here we report a region- and feature-dependent influence of GABA level on human visual perception. Our findings suggest that GABA level of a cortical region selectively influences perception of visual features that are topographically mapped in this region through intraregional lateral connections. Copyright © 2017 Song, Sandberg et al.
Ma, Wei Ji; Zhou, Xiang; Ross, Lars A; Foxe, John J; Parra, Lucas C
2009-01-01
Watching a speaker's facial movements can dramatically enhance our ability to comprehend words, especially in noisy environments. From a general doctrine of combining information from different sensory modalities (the principle of inverse effectiveness), one would expect that the visual signals would be most effective at the highest levels of auditory noise. In contrast, we find, in accord with a recent paper, that visual information improves performance more at intermediate levels of auditory noise than at the highest levels, and we show that a novel visual stimulus containing only temporal information does the same. We present a Bayesian model of optimal cue integration that can explain these conflicts. In this model, words are regarded as points in a multidimensional space and word recognition is a probabilistic inference process. When the dimensionality of the feature space is low, the Bayesian model predicts inverse effectiveness; when the dimensionality is high, the enhancement is maximal at intermediate auditory noise levels. When the auditory and visual stimuli differ slightly in high noise, the model makes a counterintuitive prediction: as sound quality increases, the proportion of reported words corresponding to the visual stimulus should first increase and then decrease. We confirm this prediction in a behavioral experiment. We conclude that auditory-visual speech perception obeys the same notion of optimality previously observed only for simple multisensory stimuli.
Behavioral model of visual perception and recognition
NASA Astrophysics Data System (ADS)
Rybak, Ilya A.; Golovan, Alexander V.; Gusakova, Valentina I.
1993-09-01
In the processes of visual perception and recognition human eyes actively select essential information by way of successive fixations at the most informative points of the image. A behavioral program defining a scanpath of the image is formed at the stage of learning (object memorizing) and consists of sequential motor actions, which are shifts of attention from one to another point of fixation, and sensory signals expected to arrive in response to each shift of attention. In the modern view of the problem, invariant object recognition is provided by the following: (1) separated processing of `what' (object features) and `where' (spatial features) information at high levels of the visual system; (2) mechanisms of visual attention using `where' information; (3) representation of `what' information in an object-based frame of reference (OFR). However, most recent models of vision based on OFR have demonstrated the ability of invariant recognition of only simple objects like letters or binary objects without background, i.e. objects to which a frame of reference is easily attached. In contrast, we use not OFR, but a feature-based frame of reference (FFR), connected with the basic feature (edge) at the fixation point. This has provided for our model, the ability for invariant representation of complex objects in gray-level images, but demands realization of behavioral aspects of vision described above. The developed model contains a neural network subsystem of low-level vision which extracts a set of primary features (edges) in each fixation, and high- level subsystem consisting of `what' (Sensory Memory) and `where' (Motor Memory) modules. The resolution of primary features extraction decreases with distances from the point of fixation. FFR provides both the invariant representation of object features in Sensor Memory and shifts of attention in Motor Memory. Object recognition consists in successive recall (from Motor Memory) and execution of shifts of attention and successive verification of the expected sets of features (stored in Sensory Memory). The model shows the ability of recognition of complex objects (such as faces) in gray-level images invariant with respect to shift, rotation, and scale.
Kathryn E. Schertz; Sonya Sachdeva; Omid Kardan; Hiroki P. Kotabe; Kathleen L. Wolf; Marc G. Berman
2018-01-01
Prior research has shown that the physical characteristics of one's environment have wide ranging effects on affect and cognition. Other research has demonstrated that one's thoughts have impacts on mood and behavior, and in this three-part research program we investigated how physical features of the environment can alter thought content. In one study, we...
Visual perception as retrospective Bayesian decoding from high- to low-level features
Ding, Stephanie; Cueva, Christopher J.; Tsodyks, Misha; Qian, Ning
2017-01-01
When a stimulus is presented, its encoding is known to progress from low- to high-level features. How these features are decoded to produce perception is less clear, and most models assume that decoding follows the same low- to high-level hierarchy of encoding. There are also theories arguing for global precedence, reversed hierarchy, or bidirectional processing, but they are descriptive without quantitative comparison with human perception. Moreover, observers often inspect different parts of a scene sequentially to form overall perception, suggesting that perceptual decoding requires working memory, yet few models consider how working-memory properties may affect decoding hierarchy. We probed decoding hierarchy by comparing absolute judgments of single orientations and relative/ordinal judgments between two sequentially presented orientations. We found that lower-level, absolute judgments failed to account for higher-level, relative/ordinal judgments. However, when ordinal judgment was used to retrospectively decode memory representations of absolute orientations, striking aspects of absolute judgments, including the correlation and forward/backward aftereffects between two reported orientations in a trial, were explained. We propose that the brain prioritizes decoding of higher-level features because they are more behaviorally relevant, and more invariant and categorical, and thus easier to specify and maintain in noisy working memory, and that more reliable higher-level decoding constrains less reliable lower-level decoding. PMID:29073108
Top-down influences on visual attention during listening are modulated by observer sex.
Shen, John; Itti, Laurent
2012-07-15
In conversation, women have a small advantage in decoding non-verbal communication compared to men. In light of these findings, we sought to determine whether sex differences also existed in visual attention during a related listening task, and if so, if the differences existed among attention to high-level aspects of the scene or to conspicuous visual features. Using eye-tracking and computational techniques, we present direct evidence that men and women orient attention differently during conversational listening. We tracked the eyes of 15 men and 19 women who watched and listened to 84 clips featuring 12 different speakers in various outdoor settings. At the fixation following each saccadic eye movement, we analyzed the type of object that was fixated. Men gazed more often at the mouth and women at the eyes of the speaker. Women more often exhibited "distracted" saccades directed away from the speaker and towards a background scene element. Examining the multi-scale center-surround variation in low-level visual features (static: color, intensity, orientation, and dynamic: motion energy), we found that men consistently selected regions which expressed more variation in dynamic features, which can be attributed to a male preference for motion and a female preference for areas that may contain nonverbal information about the speaker. In sum, significant differences were observed, which we speculate arise from different integration strategies of visual cues in selecting the final target of attention. Our findings have implications for studies of sex in nonverbal communication, as well as for more predictive models of visual attention. Published by Elsevier Ltd.
Cohen, Michael A; Rhee, Juliana Y; Alvarez, George A
2016-01-01
Human cognition has a limited capacity that is often attributed to the brain having finite cognitive resources, but the nature of these resources is usually not specified. Here, we show evidence that perceptual interference between items can be predicted by known receptive field properties of the visual cortex, suggesting that competition within representational maps is an important source of the capacity limitations of visual processing. Across the visual hierarchy, receptive fields get larger and represent more complex, high-level features. Thus, when presented simultaneously, high-level items (e.g., faces) will often land within the same receptive fields, while low-level items (e.g., color patches) will often not. Using a perceptual task, we found long-range interference between high-level items, but only short-range interference for low-level items, with both types of interference being weaker across hemifields. Finally, we show that long-range interference between items appears to occur primarily during perceptual encoding and not during working memory maintenance. These results are naturally explained by the distribution of receptive fields and establish a link between perceptual capacity limits and the underlying neural architecture. (c) 2015 APA, all rights reserved).
Relevance feedback-based building recognition
NASA Astrophysics Data System (ADS)
Li, Jing; Allinson, Nigel M.
2010-07-01
Building recognition is a nontrivial task in computer vision research which can be utilized in robot localization, mobile navigation, etc. However, existing building recognition systems usually encounter the following two problems: 1) extracted low level features cannot reveal the true semantic concepts; and 2) they usually involve high dimensional data which require heavy computational costs and memory. Relevance feedback (RF), widely applied in multimedia information retrieval, is able to bridge the gap between the low level visual features and high level concepts; while dimensionality reduction methods can mitigate the high-dimensional problem. In this paper, we propose a building recognition scheme which integrates the RF and subspace learning algorithms. Experimental results undertaken on our own building database show that the newly proposed scheme appreciably enhances the recognition accuracy.
Bankson, B B; Hebart, M N; Groen, I I A; Baker, C I
2018-05-17
Visual object representations are commonly thought to emerge rapidly, yet it has remained unclear to what extent early brain responses reflect purely low-level visual features of these objects and how strongly those features contribute to later categorical or conceptual representations. Here, we aimed to estimate a lower temporal bound for the emergence of conceptual representations by defining two criteria that characterize such representations: 1) conceptual object representations should generalize across different exemplars of the same object, and 2) these representations should reflect high-level behavioral judgments. To test these criteria, we compared magnetoencephalography (MEG) recordings between two groups of participants (n = 16 per group) exposed to different exemplar images of the same object concepts. Further, we disentangled low-level from high-level MEG responses by estimating the unique and shared contribution of models of behavioral judgments, semantics, and different layers of deep neural networks of visual object processing. We find that 1) both generalization across exemplars as well as generalization of object-related signals across time increase after 150 ms, peaking around 230 ms; 2) representations specific to behavioral judgments emerged rapidly, peaking around 160 ms. Collectively, these results suggest a lower bound for the emergence of conceptual object representations around 150 ms following stimulus onset. Copyright © 2018 Elsevier Inc. All rights reserved.
Visual short-term memory: activity supporting encoding and maintenance in retinotopic visual cortex.
Sneve, Markus H; Alnæs, Dag; Endestad, Tor; Greenlee, Mark W; Magnussen, Svein
2012-10-15
Recent studies have demonstrated that retinotopic cortex maintains information about visual stimuli during retention intervals. However, the process by which transient stimulus-evoked sensory responses are transformed into enduring memory representations is unknown. Here, using fMRI and short-term visual memory tasks optimized for univariate and multivariate analysis approaches, we report differential involvement of human retinotopic areas during memory encoding of the low-level visual feature orientation. All visual areas show weaker responses when memory encoding processes are interrupted, possibly due to effects in orientation-sensitive primary visual cortex (V1) propagating across extrastriate areas. Furthermore, intermediate areas in both dorsal (V3a/b) and ventral (LO1/2) streams are significantly more active during memory encoding compared with non-memory (active and passive) processing of the same stimulus material. These effects in intermediate visual cortex are also observed during memory encoding of a different stimulus feature (spatial frequency), suggesting that these areas are involved in encoding processes on a higher level of representation. Using pattern-classification techniques to probe the representational content in visual cortex during delay periods, we further demonstrate that simply initiating memory encoding is not sufficient to produce long-lasting memory traces. Rather, active maintenance appears to underlie the observed memory-specific patterns of information in retinotopic cortex. Copyright © 2012 Elsevier Inc. All rights reserved.
Visual categorization of natural movies by rats.
Vinken, Kasper; Vermaercke, Ben; Op de Beeck, Hans P
2014-08-06
Visual categorization of complex, natural stimuli has been studied for some time in human and nonhuman primates. Recent interest in the rodent as a model for visual perception, including higher-level functional specialization, leads to the question of how rodents would perform on a categorization task using natural stimuli. To answer this question, rats were trained in a two-alternative forced choice task to discriminate movies containing rats from movies containing other objects and from scrambled movies (ordinate-level categorization). Subsequently, transfer to novel, previously unseen stimuli was tested, followed by a series of control probes. The results show that the animals are capable of acquiring a decision rule by abstracting common features from natural movies to generalize categorization to new stimuli. Control probes demonstrate that they did not use single low-level features, such as motion energy or (local) luminance. Significant generalization was even present with stationary snapshots from untrained movies. The variability within and between training and test stimuli, the complexity of natural movies, and the control experiments and analyses all suggest that a more high-level rule based on more complex stimulus features than local luminance-based cues was used to classify the novel stimuli. In conclusion, natural stimuli can be used to probe ordinate-level categorization in rats. Copyright © 2014 the authors 0270-6474/14/3410645-14$15.00/0.
NASA Astrophysics Data System (ADS)
Frikha, Mayssa; Fendri, Emna; Hammami, Mohamed
2017-09-01
Using semantic attributes such as gender, clothes, and accessories to describe people's appearance is an appealing modeling method for video surveillance applications. We proposed a midlevel appearance signature based on extracting a list of nameable semantic attributes describing the body in uncontrolled acquisition conditions. Conventional approaches extract the same set of low-level features to learn the semantic classifiers uniformly. Their critical limitation is the inability to capture the dominant visual characteristics for each trait separately. The proposed approach consists of extracting low-level features in an attribute-adaptive way by automatically selecting the most relevant features for each attribute separately. Furthermore, relying on a small training-dataset would easily lead to poor performance due to the large intraclass and interclass variations. We annotated large scale people images collected from different person reidentification benchmarks covering a large attribute sample and reflecting the challenges of uncontrolled acquisition conditions. These annotations were gathered into an appearance semantic attribute dataset that contains 3590 images annotated with 14 attributes. Various experiments prove that carefully designed features for learning the visual characteristics for an attribute provide an improvement of the correct classification accuracy and a reduction of both spatial and temporal complexities against state-of-the-art approaches.
Basic level category structure emerges gradually across human ventral visual cortex.
Iordan, Marius Cătălin; Greene, Michelle R; Beck, Diane M; Fei-Fei, Li
2015-07-01
Objects can be simultaneously categorized at multiple levels of specificity ranging from very broad ("natural object") to very distinct ("Mr. Woof"), with a mid-level of generality (basic level: "dog") often providing the most cognitively useful distinction between categories. It is unknown, however, how this hierarchical representation is achieved in the brain. Using multivoxel pattern analyses, we examined how well each taxonomic level (superordinate, basic, and subordinate) of real-world object categories is represented across occipitotemporal cortex. We found that, although in early visual cortex objects are best represented at the subordinate level (an effect mostly driven by low-level feature overlap between objects in the same category), this advantage diminishes compared to the basic level as we move up the visual hierarchy, disappearing in object-selective regions of occipitotemporal cortex. This pattern stems from a combined increase in within-category similarity (category cohesion) and between-category dissimilarity (category distinctiveness) of neural activity patterns at the basic level, relative to both subordinate and superordinate levels, suggesting that successive visual areas may be optimizing basic level representations.
Synergistic Instance-Level Subspace Alignment for Fine-Grained Sketch-Based Image Retrieval.
Li, Ke; Pang, Kaiyue; Song, Yi-Zhe; Hospedales, Timothy M; Xiang, Tao; Zhang, Honggang
2017-08-25
We study the problem of fine-grained sketch-based image retrieval. By performing instance-level (rather than category-level) retrieval, it embodies a timely and practical application, particularly with the ubiquitous availability of touchscreens. Three factors contribute to the challenging nature of the problem: (i) free-hand sketches are inherently abstract and iconic, making visual comparisons with photos difficult, (ii) sketches and photos are in two different visual domains, i.e. black and white lines vs. color pixels, and (iii) fine-grained distinctions are especially challenging when executed across domain and abstraction-level. To address these challenges, we propose to bridge the image-sketch gap both at the high-level via parts and attributes, as well as at the low-level, via introducing a new domain alignment method. More specifically, (i) we contribute a dataset with 304 photos and 912 sketches, where each sketch and image is annotated with its semantic parts and associated part-level attributes. With the help of this dataset, we investigate (ii) how strongly-supervised deformable part-based models can be learned that subsequently enable automatic detection of part-level attributes, and provide pose-aligned sketch-image comparisons. To reduce the sketch-image gap when comparing low-level features, we also (iii) propose a novel method for instance-level domain-alignment, that exploits both subspace and instance-level cues to better align the domains. Finally (iv) these are combined in a matching framework integrating aligned low-level features, mid-level geometric structure and high-level semantic attributes. Extensive experiments conducted on our new dataset demonstrate effectiveness of the proposed method.
Decoding visual object categories from temporal correlations of ECoG signals.
Majima, Kei; Matsuo, Takeshi; Kawasaki, Keisuke; Kawai, Kensuke; Saito, Nobuhito; Hasegawa, Isao; Kamitani, Yukiyasu
2014-04-15
How visual object categories are represented in the brain is one of the key questions in neuroscience. Studies on low-level visual features have shown that relative timings or phases of neural activity between multiple brain locations encode information. However, whether such temporal patterns of neural activity are used in the representation of visual objects is unknown. Here, we examined whether and how visual object categories could be predicted (or decoded) from temporal patterns of electrocorticographic (ECoG) signals from the temporal cortex in five patients with epilepsy. We used temporal correlations between electrodes as input features, and compared the decoding performance with features defined by spectral power and phase from individual electrodes. While using power or phase alone, the decoding accuracy was significantly better than chance, correlations alone or those combined with power outperformed other features. Decoding performance with correlations was degraded by shuffling the order of trials of the same category in each electrode, indicating that the relative time series between electrodes in each trial is critical. Analysis using a sliding time window revealed that decoding performance with correlations began to rise earlier than that with power. This earlier increase in performance was replicated by a model using phase differences to encode categories. These results suggest that activity patterns arising from interactions between multiple neuronal units carry additional information on visual object categories. Copyright © 2013 Elsevier Inc. All rights reserved.
Natural image statistics and low-complexity feature selection.
Vasconcelos, Manuela; Vasconcelos, Nuno
2009-02-01
Low-complexity feature selection is analyzed in the context of visual recognition. It is hypothesized that high-order dependences of bandpass features contain little information for discrimination of natural images. This hypothesis is characterized formally by the introduction of the concepts of conjunctive interference and decomposability order of a feature set. Necessary and sufficient conditions for the feasibility of low-complexity feature selection are then derived in terms of these concepts. It is shown that the intrinsic complexity of feature selection is determined by the decomposability order of the feature set and not its dimension. Feature selection algorithms are then derived for all levels of complexity and are shown to be approximated by existing information-theoretic methods, which they consistently outperform. The new algorithms are also used to objectively test the hypothesis of low decomposability order through comparison of classification performance. It is shown that, for image classification, the gain of modeling feature dependencies has strongly diminishing returns: best results are obtained under the assumption of decomposability order 1. This suggests a generic law for bandpass features extracted from natural images: that the effect, on the dependence of any two features, of observing any other feature is constant across image classes.
Visual perception as retrospective Bayesian decoding from high- to low-level features.
Ding, Stephanie; Cueva, Christopher J; Tsodyks, Misha; Qian, Ning
2017-10-24
When a stimulus is presented, its encoding is known to progress from low- to high-level features. How these features are decoded to produce perception is less clear, and most models assume that decoding follows the same low- to high-level hierarchy of encoding. There are also theories arguing for global precedence, reversed hierarchy, or bidirectional processing, but they are descriptive without quantitative comparison with human perception. Moreover, observers often inspect different parts of a scene sequentially to form overall perception, suggesting that perceptual decoding requires working memory, yet few models consider how working-memory properties may affect decoding hierarchy. We probed decoding hierarchy by comparing absolute judgments of single orientations and relative/ordinal judgments between two sequentially presented orientations. We found that lower-level, absolute judgments failed to account for higher-level, relative/ordinal judgments. However, when ordinal judgment was used to retrospectively decode memory representations of absolute orientations, striking aspects of absolute judgments, including the correlation and forward/backward aftereffects between two reported orientations in a trial, were explained. We propose that the brain prioritizes decoding of higher-level features because they are more behaviorally relevant, and more invariant and categorical, and thus easier to specify and maintain in noisy working memory, and that more reliable higher-level decoding constrains less reliable lower-level decoding. Published under the PNAS license.
Visual Attention Modeling for Stereoscopic Video: A Benchmark and Computational Model.
Fang, Yuming; Zhang, Chi; Li, Jing; Lei, Jianjun; Perreira Da Silva, Matthieu; Le Callet, Patrick
2017-10-01
In this paper, we investigate the visual attention modeling for stereoscopic video from the following two aspects. First, we build one large-scale eye tracking database as the benchmark of visual attention modeling for stereoscopic video. The database includes 47 video sequences and their corresponding eye fixation data. Second, we propose a novel computational model of visual attention for stereoscopic video based on Gestalt theory. In the proposed model, we extract the low-level features, including luminance, color, texture, and depth, from discrete cosine transform coefficients, which are used to calculate feature contrast for the spatial saliency computation. The temporal saliency is calculated by the motion contrast from the planar and depth motion features in the stereoscopic video sequences. The final saliency is estimated by fusing the spatial and temporal saliency with uncertainty weighting, which is estimated by the laws of proximity, continuity, and common fate in Gestalt theory. Experimental results show that the proposed method outperforms the state-of-the-art stereoscopic video saliency detection models on our built large-scale eye tracking database and one other database (DML-ITRACK-3D).
Serial dependence in the perception of attractiveness.
Xia, Ye; Leib, Allison Yamanashi; Whitney, David
2016-12-01
The perception of attractiveness is essential for choices of food, object, and mate preference. Like perception of other visual features, perception of attractiveness is stable despite constant changes of image properties due to factors like occlusion, visual noise, and eye movements. Recent results demonstrate that perception of low-level stimulus features and even more complex attributes like human identity are biased towards recent percepts. This effect is often called serial dependence. Some recent studies have suggested that serial dependence also exists for perceived facial attractiveness, though there is also concern that the reported effects are due to response bias. Here we used an attractiveness-rating task to test the existence of serial dependence in perceived facial attractiveness. Our results demonstrate that perceived face attractiveness was pulled by the attractiveness level of facial images encountered up to 6 s prior. This effect was not due to response bias and did not rely on the previous motor response. This perceptual pull increased as the difference in attractiveness between previous and current stimuli increased. Our results reconcile previously conflicting findings and extend previous work, demonstrating that sequential dependence in perception operates across different levels of visual analysis, even at the highest levels of perceptual interpretation.
Jung, Wonmo; Bülthoff, Isabelle; Armann, Regine G M
2017-11-01
The brain can only attend to a fraction of all the information that is entering the visual system at any given moment. One way of overcoming the so-called bottleneck of selective attention (e.g., J. M. Wolfe, Võ, Evans, & Greene, 2011) is to make use of redundant visual information and extract summarized statistical information of the whole visual scene. Such ensemble representation occurs for low-level features of textures or simple objects, but it has also been reported for complex high-level properties. While the visual system has, for example, been shown to compute summary representations of facial expression, gender, or identity, it is less clear whether perceptual input from all parts of the visual field contributes equally to the ensemble percept. Here we extend the line of ensemble-representation research into the realm of race and look at the possibility that ensemble perception relies on weighting visual information differently depending on its origin from either the fovea or the visual periphery. We find that observers can judge the mean race of a set of faces, similar to judgments of mean emotion from faces and ensemble representations in low-level domains of visual processing. We also find that while peripheral faces seem to be taken into account for the ensemble percept, far more weight is given to stimuli presented foveally than peripherally. Whether this precision weighting of information stems from differences in the accuracy with which the visual system processes information across the visual field or from statistical inferences about the world needs to be determined by further research.
SemVisM: semantic visualizer for medical image
NASA Astrophysics Data System (ADS)
Landaeta, Luis; La Cruz, Alexandra; Baranya, Alexander; Vidal, María.-Esther
2015-01-01
SemVisM is a toolbox that combines medical informatics and computer graphics tools for reducing the semantic gap between low-level features and high-level semantic concepts/terms in the images. This paper presents a novel strategy for visualizing medical data annotated semantically, combining rendering techniques, and segmentation algorithms. SemVisM comprises two main components: i) AMORE (A Modest vOlume REgister) to handle input data (RAW, DAT or DICOM) and to initially annotate the images using terms defined on medical ontologies (e.g., MesH, FMA or RadLex), and ii) VOLPROB (VOlume PRObability Builder) for generating the annotated volumetric data containing the classified voxels that belong to a particular tissue. SemVisM is built on top of the semantic visualizer ANISE.1
What you see is what you expect: rapid scene understanding benefits from prior experience.
Greene, Michelle R; Botros, Abraham P; Beck, Diane M; Fei-Fei, Li
2015-05-01
Although we are able to rapidly understand novel scene images, little is known about the mechanisms that support this ability. Theories of optimal coding assert that prior visual experience can be used to ease the computational burden of visual processing. A consequence of this idea is that more probable visual inputs should be facilitated relative to more unlikely stimuli. In three experiments, we compared the perceptions of highly improbable real-world scenes (e.g., an underwater press conference) with common images matched for visual and semantic features. Although the two groups of images could not be distinguished by their low-level visual features, we found profound deficits related to the improbable images: Observers wrote poorer descriptions of these images (Exp. 1), had difficulties classifying the images as unusual (Exp. 2), and even had lower sensitivity to detect these images in noise than to detect their more probable counterparts (Exp. 3). Taken together, these results place a limit on our abilities for rapid scene perception and suggest that perception is facilitated by prior visual experience.
Mishra, Jyoti; Zanto, Theodore; Nilakantan, Aneesha; Gazzaley, Adam
2013-01-01
Intrasensory interference during visual working memory (WM) maintenance by object stimuli (such as faces and scenes), has been shown to negatively impact WM performance, with greater detrimental impacts of interference observed in aging. Here we assessed age-related impacts by intrasensory WM interference from lower-level stimulus features such as visual and auditory motion stimuli. We consistently found that interference in the form of ignored distractions and secondary task i nterruptions presented during a WM maintenance period, degraded memory accuracy in both the visual and auditory domain. However, in contrast to prior studies assessing WM for visual object stimuli, feature-based interference effects were not observed to be significantly greater in older adults. Analyses of neural oscillations in the alpha frequency band further revealed preserved mechanisms of interference processing in terms of post-stimulus alpha suppression, which was observed maximally for secondary task interruptions in visual and auditory modalities in both younger and older adults. These results suggest that age-related sensitivity of WM to interference may be limited to complex object stimuli, at least at low WM loads. PMID:23791629
Visual scan-path analysis with feature space transient fixation moments
NASA Astrophysics Data System (ADS)
Dempere-Marco, Laura; Hu, Xiao-Peng; Yang, Guang-Zhong
2003-05-01
The study of eye movements provides useful insight into the cognitive processes underlying visual search tasks. The analysis of the dynamics of eye movements has often been approached from a purely spatial perspective. In many cases, however, it may not be possible to define meaningful or consistent dynamics without considering the features underlying the scan paths. In this paper, the definition of the feature space has been attempted through the concept of visual similarity and non-linear low dimensional embedding, which defines a mapping from the image space into a low dimensional feature manifold that preserves the intrinsic similarity of image patterns. This has enabled the definition of perceptually meaningful features without the use of domain specific knowledge. Based on this, this paper introduces a new concept called Feature Space Transient Fixation Moments (TFM). The approach presented tackles the problem of feature space representation of visual search through the use of TFM. We demonstrate the practical values of this concept for characterizing the dynamics of eye movements in goal directed visual search tasks. We also illustrate how this model can be used to elucidate the fundamental steps involved in skilled search tasks through the evolution of transient fixation moments.
Lu, Kun-Han; Hung, Shao-Chin; Wen, Haiguang; Marussich, Lauren; Liu, Zhongming
2016-01-01
Complex, sustained, dynamic, and naturalistic visual stimulation can evoke distributed brain activities that are highly reproducible within and across individuals. However, the precise origins of such reproducible responses remain incompletely understood. Here, we employed concurrent functional magnetic resonance imaging (fMRI) and eye tracking to investigate the experimental and behavioral factors that influence fMRI activity and its intra- and inter-subject reproducibility during repeated movie stimuli. We found that widely distributed and highly reproducible fMRI responses were attributed primarily to the high-level natural content in the movie. In the absence of such natural content, low-level visual features alone in a spatiotemporally scrambled control stimulus evoked significantly reduced degree and extent of reproducible responses, which were mostly confined to the primary visual cortex (V1). We also found that the varying gaze behavior affected the cortical response at the peripheral part of V1 and in the oculomotor network, with minor effects on the response reproducibility over the extrastriate visual areas. Lastly, scene transitions in the movie stimulus due to film editing partly caused the reproducible fMRI responses at widespread cortical areas, especially along the ventral visual pathway. Therefore, the naturalistic nature of a movie stimulus is necessary for driving highly reliable visual activations. In a movie-stimulation paradigm, scene transitions and individuals’ gaze behavior should be taken as potential confounding factors in order to properly interpret cortical activity that supports natural vision. PMID:27564573
Feature diagnosticity and task context shape activity in human scene-selective cortex.
Lowe, Matthew X; Gallivan, Jason P; Ferber, Susanne; Cant, Jonathan S
2016-01-15
Scenes are constructed from multiple visual features, yet previous research investigating scene processing has often focused on the contributions of single features in isolation. In the real world, features rarely exist independently of one another and likely converge to inform scene identity in unique ways. Here, we utilize fMRI and pattern classification techniques to examine the interactions between task context (i.e., attend to diagnostic global scene features; texture or layout) and high-level scene attributes (content and spatial boundary) to test the novel hypothesis that scene-selective cortex represents multiple visual features, the importance of which varies according to their diagnostic relevance across scene categories and task demands. Our results show for the first time that scene representations are driven by interactions between multiple visual features and high-level scene attributes. Specifically, univariate analysis of scene-selective cortex revealed that task context and feature diagnosticity shape activity differentially across scene categories. Examination using multivariate decoding methods revealed results consistent with univariate findings, but also evidence for an interaction between high-level scene attributes and diagnostic visual features within scene categories. Critically, these findings suggest visual feature representations are not distributed uniformly across scene categories but are shaped by task context and feature diagnosticity. Thus, we propose that scene-selective cortex constructs a flexible representation of the environment by integrating multiple diagnostically relevant visual features, the nature of which varies according to the particular scene being perceived and the goals of the observer. Copyright © 2015 Elsevier Inc. All rights reserved.
Learning Computational Models of Video Memorability from fMRI Brain Imaging.
Han, Junwei; Chen, Changyuan; Shao, Ling; Hu, Xintao; Han, Jungong; Liu, Tianming
2015-08-01
Generally, various visual media are unequally memorable by the human brain. This paper looks into a new direction of modeling the memorability of video clips and automatically predicting how memorable they are by learning from brain functional magnetic resonance imaging (fMRI). We propose a novel computational framework by integrating the power of low-level audiovisual features and brain activity decoding via fMRI. Initially, a user study experiment is performed to create a ground truth database for measuring video memorability and a set of effective low-level audiovisual features is examined in this database. Then, human subjects' brain fMRI data are obtained when they are watching the video clips. The fMRI-derived features that convey the brain activity of memorizing videos are extracted using a universal brain reference system. Finally, due to the fact that fMRI scanning is expensive and time-consuming, a computational model is learned on our benchmark dataset with the objective of maximizing the correlation between the low-level audiovisual features and the fMRI-derived features using joint subspace learning. The learned model can then automatically predict the memorability of videos without fMRI scans. Evaluations on publically available image and video databases demonstrate the effectiveness of the proposed framework.
The singular nature of auditory and visual scene analysis in autism
Lin, I.-Fan; Shirama, Aya; Kato, Nobumasa
2017-01-01
Individuals with autism spectrum disorder often have difficulty acquiring relevant auditory and visual information in daily environments, despite not being diagnosed as hearing impaired or having low vision. Resent psychophysical and neurophysiological studies have shown that autistic individuals have highly specific individual differences at various levels of information processing, including feature extraction, automatic grouping and top-down modulation in auditory and visual scene analysis. Comparison of the characteristics of scene analysis between auditory and visual modalities reveals some essential commonalities, which could provide clues about the underlying neural mechanisms. Further progress in this line of research may suggest effective methods for diagnosing and supporting autistic individuals. This article is part of the themed issue ‘Auditory and visual scene analysis'. PMID:28044025
Semantic guidance of eye movements in real-world scenes
Hwang, Alex D.; Wang, Hsueh-Cheng; Pomplun, Marc
2011-01-01
The perception of objects in our visual world is influenced by not only their low-level visual features such as shape and color, but also their high-level features such as meaning and semantic relations among them. While it has been shown that low-level features in real-world scenes guide eye movements during scene inspection and search, the influence of semantic similarity among scene objects on eye movements in such situations has not been investigated. Here we study guidance of eye movements by semantic similarity among objects during real-world scene inspection and search. By selecting scenes from the LabelMe object-annotated image database and applying Latent Semantic Analysis (LSA) to the object labels, we generated semantic saliency maps of real-world scenes based on the semantic similarity of scene objects to the currently fixated object or the search target. An ROC analysis of these maps as predictors of subjects’ gaze transitions between objects during scene inspection revealed a preference for transitions to objects that were semantically similar to the currently inspected one. Furthermore, during the course of a scene search, subjects’ eye movements were progressively guided toward objects that were semantically similar to the search target. These findings demonstrate substantial semantic guidance of eye movements in real-world scenes and show its importance for understanding real-world attentional control. PMID:21426914
Semantic guidance of eye movements in real-world scenes.
Hwang, Alex D; Wang, Hsueh-Cheng; Pomplun, Marc
2011-05-25
The perception of objects in our visual world is influenced by not only their low-level visual features such as shape and color, but also their high-level features such as meaning and semantic relations among them. While it has been shown that low-level features in real-world scenes guide eye movements during scene inspection and search, the influence of semantic similarity among scene objects on eye movements in such situations has not been investigated. Here we study guidance of eye movements by semantic similarity among objects during real-world scene inspection and search. By selecting scenes from the LabelMe object-annotated image database and applying latent semantic analysis (LSA) to the object labels, we generated semantic saliency maps of real-world scenes based on the semantic similarity of scene objects to the currently fixated object or the search target. An ROC analysis of these maps as predictors of subjects' gaze transitions between objects during scene inspection revealed a preference for transitions to objects that were semantically similar to the currently inspected one. Furthermore, during the course of a scene search, subjects' eye movements were progressively guided toward objects that were semantically similar to the search target. These findings demonstrate substantial semantic guidance of eye movements in real-world scenes and show its importance for understanding real-world attentional control. Copyright © 2011 Elsevier Ltd. All rights reserved.
Do Rats Use Shape to Solve "Shape Discriminations"?
ERIC Educational Resources Information Center
Minini, Loredana; Jeffery, Kathryn J.
2006-01-01
Visual discrimination tasks are increasingly used to explore the neurobiology of vision in rodents, but it remains unclear how the animals solve these tasks: Do they process shapes holistically, or by using low-level features such as luminance and angle acuity? In the present study we found that when discriminating triangles from squares, rats did…
Figure-ground organization and the emergence of proto-objects in the visual cortex.
von der Heydt, Rüdiger
2015-01-01
A long history of studies of perception has shown that the visual system organizes the incoming information early on, interpreting the 2D image in terms of a 3D world and producing a structure that provides perceptual continuity and enables object-based attention. Recordings from monkey visual cortex show that many neurons, especially in area V2, are selective for border ownership. These neurons are edge selective and have ordinary classical receptive fields (CRF), but in addition their responses are modulated (enhanced or suppressed) depending on the location of a 'figure' relative to the edge in their receptive field. Each neuron has a fixed preference for location on one side or the other. This selectivity is derived from the image context far beyond the CRF. This paper reviews evidence indicating that border ownership selectivity reflects the formation of early object representations ('proto-objects'). The evidence includes experiments showing (1) reversal of border ownership signals with change of perceived object structure, (2) border ownership specific enhancement of responses in object-based selective attention, (3) persistence of border ownership signals in accordance with continuity of object perception, and (4) remapping of border ownership signals across saccades and object movements. Findings 1 and 2 can be explained by hypothetical grouping circuits that sum contour feature signals in search of objectness, and, via recurrent projections, enhance the corresponding low-level feature signals. Findings 3 and 4 might be explained by assuming that the activity of grouping circuits persists and can be remapped. Grouping, persistence, and remapping are fundamental operations of vision. Finding these operations manifest in low-level visual areas challenges traditional views of visual processing. New computational models need to be developed for a comprehensive understanding of the function of the visual cortex.
Figure–ground organization and the emergence of proto-objects in the visual cortex
von der Heydt, Rüdiger
2015-01-01
A long history of studies of perception has shown that the visual system organizes the incoming information early on, interpreting the 2D image in terms of a 3D world and producing a structure that provides perceptual continuity and enables object-based attention. Recordings from monkey visual cortex show that many neurons, especially in area V2, are selective for border ownership. These neurons are edge selective and have ordinary classical receptive fields (CRF), but in addition their responses are modulated (enhanced or suppressed) depending on the location of a ‘figure’ relative to the edge in their receptive field. Each neuron has a fixed preference for location on one side or the other. This selectivity is derived from the image context far beyond the CRF. This paper reviews evidence indicating that border ownership selectivity reflects the formation of early object representations (‘proto-objects’). The evidence includes experiments showing (1) reversal of border ownership signals with change of perceived object structure, (2) border ownership specific enhancement of responses in object-based selective attention, (3) persistence of border ownership signals in accordance with continuity of object perception, and (4) remapping of border ownership signals across saccades and object movements. Findings 1 and 2 can be explained by hypothetical grouping circuits that sum contour feature signals in search of objectness, and, via recurrent projections, enhance the corresponding low-level feature signals. Findings 3 and 4 might be explained by assuming that the activity of grouping circuits persists and can be remapped. Grouping, persistence, and remapping are fundamental operations of vision. Finding these operations manifest in low-level visual areas challenges traditional views of visual processing. New computational models need to be developed for a comprehensive understanding of the function of the visual cortex. PMID:26579062
Usage of Accessibility Options for the iPhone and iPad in a Visually Impaired Population.
Robinson, Joshua L; Braimah Avery, Vanessa; Chun, Rob; Pusateri, Gregg; Jay, Walter M
2017-01-01
The iPad and iPhone have a number of low-vision accessibility features including Siri Voice Assistant, Large Text, Zoom Magnification, Invert Colors, Voice Over, and Speech Selection. We studied their usage within a low-vision population. Patients were recruited to participate in an IRB-approved survey regarding their usage of the iPad and/or iPhone. Participants met one of the following criteria: best corrected visual acuity (BCVA) of 20/60 or worse, or significant peripheral visual field defects. Thirty-three low-vision patients agreed to participate (mean age 54.3 years). There were 18 different diagnoses represented and the average visual acuity of respondents was 20/119 in the right eye and 20/133 in the left eye. The most commonly used vision accessibility features were Zoom Magnification and Large Text. Although many patients are using the low-vision accessibility features, few are receiving training or recommendations from their eye care specialist.
Bordier, Cecile; Puja, Francesco; Macaluso, Emiliano
2013-01-01
The investigation of brain activity using naturalistic, ecologically-valid stimuli is becoming an important challenge for neuroscience research. Several approaches have been proposed, primarily relying on data-driven methods (e.g. independent component analysis, ICA). However, data-driven methods often require some post-hoc interpretation of the imaging results to draw inferences about the underlying sensory, motor or cognitive functions. Here, we propose using a biologically-plausible computational model to extract (multi-)sensory stimulus statistics that can be used for standard hypothesis-driven analyses (general linear model, GLM). We ran two separate fMRI experiments, which both involved subjects watching an episode of a TV-series. In Exp 1, we manipulated the presentation by switching on-and-off color, motion and/or sound at variable intervals, whereas in Exp 2, the video was played in the original version, with all the consequent continuous changes of the different sensory features intact. Both for vision and audition, we extracted stimulus statistics corresponding to spatial and temporal discontinuities of low-level features, as well as a combined measure related to the overall stimulus saliency. Results showed that activity in occipital visual cortex and the superior temporal auditory cortex co-varied with changes of low-level features. Visual saliency was found to further boost activity in extra-striate visual cortex plus posterior parietal cortex, while auditory saliency was found to enhance activity in the superior temporal cortex. Data-driven ICA analyses of the same datasets also identified “sensory” networks comprising visual and auditory areas, but without providing specific information about the possible underlying processes, e.g., these processes could relate to modality, stimulus features and/or saliency. We conclude that the combination of computational modeling and GLM enables the tracking of the impact of bottom–up signals on brain activity during viewing of complex and dynamic multisensory stimuli, beyond the capability of purely data-driven approaches. PMID:23202431
Acquiring skill at medical image inspection: learning localized in early visual processes
NASA Astrophysics Data System (ADS)
Sowden, Paul T.; Davies, Ian R. L.; Roling, Penny; Watt, Simon J.
1997-04-01
Acquisition of the skill of medical image inspection could be due to changes in visual search processes, 'low-level' sensory learning, and higher level 'conceptual learning.' Here, we report two studies that investigate the extent to which learning in medical image inspection involves low- level learning. Early in the visual processing pathway cells are selective for direction of luminance contrast. We exploit this in the present studies by using transfer across direction of contrast as a 'marker' to indicate the level of processing at which learning occurs. In both studies twelve observers trained for four days at detecting features in x- ray images (experiment one equals discs in the Nijmegen phantom, experiment two equals micro-calcification clusters in digitized mammograms). Half the observers examined negative luminance contrast versions of the images and the remainder examined positive contrast versions. On the fifth day, observers swapped to inspect their respective opposite contrast images. In both experiments leaning occurred across sessions. In experiment one, learning did not transfer across direction of luminance contrast, while in experiment two there was only partial transfer. These findings are consistent with the contention that some of the leaning was localized early in the visual processing pathway. The implications of these results for current medical image inspection training schedules are discussed.
Comparing visual representations across human fMRI and computational vision
Leeds, Daniel D.; Seibert, Darren A.; Pyles, John A.; Tarr, Michael J.
2013-01-01
Feedforward visual object perception recruits a cortical network that is assumed to be hierarchical, progressing from basic visual features to complete object representations. However, the nature of the intermediate features related to this transformation remains poorly understood. Here, we explore how well different computer vision recognition models account for neural object encoding across the human cortical visual pathway as measured using fMRI. These neural data, collected during the viewing of 60 images of real-world objects, were analyzed with a searchlight procedure as in Kriegeskorte, Goebel, and Bandettini (2006): Within each searchlight sphere, the obtained patterns of neural activity for all 60 objects were compared to model responses for each computer recognition algorithm using representational dissimilarity analysis (Kriegeskorte et al., 2008). Although each of the computer vision methods significantly accounted for some of the neural data, among the different models, the scale invariant feature transform (Lowe, 2004), encoding local visual properties gathered from “interest points,” was best able to accurately and consistently account for stimulus representations within the ventral pathway. More generally, when present, significance was observed in regions of the ventral-temporal cortex associated with intermediate-level object perception. Differences in model effectiveness and the neural location of significant matches may be attributable to the fact that each model implements a different featural basis for representing objects (e.g., more holistic or more parts-based). Overall, we conclude that well-known computer vision recognition systems may serve as viable proxies for theories of intermediate visual object representation. PMID:24273227
Saliency predicts change detection in pictures of natural scenes.
Wright, Michael J
2005-01-01
It has been proposed that the visual system encodes the salience of objects in the visual field in an explicit two-dimensional map that guides visual selective attention. Experiments were conducted to determine whether salience measurements applied to regions of pictures of outdoor scenes could predict the detection of changes in those regions. To obtain a quantitative measure of change detection, observers located changes in pairs of colour pictures presented across an interstimulus interval (ISI). Salience measurements were then obtained from different observers for image change regions using three independent methods, and all were positively correlated with change detection. Factor analysis extracted a single saliency factor that accounted for 62% of the variance contained in the four measures. Finally, estimates of the magnitude of the image change in each picture pair were obtained, using nine separate visual filters representing low-level vision features (luminance, colour, spatial frequency, orientation, edge density). None of the feature outputs was significantly associated with change detection or saliency. On the other hand it was shown that high-level (structural) properties of the changed region were related to saliency and to change detection: objects were more salient than shadows and more detectable when changed.
Schertz, Kathryn E; Sachdeva, Sonya; Kardan, Omid; Kotabe, Hiroki P; Wolf, Kathleen L; Berman, Marc G
2018-05-01
Prior research has shown that the physical characteristics of one's environment have wide ranging effects on affect and cognition. Other research has demonstrated that one's thoughts have impacts on mood and behavior, and in this three-part research program we investigated how physical features of the environment can alter thought content. In one study, we analyzed thousands of journal entries written by park visitors to examine how low-level and semantic visual features of the parks correlate with different thought topics. In a second study, we validated our ecological results by conducting an online study where participants were asked to write journal entries while imagining they were visiting a park, to ensure that results from Study 1 were not due to selection bias of park visitors. In the third study, we experimentally manipulated exposure to specific visual features to determine if they induced thinking about the same thought topics under more generalized conditions. Results from Study 3 demonstrated a potential causal role for perceived naturalness and high non-straight edges on thinking about "Nature", with a significant positive interaction. Results also showed a potential causal effect of naturalness and non-straight edges on thinking about topics related to "Spiritual & Life Journey", with perceived naturalness having a negative relationship and non-straight edges having a positive relationship. We also observed a significant positive interaction between non-straight edge density and naturalness in relation to "Spiritual & Life Journey". These results have implications for the design of the built environment to influence human reflection and well-being. Copyright © 2018 Elsevier B.V. All rights reserved.
In search of the emotional face: anger versus happiness superiority in visual search.
Savage, Ruth A; Lipp, Ottmar V; Craig, Belinda M; Becker, Stefanie I; Horstmann, Gernot
2013-08-01
Previous research has provided inconsistent results regarding visual search for emotional faces, yielding evidence for either anger superiority (i.e., more efficient search for angry faces) or happiness superiority effects (i.e., more efficient search for happy faces), suggesting that these results do not reflect on emotional expression, but on emotion (un-)related low-level perceptual features. The present study investigated possible factors mediating anger/happiness superiority effects; specifically search strategy (fixed vs. variable target search; Experiment 1), stimulus choice (Nimstim database vs. Ekman & Friesen database; Experiments 1 and 2), and emotional intensity (Experiment 3 and 3a). Angry faces were found faster than happy faces regardless of search strategy using faces from the Nimstim database (Experiment 1). By contrast, a happiness superiority effect was evident in Experiment 2 when using faces from the Ekman and Friesen database. Experiment 3 employed angry, happy, and exuberant expressions (Nimstim database) and yielded anger and happiness superiority effects, respectively, highlighting the importance of the choice of stimulus materials. Ratings of the stimulus materials collected in Experiment 3a indicate that differences in perceived emotional intensity, pleasantness, or arousal do not account for differences in search efficiency. Across three studies, the current investigation indicates that prior reports of anger or happiness superiority effects in visual search are likely to reflect on low-level visual features associated with the stimulus materials used, rather than on emotion. PsycINFO Database Record (c) 2013 APA, all rights reserved.
When Art Moves the Eyes: A Behavioral and Eye-Tracking Study
Massaro, Davide; Savazzi, Federica; Di Dio, Cinzia; Freedberg, David; Gallese, Vittorio; Gilli, Gabriella; Marchetti, Antonella
2012-01-01
The aim of this study was to investigate, using eye-tracking technique, the influence of bottom-up and top-down processes on visual behavior while subjects, naïve to art criticism, were presented with representational paintings. Forty-two subjects viewed color and black and white paintings (Color) categorized as dynamic or static (Dynamism) (bottom-up processes). Half of the images represented natural environments and half human subjects (Content); all stimuli were displayed under aesthetic and movement judgment conditions (Task) (top-down processes). Results on gazing behavior showed that content-related top-down processes prevailed over low-level visually-driven bottom-up processes when a human subject is represented in the painting. On the contrary, bottom-up processes, mediated by low-level visual features, particularly affected gazing behavior when looking at nature-content images. We discuss our results proposing a reconsideration of the definition of content-related top-down processes in accordance with the concept of embodied simulation in art perception. PMID:22624007
When art moves the eyes: a behavioral and eye-tracking study.
Massaro, Davide; Savazzi, Federica; Di Dio, Cinzia; Freedberg, David; Gallese, Vittorio; Gilli, Gabriella; Marchetti, Antonella
2012-01-01
The aim of this study was to investigate, using eye-tracking technique, the influence of bottom-up and top-down processes on visual behavior while subjects, naïve to art criticism, were presented with representational paintings. Forty-two subjects viewed color and black and white paintings (Color) categorized as dynamic or static (Dynamism) (bottom-up processes). Half of the images represented natural environments and half human subjects (Content); all stimuli were displayed under aesthetic and movement judgment conditions (Task) (top-down processes). Results on gazing behavior showed that content-related top-down processes prevailed over low-level visually-driven bottom-up processes when a human subject is represented in the painting. On the contrary, bottom-up processes, mediated by low-level visual features, particularly affected gazing behavior when looking at nature-content images. We discuss our results proposing a reconsideration of the definition of content-related top-down processes in accordance with the concept of embodied simulation in art perception.
Visual Cortical Entrainment to Motion and Categorical Speech Features during Silent Lipreading
O’Sullivan, Aisling E.; Crosse, Michael J.; Di Liberto, Giovanni M.; Lalor, Edmund C.
2017-01-01
Speech is a multisensory percept, comprising an auditory and visual component. While the content and processing pathways of audio speech have been well characterized, the visual component is less well understood. In this work, we expand current methodologies using system identification to introduce a framework that facilitates the study of visual speech in its natural, continuous form. Specifically, we use models based on the unheard acoustic envelope (E), the motion signal (M) and categorical visual speech features (V) to predict EEG activity during silent lipreading. Our results show that each of these models performs similarly at predicting EEG in visual regions and that respective combinations of the individual models (EV, MV, EM and EMV) provide an improved prediction of the neural activity over their constituent models. In comparing these different combinations, we find that the model incorporating all three types of features (EMV) outperforms the individual models, as well as both the EV and MV models, while it performs similarly to the EM model. Importantly, EM does not outperform EV and MV, which, considering the higher dimensionality of the V model, suggests that more data is needed to clarify this finding. Nevertheless, the performance of EMV, and comparisons of the subject performances for the three individual models, provides further evidence to suggest that visual regions are involved in both low-level processing of stimulus dynamics and categorical speech perception. This framework may prove useful for investigating modality-specific processing of visual speech under naturalistic conditions. PMID:28123363
NASA Astrophysics Data System (ADS)
Kuvychko, Igor
2001-10-01
Vision is a part of a larger information system that converts visual information into knowledge structures. These structures drive vision process, resolving ambiguity and uncertainty via feedback, and provide image understanding, that is an interpretation of visual information in terms of such knowledge models. A computer vision system based on such principles requires unifying representation of perceptual and conceptual information. Computer simulation models are built on the basis of graphs/networks. The ability of human brain to emulate similar graph/networks models is found. That means a very important shift of paradigm in our knowledge about brain from neural networks to the cortical software. Starting from the primary visual areas, brain analyzes an image as a graph-type spatial structure. Primary areas provide active fusion of image features on a spatial grid-like structure, where nodes are cortical columns. The spatial combination of different neighbor features cannot be described as a statistical/integral characteristic of the analyzed region, but uniquely characterizes such region itself. Spatial logic and topology naturally present in such structures. Mid-level vision processes like clustering, perceptual grouping, multilevel hierarchical compression, separation of figure from ground, etc. are special kinds of graph/network transformations. They convert low-level image structure into the set of more abstract ones, which represent objects and visual scene, making them easy for analysis by higher-level knowledge structures. Higher-level vision phenomena like shape from shading, occlusion, etc. are results of such analysis. Such approach gives opportunity not only to explain frequently unexplainable results of the cognitive science, but also to create intelligent computer vision systems that simulate perceptional processes in both what and where visual pathways. Such systems can open new horizons for robotic and computer vision industries.
Single Trial EEG Patterns for the Prediction of Individual Differences in Fluid Intelligence.
Qazi, Emad-Ul-Haq; Hussain, Muhammad; Aboalsamh, Hatim; Malik, Aamir Saeed; Amin, Hafeez Ullah; Bamatraf, Saeed
2016-01-01
Assessing a person's intelligence level is required in many situations, such as career counseling and clinical applications. EEG evoked potentials in oddball task and fluid intelligence score are correlated because both reflect the cognitive processing and attention. A system for prediction of an individual's fluid intelligence level using single trial Electroencephalography (EEG) signals has been proposed. For this purpose, we employed 2D and 3D contents and 34 subjects each for 2D and 3D, which were divided into low-ability (LA) and high-ability (HA) groups using Raven's Advanced Progressive Matrices (RAPM) test. Using visual oddball cognitive task, neural activity of each group was measured and analyzed over three midline electrodes (Fz, Cz, and Pz). To predict whether an individual belongs to LA or HA group, features were extracted using wavelet decomposition of EEG signals recorded in visual oddball task and support vector machine (SVM) was used as a classifier. Two different types of Haar wavelet transform based features have been extracted from the band (0.3 to 30 Hz) of EEG signals. Statistical wavelet features and wavelet coefficient features from the frequency bands 0.0-1.875 Hz (delta low) and 1.875-3.75 Hz (delta high), resulted in the 100 and 98% prediction accuracies, respectively, both for 2D and 3D contents. The analysis of these frequency bands showed clear difference between LA and HA groups. Further, discriminative values of the features have been validated using statistical significance tests and inter-class and intra-class variation analysis. Also, statistical test showed that there was no effect of 2D and 3D content on the assessment of fluid intelligence level. Comparisons with state-of-the-art techniques showed the superiority of the proposed system.
Modeling global scene factors in attention
NASA Astrophysics Data System (ADS)
Torralba, Antonio
2003-07-01
Models of visual attention have focused predominantly on bottom-up approaches that ignored structured contextual and scene information. I propose a model of contextual cueing for attention guidance based on the global scene configuration. It is shown that the statistics of low-level features across the whole image can be used to prime the presence or absence of objects in the scene and to predict their location, scale, and appearance before exploring the image. In this scheme, visual context information can become available early in the visual processing chain, which allows modulation of the saliency of image regions and provides an efficient shortcut for object detection and recognition. 2003 Optical Society of America
Visual feature discrimination versus compression ratio for polygonal shape descriptors
NASA Astrophysics Data System (ADS)
Heuer, Joerg; Sanahuja, Francesc; Kaup, Andre
2000-10-01
In the last decade several methods for low level indexing of visual features appeared. Most often these were evaluated with respect to their discrimination power using measures like precision and recall. Accordingly, the targeted application was indexing of visual data within databases. During the standardization process of MPEG-7 the view on indexing of visual data changed, taking also communication aspects into account where coding efficiency is important. Even if the descriptors used for indexing are small compared to the size of images, it is recognized that there can be several descriptors linked to an image, characterizing different features and regions. Beside the importance of a small memory footprint for the transmission of the descriptor and the memory footprint in a database, eventually the search and filtering can be sped up by reducing the dimensionality of the descriptor if the metric of the matching can be adjusted. Based on a polygon shape descriptor presented for MPEG-7 this paper compares the discrimination power versus memory consumption of the descriptor. Different methods based on quantization are presented and their effect on the retrieval performance are measured. Finally an optimized computation of the descriptor is presented.
Asymmetries in visual search for conjunctive targets.
Cohen, A
1993-08-01
Asymmetry is demonstrated between conjunctive targets in visual search with no detectable asymmetries between the individual features that compose these targets. Experiment 1 demonstrated this phenomenon for targets composed of color and shape. Experiment 2 and 4 demonstrate this asymmetry for targets composed of size and orientation and for targets composed of contrast level and orientation, respectively. Experiment 3 demonstrates that search rate of individual features cannot predict search rate for conjunctive targets. These results demonstrate the need for 2 levels of representations: one of features and one of conjunction of features. A model related to the modified feature integration theory is proposed to account for these results. The proposed model and other models of visual search are discussed.
Perceptual learning in visual search: fast, enduring, but non-specific.
Sireteanu, R; Rettenbach, R
1995-07-01
Visual search has been suggested as a tool for isolating visual primitives. Elementary "features" were proposed to involve parallel search, while serial search is necessary for items without a "feature" status, or, in some cases, for conjunctions of "features". In this study, we investigated the role of practice in visual search tasks. We found that, under some circumstances, initially serial tasks can become parallel after a few hundred trials. Learning in visual search is far less specific than learning of visual discriminations and hyperacuity, suggesting that it takes place at another level in the central visual pathway, involving different neural circuits.
GAFFE: a gaze-attentive fixation finding engine.
Rajashekar, U; van der Linde, I; Bovik, A C; Cormack, L K
2008-04-01
The ability to automatically detect visually interesting regions in images has many practical applications, especially in the design of active machine vision and automatic visual surveillance systems. Analysis of the statistics of image features at observers' gaze can provide insights into the mechanisms of fixation selection in humans. Using a foveated analysis framework, we studied the statistics of four low-level local image features: luminance, contrast, and bandpass outputs of both luminance and contrast, and discovered that image patches around human fixations had, on average, higher values of each of these features than image patches selected at random. Contrast-bandpass showed the greatest difference between human and random fixations, followed by luminance-bandpass, RMS contrast, and luminance. Using these measurements, we present a new algorithm that selects image regions as likely candidates for fixation. These regions are shown to correlate well with fixations recorded from human observers.
Atoms of recognition in human and computer vision.
Ullman, Shimon; Assif, Liav; Fetaya, Ethan; Harari, Daniel
2016-03-08
Discovering the visual features and representations used by the brain to recognize objects is a central problem in the study of vision. Recently, neural network models of visual object recognition, including biological and deep network models, have shown remarkable progress and have begun to rival human performance in some challenging tasks. These models are trained on image examples and learn to extract features and representations and to use them for categorization. It remains unclear, however, whether the representations and learning processes discovered by current models are similar to those used by the human visual system. Here we show, by introducing and using minimal recognizable images, that the human visual system uses features and processes that are not used by current models and that are critical for recognition. We found by psychophysical studies that at the level of minimal recognizable images a minute change in the image can have a drastic effect on recognition, thus identifying features that are critical for the task. Simulations then showed that current models cannot explain this sensitivity to precise feature configurations and, more generally, do not learn to recognize minimal images at a human level. The role of the features shown here is revealed uniquely at the minimal level, where the contribution of each feature is essential. A full understanding of the learning and use of such features will extend our understanding of visual recognition and its cortical mechanisms and will enhance the capacity of computational models to learn from visual experience and to deal with recognition and detailed image interpretation.
Wen, Haiguang; Shi, Junxing; Chen, Wei; Liu, Zhongming
2018-02-28
The brain represents visual objects with topographic cortical patterns. To address how distributed visual representations enable object categorization, we established predictive encoding models based on a deep residual network, and trained them to predict cortical responses to natural movies. Using this predictive model, we mapped human cortical representations to 64,000 visual objects from 80 categories with high throughput and accuracy. Such representations covered both the ventral and dorsal pathways, reflected multiple levels of object features, and preserved semantic relationships between categories. In the entire visual cortex, object representations were organized into three clusters of categories: biological objects, non-biological objects, and background scenes. In a finer scale specific to each cluster, object representations revealed sub-clusters for further categorization. Such hierarchical clustering of category representations was mostly contributed by cortical representations of object features from middle to high levels. In summary, this study demonstrates a useful computational strategy to characterize the cortical organization and representations of visual features for rapid categorization.
A model of attention-guided visual perception and recognition.
Rybak, I A; Gusakova, V I; Golovan, A V; Podladchikova, L N; Shevtsova, N A
1998-08-01
A model of visual perception and recognition is described. The model contains: (i) a low-level subsystem which performs both a fovea-like transformation and detection of primary features (edges), and (ii) a high-level subsystem which includes separated 'what' (sensory memory) and 'where' (motor memory) structures. Image recognition occurs during the execution of a 'behavioral recognition program' formed during the primary viewing of the image. The recognition program contains both programmed attention window movements (stored in the motor memory) and predicted image fragments (stored in the sensory memory) for each consecutive fixation. The model shows the ability to recognize complex images (e.g. faces) invariantly with respect to shift, rotation and scale.
Liu, Jianli; Lughofer, Edwin; Zeng, Xianyi
2015-01-01
Modeling human aesthetic perception of visual textures is important and valuable in numerous industrial domains, such as product design, architectural design, and decoration. Based on results from a semantic differential rating experiment, we modeled the relationship between low-level basic texture features and aesthetic properties involved in human aesthetic texture perception. First, we compute basic texture features from textural images using four classical methods. These features are neutral, objective, and independent of the socio-cultural context of the visual textures. Then, we conduct a semantic differential rating experiment to collect from evaluators their aesthetic perceptions of selected textural stimuli. In semantic differential rating experiment, eights pairs of aesthetic properties are chosen, which are strongly related to the socio-cultural context of the selected textures and to human emotions. They are easily understood and connected to everyday life. We propose a hierarchical feed-forward layer model of aesthetic texture perception and assign 8 pairs of aesthetic properties to different layers. Finally, we describe the generation of multiple linear and non-linear regression models for aesthetic prediction by taking dimensionality-reduced texture features and aesthetic properties of visual textures as dependent and independent variables, respectively. Our experimental results indicate that the relationships between each layer and its neighbors in the hierarchical feed-forward layer model of aesthetic texture perception can be fitted well by linear functions, and the models thus generated can successfully bridge the gap between computational texture features and aesthetic texture properties.
Kovalenko, Lyudmyla Y; Chaumon, Maximilien; Busch, Niko A
2012-07-01
Semantic processing of verbal and visual stimuli has been investigated in semantic violation or semantic priming paradigms in which a stimulus is either related or unrelated to a previously established semantic context. A hallmark of semantic priming is the N400 event-related potential (ERP)--a deflection of the ERP that is more negative for semantically unrelated target stimuli. The majority of studies investigating the N400 and semantic integration have used verbal material (words or sentences), and standardized stimulus sets with norms for semantic relatedness have been published for verbal but not for visual material. However, semantic processing of visual objects (as opposed to words) is an important issue in research on visual cognition. In this study, we present a set of 800 pairs of semantically related and unrelated visual objects. The images were rated for semantic relatedness by a sample of 132 participants. Furthermore, we analyzed low-level image properties and matched the two semantic categories according to these features. An ERP study confirmed the suitability of this image set for evoking a robust N400 effect of semantic integration. Additionally, using a general linear modeling approach of single-trial data, we also demonstrate that low-level visual image properties and semantic relatedness are in fact only minimally overlapping. The image set is available for download from the authors' website. We expect that the image set will facilitate studies investigating mechanisms of semantic and contextual processing of visual stimuli.
Low-Visibility Visual Simulation with Real Fog
NASA Technical Reports Server (NTRS)
Chase, Wendell D.
1982-01-01
An environmental fog simulation (EFS) attachment was developed to aid in the study of natural low-visibility visual cues and subsequently used to examine the realism effect upon the aircraft simulator visual scene. A review of the basic fog equations indicated that the two major factors must be accounted for in the simulation of low visibility-one due to atmospheric attenuation and one due to veiling luminance. These factors are compared systematically by: comparing actual measurements lo those computed from the Fog equations, and comparing runway-visual-range-related visual-scene contrast values with the calculated values. These values are also compared with the simulated equivalent equations and with contrast measurements obtained from a current electronic fog synthesizer to help identify areas in which improvements are needed. These differences in technique, the measured values, the Features of both systems, a pilot opinion survey of the EFS fog, and improvements (by combining features of both systems) that are expected to significantly increase the potential as well as flexibility for producing a very high-fidelity, low-visibility visual simulation are discussed.
Low-visibility visual simulation with real fog
NASA Technical Reports Server (NTRS)
Chase, W. D.
1981-01-01
An environmental fog simulation (EFS) attachment was developed to aid in the study of natural low-visibility visual cues and subsequently used to examine the realism effect upon the aircraft simulator visual scene. A review of the basic fog equations indicated that two major factors must be accounted for in the simulation of low visibility - one due to atmospheric attenuation and one due to veiling luminance. These factors are compared systematically by (1) comparing actual measurements to those computed from the fog equations, and (2) comparing runway-visual-range-related visual-scene contrast values with the calculated values. These values are also compared with the simulated equivalent equations and with contrast measurements obtained from a current electronic fog synthesizer to help identify areas in which improvements are needed. These differences in technique, the measured values, the features of both systems, a pilot opinion survey of the EFS fog, and improvements (by combining features of both systems) that are expected to significantly increase the potential as well as flexibility for producing a very high-fidelity low-visibility visual simulation are discussed.
Ratner, Kyle G; Kaul, Christian; Van Bavel, Jay J
2013-10-01
Several theories suggest that people do not represent race when it does not signify group boundaries. However, race is often associated with visually salient differences in skin tone and facial features. In this study, we investigated whether race could be decoded from distributed patterns of neural activity in the fusiform gyri and early visual cortex when visual features that often covary with race were orthogonal to group membership. To this end, we used multivariate pattern analysis to examine an fMRI dataset that was collected while participants assigned to mixed-race groups categorized own-race and other-race faces as belonging to their newly assigned group. Whereas conventional univariate analyses provided no evidence of race-based responses in the fusiform gyri or early visual cortex, multivariate pattern analysis suggested that race was represented within these regions. Moreover, race was represented in the fusiform gyri to a greater extent than early visual cortex, suggesting that the fusiform gyri results do not merely reflect low-level perceptual information (e.g. color, contrast) from early visual cortex. These findings indicate that patterns of activation within specific regions of the visual cortex may represent race even when overall activation in these regions is not driven by racial information.
Behavior analysis for elderly care using a network of low-resolution visual sensors
NASA Astrophysics Data System (ADS)
Eldib, Mohamed; Deboeverie, Francis; Philips, Wilfried; Aghajan, Hamid
2016-07-01
Recent advancements in visual sensor technologies have made behavior analysis practical for in-home monitoring systems. The current in-home monitoring systems face several challenges: (1) visual sensor calibration is a difficult task and not practical in real-life because of the need for recalibration when the visual sensors are moved accidentally by a caregiver or the senior citizen, (2) privacy concerns, and (3) the high hardware installation cost. We propose to use a network of cheap low-resolution visual sensors (30×30 pixels) for long-term behavior analysis. The behavior analysis starts by visual feature selection based on foreground/background detection to track the motion level in each visual sensor. Then a hidden Markov model (HMM) is used to estimate the user's locations without calibration. Finally, an activity discovery approach is proposed using spatial and temporal contexts. We performed experiments on 10 months of real-life data. We show that the HMM approach outperforms the k-nearest neighbor classifier against ground truth for 30 days. Our framework is able to discover 13 activities of daily livings (ADL parameters). More specifically, we analyze mobility patterns and some of the key ADL parameters to detect increasing or decreasing health conditions.
Miskovic, Vladimir; Martinovic, Jasna; Wieser, Matthias M.; Petro, Nathan M.; Bradley, Margaret M.; Keil, Andreas
2015-01-01
Emotionally arousing scenes readily capture visual attention, prompting amplified neural activity in sensory regions of the brain. The physical stimulus features and related information channels in the human visual system that contribute to this modulation, however, are not known. Here, we manipulated low-level physical parameters of complex scenes varying in hedonic valence and emotional arousal in order to target the relative contributions of luminance based versus chromatic visual channels to emotional perception. Stimulus-evoked brain electrical activity was measured during picture viewing and used to quantify neural responses sensitive to lower-tier visual cortical involvement (steady-state visual evoked potentials) as well as the late positive potential, reflecting a more distributed cortical event. Results showed that the enhancement for emotional content was stimulus-selective when examining the steady-state segments of the evoked visual potentials. Response amplification was present only for low spatial frequency, grayscale stimuli, and not for high spatial frequency, red/green stimuli. In contrast, the late positive potential was modulated by emotion regardless of the scene’s physical properties. Our findings are discussed in relation to neurophysiologically plausible constraints operating at distinct stages of the cortical processing stream. PMID:25640949
Miskovic, Vladimir; Martinovic, Jasna; Wieser, Matthias J; Petro, Nathan M; Bradley, Margaret M; Keil, Andreas
2015-03-01
Emotionally arousing scenes readily capture visual attention, prompting amplified neural activity in sensory regions of the brain. The physical stimulus features and related information channels in the human visual system that contribute to this modulation, however, are not known. Here, we manipulated low-level physical parameters of complex scenes varying in hedonic valence and emotional arousal in order to target the relative contributions of luminance based versus chromatic visual channels to emotional perception. Stimulus-evoked brain electrical activity was measured during picture viewing and used to quantify neural responses sensitive to lower-tier visual cortical involvement (steady-state visual evoked potentials) as well as the late positive potential, reflecting a more distributed cortical event. Results showed that the enhancement for emotional content was stimulus-selective when examining the steady-state segments of the evoked visual potentials. Response amplification was present only for low spatial frequency, grayscale stimuli, and not for high spatial frequency, red/green stimuli. In contrast, the late positive potential was modulated by emotion regardless of the scene's physical properties. Our findings are discussed in relation to neurophysiologically plausible constraints operating at distinct stages of the cortical processing stream. Copyright © 2015 Elsevier B.V. All rights reserved.
Kurtz, Camille; Depeursinge, Adrien; Napel, Sandy; Beaulieu, Christopher F.; Rubin, Daniel L.
2014-01-01
Computer-assisted image retrieval applications can assist radiologists by identifying similar images in archives as a means to providing decision support. In the classical case, images are described using low-level features extracted from their contents, and an appropriate distance is used to find the best matches in the feature space. However, using low-level image features to fully capture the visual appearance of diseases is challenging and the semantic gap between these features and the high-level visual concepts in radiology may impair the system performance. To deal with this issue, the use of semantic terms to provide high-level descriptions of radiological image contents has recently been advocated. Nevertheless, most of the existing semantic image retrieval strategies are limited by two factors: they require manual annotation of the images using semantic terms and they ignore the intrinsic visual and semantic relationships between these annotations during the comparison of the images. Based on these considerations, we propose an image retrieval framework based on semantic features that relies on two main strategies: (1) automatic “soft” prediction of ontological terms that describe the image contents from multi-scale Riesz wavelets and (2) retrieval of similar images by evaluating the similarity between their annotations using a new term dissimilarity measure, which takes into account both image-based and ontological term relations. The combination of these strategies provides a means of accurately retrieving similar images in databases based on image annotations and can be considered as a potential solution to the semantic gap problem. We validated this approach in the context of the retrieval of liver lesions from computed tomographic (CT) images and annotated with semantic terms of the RadLex ontology. The relevance of the retrieval results was assessed using two protocols: evaluation relative to a dissimilarity reference standard defined for pairs of images on a 25-images dataset, and evaluation relative to the diagnoses of the retrieved images on a 72-images dataset. A normalized discounted cumulative gain (NDCG) score of more than 0.92 was obtained with the first protocol, while AUC scores of more than 0.77 were obtained with the second protocol. This automatical approach could provide real-time decision support to radiologists by showing them similar images with associated diagnoses and, where available, responses to therapies. PMID:25036769
Basic Visual Merchandising. Second Edition. [Student's Manual and] Answer Book/Teacher's Guide.
ERIC Educational Resources Information Center
Luter, Robert R.
This student's manual that features content needed to do tasks related to visual merchandising is intended for students in co-op training stations and entry-level, master employee, and supervisory-level employees. It contains 13 assignments. Each assignment has questions covering specific information and also features activities in which students…
Residual attention guidance in blindsight monkeys watching complex natural scenes.
Yoshida, Masatoshi; Itti, Laurent; Berg, David J; Ikeda, Takuro; Kato, Rikako; Takaura, Kana; White, Brian J; Munoz, Douglas P; Isa, Tadashi
2012-08-07
Patients with damage to primary visual cortex (V1) demonstrate residual performance on laboratory visual tasks despite denial of conscious seeing (blindsight) [1]. After a period of recovery, which suggests a role for plasticity [2], visual sensitivity higher than chance is observed in humans and monkeys for simple luminance-defined stimuli, grating stimuli, moving gratings, and other stimuli [3-7]. Some residual cognitive processes including bottom-up attention and spatial memory have also been demonstrated [8-10]. To date, little is known about blindsight with natural stimuli and spontaneous visual behavior. In particular, is orienting attention toward salient stimuli during free viewing still possible? We used a computational saliency map model to analyze spontaneous eye movements of monkeys with blindsight from unilateral ablation of V1. Despite general deficits in gaze allocation, monkeys were significantly attracted to salient stimuli. The contribution of orientation features to salience was nearly abolished, whereas contributions of motion, intensity, and color features were preserved. Control experiments employing laboratory stimuli confirmed the free-viewing finding that lesioned monkeys retained color sensitivity. Our results show that attention guidance over complex natural scenes is preserved in the absence of V1, thereby directly challenging theories and models that crucially depend on V1 to compute the low-level visual features that guide attention. Copyright © 2012 Elsevier Ltd. All rights reserved.
What you fear will appear: detection of schematic spiders in spider fear.
Peira, Nathalie; Golkar, Armita; Larsson, Maria; Wiens, Stefan
2010-01-01
Various experimental tasks suggest that fear guides attention. However, because these tasks often lack ecological validity, it is unclear to what extent results from these tasks can be generalized to real-life situations. In change detection tasks, a brief interruption of the visual input (i.e., a blank interval or a scene cut) often results in undetected changes in the scene. This setup resembles real-life viewing behavior and is used here to increase ecological validity of the attentional task without compromising control over the stimuli presented. Spider-fearful and nonfearful women detected schematic spiders and flowers that were added to one of two identical background pictures that alternated with a brief blank in between them (i.e., flicker paradigm). Results showed that spider-fearful women detected spiders (but not flowers) faster than did nonfearful women. Because spiders and flowers had similar low-level features, these findings suggest that fear guides attention on the basis of object features rather than simple low-level features.
Image annotation based on positive-negative instances learning
NASA Astrophysics Data System (ADS)
Zhang, Kai; Hu, Jiwei; Liu, Quan; Lou, Ping
2017-07-01
Automatic image annotation is now a tough task in computer vision, the main sense of this tech is to deal with managing the massive image on the Internet and assisting intelligent retrieval. This paper designs a new image annotation model based on visual bag of words, using the low level features like color and texture information as well as mid-level feature as SIFT, and mixture the pic2pic, label2pic and label2label correlation to measure the correlation degree of labels and images. We aim to prune the specific features for each single label and formalize the annotation task as a learning process base on Positive-Negative Instances Learning. Experiments are performed using the Corel5K Dataset, and provide a quite promising result when comparing with other existing methods.
Mid-level perceptual features distinguish objects of different real-world sizes.
Long, Bria; Konkle, Talia; Cohen, Michael A; Alvarez, George A
2016-01-01
Understanding how perceptual and conceptual representations are connected is a fundamental goal of cognitive science. Here, we focus on a broad conceptual distinction that constrains how we interact with objects--real-world size. Although there appear to be clear perceptual correlates for basic-level categories (apples look like other apples, oranges look like other oranges), the perceptual correlates of broader categorical distinctions are largely unexplored, i.e., do small objects look like other small objects? Because there are many kinds of small objects (e.g., cups, keys), there may be no reliable perceptual features that distinguish them from big objects (e.g., cars, tables). Contrary to this intuition, we demonstrated that big and small objects have reliable perceptual differences that can be extracted by early stages of visual processing. In a series of visual search studies, participants found target objects faster when the distractor objects differed in real-world size. These results held when we broadly sampled big and small objects, when we controlled for low-level features and image statistics, and when we reduced objects to texforms--unrecognizable textures that loosely preserve an object's form. However, this effect was absent when we used more basic textures. These results demonstrate that big and small objects have reliably different mid-level perceptual features, and suggest that early perceptual information about broad-category membership may influence downstream object perception, recognition, and categorization processes. (c) 2015 APA, all rights reserved).
Neural pathways for visual speech perception
Bernstein, Lynne E.; Liebenthal, Einat
2014-01-01
This paper examines the questions, what levels of speech can be perceived visually, and how is visual speech represented by the brain? Review of the literature leads to the conclusions that every level of psycholinguistic speech structure (i.e., phonetic features, phonemes, syllables, words, and prosody) can be perceived visually, although individuals differ in their abilities to do so; and that there are visual modality-specific representations of speech qua speech in higher-level vision brain areas. That is, the visual system represents the modal patterns of visual speech. The suggestion that the auditory speech pathway receives and represents visual speech is examined in light of neuroimaging evidence on the auditory speech pathways. We outline the generally agreed-upon organization of the visual ventral and dorsal pathways and examine several types of visual processing that might be related to speech through those pathways, specifically, face and body, orthography, and sign language processing. In this context, we examine the visual speech processing literature, which reveals widespread diverse patterns of activity in posterior temporal cortices in response to visual speech stimuli. We outline a model of the visual and auditory speech pathways and make several suggestions: (1) The visual perception of speech relies on visual pathway representations of speech qua speech. (2) A proposed site of these representations, the temporal visual speech area (TVSA) has been demonstrated in posterior temporal cortex, ventral and posterior to multisensory posterior superior temporal sulcus (pSTS). (3) Given that visual speech has dynamic and configural features, its representations in feedforward visual pathways are expected to integrate these features, possibly in TVSA. PMID:25520611
Line drawing extraction from gray level images by feature integration
NASA Astrophysics Data System (ADS)
Yoo, Hoi J.; Crevier, Daniel; Lepage, Richard; Myler, Harley R.
1994-10-01
We describe procedures that extract line drawings from digitized gray level images, without use of domain knowledge, by modeling preattentive and perceptual organization functions of the human visual system. First, edge points are identified by standard low-level processing, based on the Canny edge operator. Edge points are then linked into single-pixel thick straight- line segments and circular arcs: this operation serves to both filter out isolated and highly irregular segments, and to lump the remaining points into a smaller number of structures for manipulation by later stages of processing. The next stages consist in linking the segments into a set of closed boundaries, which is the system's definition of a line drawing. According to the principles of Gestalt psychology, closure allows us to organize the world by filling in the gaps in a visual stimulation so as to perceive whole objects instead of disjoint parts. To achieve such closure, the system selects particular features or combinations of features by methods akin to those of preattentive processing in humans: features include gaps, pairs of straight or curved parallel lines, L- and T-junctions, pairs of symmetrical lines, and the orientation and length of single lines. These preattentive features are grouped into higher-level structures according to the principles of proximity, similarity, closure, symmetry, and feature conjunction. Achieving closure may require supplying missing segments linking contour concavities. Choices are made between competing structures on the basis of their overall compliance with the principles of closure and symmetry. Results include clean line drawings of curvilinear manufactured objects. The procedures described are part of a system called VITREO (viewpoint-independent 3-D recognition and extraction of objects).
Comparing object recognition from binary and bipolar edge images for visual prostheses.
Jung, Jae-Hyun; Pu, Tian; Peli, Eli
2016-11-01
Visual prostheses require an effective representation method due to the limited display condition which has only 2 or 3 levels of grayscale in low resolution. Edges derived from abrupt luminance changes in images carry essential information for object recognition. Typical binary (black and white) edge images have been used to represent features to convey essential information. However, in scenes with a complex cluttered background, the recognition rate of the binary edge images by human observers is limited and additional information is required. The polarity of edges and cusps (black or white features on a gray background) carries important additional information; the polarity may provide shape from shading information missing in the binary edge image. This depth information may be restored by using bipolar edges. We compared object recognition rates from 16 binary edge images and bipolar edge images by 26 subjects to determine the possible impact of bipolar filtering in visual prostheses with 3 or more levels of grayscale. Recognition rates were higher with bipolar edge images and the improvement was significant in scenes with complex backgrounds. The results also suggest that erroneous shape from shading interpretation of bipolar edges resulting from pigment rather than boundaries of shape may confound the recognition.
Topic Transition in Educational Videos Using Visually Salient Words
ERIC Educational Resources Information Center
Gandhi, Ankit; Biswas, Arijit; Deshmukh, Om
2015-01-01
In this paper, we propose a visual saliency algorithm for automatically finding the topic transition points in an educational video. First, we propose a method for assigning a saliency score to each word extracted from an educational video. We design several mid-level features that are indicative of visual saliency. The optimal feature combination…
Visualizing Dynamic Weather and Ocean Data in Google Earth
NASA Astrophysics Data System (ADS)
Castello, C.; Giencke, P.
2008-12-01
Katrina. Climate change. Rising sea levels. Low lake levels. These headliners, and countless others like them, underscore the need to better understand our changing oceans and lakes. Over the past decade, efforts such as the Global Ocean Observing System (GOOS) have added to this understanding, through the creation of interoperable ocean observing systems. These systems, including buoy networks, gliders, UAV's, etc, have resulted in a dramatic increase in the amount of Earth observation data available to the public. Unfortunately, these data tend to be restrictive to mass consumption, owing to large file sizes, incompatible formats, and/or a dearth of user friendly visualization software. Google Earth offers a flexible way to visualize Earth observation data. Marrying high resolution orthoimagery, user friendly query and navigation tools, and the power of OGC's KML standard, Google Earth can make observation data universally understandable and accessible. This presentation will feature examples of meteorological and oceanographic data visualized using KML and Google Earth, along with tools and tips for integrating other such environmental datasets.
Exploration of complex visual feature spaces for object perception
Leeds, Daniel D.; Pyles, John A.; Tarr, Michael J.
2014-01-01
The mid- and high-level visual properties supporting object perception in the ventral visual pathway are poorly understood. In the absence of well-specified theory, many groups have adopted a data-driven approach in which they progressively interrogate neural units to establish each unit's selectivity. Such methods are challenging in that they require search through a wide space of feature models and stimuli using a limited number of samples. To more rapidly identify higher-level features underlying human cortical object perception, we implemented a novel functional magnetic resonance imaging method in which visual stimuli are selected in real-time based on BOLD responses to recently shown stimuli. This work was inspired by earlier primate physiology work, in which neural selectivity for mid-level features in IT was characterized using a simple parametric approach (Hung et al., 2012). To extend such work to human neuroimaging, we used natural and synthetic object stimuli embedded in feature spaces constructed on the basis of the complex visual properties of the objects themselves. During fMRI scanning, we employed a real-time search method to control continuous stimulus selection within each image space. This search was designed to maximize neural responses across a pre-determined 1 cm3 brain region within ventral cortex. To assess the value of this method for understanding object encoding, we examined both the behavior of the method itself and the complex visual properties the method identified as reliably activating selected brain regions. We observed: (1) Regions selective for both holistic and component object features and for a variety of surface properties; (2) Object stimulus pairs near one another in feature space that produce responses at the opposite extremes of the measured activity range. Together, these results suggest that real-time fMRI methods may yield more widely informative measures of selectivity within the broad classes of visual features associated with cortical object representation. PMID:25309408
Kotchoubey, Boris; Pavlov, Yuri G; Kleber, Boris
2015-01-01
According to a prevailing view, the visual system works by dissecting stimuli into primitives, whereas the auditory system processes simple and complex stimuli with their corresponding features in parallel. This makes musical stimulation particularly suitable for patients with disorders of consciousness (DoC), because the processing pathways related to complex stimulus features can be preserved even when those related to simple features are no longer available. An additional factor speaking in favor of musical stimulation in DoC is the low efficiency of visual stimulation due to prevalent maladies of vision or gaze fixation in DoC patients. Hearing disorders, in contrast, are much less frequent in DoC, which allows us to use auditory stimulation at various levels of complexity. The current paper overviews empirical data concerning the four main domains of brain functioning in DoC patients that musical stimulation can address: perception (e.g., pitch, timbre, and harmony), cognition (e.g., musical syntax and meaning), emotions, and motor functions. Music can approach basic levels of patients' self-consciousness, which may even exist when all higher-level cognitions are lost, whereas music induced emotions and rhythmic stimulation can affect the dopaminergic reward-system and activity in the motor system respectively, thus serving as a starting point for rehabilitation.
Kotchoubey, Boris; Pavlov, Yuri G.; Kleber, Boris
2015-01-01
According to a prevailing view, the visual system works by dissecting stimuli into primitives, whereas the auditory system processes simple and complex stimuli with their corresponding features in parallel. This makes musical stimulation particularly suitable for patients with disorders of consciousness (DoC), because the processing pathways related to complex stimulus features can be preserved even when those related to simple features are no longer available. An additional factor speaking in favor of musical stimulation in DoC is the low efficiency of visual stimulation due to prevalent maladies of vision or gaze fixation in DoC patients. Hearing disorders, in contrast, are much less frequent in DoC, which allows us to use auditory stimulation at various levels of complexity. The current paper overviews empirical data concerning the four main domains of brain functioning in DoC patients that musical stimulation can address: perception (e.g., pitch, timbre, and harmony), cognition (e.g., musical syntax and meaning), emotions, and motor functions. Music can approach basic levels of patients’ self-consciousness, which may even exist when all higher-level cognitions are lost, whereas music induced emotions and rhythmic stimulation can affect the dopaminergic reward-system and activity in the motor system respectively, thus serving as a starting point for rehabilitation. PMID:26640445
Models of Speed Discrimination
NASA Technical Reports Server (NTRS)
1997-01-01
The prime purpose of this project was to investigate various theoretical issues concerning the integration of information across visual space. To date, most of the research efforts in the study of the visual system seem to have been focused in two almost non-overlaping directions. One research focus has been the low level perception as studied by psychophysics. The other focus has been the study of high level vision exemplified by the study of object perception. Most of the effort in psychophysics has been devoted to the search for the fundamental "features" of perception. The general idea is that the most peripheral processes of the visual system decompose the input into features that are then used for classification and recognition. The experimental and theoretical focus has been on finding and describing these analyzers that decompose images into useful components. Various models are then compared to the physiological measurements performed on neurons in the sensory systems. In the study of higher level perception, the work has been focused on the representation of objects and on the connections between various physical effects and object perception. In this category we find the perception of 3D from a variety of physical measurements including motion, shading and other physical phenomena. With few exceptions, there seem to be very limited development of theories describing how the visual system might combine the output of the analyzers to form the representation of visual objects. Therefore, the processes underlying the integration of information over space represent critical aspects of vision system. The understanding of these processes will have implications on our expectations for the underlying physiological mechanisms, as well as for our models of the internal representation for visual percepts. In this project, we explored several mechanisms related to spatial summation, attention, and eye movements. The project comprised three components: 1. Modeling visual search for the detection of speed deviation. 2. Perception of moving objects. 3. Exploring the role of eye movements in various visual tasks.
NASA Astrophysics Data System (ADS)
Graham, James; Ternovskiy, Igor V.
2013-06-01
We applied a two stage unsupervised hierarchical learning system to model complex dynamic surveillance and cyber space monitoring systems using a non-commercial version of the NeoAxis visualization software. The hierarchical scene learning and recognition approach is based on hierarchical expectation maximization, and was linked to a 3D graphics engine for validation of learning and classification results and understanding the human - autonomous system relationship. Scene recognition is performed by taking synthetically generated data and feeding it to a dynamic logic algorithm. The algorithm performs hierarchical recognition of the scene by first examining the features of the objects to determine which objects are present, and then determines the scene based on the objects present. This paper presents a framework within which low level data linked to higher-level visualization can provide support to a human operator and be evaluated in a detailed and systematic way.
Regional Principal Color Based Saliency Detection
Lou, Jing; Ren, Mingwu; Wang, Huan
2014-01-01
Saliency detection is widely used in many visual applications like image segmentation, object recognition and classification. In this paper, we will introduce a new method to detect salient objects in natural images. The approach is based on a regional principal color contrast modal, which incorporates low-level and medium-level visual cues. The method allows a simple computation of color features and two categories of spatial relationships to a saliency map, achieving higher F-measure rates. At the same time, we present an interpolation approach to evaluate resulting curves, and analyze parameters selection. Our method enables the effective computation of arbitrary resolution images. Experimental results on a saliency database show that our approach produces high quality saliency maps and performs favorably against ten saliency detection algorithms. PMID:25379960
Trial-by-trial adjustments in control triggered by incidentally encoded semantic cues.
Blais, Chris; Harris, Michael B; Sinanian, Michael H; Bunge, Silvia A
2015-01-01
Cognitive control mechanisms provide the flexibility to rapidly adapt to contextual demands. These contexts can be defined by top-down goals-but also by bottom-up perceptual factors, such as the location at which a visual stimulus appears. There are now several experiments reporting contextual control effects. Such experiments establish that contexts defined by low-level perceptual cues such as the location of a visual stimulus can lead to context-specific control, suggesting a relatively early focus for cognitive control. The current set of experiments involved a word-word interference task designed to assess whether a high-level cue, the semantic category to which a word belongs, can also facilitate contextual control. Indeed, participants exhibit a larger Flanker effect to items pertaining to a semantic category in which 75% of stimuli are incongruent than in response to items pertaining to a category in which 25% of stimuli are incongruent. Thus, both low-level and high-level stimulus features can affect the bottom-up engagement of cognitive control. The implications for current models of cognitive control are discussed.
A ganglion-cell-based primary image representation method and its contribution to object recognition
NASA Astrophysics Data System (ADS)
Wei, Hui; Dai, Zhi-Long; Zuo, Qing-Song
2016-10-01
A visual stimulus is represented by the biological visual system at several levels: in the order from low to high levels, they are: photoreceptor cells, ganglion cells (GCs), lateral geniculate nucleus cells and visual cortical neurons. Retinal GCs at the early level need to represent raw data only once, but meet a wide number of diverse requests from different vision-based tasks. This means the information representation at this level is general and not task-specific. Neurobiological findings have attributed this universal adaptation to GCs' receptive field (RF) mechanisms. For the purposes of developing a highly efficient image representation method that can facilitate information processing and interpretation at later stages, here we design a computational model to simulate the GC's non-classical RF. This new image presentation method can extract major structural features from raw data, and is consistent with other statistical measures of the image. Based on the new representation, the performances of other state-of-the-art algorithms in contour detection and segmentation can be upgraded remarkably. This work concludes that applying sophisticated representation schema at early state is an efficient and promising strategy in visual information processing.
Reavis, Eric A; Frank, Sebastian M; Tse, Peter U
2018-04-12
Visual search is often slow and difficult for complex stimuli such as feature conjunctions. Search efficiency, however, can improve with training. Search for stimuli that can be identified by the spatial configuration of two elements (e.g., the relative position of two colored shapes) improves dramatically within a few hundred trials of practice. Several recent imaging studies have identified neural correlates of this learning, but it remains unclear what stimulus properties participants learn to use to search efficiently. Influential models, such as reverse hierarchy theory, propose two major possibilities: learning to use information contained in low-level image statistics (e.g., single features at particular retinotopic locations) or in high-level characteristics (e.g., feature conjunctions) of the task-relevant stimuli. In a series of experiments, we tested these two hypotheses, which make different predictions about the effect of various stimulus manipulations after training. We find relatively small effects of manipulating low-level properties of the stimuli (e.g., changing their retinotopic location) and some conjunctive properties (e.g., color-position), whereas the effects of manipulating other conjunctive properties (e.g., color-shape) are larger. Overall, the findings suggest conjunction learning involving such stimuli might be an emergent phenomenon that reflects multiple different learning processes, each of which capitalizes on different types of information contained in the stimuli. We also show that both targets and distractors are learned, and that reversing learned target and distractor identities impairs performance. This suggests that participants do not merely learn to discriminate target and distractor stimuli, they also learn stimulus identity mappings that contribute to performance improvements.
Sadeghi, Zahra; Testolin, Alberto
2017-08-01
In humans, efficient recognition of written symbols is thought to rely on a hierarchical processing system, where simple features are progressively combined into more abstract, high-level representations. Here, we present a computational model of Persian character recognition based on deep belief networks, where increasingly more complex visual features emerge in a completely unsupervised manner by fitting a hierarchical generative model to the sensory data. Crucially, high-level internal representations emerging from unsupervised deep learning can be easily read out by a linear classifier, achieving state-of-the-art recognition accuracy. Furthermore, we tested the hypothesis that handwritten digits and letters share many common visual features: A generative model that captures the statistical structure of the letters distribution should therefore also support the recognition of written digits. To this aim, deep networks trained on Persian letters were used to build high-level representations of Persian digits, which were indeed read out with high accuracy. Our simulations show that complex visual features, such as those mediating the identification of Persian symbols, can emerge from unsupervised learning in multilayered neural networks and can support knowledge transfer across related domains.
Beyond the search surface: visual search and attentional engagement.
Duncan, J; Humphreys, G
1992-05-01
Treisman (1991) described a series of visual search studies testing feature integration theory against an alternative (Duncan & Humphreys, 1989) in which feature and conjunction search are basically similar. Here the latter account is noted to have 2 distinct levels: (a) a summary of search findings in terms of stimulus similarities, and (b) a theory of how visual attention is brought to bear on relevant objects. Working at the 1st level, Treisman found that even when similarities were calibrated and controlled, conjunction search was much harder than feature search. The theory, however, can only really be tested at the 2nd level, because the 1st is an approximation. An account of the findings is developed at the 2nd level, based on the 2 processes of input-template matching and spreading suppression. New data show that, when both of these factors are controlled, feature and conjunction search are equally difficult. Possibilities for unification of the alternative views are considered.
Mental Imagery: Functional Mechanisms and Clinical Applications
Pearson, Joel; Naselaris, Thomas; Holmes, Emily A.; Kosslyn, Stephen M.
2015-01-01
Mental imagery research has weathered both disbelief of the phenomenon and inherent methodological limitations. Here we review recent behavioral, brain imaging, and clinical research that has reshaped our understanding of mental imagery. Research supports the claim that visual mental imagery is a depictive internal representation that functions like a weak form of perception. Brain imaging work has demonstrated that neural representations of mental and perceptual images resemble one another as early as the primary visual cortex (V1). Activity patterns in V1 encode mental images and perceptual images via a common set of low-level depictive visual features. Recent translational and clinical research reveals the pivotal role that imagery plays in many mental disorders and suggests how clinicians can utilize imagery in treatment. PMID:26412097
Psychophysical and perceptual performance in a simulated-scotoma model of human eye injury
NASA Astrophysics Data System (ADS)
Brandeis, R.; Egoz, I.; Peri, D.; Sapiens, N.; Turetz, J.
2008-02-01
Macular scotomas, affecting visual functioning, characterize many eye and neurological diseases like AMD, diabetes mellitus, multiple sclerosis, and macular hole. In this work, foveal visual field defects were modeled, and their effects were evaluated on spatial contrast sensitivity and a task of stimulus detection and aiming. The modeled occluding scotomas, of different size, were superimposed on the stimuli presented on the computer display, and were stabilized on the retina using a mono Purkinje Eye-Tracker. Spatial contrast sensitivity was evaluated using square-wave grating stimuli, whose contrast thresholds were measured using the method of constant stimuli with "catch trials". The detection task consisted of a triple conjunctive visual search display of: size (in visual angle), contrast and background (simple, low-level features vs. complex, high-level features). Search/aiming accuracy as well as R.T. measures used for performance evaluation. Artificially generated scotomas suppressed spatial contrast sensitivity in a size dependent manner, similar to previous studies. Deprivation effect was dependent on spatial frequency, consistent with retinal inhomogeneity models. Stimulus detection time was slowed in complex background search situation more than in simple background. Detection speed was dependent on scotoma size and size of stimulus. In contrast, visually guided aiming was more sensitive to scotoma effect in simple background search situation than in complex background. Both stimulus aiming R.T. and accuracy (precision targeting) were impaired, as a function of scotoma size and size of stimulus. The data can be explained by models distinguishing between saliency-based, parallel and serial search processes, guiding visual attention, which are supported by underlying retinal as well as neural mechanisms.
Vernon, Richard J W; Gouws, André D; Lawrence, Samuel J D; Wade, Alex R; Morland, Antony B
2016-05-25
Representations in early visual areas are organized on the basis of retinotopy, but this organizational principle appears to lose prominence in the extrastriate cortex. Nevertheless, an extrastriate region, such as the shape-selective lateral occipital cortex (LO), must still base its activation on the responses from earlier retinotopic visual areas, implying that a transition from retinotopic to "functional" organizations should exist. We hypothesized that such a transition may lie in LO-1 or LO-2, two visual areas lying between retinotopically defined V3d and functionally defined LO. Using a rapid event-related fMRI paradigm, we measured neural similarity in 12 human participants between pairs of stimuli differing along dimensions of shape exemplar and shape complexity within both retinotopically and functionally defined visual areas. These neural similarity measures were then compared with low-level and more abstract (curvature-based) measures of stimulus similarity. We found that low-level, but not abstract, stimulus measures predicted V1-V3 responses, whereas the converse was true for LO, a double dissociation. Critically, abstract stimulus measures were most predictive of responses within LO-2, akin to LO, whereas both low-level and abstract measures were predictive for responses within LO-1, perhaps indicating a transitional point between those two organizational principles. Similar transitions to abstract representations were not observed in the more ventral stream passing through V4 and VO-1/2. The transition we observed in LO-1 and LO-2 demonstrates that a more "abstracted" representation, typically considered the preserve of "category-selective" extrastriate cortex, can nevertheless emerge in retinotopic regions. Visual areas are typically identified either through retinotopy (e.g., V1-V3) or from functional selectivity [e.g., shape-selective lateral occipital complex (LOC)]. We combined these approaches to explore the nature of shape representations through the visual hierarchy. Two different representations emerged: the first reflected low-level shape properties (dependent on the spatial layout of the shape outline), whereas the second captured more abstract curvature-related shape features. Critically, early visual cortex represented low-level information but this diminished in the extrastriate cortex (LO-1/LO-2/LOC), in which the abstract representation emerged. Therefore, this work further elucidates the nature of shape representations in the LOC, provides insight into how those representations emerge from early retinotopic cortex, and crucially demonstrates that retinotopically tuned regions (LO-1/LO-2) are not necessarily constrained to retinotopic representations. Copyright © 2016 Vernon et al.
Vernon, Richard J. W.; Gouws, André D.; Lawrence, Samuel J. D.; Wade, Alex R.
2016-01-01
Representations in early visual areas are organized on the basis of retinotopy, but this organizational principle appears to lose prominence in the extrastriate cortex. Nevertheless, an extrastriate region, such as the shape-selective lateral occipital cortex (LO), must still base its activation on the responses from earlier retinotopic visual areas, implying that a transition from retinotopic to “functional” organizations should exist. We hypothesized that such a transition may lie in LO-1 or LO-2, two visual areas lying between retinotopically defined V3d and functionally defined LO. Using a rapid event-related fMRI paradigm, we measured neural similarity in 12 human participants between pairs of stimuli differing along dimensions of shape exemplar and shape complexity within both retinotopically and functionally defined visual areas. These neural similarity measures were then compared with low-level and more abstract (curvature-based) measures of stimulus similarity. We found that low-level, but not abstract, stimulus measures predicted V1–V3 responses, whereas the converse was true for LO, a double dissociation. Critically, abstract stimulus measures were most predictive of responses within LO-2, akin to LO, whereas both low-level and abstract measures were predictive for responses within LO-1, perhaps indicating a transitional point between those two organizational principles. Similar transitions to abstract representations were not observed in the more ventral stream passing through V4 and VO-1/2. The transition we observed in LO-1 and LO-2 demonstrates that a more “abstracted” representation, typically considered the preserve of “category-selective” extrastriate cortex, can nevertheless emerge in retinotopic regions. SIGNIFICANCE STATEMENT Visual areas are typically identified either through retinotopy (e.g., V1–V3) or from functional selectivity [e.g., shape-selective lateral occipital complex (LOC)]. We combined these approaches to explore the nature of shape representations through the visual hierarchy. Two different representations emerged: the first reflected low-level shape properties (dependent on the spatial layout of the shape outline), whereas the second captured more abstract curvature-related shape features. Critically, early visual cortex represented low-level information but this diminished in the extrastriate cortex (LO-1/LO-2/LOC), in which the abstract representation emerged. Therefore, this work further elucidates the nature of shape representations in the LOC, provides insight into how those representations emerge from early retinotopic cortex, and crucially demonstrates that retinotopically tuned regions (LO-1/LO-2) are not necessarily constrained to retinotopic representations. PMID:27225766
Tran, Truyet T.; Craven, Ashley P.; Leung, Tsz-Wing; Chat, Sandy W.; Levi, Dennis M.
2016-01-01
Neurons in the early visual cortex are finely tuned to different low-level visual features, forming a multi-channel system analysing the visual image formed on the retina in a parallel manner. However, little is known about the potential ‘cross-talk’ among these channels. Here, we systematically investigated whether stereoacuity, over a large range of target spatial frequencies, can be enhanced by perceptual learning. Using narrow-band visual stimuli, we found that practice with coarse (low spatial frequency) targets substantially improves performance, and that the improvement spreads from coarse to fine (high spatial frequency) three-dimensional perception, generalizing broadly across untrained spatial frequencies and orientations. Notably, we observed an asymmetric transfer of learning across the spatial frequency spectrum. The bandwidth of transfer was broader when training was at a high spatial frequency than at a low spatial frequency. Stereoacuity training is most beneficial when trained with fine targets. This broad transfer of stereoacuity learning contrasts with the highly specific learning reported for other basic visual functions. We also revealed strategies to boost learning outcomes ‘beyond-the-plateau’. Our investigations contribute to understanding the functional properties of the network subserving stereovision. The ability to generalize may provide a key principle for restoring impaired binocular vision in clinical situations. PMID:26909178
Visual representation of spatiotemporal structure
NASA Astrophysics Data System (ADS)
Schill, Kerstin; Zetzsche, Christoph; Brauer, Wilfried; Eisenkolb, A.; Musto, A.
1998-07-01
The processing and representation of motion information is addressed from an integrated perspective comprising low- level signal processing properties as well as higher-level cognitive aspects. For the low-level processing of motion information we argue that a fundamental requirement is the existence of a spatio-temporal memory. Its key feature, the provision of an orthogonal relation between external time and its internal representation, is achieved by a mapping of temporal structure into a locally distributed activity distribution accessible in parallel by higher-level processing stages. This leads to a reinterpretation of the classical concept of `iconic memory' and resolves inconsistencies on ultra-short-time processing and visual masking. The spatial-temporal memory is further investigated by experiments on the perception of spatio-temporal patterns. Results on the direction discrimination of motion paths provide evidence that information about direction and location are not processed and represented independent of each other. This suggests a unified representation on an early level, in the sense that motion information is internally available in form of a spatio-temporal compound. For the higher-level representation we have developed a formal framework for the qualitative description of courses of motion that may occur with moving objects.
Deconstructing continuous flash suppression
Yang, Eunice; Blake, Randolph
2012-01-01
In this paper, we asked to what extent the depth of interocular suppression engendered by continuous flash suppression (CFS) varies depending on spatiotemporal properties of the suppressed stimulus and CFS suppressor. An answer to this question could have implications for interpreting the results in which CFS influences the processing of different categories of stimuli to different extents. In a series of experiments, we measured the selectivity and depth of suppression (i.e., elevation in contrast detection thresholds) as a function of the visual features of the stimulus being suppressed and the stimulus evoking suppression, namely, the popular “Mondrian” CFS stimulus (N. Tsuchiya & C. Koch, 2005). First, we found that CFS differentially suppresses the spatial components of the suppressed stimulus: Observers' sensitivity for stimuli of relatively low spatial frequency or cardinally oriented features was more strongly impaired in comparison to high spatial frequency or obliquely oriented stimuli. Second, we discovered that this feature-selective bias primarily arises from the spatiotemporal structure of the CFS stimulus, particularly within information residing in the low spatial frequency range and within the smooth rather than abrupt luminance changes over time. These results imply that this CFS stimulus operates by selectively attenuating certain classes of low-level signals while leaving others to be potentially encoded during suppression. These findings underscore the importance of considering the contribution of low-level features in stimulus-driven effects that are reported under CFS. PMID:22408039
Deconstructing continuous flash suppression.
Yang, Eunice; Blake, Randolph
2012-03-08
In this paper, we asked to what extent the depth of interocular suppression engendered by continuous flash suppression (CFS) varies depending on spatiotemporal properties of the suppressed stimulus and CFS suppressor. An answer to this question could have implications for interpreting the results in which CFS influences the processing of different categories of stimuli to different extents. In a series of experiments, we measured the selectivity and depth of suppression (i.e., elevation in contrast detection thresholds) as a function of the visual features of the stimulus being suppressed and the stimulus evoking suppression, namely, the popular "Mondrian" CFS stimulus (N. Tsuchiya & C. Koch, 2005). First, we found that CFS differentially suppresses the spatial components of the suppressed stimulus: Observers' sensitivity for stimuli of relatively low spatial frequency or cardinally oriented features was more strongly impaired in comparison to high spatial frequency or obliquely oriented stimuli. Second, we discovered that this feature-selective bias primarily arises from the spatiotemporal structure of the CFS stimulus, particularly within information residing in the low spatial frequency range and within the smooth rather than abrupt luminance changes over time. These results imply that this CFS stimulus operates by selectively attenuating certain classes of low-level signals while leaving others to be potentially encoded during suppression. These findings underscore the importance of considering the contribution of low-level features in stimulus-driven effects that are reported under CFS.
Iconic memory requires attention
Persuh, Marjan; Genzer, Boris; Melara, Robert D.
2012-01-01
Two experiments investigated whether attention plays a role in iconic memory, employing either a change detection paradigm (Experiment 1) or a partial-report paradigm (Experiment 2). In each experiment, attention was taxed during initial display presentation, focusing the manipulation on consolidation of information into iconic memory, prior to transfer into working memory. Observers were able to maintain high levels of performance (accuracy of change detection or categorization) even when concurrently performing an easy visual search task (low load). However, when the concurrent search was made difficult (high load), observers' performance dropped to almost chance levels, while search accuracy held at single-task levels. The effects of attentional load remained the same across paradigms. The results suggest that, without attention, participants consolidate in iconic memory only gross representations of the visual scene, information too impoverished for successful detection of perceptual change or categorization of features. PMID:22586389
Iconic memory requires attention.
Persuh, Marjan; Genzer, Boris; Melara, Robert D
2012-01-01
Two experiments investigated whether attention plays a role in iconic memory, employing either a change detection paradigm (Experiment 1) or a partial-report paradigm (Experiment 2). In each experiment, attention was taxed during initial display presentation, focusing the manipulation on consolidation of information into iconic memory, prior to transfer into working memory. Observers were able to maintain high levels of performance (accuracy of change detection or categorization) even when concurrently performing an easy visual search task (low load). However, when the concurrent search was made difficult (high load), observers' performance dropped to almost chance levels, while search accuracy held at single-task levels. The effects of attentional load remained the same across paradigms. The results suggest that, without attention, participants consolidate in iconic memory only gross representations of the visual scene, information too impoverished for successful detection of perceptual change or categorization of features.
Selection-for-action in visual search.
Hannus, Aave; Cornelissen, Frans W; Lindemann, Oliver; Bekkering, Harold
2005-01-01
Grasping an object rather than pointing to it enhances processing of its orientation but not its color. Apparently, visual discrimination is selectively enhanced for a behaviorally relevant feature. In two experiments we investigated the limitations and targets of this bias. Specifically, in Experiment 1 we were interested to find out whether the effect is capacity demanding, therefore we manipulated the set-size of the display. The results indicated a clear cognitive processing capacity requirement, i.e. the magnitude of the effect decreased for a larger set size. Consequently, in Experiment 2, we investigated if the enhancement effect occurs only at the level of behaviorally relevant feature or at a level common to different features. Therefore we manipulated the discriminability of the behaviorally neutral feature (color). Again, results showed that this manipulation influenced the action enhancement of the behaviorally relevant feature. Particularly, the effect of the color manipulation on the action enhancement suggests that the action effect is more likely to bias the competition between different visual features rather than to enhance the processing of the relevant feature. We offer a theoretical account that integrates the action-intention effect within the biased competition model of visual selective attention.
Size-Sensitive Perceptual Representations Underlie Visual and Haptic Object Recognition
Craddock, Matt; Lawson, Rebecca
2009-01-01
A variety of similarities between visual and haptic object recognition suggests that the two modalities may share common representations. However, it is unclear whether such common representations preserve low-level perceptual features or whether transfer between vision and haptics is mediated by high-level, abstract representations. Two experiments used a sequential shape-matching task to examine the effects of size changes on unimodal and crossmodal visual and haptic object recognition. Participants felt or saw 3D plastic models of familiar objects. The two objects presented on a trial were either the same size or different sizes and were the same shape or different but similar shapes. Participants were told to ignore size changes and to match on shape alone. In Experiment 1, size changes on same-shape trials impaired performance similarly for both visual-to-visual and haptic-to-haptic shape matching. In Experiment 2, size changes impaired performance on both visual-to-haptic and haptic-to-visual shape matching and there was no interaction between the cost of size changes and direction of transfer. Together the unimodal and crossmodal matching results suggest that the same, size-specific perceptual representations underlie both visual and haptic object recognition, and indicate that crossmodal memory for objects must be at least partly based on common perceptual representations. PMID:19956685
Heinen, Klaartje; Feredoes, Eva; Weiskopf, Nikolaus; Ruff, Christian C; Driver, Jon
2014-11-01
Voluntary selective attention can prioritize different features in a visual scene. The frontal eye-fields (FEF) are one potential source of such feature-specific top-down signals, but causal evidence for influences on visual cortex (as was shown for "spatial" attention) has remained elusive. Here, we show that transcranial magnetic stimulation (TMS) applied to right FEF increased the blood oxygen level-dependent (BOLD) signals in visual areas processing "target feature" but not in "distracter feature"-processing regions. TMS-induced BOLD signals increase in motion-responsive visual cortex (MT+) when motion was attended in a display with moving dots superimposed on face stimuli, but in face-responsive fusiform area (FFA) when faces were attended to. These TMS effects on BOLD signal in both regions were negatively related to performance (on the motion task), supporting the behavioral relevance of this pathway. Our findings provide new causal evidence for the human FEF in the control of nonspatial "feature"-based attention, mediated by dynamic influences on feature-specific visual cortex that vary with the currently attended property. © The Author 2013. Published by Oxford University Press.
Position Information Encoded by Population Activity in Hierarchical Visual Areas
Majima, Kei; Horikawa, Tomoyasu
2017-01-01
Abstract Neurons in high-level visual areas respond to more complex visual features with broader receptive fields (RFs) compared to those in low-level visual areas. Thus, high-level visual areas are generally considered to carry less information regarding the position of seen objects in the visual field. However, larger RFs may not imply loss of position information at the population level. Here, we evaluated how accurately the position of a seen object could be predicted (decoded) from activity patterns in each of six representative visual areas with different RF sizes [V1–V4, lateral occipital complex (LOC), and fusiform face area (FFA)]. We collected functional magnetic resonance imaging (fMRI) responses while human subjects viewed a ball randomly moving in a two-dimensional field. To estimate population RF sizes of individual fMRI voxels, RF models were fitted for individual voxels in each brain area. The voxels in higher visual areas showed larger estimated RFs than those in lower visual areas. Then, the ball’s position in a separate session was predicted by maximum likelihood estimation using the RF models of individual voxels. We also tested a model-free multivoxel regression (support vector regression, SVR) to predict the position. We found that regardless of the difference in RF size, all visual areas showed similar prediction accuracies, especially on the horizontal dimension. Higher areas showed slightly lower accuracies on the vertical dimension, which appears to be attributed to the narrower spatial distributions of the RF centers. The results suggest that much position information is preserved in population activity through the hierarchical visual pathway regardless of RF sizes and is potentially available in later processing for recognition and behavior. PMID:28451634
A Task-Dependent Causal Role for Low-Level Visual Processes in Spoken Word Comprehension
ERIC Educational Resources Information Center
Ostarek, Markus; Huettig, Falk
2017-01-01
It is well established that the comprehension of spoken words referring to object concepts relies on high-level visual areas in the ventral stream that build increasingly abstract representations. It is much less clear whether basic low-level visual representations are also involved. Here we asked in what task situations low-level visual…
Tebartz van Elst, Ludger; Bach, Michael; Blessing, Julia; Riedel, Andreas; Bubl, Emanuel
2015-01-01
A common neurodevelopmental disorder, autism spectrum disorder (ASD), is defined by specific patterns in social perception, social competence, communication, highly circumscribed interests, and a strong subjective need for behavioral routines. Furthermore, distinctive features of visual perception, such as markedly reduced eye contact and a tendency to focus more on small, visual items than on holistic perception, have long been recognized as typical ASD characteristics. Recent debate in the scientific community discusses whether the physiology of low-level visual perception might explain such higher visual abnormalities. While reports of this enhanced, "eagle-like" visual acuity contained methodological errors and could not be substantiated, several authors have reported alterations in even earlier stages of visual processing, such as contrast perception and motion perception at the occipital cortex level. Therefore, in this project, we have investigated the electrophysiology of very early visual processing by analyzing the pattern electroretinogram-based contrast gain, the background noise amplitude, and the psychophysical visual acuities of participants with high-functioning ASD and controls with equal education. Based on earlier findings, we hypothesized that alterations in early vision would be present in ASD participants. This study included 33 individuals with ASD (11 female) and 33 control individuals (12 female). The groups were matched in terms of age, gender, and education level. We found no evidence of altered electrophysiological retinal contrast processing or psychophysical measured visual acuities. There appears to be no evidence for abnormalities in retinal visual processing in ASD patients, at least with respect to contrast detection.
Dorsal hippocampus is necessary for visual categorization in rats.
Kim, Jangjin; Castro, Leyre; Wasserman, Edward A; Freeman, John H
2018-02-23
The hippocampus may play a role in categorization because of the need to differentiate stimulus categories (pattern separation) and to recognize category membership of stimuli from partial information (pattern completion). We hypothesized that the hippocampus would be more crucial for categorization of low-density (few relevant features) stimuli-due to the higher demand on pattern separation and pattern completion-than for categorization of high-density (many relevant features) stimuli. Using a touchscreen apparatus, rats were trained to categorize multiple abstract stimuli into two different categories. Each stimulus was a pentagonal configuration of five visual features; some of the visual features were relevant for defining the category whereas others were irrelevant. Two groups of rats were trained with either a high (dense, n = 8) or low (sparse, n = 8) number of category-relevant features. Upon reaching criterion discrimination (≥75% correct, on 2 consecutive days), bilateral cannulas were implanted in the dorsal hippocampus. The rats were then given either vehicle or muscimol infusions into the hippocampus just prior to various testing sessions. They were tested with: the previously trained stimuli (trained), novel stimuli involving new irrelevant features (novel), stimuli involving relocated features (relocation), and a single relevant feature (singleton). In training, the dense group reached criterion faster than the sparse group, indicating that the sparse task was more difficult than the dense task. In testing, accuracy of both groups was equally high for trained and novel stimuli. However, both groups showed impaired accuracy in the relocation and singleton conditions, with a greater deficit in the sparse group. The testing data indicate that rats encode both the relevant features and the spatial locations of the features. Hippocampal inactivation impaired visual categorization regardless of the density of the category-relevant features for the trained, novel, relocation, and singleton stimuli. Hippocampus-mediated pattern completion and pattern separation mechanisms may be necessary for visual categorization involving overlapping irrelevant features. © 2018 Wiley Periodicals, Inc.
Hardman, Kyle; Cowan, Nelson
2014-01-01
Visual working memory stores stimuli from our environment as representations that can be accessed by high-level control processes. This study addresses a longstanding debate in the literature about whether storage limits in visual working memory include a limit to the complexity of discrete items. We examined the issue with a number of change-detection experiments that used complex stimuli which possessed multiple features per stimulus item. We manipulated the number of relevant features of the stimulus objects in order to vary feature load. In all of our experiments, we found that increased feature load led to a reduction in change-detection accuracy. However, we found that feature load alone could not account for the results, but that a consideration of the number of relevant objects was also required. This study supports capacity limits for both feature and object storage in visual working memory. PMID:25089739
Impaired recognition of facial emotions from low-spatial frequencies in Asperger syndrome.
Kätsyri, Jari; Saalasti, Satu; Tiippana, Kaisa; von Wendt, Lennart; Sams, Mikko
2008-01-01
The theory of 'weak central coherence' [Happe, F., & Frith, U. (2006). The weak coherence account: Detail-focused cognitive style in autism spectrum disorders. Journal of Autism and Developmental Disorders, 36(1), 5-25] implies that persons with autism spectrum disorders (ASDs) have a perceptual bias for local but not for global stimulus features. The recognition of emotional facial expressions representing various different levels of detail has not been studied previously in ASDs. We analyzed the recognition of four basic emotional facial expressions (anger, disgust, fear and happiness) from low-spatial frequencies (overall global shapes without local features) in adults with an ASD. A group of 20 participants with Asperger syndrome (AS) was compared to a group of non-autistic age- and sex-matched controls. Emotion recognition was tested from static and dynamic facial expressions whose spatial frequency contents had been manipulated by low-pass filtering at two levels. The two groups recognized emotions similarly from non-filtered faces and from dynamic vs. static facial expressions. In contrast, the participants with AS were less accurate than controls in recognizing facial emotions from very low-spatial frequencies. The results suggest intact recognition of basic facial emotions and dynamic facial information, but impaired visual processing of global features in ASDs.
Learning-based saliency model with depth information.
Ma, Chih-Yao; Hang, Hsueh-Ming
2015-01-01
Most previous studies on visual saliency focused on two-dimensional (2D) scenes. Due to the rapidly growing three-dimensional (3D) video applications, it is very desirable to know how depth information affects human visual attention. In this study, we first conducted eye-fixation experiments on 3D images. Our fixation data set comprises 475 3D images and 16 subjects. We used a Tobii TX300 eye tracker (Tobii, Stockholm, Sweden) to track the eye movement of each subject. In addition, this database contains 475 computed depth maps. Due to the scarcity of public-domain 3D fixation data, this data set should be useful to the 3D visual attention research community. Then, a learning-based visual attention model was designed to predict human attention. In addition to the popular 2D features, we included the depth map and its derived features. The results indicate that the extra depth information can enhance the saliency estimation accuracy specifically for close-up objects hidden in a complex-texture background. In addition, we examined the effectiveness of various low-, mid-, and high-level features on saliency prediction. Compared with both 2D and 3D state-of-the-art saliency estimation models, our methods show better performance on the 3D test images. The eye-tracking database and the MATLAB source codes for the proposed saliency model and evaluation methods are available on our website.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Thomas, Mathew; Marshall, Matthew J.; Miller, Erin A.
2014-08-26
Understanding the interactions of structured communities known as “biofilms” and other complex matrixes is possible through the X-ray micro tomography imaging of the biofilms. Feature detection and image processing for this type of data focuses on efficiently identifying and segmenting biofilms and bacteria in the datasets. The datasets are very large and often require manual interventions due to low contrast between objects and high noise levels. Thus new software is required for the effectual interpretation and analysis of the data. This work specifies the evolution and application of the ability to analyze and visualize high resolution X-ray micro tomography datasets.
Audiovisual associations alter the perception of low-level visual motion
Kafaligonul, Hulusi; Oluk, Can
2015-01-01
Motion perception is a pervasive nature of vision and is affected by both immediate pattern of sensory inputs and prior experiences acquired through associations. Recently, several studies reported that an association can be established quickly between directions of visual motion and static sounds of distinct frequencies. After the association is formed, sounds are able to change the perceived direction of visual motion. To determine whether such rapidly acquired audiovisual associations and their subsequent influences on visual motion perception are dependent on the involvement of higher-order attentive tracking mechanisms, we designed psychophysical experiments using regular and reverse-phi random dot motions isolating low-level pre-attentive motion processing. Our results show that an association between the directions of low-level visual motion and static sounds can be formed and this audiovisual association alters the subsequent perception of low-level visual motion. These findings support the view that audiovisual associations are not restricted to high-level attention based motion system and early-level visual motion processing has some potential role. PMID:25873869
Image segmentation via foreground and background semantic descriptors
NASA Astrophysics Data System (ADS)
Yuan, Ding; Qiang, Jingjing; Yin, Jihao
2017-09-01
In the field of image processing, it has been a challenging task to obtain a complete foreground that is not uniform in color or texture. Unlike other methods, which segment the image by only using low-level features, we present a segmentation framework, in which high-level visual features, such as semantic information, are used. First, the initial semantic labels were obtained by using the nonparametric method. Then, a subset of the training images, with a similar foreground to the input image, was selected. Consequently, the semantic labels could be further refined according to the subset. Finally, the input image was segmented by integrating the object affinity and refined semantic labels. State-of-the-art performance was achieved in experiments with the challenging MSRC 21 dataset.
Decontaminate feature for tracking: adaptive tracking via evolutionary feature subset
NASA Astrophysics Data System (ADS)
Liu, Qiaoyuan; Wang, Yuru; Yin, Minghao; Ren, Jinchang; Li, Ruizhi
2017-11-01
Although various visual tracking algorithms have been proposed in the last 2-3 decades, it remains a challenging problem for effective tracking with fast motion, deformation, occlusion, etc. Under complex tracking conditions, most tracking models are not discriminative and adaptive enough. When the combined feature vectors are inputted to the visual models, this may lead to redundancy causing low efficiency and ambiguity causing poor performance. An effective tracking algorithm is proposed to decontaminate features for each video sequence adaptively, where the visual modeling is treated as an optimization problem from the perspective of evolution. Every feature vector is compared to a biological individual and then decontaminated via classical evolutionary algorithms. With the optimized subsets of features, the "curse of dimensionality" has been avoided while the accuracy of the visual model has been improved. The proposed algorithm has been tested on several publicly available datasets with various tracking challenges and benchmarked with a number of state-of-the-art approaches. The comprehensive experiments have demonstrated the efficacy of the proposed methodology.
A Multiple-Label Guided Clustering Algorithm for Historical Document Dating and Localization.
He, Sheng; Samara, Petros; Burgers, Jan; Schomaker, Lambert
2016-11-01
It is of essential importance for historians to know the date and place of origin of the documents they study. It would be a huge advancement for historical scholars if it would be possible to automatically estimate the geographical and temporal provenance of a handwritten document by inferring them from the handwriting style of such a document. We propose a multiple-label guided clustering algorithm to discover the correlations between the concrete low-level visual elements in historical documents and abstract labels, such as date and location. First, a novel descriptor, called histogram of orientations of handwritten strokes, is proposed to extract and describe the visual elements, which is built on a scale-invariant polar-feature space. In addition, the multi-label self-organizing map (MLSOM) is proposed to discover the correlations between the low-level visual elements and their labels in a single framework. Our proposed MLSOM can be used to predict the labels directly. Moreover, the MLSOM can also be considered as a pre-structured clustering method to build a codebook, which contains more discriminative information on date and geography. The experimental results on the medieval paleographic scale data set demonstrate that our method achieves state-of-the-art results.
A new approach to modeling the influence of image features on fixation selection in scenes
Nuthmann, Antje; Einhäuser, Wolfgang
2015-01-01
Which image characteristics predict where people fixate when memorizing natural images? To answer this question, we introduce a new analysis approach that combines a novel scene-patch analysis with generalized linear mixed models (GLMMs). Our method allows for (1) directly describing the relationship between continuous feature value and fixation probability, and (2) assessing each feature's unique contribution to fixation selection. To demonstrate this method, we estimated the relative contribution of various image features to fixation selection: luminance and luminance contrast (low-level features); edge density (a mid-level feature); visual clutter and image segmentation to approximate local object density in the scene (higher-level features). An additional predictor captured the central bias of fixation. The GLMM results revealed that edge density, clutter, and the number of homogenous segments in a patch can independently predict whether image patches are fixated or not. Importantly, neither luminance nor contrast had an independent effect above and beyond what could be accounted for by the other predictors. Since the parcellation of the scene and the selection of features can be tailored to the specific research question, our approach allows for assessing the interplay of various factors relevant for fixation selection in scenes in a powerful and flexible manner. PMID:25752239
Comparing object recognition from binary and bipolar edge images for visual prostheses
Jung, Jae-Hyun; Pu, Tian; Peli, Eli
2017-01-01
Visual prostheses require an effective representation method due to the limited display condition which has only 2 or 3 levels of grayscale in low resolution. Edges derived from abrupt luminance changes in images carry essential information for object recognition. Typical binary (black and white) edge images have been used to represent features to convey essential information. However, in scenes with a complex cluttered background, the recognition rate of the binary edge images by human observers is limited and additional information is required. The polarity of edges and cusps (black or white features on a gray background) carries important additional information; the polarity may provide shape from shading information missing in the binary edge image. This depth information may be restored by using bipolar edges. We compared object recognition rates from 16 binary edge images and bipolar edge images by 26 subjects to determine the possible impact of bipolar filtering in visual prostheses with 3 or more levels of grayscale. Recognition rates were higher with bipolar edge images and the improvement was significant in scenes with complex backgrounds. The results also suggest that erroneous shape from shading interpretation of bipolar edges resulting from pigment rather than boundaries of shape may confound the recognition. PMID:28458481
Eye movement assessment of selective attentional capture by emotional pictures.
Nummenmaa, Lauri; Hyönä, Jukka; Calvo, Manuel G
2006-05-01
The eye-tracking method was used to assess attentional orienting to and engagement on emotional visual scenes. In Experiment 1, unpleasant, neutral, or pleasant target pictures were presented simultaneously with neutral control pictures in peripheral vision under instruction to compare pleasantness of the pictures. The probability of first fixating an emotional picture, and the frequency of subsequent fixations, were greater than those for neutral pictures. In Experiment 2, participants were instructed to avoid looking at the emotional pictures, but these were still more likely to be fixated first and gazed longer during the first-pass viewing than neutral pictures. Low-level visual features cannot explain the results. It is concluded that overt visual attention is captured by both unpleasant and pleasant emotional content. 2006 APA, all rights reserved
The effects of alcohol intoxication on attention and memory for visual scenes.
Harvey, Alistair J; Kneller, Wendy; Campbell, Alison C
2013-01-01
This study tests the claim that alcohol intoxication narrows the focus of visual attention on to the more salient features of a visual scene. A group of alcohol intoxicated and sober participants had their eye movements recorded as they encoded a photographic image featuring a central event of either high or low salience. All participants then recalled the details of the image the following day when sober. We sought to determine whether the alcohol group would pay less attention to the peripheral features of the encoded scene than their sober counterparts, whether this effect of attentional narrowing was stronger for the high-salience event than for the low-salience event, and whether it would lead to a corresponding deficit in peripheral recall. Alcohol was found to narrow the focus of foveal attention to the central features of both images but did not facilitate recall from this region. It also reduced the overall amount of information accurately recalled from each scene. These findings demonstrate that the concept of alcohol myopia originally posited to explain the social consequences of intoxication (Steele & Josephs, 1990) may be extended to explain the relative neglect of peripheral information during the processing of visual scenes.
Non-conscious processing of motion coherence can boost conscious access.
Kaunitz, Lisandro; Fracasso, Alessio; Lingnau, Angelika; Melcher, David
2013-01-01
Research on the scope and limits of non-conscious vision can advance our understanding of the functional and neural underpinnings of visual awareness. Here we investigated whether distributed local features can be bound, outside of awareness, into coherent patterns. We used continuous flash suppression (CFS) to create interocular suppression, and thus lack of awareness, for a moving dot stimulus that varied in terms of coherence with an overall pattern (radial flow). Our results demonstrate that for radial motion, coherence favors the detection of patterns of moving dots even under interocular suppression. Coherence caused dots to break through the masks more often: this indicates that the visual system was able to integrate low-level motion signals into a coherent pattern outside of visual awareness. In contrast, in an experiment using meaningful or scrambled biological motion we did not observe any increase in the sensitivity of detection for meaningful patterns. Overall, our results are in agreement with previous studies on face processing and with the hypothesis that certain features are spatiotemporally bound into coherent patterns even outside of attention or awareness.
Is fear perception special? Evidence at the level of decision-making and subjective confidence.
Koizumi, Ai; Mobbs, Dean; Lau, Hakwan
2016-11-01
Fearful faces are believed to be prioritized in visual perception. However, it is unclear whether the processing of low-level facial features alone can facilitate such prioritization or whether higher-level mechanisms also contribute. We examined potential biases for fearful face perception at the levels of perceptual decision-making and perceptual confidence. We controlled for lower-level visual processing capacity by titrating luminance contrasts of backward masks, and the emotional intensity of fearful, angry and happy faces. Under these conditions, participants showed liberal biases in perceiving a fearful face, in both detection and discrimination tasks. This effect was stronger among individuals with reduced density in dorsolateral prefrontal cortex, a region linked to perceptual decision-making. Moreover, participants reported higher confidence when they accurately perceived a fearful face, suggesting that fearful faces may have privileged access to consciousness. Together, the results suggest that mechanisms in the prefrontal cortex contribute to making fearful face perception special. © The Author (2016). Published by Oxford University Press.
Aslan, Ummuhan Bas; Calik, Bilge Basakcı; Kitiş, Ali
2012-01-01
This study was planned in order to determine physical activity levels of visually impaired children and adolescents and to investigate the effect of gender and level of vision on physical activity level in visually impaired children and adolescents. A total of 30 visually impaired children and adolescents (16 low vision and 14 blind) aged between 8 and 16 years participated in the study. The physical activity level of cases was evaluated with a physical activity diary (PAD) and one-mile run/walk test (OMR-WT). No difference was found between the PAD and the OMR-WT results of low vision and blind children and adolescents. The visually impaired children and adolescents were detected not to participate in vigorous physical activity. A difference was found in favor of low vision boys in terms of mild, moderate activities and OMR-WT durations. However, no difference was found between physical activity levels of blind girls and boys. The results of our study suggested that the physical activity level of visually impaired children and adolescents was low, and gender affected physical activity in low vision children and adolescents. Copyright © 2012 Elsevier Ltd. All rights reserved.
Dual Low-Rank Pursuit: Learning Salient Features for Saliency Detection.
Lang, Congyan; Feng, Jiashi; Feng, Songhe; Wang, Jingdong; Yan, Shuicheng
2016-06-01
Saliency detection is an important procedure for machines to understand visual world as humans do. In this paper, we consider a specific saliency detection problem of predicting human eye fixations when they freely view natural images, and propose a novel dual low-rank pursuit (DLRP) method. DLRP learns saliency-aware feature transformations by utilizing available supervision information and constructs discriminative bases for effectively detecting human fixation points under the popular low-rank and sparsity-pursuit framework. Benefiting from the embedded high-level information in the supervised learning process, DLRP is able to predict fixations accurately without performing the expensive object segmentation as in the previous works. Comprehensive experiments clearly show the superiority of the proposed DLRP method over the established state-of-the-art methods. We also empirically demonstrate that DLRP provides stronger generalization performance across different data sets and inherits the advantages of both the bottom-up- and top-down-based saliency detection methods.
Exposure to Organic Solvents Used in Dry Cleaning Reduces Low and High Level Visual Function
Jiménez Barbosa, Ingrid Astrid
2015-01-01
Purpose To investigate whether exposure to occupational levels of organic solvents in the dry cleaning industry is associated with neurotoxic symptoms and visual deficits in the perception of basic visual features such as luminance contrast and colour, higher level processing of global motion and form (Experiment 1), and cognitive function as measured in a visual search task (Experiment 2). Methods The Q16 neurotoxic questionnaire, a commonly used measure of neurotoxicity (by the World Health Organization), was administered to assess the neurotoxic status of a group of 33 dry cleaners exposed to occupational levels of organic solvents (OS) and 35 age-matched non dry-cleaners who had never worked in the dry cleaning industry. In Experiment 1, to assess visual function, contrast sensitivity, colour/hue discrimination (Munsell Hue 100 test), global motion and form thresholds were assessed using computerised psychophysical tests. Sensitivity to global motion or form structure was quantified by varying the pattern coherence of global dot motion (GDM) and Glass pattern (oriented dot pairs) respectively (i.e., the percentage of dots/dot pairs that contribute to the perception of global structure). In Experiment 2, a letter visual-search task was used to measure reaction times (as a function of the number of elements: 4, 8, 16, 32, 64 and 100) in both parallel and serial search conditions. Results Dry cleaners exposed to organic solvents had significantly higher scores on the Q16 compared to non dry-cleaners indicating that dry cleaners experienced more neurotoxic symptoms on average. The contrast sensitivity function for dry cleaners was significantly lower at all spatial frequencies relative to non dry-cleaners, which is consistent with previous studies. Poorer colour discrimination performance was also noted in dry cleaners than non dry-cleaners, particularly along the blue/yellow axis. In a new finding, we report that global form and motion thresholds for dry cleaners were also significantly higher and almost double than that obtained from non dry-cleaners. However, reaction time performance on both parallel and serial visual search was not different between dry cleaners and non dry-cleaners. Conclusions Exposure to occupational levels of organic solvents is associated with neurotoxicity which is in turn associated with both low level deficits (such as the perception of contrast and discrimination of colour) and high level visual deficits such as the perception of global form and motion, but not visual search performance. The latter finding indicates that the deficits in visual function are unlikely to be due to changes in general cognitive performance. PMID:25933026
The role of lightness, hue and saturation in feature-based visual attention.
Stuart, Geoffrey W; Barsdell, Wendy N; Day, Ross H
2014-03-01
Visual attention is used to select part of the visual array for higher-level processing. Visual selection can be based on spatial location, but it has also been demonstrated that multiple locations can be selected simultaneously on the basis of a visual feature such as color. One task that has been used to demonstrate feature-based attention is the judgement of the symmetry of simple four-color displays. In a typical task, when symmetry is violated, four squares on either side of the display do not match. When four colors are involved, symmetry judgements are made more quickly than when only two of the four colors are involved. This indicates that symmetry judgements are made one color at a time. Previous studies have confounded lightness, hue, and saturation when defining the colors used in such displays. In three experiments, symmetry was defined by lightness alone, lightness plus hue, or by hue or saturation alone, with lightness levels randomised. The difference between judgements of two- and four-color asymmetry was maintained, showing that hue and saturation can provide the sole basis for feature-based attentional selection. Crown Copyright © 2014. Published by Elsevier Ltd. All rights reserved.
Content Based Image Retrieval by Using Color Descriptor and Discrete Wavelet Transform.
Ashraf, Rehan; Ahmed, Mudassar; Jabbar, Sohail; Khalid, Shehzad; Ahmad, Awais; Din, Sadia; Jeon, Gwangil
2018-01-25
Due to recent development in technology, the complexity of multimedia is significantly increased and the retrieval of similar multimedia content is a open research problem. Content-Based Image Retrieval (CBIR) is a process that provides a framework for image search and low-level visual features are commonly used to retrieve the images from the image database. The basic requirement in any image retrieval process is to sort the images with a close similarity in term of visually appearance. The color, shape and texture are the examples of low-level image features. The feature plays a significant role in image processing. The powerful representation of an image is known as feature vector and feature extraction techniques are applied to get features that will be useful in classifying and recognition of images. As features define the behavior of an image, they show its place in terms of storage taken, efficiency in classification and obviously in time consumption also. In this paper, we are going to discuss various types of features, feature extraction techniques and explaining in what scenario, which features extraction technique will be better. The effectiveness of the CBIR approach is fundamentally based on feature extraction. In image processing errands like object recognition and image retrieval feature descriptor is an immense among the most essential step. The main idea of CBIR is that it can search related images to an image passed as query from a dataset got by using distance metrics. The proposed method is explained for image retrieval constructed on YCbCr color with canny edge histogram and discrete wavelet transform. The combination of edge of histogram and discrete wavelet transform increase the performance of image retrieval framework for content based search. The execution of different wavelets is additionally contrasted with discover the suitability of specific wavelet work for image retrieval. The proposed algorithm is prepared and tried to implement for Wang image database. For Image Retrieval Purpose, Artificial Neural Networks (ANN) is used and applied on standard dataset in CBIR domain. The execution of the recommended descriptors is assessed by computing both Precision and Recall values and compared with different other proposed methods with demonstrate the predominance of our method. The efficiency and effectiveness of the proposed approach outperforms the existing research in term of average precision and recall values.
Multilevel depth and image fusion for human activity detection.
Ni, Bingbing; Pei, Yong; Moulin, Pierre; Yan, Shuicheng
2013-10-01
Recognizing complex human activities usually requires the detection and modeling of individual visual features and the interactions between them. Current methods only rely on the visual features extracted from 2-D images, and therefore often lead to unreliable salient visual feature detection and inaccurate modeling of the interaction context between individual features. In this paper, we show that these problems can be addressed by combining data from a conventional camera and a depth sensor (e.g., Microsoft Kinect). We propose a novel complex activity recognition and localization framework that effectively fuses information from both grayscale and depth image channels at multiple levels of the video processing pipeline. In the individual visual feature detection level, depth-based filters are applied to the detected human/object rectangles to remove false detections. In the next level of interaction modeling, 3-D spatial and temporal contexts among human subjects or objects are extracted by integrating information from both grayscale and depth images. Depth information is also utilized to distinguish different types of indoor scenes. Finally, a latent structural model is developed to integrate the information from multiple levels of video processing for an activity detection. Extensive experiments on two activity recognition benchmarks (one with depth information) and a challenging grayscale + depth human activity database that contains complex interactions between human-human, human-object, and human-surroundings demonstrate the effectiveness of the proposed multilevel grayscale + depth fusion scheme. Higher recognition and localization accuracies are obtained relative to the previous methods.
Hardman, Kyle O; Cowan, Nelson
2015-03-01
Visual working memory stores stimuli from our environment as representations that can be accessed by high-level control processes. This study addresses a longstanding debate in the literature about whether storage limits in visual working memory include a limit to the complexity of discrete items. We examined the issue with a number of change-detection experiments that used complex stimuli that possessed multiple features per stimulus item. We manipulated the number of relevant features of the stimulus objects in order to vary feature load. In all of our experiments, we found that increased feature load led to a reduction in change-detection accuracy. However, we found that feature load alone could not account for the results but that a consideration of the number of relevant objects was also required. This study supports capacity limits for both feature and object storage in visual working memory. PsycINFO Database Record (c) 2015 APA, all rights reserved.
Chromatic information and feature detection in fast visual analysis
Del Viva, Maria M.; Punzi, Giovanni; Shevell, Steven K.; ...
2016-08-01
The visual system is able to recognize a scene based on a sketch made of very simple features. This ability is likely crucial for survival, when fast image recognition is necessary, and it is believed that a primal sketch is extracted very early in the visual processing. Such highly simplified representations can be sufficient for accurate object discrimination, but an open question is the role played by color in this process. Rich color information is available in natural scenes, yet artist's sketches are usually monochromatic; and, black-andwhite movies provide compelling representations of real world scenes. Also, the contrast sensitivity ofmore » color is low at fine spatial scales. We approach the question from the perspective of optimal information processing by a system endowed with limited computational resources. We show that when such limitations are taken into account, the intrinsic statistical properties of natural scenes imply that the most effective strategy is to ignore fine-scale color features and devote most of the bandwidth to gray-scale information. We find confirmation of these information-based predictions from psychophysics measurements of fast-viewing discrimination of natural scenes. As a result, we conclude that the lack of colored features in our visual representation, and our overall low sensitivity to high-frequency color components, are a consequence of an adaptation process, optimizing the size and power consumption of our brain for the visual world we live in.« less
Chromatic information and feature detection in fast visual analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Del Viva, Maria M.; Punzi, Giovanni; Shevell, Steven K.
The visual system is able to recognize a scene based on a sketch made of very simple features. This ability is likely crucial for survival, when fast image recognition is necessary, and it is believed that a primal sketch is extracted very early in the visual processing. Such highly simplified representations can be sufficient for accurate object discrimination, but an open question is the role played by color in this process. Rich color information is available in natural scenes, yet artist's sketches are usually monochromatic; and, black-andwhite movies provide compelling representations of real world scenes. Also, the contrast sensitivity ofmore » color is low at fine spatial scales. We approach the question from the perspective of optimal information processing by a system endowed with limited computational resources. We show that when such limitations are taken into account, the intrinsic statistical properties of natural scenes imply that the most effective strategy is to ignore fine-scale color features and devote most of the bandwidth to gray-scale information. We find confirmation of these information-based predictions from psychophysics measurements of fast-viewing discrimination of natural scenes. As a result, we conclude that the lack of colored features in our visual representation, and our overall low sensitivity to high-frequency color components, are a consequence of an adaptation process, optimizing the size and power consumption of our brain for the visual world we live in.« less
Finding regions of interest in pathological images: an attentional model approach
NASA Astrophysics Data System (ADS)
Gómez, Francisco; Villalón, Julio; Gutierrez, Ricardo; Romero, Eduardo
2009-02-01
This paper introduces an automated method for finding diagnostic regions-of-interest (RoIs) in histopathological images. This method is based on the cognitive process of visual selective attention that arises during a pathologist's image examination. Specifically, it emulates the first examination phase, which consists in a coarse search for tissue structures at a "low zoom" to separate the image into relevant regions.1 The pathologist's cognitive performance depends on inherent image visual cues - bottom-up information - and on acquired clinical medicine knowledge - top-down mechanisms -. Our pathologist's visual attention model integrates the latter two components. The selected bottom-up information includes local low level features such as intensity, color, orientation and texture information. Top-down information is related to the anatomical and pathological structures known by the expert. A coarse approximation to these structures is achieved by an oversegmentation algorithm, inspired by psychological grouping theories. The algorithm parameters are learned from an expert pathologist's segmentation. Top-down and bottom-up integration is achieved by calculating a unique index for each of the low level characteristics inside the region. Relevancy is estimated as a simple average of these indexes. Finally, a binary decision rule defines whether or not a region is interesting. The method was evaluated on a set of 49 images using a perceptually-weighted evaluation criterion, finding a quality gain of 3dB when comparing to a classical bottom-up model of attention.
NASA Astrophysics Data System (ADS)
Dostal, P.; Krasula, L.; Klima, M.
2012-06-01
Various image processing techniques in multimedia technology are optimized using visual attention feature of the human visual system. Spatial non-uniformity causes that different locations in an image are of different importance in terms of perception of the image. In other words, the perceived image quality depends mainly on the quality of important locations known as regions of interest. The performance of such techniques is measured by subjective evaluation or objective image quality criteria. Many state-of-the-art objective metrics are based on HVS properties; SSIM, MS-SSIM based on image structural information, VIF based on the information that human brain can ideally gain from the reference image or FSIM utilizing the low-level features to assign the different importance to each location in the image. But still none of these objective metrics utilize the analysis of regions of interest. We solve the question if these objective metrics can be used for effective evaluation of images reconstructed by processing techniques based on ROI analysis utilizing high-level features. In this paper authors show that the state-of-the-art objective metrics do not correlate well with subjective evaluation while the demosaicing based on ROI analysis is used for reconstruction. The ROI were computed from "ground truth" visual attention data. The algorithm combining two known demosaicing techniques on the basis of ROI location is proposed to reconstruct the ROI in fine quality while the rest of image is reconstructed with low quality. The color image reconstructed by this ROI approach was compared with selected demosaicing techniques by objective criteria and subjective testing. The qualitative comparison of the objective and subjective results indicates that the state-of-the-art objective metrics are still not suitable for evaluation image processing techniques based on ROI analysis and new criteria is demanded.
Visual Prediction Error Spreads Across Object Features in Human Visual Cortex
Summerfield, Christopher; Egner, Tobias
2016-01-01
Visual cognition is thought to rely heavily on contextual expectations. Accordingly, previous studies have revealed distinct neural signatures for expected versus unexpected stimuli in visual cortex. However, it is presently unknown how the brain combines multiple concurrent stimulus expectations such as those we have for different features of a familiar object. To understand how an unexpected object feature affects the simultaneous processing of other expected feature(s), we combined human fMRI with a task that independently manipulated expectations for color and motion features of moving-dot stimuli. Behavioral data and neural signals from visual cortex were then interrogated to adjudicate between three possible ways in which prediction error (surprise) in the processing of one feature might affect the concurrent processing of another, expected feature: (1) feature processing may be independent; (2) surprise might “spread” from the unexpected to the expected feature, rendering the entire object unexpected; or (3) pairing a surprising feature with an expected feature might promote the inference that the two features are not in fact part of the same object. To formalize these rival hypotheses, we implemented them in a simple computational model of multifeature expectations. Across a range of analyses, behavior and visual neural signals consistently supported a model that assumes a mixing of prediction error signals across features: surprise in one object feature spreads to its other feature(s), thus rendering the entire object unexpected. These results reveal neurocomputational principles of multifeature expectations and indicate that objects are the unit of selection for predictive vision. SIGNIFICANCE STATEMENT We address a key question in predictive visual cognition: how does the brain combine multiple concurrent expectations for different features of a single object such as its color and motion trajectory? By combining a behavioral protocol that independently varies expectation of (and attention to) multiple object features with computational modeling and fMRI, we demonstrate that behavior and fMRI activity patterns in visual cortex are best accounted for by a model in which prediction error in one object feature spreads to other object features. These results demonstrate how predictive vision forms object-level expectations out of multiple independent features. PMID:27810936
Chien, Yu-Tai; Chen, Yu-Jen; Hsiung, Hsiao-Fang; Chen, Hsiao-Jung; Hsieh, Meng-Hua; Wu, Wen-Jie
2017-01-01
Background Physical activity is important for middle-agers to maintain health both in middle age and in old age. Although thousands of exercise-promotion mobile phone apps are available for download, current literature offers little understanding regarding which design features can enhance middle-aged adults’ quality perception toward exercise-promotion apps and which factor may influence such perception. Objectives The aims of this study were to understand (1) which design features of exercise-promotion apps can enhance quality perception of middle-agers, (2) whether their needs are matched by current functions offered in app stores, and (3) whether physical activity (PA) and mobile phone self-efficacy (MPSE) influence quality perception. Methods A total of 105 middle-agers participated and filled out three scales: the International Physical Activity Questionnaire (IPAQ), the MPSE scale, and the need for design features questionnaire. The design features were developed based on the Coventry, Aberdeen, and London—Refined (CALO-RE) taxonomy. Following the Kano quality model, the need for design features questionnaire asked participants to classify design features into five categories: attractive, one-dimensional, must-be, indifferent, and reverse. The quality categorization was conducted based on a voting approach and the categorization results were compared with the findings of a prevalence study to realize whether needs match current availability. In total, 52 multinomial logistic regression models were analyzed to evaluate the effects of PA level and MPSE on quality perception of design features. Results The Kano analysis on the total sample revealed that visual demonstration of exercise instructions is the only attractive design feature, whereas the other 51 design features were perceived with indifference. Although examining quality perception by PA level, 21 features are recommended to low level, 6 features to medium level, but none to high-level PA. In contrast, high-level MPSE is recommended with 14 design features, medium level with 6 features, whereas low-level participants are recommended with 1 feature. The analysis suggests that the implementation of demanded features could be low, as the average prevalence of demanded design features is 20% (4.3/21). Surprisingly, social comparison and social support, most implemented features in current apps, were categorized into the indifferent category. The magnitude of effect is larger for MPSE because it effects quality perception of more design features than PA. Delving into the 52 regression models revealed that high MPSE more likely induces attractive or one- dimensional categorization, suggesting the importance of technological self-efficacy on eHealth care promotion. Conclusions This study is the first to propose middle-agers’ needs in relation to mobile phone exercise-promotion. In addition to the tailor-made recommendations, suggestions are offered to app designers to enhance the performance of persuasive features. An interesting finding on change of quality perception attributed to MPSE is proposed as future research. PMID:28546140
Fabric defect detection based on visual saliency using deep feature and low-rank recovery
NASA Astrophysics Data System (ADS)
Liu, Zhoufeng; Wang, Baorui; Li, Chunlei; Li, Bicao; Dong, Yan
2018-04-01
Fabric defect detection plays an important role in improving the quality of fabric product. In this paper, a novel fabric defect detection method based on visual saliency using deep feature and low-rank recovery was proposed. First, unsupervised training is carried out by the initial network parameters based on MNIST large datasets. The supervised fine-tuning of fabric image library based on Convolutional Neural Networks (CNNs) is implemented, and then more accurate deep neural network model is generated. Second, the fabric images are uniformly divided into the image block with the same size, then we extract their multi-layer deep features using the trained deep network. Thereafter, all the extracted features are concentrated into a feature matrix. Third, low-rank matrix recovery is adopted to divide the feature matrix into the low-rank matrix which indicates the background and the sparse matrix which indicates the salient defect. In the end, the iterative optimal threshold segmentation algorithm is utilized to segment the saliency maps generated by the sparse matrix to locate the fabric defect area. Experimental results demonstrate that the feature extracted by CNN is more suitable for characterizing the fabric texture than the traditional LBP, HOG and other hand-crafted features extraction method, and the proposed method can accurately detect the defect regions of various fabric defects, even for the image with complex texture.
Toward semantic-based retrieval of visual information: a model-based approach
NASA Astrophysics Data System (ADS)
Park, Youngchoon; Golshani, Forouzan; Panchanathan, Sethuraman
2002-07-01
This paper center around the problem of automated visual content classification. To enable classification based image or visual object retrieval, we propose a new image representation scheme called visual context descriptor (VCD) that is a multidimensional vector in which each element represents the frequency of a unique visual property of an image or a region. VCD utilizes the predetermined quality dimensions (i.e., types of features and quantization level) and semantic model templates mined in priori. Not only observed visual cues, but also contextually relevant visual features are proportionally incorporated in VCD. Contextual relevance of a visual cue to a semantic class is determined by using correlation analysis of ground truth samples. Such co-occurrence analysis of visual cues requires transformation of a real-valued visual feature vector (e.g., color histogram, Gabor texture, etc.,) into a discrete event (e.g., terms in text). Good-feature to track, rule of thirds, iterative k-means clustering and TSVQ are involved in transformation of feature vectors into unified symbolic representations called visual terms. Similarity-based visual cue frequency estimation is also proposed and used for ensuring the correctness of model learning and matching since sparseness of sample data causes the unstable results of frequency estimation of visual cues. The proposed method naturally allows integration of heterogeneous visual or temporal or spatial cues in a single classification or matching framework, and can be easily integrated into a semantic knowledge base such as thesaurus, and ontology. Robust semantic visual model template creation and object based image retrieval are demonstrated based on the proposed content description scheme.
Population Response Profiles in Early Visual Cortex Are Biased in Favor of More Valuable Stimuli
Saproo, Sameer
2010-01-01
Voluntary and stimulus-driven shifts of attention can modulate the representation of behaviorally relevant stimuli in early areas of visual cortex. In turn, attended items are processed faster and more accurately, facilitating the selection of appropriate behavioral responses. Information processing is also strongly influenced by past experience and recent studies indicate that the learned value of a stimulus can influence relatively late stages of decision making such as the process of selecting a motor response. However, the learned value of a stimulus can also influence the magnitude of cortical responses in early sensory areas such as V1 and S1. These early effects of stimulus value are presumed to improve the quality of sensory representations; however, the nature of these modulations is not clear. They could reflect nonspecific changes in response amplitude associated with changes in general arousal or they could reflect a bias in population responses so that high-value features are represented more robustly. To examine this issue, subjects performed a two-alternative forced choice paradigm with a variable-interval payoff schedule to dynamically manipulate the relative value of two stimuli defined by their orientation (one was rotated clockwise from vertical, the other counterclockwise). Activation levels in visual cortex were monitored using functional MRI and feature-selective voxel tuning functions while subjects performed the behavioral task. The results suggest that value not only modulates the relative amplitude of responses in early areas of human visual cortex, but also sharpens the response profile across the populations of feature-selective neurons that encode the critical stimulus feature (orientation). Moreover, changes in space- or feature-based attention cannot easily explain the results because representations of both the selected and the unselected stimuli underwent a similar feature-selective modulation. This sharpening in the population response profile could theoretically improve the probability of correctly discriminating high-value stimuli from low-value alternatives. PMID:20410360
Residual perception of biological motion in cortical blindness.
Ruffieux, Nicolas; Ramon, Meike; Lao, Junpeng; Colombo, Françoise; Stacchi, Lisa; Borruat, François-Xavier; Accolla, Ettore; Annoni, Jean-Marie; Caldara, Roberto
2016-12-01
From birth, the human visual system shows a remarkable sensitivity for perceiving biological motion. This visual ability relies on a distributed network of brain regions and can be preserved even after damage of high-level ventral visual areas. However, it remains unknown whether this critical biological skill can withstand the loss of vision following bilateral striate damage. To address this question, we tested the categorization of human and animal biological motion in BC, a rare case of cortical blindness after anoxia-induced bilateral striate damage. The severity of his impairment, encompassing various aspects of vision (i.e., color, shape, face, and object recognition) and causing blind-like behavior, contrasts with a residual ability to process motion. We presented BC with static or dynamic point-light displays (PLDs) of human or animal walkers. These stimuli were presented either individually, or in pairs in two alternative forced choice (2AFC) tasks. When confronted with individual PLDs, the patient was unable to categorize the stimuli, irrespective of whether they were static or dynamic. In the 2AFC task, BC exhibited appropriate eye movements towards diagnostic information, but performed at chance level with static PLDs, in stark contrast to his ability to efficiently categorize dynamic biological agents. This striking ability to categorize biological motion provided top-down information is important for at least two reasons. Firstly, it emphasizes the importance of assessing patients' (visual) abilities across a range of task constraints, which can reveal potential residual abilities that may in turn represent a key feature for patient rehabilitation. Finally, our findings reinforce the view that the neural network processing biological motion can efficiently operate despite severely impaired low-level vision, positing our natural predisposition for processing dynamicity in biological agents as a robust feature of human vision. Copyright © 2016 Elsevier Ltd. All rights reserved.
Wavelet processing techniques for digital mammography
NASA Astrophysics Data System (ADS)
Laine, Andrew F.; Song, Shuwu
1992-09-01
This paper introduces a novel approach for accomplishing mammographic feature analysis through multiresolution representations. We show that efficient (nonredundant) representations may be identified from digital mammography and used to enhance specific mammographic features within a continuum of scale space. The multiresolution decomposition of wavelet transforms provides a natural hierarchy in which to embed an interactive paradigm for accomplishing scale space feature analysis. Similar to traditional coarse to fine matching strategies, the radiologist may first choose to look for coarse features (e.g., dominant mass) within low frequency levels of a wavelet transform and later examine finer features (e.g., microcalcifications) at higher frequency levels. In addition, features may be extracted by applying geometric constraints within each level of the transform. Choosing wavelets (or analyzing functions) that are simultaneously localized in both space and frequency, results in a powerful methodology for image analysis. Multiresolution and orientation selectivity, known biological mechanisms in primate vision, are ingrained in wavelet representations and inspire the techniques presented in this paper. Our approach includes local analysis of complete multiscale representations. Mammograms are reconstructed from wavelet representations, enhanced by linear, exponential and constant weight functions through scale space. By improving the visualization of breast pathology we can improve the chances of early detection of breast cancers (improve quality) while requiring less time to evaluate mammograms for most patients (lower costs).
NASA Technical Reports Server (NTRS)
Smith, R. A.
1979-01-01
Operational and physical requirements were investigated for a low-light-level viewing device to be used as a window-mounted optical sight for crew use in the pointing, navigating, stationkeeping, and docking of space vehicles to support space station operations and the assembly of large structures in space. A suitable prototype, obtained from a commercial vendor, was subjected to limited tests to determine the potential effectiveness of a proximity optical device in spacecraft operations. The constructional features of the device are discussed as well as concepts for its use. Tests results show that a proximity optical device is capable of performing low-light-level viewing services and will enhance manned spacecraft operations.
Satellite Imagery Assisted Road-Based Visual Navigation System
NASA Astrophysics Data System (ADS)
Volkova, A.; Gibbens, P. W.
2016-06-01
There is a growing demand for unmanned aerial systems as autonomous surveillance, exploration and remote sensing solutions. Among the key concerns for robust operation of these systems is the need to reliably navigate the environment without reliance on global navigation satellite system (GNSS). This is of particular concern in Defence circles, but is also a major safety issue for commercial operations. In these circumstances, the aircraft needs to navigate relying only on information from on-board passive sensors such as digital cameras. An autonomous feature-based visual system presented in this work offers a novel integral approach to the modelling and registration of visual features that responds to the specific needs of the navigation system. It detects visual features from Google Earth* build a feature database. The same algorithm then detects features in an on-board cameras video stream. On one level this serves to localise the vehicle relative to the environment using Simultaneous Localisation and Mapping (SLAM). On a second level it correlates them with the database to localise the vehicle with respect to the inertial frame. The performance of the presented visual navigation system was compared using the satellite imagery from different years. Based on comparison results, an analysis of the effects of seasonal, structural and qualitative changes of the imagery source on the performance of the navigation algorithm is presented. * The algorithm is independent of the source of satellite imagery and another provider can be used
Metal artifact reduction using a patch-based reconstruction for digital breast tomosynthesis
NASA Astrophysics Data System (ADS)
Borges, Lucas R.; Bakic, Predrag R.; Maidment, Andrew D. A.; Vieira, Marcelo A. C.
2017-03-01
Digital breast tomosynthesis (DBT) is rapidly emerging as the main clinical tool for breast cancer screening. Although several reconstruction methods for DBT are described by the literature, one common issue is the interplane artifacts caused by out-of-focus features. For breasts containing highly attenuating features, such as surgical clips and large calcifications, the artifacts are even more apparent and can limit the detection and characterization of lesions by the radiologist. In this work, we propose a novel method of combining backprojected data into tomographic slices using a patch-based approach, commonly used in denoising. Preliminary tests were performed on a geometry phantom and on an anthropomorphic phantom containing metal inserts. The reconstructed images were compared to a commercial reconstruction solution. Qualitative assessment of the reconstructed images provides evidence that the proposed method reduces artifacts while maintaining low noise levels. Objective assessment supports the visual findings. The artifact spread function shows that the proposed method is capable of suppressing artifacts generated by highly attenuating features. The signal difference to noise ratio shows that the noise levels of the proposed and commercial methods are comparable, even though the commercial method applies post-processing filtering steps, which were not implemented on the proposed method. Thus, the proposed method can produce tomosynthesis reconstructions with reduced artifacts and low noise levels.
Capacity for visual features in mental rotation
Xu, Yangqing; Franconeri, Steven L.
2015-01-01
Although mental rotation is a core component of scientific reasoning, we still know little about its underlying mechanism. For instance - how much visual information can we rotate at once? Participants rotated a simple multi-part shape, requiring them to maintain attachments between features and moving parts. The capacity of this aspect of mental rotation was strikingly low – only one feature could remain attached to one part. Behavioral and eyetracking data showed that this single feature remained ‘glued’ via a singular focus of attention, typically on the object’s top. We argue that the architecture of the human visual system is not suited for keeping multiple features attached to multiple parts during mental rotation. Such measurement of the capacity limits may prove to be a critical step in dissecting the suite of visuospatial tools involved in mental rotation, leading to insights for improvement of pedagogy in science education contexts. PMID:26174781
Capacity for Visual Features in Mental Rotation.
Xu, Yangqing; Franconeri, Steven L
2015-08-01
Although mental rotation is a core component of scientific reasoning, little is known about its underlying mechanisms. For instance, how much visual information can someone rotate at once? We asked participants to rotate a simple multipart shape, requiring them to maintain attachments between features and moving parts. The capacity of this aspect of mental rotation was strikingly low: Only one feature could remain attached to one part. Behavioral and eye-tracking data showed that this single feature remained "glued" via a singular focus of attention, typically on the object's top. We argue that the architecture of the human visual system is not suited for keeping multiple features attached to multiple parts during mental rotation. Such measurement of capacity limits may prove to be a critical step in dissecting the suite of visuospatial tools involved in mental rotation, leading to insights for improvement of pedagogy in science-education contexts. © The Author(s) 2015.
Tablet and Smartphone Accessibility Features in the Low Vision Rehabilitation
Irvine, Danielle; Zemke, Alex; Pusateri, Gregg; Gerlach, Leah; Chun, Rob; Jay, Walter M.
2014-01-01
Abstract Tablet and smartphone use is rapidly increasing in developed countries. With this upsurge in popularity, the devices themselves are becoming more user-friendly for all consumers, including the visually impaired. Traditionally, visually impaired patients have received optical rehabilitation in the forms of microscopes, stand magnifiers, handheld magnifiers, telemicroscopes, and electronic magnification such as closed circuit televisions (CCTVs). In addition to the optical and financial limitations of traditional devices, patients do not always view them as being socially acceptable. For this reason, devices are often underutilised by patients due to lack of use in public forums or when among peers. By incorporating smartphones and tablets into a patient’s low vision rehabilitation, in addition to traditional devices, one provides versatile and mainstream options, which may also be less expensive. This article explains exactly what the accessibility features of tablets and smartphones are for the blind and visually impaired, how to access them, and provides an introduction on usage of the features. PMID:27928274
Designing sound and visual components for enhancement of urban soundscapes.
Hong, Joo Young; Jeon, Jin Yong
2013-09-01
The aim of this study is to investigate the effect of audio-visual components on environmental quality to improve soundscape. Natural sounds with road traffic noise and visual components in urban streets were evaluated through laboratory experiments. Waterfall and stream water sounds, as well as bird sounds, were selected to enhance the soundscape. Sixteen photomontages of a streetscape were constructed in combination with two types of water features and three types of vegetation which were chosen as positive visual components. The experiments consisted of audio-only, visual-only, and audio-visual conditions. The preferences and environmental qualities of the stimuli were evaluated by a numerical scale and 12 pairs of adjectives, respectively. The results showed that bird sounds were the most preferred among the natural sounds, while the sound of falling water was found to degrade the soundscape quality when the road traffic noise level was high. The visual effects of vegetation on aesthetic preference were significant, but those of water features relatively small. It was revealed that the perceptual dimensions of the environment were different from the noise levels. Particularly, the acoustic comfort factor related to soundscape quality considerably influenced preference for the overall environment at a higher level of road traffic noise.
A task-dependent causal role for low-level visual processes in spoken word comprehension.
Ostarek, Markus; Huettig, Falk
2017-08-01
It is well established that the comprehension of spoken words referring to object concepts relies on high-level visual areas in the ventral stream that build increasingly abstract representations. It is much less clear whether basic low-level visual representations are also involved. Here we asked in what task situations low-level visual representations contribute functionally to concrete word comprehension using an interference paradigm. We interfered with basic visual processing while participants performed a concreteness task (Experiment 1), a lexical-decision task (Experiment 2), and a word class judgment task (Experiment 3). We found that visual noise interfered more with concrete versus abstract word processing, but only when the task required visual information to be accessed. This suggests that basic visual processes can be causally involved in language comprehension, but that their recruitment is not automatic and rather depends on the type of information that is required in a given task situation. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Störmer, Viola S; Li, Shu-Chen; Heekeren, Hauke R; Lindenberger, Ulman
2011-02-01
The ability to attend to multiple objects that move in the visual field is important for many aspects of daily functioning. The attentional capacity for such dynamic tracking, however, is highly limited and undergoes age-related decline. Several aspects of the tracking process can influence performance. Here, we investigated effects of feature-based interference from distractor objects that appear in unattended regions of the visual field with a hemifield-tracking task. Younger and older participants performed an attentional tracking task in one hemifield while distractor objects were concurrently presented in the unattended hemifield. Feature similarity between objects in the attended and unattended hemifields as well as motion speed and the number of to-be-tracked objects were parametrically manipulated. The results show that increasing feature overlap leads to greater interference from the unattended visual field. This effect of feature-based interference was only present in the slow speed condition, indicating that the interference is mainly modulated by perceptual demands. High-performing older adults showed a similar interference effect as younger adults, whereas low-performing adults showed poor tracking performance overall.
Unconscious analyses of visual scenes based on feature conjunctions.
Tachibana, Ryosuke; Noguchi, Yasuki
2015-06-01
To efficiently process a cluttered scene, the visual system analyzes statistical properties or regularities of visual elements embedded in the scene. It is controversial, however, whether those scene analyses could also work for stimuli unconsciously perceived. Here we show that our brain performs the unconscious scene analyses not only using a single featural cue (e.g., orientation) but also based on conjunctions of multiple visual features (e.g., combinations of color and orientation information). Subjects foveally viewed a stimulus array (duration: 50 ms) where 4 types of bars (red-horizontal, red-vertical, green-horizontal, and green-vertical) were intermixed. Although a conscious perception of those bars was inhibited by a subsequent mask stimulus, the brain correctly analyzed the information about color, orientation, and color-orientation conjunctions of those invisible bars. The information of those features was then used for the unconscious configuration analysis (statistical processing) of the central bars, which induced a perceptual bias and illusory feature binding in visible stimuli at peripheral locations. While statistical analyses and feature binding are normally 2 key functions of the visual system to construct coherent percepts of visual scenes, our results show that a high-level analysis combining those 2 functions is correctly performed by unconscious computations in the brain. (c) 2015 APA, all rights reserved).
Visual cues to geographical orientation during low-level flight
NASA Technical Reports Server (NTRS)
Battiste, Vernol; Delzell, Suzanne
1991-01-01
A field study of an operational Emergency Medical Service (EMS) unit was conducted to investigate the relationships among geographical orientation, pilot decision making, and workload in EMS flights. The map data collected during this study were compared to protocols gathered in the laboratory, where pilots viewed a simulated flight over different types of unfamiliar terrain and verbally identified the features utilized to maintain geographical orientation. The EMS pilot's questionnaire data were compared with data from non-EMS helicopter pilots with comparable flight experience.
A recurrent neural model for proto-object based contour integration and figure-ground segregation.
Hu, Brian; Niebur, Ernst
2017-12-01
Visual processing of objects makes use of both feedforward and feedback streams of information. However, the nature of feedback signals is largely unknown, as is the identity of the neuronal populations in lower visual areas that receive them. Here, we develop a recurrent neural model to address these questions in the context of contour integration and figure-ground segregation. A key feature of our model is the use of grouping neurons whose activity represents tentative objects ("proto-objects") based on the integration of local feature information. Grouping neurons receive input from an organized set of local feature neurons, and project modulatory feedback to those same neurons. Additionally, inhibition at both the local feature level and the object representation level biases the interpretation of the visual scene in agreement with principles from Gestalt psychology. Our model explains several sets of neurophysiological results (Zhou et al. Journal of Neuroscience, 20(17), 6594-6611 2000; Qiu et al. Nature Neuroscience, 10(11), 1492-1499 2007; Chen et al. Neuron, 82(3), 682-694 2014), and makes testable predictions about the influence of neuronal feedback and attentional selection on neural responses across different visual areas. Our model also provides a framework for understanding how object-based attention is able to select both objects and the features associated with them.
Shared Neural Substrates of Emotionally Enhanced Perceptual and Mnemonic Vividness
Todd, Rebecca M.; Schmitz, Taylor W.; Susskind, Josh; Anderson, Adam K.
2013-01-01
It is well-known that emotionally salient events are remembered more vividly than mundane ones. Our recent research has demonstrated that such memory vividness (Mviv) is due in part to the subjective experience of emotional events as more perceptually vivid, an effect we call emotionally enhanced vividness (EEV). The present study built on previously reported research in which fMRI data were collected while participants rated relative levels of visual noise overlaid on emotionally salient and neutral images. Ratings of greater EEV were associated with greater activation in the amygdala and visual cortex. In the present study, we measured BOLD activation that predicted recognition Mviv for these same images 1 week later. Results showed that, after controlling for differences between scenes in low-level objective features, hippocampus activation uniquely predicted subsequent Mviv. In contrast, amygdala and visual cortex regions that were sensitive to EEV were also modulated by subsequent ratings of Mviv. These findings suggest shared neural substrates for the influence of emotional salience on perceptual and mnemonic vividness, with amygdala and visual cortex activation at encoding contributing to the experience of both perception and subsequent memory. PMID:23653601
Cross-Domain Shoe Retrieval with a Semantic Hierarchy of Attribute Classification Network.
Zhan, Huijing; Shi, Boxin; Kot, Alex C
2017-08-04
Cross-domain shoe image retrieval is a challenging problem, because the query photo from the street domain (daily life scenario) and the reference photo in the online domain (online shop images) have significant visual differences due to the viewpoint and scale variation, self-occlusion, and cluttered background. This paper proposes the Semantic Hierarchy Of attributE Convolutional Neural Network (SHOE-CNN) with a three-level feature representation for discriminative shoe feature expression and efficient retrieval. The SHOE-CNN with its newly designed loss function systematically merges semantic attributes of closer visual appearances to prevent shoe images with the obvious visual differences being confused with each other; the features extracted from image, region, and part levels effectively match the shoe images across different domains. We collect a large-scale shoe dataset composed of 14341 street domain and 12652 corresponding online domain images with fine-grained attributes to train our network and evaluate our system. The top-20 retrieval accuracy improves significantly over the solution with the pre-trained CNN features.
Salient region detection by fusing bottom-up and top-down features extracted from a single image.
Tian, Huawei; Fang, Yuming; Zhao, Yao; Lin, Weisi; Ni, Rongrong; Zhu, Zhenfeng
2014-10-01
Recently, some global contrast-based salient region detection models have been proposed based on only the low-level feature of color. It is necessary to consider both color and orientation features to overcome their limitations, and thus improve the performance of salient region detection for images with low-contrast in color and high-contrast in orientation. In addition, the existing fusion methods for different feature maps, like the simple averaging method and the selective method, are not effective sufficiently. To overcome these limitations of existing salient region detection models, we propose a novel salient region model based on the bottom-up and top-down mechanisms: the color contrast and orientation contrast are adopted to calculate the bottom-up feature maps, while the top-down cue of depth-from-focus from the same single image is used to guide the generation of final salient regions, since depth-from-focus reflects the photographer's preference and knowledge of the task. A more general and effective fusion method is designed to combine the bottom-up feature maps. According to the degree-of-scattering and eccentricities of feature maps, the proposed fusion method can assign adaptive weights to different feature maps to reflect the confidence level of each feature map. The depth-from-focus of the image as a significant top-down feature for visual attention in the image is used to guide the salient regions during the fusion process; with its aid, the proposed fusion method can filter out the background and highlight salient regions for the image. Experimental results show that the proposed model outperforms the state-of-the-art models on three public available data sets.
Coding visual features extracted from video sequences.
Baroffio, Luca; Cesana, Matteo; Redondi, Alessandro; Tagliasacchi, Marco; Tubaro, Stefano
2014-05-01
Visual features are successfully exploited in several applications (e.g., visual search, object recognition and tracking, etc.) due to their ability to efficiently represent image content. Several visual analysis tasks require features to be transmitted over a bandwidth-limited network, thus calling for coding techniques to reduce the required bit budget, while attaining a target level of efficiency. In this paper, we propose, for the first time, a coding architecture designed for local features (e.g., SIFT, SURF) extracted from video sequences. To achieve high coding efficiency, we exploit both spatial and temporal redundancy by means of intraframe and interframe coding modes. In addition, we propose a coding mode decision based on rate-distortion optimization. The proposed coding scheme can be conveniently adopted to implement the analyze-then-compress (ATC) paradigm in the context of visual sensor networks. That is, sets of visual features are extracted from video frames, encoded at remote nodes, and finally transmitted to a central controller that performs visual analysis. This is in contrast to the traditional compress-then-analyze (CTA) paradigm, in which video sequences acquired at a node are compressed and then sent to a central unit for further processing. In this paper, we compare these coding paradigms using metrics that are routinely adopted to evaluate the suitability of visual features in the context of content-based retrieval, object recognition, and tracking. Experimental results demonstrate that, thanks to the significant coding gains achieved by the proposed coding scheme, ATC outperforms CTA with respect to all evaluation metrics.
Multidimensional brain activity dictated by winner-take-all mechanisms.
Tozzi, Arturo; Peters, James F
2018-06-21
A novel demon-based architecture is introduced to elucidate brain functions such as pattern recognition during human perception and mental interpretation of visual scenes. Starting from the topological concepts of invariance and persistence, we introduce a Selfridge pandemonium variant of brain activity that takes into account a novel feature, namely, demons that recognize short straight-line segments, curved lines and scene shapes, such as shape interior, density and texture. Low-level representations of objects can be mapped to higher-level views (our mental interpretations): a series of transformations can be gradually applied to a pattern in a visual scene, without affecting its invariant properties. This makes it possible to construct a symbolic multi-dimensional representation of the environment. These representations can be projected continuously to an object that we have seen and continue to see, thanks to the mapping from shapes in our memory to shapes in Euclidean space. Although perceived shapes are 3-dimensional (plus time), the evaluation of shape features (volume, color, contour, closeness, texture, and so on) leads to n-dimensional brain landscapes. Here we discuss the advantages of our parallel, hierarchical model in pattern recognition, computer vision and biological nervous system's evolution. Copyright © 2018 Elsevier B.V. All rights reserved.
Research on image complexity evaluation method based on color information
NASA Astrophysics Data System (ADS)
Wang, Hao; Duan, Jin; Han, Xue-hui; Xiao, Bo
2017-11-01
In order to evaluate the complexity of a color image more effectively and find the connection between image complexity and image information, this paper presents a method to compute the complexity of image based on color information.Under the complexity ,the theoretical analysis first divides the complexity from the subjective level, divides into three levels: low complexity, medium complexity and high complexity, and then carries on the image feature extraction, finally establishes the function between the complexity value and the color characteristic model. The experimental results show that this kind of evaluation method can objectively reconstruct the complexity of the image from the image feature research. The experimental results obtained by the method of this paper are in good agreement with the results of human visual perception complexity,Color image complexity has a certain reference value.
Coding Local and Global Binary Visual Features Extracted From Video Sequences.
Baroffio, Luca; Canclini, Antonio; Cesana, Matteo; Redondi, Alessandro; Tagliasacchi, Marco; Tubaro, Stefano
2015-11-01
Binary local features represent an effective alternative to real-valued descriptors, leading to comparable results for many visual analysis tasks while being characterized by significantly lower computational complexity and memory requirements. When dealing with large collections, a more compact representation based on global features is often preferred, which can be obtained from local features by means of, e.g., the bag-of-visual word model. Several applications, including, for example, visual sensor networks and mobile augmented reality, require visual features to be transmitted over a bandwidth-limited network, thus calling for coding techniques that aim at reducing the required bit budget while attaining a target level of efficiency. In this paper, we investigate a coding scheme tailored to both local and global binary features, which aims at exploiting both spatial and temporal redundancy by means of intra- and inter-frame coding. In this respect, the proposed coding scheme can conveniently be adopted to support the analyze-then-compress (ATC) paradigm. That is, visual features are extracted from the acquired content, encoded at remote nodes, and finally transmitted to a central controller that performs the visual analysis. This is in contrast with the traditional approach, in which visual content is acquired at a node, compressed and then sent to a central unit for further processing, according to the compress-then-analyze (CTA) paradigm. In this paper, we experimentally compare the ATC and the CTA by means of rate-efficiency curves in the context of two different visual analysis tasks: 1) homography estimation and 2) content-based retrieval. Our results show that the novel ATC paradigm based on the proposed coding primitives can be competitive with the CTA, especially in bandwidth limited scenarios.
Coding Local and Global Binary Visual Features Extracted From Video Sequences
NASA Astrophysics Data System (ADS)
Baroffio, Luca; Canclini, Antonio; Cesana, Matteo; Redondi, Alessandro; Tagliasacchi, Marco; Tubaro, Stefano
2015-11-01
Binary local features represent an effective alternative to real-valued descriptors, leading to comparable results for many visual analysis tasks, while being characterized by significantly lower computational complexity and memory requirements. When dealing with large collections, a more compact representation based on global features is often preferred, which can be obtained from local features by means of, e.g., the Bag-of-Visual-Word (BoVW) model. Several applications, including for example visual sensor networks and mobile augmented reality, require visual features to be transmitted over a bandwidth-limited network, thus calling for coding techniques that aim at reducing the required bit budget, while attaining a target level of efficiency. In this paper we investigate a coding scheme tailored to both local and global binary features, which aims at exploiting both spatial and temporal redundancy by means of intra- and inter-frame coding. In this respect, the proposed coding scheme can be conveniently adopted to support the Analyze-Then-Compress (ATC) paradigm. That is, visual features are extracted from the acquired content, encoded at remote nodes, and finally transmitted to a central controller that performs visual analysis. This is in contrast with the traditional approach, in which visual content is acquired at a node, compressed and then sent to a central unit for further processing, according to the Compress-Then-Analyze (CTA) paradigm. In this paper we experimentally compare ATC and CTA by means of rate-efficiency curves in the context of two different visual analysis tasks: homography estimation and content-based retrieval. Our results show that the novel ATC paradigm based on the proposed coding primitives can be competitive with CTA, especially in bandwidth limited scenarios.
Fuzzy Classification of High Resolution Remote Sensing Scenes Using Visual Attention Features.
Li, Linyi; Xu, Tingbao; Chen, Yun
2017-01-01
In recent years the spatial resolutions of remote sensing images have been improved greatly. However, a higher spatial resolution image does not always lead to a better result of automatic scene classification. Visual attention is an important characteristic of the human visual system, which can effectively help to classify remote sensing scenes. In this study, a novel visual attention feature extraction algorithm was proposed, which extracted visual attention features through a multiscale process. And a fuzzy classification method using visual attention features (FC-VAF) was developed to perform high resolution remote sensing scene classification. FC-VAF was evaluated by using remote sensing scenes from widely used high resolution remote sensing images, including IKONOS, QuickBird, and ZY-3 images. FC-VAF achieved more accurate classification results than the others according to the quantitative accuracy evaluation indices. We also discussed the role and impacts of different decomposition levels and different wavelets on the classification accuracy. FC-VAF improves the accuracy of high resolution scene classification and therefore advances the research of digital image analysis and the applications of high resolution remote sensing images.
Fuzzy Classification of High Resolution Remote Sensing Scenes Using Visual Attention Features
Xu, Tingbao; Chen, Yun
2017-01-01
In recent years the spatial resolutions of remote sensing images have been improved greatly. However, a higher spatial resolution image does not always lead to a better result of automatic scene classification. Visual attention is an important characteristic of the human visual system, which can effectively help to classify remote sensing scenes. In this study, a novel visual attention feature extraction algorithm was proposed, which extracted visual attention features through a multiscale process. And a fuzzy classification method using visual attention features (FC-VAF) was developed to perform high resolution remote sensing scene classification. FC-VAF was evaluated by using remote sensing scenes from widely used high resolution remote sensing images, including IKONOS, QuickBird, and ZY-3 images. FC-VAF achieved more accurate classification results than the others according to the quantitative accuracy evaluation indices. We also discussed the role and impacts of different decomposition levels and different wavelets on the classification accuracy. FC-VAF improves the accuracy of high resolution scene classification and therefore advances the research of digital image analysis and the applications of high resolution remote sensing images. PMID:28761440
Image quality classification for DR screening using deep learning.
FengLi Yu; Jing Sun; Annan Li; Jun Cheng; Cheng Wan; Jiang Liu
2017-07-01
The quality of input images significantly affects the outcome of automated diabetic retinopathy (DR) screening systems. Unlike the previous methods that only consider simple low-level features such as hand-crafted geometric and structural features, in this paper we propose a novel method for retinal image quality classification (IQC) that performs computational algorithms imitating the working of the human visual system. The proposed algorithm combines unsupervised features from saliency map and supervised features coming from convolutional neural networks (CNN), which are fed to an SVM to automatically detect high quality vs poor quality retinal fundus images. We demonstrate the superior performance of our proposed algorithm on a large retinal fundus image dataset and the method could achieve higher accuracy than other methods. Although retinal images are used in this study, the methodology is applicable to the image quality assessment and enhancement of other types of medical images.
Dictionary Pruning with Visual Word Significance for Medical Image Retrieval
Zhang, Fan; Song, Yang; Cai, Weidong; Hauptmann, Alexander G.; Liu, Sidong; Pujol, Sonia; Kikinis, Ron; Fulham, Michael J; Feng, David Dagan; Chen, Mei
2016-01-01
Content-based medical image retrieval (CBMIR) is an active research area for disease diagnosis and treatment but it can be problematic given the small visual variations between anatomical structures. We propose a retrieval method based on a bag-of-visual-words (BoVW) to identify discriminative characteristics between different medical images with Pruned Dictionary based on Latent Semantic Topic description. We refer to this as the PD-LST retrieval. Our method has two main components. First, we calculate a topic-word significance value for each visual word given a certain latent topic to evaluate how the word is connected to this latent topic. The latent topics are learnt, based on the relationship between the images and words, and are employed to bridge the gap between low-level visual features and high-level semantics. These latent topics describe the images and words semantically and can thus facilitate more meaningful comparisons between the words. Second, we compute an overall-word significance value to evaluate the significance of a visual word within the entire dictionary. We designed an iterative ranking method to measure overall-word significance by considering the relationship between all latent topics and words. The words with higher values are considered meaningful with more significant discriminative power in differentiating medical images. We evaluated our method on two public medical imaging datasets and it showed improved retrieval accuracy and efficiency. PMID:27688597
Dictionary Pruning with Visual Word Significance for Medical Image Retrieval.
Zhang, Fan; Song, Yang; Cai, Weidong; Hauptmann, Alexander G; Liu, Sidong; Pujol, Sonia; Kikinis, Ron; Fulham, Michael J; Feng, David Dagan; Chen, Mei
2016-02-12
Content-based medical image retrieval (CBMIR) is an active research area for disease diagnosis and treatment but it can be problematic given the small visual variations between anatomical structures. We propose a retrieval method based on a bag-of-visual-words (BoVW) to identify discriminative characteristics between different medical images with Pruned Dictionary based on Latent Semantic Topic description. We refer to this as the PD-LST retrieval. Our method has two main components. First, we calculate a topic-word significance value for each visual word given a certain latent topic to evaluate how the word is connected to this latent topic. The latent topics are learnt, based on the relationship between the images and words, and are employed to bridge the gap between low-level visual features and high-level semantics. These latent topics describe the images and words semantically and can thus facilitate more meaningful comparisons between the words. Second, we compute an overall-word significance value to evaluate the significance of a visual word within the entire dictionary. We designed an iterative ranking method to measure overall-word significance by considering the relationship between all latent topics and words. The words with higher values are considered meaningful with more significant discriminative power in differentiating medical images. We evaluated our method on two public medical imaging datasets and it showed improved retrieval accuracy and efficiency.
Invisibility and interpretation.
Herzog, Michael H; Hermens, Frouke; Oğmen, Haluk
2014-01-01
Invisibility is often thought to occur because of the low-level limitations of the visual system. For example, it is often assumed that backward masking renders a target invisible because the visual system is simply too slow to resolve the target and the mask separately. Here, we propose an alternative explanation in which invisibility is a goal rather than a limitation and occurs naturally when making sense out of the plethora of incoming information. For example, we present evidence that (in)visibility of an element can strongly depend on how it groups with other elements. Changing grouping changes visibility. In addition, we will show that features often just appear to be invisible but are in fact visible in a way the experimenter is not aware of.
Invisibility and interpretation
Herzog, Michael H.; Hermens, Frouke; Öğmen, Haluk
2014-01-01
Invisibility is often thought to occur because of the low-level limitations of the visual system. For example, it is often assumed that backward masking renders a target invisible because the visual system is simply too slow to resolve the target and the mask separately. Here, we propose an alternative explanation in which invisibility is a goal rather than a limitation and occurs naturally when making sense out of the plethora of incoming information. For example, we present evidence that (in)visibility of an element can strongly depend on how it groups with other elements. Changing grouping changes visibility. In addition, we will show that features often just appear to be invisible but are in fact visible in a way the experimenter is not aware of. PMID:25278910
Open-field arena boundary is a primary object of exploration for Drosophila
Soibam, Benjamin; Mann, Monica; Liu, Lingzhi; Tran, Jessica; Lobaina, Milena; Kang, Yuan Yuan; Gunaratne, Gemunu H; Pletcher, Scott; Roman, Gregg
2012-01-01
Drosophila adults, when placed into a novel open-field arena, initially exhibit an elevated level of activity followed by a reduced stable level of spontaneous activity and spend a majority of time near the arena edge, executing motions along the walls. In order to determine the environmental features that are responsible for the initial high activity and wall-following behavior exhibited during exploration, we examined wild-type and visually impaired mutants in arenas with different vertical surfaces. These experiments support the conclusion that the wall-following behavior of Drosophila is best characterized by a preference for the arena boundary, and not thigmotaxis or centrophobicity. In circular arenas, Drosophila mostly move in trajectories with low turn angles. Since the boundary preference could derive from highly linear trajectories, we further developed a simulation program to model the effects of turn angle on the boundary preference. In an hourglass-shaped arena with convex-angled walls that forced a straight versus wall-following choice, the simulation with constrained turn angles predicted general movement across a central gap, whereas Drosophila tend to follow the wall. Hence, low turn angled movement does not drive the boundary preference. Lastly, visually impaired Drosophila demonstrate a defect in attenuation of the elevated initial activity. Interestingly, the visually impaired w1118 activity decay defect can be rescued by increasing the contrast of the arena's edge, suggesting that the activity decay relies on visual detection of the boundary. The arena boundary is, therefore, a primary object of exploration for Drosophila. PMID:22574279
Liao, Gen-Yih; Chien, Yu-Tai; Chen, Yu-Jen; Hsiung, Hsiao-Fang; Chen, Hsiao-Jung; Hsieh, Meng-Hua; Wu, Wen-Jie
2017-05-25
Physical activity is important for middle-agers to maintain health both in middle age and in old age. Although thousands of exercise-promotion mobile phone apps are available for download, current literature offers little understanding regarding which design features can enhance middle-aged adults' quality perception toward exercise-promotion apps and which factor may influence such perception. The aims of this study were to understand (1) which design features of exercise-promotion apps can enhance quality perception of middle-agers, (2) whether their needs are matched by current functions offered in app stores, and (3) whether physical activity (PA) and mobile phone self-efficacy (MPSE) influence quality perception. A total of 105 middle-agers participated and filled out three scales: the International Physical Activity Questionnaire (IPAQ), the MPSE scale, and the need for design features questionnaire. The design features were developed based on the Coventry, Aberdeen, and London-Refined (CALO-RE) taxonomy. Following the Kano quality model, the need for design features questionnaire asked participants to classify design features into five categories: attractive, one-dimensional, must-be, indifferent, and reverse. The quality categorization was conducted based on a voting approach and the categorization results were compared with the findings of a prevalence study to realize whether needs match current availability. In total, 52 multinomial logistic regression models were analyzed to evaluate the effects of PA level and MPSE on quality perception of design features. The Kano analysis on the total sample revealed that visual demonstration of exercise instructions is the only attractive design feature, whereas the other 51 design features were perceived with indifference. Although examining quality perception by PA level, 21 features are recommended to low level, 6 features to medium level, but none to high-level PA. In contrast, high-level MPSE is recommended with 14 design features, medium level with 6 features, whereas low-level participants are recommended with 1 feature. The analysis suggests that the implementation of demanded features could be low, as the average prevalence of demanded design features is 20% (4.3/21). Surprisingly, social comparison and social support, most implemented features in current apps, were categorized into the indifferent category. The magnitude of effect is larger for MPSE because it effects quality perception of more design features than PA. Delving into the 52 regression models revealed that high MPSE more likely induces attractive or one- dimensional categorization, suggesting the importance of technological self-efficacy on eHealth care promotion. This study is the first to propose middle-agers' needs in relation to mobile phone exercise-promotion. In addition to the tailor-made recommendations, suggestions are offered to app designers to enhance the performance of persuasive features. An interesting finding on change of quality perception attributed to MPSE is proposed as future research. ©Gen-Yih Liao, Yu-Tai Chien, Yu-Jen Chen, Hsiao-Fang Hsiung, Hsiao-Jung Chen, Meng-Hua Hsieh, Wen-Jie Wu. Originally published in JMIR Mhealth and Uhealth (http://mhealth.jmir.org), 25.05.2017.
Computational mechanisms underlying cortical responses to the affordance properties of visual scenes
Epstein, Russell A.
2018-01-01
Biologically inspired deep convolutional neural networks (CNNs), trained for computer vision tasks, have been found to predict cortical responses with remarkable accuracy. However, the internal operations of these models remain poorly understood, and the factors that account for their success are unknown. Here we develop a set of techniques for using CNNs to gain insights into the computational mechanisms underlying cortical responses. We focused on responses in the occipital place area (OPA), a scene-selective region of dorsal occipitoparietal cortex. In a previous study, we showed that fMRI activation patterns in the OPA contain information about the navigational affordances of scenes; that is, information about where one can and cannot move within the immediate environment. We hypothesized that this affordance information could be extracted using a set of purely feedforward computations. To test this idea, we examined a deep CNN with a feedforward architecture that had been previously trained for scene classification. We found that responses in the CNN to scene images were highly predictive of fMRI responses in the OPA. Moreover the CNN accounted for the portion of OPA variance relating to the navigational affordances of scenes. The CNN could thus serve as an image-computable candidate model of affordance-related responses in the OPA. We then ran a series of in silico experiments on this model to gain insights into its internal operations. These analyses showed that the computation of affordance-related features relied heavily on visual information at high-spatial frequencies and cardinal orientations, both of which have previously been identified as low-level stimulus preferences of scene-selective visual cortex. These computations also exhibited a strong preference for information in the lower visual field, which is consistent with known retinotopic biases in the OPA. Visualizations of feature selectivity within the CNN suggested that affordance-based responses encoded features that define the layout of the spatial environment, such as boundary-defining junctions and large extended surfaces. Together, these results map the sensory functions of the OPA onto a fully quantitative model that provides insights into its visual computations. More broadly, they advance integrative techniques for understanding visual cortex across multiple level of analysis: from the identification of cortical sensory functions to the modeling of their underlying algorithms. PMID:29684011
Horikawa, Tomoyasu; Kamitani, Yukiyasu
2017-01-01
Dreaming is generally thought to be generated by spontaneous brain activity during sleep with patterns common to waking experience. This view is supported by a recent study demonstrating that dreamed objects can be predicted from brain activity during sleep using statistical decoders trained with stimulus-induced brain activity. However, it remains unclear whether and how visual image features associated with dreamed objects are represented in the brain. In this study, we used a deep neural network (DNN) model for object recognition as a proxy for hierarchical visual feature representation, and DNN features for dreamed objects were analyzed with brain decoding of fMRI data collected during dreaming. The decoders were first trained with stimulus-induced brain activity labeled with the feature values of the stimulus image from multiple DNN layers. The decoders were then used to decode DNN features from the dream fMRI data, and the decoded features were compared with the averaged features of each object category calculated from a large-scale image database. We found that the feature values decoded from the dream fMRI data positively correlated with those associated with dreamed object categories at mid- to high-level DNN layers. Using the decoded features, the dreamed object category could be identified at above-chance levels by matching them to the averaged features for candidate categories. The results suggest that dreaming recruits hierarchical visual feature representations associated with objects, which may support phenomenal aspects of dream experience.
Wang, Yuanye; Luo, Huan
2017-01-01
In order to deal with external world efficiently, the brain constantly generates predictions about incoming sensory inputs, a process known as "predictive coding." Our recent studies, by employing visual priming paradigms in combination with a time-resolved behavioral measurement, reveal that perceptual predictions about simple features (e.g., left or right orientation) return to low sensory areas not continuously but recurrently in a theta-band (3-4Hz) rhythm. However, it remains unknown whether high-level object processing is also mediated by the oscillatory mechanism and if yes at which rhythm the mechanism works. In the present study, we employed a morph-face priming paradigm and the time-resolved behavioral measurements to examine the fine temporal dynamics of face identity priming performance. First, we reveal classical priming effects and a rhythmic trend within the prime-to-probe SOA of 600ms (Experiment 1). Next, we densely sampled the face priming behavioral performances within this SOA range (Experiment 2). Our results demonstrate a significant ~5Hz oscillatory component in the face priming behavioral performances, suggesting that a rhythmic process also coordinates the object-level prediction (i.e., face identity here). In comparison to our previous studies, the results suggest that the rhythm for the high-level object is faster than that for simple features. We propose that the seemingly distinctive priming rhythms might be attributable to that the object-level and simple feature-level predictions return to different stages along the visual pathway (e.g., FFA area for face priming and V1 area for simple feature priming). In summary, the findings support a general theta-band (3-6Hz) temporal organization mechanism in predictive coding, and that such wax-and-waning pattern in predictive coding may aid the brain to be more readily updated for new inputs. © 2017 Elsevier B.V. All rights reserved.
Yashar, Amit; Denison, Rachel N
2017-12-01
Training can modify the visual system to produce a substantial improvement on perceptual tasks and therefore has applications for treating visual deficits. Visual perceptual learning (VPL) is often specific to the trained feature, which gives insight into processes underlying brain plasticity, but limits VPL's effectiveness in rehabilitation. Under what circumstances VPL transfers to untrained stimuli is poorly understood. Here we report a qualitatively new phenomenon: intrinsic variation in the representation of features determines the transfer of VPL. Orientations around cardinal are represented more reliably than orientations around oblique in V1, which has been linked to behavioral consequences such as visual search asymmetries. We studied VPL for visual search of near-cardinal or oblique targets among distractors of the other orientation while controlling for other display and task attributes, including task precision, task difficulty, and stimulus exposure. Learning was the same in all training conditions; however, transfer depended on the orientation of the target, with full transfer of learning from near-cardinal to oblique targets but not the reverse. To evaluate the idea that representational reliability was the key difference between the orientations in determining VPL transfer, we created a model that combined orientation-dependent reliability, improvement of reliability with learning, and an optimal search strategy. Modeling suggested that not only search asymmetries but also the asymmetric transfer of VPL depended on preexisting differences between the reliability of near-cardinal and oblique representations. Transfer asymmetries in model behavior also depended on having different learning rates for targets and distractors, such that greater learning for low-reliability distractors facilitated transfer. These findings suggest that training on sensory features with intrinsically low reliability may maximize the generalizability of learning in complex visual environments.
Feature reliability determines specificity and transfer of perceptual learning in orientation search
2017-01-01
Training can modify the visual system to produce a substantial improvement on perceptual tasks and therefore has applications for treating visual deficits. Visual perceptual learning (VPL) is often specific to the trained feature, which gives insight into processes underlying brain plasticity, but limits VPL’s effectiveness in rehabilitation. Under what circumstances VPL transfers to untrained stimuli is poorly understood. Here we report a qualitatively new phenomenon: intrinsic variation in the representation of features determines the transfer of VPL. Orientations around cardinal are represented more reliably than orientations around oblique in V1, which has been linked to behavioral consequences such as visual search asymmetries. We studied VPL for visual search of near-cardinal or oblique targets among distractors of the other orientation while controlling for other display and task attributes, including task precision, task difficulty, and stimulus exposure. Learning was the same in all training conditions; however, transfer depended on the orientation of the target, with full transfer of learning from near-cardinal to oblique targets but not the reverse. To evaluate the idea that representational reliability was the key difference between the orientations in determining VPL transfer, we created a model that combined orientation-dependent reliability, improvement of reliability with learning, and an optimal search strategy. Modeling suggested that not only search asymmetries but also the asymmetric transfer of VPL depended on preexisting differences between the reliability of near-cardinal and oblique representations. Transfer asymmetries in model behavior also depended on having different learning rates for targets and distractors, such that greater learning for low-reliability distractors facilitated transfer. These findings suggest that training on sensory features with intrinsically low reliability may maximize the generalizability of learning in complex visual environments. PMID:29240813
A Force-Visualized Silicone Retractor Attachable to Surgical Suction Pipes.
Watanabe, Tetsuyou; Koyama, Toshio; Yoneyama, Takeshi; Nakada, Mitsutoshi
2017-04-05
This paper presents a force-visually-observable silicone retractor, which is an extension of a previously developed system that had the same functions of retracting, suction, and force sensing. These features provide not only high usability by reducing the number of tool changes, but also a safe choice of retracting by visualized force information. Suction is achieved by attaching the retractor to a suction pipe. The retractor has a deformable sensing component including a hole filled with a liquid. The hole is connected to an outer tube, and the liquid level displaced in proportion to the extent of deformation resulting from the retracting load. The liquid level is capable to be observed around the surgeon's fingertips, which enhances the usability. The new hybrid structure of soft sensing and hard retracting allows the miniaturization of the retractor as well as a resolution of less than 0.05 N and a range of 0.1-0.7 N. The overall structure is made of silicone, which has the advantages of disposability, low cost, and easy sterilization/disinfection. This system was validated by conducting experiments.
Forced to remember: when memory is biased by salient information.
Santangelo, Valerio
2015-04-15
The last decades have seen a rapid growing in the attempt to understand the key factors involved in the internal memory representation of the external world. Visual salience have been found to provide a major contribution in predicting the probability for an item/object embedded in a complex setting (i.e., a natural scene) to be encoded and then remembered later on. Here I review the existing literature highlighting the impact of perceptual- (based on low-level sensory features) and semantics-related salience (based on high-level knowledge) on short-term memory representation, along with the neural mechanisms underpinning the interplay between these factors. The available evidence reveal that both perceptual- and semantics-related factors affect attention selection mechanisms during the encoding of natural scenes. Biasing internal memory representation, both perceptual and semantics factors increase the probability to remember high- to the detriment of low-saliency items. The available evidence also highlight an interplay between these factors, with a reduced impact of perceptual-related salience in biasing memory representation as a function of the increasing availability of semantics-related salient information. The neural mechanisms underpinning this interplay involve the activation of different portions of the frontoparietal attention control network. Ventral regions support the assignment of selection/encoding priorities based on high-level semantics, while the involvement of dorsal regions reflects priorities assignment based on low-level sensory features. Copyright © 2015 Elsevier B.V. All rights reserved.
Feature-selective attention enhances color signals in early visual areas of the human brain.
Müller, M M; Andersen, S; Trujillo, N J; Valdés-Sosa, P; Malinowski, P; Hillyard, S A
2006-09-19
We used an electrophysiological measure of selective stimulus processing (the steady-state visual evoked potential, SSVEP) to investigate feature-specific attention to color cues. Subjects viewed a display consisting of spatially intermingled red and blue dots that continually shifted their positions at random. The red and blue dots flickered at different frequencies and thereby elicited distinguishable SSVEP signals in the visual cortex. Paying attention selectively to either the red or blue dot population produced an enhanced amplitude of its frequency-tagged SSVEP, which was localized by source modeling to early levels of the visual cortex. A control experiment showed that this selection was based on color rather than flicker frequency cues. This signal amplification of attended color items provides an empirical basis for the rapid identification of feature conjunctions during visual search, as proposed by "guided search" models.
Eyes Matched to the Prize: The State of Matched Filters in Insect Visual Circuits.
Kohn, Jessica R; Heath, Sarah L; Behnia, Rudy
2018-01-01
Confronted with an ever-changing visual landscape, animals must be able to detect relevant stimuli and translate this information into behavioral output. A visual scene contains an abundance of information: to interpret the entirety of it would be uneconomical. To optimally perform this task, neural mechanisms exist to enhance the detection of important features of the sensory environment while simultaneously filtering out irrelevant information. This can be accomplished by using a circuit design that implements specific "matched filters" that are tuned to relevant stimuli. Following this rule, the well-characterized visual systems of insects have evolved to streamline feature extraction on both a structural and functional level. Here, we review examples of specialized visual microcircuits for vital behaviors across insect species, including feature detection, escape, and estimation of self-motion. Additionally, we discuss how these microcircuits are modulated to weigh relevant input with respect to different internal and behavioral states.
Foulsham, Tom; Barton, Jason J S; Kingstone, Alan; Dewhurst, Richard; Underwood, Geoffrey
2011-08-01
Two recent papers (Foulsham, Barton, Kingstone, Dewhurst, & Underwood, 2009; Mannan, Kennard, & Husain, 2009) report that neuropsychological patients with a profound object recognition problem (visual agnosic subjects) show differences from healthy observers in the way their eye movements are controlled when looking at images. The interpretation of these papers is that eye movements can be modeled as the selection of points on a saliency map, and that agnosic subjects show an increased reliance on visual saliency, i.e., brightness and contrast in low-level stimulus features. Here we review this approach and present new data from our own experiments with an agnosic patient that quantifies the relationship between saliency and fixation location. In addition, we consider whether the perceptual difficulties of individual patients might be modeled by selectively weighting the different features involved in a saliency map. Our data indicate that saliency is not always a good predictor of fixation in agnosia: even for our agnosic subject, as for normal observers, the saliency-fixation relationship varied as a function of the task. This means that top-down processes still have a significant effect on the earliest stages of scanning in the setting of visual agnosia, indicating severe limitations for the saliency map model. Top-down, active strategies-which are the hallmark of our human visual system-play a vital role in eye movement control, whether we know what we are looking at or not. Copyright © 2011 Elsevier Ltd. All rights reserved.
Severtson, Dolores J.
2015-01-01
Barriers to communicating the uncertainty of environmental health risks include preferences for certain information and low numeracy. Map features designed to communicate the magnitude and uncertainty of estimated cancer risk from air pollution were tested among 826 participants to assess how map features influenced judgments of adequacy and the intended communication goals. An uncertain versus certain visual feature was judged as less adequate but met both communication goals and addressed numeracy barriers. Expressing relative risk using words communicated uncertainty and addressed numeracy barriers but was judged as highly inadequate. Risk communication and visual cognition concepts were applied to explain findings. PMID:26412960
Severtson, Dolores J
2015-02-01
Barriers to communicating the uncertainty of environmental health risks include preferences for certain information and low numeracy. Map features designed to communicate the magnitude and uncertainty of estimated cancer risk from air pollution were tested among 826 participants to assess how map features influenced judgments of adequacy and the intended communication goals. An uncertain versus certain visual feature was judged as less adequate but met both communication goals and addressed numeracy barriers. Expressing relative risk using words communicated uncertainty and addressed numeracy barriers but was judged as highly inadequate. Risk communication and visual cognition concepts were applied to explain findings.
Greene, Michelle R; Baldassano, Christopher; Fei-Fei, Li; Beck, Diane M; Baker, Chris I
2018-01-01
Inherent correlations between visual and semantic features in real-world scenes make it difficult to determine how different scene properties contribute to neural representations. Here, we assessed the contributions of multiple properties to scene representation by partitioning the variance explained in human behavioral and brain measurements by three feature models whose inter-correlations were minimized a priori through stimulus preselection. Behavioral assessments of scene similarity reflected unique contributions from a functional feature model indicating potential actions in scenes as well as high-level visual features from a deep neural network (DNN). In contrast, similarity of cortical responses in scene-selective areas was uniquely explained by mid- and high-level DNN features only, while an object label model did not contribute uniquely to either domain. The striking dissociation between functional and DNN features in their contribution to behavioral and brain representations of scenes indicates that scene-selective cortex represents only a subset of behaviorally relevant scene information. PMID:29513219
Groen, Iris Ia; Greene, Michelle R; Baldassano, Christopher; Fei-Fei, Li; Beck, Diane M; Baker, Chris I
2018-03-07
Inherent correlations between visual and semantic features in real-world scenes make it difficult to determine how different scene properties contribute to neural representations. Here, we assessed the contributions of multiple properties to scene representation by partitioning the variance explained in human behavioral and brain measurements by three feature models whose inter-correlations were minimized a priori through stimulus preselection. Behavioral assessments of scene similarity reflected unique contributions from a functional feature model indicating potential actions in scenes as well as high-level visual features from a deep neural network (DNN). In contrast, similarity of cortical responses in scene-selective areas was uniquely explained by mid- and high-level DNN features only, while an object label model did not contribute uniquely to either domain. The striking dissociation between functional and DNN features in their contribution to behavioral and brain representations of scenes indicates that scene-selective cortex represents only a subset of behaviorally relevant scene information.
Advancing Bag-of-Visual-Words Representations for Lesion Classification in Retinal Images
Pires, Ramon; Jelinek, Herbert F.; Wainer, Jacques; Valle, Eduardo; Rocha, Anderson
2014-01-01
Diabetic Retinopathy (DR) is a complication of diabetes that can lead to blindness if not readily discovered. Automated screening algorithms have the potential to improve identification of patients who need further medical attention. However, the identification of lesions must be accurate to be useful for clinical application. The bag-of-visual-words (BoVW) algorithm employs a maximum-margin classifier in a flexible framework that is able to detect the most common DR-related lesions such as microaneurysms, cotton-wool spots and hard exudates. BoVW allows to bypass the need for pre- and post-processing of the retinographic images, as well as the need of specific ad hoc techniques for identification of each type of lesion. An extensive evaluation of the BoVW model, using three large retinograph datasets (DR1, DR2 and Messidor) with different resolution and collected by different healthcare personnel, was performed. The results demonstrate that the BoVW classification approach can identify different lesions within an image without having to utilize different algorithms for each lesion reducing processing time and providing a more flexible diagnostic system. Our BoVW scheme is based on sparse low-level feature detection with a Speeded-Up Robust Features (SURF) local descriptor, and mid-level features based on semi-soft coding with max pooling. The best BoVW representation for retinal image classification was an area under the receiver operating characteristic curve (AUC-ROC) of 97.8% (exudates) and 93.5% (red lesions), applying a cross-dataset validation protocol. To assess the accuracy for detecting cases that require referral within one year, the sparse extraction technique associated with semi-soft coding and max pooling obtained an AUC of 94.22.0%, outperforming current methods. Those results indicate that, for retinal image classification tasks in clinical practice, BoVW is equal and, in some instances, surpasses results obtained using dense detection (widely believed to be the best choice in many vision problems) for the low-level descriptors. PMID:24886780
Consumers' Perceptions of Patient-Accessible Electronic Medical Records
Vaughon, Wendy L; Czaja, Sara J; Levy, Joslyn; Rockoff, Maxine L
2013-01-01
Background Electronic health information (eHealth) tools for patients, including patient-accessible electronic medical records (patient portals), are proliferating in health care delivery systems nationally. However, there has been very limited study of the perceived utility and functionality of portals, as well as limited assessment of these systems by vulnerable (low education level, racial/ethnic minority) consumers. Objective The objective of the study was to identify vulnerable consumers’ response to patient portals, their perceived utility and value, as well as their reactions to specific portal functions. Methods This qualitative study used 4 focus groups with 28 low education level, English-speaking consumers in June and July 2010, in New York City. Results Participants included 10 males and 18 females, ranging in age from 21-63 years; 19 non-Hispanic black, 7 Hispanic, 1 non-Hispanic White and 1 Other. None of the participants had higher than a high school level education, and 13 had less than a high school education. All participants had experience with computers and 26 used the Internet. Major themes were enhanced consumer engagement/patient empowerment, extending the doctor’s visit/enhancing communication with health care providers, literacy and health literacy factors, improved prevention and health maintenance, and privacy and security concerns. Consumers were also asked to comment on a number of key portal features. Consumers were most positive about features that increased convenience, such as making appointments and refilling prescriptions. Consumers raised concerns about a number of potential barriers to usage, such as complex language, complex visual layouts, and poor usability features. Conclusions Most consumers were enthusiastic about patient portals and perceived that they had great utility and value. Study findings suggest that for patient portals to be effective for all consumers, portals must be designed to be easy to read, visually engaging, and have user-friendly navigation. PMID:23978618
Recursive feature elimination for biomarker discovery in resting-state functional connectivity.
Ravishankar, Hariharan; Madhavan, Radhika; Mullick, Rakesh; Shetty, Teena; Marinelli, Luca; Joel, Suresh E
2016-08-01
Biomarker discovery involves finding correlations between features and clinical symptoms to aid clinical decision. This task is especially difficult in resting state functional magnetic resonance imaging (rs-fMRI) data due to low SNR, high-dimensionality of images, inter-subject and intra-subject variability and small numbers of subjects compared to the number of derived features. Traditional univariate analysis suffers from the problem of multiple comparisons. Here, we adopt an alternative data-driven method for identifying population differences in functional connectivity. We propose a machine-learning approach to down-select functional connectivity features associated with symptom severity in mild traumatic brain injury (mTBI). Using this approach, we identified functional regions with altered connectivity in mTBI. including the executive control, visual and precuneus networks. We compared functional connections at multiple resolutions to determine which scale would be more sensitive to changes related to patient recovery. These modular network-level features can be used as diagnostic tools for predicting disease severity and recovery profiles.
Beyond Correlation: Do Color Features Influence Attention in Rainforest?
Frey, Hans-Peter; Wirz, Kerstin; Willenbockel, Verena; Betz, Torsten; Schreiber, Cornell; Troscianko, Tomasz; König, Peter
2011-01-01
Recent research indicates a direct relationship between low-level color features and visual attention under natural conditions. However, the design of these studies allows only correlational observations and no inference about mechanisms. Here we go a step further to examine the nature of the influence of color features on overt attention in an environment in which trichromatic color vision is advantageous. We recorded eye-movements of color-normal and deuteranope human participants freely viewing original and modified rainforest images. Eliminating red–green color information dramatically alters fixation behavior in color-normal participants. Changes in feature correlations and variability over subjects and conditions provide evidence for a causal effect of red–green color-contrast. The effects of blue–yellow contrast are much smaller. However, globally rotating hue in color space in these images reveals a mechanism analyzing color-contrast invariant of a specific axis in color space. Surprisingly, in deuteranope participants we find significantly elevated red–green contrast at fixation points, comparable to color-normal participants. Temporal analysis indicates that this is due to compensatory mechanisms acting on a slower time scale. Taken together, our results suggest that under natural conditions red–green color information contributes to overt attention at a low-level (bottom-up). Nevertheless, the results of the image modifications and deuteranope participants indicate that evaluation of color information is done in a hue-invariant fashion. PMID:21519395
Facial recognition using multisensor images based on localized kernel eigen spaces.
Gundimada, Satyanadh; Asari, Vijayan K
2009-06-01
A feature selection technique along with an information fusion procedure for improving the recognition accuracy of a visual and thermal image-based facial recognition system is presented in this paper. A novel modular kernel eigenspaces approach is developed and implemented on the phase congruency feature maps extracted from the visual and thermal images individually. Smaller sub-regions from a predefined neighborhood within the phase congruency images of the training samples are merged to obtain a large set of features. These features are then projected into higher dimensional spaces using kernel methods. The proposed localized nonlinear feature selection procedure helps to overcome the bottlenecks of illumination variations, partial occlusions, expression variations and variations due to temperature changes that affect the visual and thermal face recognition techniques. AR and Equinox databases are used for experimentation and evaluation of the proposed technique. The proposed feature selection procedure has greatly improved the recognition accuracy for both the visual and thermal images when compared to conventional techniques. Also, a decision level fusion methodology is presented which along with the feature selection procedure has outperformed various other face recognition techniques in terms of recognition accuracy.
Contextual effects on perceived contrast: figure-ground assignment and orientation contrast.
Self, Matthew W; Mookhoek, Aart; Tjalma, Nienke; Roelfsema, Pieter R
2015-02-02
Figure-ground segregation is an important step in the path leading to object recognition. The visual system segregates objects ('figures') in the visual scene from their backgrounds ('ground'). Electrophysiological studies in awake-behaving monkeys have demonstrated that neurons in early visual areas increase their firing rate when responding to a figure compared to responding to the background. We hypothesized that similar changes in neural firing would take place in early visual areas of the human visual system, leading to changes in the perception of low-level visual features. In this study, we investigated whether contrast perception is affected by figure-ground assignment using stimuli similar to those in the electrophysiological studies in monkeys. We measured contrast discrimination thresholds and perceived contrast for Gabor probes placed on figures or the background and found that the perceived contrast of the probe was increased when it was placed on a figure. Furthermore, we tested how this effect compared with the well-known effect of orientation contrast on perceived contrast. We found that figure-ground assignment and orientation contrast produced changes in perceived contrast of a similar magnitude, and that they interacted. Our results demonstrate that figure-ground assignment influences perceived contrast, consistent with an effect of figure-ground assignment on activity in early visual areas of the human visual system. © 2015 ARVO.
The influence of attention levels on psychophysiological responses.
Chang, Yu-Chieh; Huang, Shwu-Lih
2012-10-01
This study aimed to examine which brain oscillatory activities and peripheral physiological measures were influenced by attention levels. A new experimental procedure was designed. Participants were asked to count the number of target events while viewing eight moving white circles. An event occurred when two of the circles changed from white to red or blue. In the low-attention task, similar to a feature search, the target events were defined by color only. In the high-attention task, similar to a conjunction search, the target events were defined by both color and size. In the control task, participants were asked to passively watch the series of events while remembering a number. Based on Feature Integration Theory, our high-attention task would demand more attentional investment than the low-attention task. Given the identical visual stimuli and requirement of keeping a number in working memory for all three tasks, the changes in brain oscillatory activities can be attributed to attention level rather than to perceptual content or memory processes. Peripheral measures such as heart rate, heart rate variability (HRV), respiration rate, eye blinks, and skin conductance level were also evaluated. In comparing the high-attention task with the low-attention task, theta synchronization at the Fz, Cz, and Pz electrodes as a group, alpha2 desynchronization at the Fz, Cz, Pz, and Oz electrodes as a group, and a decrease in the low-frequency component and ratio measure of HRV were evident. These measures are considered to be promising indices for discriminating between attention levels. Copyright © 2012 Elsevier B.V. All rights reserved.
Rajaei, Karim; Khaligh-Razavi, Seyed-Mahdi; Ghodrati, Masoud; Ebrahimpour, Reza; Shiri Ahmad Abadi, Mohammad Ebrahim
2012-01-01
The brain mechanism of extracting visual features for recognizing various objects has consistently been a controversial issue in computational models of object recognition. To extract visual features, we introduce a new, biologically motivated model for facial categorization, which is an extension of the Hubel and Wiesel simple-to-complex cell hierarchy. To address the synaptic stability versus plasticity dilemma, we apply the Adaptive Resonance Theory (ART) for extracting informative intermediate level visual features during the learning process, which also makes this model stable against the destruction of previously learned information while learning new information. Such a mechanism has been suggested to be embedded within known laminar microcircuits of the cerebral cortex. To reveal the strength of the proposed visual feature learning mechanism, we show that when we use this mechanism in the training process of a well-known biologically motivated object recognition model (the HMAX model), it performs better than the HMAX model in face/non-face classification tasks. Furthermore, we demonstrate that our proposed mechanism is capable of following similar trends in performance as humans in a psychophysical experiment using a face versus non-face rapid categorization task.
Feature saliency and feedback information interactively impact visual category learning
Hammer, Rubi; Sloutsky, Vladimir; Grill-Spector, Kalanit
2015-01-01
Visual category learning (VCL) involves detecting which features are most relevant for categorization. VCL relies on attentional learning, which enables effectively redirecting attention to object’s features most relevant for categorization, while ‘filtering out’ irrelevant features. When features relevant for categorization are not salient, VCL relies also on perceptual learning, which enables becoming more sensitive to subtle yet important differences between objects. Little is known about how attentional learning and perceptual learning interact when VCL relies on both processes at the same time. Here we tested this interaction. Participants performed VCL tasks in which they learned to categorize novel stimuli by detecting the feature dimension relevant for categorization. Tasks varied both in feature saliency (low-saliency tasks that required perceptual learning vs. high-saliency tasks), and in feedback information (tasks with mid-information, moderately ambiguous feedback that increased attentional load, vs. tasks with high-information non-ambiguous feedback). We found that mid-information and high-information feedback were similarly effective for VCL in high-saliency tasks. This suggests that an increased attentional load, associated with the processing of moderately ambiguous feedback, has little effect on VCL when features are salient. In low-saliency tasks, VCL relied on slower perceptual learning; but when the feedback was highly informative participants were able to ultimately attain the same performance as during the high-saliency VCL tasks. However, VCL was significantly compromised in the low-saliency mid-information feedback task. We suggest that such low-saliency mid-information learning scenarios are characterized by a ‘cognitive loop paradox’ where two interdependent learning processes have to take place simultaneously. PMID:25745404
User-assisted video segmentation system for visual communication
NASA Astrophysics Data System (ADS)
Wu, Zhengping; Chen, Chun
2002-01-01
Video segmentation plays an important role for efficient storage and transmission in visual communication. In this paper, we introduce a novel video segmentation system using point tracking and contour formation techniques. Inspired by the results from the study of the human visual system, we intend to solve the video segmentation problem into three separate phases: user-assisted feature points selection, feature points' automatic tracking, and contour formation. This splitting relieves the computer of ill-posed automatic segmentation problems, and allows a higher level of flexibility of the method. First, the precise feature points can be found using a combination of user assistance and an eigenvalue-based adjustment. Second, the feature points in the remaining frames are obtained using motion estimation and point refinement. At last, contour formation is used to extract the object, and plus a point insertion process to provide the feature points for next frame's tracking.
Sherman, Aleksandra; Grabowecky, Marcia; Suzuki, Satoru
2015-08-01
What shapes art appreciation? Much research has focused on the importance of visual features themselves (e.g., symmetry, natural scene statistics) and of the viewer's experience and expertise with specific artworks. However, even after taking these factors into account, there are considerable individual differences in art preferences. Our new result suggests that art preference is also influenced by the compatibility between visual properties and the characteristics of the viewer's visual system. Specifically, we have demonstrated, using 120 artworks from diverse periods, cultures, genres, and styles, that art appreciation is increased when the level of visual complexity within an artwork is compatible with the viewer's visual working memory capacity. The result highlights the importance of the interaction between visual features and the beholder's general visual capacity in shaping art appreciation. (c) 2015 APA, all rights reserved).
Reward associations impact both iconic and visual working memory.
Infanti, Elisa; Hickey, Clayton; Turatto, Massimo
2015-02-01
Reward plays a fundamental role in human behavior. A growing number of studies have shown that stimuli associated with reward become salient and attract attention. The aim of the present study was to extend these results into the investigation of iconic memory and visual working memory. In two experiments we asked participants to perform a visual-search task where different colors of the target stimuli were paired with high or low reward. We then tested whether the pre-established feature-reward association affected performance on a subsequent visual memory task, in which no reward was provided. In this test phase participants viewed arrays of 8 objects, one of which had unique color that could match the color associated with reward during the previous visual-search task. A probe appeared at varying intervals after stimulus offset to identify the to-be-reported item. Our results suggest that reward biases the encoding of visual information such that items characterized by a reward-associated feature interfere with mnemonic representations of other items in the test display. These results extend current knowledge regarding the influence of reward on early cognitive processes, suggesting that feature-reward associations automatically interact with the encoding and storage of visual information, both in iconic memory and visual working memory. Copyright © 2014 Elsevier Ltd. All rights reserved.
Vukusic, Svjetlana; Ciorciari, Joseph; Crewther, David P.
2017-01-01
People with Autism spectrum disorder (ASD) show difficulty in social communication, especially in the rapid assessment of emotion in faces. This study examined the processing of emotional faces in typically developing adults with high and low levels of autistic traits (measured using the Autism Spectrum Quotient—AQ). Event-related potentials (ERPs) were recorded during viewing of backward-masked neutral, fearful and happy faces presented under two conditions: subliminal (16 ms, below the level of visual conscious awareness) and supraliminal (166 ms, above the time required for visual conscious awareness). Individuals with low and high AQ differed in the processing of subliminal faces, with the low AQ group showing an enhanced N2 amplitude for subliminal happy faces. Some group differences were found in the condition effects, with the Low AQ showing shorter frontal P3b and N4 latencies for subliminal vs. supraliminal condition. Although results did not show any group differences on the face-specific N170 component, there were shorter N170 latencies for supraliminal vs. subliminal conditions across groups. The results observed on the N2, showing group differences in subliminal emotion processing, suggest that decreased sensitivity to the reward value of social stimuli is a common feature both of people with ASD as well as people with high autistic traits from the normal population. PMID:28588465
Vukusic, Svjetlana; Ciorciari, Joseph; Crewther, David P
2017-01-01
People with Autism spectrum disorder (ASD) show difficulty in social communication, especially in the rapid assessment of emotion in faces. This study examined the processing of emotional faces in typically developing adults with high and low levels of autistic traits (measured using the Autism Spectrum Quotient-AQ). Event-related potentials (ERPs) were recorded during viewing of backward-masked neutral, fearful and happy faces presented under two conditions: subliminal (16 ms, below the level of visual conscious awareness) and supraliminal (166 ms, above the time required for visual conscious awareness). Individuals with low and high AQ differed in the processing of subliminal faces, with the low AQ group showing an enhanced N2 amplitude for subliminal happy faces. Some group differences were found in the condition effects, with the Low AQ showing shorter frontal P3b and N4 latencies for subliminal vs. supraliminal condition. Although results did not show any group differences on the face-specific N170 component, there were shorter N170 latencies for supraliminal vs. subliminal conditions across groups. The results observed on the N2, showing group differences in subliminal emotion processing, suggest that decreased sensitivity to the reward value of social stimuli is a common feature both of people with ASD as well as people with high autistic traits from the normal population.
Biometric recognition via texture features of eye movement trajectories in a visual searching task.
Li, Chunyong; Xue, Jiguo; Quan, Cheng; Yue, Jingwei; Zhang, Chenggang
2018-01-01
Biometric recognition technology based on eye-movement dynamics has been in development for more than ten years. Different visual tasks, feature extraction and feature recognition methods are proposed to improve the performance of eye movement biometric system. However, the correct identification and verification rates, especially in long-term experiments, as well as the effects of visual tasks and eye trackers' temporal and spatial resolution are still the foremost considerations in eye movement biometrics. With a focus on these issues, we proposed a new visual searching task for eye movement data collection and a new class of eye movement features for biometric recognition. In order to demonstrate the improvement of this visual searching task being used in eye movement biometrics, three other eye movement feature extraction methods were also tested on our eye movement datasets. Compared with the original results, all three methods yielded better results as expected. In addition, the biometric performance of these four feature extraction methods was also compared using the equal error rate (EER) and Rank-1 identification rate (Rank-1 IR), and the texture features introduced in this paper were ultimately shown to offer some advantages with regard to long-term stability and robustness over time and spatial precision. Finally, the results of different combinations of these methods with a score-level fusion method indicated that multi-biometric methods perform better in most cases.
Biometric recognition via texture features of eye movement trajectories in a visual searching task
Li, Chunyong; Xue, Jiguo; Quan, Cheng; Yue, Jingwei
2018-01-01
Biometric recognition technology based on eye-movement dynamics has been in development for more than ten years. Different visual tasks, feature extraction and feature recognition methods are proposed to improve the performance of eye movement biometric system. However, the correct identification and verification rates, especially in long-term experiments, as well as the effects of visual tasks and eye trackers’ temporal and spatial resolution are still the foremost considerations in eye movement biometrics. With a focus on these issues, we proposed a new visual searching task for eye movement data collection and a new class of eye movement features for biometric recognition. In order to demonstrate the improvement of this visual searching task being used in eye movement biometrics, three other eye movement feature extraction methods were also tested on our eye movement datasets. Compared with the original results, all three methods yielded better results as expected. In addition, the biometric performance of these four feature extraction methods was also compared using the equal error rate (EER) and Rank-1 identification rate (Rank-1 IR), and the texture features introduced in this paper were ultimately shown to offer some advantages with regard to long-term stability and robustness over time and spatial precision. Finally, the results of different combinations of these methods with a score-level fusion method indicated that multi-biometric methods perform better in most cases. PMID:29617383
Kriete, A; Schäffer, R; Harms, H; Aus, H M
1987-06-01
Nuclei of the cells from the thyroid gland were analyzed in a transmission electron microscope by direct TV scanning and on-line image processing. The method uses the advantages of a visual-perception model to detect structures in noisy and low-contrast images. The features analyzed include area, a form factor and texture parameters from the second derivative stage. Three tumor-free thyroid tissues, three follicular adenomas, three follicular carcinomas and three papillary carcinomas were studied. The computer-aided cytophotometric method showed that the most significant differences were the statistics of the chromatin texture features of homogeneity and regularity. These findings document the possibility of an automated differentiation of tumors at the ultrastructural level.
Graewe, Britta; De Weerd, Peter; Farivar, Reza; Castelo-Branco, Miguel
2012-01-01
Many studies have linked the processing of different object categories to specific event-related potentials (ERPs) such as the face-specific N170. Despite reports showing that object-related ERPs are influenced by visual stimulus features, there is consensus that these components primarily reflect categorical aspects of the stimuli. Here, we re-investigated this idea by systematically measuring the effects of visual feature manipulations on ERP responses elicited by both structure-from-motion (SFM)-defined and luminance-defined object stimuli. SFM objects elicited a novel component at 200–250 ms (N250) over parietal and posterior temporal sites. We found, however, that the N250 amplitude was unaffected by restructuring SFM stimuli into meaningless objects based on identical visual cues. This suggests that this N250 peak was not uniquely linked to categorical aspects of the objects, but is strongly determined by visual stimulus features. We provide strong support for this hypothesis by parametrically manipulating the depth range of both SFM- and luminance-defined object stimuli and showing that the N250 evoked by SFM stimuli as well as the well-known N170 to static faces were sensitive to this manipulation. Importantly, this effect could not be attributed to compromised object categorization in low depth stimuli, confirming a strong impact of visual stimulus features on object-related ERP signals. As ERP components linked with visual categorical object perception are likely determined by multiple stimulus features, this creates an interesting inverse problem when deriving specific perceptual processes from variations in ERP components. PMID:22363479
Graewe, Britta; De Weerd, Peter; Farivar, Reza; Castelo-Branco, Miguel
2012-01-01
Many studies have linked the processing of different object categories to specific event-related potentials (ERPs) such as the face-specific N170. Despite reports showing that object-related ERPs are influenced by visual stimulus features, there is consensus that these components primarily reflect categorical aspects of the stimuli. Here, we re-investigated this idea by systematically measuring the effects of visual feature manipulations on ERP responses elicited by both structure-from-motion (SFM)-defined and luminance-defined object stimuli. SFM objects elicited a novel component at 200-250 ms (N250) over parietal and posterior temporal sites. We found, however, that the N250 amplitude was unaffected by restructuring SFM stimuli into meaningless objects based on identical visual cues. This suggests that this N250 peak was not uniquely linked to categorical aspects of the objects, but is strongly determined by visual stimulus features. We provide strong support for this hypothesis by parametrically manipulating the depth range of both SFM- and luminance-defined object stimuli and showing that the N250 evoked by SFM stimuli as well as the well-known N170 to static faces were sensitive to this manipulation. Importantly, this effect could not be attributed to compromised object categorization in low depth stimuli, confirming a strong impact of visual stimulus features on object-related ERP signals. As ERP components linked with visual categorical object perception are likely determined by multiple stimulus features, this creates an interesting inverse problem when deriving specific perceptual processes from variations in ERP components.
ERIC Educational Resources Information Center
Aslan, Ummuhan Bas; Calik, Bilge Basakci; Kitis, Ali
2012-01-01
This study was planned in order to determine physical activity levels of visually impaired children and adolescents and to investigate the effect of gender and level of vision on physical activity level in visually impaired children and adolescents. A total of 30 visually impaired children and adolescents (16 low vision and 14 blind) aged between…
Máthé, Koppány; Buşoniu, Lucian
2015-01-01
Unmanned aerial vehicles (UAVs) have gained significant attention in recent years. Low-cost platforms using inexpensive sensor payloads have been shown to provide satisfactory flight and navigation capabilities. In this report, we survey vision and control methods that can be applied to low-cost UAVs, and we list some popular inexpensive platforms and application fields where they are useful. We also highlight the sensor suites used where this information is available. We overview, among others, feature detection and tracking, optical flow and visual servoing, low-level stabilization and high-level planning methods. We then list popular low-cost UAVs, selecting mainly quadrotors. We discuss applications, restricting our focus to the field of infrastructure inspection. Finally, as an example, we formulate two use-cases for railway inspection, a less explored application field, and illustrate the usage of the vision and control techniques reviewed by selecting appropriate ones to tackle these use-cases. To select vision methods, we run a thorough set of experimental evaluations. PMID:26121608
Coding of visual object features and feature conjunctions in the human brain.
Martinovic, Jasna; Gruber, Thomas; Müller, Matthias M
2008-01-01
Object recognition is achieved through neural mechanisms reliant on the activity of distributed coordinated neural assemblies. In the initial steps of this process, an object's features are thought to be coded very rapidly in distinct neural assemblies. These features play different functional roles in the recognition process--while colour facilitates recognition, additional contours and edges delay it. Here, we selectively varied the amount and role of object features in an entry-level categorization paradigm and related them to the electrical activity of the human brain. We found that early synchronizations (approx. 100 ms) increased quantitatively when more image features had to be coded, without reflecting their qualitative contribution to the recognition process. Later activity (approx. 200-400 ms) was modulated by the representational role of object features. These findings demonstrate that although early synchronizations may be sufficient for relatively crude discrimination of objects in visual scenes, they cannot support entry-level categorization. This was subserved by later processes of object model selection, which utilized the representational value of object features such as colour or edges to select the appropriate model and achieve identification.
Gaze-independent brain-computer interfaces based on covert attention and feature attention
NASA Astrophysics Data System (ADS)
Treder, M. S.; Schmidt, N. M.; Blankertz, B.
2011-10-01
There is evidence that conventional visual brain-computer interfaces (BCIs) based on event-related potentials cannot be operated efficiently when eye movements are not allowed. To overcome this limitation, the aim of this study was to develop a visual speller that does not require eye movements. Three different variants of a two-stage visual speller based on covert spatial attention and non-spatial feature attention (i.e. attention to colour and form) were tested in an online experiment with 13 healthy participants. All participants achieved highly accurate BCI control. They could select one out of thirty symbols (chance level 3.3%) with mean accuracies of 88%-97% for the different spellers. The best results were obtained for a speller that was operated using non-spatial feature attention only. These results show that, using feature attention, it is possible to realize high-accuracy, fast-paced visual spellers that have a large vocabulary and are independent of eye gaze.
Coupled binary embedding for large-scale image retrieval.
Zheng, Liang; Wang, Shengjin; Tian, Qi
2014-08-01
Visual matching is a crucial step in image retrieval based on the bag-of-words (BoW) model. In the baseline method, two keypoints are considered as a matching pair if their SIFT descriptors are quantized to the same visual word. However, the SIFT visual word has two limitations. First, it loses most of its discriminative power during quantization. Second, SIFT only describes the local texture feature. Both drawbacks impair the discriminative power of the BoW model and lead to false positive matches. To tackle this problem, this paper proposes to embed multiple binary features at indexing level. To model correlation between features, a multi-IDF scheme is introduced, through which different binary features are coupled into the inverted file. We show that matching verification methods based on binary features, such as Hamming embedding, can be effectively incorporated in our framework. As an extension, we explore the fusion of binary color feature into image retrieval. The joint integration of the SIFT visual word and binary features greatly enhances the precision of visual matching, reducing the impact of false positive matches. Our method is evaluated through extensive experiments on four benchmark datasets (Ukbench, Holidays, DupImage, and MIR Flickr 1M). We show that our method significantly improves the baseline approach. In addition, large-scale experiments indicate that the proposed method requires acceptable memory usage and query time compared with other approaches. Further, when global color feature is integrated, our method yields competitive performance with the state-of-the-arts.
Universal and adapted vocabularies for generic visual categorization.
Perronnin, Florent
2008-07-01
Generic Visual Categorization (GVC) is the pattern classification problem which consists in assigning labels to an image based on its semantic content. This is a challenging task as one has to deal with inherent object/scene variations as well as changes in viewpoint, lighting and occlusion. Several state-of-the-art GVC systems use a vocabulary of visual terms to characterize images with a histogram of visual word counts. We propose a novel practical approach to GVC based on a universal vocabulary, which describes the content of all the considered classes of images, and class vocabularies obtained through the adaptation of the universal vocabulary using class-specific data. The main novelty is that an image is characterized by a set of histograms - one per class - where each histogram describes whether the image content is best modeled by the universal vocabulary or the corresponding class vocabulary. This framework is applied to two types of local image features: low-level descriptors such as the popular SIFT and high-level histograms of word co-occurrences in a spatial neighborhood. It is shown experimentally on two challenging datasets (an in-house database of 19 categories and the PASCAL VOC 2006 dataset) that the proposed approach exhibits state-of-the-art performance at a modest computational cost.
Effect of unilateral exercises on low back pain in an urban driver
Yoo, Won-gyu
2016-01-01
[Purpose] This study aimed to develop unilateral exercises for urban drivers and investigate the effect of these exercises on low back pain (LBP). [Subject and Methods] A 40-year-old male driver, who complained of LBP on the left side at L3–5 levels, participated in this study. A two-session program was conducted, and LBP, pelvic tilt angle, and trunk range of motion were measured after each session. [Results] After the unilateral exercises, the anterior pelvic tilt angle was improved and the visual analog scale score of back pain decreased. [Conclusion] Analyzing car features and performing individual approaches are necessary in providing treatment for urban drivers with LBP. PMID:27942161
Computerized image analysis: estimation of breast density on mammograms
NASA Astrophysics Data System (ADS)
Zhou, Chuan; Chan, Heang-Ping; Petrick, Nicholas; Sahiner, Berkman; Helvie, Mark A.; Roubidoux, Marilyn A.; Hadjiiski, Lubomir M.; Goodsitt, Mitchell M.
2000-06-01
An automated image analysis tool is being developed for estimation of mammographic breast density, which may be useful for risk estimation or for monitoring breast density change in a prevention or intervention program. A mammogram is digitized using a laser scanner and the resolution is reduced to a pixel size of 0.8 mm X 0.8 mm. Breast density analysis is performed in three stages. First, the breast region is segmented from the surrounding background by an automated breast boundary-tracking algorithm. Second, an adaptive dynamic range compression technique is applied to the breast image to reduce the range of the gray level distribution in the low frequency background and to enhance the differences in the characteristic features of the gray level histogram for breasts of different densities. Third, rule-based classification is used to classify the breast images into several classes according to the characteristic features of their gray level histogram. For each image, a gray level threshold is automatically determined to segment the dense tissue from the breast region. The area of segmented dense tissue as a percentage of the breast area is then estimated. In this preliminary study, we analyzed the interobserver variation of breast density estimation by two experienced radiologists using BI-RADS lexicon. The radiologists' visually estimated percent breast densities were compared with the computer's calculation. The results demonstrate the feasibility of estimating mammographic breast density using computer vision techniques and its potential to improve the accuracy and reproducibility in comparison with the subjective visual assessment by radiologists.
NASA Astrophysics Data System (ADS)
Asiedu, Mercy Nyamewaa; Simhal, Anish; Lam, Christopher T.; Mueller, Jenna; Chaudhary, Usamah; Schmitt, John W.; Sapiro, Guillermo; Ramanujam, Nimmi
2018-02-01
The world health organization recommends visual inspection with acetic acid (VIA) and/or Lugol's Iodine (VILI) for cervical cancer screening in low-resource settings. Human interpretation of diagnostic indicators for visual inspection is qualitative, subjective, and has high inter-observer discordance, which could lead both to adverse outcomes for the patient and unnecessary follow-ups. In this work, we a simple method for automatic feature extraction and classification for Lugol's Iodine cervigrams acquired with a low-cost, miniature, digital colposcope. Algorithms to preprocess expert physician-labelled cervigrams and to extract simple but powerful color-based features are introduced. The features are used to train a support vector machine model to classify cervigrams based on expert physician labels. The selected framework achieved a sensitivity, specificity, and accuracy of 89.2%, 66.7% and 80.6% with majority diagnosis of the expert physicians in discriminating cervical intraepithelial neoplasia (CIN +) relative to normal tissues. The proposed classifier also achieved an area under the curve of 84 when trained with majority diagnosis of the expert physicians. The results suggest that utilizing simple color-based features may enable unbiased automation of VILI cervigrams, opening the door to a full system of low-cost data acquisition complemented with automatic interpretation.
Assessment of rural soundscapes with high-speed train noise.
Lee, Pyoung Jik; Hong, Joo Young; Jeon, Jin Yong
2014-06-01
In the present study, rural soundscapes with high-speed train noise were assessed through laboratory experiments. A total of ten sites with varying landscape metrics were chosen for audio-visual recording. The acoustical characteristics of the high-speed train noise were analyzed using various noise level indices. Landscape metrics such as the percentage of natural features (NF) and Shannon's diversity index (SHDI) were adopted to evaluate the landscape features of the ten sites. Laboratory experiments were then performed with 20 well-trained listeners to investigate the perception of high-speed train noise in rural areas. The experiments consisted of three parts: 1) visual-only condition, 2) audio-only condition, and 3) combined audio-visual condition. The results showed that subjects' preference for visual images was significantly related to NF, the number of land types, and the A-weighted equivalent sound pressure level (LAeq). In addition, the visual images significantly influenced the noise annoyance, and LAeq and NF were the dominant factors affecting the annoyance from high-speed train noise in the combined audio-visual condition. In addition, Zwicker's loudness (N) was highly correlated with the annoyance from high-speed train noise in both the audio-only and audio-visual conditions. © 2013.
Buchs, Galit; Maidenbaum, Shachar; Levy-Tzedek, Shelly; Amedi, Amir
2015-01-01
Purpose: To visually perceive our surroundings we constantly move our eyes and focus on particular details, and then integrate them into a combined whole. Current visual rehabilitation methods, both invasive, like bionic-eyes and non-invasive, like Sensory Substitution Devices (SSDs), down-sample visual stimuli into low-resolution images. Zooming-in to sub-parts of the scene could potentially improve detail perception. Can congenitally blind individuals integrate a ‘visual’ scene when offered this information via different sensory modalities, such as audition? Can they integrate visual information –perceived in parts - into larger percepts despite never having had any visual experience? Methods: We explored these questions using a zooming-in functionality embedded in the EyeMusic visual-to-auditory SSD. Eight blind participants were tasked with identifying cartoon faces by integrating their individual components recognized via the EyeMusic’s zooming mechanism. Results: After specialized training of just 6–10 hours, blind participants successfully and actively integrated facial features into cartooned identities in 79±18% of the trials in a highly significant manner, (chance level 10% ; rank-sum P < 1.55E-04). Conclusions: These findings show that even users who lacked any previous visual experience whatsoever can indeed integrate this visual information with increased resolution. This potentially has important practical visual rehabilitation implications for both invasive and non-invasive methods. PMID:26518671
NASA Technical Reports Server (NTRS)
Sweet, Barbara T.; Kaiser, Mary K.
2013-01-01
Although current technology simulator visual systems can achieve extremely realistic levels they do not completely replicate the experience of a pilot sitting in the cockpit, looking at the outside world. Some differences in experience are due to visual artifacts, or perceptual features that would not be present in a naturally viewed scene. Others are due to features that are missing from the simulated scene. In this paper, these differences will be defined and discussed. The significance of these differences will be examined as a function of several particular operational tasks. A framework to facilitate the choice of visual system characteristics based on operational task requirements will be proposed.
Werner, Sebastian; Noppeney, Uta
2010-08-01
Merging information from multiple senses provides a more reliable percept of our environment. Yet, little is known about where and how various sensory features are combined within the cortical hierarchy. Combining functional magnetic resonance imaging and psychophysics, we investigated the neural mechanisms underlying integration of audiovisual object features. Subjects categorized or passively perceived audiovisual object stimuli with the informativeness (i.e., degradation) of the auditory and visual modalities being manipulated factorially. Controlling for low-level integration processes, we show higher level audiovisual integration selectively in the superior temporal sulci (STS) bilaterally. The multisensory interactions were primarily subadditive and even suppressive for intact stimuli but turned into additive effects for degraded stimuli. Consistent with the inverse effectiveness principle, auditory and visual informativeness determine the profile of audiovisual integration in STS similarly to the influence of physical stimulus intensity in the superior colliculus. Importantly, when holding stimulus degradation constant, subjects' audiovisual behavioral benefit predicts their multisensory integration profile in STS: only subjects that benefit from multisensory integration exhibit superadditive interactions, while those that do not benefit show suppressive interactions. In conclusion, superadditive and subadditive integration profiles in STS are functionally relevant and related to behavioral indices of multisensory integration with superadditive interactions mediating successful audiovisual object categorization.
Seeing the invisible: The scope and limits of unconscious processing in binocular rivalry
Lin, Zhicheng; He, Sheng
2009-01-01
When an image is presented to one eye and a very different image is presented to the corresponding location of the other eye, they compete for perceptual dominance, such that only one image is visible at a time while the other is suppressed. Called binocular rivalry, this phenomenon and its deviants have been extensively exploited to study the mechanism and neural correlates of consciousness. In this paper, we propose a framework, the unconscious binding hypothesis, to distinguish unconscious and conscious processing. According to this framework, the unconscious mind not only encodes individual features but also temporally binds distributed features to give rise to cortical representation, but unlike conscious binding, such unconscious binding is fragile. Under this framework, we review evidence from psychophysical and neuroimaging studies, which suggests that: (1) for invisible low level features, prolonged exposure to visual pattern and simple translational motion can alter the appearance of subsequent visible features (i.e. adaptation); for invisible high level features, although complex spiral motion cannot produce adaptation, nor can objects/words enhance subsequent processing of related stimuli (i.e. priming), images of objects such as tools can nevertheless activate the dorsal pathway; and (2) although invisible central cues cannot orient attention, invisible erotic pictures in the periphery can nevertheless guide attention, likely through emotional arousal; reciprocally, the processing of invisible information can be modulated by attention at perceptual and neural levels. PMID:18824061
Visual Search Across the Life Span
ERIC Educational Resources Information Center
Hommel, Bernhard; Li, Karen Z. H.; Li, Shu-Chen
2004-01-01
Gains and losses in visual search were studied across the life span in a representative sample of 298 individuals from 6 to 89 years of age. Participants searched for single-feature and conjunction targets of high or low eccentricity. Search was substantially slowed early and late in life, age gradients were more pronounced in conjunction than in…
Bag of Lines (BoL) for Improved Aerial Scene Representation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sridharan, Harini; Cheriyadat, Anil M.
2014-09-22
Feature representation is a key step in automated visual content interpretation. In this letter, we present a robust feature representation technique, referred to as bag of lines (BoL), for high-resolution aerial scenes. The proposed technique involves extracting and compactly representing low-level line primitives from the scene. The compact scene representation is generated by counting the different types of lines representing various linear structures in the scene. Through extensive experiments, we show that the proposed scene representation is invariant to scale changes and scene conditions and can discriminate urban scene categories accurately. We compare the BoL representation with the popular scalemore » invariant feature transform (SIFT) and Gabor wavelets for their classification and clustering performance on an aerial scene database consisting of images acquired by sensors with different spatial resolutions. The proposed BoL representation outperforms the SIFT- and Gabor-based representations.« less
Example-Based Image Colorization Using Locality Consistent Sparse Representation.
Bo Li; Fuchen Zhao; Zhuo Su; Xiangguo Liang; Yu-Kun Lai; Rosin, Paul L
2017-11-01
Image colorization aims to produce a natural looking color image from a given gray-scale image, which remains a challenging problem. In this paper, we propose a novel example-based image colorization method exploiting a new locality consistent sparse representation. Given a single reference color image, our method automatically colorizes the target gray-scale image by sparse pursuit. For efficiency and robustness, our method operates at the superpixel level. We extract low-level intensity features, mid-level texture features, and high-level semantic features for each superpixel, which are then concatenated to form its descriptor. The collection of feature vectors for all the superpixels from the reference image composes the dictionary. We formulate colorization of target superpixels as a dictionary-based sparse reconstruction problem. Inspired by the observation that superpixels with similar spatial location and/or feature representation are likely to match spatially close regions from the reference image, we further introduce a locality promoting regularization term into the energy formulation, which substantially improves the matching consistency and subsequent colorization results. Target superpixels are colorized based on the chrominance information from the dominant reference superpixels. Finally, to further improve coherence while preserving sharpness, we develop a new edge-preserving filter for chrominance channels with the guidance from the target gray-scale image. To the best of our knowledge, this is the first work on sparse pursuit image colorization from single reference images. Experimental results demonstrate that our colorization method outperforms the state-of-the-art methods, both visually and quantitatively using a user study.
Face features and face configurations both contribute to visual crowding.
Sun, Hsin-Mei; Balas, Benjamin
2015-02-01
Crowding refers to the inability to recognize an object in peripheral vision when other objects are presented nearby (Whitney & Levi Trends in Cognitive Sciences, 15, 160-168, 2011). A popular explanation of crowding is that features of the target and flankers are combined inappropriately when they are located within an integration field, thus impairing target recognition (Pelli, Palomares, & Majaj Journal of Vision, 4(12), 12:1136-1169, 2004). However, it remains unclear which features of the target and flankers are combined inappropriately to cause crowding (Levi Vision Research, 48, 635-654, 2008). For example, in a complex stimulus (e.g., a face), to what extent does crowding result from the integration of features at a part-based level or at the level of global processing of the configural appearance? In this study, we used a face categorization task and different types of flankers to examine how much the magnitude of visual crowding depends on the similarity of face parts or of global configurations. We created flankers with face-like features (e.g., the eyes, nose, and mouth) in typical and scrambled configurations to examine the impacts of part appearance and global configuration on the visual crowding of faces. Additionally, we used "electrical socket" flankers that mimicked first-order face configuration but had only schematic features, to examine the extent to which global face geometry impacted crowding. Our results indicated that both face parts and configurations contribute to visual crowding, suggesting that face similarity as realized under crowded conditions includes both aspects of facial appearance.
Hantke, Simone; Weninger, Felix; Kurle, Richard; Ringeval, Fabien; Batliner, Anton; Mousa, Amr El-Desoky; Schuller, Björn
2016-01-01
We propose a new recognition task in the area of computational paralinguistics: automatic recognition of eating conditions in speech, i. e., whether people are eating while speaking, and what they are eating. To this end, we introduce the audio-visual iHEARu-EAT database featuring 1.6 k utterances of 30 subjects (mean age: 26.1 years, standard deviation: 2.66 years, gender balanced, German speakers), six types of food (Apple, Nectarine, Banana, Haribo Smurfs, Biscuit, and Crisps), and read as well as spontaneous speech, which is made publicly available for research purposes. We start with demonstrating that for automatic speech recognition (ASR), it pays off to know whether speakers are eating or not. We also propose automatic classification both by brute-forcing of low-level acoustic features as well as higher-level features related to intelligibility, obtained from an Automatic Speech Recogniser. Prediction of the eating condition was performed with a Support Vector Machine (SVM) classifier employed in a leave-one-speaker-out evaluation framework. Results show that the binary prediction of eating condition (i. e., eating or not eating) can be easily solved independently of the speaking condition; the obtained average recalls are all above 90%. Low-level acoustic features provide the best performance on spontaneous speech, which reaches up to 62.3% average recall for multi-way classification of the eating condition, i. e., discriminating the six types of food, as well as not eating. The early fusion of features related to intelligibility with the brute-forced acoustic feature set improves the performance on read speech, reaching a 66.4% average recall for the multi-way classification task. Analysing features and classifier errors leads to a suitable ordinal scale for eating conditions, on which automatic regression can be performed with up to 56.2% determination coefficient. PMID:27176486
LRRTM1 underlies synaptic convergence in visual thalamus
Monavarfeshani, Aboozar; Stanton, Gail; Van Name, Jonathan; Su, Kaiwen; Mills, William A; Swilling, Kenya; Kerr, Alicia; Huebschman, Natalie A; Su, Jianmin
2018-01-01
It has long been thought that the mammalian visual system is organized into parallel pathways, with incoming visual signals being parsed in the retina based on feature (e.g. color, contrast and motion) and then transmitted to the brain in unmixed, feature-specific channels. To faithfully convey feature-specific information from retina to cortex, thalamic relay cells must receive inputs from only a small number of functionally similar retinal ganglion cells. However, recent studies challenged this by revealing substantial levels of retinal convergence onto relay cells. Here, we sought to identify mechanisms responsible for the assembly of such convergence. Using an unbiased transcriptomics approach and targeted mutant mice, we discovered a critical role for the synaptic adhesion molecule Leucine Rich Repeat Transmembrane Neuronal 1 (LRRTM1) in the emergence of retinothalamic convergence. Importantly, LRRTM1 mutant mice display impairment in visual behaviors, suggesting a functional role of retinothalamic convergence in vision. PMID:29424692
ERIC Educational Resources Information Center
Kelly, Resa M.
2014-01-01
Molecular visualizations have been widely endorsed by many chemical educators as an efficient way to convey the dynamic and atomic-level details of chemistry events. Research indicates that students who use molecular visualizations are able to incorporate most of the intended features of the animations into their explanations. However, studies…
The Forest, the Trees, and the Leaves: Differences of Processing across Development
ERIC Educational Resources Information Center
Krakowski, Claire-Sara; Poirel, Nicolas; Vidal, Julie; Roëll, Margot; Pineau, Arlette; Borst, Grégoire; Houdé, Olivier
2016-01-01
To act and think, children and adults are continually required to ignore irrelevant visual information to focus on task-relevant items. As real-world visual information is organized into structures, we designed a feature visual search task containing 3-level hierarchical stimuli (i.e., local shapes that constituted intermediate shapes that formed…
Fault Diagnosis for Rolling Bearings under Variable Conditions Based on Visual Cognition
Cheng, Yujie; Zhou, Bo; Lu, Chen; Yang, Chao
2017-01-01
Fault diagnosis for rolling bearings has attracted increasing attention in recent years. However, few studies have focused on fault diagnosis for rolling bearings under variable conditions. This paper introduces a fault diagnosis method for rolling bearings under variable conditions based on visual cognition. The proposed method includes the following steps. First, the vibration signal data are transformed into a recurrence plot (RP), which is a two-dimensional image. Then, inspired by the visual invariance characteristic of the human visual system (HVS), we utilize speed up robust feature to extract fault features from the two-dimensional RP and generate a 64-dimensional feature vector, which is invariant to image translation, rotation, scaling variation, etc. Third, based on the manifold perception characteristic of HVS, isometric mapping, a manifold learning method that can reflect the intrinsic manifold embedded in the high-dimensional space, is employed to obtain a low-dimensional feature vector. Finally, a classical classification method, support vector machine, is utilized to realize fault diagnosis. Verification data were collected from Case Western Reserve University Bearing Data Center, and the experimental result indicates that the proposed fault diagnosis method based on visual cognition is highly effective for rolling bearings under variable conditions, thus providing a promising approach from the cognitive computing field. PMID:28772943
Mid-level perceptual features contain early cues to animacy.
Long, Bria; Störmer, Viola S; Alvarez, George A
2017-06-01
While substantial work has focused on how the visual system achieves basic-level recognition, less work has asked about how it supports large-scale distinctions between objects, such as animacy and real-world size. Previous work has shown that these dimensions are reflected in our neural object representations (Konkle & Caramazza, 2013), and that objects of different real-world sizes have different mid-level perceptual features (Long, Konkle, Cohen, & Alvarez, 2016). Here, we test the hypothesis that animates and manmade objects also differ in mid-level perceptual features. To do so, we generated synthetic images of animals and objects that preserve some texture and form information ("texforms"), but are not identifiable at the basic level. We used visual search efficiency as an index of perceptual similarity, as search is slower when targets are perceptually similar to distractors. Across three experiments, we find that observers can find animals faster among objects than among other animals, and vice versa, and that these results hold when stimuli are reduced to unrecognizable texforms. Electrophysiological evidence revealed that this mixed-animacy search advantage emerges during early stages of target individuation, and not during later stages associated with semantic processing. Lastly, we find that perceived curvature explains part of the mixed-animacy search advantage and that observers use perceived curvature to classify texforms as animate/inanimate. Taken together, these findings suggest that mid-level perceptual features, including curvature, contain cues to whether an object may be animate versus manmade. We propose that the visual system capitalizes on these early cues to facilitate object detection, recognition, and classification.
Application of multispectral photography to mineral and land resources of South Carolina
NASA Technical Reports Server (NTRS)
Olson, N. K. (Principal Investigator)
1975-01-01
The author has identified the following significant results. Good results were obtained from using Skylab photography in conjunction with LANDSAT imagery for visual interpretation of various geologic features, particularly lineaments. It was concluded that visual interpretation alone of Skylab photographs was quite limited, and much of this was because of the low contrast, heavily vegetated terrain in southeastern United States. Lineaments of major structural features are detectable but subtle. An intimate knowledge of the geologic field relationships is needed before a meaningful analysis is feasible using current satellite photography alone.
Content-Aware Video Adaptation under Low-Bitrate Constraint
NASA Astrophysics Data System (ADS)
Hsiao, Ming-Ho; Chen, Yi-Wen; Chen, Hua-Tsung; Chou, Kuan-Hung; Lee, Suh-Yin
2007-12-01
With the development of wireless network and the improvement of mobile device capability, video streaming is more and more widespread in such an environment. Under the condition of limited resource and inherent constraints, appropriate video adaptations have become one of the most important and challenging issues in wireless multimedia applications. In this paper, we propose a novel content-aware video adaptation in order to effectively utilize resource and improve visual perceptual quality. First, the attention model is derived from analyzing the characteristics of brightness, location, motion vector, and energy features in compressed domain to reduce computation complexity. Then, through the integration of attention model, capability of client device and correlational statistic model, attractive regions of video scenes are derived. The information object- (IOB-) weighted rate distortion model is used for adjusting the bit allocation. Finally, the video adaptation scheme dynamically adjusts video bitstream in frame level and object level. Experimental results validate that the proposed scheme achieves better visual quality effectively and efficiently.
Experience improves feature extraction in Drosophila.
Peng, Yueqing; Xi, Wang; Zhang, Wei; Zhang, Ke; Guo, Aike
2007-05-09
Previous exposure to a pattern in the visual scene can enhance subsequent recognition of that pattern in many species from honeybees to humans. However, whether previous experience with a visual feature of an object, such as color or shape, can also facilitate later recognition of that particular feature from multiple visual features is largely unknown. Visual feature extraction is the ability to select the key component from multiple visual features. Using a visual flight simulator, we designed a novel protocol for visual feature extraction to investigate the effects of previous experience on visual reinforcement learning in Drosophila. We found that, after conditioning with a visual feature of objects among combinatorial shape-color features, wild-type flies exhibited poor ability to extract the correct visual feature. However, the ability for visual feature extraction was greatly enhanced in flies trained previously with that visual feature alone. Moreover, we demonstrated that flies might possess the ability to extract the abstract category of "shape" but not a particular shape. Finally, this experience-dependent feature extraction is absent in flies with defective MBs, one of the central brain structures in Drosophila. Our results indicate that previous experience can enhance visual feature extraction in Drosophila and that MBs are required for this experience-dependent visual cognition.
Bartsch, Mandy V; Loewe, Kristian; Merkel, Christian; Heinze, Hans-Jochen; Schoenfeld, Mircea A; Tsotsos, John K; Hopf, Jens-Max
2017-10-25
Attention can facilitate the selection of elementary object features such as color, orientation, or motion. This is referred to as feature-based attention and it is commonly attributed to a modulation of the gain and tuning of feature-selective units in visual cortex. Although gain mechanisms are well characterized, little is known about the cortical processes underlying the sharpening of feature selectivity. Here, we show with high-resolution magnetoencephalography in human observers (men and women) that sharpened selectivity for a particular color arises from feedback processing in the human visual cortex hierarchy. To assess color selectivity, we analyze the response to a color probe that varies in color distance from an attended color target. We find that attention causes an initial gain enhancement in anterior ventral extrastriate cortex that is coarsely selective for the target color and transitions within ∼100 ms into a sharper tuned profile in more posterior ventral occipital cortex. We conclude that attention sharpens selectivity over time by attenuating the response at lower levels of the cortical hierarchy to color values neighboring the target in color space. These observations support computational models proposing that attention tunes feature selectivity in visual cortex through backward-propagating attenuation of units less tuned to the target. SIGNIFICANCE STATEMENT Whether searching for your car, a particular item of clothing, or just obeying traffic lights, in everyday life, we must select items based on color. But how does attention allow us to select a specific color? Here, we use high spatiotemporal resolution neuromagnetic recordings to examine how color selectivity emerges in the human brain. We find that color selectivity evolves as a coarse to fine process from higher to lower levels within the visual cortex hierarchy. Our observations support computational models proposing that feature selectivity increases over time by attenuating the responses of less-selective cells in lower-level brain areas. These data emphasize that color perception involves multiple areas across a hierarchy of regions, interacting with each other in a complex, recursive manner. Copyright © 2017 the authors 0270-6474/17/3710346-12$15.00/0.
Meijer, Guido T; Montijn, Jorrit S; Pennartz, Cyriel M A; Lansink, Carien S
2017-09-06
The sensory neocortex is a highly connected associative network that integrates information from multiple senses, even at the level of the primary sensory areas. Although a growing body of empirical evidence supports this view, the neural mechanisms of cross-modal integration in primary sensory areas, such as the primary visual cortex (V1), are still largely unknown. Using two-photon calcium imaging in awake mice, we show that the encoding of audiovisual stimuli in V1 neuronal populations is highly dependent on the features of the stimulus constituents. When the visual and auditory stimulus features were modulated at the same rate (i.e., temporally congruent), neurons responded with either an enhancement or suppression compared with unisensory visual stimuli, and their prevalence was balanced. Temporally incongruent tones or white-noise bursts included in audiovisual stimulus pairs resulted in predominant response suppression across the neuronal population. Visual contrast did not influence multisensory processing when the audiovisual stimulus pairs were congruent; however, when white-noise bursts were used, neurons generally showed response suppression when the visual stimulus contrast was high whereas this effect was absent when the visual contrast was low. Furthermore, a small fraction of V1 neurons, predominantly those located near the lateral border of V1, responded to sound alone. These results show that V1 is involved in the encoding of cross-modal interactions in a more versatile way than previously thought. SIGNIFICANCE STATEMENT The neural substrate of cross-modal integration is not limited to specialized cortical association areas but extends to primary sensory areas. Using two-photon imaging of large groups of neurons, we show that multisensory modulation of V1 populations is strongly determined by the individual and shared features of cross-modal stimulus constituents, such as contrast, frequency, congruency, and temporal structure. Congruent audiovisual stimulation resulted in a balanced pattern of response enhancement and suppression compared with unisensory visual stimuli, whereas incongruent or dissimilar stimuli at full contrast gave rise to a population dominated by response-suppressing neurons. Our results indicate that V1 dynamically integrates nonvisual sources of information while still attributing most of its resources to coding visual information. Copyright © 2017 the authors 0270-6474/17/378783-14$15.00/0.
Cocchi, Luca; Sale, Martin V; L Gollo, Leonardo; Bell, Peter T; Nguyen, Vinh T; Zalesky, Andrew; Breakspear, Michael; Mattingley, Jason B
2016-01-01
Within the primate visual system, areas at lower levels of the cortical hierarchy process basic visual features, whereas those at higher levels, such as the frontal eye fields (FEF), are thought to modulate sensory processes via feedback connections. Despite these functional exchanges during perception, there is little shared activity between early and late visual regions at rest. How interactions emerge between regions encompassing distinct levels of the visual hierarchy remains unknown. Here we combined neuroimaging, non-invasive cortical stimulation and computational modelling to characterize changes in functional interactions across widespread neural networks before and after local inhibition of primary visual cortex or FEF. We found that stimulation of early visual cortex selectively increased feedforward interactions with FEF and extrastriate visual areas, whereas identical stimulation of the FEF decreased feedback interactions with early visual areas. Computational modelling suggests that these opposing effects reflect a fast-slow timescale hierarchy from sensory to association areas. DOI: http://dx.doi.org/10.7554/eLife.15252.001 PMID:27596931
Cocchi, Luca; Sale, Martin V; L Gollo, Leonardo; Bell, Peter T; Nguyen, Vinh T; Zalesky, Andrew; Breakspear, Michael; Mattingley, Jason B
2016-09-06
Within the primate visual system, areas at lower levels of the cortical hierarchy process basic visual features, whereas those at higher levels, such as the frontal eye fields (FEF), are thought to modulate sensory processes via feedback connections. Despite these functional exchanges during perception, there is little shared activity between early and late visual regions at rest. How interactions emerge between regions encompassing distinct levels of the visual hierarchy remains unknown. Here we combined neuroimaging, non-invasive cortical stimulation and computational modelling to characterize changes in functional interactions across widespread neural networks before and after local inhibition of primary visual cortex or FEF. We found that stimulation of early visual cortex selectively increased feedforward interactions with FEF and extrastriate visual areas, whereas identical stimulation of the FEF decreased feedback interactions with early visual areas. Computational modelling suggests that these opposing effects reflect a fast-slow timescale hierarchy from sensory to association areas.
A unified selection signal for attention and reward in primary visual cortex.
Stănişor, Liviu; van der Togt, Chris; Pennartz, Cyriel M A; Roelfsema, Pieter R
2013-05-28
Stimuli associated with high rewards evoke stronger neuronal activity than stimuli associated with lower rewards in many brain regions. It is not well understood how these reward effects influence activity in sensory cortices that represent low-level stimulus features. Here, we investigated the effects of reward information in the primary visual cortex (area V1) of monkeys. We found that the reward value of a stimulus relative to the value of other stimuli is a good predictor of V1 activity. Relative value biases the competition between stimuli, just as has been shown for selective attention. The neuronal latency of this reward value effect in V1 was similar to the latency of attentional influences. Moreover, V1 neurons with a strong value effect also exhibited a strong attention effect, which implies that relative value and top-down attention engage overlapping, if not identical, neuronal selection mechanisms. Our findings demonstrate that the effects of reward value reach down to the earliest sensory processing levels of the cerebral cortex and imply that theories about the effects of reward coding and top-down attention on visual representations should be unified.
Global Sensory Qualities and Aesthetic Experience in Music.
Brattico, Pauli; Brattico, Elvira; Vuust, Peter
2017-01-01
A well-known tradition in the study of visual aesthetics holds that the experience of visual beauty is grounded in global computational or statistical properties of the stimulus, for example, scale-invariant Fourier spectrum or self-similarity. Some approaches rely on neural mechanisms, such as efficient computation, processing fluency, or the responsiveness of the cells in the primary visual cortex. These proposals are united by the fact that the contributing factors are hypothesized to be global (i.e., they concern the percept as a whole), formal or non-conceptual (i.e., they concern form instead of content), computational and/or statistical, and based on relatively low-level sensory properties. Here we consider that the study of aesthetic responses to music could benefit from the same approach. Thus, along with local features such as pitch, tuning, consonance/dissonance, harmony, timbre, or beat, also global sonic properties could be viewed as contributing toward creating an aesthetic musical experience. Several such properties are discussed and their neural implementation is reviewed in the light of recent advances in neuroaesthetics.
Klink, P Christiaan; Dagnino, Bruno; Gariel-Mathis, Marie-Alice; Roelfsema, Pieter R
2017-07-05
The visual cortex is hierarchically organized, with low-level areas coding for simple features and higher areas for complex ones. Feedforward and feedback connections propagate information between areas in opposite directions, but their functional roles are only partially understood. We used electrical microstimulation to perturb the propagation of neuronal activity between areas V1 and V4 in monkeys performing a texture-segregation task. In both areas, microstimulation locally caused a brief phase of excitation, followed by inhibition. Both these effects propagated faithfully in the feedforward direction from V1 to V4. Stimulation of V4, however, caused little V1 excitation, but it did yield a delayed suppression during the late phase of visually driven activity. This suppression was pronounced for the V1 figure representation and weaker for background representations. Our results reveal functional differences between feedforward and feedback processing in texture segregation and suggest a specific modulating role for feedback connections in perceptual organization. Copyright © 2017 Elsevier Inc. All rights reserved.
Hamker, Fred H
2008-07-15
Feature inheritance provides evidence that properties of an invisible target stimulus can be attached to a following mask. We apply a systemslevel model of attention and decision making to explore the influence of memory and feedback connections in feature inheritance. We find that the presence of feedback loops alone is sufficient to account for feature inheritance. Although our simulations do not cover all experimental variations and focus only on the general principle, our result appears of specific interest since the model was designed for a completely different purpose than to explain feature inheritance. We suggest that feedback is an important property in visual perception and provide a description of its mechanism and its role in perception.
De Weerd, Peter; Reithler, Joel; van de Ven, Vincent; Been, Marin; Jacobs, Christianne; Sack, Alexander T
2012-02-08
Practice-induced improvements in skilled performance reflect "offline " consolidation processes extending beyond daily training sessions. According to visual learning theories, an early, fast learning phase driven by high-level areas is followed by a late, asymptotic learning phase driven by low-level, retinotopic areas when higher resolution is required. Thus, low-level areas would not contribute to learning and offline consolidation until late learning. Recent studies have challenged this notion, demonstrating modified responses to trained stimuli in primary visual cortex (V1) and offline activity after very limited training. However, the behavioral relevance of modified V1 activity for offline consolidation of visual skill memory in V1 after early training sessions remains unclear. Here, we used neuronavigated transcranial magnetic stimulation (TMS) directed to a trained retinotopic V1 location to test for behaviorally relevant consolidation in human low-level visual cortex. Applying TMS to the trained V1 location within 45 min of the first or second training session strongly interfered with learning, as measured by impaired performance the next day. The interference was conditional on task context and occurred only when training in the location targeted by TMS was followed by training in a second location before TMS. In this condition, high-level areas may become coupled to the second location and uncoupled from the previously trained low-level representation, thereby rendering consolidation vulnerable to interference. Our data show that, during the earliest phases of skill learning in the lowest-level visual areas, a behaviorally relevant form of consolidation exists of which the robustness is controlled by high-level, contextual factors.
Visualization of Multi-mission Astronomical Data with ESASky
NASA Astrophysics Data System (ADS)
Baines, Deborah; Giordano, Fabrizio; Racero, Elena; Salgado, Jesús; López Martí, Belén; Merín, Bruno; Sarmiento, María-Henar; Gutiérrez, Raúl; Ortiz de Landaluce, Iñaki; León, Ignacio; de Teodoro, Pilar; González, Juan; Nieto, Sara; Segovia, Juan Carlos; Pollock, Andy; Rosa, Michael; Arviset, Christophe; Lennon, Daniel; O'Mullane, William; de Marchi, Guido
2017-02-01
ESASky is a science-driven discovery portal to explore the multi-wavelength sky and visualize and access multiple astronomical archive holdings. The tool is a web application that requires no prior knowledge of any of the missions involved and gives users world-wide simplified access to the highest-level science data products from multiple astronomical space-based astronomy missions plus a number of ESA source catalogs. The first public release of ESASky features interfaces for the visualization of the sky in multiple wavelengths, the visualization of query results summaries, and the visualization of observations and catalog sources for single and multiple targets. This paper describes these features within ESASky, developed to address use cases from the scientific community. The decisions regarding the visualization of large amounts of data and the technologies used were made to maximize the responsiveness of the application and to keep the tool as useful and intuitive as possible.
Are V1 Simple Cells Optimized for Visual Occlusions? A Comparative Study
Bornschein, Jörg; Henniges, Marc; Lücke, Jörg
2013-01-01
Simple cells in primary visual cortex were famously found to respond to low-level image components such as edges. Sparse coding and independent component analysis (ICA) emerged as the standard computational models for simple cell coding because they linked their receptive fields to the statistics of visual stimuli. However, a salient feature of image statistics, occlusions of image components, is not considered by these models. Here we ask if occlusions have an effect on the predicted shapes of simple cell receptive fields. We use a comparative approach to answer this question and investigate two models for simple cells: a standard linear model and an occlusive model. For both models we simultaneously estimate optimal receptive fields, sparsity and stimulus noise. The two models are identical except for their component superposition assumption. We find the image encoding and receptive fields predicted by the models to differ significantly. While both models predict many Gabor-like fields, the occlusive model predicts a much sparser encoding and high percentages of ‘globular’ receptive fields. This relatively new center-surround type of simple cell response is observed since reverse correlation is used in experimental studies. While high percentages of ‘globular’ fields can be obtained using specific choices of sparsity and overcompleteness in linear sparse coding, no or only low proportions are reported in the vast majority of studies on linear models (including all ICA models). Likewise, for the here investigated linear model and optimal sparsity, only low proportions of ‘globular’ fields are observed. In comparison, the occlusive model robustly infers high proportions and can match the experimentally observed high proportions of ‘globular’ fields well. Our computational study, therefore, suggests that ‘globular’ fields may be evidence for an optimal encoding of visual occlusions in primary visual cortex. PMID:23754938
The fate of task-irrelevant visual motion: perceptual load versus feature-based attention.
Taya, Shuichiro; Adams, Wendy J; Graf, Erich W; Lavie, Nilli
2009-11-18
We tested contrasting predictions derived from perceptual load theory and from recent feature-based selection accounts. Observers viewed moving, colored stimuli and performed low or high load tasks associated with one stimulus feature, either color or motion. The resultant motion aftereffect (MAE) was used to evaluate attentional allocation. We found that task-irrelevant visual features received less attention than co-localized task-relevant features of the same objects. Moreover, when color and motion features were co-localized yet perceived to belong to two distinct surfaces, feature-based selection was further increased at the expense of object-based co-selection. Load theory predicts that the MAE for task-irrelevant motion would be reduced with a higher load color task. However, this was not seen for co-localized features; perceptual load only modulated the MAE for task-irrelevant motion when this was spatially separated from the attended color location. Our results suggest that perceptual load effects are mediated by spatial selection and do not generalize to the feature domain. Feature-based selection operates to suppress processing of task-irrelevant, co-localized features, irrespective of perceptual load.
Robot Acting on Moving Bodies (RAMBO): Interaction with tumbling objects
NASA Technical Reports Server (NTRS)
Davis, Larry S.; Dementhon, Daniel; Bestul, Thor; Ziavras, Sotirios; Srinivasan, H. V.; Siddalingaiah, Madhu; Harwood, David
1989-01-01
Interaction with tumbling objects will become more common as human activities in space expand. Attempting to interact with a large complex object translating and rotating in space, a human operator using only his visual and mental capacities may not be able to estimate the object motion, plan actions or control those actions. A robot system (RAMBO) equipped with a camera, which, given a sequence of simple tasks, can perform these tasks on a tumbling object, is being developed. RAMBO is given a complete geometric model of the object. A low level vision module extracts and groups characteristic features in images of the object. The positions of the object are determined in a sequence of images, and a motion estimate of the object is obtained. This motion estimate is used to plan trajectories of the robot tool to relative locations rearby the object sufficient for achieving the tasks. More specifically, low level vision uses parallel algorithms for image enhancement by symmetric nearest neighbor filtering, edge detection by local gradient operators, and corner extraction by sector filtering. The object pose estimation is a Hough transform method accumulating position hypotheses obtained by matching triples of image features (corners) to triples of model features. To maximize computing speed, the estimate of the position in space of a triple of features is obtained by decomposing its perspective view into a product of rotations and a scaled orthographic projection. This allows use of 2-D lookup tables at each stage of the decomposition. The position hypotheses for each possible match of model feature triples and image feature triples are calculated in parallel. Trajectory planning combines heuristic and dynamic programming techniques. Then trajectories are created using dynamic interpolations between initial and goal trajectories. All the parallel algorithms run on a Connection Machine CM-2 with 16K processors.
ERIC Educational Resources Information Center
Lev, Maria; Gilaie-Dotan, Sharon; Gotthilf-Nezri, Dana; Yehezkel, Oren; Brooks, Joseph L.; Perry, Anat; Bentin, Shlomo; Bonneh, Yoram; Polat, Uri
2015-01-01
Long-term deprivation of normal visual inputs can cause perceptual impairments at various levels of visual function, from basic visual acuity deficits, through mid-level deficits such as contour integration and motion coherence, to high-level face and object agnosia. Yet it is unclear whether training during adulthood, at a post-developmental…
Vaidya, Avinash R; Fellows, Lesley K
2015-09-16
Adaptively interacting with our environment requires extracting information that will allow us to successfully predict reward. This can be a challenge, particularly when there are many candidate cues, and when rewards are probabilistic. Recent work has demonstrated that visual attention is allocated to stimulus features that have been associated with reward on previous trials. The ventromedial frontal lobe (VMF) has been implicated in learning in dynamic environments of this kind, but the mechanism by which this region influences this process is not clear. Here, we hypothesized that the VMF plays a critical role in guiding attention to reward-predictive stimulus features based on feedback. We tested the effects of VMF damage in human subjects on a visual search task in which subjects were primed to attend to task-irrelevant colors associated with different levels of reward, incidental to the search task. Consistent with previous work, we found that distractors had a greater influence on reaction time when they appeared in colors associated with high reward in the previous trial compared with colors associated with low reward in healthy control subjects and patients with prefrontal damage sparing the VMF. However, this reward modulation of attentional priming was absent in patients with VMF damage. Thus, an intact VMF is necessary for directing attention based on experience with cue-reward associations. We suggest that this region plays a role in selecting reward-predictive cues to facilitate future learning. There has been a swell of interest recently in the ventromedial frontal cortex (VMF), a brain region critical to associative learning. However, the underlying mechanism by which this region guides learning is not well understood. Here, we tested the effects of damage to this region in humans on a task in which rewards were linked incidentally to visual features, resulting in trial-by-trial attentional priming. Controls and subjects with prefrontal damage sparing the VMF showed normal reward priming, but VMF-damaged patients did not. This work sheds light on a potential mechanism through which this region influences behavior. We suggest that the VMF is necessary for directing attention to reward-predictive visual features based on feedback, facilitating future learning and decision-making. Copyright © 2015 the authors 0270-6474/15/3512813-11$15.00/0.
Evidence for unlimited capacity processing of simple features in visual cortex
White, Alex L.; Runeson, Erik; Palmer, John; Ernst, Zachary R.; Boynton, Geoffrey M.
2017-01-01
Performance in many visual tasks is impaired when observers attempt to divide spatial attention across multiple visual field locations. Correspondingly, neuronal response magnitudes in visual cortex are often reduced during divided compared with focused spatial attention. This suggests that early visual cortex is the site of capacity limits, where finite processing resources must be divided among attended stimuli. However, behavioral research demonstrates that not all visual tasks suffer such capacity limits: The costs of divided attention are minimal when the task and stimulus are simple, such as when searching for a target defined by orientation or contrast. To date, however, every neuroimaging study of divided attention has used more complex tasks and found large reductions in response magnitude. We bridged that gap by using functional magnetic resonance imaging to measure responses in the human visual cortex during simple feature detection. The first experiment used a visual search task: Observers detected a low-contrast Gabor patch within one or four potentially relevant locations. The second experiment used a dual-task design, in which observers made independent judgments of Gabor presence in patches of dynamic noise at two locations. In both experiments, blood-oxygen level–dependent (BOLD) signals in the retinotopic cortex were significantly lower for ignored than attended stimuli. However, when observers divided attention between multiple stimuli, BOLD signals were not reliably reduced and behavioral performance was unimpaired. These results suggest that processing of simple features in early visual cortex has unlimited capacity. PMID:28654964
Hierarchical streamline bundles.
Yu, Hongfeng; Wang, Chaoli; Shene, Ching-Kuang; Chen, Jacqueline H
2012-08-01
Effective 3D streamline placement and visualization play an essential role in many science and engineering disciplines. The main challenge for effective streamline visualization lies in seed placement, i.e., where to drop seeds and how many seeds should be placed. Seeding too many or too few streamlines may not reveal flow features and patterns either because it easily leads to visual clutter in rendering or it conveys little information about the flow field. Not only does the number of streamlines placed matter, their spatial relationships also play a key role in understanding the flow field. Therefore, effective flow visualization requires the streamlines to be placed in the right place and in the right amount. This paper introduces hierarchical streamline bundles, a novel approach to simplifying and visualizing 3D flow fields defined on regular grids. By placing seeds and generating streamlines according to flow saliency, we produce a set of streamlines that captures important flow features near critical points without enforcing the dense seeding condition. We group spatially neighboring and geometrically similar streamlines to construct a hierarchy from which we extract streamline bundles at different levels of detail. Streamline bundles highlight multiscale flow features and patterns through clustered yet not cluttered display. This selective visualization strategy effectively reduces visual clutter while accentuating visual foci, and therefore is able to convey the desired insight into the flow data.
Visual analytics in cheminformatics: user-supervised descriptor selection for QSAR methods.
Martínez, María Jimena; Ponzoni, Ignacio; Díaz, Mónica F; Vazquez, Gustavo E; Soto, Axel J
2015-01-01
The design of QSAR/QSPR models is a challenging problem, where the selection of the most relevant descriptors constitutes a key step of the process. Several feature selection methods that address this step are concentrated on statistical associations among descriptors and target properties, whereas the chemical knowledge is left out of the analysis. For this reason, the interpretability and generality of the QSAR/QSPR models obtained by these feature selection methods are drastically affected. Therefore, an approach for integrating domain expert's knowledge in the selection process is needed for increase the confidence in the final set of descriptors. In this paper a software tool, which we named Visual and Interactive DEscriptor ANalysis (VIDEAN), that combines statistical methods with interactive visualizations for choosing a set of descriptors for predicting a target property is proposed. Domain expertise can be added to the feature selection process by means of an interactive visual exploration of data, and aided by statistical tools and metrics based on information theory. Coordinated visual representations are presented for capturing different relationships and interactions among descriptors, target properties and candidate subsets of descriptors. The competencies of the proposed software were assessed through different scenarios. These scenarios reveal how an expert can use this tool to choose one subset of descriptors from a group of candidate subsets or how to modify existing descriptor subsets and even incorporate new descriptors according to his or her own knowledge of the target property. The reported experiences showed the suitability of our software for selecting sets of descriptors with low cardinality, high interpretability, low redundancy and high statistical performance in a visual exploratory way. Therefore, it is possible to conclude that the resulting tool allows the integration of a chemist's expertise in the descriptor selection process with a low cognitive effort in contrast with the alternative of using an ad-hoc manual analysis of the selected descriptors. Graphical abstractVIDEAN allows the visual analysis of candidate subsets of descriptors for QSAR/QSPR. In the two panels on the top, users can interactively explore numerical correlations as well as co-occurrences in the candidate subsets through two interactive graphs.
Modulation of microsaccades by spatial frequency during object categorization.
Craddock, Matt; Oppermann, Frank; Müller, Matthias M; Martinovic, Jasna
2017-01-01
The organization of visual processing into a coarse-to-fine information processing based on the spatial frequency properties of the input forms an important facet of the object recognition process. During visual object categorization tasks, microsaccades occur frequently. One potential functional role of these eye movements is to resolve high spatial frequency information. To assess this hypothesis, we examined the rate, amplitude and speed of microsaccades in an object categorization task in which participants viewed object and non-object images and classified them as showing either natural objects, man-made objects or non-objects. Images were presented unfiltered (broadband; BB) or filtered to contain only low (LSF) or high spatial frequency (HSF) information. This allowed us to examine whether microsaccades were modulated independently by the presence of a high-level feature - the presence of an object - and by low-level stimulus characteristics - spatial frequency. We found a bimodal distribution of saccades based on their amplitude, with a split between smaller and larger microsaccades at 0.4° of visual angle. The rate of larger saccades (⩾0.4°) was higher for objects than non-objects, and higher for objects with high spatial frequency content (HSF and BB objects) than for LSF objects. No effects were observed for smaller microsaccades (<0.4°). This is consistent with a role for larger microsaccades in resolving HSF information for object identification, and previous evidence that more microsaccades are directed towards informative image regions. Copyright © 2016 Elsevier Ltd. All rights reserved.
Lewis, James W.; Talkington, William J.; Tallaksen, Katherine C.; Frum, Chris A.
2012-01-01
Whether viewed or heard, an object in action can be segmented as a distinct salient event based on a number of different sensory cues. In the visual system, several low-level attributes of an image are processed along parallel hierarchies, involving intermediate stages wherein gross-level object form and/or motion features are extracted prior to stages that show greater specificity for different object categories (e.g., people, buildings, or tools). In the auditory system, though relying on a rather different set of low-level signal attributes, meaningful real-world acoustic events and “auditory objects” can also be readily distinguished from background scenes. However, the nature of the acoustic signal attributes or gross-level perceptual features that may be explicitly processed along intermediate cortical processing stages remain poorly understood. Examining mechanical and environmental action sounds, representing two distinct non-biological categories of action sources, we had participants assess the degree to which each sound was perceived as object-like versus scene-like. We re-analyzed data from two of our earlier functional magnetic resonance imaging (fMRI) task paradigms (Engel et al., 2009) and found that scene-like action sounds preferentially led to activation along several midline cortical structures, but with strong dependence on listening task demands. In contrast, bilateral foci along the superior temporal gyri (STG) showed parametrically increasing activation to action sounds rated as more “object-like,” independent of sound category or task demands. Moreover, these STG regions also showed parametric sensitivity to spectral structure variations (SSVs) of the action sounds—a quantitative measure of change in entropy of the acoustic signals over time—and the right STG additionally showed parametric sensitivity to measures of mean entropy and harmonic content of the environmental sounds. Analogous to the visual system, intermediate stages of the auditory system appear to process or extract a number of quantifiable low-order signal attributes that are characteristic of action events perceived as being object-like, representing stages that may begin to dissociate different perceptual dimensions and categories of every-day, real-world action sounds. PMID:22582038
A neural model of the temporal dynamics of figure-ground segregation in motion perception.
Raudies, Florian; Neumann, Heiko
2010-03-01
How does the visual system manage to segment a visual scene into surfaces and objects and manage to attend to a target object? Based on psychological and physiological investigations, it has been proposed that the perceptual organization and segmentation of a scene is achieved by the processing at different levels of the visual cortical hierarchy. According to this, motion onset detection, motion-defined shape segregation, and target selection are accomplished by processes which bind together simple features into fragments of increasingly complex configurations at different levels in the processing hierarchy. As an alternative to this hierarchical processing hypothesis, it has been proposed that the processing stages for feature detection and segregation are reflected in different temporal episodes in the response patterns of individual neurons. Such temporal epochs have been observed in the activation pattern of neurons as low as in area V1. Here, we present a neural network model of motion detection, figure-ground segregation and attentive selection which explains these response patterns in an unifying framework. Based on known principles of functional architecture of the visual cortex, we propose that initial motion and motion boundaries are detected at different and hierarchically organized stages in the dorsal pathway. Visual shapes that are defined by boundaries, which were generated from juxtaposed opponent motions, are represented at different stages in the ventral pathway. Model areas in the different pathways interact through feedforward and modulating feedback, while mutual interactions enable the communication between motion and form representations. Selective attention is devoted to shape representations by sending modulating feedback signals from higher levels (working memory) to intermediate levels to enhance their responses. Areas in the motion and form pathway are coupled through top-down feedback with V1 cells at the bottom end of the hierarchy. We propose that the different temporal episodes in the response pattern of V1 cells, as recorded in recent experiments, reflect the strength of modulating feedback signals. This feedback results from the consolidated shape representations from coherent motion patterns and the attentive modulation of responses along the cortical hierarchy. The model makes testable predictions concerning the duration and delay of the temporal episodes of V1 cell responses as well as their response variations that were caused by modulating feedback signals. Copyright 2009 Elsevier Ltd. All rights reserved.
Image wavelet decomposition and applications
NASA Technical Reports Server (NTRS)
Treil, N.; Mallat, S.; Bajcsy, R.
1989-01-01
The general problem of computer vision has been investigated for more that 20 years and is still one of the most challenging fields in artificial intelligence. Indeed, taking a look at the human visual system can give us an idea of the complexity of any solution to the problem of visual recognition. This general task can be decomposed into a whole hierarchy of problems ranging from pixel processing to high level segmentation and complex objects recognition. Contrasting an image at different representations provides useful information such as edges. An example of low level signal and image processing using the theory of wavelets is introduced which provides the basis for multiresolution representation. Like the human brain, we use a multiorientation process which detects features independently in different orientation sectors. So, images of the same orientation but of different resolutions are contrasted to gather information about an image. An interesting image representation using energy zero crossings is developed. This representation is shown to be experimentally complete and leads to some higher level applications such as edge and corner finding, which in turn provides two basic steps to image segmentation. The possibilities of feedback between different levels of processing are also discussed.
Quantification and Visualization of Variation in Anatomical Trees
DOE Office of Scientific and Technical Information (OSTI.GOV)
Amenta, Nina; Datar, Manasi; Dirksen, Asger
This paper presents two approaches to quantifying and visualizing variation in datasets of trees. The first approach localizes subtrees in which significant population differences are found through hypothesis testing and sparse classifiers on subtree features. The second approach visualizes the global metric structure of datasets through low-distortion embedding into hyperbolic planes in the style of multidimensional scaling. A case study is made on a dataset of airway trees in relation to Chronic Obstructive Pulmonary Disease.
Wu, Yu-Tzu; Nash, Paul; Barnes, Linda E; Minett, Thais; Matthews, Fiona E; Jones, Andy; Brayne, Carol
2014-10-22
An association between depressive symptoms and features of built environment has been reported in the literature. A remaining research challenge is the development of methods to efficiently capture pertinent environmental features in relevant study settings. Visual streetscape images have been used to replace traditional physical audits and directly observe the built environment of communities. The aim of this work is to examine the inter-method reliability of the two audit methods for assessing community environments with a specific focus on physical features related to mental health. Forty-eight postcodes in urban and rural areas of Cambridgeshire, England were randomly selected from an alphabetical list of streets hosted on a UK property website. The assessment was conducted in July and August 2012 by both physical and visual image audits based on the items in Residential Environment Assessment Tool (REAT), an observational instrument targeting the micro-scale environmental features related to mental health in UK postcodes. The assessor used the images of Google Street View and virtually "walked through" the streets to conduct the property and street level assessments. Gwet's AC1 coefficients and Bland-Altman plots were used to compare the concordance of two audits. The results of conducting the REAT by visual image audits generally correspond to direct observations. More variations were found in property level items regarding physical incivilities, with broad limits of agreement which importantly lead to most of the variation in the overall REAT score. Postcodes in urban areas had lower consistency between the two methods than rural areas. Google Street View has the potential to assess environmental features related to mental health with fair reliability and provide a less resource intense method of assessing community environments than physical audits.
Qiao, Hong; Li, Yinlin; Li, Fengfu; Xi, Xuanyang; Wu, Wei
2016-10-01
Recently, many biologically inspired visual computational models have been proposed. The design of these models follows the related biological mechanisms and structures, and these models provide new solutions for visual recognition tasks. In this paper, based on the recent biological evidence, we propose a framework to mimic the active and dynamic learning and recognition process of the primate visual cortex. From principle point of view, the main contributions are that the framework can achieve unsupervised learning of episodic features (including key components and their spatial relations) and semantic features (semantic descriptions of the key components), which support higher level cognition of an object. From performance point of view, the advantages of the framework are as follows: 1) learning episodic features without supervision-for a class of objects without a prior knowledge, the key components, their spatial relations and cover regions can be learned automatically through a deep neural network (DNN); 2) learning semantic features based on episodic features-within the cover regions of the key components, the semantic geometrical values of these components can be computed based on contour detection; 3) forming the general knowledge of a class of objects-the general knowledge of a class of objects can be formed, mainly including the key components, their spatial relations and average semantic values, which is a concise description of the class; and 4) achieving higher level cognition and dynamic updating-for a test image, the model can achieve classification and subclass semantic descriptions. And the test samples with high confidence are selected to dynamically update the whole model. Experiments are conducted on face images, and a good performance is achieved in each layer of the DNN and the semantic description learning process. Furthermore, the model can be generalized to recognition tasks of other objects with learning ability.
The redshift evolution of major merger triggering of luminous AGNs: a slight enhancement at z ˜ 2
NASA Astrophysics Data System (ADS)
Hewlett, Timothy; Villforth, Carolin; Wild, Vivienne; Mendez-Abreu, Jairo; Pawlik, Milena; Rowlands, Kate
2017-09-01
Active galactic nuclei (AGNs), particularly the most luminous AGNs, are commonly assumed to be triggered through major mergers; however, observational evidence for this scenario is mixed. To investigate any influence of galaxy mergers on AGN triggering and luminosities through cosmic time, we present a sample of 106 luminous X-ray-selected type 1 AGNs from the COSMOS survey. These AGNs occupy a large redshift range (0.5 < z < 2.2) and two orders of magnitude in X-ray luminosity (˜1043-1045 erg s-1). AGN hosts are carefully mass and redshift matched to 486 control galaxies. A novel technique for identifying and quantifying merger features in galaxies is developed, subtracting galfit galaxy models and quantifying the residuals. Comparison to visual classification confirms this measure reliably picks out disturbance features in galaxies. No enhancement of merger features with increasing AGN luminosity is found with this metric, or by visual inspection. We analyse the redshift evolution of AGNs associated with galaxy mergers and find no merger enhancement in lower redshift bins. Contrarily, in the highest redshift bin (z ˜ 2) AGNs are ˜4 times more likely to be in galaxies exhibiting evidence of morphological disturbance compared to control galaxies, at 99 per cent confidence level (˜2.4σ) from visual inspection. Since only ˜15 per cent of these AGNs are found to be in morphologically disturbed galaxies, it is implied that major mergers at high redshift make a noticeable but subdominant contribution to AGN fuelling. At low redshifts, other processes dominate and mergers become a less significant triggering mechanism.
Visual Environments for CFD Research
NASA Technical Reports Server (NTRS)
Watson, Val; George, Michael W. (Technical Monitor)
1994-01-01
This viewgraph presentation gives an overview of the visual environments for computational fluid dynamics (CFD) research. It includes details on critical needs from the future computer environment, features needed to attain this environment, prospects for changes in and the impact of the visualization revolution on the human-computer interface, human processing capabilities, limits of personal environment and the extension of that environment with computers. Information is given on the need for more 'visual' thinking (including instances of visual thinking), an evaluation of the alternate approaches for and levels of interactive computer graphics, a visual analysis of computational fluid dynamics, and an analysis of visualization software.
Visual search in Alzheimer's disease: a deficiency in processing conjunctions of features.
Tales, A; Butler, S R; Fossey, J; Gilchrist, I D; Jones, R W; Troscianko, T
2002-01-01
Human vision often needs to encode multiple characteristics of many elements of the visual field, for example their lightness and orientation. The paradigm of visual search allows a quantitative assessment of the function of the underlying mechanisms. It measures the ability to detect a target element among a set of distractor elements. We asked whether Alzheimer's disease (AD) patients are particularly affected in one type of search, where the target is defined by a conjunction of features (orientation and lightness) and where performance depends on some shifting of attention. Two non-conjunction control conditions were employed. The first was a pre-attentive, single-feature, "pop-out" task, detecting a vertical target among horizontal distractors. The second was a single-feature, partly attentive task in which the target element was slightly larger than the distractors-a "size" task. This was chosen to have a similar level of attentional load as the conjunction task (for the control group), but lacked the conjunction of two features. In an experiment, 15 AD patients were compared to age-matched controls. The results suggested that AD patients have a particular impairment in the conjunction task but not in the single-feature size or pre-attentive tasks. This may imply that AD particularly affects those mechanisms which compare across more than one feature type, and spares the other systems and is not therefore simply an 'attention-related' impairment. Additionally, these findings show a double dissociation with previous data on visual search in Parkinson's disease (PD), suggesting a different effect of these diseases on the visual pathway.
Jabeen, Safia; Mehmood, Zahid; Mahmood, Toqeer; Saba, Tanzila; Rehman, Amjad; Mahmood, Muhammad Tariq
2018-01-01
For the last three decades, content-based image retrieval (CBIR) has been an active research area, representing a viable solution for retrieving similar images from an image repository. In this article, we propose a novel CBIR technique based on the visual words fusion of speeded-up robust features (SURF) and fast retina keypoint (FREAK) feature descriptors. SURF is a sparse descriptor whereas FREAK is a dense descriptor. Moreover, SURF is a scale and rotation-invariant descriptor that performs better in the case of repeatability, distinctiveness, and robustness. It is robust to noise, detection errors, geometric, and photometric deformations. It also performs better at low illumination within an image as compared to the FREAK descriptor. In contrast, FREAK is a retina-inspired speedy descriptor that performs better for classification-based problems as compared to the SURF descriptor. Experimental results show that the proposed technique based on the visual words fusion of SURF-FREAK descriptors combines the features of both descriptors and resolves the aforementioned issues. The qualitative and quantitative analysis performed on three image collections, namely Corel-1000, Corel-1500, and Caltech-256, shows that proposed technique based on visual words fusion significantly improved the performance of the CBIR as compared to the feature fusion of both descriptors and state-of-the-art image retrieval techniques. PMID:29694429
Jabeen, Safia; Mehmood, Zahid; Mahmood, Toqeer; Saba, Tanzila; Rehman, Amjad; Mahmood, Muhammad Tariq
2018-01-01
For the last three decades, content-based image retrieval (CBIR) has been an active research area, representing a viable solution for retrieving similar images from an image repository. In this article, we propose a novel CBIR technique based on the visual words fusion of speeded-up robust features (SURF) and fast retina keypoint (FREAK) feature descriptors. SURF is a sparse descriptor whereas FREAK is a dense descriptor. Moreover, SURF is a scale and rotation-invariant descriptor that performs better in the case of repeatability, distinctiveness, and robustness. It is robust to noise, detection errors, geometric, and photometric deformations. It also performs better at low illumination within an image as compared to the FREAK descriptor. In contrast, FREAK is a retina-inspired speedy descriptor that performs better for classification-based problems as compared to the SURF descriptor. Experimental results show that the proposed technique based on the visual words fusion of SURF-FREAK descriptors combines the features of both descriptors and resolves the aforementioned issues. The qualitative and quantitative analysis performed on three image collections, namely Corel-1000, Corel-1500, and Caltech-256, shows that proposed technique based on visual words fusion significantly improved the performance of the CBIR as compared to the feature fusion of both descriptors and state-of-the-art image retrieval techniques.
Looking away from faces: influence of high-level visual processes on saccade programming.
Morand, Stéphanie M; Grosbras, Marie-Hélène; Caldara, Roberto; Harvey, Monika
2010-03-30
Human faces capture attention more than other visual stimuli. Here we investigated whether such face-specific biases rely on automatic (involuntary) or voluntary orienting responses. To this end, we used an anti-saccade paradigm, which requires the ability to inhibit a reflexive automatic response and to generate a voluntary saccade in the opposite direction of the stimulus. To control for potential low-level confounds in the eye-movement data, we manipulated the high-level visual properties of the stimuli while normalizing their global low-level visual properties. Eye movements were recorded in 21 participants who performed either pro- or anti-saccades to a face, car, or noise pattern, randomly presented to the left or right of a fixation point. For each trial, a symbolic cue instructed the observer to generate either a pro-saccade or an anti-saccade. We report a significant increase in anti-saccade error rates for faces compared to cars and noise patterns, as well as faster pro-saccades to faces and cars in comparison to noise patterns. These results indicate that human faces induce stronger involuntary orienting responses than other visual objects, i.e., responses that are beyond the control of the observer. Importantly, this involuntary processing cannot be accounted for by global low-level visual factors.
Combining heterogenous features for 3D hand-held object recognition
NASA Astrophysics Data System (ADS)
Lv, Xiong; Wang, Shuang; Li, Xiangyang; Jiang, Shuqiang
2014-10-01
Object recognition has wide applications in the area of human-machine interaction and multimedia retrieval. However, due to the problem of visual polysemous and concept polymorphism, it is still a great challenge to obtain reliable recognition result for the 2D images. Recently, with the emergence and easy availability of RGB-D equipment such as Kinect, this challenge could be relieved because the depth channel could bring more information. A very special and important case of object recognition is hand-held object recognition, as hand is a straight and natural way for both human-human interaction and human-machine interaction. In this paper, we study the problem of 3D object recognition by combining heterogenous features with different modalities and extraction techniques. For hand-craft feature, although it reserves the low-level information such as shape and color, it has shown weakness in representing hiconvolutionalgh-level semantic information compared with the automatic learned feature, especially deep feature. Deep feature has shown its great advantages in large scale dataset recognition but is not always robust to rotation or scale variance compared with hand-craft feature. In this paper, we propose a method to combine hand-craft point cloud features and deep learned features in RGB and depth channle. First, hand-held object segmentation is implemented by using depth cues and human skeleton information. Second, we combine the extracted hetegerogenous 3D features in different stages using linear concatenation and multiple kernel learning (MKL). Then a training model is used to recognize 3D handheld objects. Experimental results validate the effectiveness and gerneralization ability of the proposed method.
Kurtz, Camille; Beaulieu, Christopher F.; Napel, Sandy; Rubin, Daniel L.
2014-01-01
Computer-assisted image retrieval applications could assist radiologist interpretations by identifying similar images in large archives as a means to providing decision support. However, the semantic gap between low-level image features and their high level semantics may impair the system performances. Indeed, it can be challenging to comprehensively characterize the images using low-level imaging features to fully capture the visual appearance of diseases on images, and recently the use of semantic terms has been advocated to provide semantic descriptions of the visual contents of images. However, most of the existing image retrieval strategies do not consider the intrinsic properties of these terms during the comparison of the images beyond treating them as simple binary (presence/absence) features. We propose a new framework that includes semantic features in images and that enables retrieval of similar images in large databases based on their semantic relations. It is based on two main steps: (1) annotation of the images with semantic terms extracted from an ontology, and (2) evaluation of the similarity of image pairs by computing the similarity between the terms using the Hierarchical Semantic-Based Distance (HSBD) coupled to an ontological measure. The combination of these two steps provides a means of capturing the semantic correlations among the terms used to characterize the images that can be considered as a potential solution to deal with the semantic gap problem. We validate this approach in the context of the retrieval and the classification of 2D regions of interest (ROIs) extracted from computed tomographic (CT) images of the liver. Under this framework, retrieval accuracy of more than 0.96 was obtained on a 30-images dataset using the Normalized Discounted Cumulative Gain (NDCG) index that is a standard technique used to measure the effectiveness of information retrieval algorithms when a separate reference standard is available. Classification results of more than 95% were obtained on a 77-images dataset. For comparison purpose, the use of the Earth Mover's Distance (EMD), which is an alternative distance metric that considers all the existing relations among the terms, led to results retrieval accuracy of 0.95 and classification results of 93% with a higher computational cost. The results provided by the presented framework are competitive with the state-of-the-art and emphasize the usefulness of the proposed methodology for radiology image retrieval and classification. PMID:24632078
Nyamsuren, Enkhbold; Taatgen, Niels A
2013-01-01
Using results from a controlled experiment and simulations based on cognitive models, we show that visual presentation style can have a significant impact on performance in a complex problem-solving task. We compared subject performances in two isomorphic, but visually different, tasks based on a card game of SET. Although subjects used the same strategy in both tasks, the difference in presentation style resulted in radically different reaction times and significant deviations in scanpath patterns in the two tasks. Results from our study indicate that low-level subconscious visual processes, such as differential acuity in peripheral vision and low-level iconic memory, can have indirect, but significant effects on decision making during a problem-solving task. We have developed two ACT-R models that employ the same basic strategy but deal with different presentations styles. Our ACT-R models confirm that changes in low-level visual processes triggered by changes in presentation style can propagate to higher-level cognitive processes. Such a domino effect can significantly affect reaction times and eye movements, without affecting the overall strategy of problem solving.
The Effect of Visual Representation Style in Problem-Solving: A Perspective from Cognitive Processes
Nyamsuren, Enkhbold; Taatgen, Niels A.
2013-01-01
Using results from a controlled experiment and simulations based on cognitive models, we show that visual presentation style can have a significant impact on performance in a complex problem-solving task. We compared subject performances in two isomorphic, but visually different, tasks based on a card game of SET. Although subjects used the same strategy in both tasks, the difference in presentation style resulted in radically different reaction times and significant deviations in scanpath patterns in the two tasks. Results from our study indicate that low-level subconscious visual processes, such as differential acuity in peripheral vision and low-level iconic memory, can have indirect, but significant effects on decision making during a problem-solving task. We have developed two ACT-R models that employ the same basic strategy but deal with different presentations styles. Our ACT-R models confirm that changes in low-level visual processes triggered by changes in presentation style can propagate to higher-level cognitive processes. Such a domino effect can significantly affect reaction times and eye movements, without affecting the overall strategy of problem solving. PMID:24260415
NASA Astrophysics Data System (ADS)
Maes, Alfons
2017-04-01
Climate change is a playground for visualization. Yet research and technological innovations in visual communication and data visualization do not account for a substantial part of the world's population: vulnerable audiences with low levels of literacy.
Curveslam: Utilizing Higher Level Structure In Stereo Vision-Based Navigation
2012-01-01
consider their applica- tion to SLAM . The work of [31] [32] develops a spline-based SLAM framework, but this is only for application to LIDAR -based SLAM ...Existing approaches to visual Simultaneous Localization and Mapping ( SLAM ) typically utilize points as visual feature primitives to represent landmarks...regions of interest. Further, previous SLAM techniques that propose the use of higher level structures often place constraints on the environment, such as
Helo, Andrea; van Ommen, Sandrien; Pannasch, Sebastian; Danteny-Dordoigne, Lucile; Rämä, Pia
2017-11-01
Conceptual representations of everyday scenes are built in interaction with visual environment and these representations guide our visual attention. Perceptual features and object-scene semantic consistency have been found to attract our attention during scene exploration. The present study examined how visual attention in 24-month-old toddlers is attracted by semantic violations and how perceptual features (i. e. saliency, centre distance, clutter and object size) and linguistic properties (i. e. object label frequency and label length) affect gaze distribution. We compared eye movements of 24-month-old toddlers and adults while exploring everyday scenes which either contained an inconsistent (e.g., soap on a breakfast table) or consistent (e.g., soap in a bathroom) object. Perceptual features such as saliency, centre distance and clutter of the scene affected looking times in the toddler group during the whole viewing time whereas looking times in adults were affected only by centre distance during the early viewing time. Adults looked longer to inconsistent than consistent objects either if the objects had a high or a low saliency. In contrast, toddlers presented semantic consistency effect only when objects were highly salient. Additionally, toddlers with lower vocabulary skills looked longer to inconsistent objects while toddlers with higher vocabulary skills look equally long to both consistent and inconsistent objects. Our results indicate that 24-month-old children use scene context to guide visual attention when exploring the visual environment. However, perceptual features have a stronger influence in eye movement guidance in toddlers than in adults. Our results also indicate that language skills influence cognitive but not perceptual guidance of eye movements during scene perception in toddlers. Copyright © 2017 Elsevier Inc. All rights reserved.
Visual search for features and conjunctions in development.
Lobaugh, N J; Cole, S; Rovet, J F
1998-12-01
Visual search performance was examined in three groups of children 7 to 12 years of age and in young adults. Colour and orientation feature searches and a conjunction search were conducted. Reaction time (RT) showed expected improvements in processing speed with age. Comparisons of RT's on target-present and target-absent trials were consistent with parallel search on the two feature conditions and with serial search in the conjunction condition. The RT results indicated searches for feature and conjunctions were treated similarly for children and adults. However, the youngest children missed more targets at the largest array sizes, most strikingly in conjunction search. Based on an analysis of speed/accuracy trade-offs, we suggest that low target-distractor discriminability leads to an undersampling of array elements, and is responsible for the high number of misses in the youngest children.
Panoramic Night Vision Goggle Testing For Diagnosis and Repair
2000-01-01
Visual Acuity Visual Acuity [ Marasco & Task, 1999] measures how well a human observer can see high contrast targets at specified light levels through...grid through the PNVG in-board and out-board channels simultaneously and comparing the defects to the size of grid features ( Marasco & Task, 1999). The
ERIC Educational Resources Information Center
Chandramouli, Magesh; Chittamuru, Siva-Teja
2016-01-01
This paper explains the design of a graphics-based virtual environment for instructing computer hardware concepts to students, especially those at the beginner level. Photorealistic visualizations and simulations are designed and programmed with interactive features allowing students to practice, explore, and test themselves on computer hardware…
A neural theory of visual attention and short-term memory (NTVA).
Bundesen, Claus; Habekost, Thomas; Kyllingsbæk, Søren
2011-05-01
The neural theory of visual attention and short-term memory (NTVA) proposed by Bundesen, Habekost, and Kyllingsbæk (2005) is reviewed. In NTVA, filtering (selection of objects) changes the number of cortical neurons in which an object is represented so that this number increases with the behavioural importance of the object. Another mechanism of selection, pigeonholing (selection of features), scales the level of activation in neurons coding for a particular feature. By these mechanisms, behaviourally important objects and features are likely to win the competition to become encoded into visual short-term memory (VSTM). The VSTM system is conceived as a feedback mechanism that sustains activity in the neurons that have won the attentional competition. NTVA accounts both for a wide range of attentional effects in human performance (reaction times and error rates) and a wide range of effects observed in firing rates of single cells in the primate visual system. Copyright © 2010 Elsevier Ltd. All rights reserved.
Chang, Yongjun; Paul, Anjan Kumar; Kim, Namkug; Baek, Jung Hwan; Choi, Young Jun; Ha, Eun Ju; Lee, Kang Dae; Lee, Hyoung Shin; Shin, DaeSeock; Kim, Nakyoung
2016-01-01
To develop a semiautomated computer-aided diagnosis (cad) system for thyroid cancer using two-dimensional ultrasound images that can be used to yield a second opinion in the clinic to differentiate malignant and benign lesions. A total of 118 ultrasound images that included axial and longitudinal images from patients with biopsy-confirmed malignant (n = 30) and benign (n = 29) nodules were collected. Thyroid cad software was developed to extract quantitative features from these images based on thyroid nodule segmentation in which adaptive diffusion flow for active contours was used. Various features, including histogram, intensity differences, elliptical fit, gray-level co-occurrence matrixes, and gray-level run-length matrixes, were evaluated for each region imaged. Based on these imaging features, a support vector machine (SVM) classifier was used to differentiate benign and malignant nodules. Leave-one-out cross-validation with sequential forward feature selection was performed to evaluate the overall accuracy of this method. Additionally, analyses with contingency tables and receiver operating characteristic (ROC) curves were performed to compare the performance of cad with visual inspection by expert radiologists based on established gold standards. Most univariate features for this proposed cad system attained accuracies that ranged from 78.0% to 83.1%. When optimal SVM parameters that were established using a grid search method with features that radiologists use for visual inspection were employed, the authors could attain rates of accuracy that ranged from 72.9% to 84.7%. Using leave-one-out cross-validation results in a multivariate analysis of various features, the highest accuracy achieved using the proposed cad system was 98.3%, whereas visual inspection by radiologists reached 94.9% accuracy. To obtain the highest accuracies, "axial ratio" and "max probability" in axial images were most frequently included in the optimal feature sets for the authors' proposed cad system, while "shape" and "calcification" in longitudinal images were most frequently included in the optimal feature sets for visual inspection by radiologists. The computed areas under curves in the ROC analysis were 0.986 and 0.979 for the proposed cad system and visual inspection by radiologists, respectively; no significant difference was detected between these groups. The use of thyroid cad to differentiate malignant from benign lesions shows accuracy similar to that obtained via visual inspection by radiologists. Thyroid cad might be considered a viable way to generate a second opinion for radiologists in clinical practice.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chang, Yongjun; Paul, Anjan Kumar; Kim, Namkug, E-mail: namkugkim@gmail.com
Purpose: To develop a semiautomated computer-aided diagnosis (CAD) system for thyroid cancer using two-dimensional ultrasound images that can be used to yield a second opinion in the clinic to differentiate malignant and benign lesions. Methods: A total of 118 ultrasound images that included axial and longitudinal images from patients with biopsy-confirmed malignant (n = 30) and benign (n = 29) nodules were collected. Thyroid CAD software was developed to extract quantitative features from these images based on thyroid nodule segmentation in which adaptive diffusion flow for active contours was used. Various features, including histogram, intensity differences, elliptical fit, gray-level co-occurrencemore » matrixes, and gray-level run-length matrixes, were evaluated for each region imaged. Based on these imaging features, a support vector machine (SVM) classifier was used to differentiate benign and malignant nodules. Leave-one-out cross-validation with sequential forward feature selection was performed to evaluate the overall accuracy of this method. Additionally, analyses with contingency tables and receiver operating characteristic (ROC) curves were performed to compare the performance of CAD with visual inspection by expert radiologists based on established gold standards. Results: Most univariate features for this proposed CAD system attained accuracies that ranged from 78.0% to 83.1%. When optimal SVM parameters that were established using a grid search method with features that radiologists use for visual inspection were employed, the authors could attain rates of accuracy that ranged from 72.9% to 84.7%. Using leave-one-out cross-validation results in a multivariate analysis of various features, the highest accuracy achieved using the proposed CAD system was 98.3%, whereas visual inspection by radiologists reached 94.9% accuracy. To obtain the highest accuracies, “axial ratio” and “max probability” in axial images were most frequently included in the optimal feature sets for the authors’ proposed CAD system, while “shape” and “calcification” in longitudinal images were most frequently included in the optimal feature sets for visual inspection by radiologists. The computed areas under curves in the ROC analysis were 0.986 and 0.979 for the proposed CAD system and visual inspection by radiologists, respectively; no significant difference was detected between these groups. Conclusions: The use of thyroid CAD to differentiate malignant from benign lesions shows accuracy similar to that obtained via visual inspection by radiologists. Thyroid CAD might be considered a viable way to generate a second opinion for radiologists in clinical practice.« less
Manchester visual query language
NASA Astrophysics Data System (ADS)
Oakley, John P.; Davis, Darryl N.; Shann, Richard T.
1993-04-01
We report a database language for visual retrieval which allows queries on image feature information which has been computed and stored along with images. The language is novel in that it provides facilities for dealing with feature data which has actually been obtained from image analysis. Each line in the Manchester Visual Query Language (MVQL) takes a set of objects as input and produces another, usually smaller, set as output. The MVQL constructs are mainly based on proven operators from the field of digital image analysis. An example is the Hough-group operator which takes as input a specification for the objects to be grouped, a specification for the relevant Hough space, and a definition of the voting rule. The output is a ranked list of high scoring bins. The query could be directed towards one particular image or an entire image database, in the latter case the bins in the output list would in general be associated with different images. We have implemented MVQL in two layers. The command interpreter is a Lisp program which maps each MVQL line to a sequence of commands which are used to control a specialized database engine. The latter is a hybrid graph/relational system which provides low-level support for inheritance and schema evolution. In the paper we outline the language and provide examples of useful queries. We also describe our solution to the engineering problems associated with the implementation of MVQL.
A fast and automatic fusion algorithm for unregistered multi-exposure image sequence
NASA Astrophysics Data System (ADS)
Liu, Yan; Yu, Feihong
2014-09-01
Human visual system (HVS) can visualize all the brightness levels of the scene through visual adaptation. However, the dynamic range of most commercial digital cameras and display devices are smaller than the dynamic range of human eye. This implies low dynamic range (LDR) images captured by normal digital camera may lose image details. We propose an efficient approach to high dynamic (HDR) image fusion that copes with image displacement and image blur degradation in a computationally efficient manner, which is suitable for implementation on mobile devices. The various image registration algorithms proposed in the previous literatures are unable to meet the efficiency and performance requirements in the application of mobile devices. In this paper, we selected Oriented Brief (ORB) detector to extract local image structures. The descriptor selected in multi-exposure image fusion algorithm has to be fast and robust to illumination variations and geometric deformations. ORB descriptor is the best candidate in our algorithm. Further, we perform an improved RANdom Sample Consensus (RANSAC) algorithm to reject incorrect matches. For the fusion of images, a new approach based on Stationary Wavelet Transform (SWT) is used. The experimental results demonstrate that the proposed algorithm generates high quality images at low computational cost. Comparisons with a number of other feature matching methods show that our method gets better performance.
Geometric quantification of features in large flow fields.
Kendall, Wesley; Huang, Jian; Peterka, Tom
2012-01-01
Interactive exploration of flow features in large-scale 3D unsteady-flow data is one of the most challenging visualization problems today. To comprehensively explore the complex feature spaces in these datasets, a proposed system employs a scalable framework for investigating a multitude of characteristics from traced field lines. This capability supports the examination of various neighborhood-based geometric attributes in concert with other scalar quantities. Such an analysis wasn't previously possible because of the large computational overhead and I/O requirements. The system integrates visual analytics methods by letting users procedurally and interactively describe and extract high-level flow features. An exploration of various phenomena in a large global ocean-modeling simulation demonstrates the approach's generality and expressiveness as well as its efficacy.
Night vision in barn owls: visual acuity and contrast sensitivity under dark adaptation.
Orlowski, Julius; Harmening, Wolf; Wagner, Hermann
2012-12-06
Barn owls are effective nocturnal predators. We tested their visual performance at low light levels and determined visual acuity and contrast sensitivity of three barn owls by their behavior at stimulus luminances ranging from photopic to fully scotopic levels (23.5 to 1.5 × 10⁻⁶). Contrast sensitivity and visual acuity decreased only slightly from photopic to scotopic conditions. Peak grating acuity was at mesopic (4 × 10⁻² cd/m²) conditions. Barn owls retained a quarter of their maximal acuity when luminance decreased by 5.5 log units. We argue that the visual system of barn owls is designed to yield as much visual acuity under low light conditions as possible, thereby sacrificing resolution at photopic conditions.
Enhanced HMAX model with feedforward feature learning for multiclass categorization.
Li, Yinlin; Wu, Wei; Zhang, Bo; Li, Fengfu
2015-01-01
In recent years, the interdisciplinary research between neuroscience and computer vision has promoted the development in both fields. Many biologically inspired visual models are proposed, and among them, the Hierarchical Max-pooling model (HMAX) is a feedforward model mimicking the structures and functions of V1 to posterior inferotemporal (PIT) layer of the primate visual cortex, which could generate a series of position- and scale- invariant features. However, it could be improved with attention modulation and memory processing, which are two important properties of the primate visual cortex. Thus, in this paper, based on recent biological research on the primate visual cortex, we still mimic the first 100-150 ms of visual cognition to enhance the HMAX model, which mainly focuses on the unsupervised feedforward feature learning process. The main modifications are as follows: (1) To mimic the attention modulation mechanism of V1 layer, a bottom-up saliency map is computed in the S1 layer of the HMAX model, which can support the initial feature extraction for memory processing; (2) To mimic the learning, clustering and short-term memory to long-term memory conversion abilities of V2 and IT, an unsupervised iterative clustering method is used to learn clusters with multiscale middle level patches, which are taken as long-term memory; (3) Inspired by the multiple feature encoding mode of the primate visual cortex, information including color, orientation, and spatial position are encoded in different layers of the HMAX model progressively. By adding a softmax layer at the top of the model, multiclass categorization experiments can be conducted, and the results on Caltech101 show that the enhanced model with a smaller memory size exhibits higher accuracy than the original HMAX model, and could also achieve better accuracy than other unsupervised feature learning methods in multiclass categorization task.
Creating visual explanations improves learning.
Bobek, Eliza; Tversky, Barbara
2016-01-01
Many topics in science are notoriously difficult for students to learn. Mechanisms and processes outside student experience present particular challenges. While instruction typically involves visualizations, students usually explain in words. Because visual explanations can show parts and processes of complex systems directly, creating them should have benefits beyond creating verbal explanations. We compared learning from creating visual or verbal explanations for two STEM domains, a mechanical system (bicycle pump) and a chemical system (bonding). Both kinds of explanations were analyzed for content and learning assess by a post-test. For the mechanical system, creating a visual explanation increased understanding particularly for participants of low spatial ability. For the chemical system, creating both visual and verbal explanations improved learning without new teaching. Creating a visual explanation was superior and benefitted participants of both high and low spatial ability. Visual explanations often included crucial yet invisible features. The greater effectiveness of visual explanations appears attributable to the checks they provide for completeness and coherence as well as to their roles as platforms for inference. The benefits should generalize to other domains like the social sciences, history, and archeology where important information can be visualized. Together, the findings provide support for the use of learner-generated visual explanations as a powerful learning tool.
Aiello, Marilena; Merola, Sheila; Lasaponara, Stefano; Pinto, Mario; Tomaiuolo, Francesco; Doricchi, Fabrizio
2018-01-31
The possibility of allocating attentional resources to the "global" shape or to the "local" details of pictorial stimuli helps visual processing. Investigations with hierarchical Navon letters, that are large "global" letters made up of small "local" ones, consistently demonstrate a right hemisphere advantage for global processing and a left hemisphere advantage for local processing. Here we investigated how the visual and phonological features of the global and local components of Navon letters influence these hemispheric advantages. In a first study in healthy participants, we contrasted the hemispheric processing of hierarchical letters with global and local items competing for response selection, to the processing of hierarchical letters in which a letter, a false-letter conveying no phonological information or a geometrical shape presented at the unattended level did not compete for response selection. In a second study, we investigated the hemispheric processing of hierarchical stimuli in which global and local letters were both visually and phonologically congruent (e.g. large uppercase G made of smaller uppercase G), visually incongruent and phonologically congruent (e.g. large uppercase G made of small lowercase g) or visually incongruent and phonologically incongruent (e.g. large uppercase G made of small lowercase or uppercase M). In a third study, we administered the same tasks to a right brain damaged patient with a lesion involving pre-striate areas engaged by global processing. The results of the first two experiments showed that the global abilities of the left hemisphere are limited because of its strong susceptibility to interference from local letters even when these are irrelevant to the task. Phonological features played a crucial role in this interference because the interference was entirely maintained also when letters at the global and local level were presented in different uppercase vs. lowercase formats. In contrast, when local features conveyed no phonological information, the left hemisphere showed preserved global processing abilities. These findings were supported by the study of the right brain damaged patient. These results offer a new look at the hemispheric dominance in the attentional processing of the global and local levels of hierarchical stimuli. Copyright © 2017 Elsevier Ltd. All rights reserved.
Spering, Miriam; Carrasco, Marisa
2012-01-01
Feature-based attention enhances visual processing and improves perception, even for visual features that we are not aware of. Does feature-based attention also modulate motor behavior in response to visual information that does or does not reach awareness? Here we compare the effect of feature-based attention on motion perception and smooth pursuit eye movements in response to moving dichoptic plaids–stimuli composed of two orthogonally-drifting gratings, presented separately to each eye–in human observers. Monocular adaptation to one grating prior to the presentation of both gratings renders the adapted grating perceptually weaker than the unadapted grating and decreases the level of awareness. Feature-based attention was directed to either the adapted or the unadapted grating’s motion direction or to both (neutral condition). We show that observers were better in detecting a speed change in the attended than the unattended motion direction, indicating that they had successfully attended to one grating. Speed change detection was also better when the change occurred in the unadapted than the adapted grating, indicating that the adapted grating was perceptually weaker. In neutral conditions, perception and pursuit in response to plaid motion were dissociated: While perception followed one grating’s motion direction almost exclusively (component motion), the eyes tracked the average of both gratings (pattern motion). In attention conditions, perception and pursuit were shifted towards the attended component. These results suggest that attention affects perception and pursuit similarly even though only the former reflects awareness. The eyes can track an attended feature even if observers do not perceive it. PMID:22649238
Spering, Miriam; Carrasco, Marisa
2012-05-30
Feature-based attention enhances visual processing and improves perception, even for visual features that we are not aware of. Does feature-based attention also modulate motor behavior in response to visual information that does or does not reach awareness? Here we compare the effect of feature-based attention on motion perception and smooth-pursuit eye movements in response to moving dichoptic plaids--stimuli composed of two orthogonally drifting gratings, presented separately to each eye--in human observers. Monocular adaptation to one grating before the presentation of both gratings renders the adapted grating perceptually weaker than the unadapted grating and decreases the level of awareness. Feature-based attention was directed to either the adapted or the unadapted grating's motion direction or to both (neutral condition). We show that observers were better at detecting a speed change in the attended than the unattended motion direction, indicating that they had successfully attended to one grating. Speed change detection was also better when the change occurred in the unadapted than the adapted grating, indicating that the adapted grating was perceptually weaker. In neutral conditions, perception and pursuit in response to plaid motion were dissociated: While perception followed one grating's motion direction almost exclusively (component motion), the eyes tracked the average of both gratings (pattern motion). In attention conditions, perception and pursuit were shifted toward the attended component. These results suggest that attention affects perception and pursuit similarly even though only the former reflects awareness. The eyes can track an attended feature even if observers do not perceive it.
NASA Astrophysics Data System (ADS)
Starkey, Eleanor; Barnes, Mhari; Quinn, Paul; Large, Andy
2016-04-01
Pressures associated with flooding and climate change have significantly increased over recent years. Natural Flood Risk Management (NFRM) is now seen as being a more appropriate and favourable approach in some locations. At the same time, catchment managers are also encouraged to adopt a more integrated, evidence-based and bottom-up approach. This includes engaging with local communities. Although NFRM features are being more readily installed, there is still limited evidence associated with their ability to reduce flood risk and offer multiple benefits. In particular, local communities and land owners are still uncertain about what the features entail and how they will perform, which is a huge barrier affecting widespread uptake. Traditional hydrometric monitoring techniques are well established but they still struggle to successfully monitor and capture NFRM performance spatially and temporally in a visual and more meaningful way for those directly affected on the ground. Two UK-based case studies are presented here where unique NFRM features have been carefully designed and installed in rural headwater catchments. This includes a 1km2 sub-catchment of the Haltwhistle Burn (northern England) and a 2km2 sub-catchment of Eddleston Water (southern Scotland). Both of these pilot sites are subject to prolonged flooding in winter and flash flooding in summer. This exacerbates sediment, debris and water quality issues downstream. Examples of NFRM features include ponds, woody debris and a log feature inspired by the children's game 'Kerplunk'. They have been tested and monitored over the 2015-2016 winter storms using low-cost techniques by both researchers and members of the community ('citizen scientists'). Results show that monitoring techniques such as regular consumer specification time-lapse cameras, photographs, videos and 'kite-cams' are suitable for long-term and low-cost monitoring of a variety of NFRM features. These techniques have been compared against traditional hydrometric monitoring equipment. It is clear that traditional techniques are expensive, require specialist skills and outputs are complicated to the untrained eye. These alternative methods tested are visually more meaningful, can be interpreted by all stakeholders and techniques can be easily utilised by citizen scientists, land owners or flood groups. Such techniques therefore offer a before, during and after NFRM monitoring solution which can be more realistically and readily implemented, supports engagement and subsequent uptake and maintenance of NFRM features on a local level. Although monitoring techniques presented are relatively simple, they are regarded as being essential given that many schemes are not monitored at all.
Prestimulus EEG Power Predicts Conscious Awareness But Not Objective Visual Performance
Veniero, Domenica
2017-01-01
Abstract Prestimulus oscillatory neural activity has been linked to perceptual outcomes during performance of psychophysical detection and discrimination tasks. Specifically, the power and phase of low frequency oscillations have been found to predict whether an upcoming weak visual target will be detected or not. However, the mechanisms by which baseline oscillatory activity influences perception remain unclear. Recent studies suggest that the frequently reported negative relationship between α power and stimulus detection may be explained by changes in detection criterion (i.e., increased target present responses regardless of whether the target was present/absent) driven by the state of neural excitability, rather than changes in visual sensitivity (i.e., more veridical percepts). Here, we recorded EEG while human participants performed a luminance discrimination task on perithreshold stimuli in combination with single-trial ratings of perceptual awareness. Our aim was to investigate whether the power and/or phase of prestimulus oscillatory activity predict discrimination accuracy and/or perceptual awareness on a trial-by-trial basis. Prestimulus power (3–28 Hz) was inversely related to perceptual awareness ratings (i.e., higher ratings in states of low prestimulus power/high excitability) but did not predict discrimination accuracy. In contrast, prestimulus oscillatory phase did not predict awareness ratings or accuracy in any frequency band. These results provide evidence that prestimulus α power influences the level of subjective awareness of threshold visual stimuli but does not influence visual sensitivity when a decision has to be made regarding stimulus features. Hence, we find a clear dissociation between the influence of ongoing neural activity on conscious awareness and objective performance. PMID:29255794
Efficient receptive field tiling in primate V1
Nauhaus, Ian; Nielsen, Kristina J.; Callaway, Edward M.
2017-01-01
The primary visual cortex (V1) encodes a diverse set of visual features, including orientation, ocular dominance (OD) and spatial frequency (SF), whose joint organization must be precisely structured to optimize coverage within the retinotopic map. Prior experiments have only identified efficient coverage based on orthogonal maps. Here, we used two-photon calcium imaging to reveal an alternative arrangement for OD and SF maps in macaque V1; their gradients run parallel but with unique spatial periods, whereby low SF regions coincide with monocular regions. Next, we mapped receptive fields and find surprisingly precise micro-retinotopy that yields a smaller point-image and requires more efficient inter-map geometry, thus underscoring the significance of map relationships. While smooth retinotopy is constraining, studies suggest that it improves both wiring economy and the V1 population code read downstream. Altogether, these data indicate that connectivity within V1 is finely tuned and precise at the level of individual neurons. PMID:27499086
Schettino, Antonio; Keil, Andreas; Porcu, Emanuele; Müller, Matthias M
2016-06-01
The rapid extraction of affective cues from the visual environment is crucial for flexible behavior. Previous studies have reported emotion-dependent amplitude modulations of two event-related potential (ERP) components - the N1 and EPN - reflecting sensory gain control mechanisms in extrastriate visual areas. However, it is unclear whether both components are selective electrophysiological markers of attentional orienting toward emotional material or are also influenced by physical features of the visual stimuli. To address this question, electrical brain activity was recorded from seventeen male participants while viewing original and bright versions of neutral and erotic pictures. Bright neutral scenes were rated as more pleasant compared to their original counterpart, whereas erotic scenes were judged more positively when presented in their original version. Classical and mass univariate ERP analysis showed larger N1 amplitude for original relative to bright erotic pictures, with no differences for original and bright neutral scenes. Conversely, the EPN was only modulated by picture content and not by brightness, substantiating the idea that this component is a unique electrophysiological marker of attention allocation toward emotional material. Complementary topographic analysis revealed the early selective expression of a centro-parietal positivity following the presentation of original erotic scenes only, reflecting the recruitment of neural networks associated with sustained attention and facilitated memory encoding for motivationally relevant material. Overall, these results indicate that neural networks subtending the extraction of emotional information are differentially recruited depending on low-level perceptual features, which ultimately influence affective evaluations. Copyright © 2016 Elsevier Inc. All rights reserved.
Taylor, Kirsten I; Devereux, Barry J; Acres, Kadia; Randall, Billi; Tyler, Lorraine K
2012-03-01
Conceptual representations are at the heart of our mental lives, involved in every aspect of cognitive functioning. Despite their centrality, a long-standing debate persists as to how the meanings of concepts are represented and processed. Many accounts agree that the meanings of concrete concepts are represented by their individual features, but disagree about the importance of different feature-based variables: some views stress the importance of the information carried by distinctive features in conceptual processing, others the features which are shared over many concepts, and still others the extent to which features co-occur. We suggest that previously disparate theoretical positions and experimental findings can be unified by an account which claims that task demands determine how concepts are processed in addition to the effects of feature distinctiveness and co-occurrence. We tested these predictions in a basic-level naming task which relies on distinctive feature information (Experiment 1) and a domain decision task which relies on shared feature information (Experiment 2). Both used large-scale regression designs with the same visual objects, and mixed-effects models incorporating participant, session, stimulus-related and feature statistic variables to model the performance. We found that concepts with relatively more distinctive and more highly correlated distinctive relative to shared features facilitated basic-level naming latencies, while concepts with relatively more shared and more highly correlated shared relative to distinctive features speeded domain decisions. These findings demonstrate that the feature statistics of distinctiveness (shared vs. distinctive) and correlational strength, as well as the task demands, determine how concept meaning is processed in the conceptual system. Copyright © 2011 Elsevier B.V. All rights reserved.
StreamExplorer: A Multi-Stage System for Visually Exploring Events in Social Streams.
Wu, Yingcai; Chen, Zhutian; Sun, Guodao; Xie, Xiao; Cao, Nan; Liu, Shixia; Cui, Weiwei
2017-10-18
Analyzing social streams is important for many applications, such as crisis management. However, the considerable diversity, increasing volume, and high dynamics of social streams of large events continue to be significant challenges that must be overcome to ensure effective exploration. We propose a novel framework by which to handle complex social streams on a budget PC. This framework features two components: 1) an online method to detect important time periods (i.e., subevents), and 2) a tailored GPU-assisted Self-Organizing Map (SOM) method, which clusters the tweets of subevents stably and efficiently. Based on the framework, we present StreamExplorer to facilitate the visual analysis, tracking, and comparison of a social stream at three levels. At a macroscopic level, StreamExplorer uses a new glyph-based timeline visualization, which presents a quick multi-faceted overview of the ebb and flow of a social stream. At a mesoscopic level, a map visualization is employed to visually summarize the social stream from either a topical or geographical aspect. At a microscopic level, users can employ interactive lenses to visually examine and explore the social stream from different perspectives. Two case studies and a task-based evaluation are used to demonstrate the effectiveness and usefulness of StreamExplorer.Analyzing social streams is important for many applications, such as crisis management. However, the considerable diversity, increasing volume, and high dynamics of social streams of large events continue to be significant challenges that must be overcome to ensure effective exploration. We propose a novel framework by which to handle complex social streams on a budget PC. This framework features two components: 1) an online method to detect important time periods (i.e., subevents), and 2) a tailored GPU-assisted Self-Organizing Map (SOM) method, which clusters the tweets of subevents stably and efficiently. Based on the framework, we present StreamExplorer to facilitate the visual analysis, tracking, and comparison of a social stream at three levels. At a macroscopic level, StreamExplorer uses a new glyph-based timeline visualization, which presents a quick multi-faceted overview of the ebb and flow of a social stream. At a mesoscopic level, a map visualization is employed to visually summarize the social stream from either a topical or geographical aspect. At a microscopic level, users can employ interactive lenses to visually examine and explore the social stream from different perspectives. Two case studies and a task-based evaluation are used to demonstrate the effectiveness and usefulness of StreamExplorer.
The change in critical technologies for computational physics
NASA Technical Reports Server (NTRS)
Watson, Val
1990-01-01
It is noted that the types of technology required for computational physics are changing as the field matures. Emphasis has shifted from computer technology to algorithm technology and, finally, to visual analysis technology as areas of critical research for this field. High-performance graphical workstations tied to a supercommunicator with high-speed communications along with the development of especially tailored visualization software has enabled analysis of highly complex fluid-dynamics simulations. Particular reference is made here to the development of visual analysis tools at NASA's Numerical Aerodynamics Simulation Facility. The next technology which this field requires is one that would eliminate visual clutter by extracting key features of simulations of physics and technology in order to create displays that clearly portray these key features. Research in the tuning of visual displays to human cognitive abilities is proposed. The immediate transfer of technology to all levels of computers, specifically the inclusion of visualization primitives in basic software developments for all work stations and PCs, is recommended.
ERIC Educational Resources Information Center
Smith, Philip A.; Webb, Geoffrey I.
2000-01-01
Describes "Glass-box Interpreter" a low-level program visualization tool called Bradman designed to provide a conceptual model of C program execution for novice programmers and makes visible aspects of the programming process normally hidden from the user. Presents an experiment that tests the efficacy of Bradman, and provides…
High-resolution Self-Organizing Maps for advanced visualization and dimension reduction.
Saraswati, Ayu; Nguyen, Van Tuc; Hagenbuchner, Markus; Tsoi, Ah Chung
2018-05-04
Kohonen's Self Organizing feature Map (SOM) provides an effective way to project high dimensional input features onto a low dimensional display space while preserving the topological relationships among the input features. Recent advances in algorithms that take advantages of modern computing hardware introduced the concept of high resolution SOMs (HRSOMs). This paper investigates the capabilities and applicability of the HRSOM as a visualization tool for cluster analysis and its suitabilities to serve as a pre-processor in ensemble learning models. The evaluation is conducted on a number of established benchmarks and real-world learning problems, namely, the policeman benchmark, two web spam detection problems, a network intrusion detection problem, and a malware detection problem. It is found that the visualization resulted from an HRSOM provides new insights concerning these learning problems. It is furthermore shown empirically that broad benefits from the use of HRSOMs in both clustering and classification problems can be expected. Copyright © 2018 Elsevier Ltd. All rights reserved.
Katwal, Santosh B; Gore, John C; Marois, Rene; Rogers, Baxter P
2013-09-01
We present novel graph-based visualizations of self-organizing maps for unsupervised functional magnetic resonance imaging (fMRI) analysis. A self-organizing map is an artificial neural network model that transforms high-dimensional data into a low-dimensional (often a 2-D) map using unsupervised learning. However, a postprocessing scheme is necessary to correctly interpret similarity between neighboring node prototypes (feature vectors) on the output map and delineate clusters and features of interest in the data. In this paper, we used graph-based visualizations to capture fMRI data features based upon 1) the distribution of data across the receptive fields of the prototypes (density-based connectivity); and 2) temporal similarities (correlations) between the prototypes (correlation-based connectivity). We applied this approach to identify task-related brain areas in an fMRI reaction time experiment involving a visuo-manual response task, and we correlated the time-to-peak of the fMRI responses in these areas with reaction time. Visualization of self-organizing maps outperformed independent component analysis and voxelwise univariate linear regression analysis in identifying and classifying relevant brain regions. We conclude that the graph-based visualizations of self-organizing maps help in advanced visualization of cluster boundaries in fMRI data enabling the separation of regions with small differences in the timings of their brain responses.
Design and evaluation of a kitchen for persons with visual impairments.
Kutintara, Benjamas; Somboon, Pornpun; Buasri, Virajada; Srettananurak, Metinee; Jedeeyod, Piyanooch; Pornpratoom, Kittikan; Iam-cham, Veraya
2013-03-01
Visually impaired people need skills on daily living, such as cooking, and Ratchasuda College offers independent living training for them. In order to fulfill their needs, a suitable kitchen should be designed with the consideration of their limitations. The objective of this study was to design and evaluate a kitchen for persons with visual impairments. Before designing the kitchen, interviews and an observation were carried out to obtain information on the needs of blind and low vision persons. Consequently, a kitchen model was developed, and it was evaluated by 10 persons with visual impairments. After the design improvement, the kitchen was built and has been routinely used for training persons with visual impairments to prepare meals. Finally, a post-occupancy evaluation of the kitchen was conducted by observing and interviewing both trainers and those with visual impairments during the food preparation training. The results of the study indicated that kitchens for persons with visual impairments should have safety and usability features. The results of the post-occupancy evaluation showed that those who attended cooking courses were able to cook safely in the kitchen. However, the kitchen still had limitations in some features.
Feature-Specific Organization of Feedback Pathways in Mouse Visual Cortex.
Huh, Carey Y L; Peach, John P; Bennett, Corbett; Vega, Roxana M; Hestrin, Shaul
2018-01-08
Higher and lower cortical areas in the visual hierarchy are reciprocally connected [1]. Although much is known about how feedforward pathways shape receptive field properties of visual neurons, relatively little is known about the role of feedback pathways in visual processing. Feedback pathways are thought to carry top-down signals, including information about context (e.g., figure-ground segmentation and surround suppression) [2-5], and feedback has been demonstrated to sharpen orientation tuning of neurons in the primary visual cortex (V1) [6, 7]. However, the response characteristics of feedback neurons themselves and how feedback shapes V1 neurons' tuning for other features, such as spatial frequency (SF), remain largely unknown. Here, using a retrograde virus, targeted electrophysiological recordings, and optogenetic manipulations, we show that putatively feedback neurons in layer 5 (hereafter "L5 feedback") in higher visual areas, AL (anterolateral area) and PM (posteromedial area), display distinct visual properties in awake head-fixed mice. AL L5 feedback neurons prefer significantly lower SF (mean: 0.04 cycles per degree [cpd]) compared to PM L5 feedback neurons (0.15 cpd). Importantly, silencing AL L5 feedback reduced visual responses of V1 neurons preferring low SF (mean change in firing rate: -8.0%), whereas silencing PM L5 feedback suppressed responses of high-SF-preferring V1 neurons (-20.4%). These findings suggest that feedback connections from higher visual areas convey distinctly tuned visual inputs to V1 that serve to boost V1 neurons' responses to SF. Such like-to-like functional organization may represent an important feature of feedback pathways in sensory systems and in the nervous system in general. Copyright © 2017 Elsevier Ltd. All rights reserved.
Preparatory attention in visual cortex.
Battistoni, Elisa; Stein, Timo; Peelen, Marius V
2017-05-01
Top-down attention is the mechanism that allows us to selectively process goal-relevant aspects of a scene while ignoring irrelevant aspects. A large body of research has characterized the effects of attention on neural activity evoked by a visual stimulus. However, attention also includes a preparatory phase before stimulus onset in which the attended dimension is internally represented. Here, we review neurophysiological, functional magnetic resonance imaging, magnetoencephalography, electroencephalography, and transcranial magnetic stimulation (TMS) studies investigating the neural basis of preparatory attention, both when attention is directed to a location in space and when it is directed to nonspatial stimulus attributes (content-based attention) ranging from low-level features to object categories. Results show that both spatial and content-based attention lead to increased baseline activity in neural populations that selectively code for the attended attribute. TMS studies provide evidence that this preparatory activity is causally related to subsequent attentional selection and behavioral performance. Attention thus acts by preactivating selective neurons in the visual cortex before stimulus onset. This appears to be a general mechanism that can operate on multiple levels of representation. We discuss the functional relevance of this mechanism, its limitations, and its relation to working memory, imagery, and expectation. We conclude by outlining open questions and future directions. © 2017 New York Academy of Sciences.
Feedforward object-vision models only tolerate small image variations compared to human
Ghodrati, Masoud; Farzmahdi, Amirhossein; Rajaei, Karim; Ebrahimpour, Reza; Khaligh-Razavi, Seyed-Mahdi
2014-01-01
Invariant object recognition is a remarkable ability of primates' visual system that its underlying mechanism has constantly been under intense investigations. Computational modeling is a valuable tool toward understanding the processes involved in invariant object recognition. Although recent computational models have shown outstanding performances on challenging image databases, they fail to perform well in image categorization under more complex image variations. Studies have shown that making sparse representation of objects by extracting more informative visual features through a feedforward sweep can lead to higher recognition performances. Here, however, we show that when the complexity of image variations is high, even this approach results in poor performance compared to humans. To assess the performance of models and humans in invariant object recognition tasks, we built a parametrically controlled image database consisting of several object categories varied in different dimensions and levels, rendered from 3D planes. Comparing the performance of several object recognition models with human observers shows that only in low-level image variations the models perform similar to humans in categorization tasks. Furthermore, the results of our behavioral experiments demonstrate that, even under difficult experimental conditions (i.e., briefly presented masked stimuli with complex image variations), human observers performed outstandingly well, suggesting that the models are still far from resembling humans in invariant object recognition. Taken together, we suggest that learning sparse informative visual features, although desirable, is not a complete solution for future progresses in object-vision modeling. We show that this approach is not of significant help in solving the computational crux of object recognition (i.e., invariant object recognition) when the identity-preserving image variations become more complex. PMID:25100986
Zsido, Andras N; Deak, Anita; Losonci, Adrienn; Stecina, Diana; Arato, Akos; Bernath, Laszlo
2018-04-01
Numerous objects and animals could be threatening, and thus, children learn to avoid them early. Spiders and syringes are among the most common targets of fears and phobias of the modern word. However, they are of different origins: while the former is evolutionary relevant, the latter is not. We sought to investigate the underlying mechanisms that make the quick detection of such stimuli possible and enable the impulse to avoid them in the future. The respective categories of threatening and non-threatening targets were similar in shape, while low-level visual features were controlled. Our results showed that children found threatening cues faster, irrespective of the evolutionary age of the cues. However, they detected non-threatening evolutionary targets faster than non-evolutionary ones. We suggest that the underlying mechanism may be different: general feature detection can account for finding evolutionary threatening cues quickly, while specific features detection is more appropriate for modern threatening stimuli. Copyright © 2018 Elsevier B.V. All rights reserved.
Chasing vs. Stalking: Interrupting the Perception of Animacy
ERIC Educational Resources Information Center
Gao, Tao; Scholl, Brian J.
2011-01-01
Visual experience involves not only physical features such as color and shape, but also higher-level properties such as animacy and goal-directedness. Perceiving animacy is an inherently dynamic experience, in part because agents' goal-directed behavior may be frequently in flux--unlike many of their physical properties. How does the visual system…
Medical image classification based on multi-scale non-negative sparse coding.
Zhang, Ruijie; Shen, Jian; Wei, Fushan; Li, Xiong; Sangaiah, Arun Kumar
2017-11-01
With the rapid development of modern medical imaging technology, medical image classification has become more and more important in medical diagnosis and clinical practice. Conventional medical image classification algorithms usually neglect the semantic gap problem between low-level features and high-level image semantic, which will largely degrade the classification performance. To solve this problem, we propose a multi-scale non-negative sparse coding based medical image classification algorithm. Firstly, Medical images are decomposed into multiple scale layers, thus diverse visual details can be extracted from different scale layers. Secondly, for each scale layer, the non-negative sparse coding model with fisher discriminative analysis is constructed to obtain the discriminative sparse representation of medical images. Then, the obtained multi-scale non-negative sparse coding features are combined to form a multi-scale feature histogram as the final representation for a medical image. Finally, SVM classifier is combined to conduct medical image classification. The experimental results demonstrate that our proposed algorithm can effectively utilize multi-scale and contextual spatial information of medical images, reduce the semantic gap in a large degree and improve medical image classification performance. Copyright © 2017 Elsevier B.V. All rights reserved.
Landmark Image Retrieval by Jointing Feature Refinement and Multimodal Classifier Learning.
Zhang, Xiaoming; Wang, Senzhang; Li, Zhoujun; Ma, Shuai; Xiaoming Zhang; Senzhang Wang; Zhoujun Li; Shuai Ma; Ma, Shuai; Zhang, Xiaoming; Wang, Senzhang; Li, Zhoujun
2018-06-01
Landmark retrieval is to return a set of images with their landmarks similar to those of the query images. Existing studies on landmark retrieval focus on exploiting the geometries of landmarks for visual similarity matches. However, the visual content of social images is of large diversity in many landmarks, and also some images share common patterns over different landmarks. On the other side, it has been observed that social images usually contain multimodal contents, i.e., visual content and text tags, and each landmark has the unique characteristic of both visual content and text content. Therefore, the approaches based on similarity matching may not be effective in this environment. In this paper, we investigate whether the geographical correlation among the visual content and the text content could be exploited for landmark retrieval. In particular, we propose an effective multimodal landmark classification paradigm to leverage the multimodal contents of social image for landmark retrieval, which integrates feature refinement and landmark classifier with multimodal contents by a joint model. The geo-tagged images are automatically labeled for classifier learning. Visual features are refined based on low rank matrix recovery, and multimodal classification combined with group sparse is learned from the automatically labeled images. Finally, candidate images are ranked by combining classification result and semantic consistence measuring between the visual content and text content. Experiments on real-world datasets demonstrate the superiority of the proposed approach as compared to existing methods.
The correlation study of parallel feature extractor and noise reduction approaches
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dewi, Deshinta Arrova; Sundararajan, Elankovan; Prabuwono, Anton Satria
2015-05-15
This paper presents literature reviews that show variety of techniques to develop parallel feature extractor and finding its correlation with noise reduction approaches for low light intensity images. Low light intensity images are normally displayed as darker images and low contrast. Without proper handling techniques, those images regularly become evidences of misperception of objects and textures, the incapability to section them. The visual illusions regularly clues to disorientation, user fatigue, poor detection and classification performance of humans and computer algorithms. Noise reduction approaches (NR) therefore is an essential step for other image processing steps such as edge detection, image segmentation,more » image compression, etc. Parallel Feature Extractor (PFE) meant to capture visual contents of images involves partitioning images into segments, detecting image overlaps if any, and controlling distributed and redistributed segments to extract the features. Working on low light intensity images make the PFE face challenges and closely depend on the quality of its pre-processing steps. Some papers have suggested many well established NR as well as PFE strategies however only few resources have suggested or mentioned the correlation between them. This paper reviews best approaches of the NR and the PFE with detailed explanation on the suggested correlation. This finding may suggest relevant strategies of the PFE development. With the help of knowledge based reasoning, computational approaches and algorithms, we present the correlation study between the NR and the PFE that can be useful for the development and enhancement of other existing PFE.« less
The functional impact of mental imagery on conscious perception
Pearson, Joel; Clifford, Colin; Tong, Frank
2008-01-01
Summary Mental imagery has been proposed to contribute to a variety of high-level cognitive functions, including memory encoding and retrieval, navigation and spatial planning, and even social communication and language comprehension [1–5]. However, it is debated whether mental imagery relies on the same sensory representations as perception [1, 6–10], and if so, what functional consequences such an overlap might have on perception itself. We report novel evidence that single instances of imagery can have a pronounced facilitatory influence on subsequent conscious perception. Either seeing or imagining a specific pattern could strongly bias which of two competing stimuli reach awareness during binocular rivalry. Effects of imagery and perception were location- and orientation-specific, accumulated in strength over time, and survived an intervening visual task lasting several seconds prior to presentation of the rivalry display. Interestingly, effects of imagery differed from those of feature-based attention. The results demonstrate that imagery, in the absence of any incoming visual signals, leads to the formation of a short-term sensory trace that can bias future perception, suggesting a means by which high-level processes that support imagination and memory retrieval may shape low-level sensory representations. PMID:18583132
Multiple feature fusion via covariance matrix for visual tracking
NASA Astrophysics Data System (ADS)
Jin, Zefenfen; Hou, Zhiqiang; Yu, Wangsheng; Wang, Xin; Sun, Hui
2018-04-01
Aiming at the problem of complicated dynamic scenes in visual target tracking, a multi-feature fusion tracking algorithm based on covariance matrix is proposed to improve the robustness of the tracking algorithm. In the frame-work of quantum genetic algorithm, this paper uses the region covariance descriptor to fuse the color, edge and texture features. It also uses a fast covariance intersection algorithm to update the model. The low dimension of region covariance descriptor, the fast convergence speed and strong global optimization ability of quantum genetic algorithm, and the fast computation of fast covariance intersection algorithm are used to improve the computational efficiency of fusion, matching, and updating process, so that the algorithm achieves a fast and effective multi-feature fusion tracking. The experiments prove that the proposed algorithm can not only achieve fast and robust tracking but also effectively handle interference of occlusion, rotation, deformation, motion blur and so on.
Multilevel analysis of sports video sequences
NASA Astrophysics Data System (ADS)
Han, Jungong; Farin, Dirk; de With, Peter H. N.
2006-01-01
We propose a fully automatic and flexible framework for analysis and summarization of tennis broadcast video sequences, using visual features and specific game-context knowledge. Our framework can analyze a tennis video sequence at three levels, which provides a broad range of different analysis results. The proposed framework includes novel pixel-level and object-level tennis video processing algorithms, such as a moving-player detection taking both the color and the court (playing-field) information into account, and a player-position tracking algorithm based on a 3-D camera model. Additionally, we employ scene-level models for detecting events, like service, base-line rally and net-approach, based on a number real-world visual features. The system can summarize three forms of information: (1) all court-view playing frames in a game, (2) the moving trajectory and real-speed of each player, as well as relative position between the player and the court, (3) the semantic event segments in a game. The proposed framework is flexible in choosing the level of analysis that is desired. It is effective because the framework makes use of several visual cues obtained from the real-world domain to model important events like service, thereby increasing the accuracy of the scene-level analysis. The paper presents attractive experimental results highlighting the system efficiency and analysis capabilities.
Potts, Geoffrey F; Wood, Susan M; Kothmann, Delia; Martin, Laura E
2008-10-21
Attention directs limited-capacity information processing resources to a subset of available perceptual representations. The mechanisms by which attention selects task-relevant representations for preferential processing are not fully known. Triesman and Gelade's [Triesman, A., Gelade, G., 1980. A feature integration theory of attention. Cognit. Psychol. 12, 97-136.] influential attention model posits that simple features are processed preattentively, in parallel, but that attention is required to serially conjoin multiple features into an object representation. Event-related potentials have provided evidence for this model showing parallel processing of perceptual features in the posterior Selection Negativity (SN) and serial, hierarchic processing of feature conjunctions in the Frontal Selection Positivity (FSP). Most prior studies have been done on conjunctions within one sensory modality while many real-world objects have multimodal features. It is not known if the same neural systems of posterior parallel processing of simple features and frontal serial processing of feature conjunctions seen within a sensory modality also operate on conjunctions between modalities. The current study used ERPs and simultaneously presented auditory and visual stimuli in three task conditions: Attend Auditory (auditory feature determines the target, visual features are irrelevant), Attend Visual (visual features relevant, auditory irrelevant), and Attend Conjunction (target defined by the co-occurrence of an auditory and a visual feature). In the Attend Conjunction condition when the auditory but not the visual feature was a target there was an SN over auditory cortex, when the visual but not auditory stimulus was a target there was an SN over visual cortex, and when both auditory and visual stimuli were targets (i.e. conjunction target) there were SNs over both auditory and visual cortex, indicating parallel processing of the simple features within each modality. In contrast, an FSP was present when either the visual only or both auditory and visual features were targets, but not when only the auditory stimulus was a target, indicating that the conjunction target determination was evaluated serially and hierarchically with visual information taking precedence. This indicates that the detection of a target defined by audio-visual conjunction is achieved via the same mechanism as within a single perceptual modality, through separate, parallel processing of the auditory and visual features and serial processing of the feature conjunction elements, rather than by evaluation of a fused multimodal percept.
Visual short-term memory for oriented, colored objects.
Shin, Hongsup; Ma, Wei Ji
2017-08-01
A central question in the study of visual short-term memory (VSTM) has been whether its basic units are objects or features. Most studies addressing this question have used change detection tasks in which the feature value before the change is highly discriminable from the feature value after the change. This approach assumes that memory noise is negligible, which recent work has shown not to be the case. Here, we investigate VSTM for orientation and color within a noisy-memory framework, using change localization with a variable magnitude of change. A specific consequence of the noise is that it is necessary to model the inference (decision) stage. We find that (a) orientation and color have independent pools of memory resource (consistent with classic results); (b) an irrelevant feature dimension is either encoded but ignored during decision-making, or encoded with low precision and taken into account during decision-making; and (c) total resource available in a given feature dimension is lower in the presence of task-relevant stimuli that are neutral in that feature dimension. We propose a framework in which feature resource comes both in packaged and in targeted form.
Solaimani, K; Amri, M A Hadian
2008-08-01
The aim of this study was capability of Indian Remote Sensing (IRS) data of 1D to detecting erosion features which were created from run-off. In this study, ability of PAN digital data of IRS-1D satellite was evaluated for extraction of erosion features in Nour-roud catchment located in Mazandaran province, Iran, using GIS techniques. Research method has based on supervised digital classification, using MLC algorithm and also visual interpretation, using PMU analysis and then these were evaluated and compared. Results indicated that opposite of digital classification, with overall accuracy 40.02% and kappa coefficient 31.35%, due to low spectral resolution; visual interpretation and classification, due to high spatial resolution (5.8 m), prepared classifying erosion features from this data, so that these features corresponded with the lithology, slope and hydrograph lines using GIS, so closely that one can consider their boundaries overlapped. Also field control showed that this data is relatively fit for using this method in investigation of erosion features and specially, can be applied to identify large erosion features.
Global Sensory Qualities and Aesthetic Experience in Music
Brattico, Pauli; Brattico, Elvira; Vuust, Peter
2017-01-01
A well-known tradition in the study of visual aesthetics holds that the experience of visual beauty is grounded in global computational or statistical properties of the stimulus, for example, scale-invariant Fourier spectrum or self-similarity. Some approaches rely on neural mechanisms, such as efficient computation, processing fluency, or the responsiveness of the cells in the primary visual cortex. These proposals are united by the fact that the contributing factors are hypothesized to be global (i.e., they concern the percept as a whole), formal or non-conceptual (i.e., they concern form instead of content), computational and/or statistical, and based on relatively low-level sensory properties. Here we consider that the study of aesthetic responses to music could benefit from the same approach. Thus, along with local features such as pitch, tuning, consonance/dissonance, harmony, timbre, or beat, also global sonic properties could be viewed as contributing toward creating an aesthetic musical experience. Several such properties are discussed and their neural implementation is reviewed in the light of recent advances in neuroaesthetics. PMID:28424573
Nurmohamadi, Maryam; Pourghassem, Hossein
2014-05-01
The utilization of antibiotics produced by Clavulanic acid (CA) is an increasing need in medicine and industry. Usually, the CA is created from the fermentation of Streptomycen Clavuligerus (SC) bacteria. Analysis of visual and morphological features of SC bacteria is an appropriate measure to estimate the growth of CA. In this paper, an automatic and fast CA production level estimation algorithm based on visual and structural features of SC bacteria instead of statistical methods and experimental evaluation by microbiologist is proposed. In this algorithm, structural features such as the number of newborn branches, thickness of hyphal and bacterial density and also color features such as acceptance color levels are extracted from the SC bacteria. Moreover, PH and biomass of the medium provided by microbiologists are considered as specified features. The level of CA production is estimated by using a new application of Self-Organizing Map (SOM), and a hybrid model of genetic algorithm with back propagation network (GA-BPN). The proposed algorithm is evaluated on four carbonic resources including malt, starch, wheat flour and glycerol that had used as different mediums of bacterial growth. Then, the obtained results are compared and evaluated with observation of specialist. Finally, the Relative Error (RE) for the SOM and GA-BPN are achieved 14.97% and 16.63%, respectively. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Alor-Hernández, Giner; Pérez-Gallardo, Yuliana; Posada-Gómez, Rubén; Cortes-Robles, Guillermo; Rodríguez-González, Alejandro; Aguilar-Laserre, Alberto A
2012-09-01
Nowadays, traditional search engines such as Google, Yahoo and Bing facilitate the retrieval of information in the format of images, but the results are not always useful for the users. This is mainly due to two problems: (1) the semantic keywords are not taken into consideration and (2) it is not always possible to establish a query using the image features. This issue has been covered in different domains in order to develop content-based image retrieval (CBIR) systems. The expert community has focussed their attention on the healthcare domain, where a lot of visual information for medical analysis is available. This paper provides a solution called iPixel Visual Search Engine, which involves semantics and content issues in order to search for digitized mammograms. iPixel offers the possibility of retrieving mammogram features using collective intelligence and implementing a CBIR algorithm. Our proposal compares not only features with similar semantic meaning, but also visual features. In this sense, the comparisons are made in different ways: by the number of regions per image, by maximum and minimum size of regions per image and by average intensity level of each region. iPixel Visual Search Engine supports the medical community in differential diagnoses related to the diseases of the breast. The iPixel Visual Search Engine has been validated by experts in the healthcare domain, such as radiologists, in addition to experts in digital image analysis.
Stronger Neural Modulation by Visual Motion Intensity in Autism Spectrum Disorders
Peiker, Ina; Schneider, Till R.; Milne, Elizabeth; Schöttle, Daniel; Vogeley, Kai; Münchau, Alexander; Schunke, Odette; Siegel, Markus; Engel, Andreas K.; David, Nicole
2015-01-01
Theories of autism spectrum disorders (ASD) have focused on altered perceptual integration of sensory features as a possible core deficit. Yet, there is little understanding of the neuronal processing of elementary sensory features in ASD. For typically developed individuals, we previously established a direct link between frequency-specific neural activity and the intensity of a specific sensory feature: Gamma-band activity in the visual cortex increased approximately linearly with the strength of visual motion. Using magnetoencephalography (MEG), we investigated whether in individuals with ASD neural activity reflect the coherence, and thus intensity, of visual motion in a similar fashion. Thirteen adult participants with ASD and 14 control participants performed a motion direction discrimination task with increasing levels of motion coherence. A polynomial regression analysis revealed that gamma-band power increased significantly stronger with motion coherence in ASD compared to controls, suggesting excessive visual activation with increasing stimulus intensity originating from motion-responsive visual areas V3, V6 and hMT/V5. Enhanced neural responses with increasing stimulus intensity suggest an enhanced response gain in ASD. Response gain is controlled by excitatory-inhibitory interactions, which also drive high-frequency oscillations in the gamma-band. Thus, our data suggest that a disturbed excitatory-inhibitory balance underlies enhanced neural responses to coherent motion in ASD. PMID:26147342
Miconi, Thomas; Groomes, Laura; Kreiman, Gabriel
2016-01-01
When searching for an object in a scene, how does the brain decide where to look next? Visual search theories suggest the existence of a global “priority map” that integrates bottom-up visual information with top-down, target-specific signals. We propose a mechanistic model of visual search that is consistent with recent neurophysiological evidence, can localize targets in cluttered images, and predicts single-trial behavior in a search task. This model posits that a high-level retinotopic area selective for shape features receives global, target-specific modulation and implements local normalization through divisive inhibition. The normalization step is critical to prevent highly salient bottom-up features from monopolizing attention. The resulting activity pattern constitues a priority map that tracks the correlation between local input and target features. The maximum of this priority map is selected as the locus of attention. The visual input is then spatially enhanced around the selected location, allowing object-selective visual areas to determine whether the target is present at this location. This model can localize objects both in array images and when objects are pasted in natural scenes. The model can also predict single-trial human fixations, including those in error and target-absent trials, in a search task involving complex objects. PMID:26092221
Tiled vector data model for the geographical features of symbolized maps.
Li, Lin; Hu, Wei; Zhu, Haihong; Li, You; Zhang, Hang
2017-01-01
Electronic maps (E-maps) provide people with convenience in real-world space. Although web map services can display maps on screens, a more important function is their ability to access geographical features. An E-map that is based on raster tiles is inferior to vector tiles in terms of interactive ability because vector maps provide a convenient and effective method to access and manipulate web map features. However, the critical issue regarding rendering tiled vector maps is that geographical features that are rendered in the form of map symbols via vector tiles may cause visual discontinuities, such as graphic conflicts and losses of data around the borders of tiles, which likely represent the main obstacles to exploring vector map tiles on the web. This paper proposes a tiled vector data model for geographical features in symbolized maps that considers the relationships among geographical features, symbol representations and map renderings. This model presents a method to tailor geographical features in terms of map symbols and 'addition' (join) operations on the following two levels: geographical features and map features. Thus, these maps can resolve the visual discontinuity problem based on the proposed model without weakening the interactivity of vector maps. The proposed model is validated by two map data sets, and the results demonstrate that the rendered (symbolized) web maps present smooth visual continuity.
Bullying in German Adolescents: Attending Special School for Students with Visual Impairment
ERIC Educational Resources Information Center
Pinquart, Martin; Pfeiffer, Jens P.
2011-01-01
The present study analysed bullying in German adolescents with and without visual impairment. Ninety-eight adolescents with vision loss from schools for students with visual impairment, of whom 31 were blind and 67 had low vision, were compared with 98 sighted peers using a matched-pair design. Students with low vision reported higher levels of…
NASA Astrophysics Data System (ADS)
Zhang, Bin; Liu, Yueyan; Zhang, Zuyu; Shen, Yonglin
2017-10-01
A multifeature soft-probability cascading scheme to solve the problem of land use and land cover (LULC) classification using high-spatial-resolution images to map rural residential areas in China is proposed. The proposed method is used to build midlevel LULC features. Local features are frequently considered as low-level feature descriptors in a midlevel feature learning method. However, spectral and textural features, which are very effective low-level features, are neglected. The acquisition of the dictionary of sparse coding is unsupervised, and this phenomenon reduces the discriminative power of the midlevel feature. Thus, we propose to learn supervised features based on sparse coding, a support vector machine (SVM) classifier, and a conditional random field (CRF) model to utilize the different effective low-level features and improve the discriminability of midlevel feature descriptors. First, three kinds of typical low-level features, namely, dense scale-invariant feature transform, gray-level co-occurrence matrix, and spectral features, are extracted separately. Second, combined with sparse coding and the SVM classifier, the probabilities of the different LULC classes are inferred to build supervised feature descriptors. Finally, the CRF model, which consists of two parts: unary potential and pairwise potential, is employed to construct an LULC classification map. Experimental results show that the proposed classification scheme can achieve impressive performance when the total accuracy reached about 87%.
Research progress on Drosophila visual cognition in China.
Guo, AiKe; Zhang, Ke; Peng, YueQin; Xi, Wang
2010-03-01
Visual cognition, as one of the fundamental aspects of cognitive neuroscience, is generally associated with high-order brain functions in animals and human. Drosophila, as a model organism, shares certain features of visual cognition in common with mammals at the genetic, molecular, cellular, and even higher behavioral levels. From learning and memory to decision making, Drosophila covers a broad spectrum of higher cognitive behaviors beyond what we had expected. Armed with powerful tools of genetic manipulation in Drosophila, an increasing number of studies have been conducted in order to elucidate the neural circuit mechanisms underlying these cognitive behaviors from a genes-brain-behavior perspective. The goal of this review is to integrate the most important studies on visual cognition in Drosophila carried out in mainland China during the last decade into a body of knowledge encompassing both the basic neural operations and circuitry of higher brain function in Drosophila. Here, we consider a series of the higher cognitive behaviors beyond learning and memory, such as visual pattern recognition, feature and context generalization, different feature memory traces, salience-based decision, attention-like behavior, and cross-modal leaning and memory. We discuss the possible general gain-gating mechanism implementing by dopamine - mushroom body circuit in fly's visual cognition. We hope that our brief review on this aspect will inspire further study on visual cognition in flies, or even beyond.
Template optimization and transfer in perceptual learning.
Kurki, Ilmari; Hyvärinen, Aapo; Saarinen, Jussi
2016-08-01
We studied how learning changes the processing of a low-level Gabor stimulus, using a classification-image method (psychophysical reverse correlation) and a task where observers discriminated between slight differences in the phase (relative alignment) of a target Gabor in visual noise. The method estimates the internal "template" that describes how the visual system weights the input information for decisions. One popular idea has been that learning makes the template more like an ideal Bayesian weighting; however, the evidence has been indirect. We used a new regression technique to directly estimate the template weight change and to test whether the direction of reweighting is significantly different from an optimal learning strategy. The subjects trained the task for six daily sessions, and we tested the transfer of training to a target in an orthogonal orientation. Strong learning and partial transfer were observed. We tested whether task precision (difficulty) had an effect on template change and transfer: Observers trained in either a high-precision (small, 60° phase difference) or a low-precision task (180°). Task precision did not have an effect on the amount of template change or transfer, suggesting that task precision per se does not determine whether learning generalizes. Classification images show that training made observers use more task-relevant features and unlearn some irrelevant features. The transfer templates resembled partially optimized versions of templates in training sessions. The template change direction resembles ideal learning significantly but not completely. The amount of template change was highly correlated with the amount of learning.
What's in a "face file"? Feature binding with facial identity, emotion, and gaze direction.
Fitousi, Daniel
2017-07-01
A series of four experiments investigated the binding of facial (i.e., facial identity, emotion, and gaze direction) and non-facial (i.e., spatial location and response location) attributes. Evidence for the creation and retrieval of temporary memory face structures across perception and action has been adduced. These episodic structures-dubbed herein "face files"-consisted of both visuo-visuo and visuo-motor bindings. Feature binding was indicated by partial-repetition costs. That is repeating a combination of facial features or altering them altogether, led to faster responses than repeating or alternating only one of the features. Taken together, the results indicate that: (a) "face files" affect both action and perception mechanisms, (b) binding can take place with facial dimensions and is not restricted to low-level features (Hommel, Visual Cognition 5:183-216, 1998), and (c) the binding of facial and non-facial attributes is facilitated if the dimensions share common spatial or motor codes. The theoretical contributions of these results to "person construal" theories (Freeman, & Ambady, Psychological Science, 20(10), 1183-1188, 2011), as well as to face recognition models (Haxby, Hoffman, & Gobbini, Biological Psychiatry, 51(1), 59-67, 2000) are discussed.
Myers, Jeffrey D.
2012-01-01
Maps are often used to convey information generated by models, for example, modeled cancer risk from air pollution. The concrete nature of images, such as maps, may convey more certainty than warranted for modeled information. Three map features were selected to communicate the uncertainty of modeled cancer risk: (a) map contours appeared in or out of focus, (b) one or three colors were used, and (c) a verbal-relative or numeric risk expression was used in the legend. Study aims were to assess how these features influenced risk beliefs and the ambiguity of risk beliefs at four assigned map locations that varied by risk level. We applied an integrated conceptual framework to conduct this full factorial experiment with 32 maps that varied by the three dichotomous features and four risk levels; 826 university students participated. Data was analyzed using structural equation modeling. Unfocused contours and the verbal-relative risk expression generated more ambiguity than their counterparts. Focused contours generated stronger risk beliefs for higher risk levels and weaker beliefs for lower risk levels. Number of colors had minimal influence. The magnitude of risk level, conveyed using incrementally darker shading, had a substantial dose-response influence on the strength of risk beliefs. Personal characteristics of prior beliefs and numeracy also had substantial influences. Bottom-up and top-down information processing suggest why iconic visual features of incremental shading and contour focus had the strongest visual influences on risk beliefs and ambiguity. Variations in contour focus and risk expression show promise for fostering appropriate levels of ambiguity. PMID:22985196
Hierarchical layered and semantic-based image segmentation using ergodicity map
NASA Astrophysics Data System (ADS)
Yadegar, Jacob; Liu, Xiaoqing
2010-04-01
Image segmentation plays a foundational role in image understanding and computer vision. Although great strides have been made and progress achieved on automatic/semi-automatic image segmentation algorithms, designing a generic, robust, and efficient image segmentation algorithm is still challenging. Human vision is still far superior compared to computer vision, especially in interpreting semantic meanings/objects in images. We present a hierarchical/layered semantic image segmentation algorithm that can automatically and efficiently segment images into hierarchical layered/multi-scaled semantic regions/objects with contextual topological relationships. The proposed algorithm bridges the gap between high-level semantics and low-level visual features/cues (such as color, intensity, edge, etc.) through utilizing a layered/hierarchical ergodicity map, where ergodicity is computed based on a space filling fractal concept and used as a region dissimilarity measurement. The algorithm applies a highly scalable, efficient, and adaptive Peano- Cesaro triangulation/tiling technique to decompose the given image into a set of similar/homogenous regions based on low-level visual cues in a top-down manner. The layered/hierarchical ergodicity map is built through a bottom-up region dissimilarity analysis. The recursive fractal sweep associated with the Peano-Cesaro triangulation provides efficient local multi-resolution refinement to any level of detail. The generated binary decomposition tree also provides efficient neighbor retrieval mechanisms for contextual topological object/region relationship generation. Experiments have been conducted within the maritime image environment where the segmented layered semantic objects include the basic level objects (i.e. sky/land/water) and deeper level objects in the sky/land/water surfaces. Experimental results demonstrate the proposed algorithm has the capability to robustly and efficiently segment images into layered semantic objects/regions with contextual topological relationships.
Eagle-eyed visual acuity: an experimental investigation of enhanced perception in autism.
Ashwin, Emma; Ashwin, Chris; Rhydderch, Danielle; Howells, Jessica; Baron-Cohen, Simon
2009-01-01
Anecdotal accounts of sensory hypersensitivity in individuals with autism spectrum conditions (ASC) have been noted since the first reports of the condition. Over time, empirical evidence has supported the notion that those with ASC have superior visual abilities compared with control subjects. However, it remains unclear whether these abilities are specifically the result of differences in sensory thresholds (low-level processing), rather than higher-level cognitive processes. This study investigates visual threshold in n = 15 individuals with ASC and n = 15 individuals without ASC, using a standardized optometric test, the Freiburg Visual Acuity and Contrast Test, to investigate basic low-level visual acuity. Individuals with ASC have significantly better visual acuity (20:7) compared with control subjects (20:13)-acuity so superior that it lies in the region reported for birds of prey. The results of this study suggest that inclusion of sensory hypersensitivity in the diagnostic criteria for ASC may be warranted and that basic standardized tests of sensory thresholds may inform causal theories of ASC.
Sketchy Rendering for Information Visualization.
Wood, J; Isenberg, P; Isenberg, T; Dykes, J; Boukhelifa, N; Slingsby, A
2012-12-01
We present and evaluate a framework for constructing sketchy style information visualizations that mimic data graphics drawn by hand. We provide an alternative renderer for the Processing graphics environment that redefines core drawing primitives including line, polygon and ellipse rendering. These primitives allow higher-level graphical features such as bar charts, line charts, treemaps and node-link diagrams to be drawn in a sketchy style with a specified degree of sketchiness. The framework is designed to be easily integrated into existing visualization implementations with minimal programming modification or design effort. We show examples of use for statistical graphics, conveying spatial imprecision and for enhancing aesthetic and narrative qualities of visualization. We evaluate user perception of sketchiness of areal features through a series of stimulus-response tests in order to assess users' ability to place sketchiness on a ratio scale, and to estimate area. Results suggest relative area judgment is compromised by sketchy rendering and that its influence is dependent on the shape being rendered. They show that degree of sketchiness may be judged on an ordinal scale but that its judgement varies strongly between individuals. We evaluate higher-level impacts of sketchiness through user testing of scenarios that encourage user engagement with data visualization and willingness to critique visualization design. Results suggest that where a visualization is clearly sketchy, engagement may be increased and that attitudes to participating in visualization annotation are more positive. The results of our work have implications for effective information visualization design that go beyond the traditional role of sketching as a tool for prototyping or its use for an indication of general uncertainty.
NASA Astrophysics Data System (ADS)
D'Elia, V.; Campana, S.; Covino, S.; D'Avanzo, P.; Piranomonte, S.; Tagliaferri, G.
2011-11-01
We aim at studying the gamma-ray burst (GRB), GRB 081008, environment by analysing the spectra of its optical afterglow. Ultraviolet and Visual Echelle Spectrograph/Very Large Telescope (UVES/VLT) high-resolution spectroscopy of GRB 081008 was secured ˜5 h after the Swift-BAT trigger. Our data set also comprises three VLT/FORS2 nearly simultaneous spectra of the same source. The availability of nearly simultaneous high- and low-resolution spectra for a GRB afterglow is an extremely rare event. The GRB-damped Lyman α system at z= 1.9683 shows that the interstellar medium (ISM) of the host galaxy is constituted by at least three components which contribute to the line profiles. Component I is the redmost one, and is 20 and 78 km s-1 redward components II and III, respectively. We detect several ground state and excited absorption features in components I and II. These features have been used to compute the distances between the GRB and the absorbers. Component I is found to be 52 ± 6 pc away from the GRB, while component II presents few excited transitions and its distance is 200+60- 80 pc. Component III only features a few, low-ionization and saturated lines suggesting that it is even farther from the GRB. Component I represents the closest absorber ever detected near a GRB. This (relatively) low distance can possibly be a consequence of a dense GRB environment, which prevents the GRB prompt/afterglow emission to strongly affect the ISM up to higher distances. The hydrogen column density associated with GRB 081008 is log NH/cm-2= 21.11 ± 0.10, and the metallicity of the host galaxy is in the range of [X/H] =-1.29 to -0.52. In particular, we found [Fe/H] =-1.19 ± 0.11 and [Zn/H] =-0.52 ± 0.11 with respect to solar values. This discrepancy can be explained by the presence of dust in the GRB ISM, given the opposite refractory properties of iron and zinc. By deriving the depletion pattern for GRB 081008, we find the optical extinction in the visual band to be AV˜ 0.19 mag. The curve-of-growth analysis applied to the FORS2 spectra brings column densities consistent at the 3σ level to that evaluated from the UVES data using the line-fitting procedure. This reflects the low saturation of the detected GRB 081008 absorption features. Based on observations collected at the European Southern Observatory, ESO, the VLT/Kueyen telescope, Paranal, Chile, in the framework of the programme 082-0755.
Visual cortical areas of the mouse: comparison of parcellation and network structure with primates
Laramée, Marie-Eve; Boire, Denis
2015-01-01
Brains have evolved to optimize sensory processing. In primates, complex cognitive tasks must be executed and evolution led to the development of large brains with many cortical areas. Rodents do not accomplish cognitive tasks of the same level of complexity as primates and remain with small brains both in relative and absolute terms. But is a small brain necessarily a simple brain? In this review, several aspects of the visual cortical networks have been compared between rodents and primates. The visual system has been used as a model to evaluate the level of complexity of the cortical circuits at the anatomical and functional levels. The evolutionary constraints are first presented in order to appreciate the rules for the development of the brain and its underlying circuits. The organization of sensory pathways, with their parallel and cross-modal circuits, is also examined. Other features of brain networks, often considered as imposing constraints on the development of underlying circuitry, are also discussed and their effect on the complexity of the mouse and primate brain are inspected. In this review, we discuss the common features of cortical circuits in mice and primates and see how these can be useful in understanding visual processing in these animals. PMID:25620914
Visual cortical areas of the mouse: comparison of parcellation and network structure with primates.
Laramée, Marie-Eve; Boire, Denis
2014-01-01
Brains have evolved to optimize sensory processing. In primates, complex cognitive tasks must be executed and evolution led to the development of large brains with many cortical areas. Rodents do not accomplish cognitive tasks of the same level of complexity as primates and remain with small brains both in relative and absolute terms. But is a small brain necessarily a simple brain? In this review, several aspects of the visual cortical networks have been compared between rodents and primates. The visual system has been used as a model to evaluate the level of complexity of the cortical circuits at the anatomical and functional levels. The evolutionary constraints are first presented in order to appreciate the rules for the development of the brain and its underlying circuits. The organization of sensory pathways, with their parallel and cross-modal circuits, is also examined. Other features of brain networks, often considered as imposing constraints on the development of underlying circuitry, are also discussed and their effect on the complexity of the mouse and primate brain are inspected. In this review, we discuss the common features of cortical circuits in mice and primates and see how these can be useful in understanding visual processing in these animals.
A color fusion method of infrared and low-light-level images based on visual perception
NASA Astrophysics Data System (ADS)
Han, Jing; Yan, Minmin; Zhang, Yi; Bai, Lianfa
2014-11-01
The color fusion images can be obtained through the fusion of infrared and low-light-level images, which will contain both the information of the two. The fusion images can help observers to understand the multichannel images comprehensively. However, simple fusion may lose the target information due to inconspicuous targets in long-distance infrared and low-light-level images; and if targets extraction is adopted blindly, the perception of the scene information will be affected seriously. To solve this problem, a new fusion method based on visual perception is proposed in this paper. The extraction of the visual targets ("what" information) and parallel processing mechanism are applied in traditional color fusion methods. The infrared and low-light-level color fusion images are achieved based on efficient typical targets learning. Experimental results show the effectiveness of the proposed method. The fusion images achieved by our algorithm can not only improve the detection rate of targets, but also get rich natural information of the scenes.
Barbosa Porcellis da Silva, Rafael; Marques, Alexandre Carriconde; Reichert, Felipe Fossati
2017-05-19
Low level of physical activity is a serious health issue in individuals with visual impairment. Few studies have objectively measured physical activity in this population group, particularly outside high-income countries. The aim of this study was to describe physical activity measured by accelerometry and its associated factors in Brazilian adults with visual impairment. In a cross-sectional design, 90 adults (18-95 years old) answered a questionnaire and wore an accelerometer for at least 3 days (including one weekend day) to measure physical activity (min/day). Sixty percent of the individuals practiced at least 30 min/day of moderate-to-vigorous physical activity. Individuals who were blind were less active, spent more time in sedentary activities and spent less time in moderate and vigorous activities than those with low vision. Individuals who walked mainly without any assistance were more active, spent less time in sedentary activities and spent more time in light and moderate activities than those who walked with a long cane or sighted guide. Our data highlight factors associated with lower levels of physical activity in people with visual impairment. These factors, such as being blind and walking without assistance should be tackled in interventions to increase physical activity levels among visual impairment individuals. Implications for Rehabilitation Physical inactivity worldwide is a serious health issue in people with visual impairments and specialized institutions and public policies must work to increase physical activity level of this population. Those with lower visual acuity and walking with any aid are at a higher risk of having low levels of physical activity. The association between visual response profile, living for less than 11 years with visual impairment and PA levels deserves further investigations Findings of the present study provide reliable data to support rehabilitation programs, observing the need of taking special attention to the subgroups that are even more likely to be inactive.
Feature-Selective Attentional Modulations in Human Frontoparietal Cortex.
Ester, Edward F; Sutterer, David W; Serences, John T; Awh, Edward
2016-08-03
Control over visual selection has long been framed in terms of a dichotomy between "source" and "site," where top-down feedback signals originating in frontoparietal cortical areas modulate or bias sensory processing in posterior visual areas. This distinction is motivated in part by observations that frontoparietal cortical areas encode task-level variables (e.g., what stimulus is currently relevant or what motor outputs are appropriate), while posterior sensory areas encode continuous or analog feature representations. Here, we present evidence that challenges this distinction. We used fMRI, a roving searchlight analysis, and an inverted encoding model to examine representations of an elementary feature property (orientation) across the entire human cortical sheet while participants attended either the orientation or luminance of a peripheral grating. Orientation-selective representations were present in a multitude of visual, parietal, and prefrontal cortical areas, including portions of the medial occipital cortex, the lateral parietal cortex, and the superior precentral sulcus (thought to contain the human homolog of the macaque frontal eye fields). Additionally, representations in many-but not all-of these regions were stronger when participants were instructed to attend orientation relative to luminance. Collectively, these findings challenge models that posit a strict segregation between sources and sites of attentional control on the basis of representational properties by demonstrating that simple feature values are encoded by cortical regions throughout the visual processing hierarchy, and that representations in many of these areas are modulated by attention. Influential models of visual attention posit a distinction between top-down control and bottom-up sensory processing networks. These models are motivated in part by demonstrations showing that frontoparietal cortical areas associated with top-down control represent abstract or categorical stimulus information, while visual areas encode parametric feature information. Here, we show that multivariate activity in human visual, parietal, and frontal cortical areas encode representations of a simple feature property (orientation). Moreover, representations in several (though not all) of these areas were modulated by feature-based attention in a similar fashion. These results provide an important challenge to models that posit dissociable top-down control and sensory processing networks on the basis of representational properties. Copyright © 2016 the authors 0270-6474/16/368188-12$15.00/0.
NASA Astrophysics Data System (ADS)
Jin, Xin; Jiang, Qian; Yao, Shaowen; Zhou, Dongming; Nie, Rencan; Lee, Shin-Jye; He, Kangjian
2018-01-01
In order to promote the performance of infrared and visual image fusion and provide better visual effects, this paper proposes a hybrid fusion method for infrared and visual image by the combination of discrete stationary wavelet transform (DSWT), discrete cosine transform (DCT) and local spatial frequency (LSF). The proposed method has three key processing steps. Firstly, DSWT is employed to decompose the important features of the source image into a series of sub-images with different levels and spatial frequencies. Secondly, DCT is used to separate the significant details of the sub-images according to the energy of different frequencies. Thirdly, LSF is applied to enhance the regional features of DCT coefficients, and it can be helpful and useful for image feature extraction. Some frequently-used image fusion methods and evaluation metrics are employed to evaluate the validity of the proposed method. The experiments indicate that the proposed method can achieve good fusion effect, and it is more efficient than other conventional image fusion methods.
Modelling Subjectivity in Visual Perception of Orientation for Image Retrieval.
ERIC Educational Resources Information Center
Sanchez, D.; Chamorro-Martinez, J.; Vila, M. A.
2003-01-01
Discussion of multimedia libraries and the need for storage, indexing, and retrieval techniques focuses on the combination of computer vision and data mining techniques to model high-level concepts for image retrieval based on perceptual features of the human visual system. Uses fuzzy set theory to measure users' assessments and to capture users'…
Can Dynamic Visualizations with Variable Control Enhance the Acquisition of Intuitive Knowledge?
ERIC Educational Resources Information Center
Wichmann, Astrid; Timpe, Sebastian
2015-01-01
An important feature of inquiry learning is to take part in science practices including exploring variables and testing hypotheses. Computer-based dynamic visualizations have the potential to open up various exploration possibilities depending on the level of learner control. It is assumed that variable control, e.g., by changing parameters of a…
Bacteriorhodopsin-based photochromic pigments for optical security applications
NASA Astrophysics Data System (ADS)
Hampp, Norbert A.; Fischer, Thorsten; Neebe, Martin
2002-04-01
Bacteriorhodopsin is a two-dimensional crystalline photochromic protein which is astonishingly stable towards chemical and thermal degradation. This is one of the reasons why this is one of the very few proteins which may be used as a biological pigment in printing inks. Variants of the naturally occurring bacteriorhodopsin have been developed which show a distinguished color change even with low light intensities and without the requirement of UV-light. Several pigments with different color changes are available right now. In addition to this visual detectable feature, the photochromism, the proteins amino acid sequence can be genetically altered in order to code and identify specific production lots. For advanced applications the data storage capability of bacteriorhodopsin will be useful. Write-once-read-many (WORM) recording of digital data is accomplished by laser excitation of printed bacteriorhodopsin inks. A density of 1 MBit per square inch is currently achieved. Several application examples for this biological molecule are described where low and high level features are used in combination. Bacteriorhodopsin-based inks are a new class of optical security pigments.
Mechanisms underlying the perceived angular velocity of a rigidly rotating object.
Caplovitz, G P; Hsieh, P-J; Tse, P U
2006-09-01
The perceived angular velocity of an ellipse undergoing a constant rate of rotation will vary as its aspect ratio is changed. Specifically, a "fat" ellipse with a low aspect ratio will in general be perceived to rotate more slowly than a "thin" ellipse with a higher aspect ratio. Here we investigate this illusory underestimation of angular velocity in the domain where ellipses appear to be rotating rigidly. We characterize the relationship between aspect ratio and perceived angular velocity under luminance and non-luminance-defined conditions. The data are consistent with two hypotheses concerning the construction of rotational motion percepts. The first hypothesis is that perceived angular velocity is determined by low-level component-motion (i.e., motion-energy) signals computed along the ellipse's contour. The second hypothesis is that relative maxima of positive contour curvature are treated as non-component, form-based "trackable features" (TFs) that contribute to the visual system's construction of the motion percept. Our data suggest that perceived angular velocity is driven largely by component signals, but is modulated by the motion signals of trackable features, such as corners and regions of high contour curvature.
Using Visual Odometry to Estimate Position and Attitude
NASA Technical Reports Server (NTRS)
Maimone, Mark; Cheng, Yang; Matthies, Larry; Schoppers, Marcel; Olson, Clark
2007-01-01
A computer program in the guidance system of a mobile robot generates estimates of the position and attitude of the robot, using features of the terrain on which the robot is moving, by processing digitized images acquired by a stereoscopic pair of electronic cameras mounted rigidly on the robot. Developed for use in localizing the Mars Exploration Rover (MER) vehicles on Martian terrain, the program can also be used for similar purposes on terrestrial robots moving in sufficiently visually textured environments: examples include low-flying robotic aircraft and wheeled robots moving on rocky terrain or inside buildings. In simplified terms, the program automatically detects visual features and tracks them across stereoscopic pairs of images acquired by the cameras. The 3D locations of the tracked features are then robustly processed into an estimate of overall vehicle motion. Testing has shown that by use of this software, the error in the estimate of the position of the robot can be limited to no more than 2 percent of the distance traveled, provided that the terrain is sufficiently rich in features. This software has proven extremely useful on the MER vehicles during driving on sandy and highly sloped terrains on Mars.
The perceptual saliency of fearful eyes and smiles: A signal detection study
Saban, Muhammet Ikbal; Rotshtein, Pia
2017-01-01
Facial features differ in the amount of expressive information they convey. Specifically, eyes are argued to be essential for fear recognition, while smiles are crucial for recognising happy expressions. In three experiments, we tested whether expression modulates the perceptual saliency of diagnostic facial features and whether the feature’s saliency depends on the face configuration. Participants were presented with masked facial features or noise at perceptual conscious threshold. The task was to indicate whether eyes (experiments 1-3A) or a mouth (experiment 3B) was present. The expression of the face and its configuration (i.e. spatial arrangement of the features) were manipulated. Experiment 1 compared fearful with neutral expressions, experiments 2 and 3 compared fearful versus happy expressions. The detection accuracy data was analysed using Signal Detection Theory (SDT), to examine the effects of expression and configuration on perceptual precision (d’) and response bias (c), separately. Across all three experiments, fearful eyes were detected better (higher d’) than neutral and happy eyes. Eyes were more precisely detected than mouths, whereas smiles were detected better than fearful mouths. The configuration of the features had no consistent effects across the experiments on the ability to detect expressive features. But facial configuration affected consistently the response bias. Participants used a more liberal criterion for detecting the eyes in canonical configuration and fearful expression. Finally, the power in low spatial frequency of a feature predicted its discriminability index. The results suggest that expressive features are perceptually more salient with a higher d’ due to changes at the low-level visual properties, with emotions and configuration affecting perception through top-down processes, as reflected by the response bias. PMID:28267761
Scene and human face recognition in the central vision of patients with glaucoma
Aptel, Florent; Attye, Arnaud; Guyader, Nathalie; Boucart, Muriel; Chiquet, Christophe; Peyrin, Carole
2018-01-01
Primary open-angle glaucoma (POAG) firstly mainly affects peripheral vision. Current behavioral studies support the idea that visual defects of patients with POAG extend into parts of the central visual field classified as normal by static automated perimetry analysis. This is particularly true for visual tasks involving processes of a higher level than mere detection. The purpose of this study was to assess visual abilities of POAG patients in central vision. Patients were assigned to two groups following a visual field examination (Humphrey 24–2 SITA-Standard test). Patients with both peripheral and central defects and patients with peripheral but no central defect, as well as age-matched controls, participated in the experiment. All participants had to perform two visual tasks where low-contrast stimuli were presented in the central 6° of the visual field. A categorization task of scene images and human face images assessed high-level visual recognition abilities. In contrast, a detection task using the same stimuli assessed low-level visual function. The difference in performance between detection and categorization revealed the cost of high-level visual processing. Compared to controls, patients with a central visual defect showed a deficit in both detection and categorization of all low-contrast images. This is consistent with the abnormal retinal sensitivity as assessed by perimetry. However, the deficit was greater for categorization than detection. Patients without a central defect showed similar performances to the controls concerning the detection and categorization of faces. However, while the detection of scene images was well-maintained, these patients showed a deficit in their categorization. This suggests that the simple loss of peripheral vision could be detrimental to scene recognition, even when the information is displayed in central vision. This study revealed subtle defects in the central visual field of POAG patients that cannot be predicted by static automated perimetry assessment using Humphrey 24–2 SITA-Standard test. PMID:29481572
Fusion of Deep Learning and Compressed Domain features for Content Based Image Retrieval.
Liu, Peizhong; Guo, Jing-Ming; Wu, Chi-Yi; Cai, Danlin
2017-08-29
This paper presents an effective image retrieval method by combining high-level features from Convolutional Neural Network (CNN) model and low-level features from Dot-Diffused Block Truncation Coding (DDBTC). The low-level features, e.g., texture and color, are constructed by VQ-indexed histogram from DDBTC bitmap, maximum, and minimum quantizers. Conversely, high-level features from CNN can effectively capture human perception. With the fusion of the DDBTC and CNN features, the extended deep learning two-layer codebook features (DL-TLCF) is generated using the proposed two-layer codebook, dimension reduction, and similarity reweighting to improve the overall retrieval rate. Two metrics, average precision rate (APR) and average recall rate (ARR), are employed to examine various datasets. As documented in the experimental results, the proposed schemes can achieve superior performance compared to the state-of-the-art methods with either low- or high-level features in terms of the retrieval rate. Thus, it can be a strong candidate for various image retrieval related applications.
High-level intuitive features (HLIFs) for intuitive skin lesion description.
Amelard, Robert; Glaister, Jeffrey; Wong, Alexander; Clausi, David A
2015-03-01
A set of high-level intuitive features (HLIFs) is proposed to quantitatively describe melanoma in standard camera images. Melanoma is the deadliest form of skin cancer. With rising incidence rates and subjectivity in current clinical detection methods, there is a need for melanoma decision support systems. Feature extraction is a critical step in melanoma decision support systems. Existing feature sets for analyzing standard camera images are comprised of low-level features, which exist in high-dimensional feature spaces and limit the system's ability to convey intuitive diagnostic rationale. The proposed HLIFs were designed to model the ABCD criteria commonly used by dermatologists such that each HLIF represents a human-observable characteristic. As such, intuitive diagnostic rationale can be conveyed to the user. Experimental results show that concatenating the proposed HLIFs with a full low-level feature set increased classification accuracy, and that HLIFs were able to separate the data better than low-level features with statistical significance. An example of a graphical interface for providing intuitive rationale is given.
Rosselli, Federica B.; Alemi, Alireza; Ansuini, Alessio; Zoccolan, Davide
2015-01-01
In recent years, a number of studies have explored the possible use of rats as models of high-level visual functions. One central question at the root of such an investigation is to understand whether rat object vision relies on the processing of visual shape features or, rather, on lower-order image properties (e.g., overall brightness). In a recent study, we have shown that rats are capable of extracting multiple features of an object that are diagnostic of its identity, at least when those features are, structure-wise, distinct enough to be parsed by the rat visual system. In the present study, we have assessed the impact of object structure on rat perceptual strategy. We trained rats to discriminate between two structurally similar objects, and compared their recognition strategies with those reported in our previous study. We found that, under conditions of lower stimulus discriminability, rat visual discrimination strategy becomes more view-dependent and subject-dependent. Rats were still able to recognize the target objects, in a way that was largely tolerant (i.e., invariant) to object transformation; however, the larger structural and pixel-wise similarity affected the way objects were processed. Compared to the findings of our previous study, the patterns of diagnostic features were: (i) smaller and more scattered; (ii) only partially preserved across object views; and (iii) only partially reproducible across rats. On the other hand, rats were still found to adopt a multi-featural processing strategy and to make use of part of the optimal discriminatory information afforded by the two objects. Our findings suggest that, as in humans, rat invariant recognition can flexibly rely on either view-invariant representations of distinctive object features or view-specific object representations, acquired through learning. PMID:25814936
High-level, but not low-level, motion perception is impaired in patients with schizophrenia.
Kandil, Farid I; Pedersen, Anya; Wehnes, Jana; Ohrmann, Patricia
2013-01-01
Smooth pursuit eye movements are compromised in patients with schizophrenia and their first-degree relatives. Although research has demonstrated that the motor components of smooth pursuit eye movements are intact, motion perception has been shown to be impaired. In particular, studies have consistently revealed deficits in performance on tasks specific to the high-order motion area V5 (middle temporal area, MT) in patients with schizophrenia. In contrast, data from low-level motion detectors in the primary visual cortex (V1) have been inconsistent. To differentiate between low-level and high-level visual motion processing, we applied a temporal-order judgment task for motion events and a motion-defined figure-ground segregation task using patients with schizophrenia and healthy controls. Successful judgments in both tasks rely on the same low-level motion detectors in the V1; however, the first task is further processed in the higher-order motion area MT in the magnocellular (dorsal) pathway, whereas the second task requires subsequent computations in the parvocellular (ventral) pathway in visual area V4 and the inferotemporal cortex (IT). These latter structures are supposed to be intact in schizophrenia. Patients with schizophrenia revealed a significantly impaired temporal resolution on the motion-based temporal-order judgment task but only mild impairment in the motion-based segregation task. These results imply that low-level motion detection in V1 is not, or is only slightly, compromised; furthermore, our data restrain the locus of the well-known deficit in motion detection to areas beyond the primary visual cortex.
Møller, Cecilie; Højlund, Andreas; Bærentsen, Klaus B; Hansen, Niels Chr; Skewes, Joshua C; Vuust, Peter
2018-05-01
Perception is fundamentally a multisensory experience. The principle of inverse effectiveness (PoIE) states how the multisensory gain is maximal when responses to the unisensory constituents of the stimuli are weak. It is one of the basic principles underlying multisensory processing of spatiotemporally corresponding crossmodal stimuli that are well established at behavioral as well as neural levels. It is not yet clear, however, how modality-specific stimulus features influence discrimination of subtle changes in a crossmodally corresponding feature belonging to another modality. Here, we tested the hypothesis that reliance on visual cues to pitch discrimination follow the PoIE at the interindividual level (i.e., varies with varying levels of auditory-only pitch discrimination abilities). Using an oddball pitch discrimination task, we measured the effect of varying visually perceived vertical position in participants exhibiting a wide range of pitch discrimination abilities (i.e., musicians and nonmusicians). Visual cues significantly enhanced pitch discrimination as measured by the sensitivity index d', and more so in the crossmodally congruent than incongruent condition. The magnitude of gain caused by compatible visual cues was associated with individual pitch discrimination thresholds, as predicted by the PoIE. This was not the case for the magnitude of the congruence effect, which was unrelated to individual pitch discrimination thresholds, indicating that the pitch-height association is robust to variations in auditory skills. Our findings shed light on individual differences in multisensory processing by suggesting that relevant multisensory information that crucially aids some perceivers' performance may be of less importance to others, depending on their unisensory abilities.
Identifying and characterising cerebral visual impairment in children: a review.
Philip, Swetha Sara; Dutton, Gordon N
2014-05-01
Cerebral visual impairment (CVI) comprises visual malfunction due to retro-chiasmal visual and visual association pathway pathology. This can be isolated or accompany anterior visual pathway dysfunction. It is a major cause of low vision in children in the developed and developing world due to increasing survival in paediatric and neonatal care. CVI can present in many combinations and degrees. There are multiple causes and it is common in children with cerebral palsy. CVI can be identified easily, if a structured approach to history-taking is employed. This review describes the features of CVI and describes practical management strategies aimed at helping affected children. A literature review was undertaken using 'Medline' and 'Pubmed'. Search terms included cerebral visual impairment, cortical visual impairment, dorsal stream dysfunction and visual function in cerebral palsy. © 2014 The Authors. Clinical and Experimental Optometry © 2014 Optometrists Association Australia.
Tanaka, Hideaki
2016-01-01
Cosmetic makeup significantly influences facial perception. Because faces consist of similar physical structures, cosmetic makeup is typically used to highlight individual features, particularly those of the eyes (i.e., eye shadow) and mouth (i.e., lipstick). Though event-related potentials have been utilized to study various aspects of facial processing, the influence of cosmetics on specific ERP components remains unclear. The present study aimed to investigate the relationship between the application of cosmetic makeup and the amplitudes of the P1 and N170 event-related potential components during facial perception tasks. Moreover, the influence of visual perception on N170 amplitude, was evaluated under three makeup conditions: Eye Shadow, Lipstick, and No Makeup. Electroencephalography was used to monitor 17 participants who were exposed to visual stimuli under each these three makeup conditions. The results of the present study subsequently demonstrated that the Lipstick condition elicited a significantly greater N170 amplitude than the No Makeup condition, while P1 amplitude was unaffected by any of the conditions. Such findings indicate that the application of cosmetic makeup alters general facial perception but exerts no influence on the perception of low-level visual features. Collectively, these results support the notion that the application of makeup induces subtle alterations in the processing of facial stimuli, with a particular effect on the processing of specific facial components (i.e., the mouth), as reflected by changes in N170 amplitude.
Tanaka, Hideaki
2016-01-01
Cosmetic makeup significantly influences facial perception. Because faces consist of similar physical structures, cosmetic makeup is typically used to highlight individual features, particularly those of the eyes (i.e., eye shadow) and mouth (i.e., lipstick). Though event-related potentials have been utilized to study various aspects of facial processing, the influence of cosmetics on specific ERP components remains unclear. The present study aimed to investigate the relationship between the application of cosmetic makeup and the amplitudes of the P1 and N170 event-related potential components during facial perception tasks. Moreover, the influence of visual perception on N170 amplitude, was evaluated under three makeup conditions: Eye Shadow, Lipstick, and No Makeup. Electroencephalography was used to monitor 17 participants who were exposed to visual stimuli under each these three makeup conditions. The results of the present study subsequently demonstrated that the Lipstick condition elicited a significantly greater N170 amplitude than the No Makeup condition, while P1 amplitude was unaffected by any of the conditions. Such findings indicate that the application of cosmetic makeup alters general facial perception but exerts no influence on the perception of low-level visual features. Collectively, these results support the notion that the application of makeup induces subtle alterations in the processing of facial stimuli, with a particular effect on the processing of specific facial components (i.e., the mouth), as reflected by changes in N170 amplitude. PMID:27656161
Parametric classification of handvein patterns based on texture features
NASA Astrophysics Data System (ADS)
Al Mahafzah, Harbi; Imran, Mohammad; Supreetha Gowda H., D.
2018-04-01
In this paper, we have developed Biometric recognition system adopting hand based modality Handvein,which has the unique pattern for each individual and it is impossible to counterfeit and fabricate as it is an internal feature. We have opted in choosing feature extraction algorithms such as LBP-visual descriptor, LPQ-blur insensitive texture operator, Log-Gabor-Texture descriptor. We have chosen well known classifiers such as KNN and SVM for classification. We have experimented and tabulated results of single algorithm recognition rate for Handvein under different distance measures and kernel options. The feature level fusion is carried out which increased the performance level.
The effects of alphabet and expertise on letter perception
Wiley, Robert W.; Wilson, Colin; Rapp, Brenda
2016-01-01
Long-standing questions in human perception concern the nature of the visual features that underlie letter recognition and the extent to which the visual processing of letters is affected by differences in alphabets and levels of viewer expertise. We examined these issues in a novel approach using a same-different judgment task on pairs of letters from the Arabic alphabet with two participant groups—one with no prior exposure to Arabic and one with reading proficiency. Hierarchical clustering and linear mixed-effects modeling of reaction times and accuracy provide evidence that both the specific characteristics of the alphabet and observers’ previous experience with it affect how letters are perceived and visually processed. The findings of this research further our understanding of the multiple factors that affect letter perception and support the view of a visual system that dynamically adjusts its weighting of visual features as expert readers come to more efficiently and effectively discriminate the letters of the specific alphabet they are viewing. PMID:26913778
Gori, Monica; Cappagli, Giulia; Tonelli, Alessia; Baud-Bovy, Gabriel; Finocchietti, Sara
2016-10-01
Considering that cortical plasticity is maximal in the child, why are the majority of technological devices available for visually impaired users meant for adults and not for children? Moreover, despite high technological advancements in recent years, why is there still no full user acceptance of existing sensory substitution devices? The goal of this review is to create a link between neuroscientists and engineers by opening a discussion about the direction that the development of technological devices for visually impaired people is taking. Firstly, we review works on spatial and social skills in children with visual impairments, showing that lack of vision is associated with other sensory and motor delays. Secondly, we present some of the technological solutions developed to date for visually impaired people. Doing this, we highlight the core features of these systems and discuss their limits. We also discuss the possible reasons behind the low adaptability in children. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.
Semantically induced distortions of visual awareness in a patient with Balint's syndrome.
Soto, David; Humphreys, Glyn W
2009-02-01
We present data indicating that visual awareness for a basic perceptual feature (colour) can be influenced by the relation between the feature and the semantic properties of the stimulus. We examined semantic interference from the meaning of a colour word (''RED") on simple colour (ink related) detection responses in a patient with simultagnosia due to bilateral parietal lesions. We found that colour detection was influenced by the congruency between the meaning of the word and the relevant ink colour, with impaired performance when the word and the colour mismatched (on incongruent trials). This result held even when remote associations between meaning and colour were used (i.e. the word ''PEA" influenced detection of the ink colour red). The results are consistent with a late locus of conscious visual experience that is derived at post-semantic levels. The implications for the understanding of the role of parietal cortex in object binding and visual awareness are discussed.
Waese, Jamie; Fan, Jim; Yu, Hans; Fucile, Geoffrey; Shi, Ruian; Cumming, Matthew; Town, Chris; Stuerzlinger, Wolfgang
2017-01-01
A big challenge in current systems biology research arises when different types of data must be accessed from separate sources and visualized using separate tools. The high cognitive load required to navigate such a workflow is detrimental to hypothesis generation. Accordingly, there is a need for a robust research platform that incorporates all data and provides integrated search, analysis, and visualization features through a single portal. Here, we present ePlant (http://bar.utoronto.ca/eplant), a visual analytic tool for exploring multiple levels of Arabidopsis thaliana data through a zoomable user interface. ePlant connects to several publicly available web services to download genome, proteome, interactome, transcriptome, and 3D molecular structure data for one or more genes or gene products of interest. Data are displayed with a set of visualization tools that are presented using a conceptual hierarchy from big to small, and many of the tools combine information from more than one data type. We describe the development of ePlant in this article and present several examples illustrating its integrative features for hypothesis generation. We also describe the process of deploying ePlant as an “app” on Araport. Building on readily available web services, the code for ePlant is freely available for any other biological species research. PMID:28808136
Papera, Massimiliano; Richards, Anne
2016-05-01
Exogenous allocation of attentional resources allows the visual system to encode and maintain representations of stimuli in visual working memory (VWM). However, limits in the processing capacity to allocate resources can prevent unexpected visual stimuli from gaining access to VWM and thereby to consciousness. Using a novel approach to create unbiased stimuli of increasing saliency, we investigated visual processing during a visual search task in individuals who show a high or low propensity to neglect unexpected stimuli. When propensity to inattention is high, ERP recordings show a diminished amplification concomitantly with a decrease in theta band power during the N1 latency, followed by a poor target enhancement during the N2 latency. Furthermore, a later modulation in the P3 latency was also found in individuals showing propensity to visual neglect, suggesting that more effort is required for conscious maintenance of visual information in VWM. Effects during early stages of processing (N80 and P1) were also observed suggesting that sensitivity to contrasts and medium-to-high spatial frequencies may be modulated by low-level saliency (albeit no statistical group differences were found). In accordance with the Global Workplace Model, our data indicate that a lack of resources in low-level processors and visual attention may be responsible for the failure to "ignite" a state of high-level activity spread across several brain areas that is necessary for stimuli to access awareness. These findings may aid in the development of diagnostic tests and intervention to detect/reduce inattention propensity to visual neglect of unexpected stimuli. © 2016 Society for Psychophysiological Research.
Finger vein recognition based on the hyperinformation feature
NASA Astrophysics Data System (ADS)
Xi, Xiaoming; Yang, Gongping; Yin, Yilong; Yang, Lu
2014-01-01
The finger vein is a promising biometric pattern for personal identification due to its advantages over other existing biometrics. In finger vein recognition, feature extraction is a critical step, and many feature extraction methods have been proposed to extract the gray, texture, or shape of the finger vein. We treat them as low-level features and present a high-level feature extraction framework. Under this framework, base attribute is first defined to represent the characteristics of a certain subcategory of a subject. Then, for an image, the correlation coefficient is used for constructing the high-level feature, which reflects the correlation between this image and all base attributes. Since the high-level feature can reveal characteristics of more subcategories and contain more discriminative information, we call it hyperinformation feature (HIF). Compared with low-level features, which only represent the characteristics of one subcategory, HIF is more powerful and robust. In order to demonstrate the potential of the proposed framework, we provide a case study to extract HIF. We conduct comprehensive experiments to show the generality of the proposed framework and the efficiency of HIF on our databases, respectively. Experimental results show that HIF significantly outperforms the low-level features.
Visual temporal processing in dyslexia and the magnocellular deficit theory: the need for speed?
McLean, Gregor M T; Stuart, Geoffrey W; Coltheart, Veronika; Castles, Anne
2011-12-01
A controversial question in reading research is whether dyslexia is associated with impairments in the magnocellular system and, if so, how these low-level visual impairments might affect reading acquisition. This study used a novel chromatic flicker perception task to specifically explore temporal aspects of magnocellular functioning in 40 children with dyslexia and 42 age-matched controls (aged 7-11). The relationship between magnocellular temporal resolution and higher-level aspects of visual temporal processing including inspection time, single and dual-target (attentional blink) RSVP performance, go/no-go reaction time, and rapid naming was also assessed. The Dyslexia group exhibited significant deficits in magnocellular temporal resolution compared with controls, but the two groups did not differ in parvocellular temporal resolution. Despite the significant group differences, associations between magnocellular temporal resolution and reading ability were relatively weak, and links between low-level temporal resolution and reading ability did not appear specific to the magnocellular system. Factor analyses revealed that a collective Perceptual Speed factor, involving both low-level and higher-level visual temporal processing measures, accounted for unique variance in reading ability independently of phonological processing, rapid naming, and general ability.
Arjunan, Sridhar P; Kumar, Dinesh K; Naik, Ganesh R
2010-01-01
This research paper reports an experimental study on identification of the changes in fractal properties of surface Electromyogram (sEMG) with the changes in the force levels during low-level finger flexions. In the previous study, the authors have identified a novel fractal feature, Maximum fractal length (MFL) as a measure of strength of low-level contractions and has used this feature to identify various wrist and finger movements. This study has tested the relationship between the MFL and force of contraction. The results suggest that changes in MFL is correlated with the changes in contraction levels (20%, 50% and 80% maximum voluntary contraction (MVC)) during low-level muscle activation such as finger flexions. From the statistical analysis and by visualisation using box-plot, it is observed that MFL (p ≈ 0.001) is a more correlated to force of contraction compared to RMS (p≈0.05), even when the muscle contraction is less than 50% MVC during low-level finger flexions. This work has established that this fractal feature will be useful in providing information about changes in levels of force during low-level finger movements for prosthetic control or human computer interface.
Tracking and Classification of In-Air Hand Gesture Based on Thermal Guided Joint Filter.
Kim, Seongwan; Ban, Yuseok; Lee, Sangyoun
2017-01-17
The research on hand gestures has attracted many image processing-related studies, as it intuitively conveys the intention of a human as it pertains to motional meaning. Various sensors have been used to exploit the advantages of different modalities for the extraction of important information conveyed by the hand gesture of a user. Although many works have focused on learning the benefits of thermal information from thermal cameras, most have focused on face recognition or human body detection, rather than hand gesture recognition. Additionally, the majority of the works that take advantage of multiple modalities (e.g., the combination of a thermal sensor and a visual sensor), usually adopting simple fusion approaches between the two modalities. As both thermal sensors and visual sensors have their own shortcomings and strengths, we propose a novel joint filter-based hand gesture recognition method to simultaneously exploit the strengths and compensate the shortcomings of each. Our study is motivated by the investigation of the mutual supplementation between thermal and visual information in low feature level for the consistent representation of a hand in the presence of varying lighting conditions. Accordingly, our proposed method leverages the thermal sensor's stability against luminance and the visual sensors textural detail, while complementing the low resolution and halo effect of thermal sensors and the weakness against illumination of visual sensors. A conventional region tracking method and a deep convolutional neural network have been leveraged to track the trajectory of a hand gesture and to recognize the hand gesture, respectively. Our experimental results show stability in recognizing a hand gesture against varying lighting conditions based on the contribution of the joint kernels of spatial adjacency and thermal range similarity.
Tracking and Classification of In-Air Hand Gesture Based on Thermal Guided Joint Filter
Kim, Seongwan; Ban, Yuseok; Lee, Sangyoun
2017-01-01
The research on hand gestures has attracted many image processing-related studies, as it intuitively conveys the intention of a human as it pertains to motional meaning. Various sensors have been used to exploit the advantages of different modalities for the extraction of important information conveyed by the hand gesture of a user. Although many works have focused on learning the benefits of thermal information from thermal cameras, most have focused on face recognition or human body detection, rather than hand gesture recognition. Additionally, the majority of the works that take advantage of multiple modalities (e.g., the combination of a thermal sensor and a visual sensor), usually adopting simple fusion approaches between the two modalities. As both thermal sensors and visual sensors have their own shortcomings and strengths, we propose a novel joint filter-based hand gesture recognition method to simultaneously exploit the strengths and compensate the shortcomings of each. Our study is motivated by the investigation of the mutual supplementation between thermal and visual information in low feature level for the consistent representation of a hand in the presence of varying lighting conditions. Accordingly, our proposed method leverages the thermal sensor’s stability against luminance and the visual sensors textural detail, while complementing the low resolution and halo effect of thermal sensors and the weakness against illumination of visual sensors. A conventional region tracking method and a deep convolutional neural network have been leveraged to track the trajectory of a hand gesture and to recognize the hand gesture, respectively. Our experimental results show stability in recognizing a hand gesture against varying lighting conditions based on the contribution of the joint kernels of spatial adjacency and thermal range similarity. PMID:28106716
Taylor, Kirsten I.; Devereux, Barry J.; Acres, Kadia; Randall, Billi; Tyler, Lorraine K.
2013-01-01
Conceptual representations are at the heart of our mental lives, involved in every aspect of cognitive functioning. Despite their centrality, a long-standing debate persists as to how the meanings of concepts are represented and processed. Many accounts agree that the meanings of concrete concepts are represented by their individual features, but disagree about the importance of different feature-based variables: some views stress the importance of the information carried by distinctive features in conceptual processing, others the features which are shared over many concepts, and still others the extent to which features co-occur. We suggest that previously disparate theoretical positions and experimental findings can be unified by an account which claims that task demands determine how concepts are processed in addition to the effects of feature distinctiveness and co-occurrence. We tested these predictions in a basic-level naming task which relies on distinctive feature information (Experiment 1) and a domain decision task which relies on shared feature information (Experiment 2). Both used large-scale regression designs with the same visual objects, and mixed-effects models incorporating participant, session, stimulus-related and feature statistic variables to model the performance. We found that concepts with relatively more distinctive and more highly correlated distinctive relative to shared features facilitated basic-level naming latencies, while concepts with relatively more shared and more highly correlated shared relative to distinctive features speeded domain decisions. These findings demonstrate that the feature statistics of distinctiveness (shared vs. distinctive) and correlational strength, as well as the task demands, determine how concept meaning is processed in the conceptual system. PMID:22137770
I can see what you are saying: Auditory labels reduce visual search times.
Cho, Kit W
2016-10-01
The present study explored the self-directed-speech effect, the finding that relative to silent reading of a label (e.g., DOG), saying it aloud reduces visual search reaction times (RTs) for locating a target picture among distractors. Experiment 1 examined whether this effect is due to a confound in the differences in the number of cues in self-directed speech (two) vs. silent reading (one) and tested whether self-articulation is required for the effect. The results showed that self-articulation is not required and that merely hearing the auditory label reduces visual search RTs relative to silent reading. This finding also rules out the number of cues confound. Experiment 2 examined whether hearing an auditory label activates more prototypical features of the label's referent and whether the auditory-label benefit is moderated by the target's imagery concordance (the degree to which the target picture matches the mental picture that is activated by a written label for the target). When the target imagery concordance was high, RTs following the presentation of a high prototypicality picture or auditory cue were comparable and shorter than RTs following a visual label or low prototypicality picture cue. However, when the target imagery concordance was low, RTs following an auditory cue were shorter than the comparable RTs following the picture cues and visual-label cue. The results suggest that an auditory label activates both prototypical and atypical features of a concept and can facilitate visual search RTs even when compared to picture primes. Copyright © 2016 Elsevier B.V. All rights reserved.
Visual Stimuli Induce Waves of Electrical Activity in Turtle Cortex
NASA Astrophysics Data System (ADS)
Prechtl, J. C.; Cohen, L. B.; Pesaran, B.; Mitra, P. P.; Kleinfeld, D.
1997-07-01
The computations involved in the processing of a visual scene invariably involve the interactions among neurons throughout all of visual cortex. One hypothesis is that the timing of neuronal activity, as well as the amplitude of activity, provides a means to encode features of objects. The experimental data from studies on cat [Gray, C. M., Konig, P., Engel, A. K. & Singer, W. (1989) Nature (London) 338, 334-337] support a view in which only synchronous (no phase lags) activity carries information about the visual scene. In contrast, theoretical studies suggest, on the one hand, the utility of multiple phases within a population of neurons as a means to encode independent visual features and, on the other hand, the likely existence of timing differences solely on the basis of network dynamics. Here we use widefield imaging in conjunction with voltage-sensitive dyes to record electrical activity from the virtually intact, unanesthetized turtle brain. Our data consist of single-trial measurements. We analyze our data in the frequency domain to isolate coherent events that lie in different frequency bands. Low frequency oscillations (<5 Hz) are seen in both ongoing activity and activity induced by visual stimuli. These oscillations propagate parallel to the afferent input. Higher frequency activity, with spectral peaks near 10 and 20 Hz, is seen solely in response to stimulation. This activity consists of plane waves and spiral-like waves, as well as more complex patterns. The plane waves have an average phase gradient of ≈ π /2 radians/mm and propagate orthogonally to the low frequency waves. Our results show that large-scale differences in neuronal timing are present and persistent during visual processing.
Visual stimuli induce waves of electrical activity in turtle cortex
Prechtl, J. C.; Cohen, L. B.; Pesaran, B.; Mitra, P. P.; Kleinfeld, D.
1997-01-01
The computations involved in the processing of a visual scene invariably involve the interactions among neurons throughout all of visual cortex. One hypothesis is that the timing of neuronal activity, as well as the amplitude of activity, provides a means to encode features of objects. The experimental data from studies on cat [Gray, C. M., Konig, P., Engel, A. K. & Singer, W. (1989) Nature (London) 338, 334–337] support a view in which only synchronous (no phase lags) activity carries information about the visual scene. In contrast, theoretical studies suggest, on the one hand, the utility of multiple phases within a population of neurons as a means to encode independent visual features and, on the other hand, the likely existence of timing differences solely on the basis of network dynamics. Here we use widefield imaging in conjunction with voltage-sensitive dyes to record electrical activity from the virtually intact, unanesthetized turtle brain. Our data consist of single-trial measurements. We analyze our data in the frequency domain to isolate coherent events that lie in different frequency bands. Low frequency oscillations (<5 Hz) are seen in both ongoing activity and activity induced by visual stimuli. These oscillations propagate parallel to the afferent input. Higher frequency activity, with spectral peaks near 10 and 20 Hz, is seen solely in response to stimulation. This activity consists of plane waves and spiral-like waves, as well as more complex patterns. The plane waves have an average phase gradient of ≈π/2 radians/mm and propagate orthogonally to the low frequency waves. Our results show that large-scale differences in neuronal timing are present and persistent during visual processing. PMID:9207142
Grossberg, Stephen; Markowitz, Jeffrey; Cao, Yongqiang
2011-12-01
Visual object recognition is an essential accomplishment of advanced brains. Object recognition needs to be tolerant, or invariant, with respect to changes in object position, size, and view. In monkeys and humans, a key area for recognition is the anterior inferotemporal cortex (ITa). Recent neurophysiological data show that ITa cells with high object selectivity often have low position tolerance. We propose a neural model whose cells learn to simulate this tradeoff, as well as ITa responses to image morphs, while explaining how invariant recognition properties may arise in stages due to processes across multiple cortical areas. These processes include the cortical magnification factor, multiple receptive field sizes, and top-down attentive matching and learning properties that may be tuned by task requirements to attend to either concrete or abstract visual features with different levels of vigilance. The model predicts that data from the tradeoff and image morph tasks emerge from different levels of vigilance in the animals performing them. This result illustrates how different vigilance requirements of a task may change the course of category learning, notably the critical features that are attended and incorporated into learned category prototypes. The model outlines a path for developing an animal model of how defective vigilance control can lead to symptoms of various mental disorders, such as autism and amnesia. Copyright © 2011 Elsevier Ltd. All rights reserved.
Crosse, Michael J; Lalor, Edmund C
2014-04-01
Visual speech can greatly enhance a listener's comprehension of auditory speech when they are presented simultaneously. Efforts to determine the neural underpinnings of this phenomenon have been hampered by the limited temporal resolution of hemodynamic imaging and the fact that EEG and magnetoencephalographic data are usually analyzed in response to simple, discrete stimuli. Recent research has shown that neuronal activity in human auditory cortex tracks the envelope of natural speech. Here, we exploit this finding by estimating a linear forward-mapping between the speech envelope and EEG data and show that the latency at which the envelope of natural speech is represented in cortex is shortened by >10 ms when continuous audiovisual speech is presented compared with audio-only speech. In addition, we use a reverse-mapping approach to reconstruct an estimate of the speech stimulus from the EEG data and, by comparing the bimodal estimate with the sum of the unimodal estimates, find no evidence of any nonlinear additive effects in the audiovisual speech condition. These findings point to an underlying mechanism that could account for enhanced comprehension during audiovisual speech. Specifically, we hypothesize that low-level acoustic features that are temporally coherent with the preceding visual stream may be synthesized into a speech object at an earlier latency, which may provide an extended period of low-level processing before extraction of semantic information.
Automatic detection of multi-level acetowhite regions in RGB color images of the uterine cervix
NASA Astrophysics Data System (ADS)
Lange, Holger
2005-04-01
Uterine cervical cancer is the second most common cancer among women worldwide. Colposcopy is a diagnostic method used to detect cancer precursors and cancer of the uterine cervix, whereby a physician (colposcopist) visually inspects the metaplastic epithelium on the cervix for certain distinctly abnormal morphologic features. A contrast agent, a 3-5% acetic acid solution, is used, causing abnormal and metaplastic epithelia to turn white. The colposcopist considers diagnostic features such as the acetowhite, blood vessel structure, and lesion margin to derive a clinical diagnosis. STI Medical Systems is developing a Computer-Aided-Diagnosis (CAD) system for colposcopy -- ColpoCAD, a complex image analysis system that at its core assesses the same visual features as used by colposcopists. The acetowhite feature has been identified as one of the most important individual predictors of lesion severity. Here, we present the details and preliminary results of a multi-level acetowhite region detection algorithm for RGB color images of the cervix, including the detection of the anatomic features: cervix, os and columnar region, which are used for the acetowhite region detection. The RGB images are assumed to be glare free, either obtained by cross-polarized image acquisition or glare removal pre-processing. The basic approach of the algorithm is to extract a feature image from the RGB image that provides a good acetowhite to cervix background ratio, to segment the feature image using novel pixel grouping and multi-stage region-growing algorithms that provide region segmentations with different levels of detail, to extract the acetowhite regions from the region segmentations using a novel region selection algorithm, and then finally to extract the multi-levels from the acetowhite regions using multiple thresholds. The performance of the algorithm is demonstrated using human subject data.
Comparison of Object Recognition Behavior in Human and Monkey
Rajalingham, Rishi; Schmidt, Kailyn
2015-01-01
Although the rhesus monkey is used widely as an animal model of human visual processing, it is not known whether invariant visual object recognition behavior is quantitatively comparable across monkeys and humans. To address this question, we systematically compared the core object recognition behavior of two monkeys with that of human subjects. To test true object recognition behavior (rather than image matching), we generated several thousand naturalistic synthetic images of 24 basic-level objects with high variation in viewing parameters and image background. Monkeys were trained to perform binary object recognition tasks on a match-to-sample paradigm. Data from 605 human subjects performing the same tasks on Mechanical Turk were aggregated to characterize “pooled human” object recognition behavior, as well as 33 separate Mechanical Turk subjects to characterize individual human subject behavior. Our results show that monkeys learn each new object in a few days, after which they not only match mean human performance but show a pattern of object confusion that is highly correlated with pooled human confusion patterns and is statistically indistinguishable from individual human subjects. Importantly, this shared human and monkey pattern of 3D object confusion is not shared with low-level visual representations (pixels, V1+; models of the retina and primary visual cortex) but is shared with a state-of-the-art computer vision feature representation. Together, these results are consistent with the hypothesis that rhesus monkeys and humans share a common neural shape representation that directly supports object perception. SIGNIFICANCE STATEMENT To date, several mammalian species have shown promise as animal models for studying the neural mechanisms underlying high-level visual processing in humans. In light of this diversity, making tight comparisons between nonhuman and human primates is particularly critical in determining the best use of nonhuman primates to further the goal of the field of translating knowledge gained from animal models to humans. To the best of our knowledge, this study is the first systematic attempt at comparing a high-level visual behavior of humans and macaque monkeys. PMID:26338324
Implicit Binding of Facial Features During Change Blindness
Lyyra, Pessi; Mäkelä, Hanna; Hietanen, Jari K.; Astikainen, Piia
2014-01-01
Change blindness refers to the inability to detect visual changes if introduced together with an eye-movement, blink, flash of light, or with distracting stimuli. Evidence of implicit detection of changed visual features during change blindness has been reported in a number of studies using both behavioral and neurophysiological measurements. However, it is not known whether implicit detection occurs only at the level of single features or whether complex organizations of features can be implicitly detected as well. We tested this in adult humans using intact and scrambled versions of schematic faces as stimuli in a change blindness paradigm while recording event-related potentials (ERPs). An enlargement of the face-sensitive N170 ERP component was observed at the right temporal electrode site to changes from scrambled to intact faces, even if the participants were not consciously able to report such changes (change blindness). Similarly, the disintegration of an intact face to scrambled features resulted in attenuated N170 responses during change blindness. Other ERP deflections were modulated by changes, but unlike the N170 component, they were indifferent to the direction of the change. The bidirectional modulation of the N170 component during change blindness suggests that implicit change detection can also occur at the level of complex features in the case of facial stimuli. PMID:24498165
Implicit binding of facial features during change blindness.
Lyyra, Pessi; Mäkelä, Hanna; Hietanen, Jari K; Astikainen, Piia
2014-01-01
Change blindness refers to the inability to detect visual changes if introduced together with an eye-movement, blink, flash of light, or with distracting stimuli. Evidence of implicit detection of changed visual features during change blindness has been reported in a number of studies using both behavioral and neurophysiological measurements. However, it is not known whether implicit detection occurs only at the level of single features or whether complex organizations of features can be implicitly detected as well. We tested this in adult humans using intact and scrambled versions of schematic faces as stimuli in a change blindness paradigm while recording event-related potentials (ERPs). An enlargement of the face-sensitive N170 ERP component was observed at the right temporal electrode site to changes from scrambled to intact faces, even if the participants were not consciously able to report such changes (change blindness). Similarly, the disintegration of an intact face to scrambled features resulted in attenuated N170 responses during change blindness. Other ERP deflections were modulated by changes, but unlike the N170 component, they were indifferent to the direction of the change. The bidirectional modulation of the N170 component during change blindness suggests that implicit change detection can also occur at the level of complex features in the case of facial stimuli.
Gomez-Ramirez, Manuel; Trzcinski, Natalie K.; Mihalas, Stefan; Niebur, Ernst
2014-01-01
Studies in vision show that attention enhances the firing rates of cells when it is directed towards their preferred stimulus feature. However, it is unknown whether other sensory systems employ this mechanism to mediate feature selection within their modalities. Moreover, whether feature-based attention modulates the correlated activity of a population is unclear. Indeed, temporal correlation codes such as spike-synchrony and spike-count correlations (rsc) are believed to play a role in stimulus selection by increasing the signal and reducing the noise in a population, respectively. Here, we investigate (1) whether feature-based attention biases the correlated activity between neurons when attention is directed towards their common preferred feature, (2) the interplay between spike-synchrony and rsc during feature selection, and (3) whether feature attention effects are common across the visual and tactile systems. Single-unit recordings were made in secondary somatosensory cortex of three non-human primates while animals engaged in tactile feature (orientation and frequency) and visual discrimination tasks. We found that both firing rate and spike-synchrony between neurons with similar feature selectivity were enhanced when attention was directed towards their preferred feature. However, attention effects on spike-synchrony were twice as large as those on firing rate, and had a tighter relationship with behavioral performance. Further, we observed increased rsc when attention was directed towards the visual modality (i.e., away from touch). These data suggest that similar feature selection mechanisms are employed in vision and touch, and that temporal correlation codes such as spike-synchrony play a role in mediating feature selection. We posit that feature-based selection operates by implementing multiple mechanisms that reduce the overall noise levels in the neural population and synchronize activity across subpopulations that encode the relevant features of sensory stimuli. PMID:25423284
Attention is required for maintenance of feature binding in visual working memory
Heider, Maike; Husain, Masud
2013-01-01
Working memory and attention are intimately connected. However, understanding the relationship between the two is challenging. Currently, there is an important controversy about whether objects in working memory are maintained automatically or require resources that are also deployed for visual or auditory attention. Here we investigated the effects of loading attention resources on precision of visual working memory, specifically on correct maintenance of feature-bound objects, using a dual-task paradigm. Participants were presented with a memory array and were asked to remember either direction of motion of random dot kinematograms of different colour, or orientation of coloured bars. During the maintenance period, they performed a secondary visual or auditory task, with varying levels of load. Following a retention period, they adjusted a coloured probe to match either the motion direction or orientation of stimuli with the same colour in the memory array. This allowed us to examine the effects of an attention-demanding task performed during maintenance on precision of recall on the concurrent working memory task. Systematic increase in attention load during maintenance resulted in a significant decrease in overall working memory performance. Changes in overall performance were specifically accompanied by an increase in feature misbinding errors: erroneous reporting of nontarget motion or orientation. Thus in trials where attention resources were taxed, participants were more likely to respond with nontarget values rather than simply making random responses. Our findings suggest that resources used during attention-demanding visual or auditory tasks also contribute to maintaining feature-bound representations in visual working memory—but not necessarily other aspects of working memory. PMID:24266343
Attention is required for maintenance of feature binding in visual working memory.
Zokaei, Nahid; Heider, Maike; Husain, Masud
2014-01-01
Working memory and attention are intimately connected. However, understanding the relationship between the two is challenging. Currently, there is an important controversy about whether objects in working memory are maintained automatically or require resources that are also deployed for visual or auditory attention. Here we investigated the effects of loading attention resources on precision of visual working memory, specifically on correct maintenance of feature-bound objects, using a dual-task paradigm. Participants were presented with a memory array and were asked to remember either direction of motion of random dot kinematograms of different colour, or orientation of coloured bars. During the maintenance period, they performed a secondary visual or auditory task, with varying levels of load. Following a retention period, they adjusted a coloured probe to match either the motion direction or orientation of stimuli with the same colour in the memory array. This allowed us to examine the effects of an attention-demanding task performed during maintenance on precision of recall on the concurrent working memory task. Systematic increase in attention load during maintenance resulted in a significant decrease in overall working memory performance. Changes in overall performance were specifically accompanied by an increase in feature misbinding errors: erroneous reporting of nontarget motion or orientation. Thus in trials where attention resources were taxed, participants were more likely to respond with nontarget values rather than simply making random responses. Our findings suggest that resources used during attention-demanding visual or auditory tasks also contribute to maintaining feature-bound representations in visual working memory-but not necessarily other aspects of working memory.
Dissociation between perceptual processing and priming in long-term lorazepam users.
Giersch, Anne; Vidailhet, Pierre
2006-12-01
Acute effects of lorazepam on visual information processing, perceptual priming and explicit memory are well established. However, visual processing and perceptual priming have rarely been explored in long-term lorazepam users. By exploring these functions it was possible to test the hypothesis that difficulty in processing visual information may lead to deficiencies in perceptual priming. Using a simple blind procedure, we tested explicit memory, perceptual priming and visual perception in 15 long-term lorazepam users and 15 control subjects individually matched according to sex, age and education level. Explicit memory, perceptual priming, and the identification of fragmented pictures were found to be preserved in long-term lorazepam users, contrary to what is usually observed after an acute drug intake. The processing of visual contour, on the other hand, was still significantly impaired. These results suggest that the effects observed on low-level visual perception are independent of the acute deleterious effects of lorazepam on perceptual priming. A comparison of perceptual priming in subjects with low- vs. high-level identification of new fragmented pictures further suggests that the ability to identify fragmented pictures has no influence on priming. Despite the fact that they were treated with relatively low doses and far from peak plasma concentration, it is noteworthy that in long-term users memory was preserved.
Steady-state visual evoked potentials as a research tool in social affective neuroscience
Wieser, Matthias J.; Miskovic, Vladimir; Keil, Andreas
2017-01-01
Like many other primates, humans place a high premium on social information transmission and processing. One important aspect of this information concerns the emotional state of other individuals, conveyed by distinct visual cues such as facial expressions, overt actions, or by cues extracted from the situational context. A rich body of theoretical and empirical work has demonstrated that these socio-emotional cues are processed by the human visual system in a prioritized fashion, in the service of optimizing social behavior. Furthermore, socio-emotional perception is highly dependent on situational contexts and previous experience. Here, we review current issues in this area of research and discuss the utility of the steady-state visual evoked potential (ssVEP) technique for addressing key empirical questions. Methodological advantages and caveats are discussed with particular regard to quantifying time-varying competition among multiple perceptual objects, trial-by-trial analysis of visual cortical activation, functional connectivity, and the control of low-level stimulus features. Studies on facial expression and emotional scene processing are summarized, with an emphasis on viewing faces and other social cues in emotional contexts, or when competing with each other. Further, because the ssVEP technique can be readily accommodated to studying the viewing of complex scenes with multiple elements, it enables researchers to advance theoretical models of socio-emotional perception, based on complex, quasi-naturalistic viewing situations. PMID:27699794
Cognitive workload modulation through degraded visual stimuli: a single-trial EEG study
NASA Astrophysics Data System (ADS)
Yu, K.; Prasad, I.; Mir, H.; Thakor, N.; Al-Nashash, H.
2015-08-01
Objective. Our experiments explored the effect of visual stimuli degradation on cognitive workload. Approach. We investigated the subjective assessment, event-related potentials (ERPs) as well as electroencephalogram (EEG) as measures of cognitive workload. Main results. These experiments confirm that degradation of visual stimuli increases cognitive workload as assessed by subjective NASA task load index and confirmed by the observed P300 amplitude attenuation. Furthermore, the single-trial multi-level classification using features extracted from ERPs and EEG is found to be promising. Specifically, the adopted single-trial oscillatory EEG/ERP detection method achieved an average accuracy of 85% for discriminating 4 workload levels. Additionally, we found from the spatial patterns obtained from EEG signals that the frontal parts carry information that can be used for differentiating workload levels. Significance. Our results show that visual stimuli can modulate cognitive workload, and the modulation can be measured by the single trial EEG/ERP detection method.
Choi, Jeungok; Bakken, Suzanne
2010-01-01
Purpose Low health literacy has been associated with poor health-related outcomes. The purposes are to report the development of a website for low-literate parents in the Neonatal Intensive Care Unit (NICU), and the findings of heuristic evaluation and a usability testing of this website. Methods To address low literacy of NICU parents, multimedia educational Website using visual aids (e.g., pictographs, photographs), voice-recorded text message in addition to a simplified text was developed. The text was created at the 5th grade readability level. The heuristic evaluation was conducted by three usability experts using 10 heuristics. End-users’ performance was measured by counting the time spent completing tasks and number of errors, as well as recording users’ perception of ease of use and usefulness (PEUU) in a sample of 10 NICU parents. Results Three evaluators identified 82 violations across the 10 heuristics. All violations, however, received scores <2, indicating minor usability problems. Participants’ time to complete task varies from 81.2 seconds (SD=30.9) to 2.2 seconds (SD=1.3). Participants rated the Website as easy to use and useful (PEUU Mean= 4.52, SD=0.53). Based on the participants’ comments, appropriate modifications were made. Discussion and Conclusions Different types of visuals on the Website were well accepted by low-literate users and agreement of visuals with text improved understanding of the educational materials over that with text alone. The findings suggest that using concrete and realistic pictures and pictographs with clear captions would maximize the benefit of visuals. One emerging theme was “simplicity” in design (e.g., limited use of colors, one font type and size), content (e.g., avoid lengthy text), and technical features (e.g., limited use of pop-ups). The heuristic evaluation by usability experts and the usability test with actual users provided complementary expertise, which can give a richer assessment of a design for low literacy Website. These results facilitated design modification and implementation of solutions by categorizing and prioritizing the usability problems. PMID:20617546
Choi, Jeungok; Bakken, Suzanne
2010-08-01
Low health literacy has been associated with poor health-related outcomes. The purposes are to report the development of a website for low-literate parents in the Neonatal Intensive Care Unit (NICU), and the findings of heuristic evaluation and a usability testing of this website. To address low literacy of NICU parents, multimedia educational Website using visual aids (e.g., pictographs, photographs), voice-recorded text message in addition to a simplified text was developed. The text was created at the 5th grade readability level. The heuristic evaluation was conducted by three usability experts using 10 heuristics. End-users' performance was measured by counting the time spent completing tasks and number of errors, as well as recording users' perception of ease of use and usefulness (PEUU) in a sample of 10 NICU parents. Three evaluators identified 82 violations across the 10 heuristics. All violations, however, received scores <2, indicating minor usability problems. Participants' time to complete task varies from 81.2 s (SD = 30.9) to 2.2 s (SD = 1.3). Participants rated the Website as easy to use and useful (PEUU mean = 4.52, SD = 0.53). Based on the participants' comments, appropriate modifications were made. Different types of visuals on the Website were well accepted by low-literate users and agreement of visuals with text improved understanding of the educational materials over that with text alone. The findings suggest that using concrete and realistic pictures and pictographs with clear captions would maximize the benefit of visuals. One emerging theme was "simplicity" in design (e.g., limited use of colors, one font type and size), content (e.g., avoid lengthy text), and technical features (e.g., limited use of pop-ups). The heuristic evaluation by usability experts and the usability test with actual users provided complementary expertise, which can give a richer assessment of a design for low literacy Website. These results facilitated design modification and implementation of solutions by categorizing and prioritizing the usability problems.
The Limits of Shape Recognition following Late Emergence from Blindness.
McKyton, Ayelet; Ben-Zion, Itay; Doron, Ravid; Zohary, Ehud
2015-09-21
Visual object recognition develops during the first years of life. But what if one is deprived of vision during early post-natal development? Shape information is extracted using both low-level cues (e.g., intensity- or color-based contours) and more complex algorithms that are largely based on inference assumptions (e.g., illumination is from above, objects are often partially occluded). Previous studies, testing visual acuity using a 2D shape-identification task (Lea symbols), indicate that contour-based shape recognition can improve with visual experience, even after years of visual deprivation from birth. We hypothesized that this may generalize to other low-level cues (shape, size, and color), but not to mid-level functions (e.g., 3D shape from shading) that might require prior visual knowledge. To that end, we studied a unique group of subjects in Ethiopia that suffered from an early manifestation of dense bilateral cataracts and were surgically treated only years later. Our results suggest that the newly sighted rapidly acquire the ability to recognize an odd element within an array, on the basis of color, size, or shape differences. However, they are generally unable to find the odd shape on the basis of illusory contours, shading, or occlusion relationships. Little recovery of these mid-level functions is seen within 1 year post-operation. We find that visual performance using low-level cues is relatively robust to prolonged deprivation from birth. However, the use of pictorial depth cues to infer 3D structure from the 2D retinal image is highly susceptible to early and prolonged visual deprivation. Copyright © 2015 Elsevier Ltd. All rights reserved.
Prevalence and causes of low vision and blindness in Baotou
Zhang, Guisen; Li, Yan; Teng, Xuelong; Wu, Qiang; Gong, Hui; Ren, Fengmei; Guo, Yuxia; Liu, Lei; Zhang, Han
2016-01-01
Abstract The aim of this study was to investigate the prevalence and causes of low vision and blindness in Baotou, Inner Mongolia. A cross-sectional study was carried out. Multistage sampling was used to select samples. The visual acuity was estimated using LogMAR and corrected by pinhole as best-corrected visual acuity. There were 7000 samples selected and 5770 subjects included in this investigation. The overall bilateral prevalence rates of low vision and blindness were 3.66% (95% CI: 3.17–4.14) and 0.99% (95% CI: 0.73–1.24), respectively. The prevalence of bilateral low vision, blindness, and visual impairment increased with age and decreased with education level. The main leading cause of low vision and blindness was cataract. Diabetic retinopathy and age-related macular degeneration were found to be the second leading causes of blindness in Baotou. The low vision and blindness were more prevalent in elderly people and subjects with low education level in Baotou. Cataract was the main cause for visual impairment and more attention should be paid to fundus diseases. In order to prevent blindness, much more eye care programs should be established. PMID:27631267
Prevalence and causes of low vision and blindness in Baotou: A cross-sectional study.
Zhang, Guisen; Li, Yan; Teng, Xuelong; Wu, Qiang; Gong, Hui; Ren, Fengmei; Guo, Yuxia; Liu, Lei; Zhang, Han
2016-09-01
The aim of this study was to investigate the prevalence and causes of low vision and blindness in Baotou, Inner Mongolia.A cross-sectional study was carried out. Multistage sampling was used to select samples. The visual acuity was estimated using LogMAR and corrected by pinhole as best-corrected visual acuity.There were 7000 samples selected and 5770 subjects included in this investigation. The overall bilateral prevalence rates of low vision and blindness were 3.66% (95% CI: 3.17-4.14) and 0.99% (95% CI: 0.73-1.24), respectively. The prevalence of bilateral low vision, blindness, and visual impairment increased with age and decreased with education level. The main leading cause of low vision and blindness was cataract. Diabetic retinopathy and age-related macular degeneration were found to be the second leading causes of blindness in Baotou.The low vision and blindness were more prevalent in elderly people and subjects with low education level in Baotou. Cataract was the main cause for visual impairment and more attention should be paid to fundus diseases. In order to prevent blindness, much more eye care programs should be established.
Nagai, Takehiro; Matsushima, Toshiki; Koida, Kowa; Tani, Yusuke; Kitazaki, Michiteru; Nakauchi, Shigeki
2015-10-01
Humans can visually recognize material categories of objects, such as glass, stone, and plastic, easily. However, little is known about the kinds of surface quality features that contribute to such material class recognition. In this paper, we examine the relationship between perceptual surface features and material category discrimination performance for pictures of materials, focusing on temporal aspects, including reaction time and effects of stimulus duration. The stimuli were pictures of objects with an identical shape but made of different materials that could be categorized into seven classes (glass, plastic, metal, stone, wood, leather, and fabric). In a pre-experiment, observers rated the pictures on nine surface features, including visual (e.g., glossiness and transparency) and non-visual features (e.g., heaviness and warmness), on a 7-point scale. In the main experiments, observers judged whether two simultaneously presented pictures were classified as the same or different material category. Reaction times and effects of stimulus duration were measured. The results showed that visual feature ratings were correlated with material discrimination performance for short reaction times or short stimulus durations, while non-visual feature ratings were correlated only with performance for long reaction times or long stimulus durations. These results suggest that the mechanisms underlying visual and non-visual feature processing may differ in terms of processing time, although the cause is unclear. Visual surface features may mainly contribute to material recognition in daily life, while non-visual features may contribute only weakly, if at all. Copyright © 2014 Elsevier Ltd. All rights reserved.
Saturation in Phosphene Size with Increasing Current Levels Delivered to Human Visual Cortex.
Bosking, William H; Sun, Ping; Ozker, Muge; Pei, Xiaomei; Foster, Brett L; Beauchamp, Michael S; Yoshor, Daniel
2017-07-26
Electrically stimulating early visual cortex results in a visual percept known as a phosphene. Although phosphenes can be evoked by a wide range of electrode sizes and current amplitudes, they are invariably described as small. To better understand this observation, we electrically stimulated 93 electrodes implanted in the visual cortex of 13 human subjects who reported phosphene size while stimulation current was varied. Phosphene size increased as the stimulation current was initially raised above threshold, but then rapidly reached saturation. Phosphene size also depended on the location of the stimulated site, with size increasing with distance from the foveal representation. We developed a model relating phosphene size to the amount of activated cortex and its location within the retinotopic map. First, a sigmoidal curve was used to predict the amount of activated cortex at a given current. Second, the amount of active cortex was converted to degrees of visual angle by multiplying by the inverse cortical magnification factor for that retinotopic location. This simple model accurately predicted phosphene size for a broad range of stimulation currents and cortical locations. The unexpected saturation in phosphene sizes suggests that the functional architecture of cerebral cortex may impose fundamental restrictions on the spread of artificially evoked activity and this may be an important consideration in the design of cortical prosthetic devices. SIGNIFICANCE STATEMENT Understanding the neural basis for phosphenes, the visual percepts created by electrical stimulation of visual cortex, is fundamental to the development of a visual cortical prosthetic. Our experiments in human subjects implanted with electrodes over visual cortex show that it is the activity of a large population of cells spread out across several millimeters of tissue that supports the perception of a phosphene. In addition, we describe an important feature of the production of phosphenes by electrical stimulation: phosphene size saturates at a relatively low current level. This finding implies that, with current methods, visual prosthetics will have a limited dynamic range available to control the production of spatial forms and that more advanced stimulation methods may be required. Copyright © 2017 the authors 0270-6474/17/377188-10$15.00/0.
Sensitivity to synchronicity of biological motion in normal and amblyopic vision
Luu, Jennifer Y.; Levi, Dennis M.
2017-01-01
Amblyopia is a developmental disorder of spatial vision that results from abnormal early visual experience usually due to the presence of strabismus, anisometropia, or both strabismus and anisometropia. Amblyopia results in a range of visual deficits that cannot be corrected by optics because the deficits reflect neural abnormalities. Biological motion refers to the motion patterns of living organisms, and is normally displayed as points of lights positioned at the major joints of the body. In this experiment, our goal was twofold. We wished to examine whether the human visual system in people with amblyopia retained the higher-level processing capabilities to extract visual information from the synchronized actions of others, therefore retaining the ability to detect biological motion. Specifically, we wanted to determine if the synchronized interaction of two agents performing a dancing routine allowed the amblyopic observer to use the actions of one agent to predict the expected actions of a second agent. We also wished to establish whether synchronicity sensitivity (detection of synchronized versus desynchronized interactions) is impaired in amblyopic observers relative to normal observers. The two aims are differentiated in that the first aim looks at whether synchronized actions result in improved expected action predictions while the second aim quantitatively compares synchronicity sensitivity, or the ratio of desynchronized to synchronized detection sensitivities, to determine if there is a difference between normal and amblyopic observers. Our results show that the ability to detect biological motion requires more samples in both eyes of amblyopes than in normal control observers. The increased sample threshold is not the result of low-level losses but may reflect losses in feature integration due to undersampling in the amblyopic visual system. However, like normal observers, amblyopes are more sensitive to synchronized versus desynchronized interactions, indicating that higher-level processing of biological motion remains intact. We also found no impairment in synchronicity sensitivity in the amblyopic visual system relative to the normal visual system. Since there is no impairment in synchronicity sensitivity in either the nonamblyopic or amblyopic eye of amblyopes, our results suggest that the higher order processing of biological motion is intact. PMID:23474301
NASA Technical Reports Server (NTRS)
Watson, Andrew B.
1990-01-01
All vision systems, both human and machine, transform the spatial image into a coded representation. Particular codes may be optimized for efficiency or to extract useful image features. Researchers explored image codes based on primary visual cortex in man and other primates. Understanding these codes will advance the art in image coding, autonomous vision, and computational human factors. In cortex, imagery is coded by features that vary in size, orientation, and position. Researchers have devised a mathematical model of this transformation, called the Hexagonal oriented Orthogonal quadrature Pyramid (HOP). In a pyramid code, features are segregated by size into layers, with fewer features in the layers devoted to large features. Pyramid schemes provide scale invariance, and are useful for coarse-to-fine searching and for progressive transmission of images. The HOP Pyramid is novel in three respects: (1) it uses a hexagonal pixel lattice, (2) it uses oriented features, and (3) it accurately models most of the prominent aspects of primary visual cortex. The transform uses seven basic features (kernels), which may be regarded as three oriented edges, three oriented bars, and one non-oriented blob. Application of these kernels to non-overlapping seven-pixel neighborhoods yields six oriented, high-pass pyramid layers, and one low-pass (blob) layer.
Abbas, Qaisar; Fondon, Irene; Sarmiento, Auxiliadora; Jiménez, Soledad; Alemany, Pedro
2017-11-01
Diabetic retinopathy (DR) is leading cause of blindness among diabetic patients. Recognition of severity level is required by ophthalmologists to early detect and diagnose the DR. However, it is a challenging task for both medical experts and computer-aided diagnosis systems due to requiring extensive domain expert knowledge. In this article, a novel automatic recognition system for the five severity level of diabetic retinopathy (SLDR) is developed without performing any pre- and post-processing steps on retinal fundus images through learning of deep visual features (DVFs). These DVF features are extracted from each image by using color dense in scale-invariant and gradient location-orientation histogram techniques. To learn these DVF features, a semi-supervised multilayer deep-learning algorithm is utilized along with a new compressed layer and fine-tuning steps. This SLDR system was evaluated and compared with state-of-the-art techniques using the measures of sensitivity (SE), specificity (SP) and area under the receiving operating curves (AUC). On 750 fundus images (150 per category), the SE of 92.18%, SP of 94.50% and AUC of 0.924 values were obtained on average. These results demonstrate that the SLDR system is appropriate for early detection of DR and provide an effective treatment for prediction type of diabetes.
Parallel Distractor Rejection as a Binding Mechanism in Search
Dent, Kevin; Allen, Harriet A.; Braithwaite, Jason J.; Humphreys, Glyn W.
2012-01-01
The relatively common experimental visual search task of finding a red X amongst red O’s and green X’s (conjunction search) presents the visual system with a binding problem. Illusory conjunctions (ICs) of features across objects must be avoided and only features present in the same object bound together. Correct binding into unique objects by the visual system may be promoted, and ICs minimized, by inhibiting the locations of distractors possessing non-target features (e.g., Treisman and Sato, 1990). Such parallel rejection of interfering distractors leaves the target as the only item competing for selection; thus solving the binding problem. In the present article we explore the theoretical and empirical basis of this process of active distractor inhibition in search. Specific experiments that provide strong evidence for a process of active distractor inhibition in search are highlighted. In the final part of the article we consider how distractor inhibition, as defined here, may be realized at a neurophysiological level (Treisman and Sato, 1990). PMID:22908002
Decoding conjunctions of direction-of-motion and binocular disparity from human visual cortex.
Seymour, Kiley J; Clifford, Colin W G
2012-05-01
Motion and binocular disparity are two features in our environment that share a common correspondence problem. Decades of psychophysical research dedicated to understanding stereopsis suggest that these features interact early in human visual processing to disambiguate depth. Single-unit recordings in the monkey also provide evidence for the joint encoding of motion and disparity across much of the dorsal visual stream. Here, we used functional MRI and multivariate pattern analysis to examine where in the human brain conjunctions of motion and disparity are encoded. Subjects sequentially viewed two stimuli that could be distinguished only by their conjunctions of motion and disparity. Specifically, each stimulus contained the same feature information (leftward and rightward motion and crossed and uncrossed disparity) but differed exclusively in the way these features were paired. Our results revealed that a linear classifier could accurately decode which stimulus a subject was viewing based on voxel activation patterns throughout the dorsal visual areas and as early as V2. This decoding success was conditional on some voxels being individually sensitive to the unique conjunctions comprising each stimulus, thus a classifier could not rely on independent information about motion and binocular disparity to distinguish these conjunctions. This study expands on evidence that disparity and motion interact at many levels of human visual processing, particularly within the dorsal stream. It also lends support to the idea that stereopsis is subserved by early mechanisms also tuned to direction of motion.
Linearly Additive Shape and Color Signals in Monkey Inferotemporal Cortex
McMahon, David B. T.; Olson, Carl R.
2009-01-01
How does the brain represent a red circle? One possibility is that there is a specialized and possibly time-consuming process whereby the attributes of shape and color, carried by separate populations of neurons in low-order visual cortex, are bound together into a unitary neural representation. Another possibility is that neurons in high-order visual cortex are selective, by virtue of their bottom-up input from low-order visual areas, for particular conjunctions of shape and color. A third possibility is that they simply sum shape and color signals linearly. We tested these ideas by measuring the responses of inferotemporal cortex neurons to sets of stimuli in which two attributes—shape and color—varied independently. We find that a few neurons exhibit conjunction selectivity but that in most neurons the influences of shape and color sum linearly. Contrary to the idea of conjunction coding, few neurons respond selectively to a particular combination of shape and color. Contrary to the idea that binding requires time, conjunction signals, when present, occur as early as feature signals. We argue that neither conjunction selectivity nor a specialized feature binding process is necessary for the effective representation of shape–color combinations. PMID:19144745
Linearly additive shape and color signals in monkey inferotemporal cortex.
McMahon, David B T; Olson, Carl R
2009-04-01
How does the brain represent a red circle? One possibility is that there is a specialized and possibly time-consuming process whereby the attributes of shape and color, carried by separate populations of neurons in low-order visual cortex, are bound together into a unitary neural representation. Another possibility is that neurons in high-order visual cortex are selective, by virtue of their bottom-up input from low-order visual areas, for particular conjunctions of shape and color. A third possibility is that they simply sum shape and color signals linearly. We tested these ideas by measuring the responses of inferotemporal cortex neurons to sets of stimuli in which two attributes-shape and color-varied independently. We find that a few neurons exhibit conjunction selectivity but that in most neurons the influences of shape and color sum linearly. Contrary to the idea of conjunction coding, few neurons respond selectively to a particular combination of shape and color. Contrary to the idea that binding requires time, conjunction signals, when present, occur as early as feature signals. We argue that neither conjunction selectivity nor a specialized feature binding process is necessary for the effective representation of shape-color combinations.
Harris, Joseph A; Wu, Chien-Te; Woldorff, Marty G
2011-06-07
It is generally agreed that considerable amounts of low-level sensory processing of visual stimuli can occur without conscious awareness. On the other hand, the degree of higher level visual processing that occurs in the absence of awareness is as yet unclear. Here, event-related potential (ERP) measures of brain activity were recorded during a sandwich-masking paradigm, a commonly used approach for attenuating conscious awareness of visual stimulus content. In particular, the present study used a combination of ERP activation contrasts to track both early sensory-processing ERP components and face-specific N170 ERP activations, in trials with versus without awareness. The electrophysiological measures revealed that the sandwich masking abolished the early face-specific N170 neural response (peaking at ~170 ms post-stimulus), an effect that paralleled the abolition of awareness of face versus non-face image content. Furthermore, however, the masking appeared to render a strong attenuation of earlier feedforward visual sensory-processing signals. This early attenuation presumably resulted in insufficient information being fed into the higher level visual system pathways specific to object category processing, thus leading to unawareness of the visual object content. These results support a coupling of visual awareness and neural indices of face processing, while also demonstrating an early low-level mechanism of interference in sandwich masking.
Storage of features, conjunctions and objects in visual working memory.
Vogel, E K; Woodman, G F; Luck, S J
2001-02-01
Working memory can be divided into separate subsystems for verbal and visual information. Although the verbal system has been well characterized, the storage capacity of visual working memory has not yet been established for simple features or for conjunctions of features. The authors demonstrate that it is possible to retain information about only 3-4 colors or orientations in visual working memory at one time. Observers are also able to retain both the color and the orientation of 3-4 objects, indicating that visual working memory stores integrated objects rather than individual features. Indeed, objects defined by a conjunction of four features can be retained in working memory just as well as single-feature objects, allowing many individual features to be retained when distributed across a small number of objects. Thus, the capacity of visual working memory must be understood in terms of integrated objects rather than individual features.
End-Stopping Predicts Curvature Tuning along the Ventral Stream
Hartmann, Till S.; Livingstone, Margaret S.
2017-01-01
Neurons in primate inferotemporal cortex (IT) are clustered into patches of shared image preferences. Functional imaging has shown that these patches are activated by natural categories (e.g., faces, body parts, and places), artificial categories (numerals, words) and geometric features (curvature and real-world size). These domains develop in the same cortical locations across monkeys and humans, which raises the possibility of common innate mechanisms. Although these commonalities could be high-level template-based categories, it is alternatively possible that the domain locations are constrained by low-level properties such as end-stopping, eccentricity, and the shape of the preferred images. To explore this, we looked for correlations among curvature preference, receptive field (RF) end-stopping, and RF eccentricity in the ventral stream. We recorded from sites in V1, V4, and posterior IT (PIT) from six monkeys using microelectrode arrays. Across all visual areas, we found a tendency for end-stopped sites to prefer curved over straight contours. Further, we found a progression in population curvature preferences along the visual hierarchy, where, on average, V1 sites preferred straight Gabors, V4 sites preferred curved stimuli, and many PIT sites showed a preference for curvature that was concave relative to fixation. Our results provide evidence that high-level functional domains may be mapped according to early rudimentary properties of the visual system. SIGNIFICANCE STATEMENT The macaque occipitotemporal cortex contains clusters of neurons with preferences for categories such as faces, body parts, and places. One common question is how these clusters (or “domains”) acquire their cortical position along the ventral stream. We and other investigators previously established an fMRI-level correlation among these category domains, retinotopy, and curvature preferences: for example, in inferotemporal cortex, face- and curvature-preferring domains show a central visual field bias whereas place- and rectilinear-preferring domains show a more peripheral visual field bias. Here, we have found an electrophysiological-level explanation for the correlation among domain preference, curvature, and retinotopy based on neuronal preference for short over long contours, also called end-stopping. PMID:28100746
End-Stopping Predicts Curvature Tuning along the Ventral Stream.
Ponce, Carlos R; Hartmann, Till S; Livingstone, Margaret S
2017-01-18
Neurons in primate inferotemporal cortex (IT) are clustered into patches of shared image preferences. Functional imaging has shown that these patches are activated by natural categories (e.g., faces, body parts, and places), artificial categories (numerals, words) and geometric features (curvature and real-world size). These domains develop in the same cortical locations across monkeys and humans, which raises the possibility of common innate mechanisms. Although these commonalities could be high-level template-based categories, it is alternatively possible that the domain locations are constrained by low-level properties such as end-stopping, eccentricity, and the shape of the preferred images. To explore this, we looked for correlations among curvature preference, receptive field (RF) end-stopping, and RF eccentricity in the ventral stream. We recorded from sites in V1, V4, and posterior IT (PIT) from six monkeys using microelectrode arrays. Across all visual areas, we found a tendency for end-stopped sites to prefer curved over straight contours. Further, we found a progression in population curvature preferences along the visual hierarchy, where, on average, V1 sites preferred straight Gabors, V4 sites preferred curved stimuli, and many PIT sites showed a preference for curvature that was concave relative to fixation. Our results provide evidence that high-level functional domains may be mapped according to early rudimentary properties of the visual system. The macaque occipitotemporal cortex contains clusters of neurons with preferences for categories such as faces, body parts, and places. One common question is how these clusters (or "domains") acquire their cortical position along the ventral stream. We and other investigators previously established an fMRI-level correlation among these category domains, retinotopy, and curvature preferences: for example, in inferotemporal cortex, face- and curvature-preferring domains show a central visual field bias whereas place- and rectilinear-preferring domains show a more peripheral visual field bias. Here, we have found an electrophysiological-level explanation for the correlation among domain preference, curvature, and retinotopy based on neuronal preference for short over long contours, also called end-stopping. Copyright © 2017 the authors 0270-6474/17/370648-12$15.00/0.
Visual Analytics for Heterogeneous Geoscience Data
NASA Astrophysics Data System (ADS)
Pan, Y.; Yu, L.; Zhu, F.; Rilee, M. L.; Kuo, K. S.; Jiang, H.; Yu, H.
2017-12-01
Geoscience data obtained from diverse sources have been routinely leveraged by scientists to study various phenomena. The principal data sources include observations and model simulation outputs. These data are characterized by spatiotemporal heterogeneity originated from different instrument design specifications and/or computational model requirements used in data generation processes. Such inherent heterogeneity poses several challenges in exploring and analyzing geoscience data. First, scientists often wish to identify features or patterns co-located among multiple data sources to derive and validate certain hypotheses. Heterogeneous data make it a tedious task to search such features in dissimilar datasets. Second, features of geoscience data are typically multivariate. It is challenging to tackle the high dimensionality of geoscience data and explore the relations among multiple variables in a scalable fashion. Third, there is a lack of transparency in traditional automated approaches, such as feature detection or clustering, in that scientists cannot intuitively interact with their analysis processes and interpret results. To address these issues, we present a new scalable approach that can assist scientists in analyzing voluminous and diverse geoscience data. We expose a high-level query interface that allows users to easily express their customized queries to search features of interest across multiple heterogeneous datasets. For identified features, we develop a visualization interface that enables interactive exploration and analytics in a linked-view manner. Specific visualization techniques such as scatter plots to parallel coordinates are employed in each view to allow users to explore various aspects of features. Different views are linked and refreshed according to user interactions in any individual view. In such a manner, a user can interactively and iteratively gain understanding into the data through a variety of visual analytics operations. We demonstrate with use cases how scientists can combine the query and visualization interfaces to enable a customized workflow facilitating studies using heterogeneous geoscience datasets.
Kriechbaumer, Thomas; Blackburn, Kim; Breckon, Toby P.; Hamilton, Oliver; Rivas Casado, Monica
2015-01-01
Autonomous survey vessels can increase the efficiency and availability of wide-area river environment surveying as a tool for environment protection and conservation. A key challenge is the accurate localisation of the vessel, where bank-side vegetation or urban settlement preclude the conventional use of line-of-sight global navigation satellite systems (GNSS). In this paper, we evaluate unaided visual odometry, via an on-board stereo camera rig attached to the survey vessel, as a novel, low-cost localisation strategy. Feature-based and appearance-based visual odometry algorithms are implemented on a six degrees of freedom platform operating under guided motion, but stochastic variation in yaw, pitch and roll. Evaluation is based on a 663 m-long trajectory (>15,000 image frames) and statistical error analysis against ground truth position from a target tracking tachymeter integrating electronic distance and angular measurements. The position error of the feature-based technique (mean of ±0.067 m) is three times smaller than that of the appearance-based algorithm. From multi-variable statistical regression, we are able to attribute this error to the depth of tracked features from the camera in the scene and variations in platform yaw. Our findings inform effective strategies to enhance stereo visual localisation for the specific application of river monitoring. PMID:26694411
Kopp, Bruno; Wessel, Karl
2010-05-01
In the present study, event-related potentials (ERPs) were recorded to investigate cognitive processes related to the partial transmission of information from stimulus recognition to response preparation. Participants classified two-dimensional visual stimuli with dimensions size and form. One feature combination was designated as the go-target, whereas the other three feature combinations served as no-go distractors. Size discriminability was manipulated across three experimental conditions. N2c and P3a amplitudes were enhanced in response to those distractors that shared the feature from the faster dimension with the target. Moreover, N2c and P3a amplitudes showed a crossover effect: Size distractors evoked more pronounced ERPs under high size discriminability, but form distractors elicited enhanced ERPs under low size discriminability. These results suggest that partial perceptual-motor transmission of information is accompanied by acts of cognitive control and by shifts of attention between the sources of conflicting information. Selection negativity findings imply adaptive allocation of visual feature-based attention across the two stimulus dimensions.
Visualizing speciation in artificial cichlid fish.
Clement, Ross
2006-01-01
The Cichlid Speciation Project (CSP) is an ALife simulation system for investigating open problems in the speciation of African cichlid fish. The CSP can be used to perform a wide range of experiments that show that speciation is a natural consequence of certain biological systems. A visualization system capable of extracting the history of speciation from low-level trace data and creating a phylogenetic tree has been implemented. Unlike previous approaches, this visualization system presents a concrete trace of speciation, rather than a summary of low-level information from which the viewer can make subjective decisions on how speciation progressed. The phylogenetic trees are a more objective visualization of speciation, and enable automated collection and summarization of the results of experiments. The visualization system is used to create a phylogenetic tree from an experiment that models sympatric speciation.
The Timing of Visual Object Categorization
Mack, Michael L.; Palmeri, Thomas J.
2011-01-01
An object can be categorized at different levels of abstraction: as natural or man-made, animal or plant, bird or dog, or as a Northern Cardinal or Pyrrhuloxia. There has been growing interest in understanding how quickly categorizations at different levels are made and how the timing of those perceptual decisions changes with experience. We specifically contrast two perspectives on the timing of object categorization at different levels of abstraction. By one account, the relative timing implies a relative timing of stages of visual processing that are tied to particular levels of object categorization: Fast categorizations are fast because they precede other categorizations within the visual processing hierarchy. By another account, the relative timing reflects when perceptual features are available over time and the quality of perceptual evidence used to drive a perceptual decision process: Fast simply means fast, it does not mean first. Understanding the short-term and long-term temporal dynamics of object categorizations is key to developing computational models of visual object recognition. We briefly review a number of models of object categorization and outline how they explain the timing of visual object categorization at different levels of abstraction. PMID:21811480
EEG Topographic Mapping of Visual and Kinesthetic Imagery in Swimmers.
Wilson, V E; Dikman, Z; Bird, E I; Williams, J M; Harmison, R; Shaw-Thornton, L; Schwartz, G E
2016-03-01
This study investigated differences in QEEG measures between kinesthetic and visual imagery of a 100-m swim in 36 elite competitive swimmers. Background information and post-trial checks controlled for the modality of imagery, swimming skill level, preferred imagery style, intensity of image and task equality. Measures of EEG relative magnitude in theta, low (7-9 Hz) and high alpha (8-10 Hz), and low and high beta were taken from 19 scalp sites during baseline, visual, and kinesthetic imagery. QEEG magnitudes in the low alpha band during the visual and kinesthetic conditions were attenuated from baseline in low band alpha but no changes were seen in any other bands. Swimmers produced more low alpha EEG magnitude during visual versus kinesthetic imagery. This was interpreted as the swimmers having a greater efficiency at producing visual imagery. Participants who reported a strong intensity versus a weaker feeling of the image (kinesthetic) had less low alpha magnitude, i.e., there was use of more cortical resources, but not for the visual condition. These data suggest that low band (7-9 Hz) alpha distinguishes imagery modalities from baseline, visual imagery requires less cortical resources than kinesthetic imagery, and that intense feelings of swimming requires more brain activity than less intense feelings.
Castagné, Raphaële; Boulangé, Claire Laurence; Karaman, Ibrahim; Campanella, Gianluca; Santos Ferreira, Diana L; Kaluarachchi, Manuja R; Lehne, Benjamin; Moayyeri, Alireza; Lewis, Matthew R; Spagou, Konstantina; Dona, Anthony C; Evangelos, Vangelis; Tracy, Russell; Greenland, Philip; Lindon, John C; Herrington, David; Ebbels, Timothy M D; Elliott, Paul; Tzoulaki, Ioanna; Chadeau-Hyam, Marc
2017-10-06
1 H NMR spectroscopy of biofluids generates reproducible data allowing detection and quantification of small molecules in large population cohorts. Statistical models to analyze such data are now well-established, and the use of univariate metabolome wide association studies (MWAS) investigating the spectral features separately has emerged as a computationally efficient and interpretable alternative to multivariate models. The MWAS rely on the accurate estimation of a metabolome wide significance level (MWSL) to be applied to control the family wise error rate. Subsequent interpretation requires efficient visualization and formal feature annotation, which, in-turn, call for efficient prioritization of spectral variables of interest. Using human serum 1 H NMR spectroscopic profiles from 3948 participants from the Multi-Ethnic Study of Atherosclerosis (MESA), we have performed a series of MWAS for serum levels of glucose. We first propose an extension of the conventional MWSL that yields stable estimates of the MWSL across the different model parameterizations and distributional features of the outcome. We propose both efficient visualization methods and a strategy based on subsampling and internal validation to prioritize the associations. Our work proposes and illustrates practical and scalable solutions to facilitate the implementation of the MWAS approach and improve interpretation in large cohort studies.
2017-01-01
1H NMR spectroscopy of biofluids generates reproducible data allowing detection and quantification of small molecules in large population cohorts. Statistical models to analyze such data are now well-established, and the use of univariate metabolome wide association studies (MWAS) investigating the spectral features separately has emerged as a computationally efficient and interpretable alternative to multivariate models. The MWAS rely on the accurate estimation of a metabolome wide significance level (MWSL) to be applied to control the family wise error rate. Subsequent interpretation requires efficient visualization and formal feature annotation, which, in-turn, call for efficient prioritization of spectral variables of interest. Using human serum 1H NMR spectroscopic profiles from 3948 participants from the Multi-Ethnic Study of Atherosclerosis (MESA), we have performed a series of MWAS for serum levels of glucose. We first propose an extension of the conventional MWSL that yields stable estimates of the MWSL across the different model parameterizations and distributional features of the outcome. We propose both efficient visualization methods and a strategy based on subsampling and internal validation to prioritize the associations. Our work proposes and illustrates practical and scalable solutions to facilitate the implementation of the MWAS approach and improve interpretation in large cohort studies. PMID:28823158
Neural codes of seeing architectural styles
Choo, Heeyoung; Nasar, Jack L.; Nikrahei, Bardia; Walther, Dirk B.
2017-01-01
Images of iconic buildings, such as the CN Tower, instantly transport us to specific places, such as Toronto. Despite the substantial impact of architectural design on people’s visual experience of built environments, we know little about its neural representation in the human brain. In the present study, we have found patterns of neural activity associated with specific architectural styles in several high-level visual brain regions, but not in primary visual cortex (V1). This finding suggests that the neural correlates of the visual perception of architectural styles stem from style-specific complex visual structure beyond the simple features computed in V1. Surprisingly, the network of brain regions representing architectural styles included the fusiform face area (FFA) in addition to several scene-selective regions. Hierarchical clustering of error patterns further revealed that the FFA participated to a much larger extent in the neural encoding of architectural styles than entry-level scene categories. We conclude that the FFA is involved in fine-grained neural encoding of scenes at a subordinate-level, in our case, architectural styles of buildings. This study for the first time shows how the human visual system encodes visual aspects of architecture, one of the predominant and longest-lasting artefacts of human culture. PMID:28071765
Neural codes of seeing architectural styles.
Choo, Heeyoung; Nasar, Jack L; Nikrahei, Bardia; Walther, Dirk B
2017-01-10
Images of iconic buildings, such as the CN Tower, instantly transport us to specific places, such as Toronto. Despite the substantial impact of architectural design on people's visual experience of built environments, we know little about its neural representation in the human brain. In the present study, we have found patterns of neural activity associated with specific architectural styles in several high-level visual brain regions, but not in primary visual cortex (V1). This finding suggests that the neural correlates of the visual perception of architectural styles stem from style-specific complex visual structure beyond the simple features computed in V1. Surprisingly, the network of brain regions representing architectural styles included the fusiform face area (FFA) in addition to several scene-selective regions. Hierarchical clustering of error patterns further revealed that the FFA participated to a much larger extent in the neural encoding of architectural styles than entry-level scene categories. We conclude that the FFA is involved in fine-grained neural encoding of scenes at a subordinate-level, in our case, architectural styles of buildings. This study for the first time shows how the human visual system encodes visual aspects of architecture, one of the predominant and longest-lasting artefacts of human culture.
EP Profiles Inventor Mark Sherron
ERIC Educational Resources Information Center
Williams, John M.
2006-01-01
This article profiles Mark Jerome Sherron, inventor of the ALLIES Line of electronic sensors for blind and visually-impaired people. Featuring the American Liquid Level Indicator electronic sensor (ALLI), Sherron's ALLIES product line also includes the Light Intensity Level Indicator (LILI), a multi-function electronic light sensor for electronic…
Automatic movie skimming with general tempo analysis
NASA Astrophysics Data System (ADS)
Lee, Shih-Hung; Yeh, Chia-Hung; Kuo, C. C. J.
2003-11-01
Story units are extracted by general tempo analysis including tempos analysis including tempos of audio and visual information in this research. Although many schemes have been proposed to successfully segment video data into shots using basic low-level features, how to group shots into meaningful units called story units is still a challenging problem. By focusing on a certain type of video such as sport or news, we can explore models with the specific application domain knowledge. For movie contents, many heuristic rules based on audiovisual clues have been proposed with limited success. We propose a method to extract story units using general tempo analysis. Experimental results are given to demonstrate the feasibility and efficiency of the proposed technique.
The magnificent outburst of the 2016 Perseids, the analyses
NASA Astrophysics Data System (ADS)
Miskotte, Koen; Vandeputte, Michel
2017-03-01
Enhanced Perseid activity had been predicted for 2016 as a result of a sequence of encounters with some dust trails as well as the effect of perturbations by Jupiter which made Earth crossing the main stream deeper through more dense regions. Visual observations resulted in a detailed activity profile and population index profile, the observed features in these profiles could be matched with the predicted passages through the different dust trails. The 4 Rev (1479) dust trail in particular produced a distinct peak while the 7 Rev (1079) dust trail remained rather at a somehow disappointing low level. The traditional annual Perseid maximum displayed enhanced activity due to the 12 Rev (441) dust trail.
van Ackeren, Markus J; Rueschemeyer, Shirley-Ann
2014-01-01
In recent years, numerous studies have provided converging evidence that word meaning is partially stored in modality-specific cortical networks. However, little is known about the mechanisms supporting the integration of this distributed semantic content into coherent conceptual representations. In the current study we aimed to address this issue by using EEG to look at the spatial and temporal dynamics of feature integration during word comprehension. Specifically, participants were presented with two modality-specific features (i.e., visual or auditory features such as silver and loud) and asked to verify whether these two features were compatible with a subsequently presented target word (e.g., WHISTLE). Each pair of features described properties from either the same modality (e.g., silver, tiny = visual features) or different modalities (e.g., silver, loud = visual, auditory). Behavioral and EEG data were collected. The results show that verifying features that are putatively represented in the same modality-specific network is faster than verifying features across modalities. At the neural level, integrating features across modalities induces sustained oscillatory activity around the theta range (4-6 Hz) in left anterior temporal lobe (ATL), a putative hub for integrating distributed semantic content. In addition, enhanced long-range network interactions in the theta range were seen between left ATL and a widespread cortical network. These results suggest that oscillatory dynamics in the theta range could be involved in integrating multimodal semantic content by creating transient functional networks linking distributed modality-specific networks and multimodal semantic hubs such as left ATL.
Jing, Xiao-Yuan; Zhu, Xiaoke; Wu, Fei; Hu, Ruimin; You, Xinge; Wang, Yunhong; Feng, Hui; Yang, Jing-Yu
2017-03-01
Person re-identification has been widely studied due to its importance in surveillance and forensics applications. In practice, gallery images are high resolution (HR), while probe images are usually low resolution (LR) in the identification scenarios with large variation of illumination, weather, or quality of cameras. Person re-identification in this kind of scenarios, which we call super-resolution (SR) person re-identification, has not been well studied. In this paper, we propose a semi-coupled low-rank discriminant dictionary learning (SLD 2 L) approach for SR person re-identification task. With the HR and LR dictionary pair and mapping matrices learned from the features of HR and LR training images, SLD 2 L can convert the features of the LR probe images into HR features. To ensure that the converted features have favorable discriminative capability and the learned dictionaries can well characterize intrinsic feature spaces of the HR and LR images, we design a discriminant term and a low-rank regularization term for SLD 2 L. Moreover, considering that low resolution results in different degrees of loss for different types of visual appearance features, we propose a multi-view SLD 2 L (MVSLD 2 L) approach, which can learn the type-specific dictionary pair and mappings for each type of feature. Experimental results on multiple publicly available data sets demonstrate the effectiveness of our proposed approaches for the SR person re-identification task.
Mihalas, Stefan; Dong, Yi; von der Heydt, Rüdiger; Niebur, Ernst
2011-01-01
Visual attention is often understood as a modulatory field acting at early stages of processing, but the mechanisms that direct and fit the field to the attended object are not known. We show that a purely spatial attention field propagating downward in the neuronal network responsible for perceptual organization will be reshaped, repositioned, and sharpened to match the object's shape and scale. Key features of the model are grouping neurons integrating local features into coherent tentative objects, excitatory feedback to the same local feature neurons that caused grouping neuron activation, and inhibition between incompatible interpretations both at the local feature level and at the object representation level. PMID:21502489
Integrated web visualizations for protein-protein interaction databases.
Jeanquartier, Fleur; Jean-Quartier, Claire; Holzinger, Andreas
2015-06-16
Understanding living systems is crucial for curing diseases. To achieve this task we have to understand biological networks based on protein-protein interactions. Bioinformatics has come up with a great amount of databases and tools that support analysts in exploring protein-protein interactions on an integrated level for knowledge discovery. They provide predictions and correlations, indicate possibilities for future experimental research and fill the gaps to complete the picture of biochemical processes. There are numerous and huge databases of protein-protein interactions used to gain insights into answering some of the many questions of systems biology. Many computational resources integrate interaction data with additional information on molecular background. However, the vast number of diverse Bioinformatics resources poses an obstacle to the goal of understanding. We present a survey of databases that enable the visual analysis of protein networks. We selected M=10 out of N=53 resources supporting visualization, and we tested against the following set of criteria: interoperability, data integration, quantity of possible interactions, data visualization quality and data coverage. The study reveals differences in usability, visualization features and quality as well as the quantity of interactions. StringDB is the recommended first choice. CPDB presents a comprehensive dataset and IntAct lets the user change the network layout. A comprehensive comparison table is available via web. The supplementary table can be accessed on http://tinyurl.com/PPI-DB-Comparison-2015. Only some web resources featuring graph visualization can be successfully applied to interactive visual analysis of protein-protein interaction. Study results underline the necessity for further enhancements of visualization integration in biochemical analysis tools. Identified challenges are data comprehensiveness, confidence, interactive feature and visualization maturing.
ERIC Educational Resources Information Center
Argyropoulos, Vassilis; Papadimitriou, Vassilios
2015-01-01
Introduction: The present study assesses the performance of students who are visually impaired (that is, those who are blind or have low vision) in braille reading accuracy and examines potential correlations among the error categories on the basis of gender, age at loss of vision, and level of education. Methods: Twenty-one visually impaired…
Estimated capacity of object files in visual short-term memory is not improved by retrieval cueing.
Saiki, Jun; Miyatsuji, Hirofumi
2009-03-23
Visual short-term memory (VSTM) has been claimed to maintain three to five feature-bound object representations. Some results showing smaller capacity estimates for feature binding memory have been interpreted as the effects of interference in memory retrieval. However, change-detection tasks may not properly evaluate complex feature-bound representations such as triple conjunctions in VSTM. To understand the general type of feature-bound object representation, evaluation of triple conjunctions is critical. To test whether interference occurs in memory retrieval for complete object file representations in a VSTM task, we cued retrieval in novel paradigms that directly evaluate the memory for triple conjunctions, in comparison with a simple change-detection task. In our multiple object permanence tracking displays, observers monitored for a switch in feature combination between objects during an occlusion period, and we found that a retrieval cue provided no benefit with the triple conjunction tasks, but significant facilitation with the change-detection task, suggesting that low capacity estimates of object file memory in VSTM reflect a limit on maintenance, not retrieval.
Neurons in the human hippocampus and amygdala respond to both low- and high-level image properties
Cabrales, Elaine; Wilson, Michael S.; Baker, Christopher P.; Thorp, Christopher K.; Smith, Kris A.; Treiman, David M.
2011-01-01
A large number of studies have demonstrated that structures within the medial temporal lobe, such as the hippocampus, are intimately involved in declarative memory for objects and people. Although these items are abstractions of the visual scene, specific visual details can change the speed and accuracy of their recall. By recording from 415 neurons in the hippocampus and amygdala of human epilepsy patients as they viewed images drawn from 10 image categories, we showed that the firing rates of 8% of these neurons encode image illuminance and contrast, low-level properties not directly pertinent to task performance, whereas in 7% of the neurons, firing rates encode the category of the item depicted in the image, a high-level property pertinent to the task. This simultaneous representation of high- and low-level image properties within the same brain areas may serve to bind separate aspects of visual objects into a coherent percept and allow episodic details of objects to influence mnemonic performance. PMID:21471400
Fan, Jianping; Gao, Yuli; Luo, Hangzai
2008-03-01
In this paper, we have developed a new scheme for achieving multilevel annotations of large-scale images automatically. To achieve more sufficient representation of various visual properties of the images, both the global visual features and the local visual features are extracted for image content representation. To tackle the problem of huge intraconcept visual diversity, multiple types of kernels are integrated to characterize the diverse visual similarity relationships between the images more precisely, and a multiple kernel learning algorithm is developed for SVM image classifier training. To address the problem of huge interconcept visual similarity, a novel multitask learning algorithm is developed to learn the correlated classifiers for the sibling image concepts under the same parent concept and enhance their discrimination and adaptation power significantly. To tackle the problem of huge intraconcept visual diversity for the image concepts at the higher levels of the concept ontology, a novel hierarchical boosting algorithm is developed to learn their ensemble classifiers hierarchically. In order to assist users on selecting more effective hypotheses for image classifier training, we have developed a novel hyperbolic framework for large-scale image visualization and interactive hypotheses assessment. Our experiments on large-scale image collections have also obtained very positive results.
The 3D Human Motion Control Through Refined Video Gesture Annotation
NASA Astrophysics Data System (ADS)
Jin, Yohan; Suk, Myunghoon; Prabhakaran, B.
In the beginning of computer and video game industry, simple game controllers consisting of buttons and joysticks were employed, but recently game consoles are replacing joystick buttons with novel interfaces such as the remote controllers with motion sensing technology on the Nintendo Wii [1] Especially video-based human computer interaction (HCI) technique has been applied to games, and the representative game is 'Eyetoy' on the Sony PlayStation 2. Video-based HCI technique has great benefit to release players from the intractable game controller. Moreover, in order to communicate between humans and computers, video-based HCI is very crucial since it is intuitive, easy to get, and inexpensive. On the one hand, extracting semantic low-level features from video human motion data is still a major challenge. The level of accuracy is really dependent on each subject's characteristic and environmental noises. Of late, people have been using 3D motion-capture data for visualizing real human motions in 3D space (e.g, 'Tiger Woods' in EA Sports, 'Angelina Jolie' in Bear-Wolf movie) and analyzing motions for specific performance (e.g, 'golf swing' and 'walking'). 3D motion-capture system ('VICON') generates a matrix for each motion clip. Here, a column is corresponding to a human's sub-body part and row represents time frames of data capture. Thus, we can extract sub-body part's motion only by selecting specific columns. Different from low-level feature values of video human motion, 3D human motion-capture data matrix are not pixel values, but is closer to human level of semantics.
Buschow, Christian; Charo, Jehad; Anders, Kathleen; Loddenkemper, Christoph; Jukica, Ana; Alsamah, Wisam; Perez, Cynthia; Willimsky, Gerald; Blankenstein, Thomas
2010-03-15
Visualizing oncogene/tumor Ag expression by noninvasive imaging is of great interest for understanding processes of tumor development and therapy. We established transgenic (Tg) mice conditionally expressing a fusion protein of the SV40 large T Ag and luciferase (TagLuc) that allows monitoring of oncogene/tumor Ag expression by bioluminescent imaging upon Cre recombinase-mediated activation. Independent of Cre-mediated recombination, the TagLuc gene was expressed at low levels in different tissues, probably due to the leakiness of the stop cassette. The level of spontaneous TagLuc expression, detected by bioluminescent imaging, varied between the different Tg lines, depended on the nature of the Tg expression cassette, and correlated with Tag-specific CTL tolerance. Following liver-specific Cre-loxP site-mediated excision of the stop cassette that separated the promoter from the TagLuc fusion gene, hepatocellular carcinoma development was visualized. The ubiquitous low level TagLuc expression caused the failure of transferred effector T cells to reject Tag-expressing tumors rather than causing graft-versus-host disease. This model may be useful to study different levels of tolerance, monitor tumor development at an early stage, and rapidly visualize the efficacy of therapeutic intervention versus potential side effects of low-level Ag expression in normal tissues.
Neural Tuning to Low-Level Features of Speech throughout the Perisylvian Cortex.
Berezutskaya, Julia; Freudenburg, Zachary V; Güçlü, Umut; van Gerven, Marcel A J; Ramsey, Nick F
2017-08-16
Despite a large body of research, we continue to lack a detailed account of how auditory processing of continuous speech unfolds in the human brain. Previous research showed the propagation of low-level acoustic features of speech from posterior superior temporal gyrus toward anterior superior temporal gyrus in the human brain (Hullett et al., 2016). In this study, we investigate what happens to these neural representations past the superior temporal gyrus and how they engage higher-level language processing areas such as inferior frontal gyrus. We used low-level sound features to model neural responses to speech outside of the primary auditory cortex. Two complementary imaging techniques were used with human participants (both males and females): electrocorticography (ECoG) and fMRI. Both imaging techniques showed tuning of the perisylvian cortex to low-level speech features. With ECoG, we found evidence of propagation of the temporal features of speech sounds along the ventral pathway of language processing in the brain toward inferior frontal gyrus. Increasingly coarse temporal features of speech spreading from posterior superior temporal cortex toward inferior frontal gyrus were associated with linguistic features such as voice onset time, duration of the formant transitions, and phoneme, syllable, and word boundaries. The present findings provide the groundwork for a comprehensive bottom-up account of speech comprehension in the human brain. SIGNIFICANCE STATEMENT We know that, during natural speech comprehension, a broad network of perisylvian cortical regions is involved in sound and language processing. Here, we investigated the tuning to low-level sound features within these regions using neural responses to a short feature film. We also looked at whether the tuning organization along these brain regions showed any parallel to the hierarchy of language structures in continuous speech. Our results show that low-level speech features propagate throughout the perisylvian cortex and potentially contribute to the emergence of "coarse" speech representations in inferior frontal gyrus typically associated with high-level language processing. These findings add to the previous work on auditory processing and underline a distinctive role of inferior frontal gyrus in natural speech comprehension. Copyright © 2017 the authors 0270-6474/17/377906-15$15.00/0.
Venkatesh, Santosh S; Levenback, Benjamin J; Sultan, Laith R; Bouzghar, Ghizlane; Sehgal, Chandra M
2015-12-01
The goal of this study was to devise a machine learning methodology as a viable low-cost alternative to a second reader to help augment physicians' interpretations of breast ultrasound images in differentiating benign and malignant masses. Two independent feature sets consisting of visual features based on a radiologist's interpretation of images and computer-extracted features when used as first and second readers and combined by adaptive boosting (AdaBoost) and a pruning classifier resulted in a very high level of diagnostic performance (area under the receiver operating characteristic curve = 0.98) at a cost of pruning a fraction (20%) of the cases for further evaluation by independent methods. AdaBoost also improved the diagnostic performance of the individual human observers and increased the agreement between their analyses. Pairing AdaBoost with selective pruning is a principled methodology for achieving high diagnostic performance without the added cost of an additional reader for differentiating solid breast masses by ultrasound. Copyright © 2015 World Federation for Ultrasound in Medicine & Biology. Published by Elsevier Inc. All rights reserved.
Model-based analysis of pattern motion processing in mouse primary visual cortex
Muir, Dylan R.; Roth, Morgane M.; Helmchen, Fritjof; Kampa, Björn M.
2015-01-01
Neurons in sensory areas of neocortex exhibit responses tuned to specific features of the environment. In visual cortex, information about features such as edges or textures with particular orientations must be integrated to recognize a visual scene or object. Connectivity studies in rodent cortex have revealed that neurons make specific connections within sub-networks sharing common input tuning. In principle, this sub-network architecture enables local cortical circuits to integrate sensory information. However, whether feature integration indeed occurs locally in rodent primary sensory areas has not been examined directly. We studied local integration of sensory features in primary visual cortex (V1) of the mouse by presenting drifting grating and plaid stimuli, while recording the activity of neuronal populations with two-photon calcium imaging. Using a Bayesian model-based analysis framework, we classified single-cell responses as being selective for either individual grating components or for moving plaid patterns. Rather than relying on trial-averaged responses, our model-based framework takes into account single-trial responses and can easily be extended to consider any number of arbitrary predictive models. Our analysis method was able to successfully classify significantly more responses than traditional partial correlation (PC) analysis, and provides a rigorous statistical framework to rank any number of models and reject poorly performing models. We also found a large proportion of cells that respond strongly to only one stimulus class. In addition, a quarter of selectively responding neurons had more complex responses that could not be explained by any simple integration model. Our results show that a broad range of pattern integration processes already take place at the level of V1. This diversity of integration is consistent with processing of visual inputs by local sub-networks within V1 that are tuned to combinations of sensory features. PMID:26300738
Neural Summation in the Hawkmoth Visual System Extends the Limits of Vision in Dim Light.
Stöckl, Anna Lisa; O'Carroll, David Charles; Warrant, Eric James
2016-03-21
Most of the world's animals are active in dim light and depend on good vision for the tasks of daily life. Many have evolved visual adaptations that permit a performance superior to that of manmade imaging devices [1]. In insects, a major model visual system, nocturnal species show impressive visual abilities ranging from flight control [2, 3], to color discrimination [4, 5], to navigation using visual landmarks [6-8] or dim celestial compass cues [9, 10]. In addition to optical adaptations that improve their sensitivity in dim light [11], neural summation of light in space and time-which enhances the coarser and slower features of the scene at the expense of noisier finer and faster features-has been suggested to improve sensitivity in theoretical [12-14], anatomical [15-17], and behavioral [18-20] studies. How these summation strategies function neurally is, however, presently unknown. Here, we quantified spatial and temporal summation in the motion vision pathway of a nocturnal hawkmoth. We show that spatial and temporal summation combine supralinearly to substantially increase contrast sensitivity and visual information rate over four decades of light intensity, enabling hawkmoths to see at light levels 100 times dimmer than without summation. Our results reveal how visual motion is calculated neurally in dim light and how spatial and temporal summation improve sensitivity while simultaneously maximizing spatial and temporal resolution, thus extending models of insect motion vision derived predominantly from diurnal flies. Moreover, the summation strategies we have revealed may benefit manmade vision systems optimized for variable light levels [21]. Copyright © 2016 Elsevier Ltd. All rights reserved.
News video story segmentation method using fusion of audio-visual features
NASA Astrophysics Data System (ADS)
Wen, Jun; Wu, Ling-da; Zeng, Pu; Luan, Xi-dao; Xie, Yu-xiang
2007-11-01
News story segmentation is an important aspect for news video analysis. This paper presents a method for news video story segmentation. Different form prior works, which base on visual features transform, the proposed technique uses audio features as baseline and fuses visual features with it to refine the results. At first, it selects silence clips as audio features candidate points, and selects shot boundaries and anchor shots as two kinds of visual features candidate points. Then this paper selects audio feature candidates as cues and develops different fusion method, which effectively using diverse type visual candidates to refine audio candidates, to get story boundaries. Experiment results show that this method has high efficiency and adaptability to different kinds of news video.
Lahav, Orit; Apter, Alan; Ratzon, Navah Z
2013-01-01
This study evaluates how much the effects of intervention programs are influenced by pre-existing psychological adjustment and self-esteem levels in kindergarten and first grade children with poor visual-motor integration skills, from low socioeconomic backgrounds. One hundred and sixteen mainstream kindergarten and first-grade children, from low socioeconomic backgrounds, scoring below the 25th percentile on a measure of visual-motor integration (VMI) were recruited and randomly divided into two parallel intervention groups. One intervention group received directive visual-motor intervention (DVMI), while the second intervention group received a non-directive supportive intervention (NDSI). Tests were administered to evaluate visual-motor integration skills outcome. Children with higher baseline measures of psychological adjustment and self-esteem responded better in NDSI while children with lower baseline performance on psychological adjustment and self-esteem responded better in DVMI. This study suggests that children from low socioeconomic backgrounds with low VMI performance scores will benefit more from intervention programs if clinicians choose the type of intervention according to baseline psychological adjustment and self-esteem measures. Copyright © 2012 Elsevier Ltd. All rights reserved.
Liu, B; Meng, X; Wu, G; Huang, Y
2012-05-17
In this article, we aimed to study whether feature precedence existed in the cognitive processing of multifeature visual information in the human brain. In our experiment, we paid attention to two important visual features as follows: color and shape. In order to avoid the presence of semantic constraints between them and the resulting impact, pure color and simple geometric shape were chosen as the color feature and shape feature of visual stimulus, respectively. We adopted an "old/new" paradigm to study the cognitive processing of color feature, shape feature and the combination of color feature and shape feature, respectively. The experiment consisted of three tasks as follows: Color task, Shape task and Color-Shape task. The results showed that the feature-based pattern would be activated in the human brain in processing multifeature visual information without semantic association between features. Furthermore, shape feature was processed earlier than color feature, and the cognitive processing of color feature was more difficult than that of shape feature. Copyright © 2012 IBRO. Published by Elsevier Ltd. All rights reserved.
ERIC Educational Resources Information Center
Murr, Christopher D.; Blanchard, R. Denise
2011-01-01
Advances in classroom technology have lowered barriers for the visually impaired to study geography, yet few participate. Employing stereotype threat theory, we examined whether beliefs held by the visually impaired affect perceptions toward completing courses and majors in visually oriented disciplines. A test group received a low-level threat…
Gintautas, Vadas; Ham, Michael I.; Kunsberg, Benjamin; Barr, Shawn; Brumby, Steven P.; Rasmussen, Craig; George, John S.; Nemenman, Ilya; Bettencourt, Luís M. A.; Kenyon, Garret T.
2011-01-01
Can lateral connectivity in the primary visual cortex account for the time dependence and intrinsic task difficulty of human contour detection? To answer this question, we created a synthetic image set that prevents sole reliance on either low-level visual features or high-level context for the detection of target objects. Rendered images consist of smoothly varying, globally aligned contour fragments (amoebas) distributed among groups of randomly rotated fragments (clutter). The time course and accuracy of amoeba detection by humans was measured using a two-alternative forced choice protocol with self-reported confidence and variable image presentation time (20-200 ms), followed by an image mask optimized so as to interrupt visual processing. Measured psychometric functions were well fit by sigmoidal functions with exponential time constants of 30-91 ms, depending on amoeba complexity. Key aspects of the psychophysical experiments were accounted for by a computational network model, in which simulated responses across retinotopic arrays of orientation-selective elements were modulated by cortical association fields, represented as multiplicative kernels computed from the differences in pairwise edge statistics between target and distractor images. Comparing the experimental and the computational results suggests that each iteration of the lateral interactions takes at least ms of cortical processing time. Our results provide evidence that cortical association fields between orientation selective elements in early visual areas can account for important temporal and task-dependent aspects of the psychometric curves characterizing human contour perception, with the remaining discrepancies postulated to arise from the influence of higher cortical areas. PMID:21998562
Perceptual uncertainty facilitates creative discovery
NASA Astrophysics Data System (ADS)
Tseng, Winger Sei-Wo
2018-06-01
In this study, unstructured and ambiguous figures used as visual stimuli were classified as having high, moderate, and low ambiguity and presented to participants. The Experiment was designed to explore how the perceptual ambiguity that is inherent within presented visual cues can affect novice and expert designers' visual discovery during design development. A total number of 42 participants, half of them were recruited from non-design departments as novices. The remaining were chosen from design companies regarded as experts. The participants were tasked with discovering a sub-shape from the presented sketch and using this shape as a cue to design a concept. To this end, two types of sub-shapes were defined: known feature sub-shapes and innovative feature sub-shapes (IFSs). The experimental results strongly evidence that with an increase in the ambiguity of the visual stimuli, expert designers produce more ideas and IFSs, whereas novice designers produce fewer. The capability of expert designers to exploit visual ambiguity is interesting, and its absence in novice designers suggests that this capability is likely a unique skill gained, at least in part, through professional practice. Our results can be applied in design learning and education to generalize the principles and strategies of visual discovery by expert designers during concept sketching in order to train novice designers in addressing design problems.
Lighten the Load: Scaffolding Visual Literacy in Biochemistry and Molecular Biology
Offerdahl, Erika G.; Arneson, Jessie B.; Byrne, Nicholas
2017-01-01
The development of scientific visual literacy has been identified as critical to the training of tomorrow’s scientists and citizens alike. Within the context of the molecular life sciences in particular, visual representations frequently incorporate various components, such as discipline-specific graphical and diagrammatic features, varied levels of abstraction, and spatial arrangements of visual elements to convey information. Visual literacy is achieved when an individual understands the various ways in which a discipline uses these components to represent a particular way of knowing. Owing to the complex nature of visual representations, the activities through which visual literacy is developed have high cognitive load. Cognitive load can be reduced by first helping students to become fluent with the discrete components of visual representations before asking them to simultaneously integrate these components to extract the intended meaning of a representation. We present a taxonomy for characterizing one component of visual representations—the level of abstraction—as a first step in understanding the opportunities afforded students to develop fluency. Further, we demonstrate how our taxonomy can be used to analyze course assessments and spur discussions regarding the extent to which the development of visual literacy skills is supported by instruction within an undergraduate biochemistry curriculum. PMID:28130273
Do rats use shape to solve “shape discriminations”?
Minini, Loredana; Jeffery, Kathryn J.
2006-01-01
Visual discrimination tasks are increasingly used to explore the neurobiology of vision in rodents, but it remains unclear how the animals solve these tasks: Do they process shapes holistically, or by using low-level features such as luminance and angle acuity? In the present study we found that when discriminating triangles from squares, rats did not use shape but instead relied on local luminance differences in the lower hemifield. A second experiment prevented this strategy by using stimuli—squares and rectangles—that varied in size and location, and for which the only constant predictor of reward was aspect ratio (ratio of height to width: a simple descriptor of “shape”). Rats eventually learned to use aspect ratio but only when no other discriminand was available, and performance remained very poor even at asymptote. These results suggest that although rats can process both dimensions simultaneously, they do not naturally solve shape discrimination tasks this way. This may reflect either a failure to visually process global shape information or a failure to discover shape as the discriminative stimulus in a simultaneous discrimination. Either way, our results suggest that simultaneous shape discrimination is not a good task for studies of visual perception in rodents. PMID:16705141
Enhanced visual statistical learning in adults with autism
Roser, Matthew E.; Aslin, Richard N.; McKenzie, Rebecca; Zahra, Daniel; Fiser, József
2014-01-01
Individuals with autism spectrum disorder (ASD) are often characterized as having social engagement and language deficiencies, but a sparing of visuo-spatial processing and short-term memory, with some evidence of supra-normal levels of performance in these domains. The present study expanded on this evidence by investigating the observational learning of visuospatial concepts from patterns of covariation across multiple exemplars. Child and adult participants with ASD, and age-matched control participants, viewed multi-shape arrays composed from a random combination of pairs of shapes that were each positioned in a fixed spatial arrangement. After this passive exposure phase, a post-test revealed that all participant groups could discriminate pairs of shapes with high covariation from randomly paired shapes with low covariation. Moreover, learning these shape-pairs with high covariation was superior in adults with ASD than in age-matched controls, while performance in children with ASD was no different than controls. These results extend previous observations of visuospatial enhancement in ASD into the domain of learning, and suggest that enhanced visual statistical learning may have arisen from a sustained bias to attend to local details in complex arrays of visual features. PMID:25151115
Lamti, Hachem A; Gorce, Philippe; Ben Khelifa, Mohamed Moncef; Alimi, Adel M
2016-12-01
The goal of this study is to investigate the influence of mental fatigue on the event related potential P300 features (maximum pick, minimum amplitude, latency and period) during virtual wheelchair navigation. For this purpose, an experimental environment was set up based on customizable environmental parameters (luminosity, number of obstacles and obstacles velocities). A correlation study between P300 and fatigue ratings was conducted. Finally, the best correlated features supplied three classification algorithms which are MLP (Multi Layer Perceptron), Linear Discriminate Analysis and Support Vector Machine. The results showed that the maximum feature over visual and temporal regions as well as period feature over frontal, fronto-central and visual regions were correlated with mental fatigue levels. In the other hand, minimum amplitude and latency features didn't show any correlation. Among classification techniques, MLP showed the best performance although the differences between classification techniques are minimal. Those findings can help us in order to design suitable mental fatigue based wheelchair control.
Neocortical Rebound Depolarization Enhances Visual Perception
Funayama, Kenta; Ban, Hiroshi; Chan, Allen W.; Matsuki, Norio; Murphy, Timothy H.; Ikegaya, Yuji
2015-01-01
Animals are constantly exposed to the time-varying visual world. Because visual perception is modulated by immediately prior visual experience, visual cortical neurons may register recent visual history into a specific form of offline activity and link it to later visual input. To examine how preceding visual inputs interact with upcoming information at the single neuron level, we designed a simple stimulation protocol in which a brief, orientated flashing stimulus was subsequently coupled to visual stimuli with identical or different features. Using in vivo whole-cell patch-clamp recording and functional two-photon calcium imaging from the primary visual cortex (V1) of awake mice, we discovered that a flash of sinusoidal grating per se induces an early, transient activation as well as a long-delayed reactivation in V1 neurons. This late response, which started hundreds of milliseconds after the flash and persisted for approximately 2 s, was also observed in human V1 electroencephalogram. When another drifting grating stimulus arrived during the late response, the V1 neurons exhibited a sublinear, but apparently increased response, especially to the same grating orientation. In behavioral tests of mice and humans, the flashing stimulation enhanced the detection power of the identically orientated visual stimulation only when the second stimulation was presented during the time window of the late response. Therefore, V1 late responses likely provide a neural basis for admixing temporally separated stimuli and extracting identical features in time-varying visual environments. PMID:26274866
Feature-based attentional modulations in the absence of direct visual stimulation.
Serences, John T; Boynton, Geoffrey M
2007-07-19
When faced with a crowded visual scene, observers must selectively attend to behaviorally relevant objects to avoid sensory overload. Often this selection process is guided by prior knowledge of a target-defining feature (e.g., the color red when looking for an apple), which enhances the firing rate of visual neurons that are selective for the attended feature. Here, we used functional magnetic resonance imaging and a pattern classification algorithm to predict the attentional state of human observers as they monitored a visual feature (one of two directions of motion). We find that feature-specific attention effects spread across the visual field-even to regions of the scene that do not contain a stimulus. This spread of feature-based attention to empty regions of space may facilitate the perception of behaviorally relevant stimuli by increasing sensitivity to attended features at all locations in the visual field.
Waese, Jamie; Fan, Jim; Pasha, Asher; Yu, Hans; Fucile, Geoffrey; Shi, Ruian; Cumming, Matthew; Kelley, Lawrence A; Sternberg, Michael J; Krishnakumar, Vivek; Ferlanti, Erik; Miller, Jason; Town, Chris; Stuerzlinger, Wolfgang; Provart, Nicholas J
2017-08-01
A big challenge in current systems biology research arises when different types of data must be accessed from separate sources and visualized using separate tools. The high cognitive load required to navigate such a workflow is detrimental to hypothesis generation. Accordingly, there is a need for a robust research platform that incorporates all data and provides integrated search, analysis, and visualization features through a single portal. Here, we present ePlant (http://bar.utoronto.ca/eplant), a visual analytic tool for exploring multiple levels of Arabidopsis thaliana data through a zoomable user interface. ePlant connects to several publicly available web services to download genome, proteome, interactome, transcriptome, and 3D molecular structure data for one or more genes or gene products of interest. Data are displayed with a set of visualization tools that are presented using a conceptual hierarchy from big to small, and many of the tools combine information from more than one data type. We describe the development of ePlant in this article and present several examples illustrating its integrative features for hypothesis generation. We also describe the process of deploying ePlant as an "app" on Araport. Building on readily available web services, the code for ePlant is freely available for any other biological species research. © 2017 American Society of Plant Biologists. All rights reserved.
ERIC Educational Resources Information Center
Firat, Mehmet; Kabakci, Isil
2010-01-01
The interactional feature of hypermedia that allows high-level student-control is considered as one of the most important advantages that hypermedia provides for learning and teaching. However, high-level student control in hypermedia might not always lead to high-level learning performance. The learner is likely to experience navigation problems…
No Effect of Featural Attention on Body Size Aftereffects
Stephen, Ian D.; Bickersteth, Chloe; Mond, Jonathan; Stevenson, Richard J.; Brooks, Kevin R.
2016-01-01
Prolonged exposure to images of narrow bodies has been shown to induce a perceptual aftereffect, such that observers’ point of subjective normality (PSN) for bodies shifts toward narrower bodies. The converse effect is shown for adaptation to wide bodies. In low-level stimuli, object attention (attention directed to the object) and spatial attention (attention directed to the location of the object) have been shown to increase the magnitude of visual aftereffects, while object-based attention enhances the adaptation effect in faces. It is not known whether featural attention (attention directed to a specific aspect of the object) affects the magnitude of adaptation effects in body stimuli. Here, we manipulate the attention of Caucasian observers to different featural information in body images, by asking them to rate the fatness or sex typicality of male and female bodies manipulated to appear fatter or thinner than average. PSNs for body fatness were taken at baseline and after adaptation, and a change in PSN (ΔPSN) was calculated. A body size adaptation effect was found, with observers who viewed fat bodies showing an increased PSN, and those exposed to thin bodies showing a reduced PSN. However, manipulations of featural attention to body fatness or sex typicality produced equivalent results, suggesting that featural attention may not affect the strength of the body size aftereffect. PMID:27597835
No Effect of Featural Attention on Body Size Aftereffects.
Stephen, Ian D; Bickersteth, Chloe; Mond, Jonathan; Stevenson, Richard J; Brooks, Kevin R
2016-01-01
Prolonged exposure to images of narrow bodies has been shown to induce a perceptual aftereffect, such that observers' point of subjective normality (PSN) for bodies shifts toward narrower bodies. The converse effect is shown for adaptation to wide bodies. In low-level stimuli, object attention (attention directed to the object) and spatial attention (attention directed to the location of the object) have been shown to increase the magnitude of visual aftereffects, while object-based attention enhances the adaptation effect in faces. It is not known whether featural attention (attention directed to a specific aspect of the object) affects the magnitude of adaptation effects in body stimuli. Here, we manipulate the attention of Caucasian observers to different featural information in body images, by asking them to rate the fatness or sex typicality of male and female bodies manipulated to appear fatter or thinner than average. PSNs for body fatness were taken at baseline and after adaptation, and a change in PSN (ΔPSN) was calculated. A body size adaptation effect was found, with observers who viewed fat bodies showing an increased PSN, and those exposed to thin bodies showing a reduced PSN. However, manipulations of featural attention to body fatness or sex typicality produced equivalent results, suggesting that featural attention may not affect the strength of the body size aftereffect.
Groen, Iris I. A.; Ghebreab, Sennay; Lamme, Victor A. F.; Scholte, H. Steven
2012-01-01
The visual world is complex and continuously changing. Yet, our brain transforms patterns of light falling on our retina into a coherent percept within a few hundred milliseconds. Possibly, low-level neural responses already carry substantial information to facilitate rapid characterization of the visual input. Here, we computationally estimated low-level contrast responses to computer-generated naturalistic images, and tested whether spatial pooling of these responses could predict image similarity at the neural and behavioral level. Using EEG, we show that statistics derived from pooled responses explain a large amount of variance between single-image evoked potentials (ERPs) in individual subjects. Dissimilarity analysis on multi-electrode ERPs demonstrated that large differences between images in pooled response statistics are predictive of more dissimilar patterns of evoked activity, whereas images with little difference in statistics give rise to highly similar evoked activity patterns. In a separate behavioral experiment, images with large differences in statistics were judged as different categories, whereas images with little differences were confused. These findings suggest that statistics derived from low-level contrast responses can be extracted in early visual processing and can be relevant for rapid judgment of visual similarity. We compared our results with two other, well- known contrast statistics: Fourier power spectra and higher-order properties of contrast distributions (skewness and kurtosis). Interestingly, whereas these statistics allow for accurate image categorization, they do not predict ERP response patterns or behavioral categorization confusions. These converging computational, neural and behavioral results suggest that statistics of pooled contrast responses contain information that corresponds with perceived visual similarity in a rapid, low-level categorization task. PMID:23093921
Özen Tunay, Zuhal; Çalışkan, Deniz; İdil, Aysun; Öztuna, Derya
2016-01-01
Objectives: To determine the clinical features and the distribution of diagnosis in partially sighted school-age children, to report the chosen low vision rehabilitation methods and to emphasize the importance of low vision rehabilitation. Materials and Methods: The study included 150 partially sighted children between the ages of 6 and 18 years. The distribution of diagnosis, accompanying ocular findings, visual acuity of the children both for near and distance with and without low vision devices, and the methods of low vision rehabilitation (for distance and for near) were determined. The demographic characteristics of the children and the parental consanguinity were recorded. Results: The mean age of children was 10.6 years and the median age was 10 years; 88 (58.7%) of them were male and 62 (41.3%) of them were female. According to distribution of diagnoses among the children, the most frequent diagnosis was hereditary fundus dystrophies (36%) followed by cortical visual impairment (18%). The most frequently used rehabilitation methods were: telescopic lenses (91.3%) for distance vision; magnifiers (38.7%) and telemicroscopic systems (26.0%) for near vision. A significant improvement in visual acuity both for distance and near vision were determined with low vision aids. Conclusion: A significant improvement in visual acuity can be achieved both for distance and near vision with low vision rehabilitation in partially sighted school-age children. It is important for ophthalmologists and pediatricians to guide parents and children to low vision rehabilitation. PMID:27800263
Özen Tunay, Zuhal; Çalışkan, Deniz; İdil, Aysun; Öztuna, Derya
2016-04-01
To determine the clinical features and the distribution of diagnosis in partially sighted school-age children, to report the chosen low vision rehabilitation methods and to emphasize the importance of low vision rehabilitation. The study included 150 partially sighted children between the ages of 6 and 18 years. The distribution of diagnosis, accompanying ocular findings, visual acuity of the children both for near and distance with and without low vision devices, and the methods of low vision rehabilitation (for distance and for near) were determined. The demographic characteristics of the children and the parental consanguinity were recorded. The mean age of children was 10.6 years and the median age was 10 years; 88 (58.7%) of them were male and 62 (41.3%) of them were female. According to distribution of diagnoses among the children, the most frequent diagnosis was hereditary fundus dystrophies (36%) followed by cortical visual impairment (18%). The most frequently used rehabilitation methods were: telescopic lenses (91.3%) for distance vision; magnifiers (38.7%) and telemicroscopic systems (26.0%) for near vision. A significant improvement in visual acuity both for distance and near vision were determined with low vision aids. A significant improvement in visual acuity can be achieved both for distance and near vision with low vision rehabilitation in partially sighted school-age children. It is important for ophthalmologists and pediatricians to guide parents and children to low vision rehabilitation.
Emotion-induced trade-offs in spatiotemporal vision.
Bocanegra, Bruno R; Zeelenberg, René
2011-05-01
It is generally assumed that emotion facilitates human vision in order to promote adaptive responses to a potential threat in the environment. Surprisingly, we recently found that emotion in some cases impairs the perception of elementary visual features (Bocanegra & Zeelenberg, 2009b). Here, we demonstrate that emotion improves fast temporal vision at the expense of fine-grained spatial vision. We tested participants' threshold resolution with Landolt circles containing a small spatial or brief temporal discontinuity. The prior presentation of a fearful face cue, compared with a neutral face cue, impaired spatial resolution but improved temporal resolution. In addition, we show that these benefits and deficits were triggered selectively by the global configural properties of the faces, which were transmitted only through low spatial frequencies. Critically, the common locus of these opposite effects suggests a trade-off between magno- and parvocellular-type visual channels, which contradicts the common assumption that emotion invariably improves vision. We show that, rather than being a general "boost" for all visual features, affective neural circuits sacrifice the slower processing of small details for a coarser but faster visual signal.
Impaired search for orientation but not color in hemi-spatial neglect.
Wilkinson, David; Ko, Philip; Milberg, William; McGlinchey, Regina
2008-01-01
Patients with hemi-spatial neglect have trouble finding targets defined by a conjunction of visual features. The problem is widely believed to stem from a high-level deficit in attentional deployment, which in turn has led to disagreement over whether the detection of basic features is also disrupted. If one assumes that the detection of salient visual features can be based on the output of spared 'preattentive' processes (Treisman and Gelade, 1980), then feature detection should remain intact. However, if one assumes that all forms of detection require at least a modicum of focused attention (Duncan and Humphreys, 1992), then all forms of search will be disrupted to some degree. Here we measured the detection of feature targets that were defined by either a unique color or orientation. Comparable detection rates were observed in non-neglected space, which indicated that both forms of search placed similar demands on attention. For either of the above accounts to be true, the two targets should therefore be detected with equal efficiency in the neglected field. We found that while the detection rate for color was normal in four of our five patients, all showed an increased reaction time and/or error rate for orientation. This result points to a selective deficit in orientation discrimination, and implies that neglect disrupts specific feature representations. That is, the effects of neglect on visual search are not only attentional but also perceptual.
Visualizing complex hydrodynamic features
NASA Astrophysics Data System (ADS)
Kempf, Jill L.; Marshall, Robert E.; Yen, Chieh-Cheng
1990-08-01
The Lake Erie Forecasting System is a cooperative project by university, private and governmental institutions to provide continuous forecasting of three-dimensional structure within the lake. The forecasts will include water velocity and temperature distributions throughout the body of water, as well as water level and wind-wave distributions at the lake's surface. Many hydrodynamic features can be extracted from this data, including coastal jets, large-scale thermocline motion and zones of upwelling and downwelling. A visualization system is being developed that will aid in understanding these features and their interactions. Because of the wide variety of features, they cannot all be adequately represented by a single rendering technique. Particle tracing, surface rendering, and volumetric techniques are all necessary. This visualization effortis aimed towards creating a system that will provide meaningful forecasts for those using the lake for recreational and commercial purposes. For example, the fishing industry needs to know about large-scale thermocline motion in order to find the best fishing areas and power plants need to know water intAke temperatures. The visualization system must convey this information in a manner that is easily understood by these users. Scientists must also be able to use this system to verify their hydrodynamic simulation. The focus of the system, therefore, is to provide the information to serve these diverse interests, without overwhelming any single user with unnecessary data.
G-Induced Visual Symptoms in a Military Helicopter Pilot.
McMahon, Terry W; Newman, David G
2016-11-01
Military helicopters are increasingly agile and capable of producing significant G forces experienced in the longitudinal (z) axis of the body in a head-to-foot direction (+Gz). Dehydration and fatigue can adversely affect a pilot's +Gz tolerance, leading to +Gz-induced symptomatology occurring at lower +Gz levels than expected. The potential for adverse consequences of +Gz exposure to affect flight safety in military helicopter operations needs to be recognized. This case report describes a helicopter pilot who experienced +Gz-induced visual impairment during low-level flight. The incident occurred during a tropical training exercise, with an ambient temperature of around 35°C (95°F). As a result of the operational tempo and the environmental conditions, aircrew were generally fatigued and dehydrated. During a low-level steep turn, a Blackhawk pilot experienced significant visual deterioration. The +Gz level was estimated at +2.5 Gz. After completing the turn, the pilot's vision returned to normal, and the flight concluded without further incident. This case highlights the potential dangers of +Gz exposure in tactical helicopters. Although the +Gz level was moderate, the pilot's +Gz tolerance was reduced by the combined effects of dehydration and fatigue. The dangers of such +Gz-induced visual impairment during low-level flight are clear. More awareness of +Gz physiology and +Gz tolerance-reducing factors in helicopter operations is needed. Reprint & Copyright © 2016 Association of Military Surgeons of the U.S.
DiCarlo, James J.; Zecchina, Riccardo; Zoccolan, Davide
2013-01-01
The anterior inferotemporal cortex (IT) is the highest stage along the hierarchy of visual areas that, in primates, processes visual objects. Although several lines of evidence suggest that IT primarily represents visual shape information, some recent studies have argued that neuronal ensembles in IT code the semantic membership of visual objects (i.e., represent conceptual classes such as animate and inanimate objects). In this study, we investigated to what extent semantic, rather than purely visual information, is represented in IT by performing a multivariate analysis of IT responses to a set of visual objects. By relying on a variety of machine-learning approaches (including a cutting-edge clustering algorithm that has been recently developed in the domain of statistical physics), we found that, in most instances, IT representation of visual objects is accounted for by their similarity at the level of shape or, more surprisingly, low-level visual properties. Only in a few cases we observed IT representations of semantic classes that were not explainable by the visual similarity of their members. Overall, these findings reassert the primary function of IT as a conveyor of explicit visual shape information, and reveal that low-level visual properties are represented in IT to a greater extent than previously appreciated. In addition, our work demonstrates how combining a variety of state-of-the-art multivariate approaches, and carefully estimating the contribution of shape similarity to the representation of object categories, can substantially advance our understanding of neuronal coding of visual objects in cortex. PMID:23950700
Stekelenburg, Jeroen J; Keetels, Mirjam
2016-05-01
The Colavita effect refers to the phenomenon that when confronted with an audiovisual stimulus, observers report more often to have perceived the visual than the auditory component. The Colavita effect depends on low-level stimulus factors such as spatial and temporal proximity between the unimodal signals. Here, we examined whether the Colavita effect is modulated by synesthetic congruency between visual size and auditory pitch. If the Colavita effect depends on synesthetic congruency, we expect a larger Colavita effect for synesthetically congruent size/pitch (large visual stimulus/low-pitched tone; small visual stimulus/high-pitched tone) than synesthetically incongruent (large visual stimulus/high-pitched tone; small visual stimulus/low-pitched tone) combinations. Participants had to identify stimulus type (visual, auditory or audiovisual). The study replicated the Colavita effect because participants reported more often the visual than auditory component of the audiovisual stimuli. Synesthetic congruency had, however, no effect on the magnitude of the Colavita effect. EEG recordings to congruent and incongruent audiovisual pairings showed a late frontal congruency effect at 400-550 ms and an occipitoparietal effect at 690-800 ms with neural sources in the anterior cingulate and premotor cortex for the 400- to 550-ms window and premotor cortex, inferior parietal lobule and the posterior middle temporal gyrus for the 690- to 800-ms window. The electrophysiological data show that synesthetic congruency was probably detected in a processing stage subsequent to the Colavita effect. We conclude that-in a modality detection task-the Colavita effect can be modulated by low-level structural factors but not by higher-order associations between auditory and visual inputs.
Linking Plasma Conditions in the Magnetosphere with Ionospheric Signatures
NASA Technical Reports Server (NTRS)
Rastaetter, Lutz; Kozyra, Janet; Kuznetsova, Maria M.; Berrios, David H.
2012-01-01
Modeling of the full magnetosphere, ring current and ionosphere system has become an indispensable tool in analyzing the series of events that occur during geomagnetic storms. The CCMC has a full model suite available for the magnetosphere, together with visualization tools that allow a user to perform a large variety of analyses. The January, 21, 2005 storm was a moderate-size storm that has been found to feature a large penetration electric field and unusually large polar caps (low-latitude precipitation patterns) that are otherwise found in super storms. Based on simulations runs at CCMC we can outline the likely causes of this behavior. Using visualization tools available to the online user we compare results from different magnetosphere models and present connections found between features in the magnetosphere and the ionosphere that are connected magnetically. The range of magnetic mappings found with different models can be compared with statistical models (Tsyganenko) and the model's fidelity can be verified with observations from low earth orbiting satellites such as DMSP and TIMED.
Genoviz Software Development Kit: Java tool kit for building genomics visualization applications.
Helt, Gregg A; Nicol, John W; Erwin, Ed; Blossom, Eric; Blanchard, Steven G; Chervitz, Stephen A; Harmon, Cyrus; Loraine, Ann E
2009-08-25
Visualization software can expose previously undiscovered patterns in genomic data and advance biological science. The Genoviz Software Development Kit (SDK) is an open source, Java-based framework designed for rapid assembly of visualization software applications for genomics. The Genoviz SDK framework provides a mechanism for incorporating adaptive, dynamic zooming into applications, a desirable feature of genome viewers. Visualization capabilities of the Genoviz SDK include automated layout of features along genetic or genomic axes; support for user interactions with graphical elements (Glyphs) in a map; a variety of Glyph sub-classes that promote experimentation with new ways of representing data in graphical formats; and support for adaptive, semantic zooming, whereby objects change their appearance depending on zoom level and zooming rate adapts to the current scale. Freely available demonstration and production quality applications, including the Integrated Genome Browser, illustrate Genoviz SDK capabilities. Separation between graphics components and genomic data models makes it easy for developers to add visualization capability to pre-existing applications or build new applications using third-party data models. Source code, documentation, sample applications, and tutorials are available at http://genoviz.sourceforge.net/.
3D Surface Reconstruction and Volume Calculation of Rills
NASA Astrophysics Data System (ADS)
Brings, Christine; Gronz, Oliver; Becker, Kerstin; Wirtz, Stefan; Seeger, Manuel; Ries, Johannes B.
2015-04-01
We use the low-cost, user-friendly photogrammetric Structure from Motion (SfM) technique, which is implemented in the Software VisualSfM, for 3D surface reconstruction and volume calculation of an 18 meter long rill in Luxembourg. The images were taken with a Canon HD video camera 1) before a natural rainfall event, 2) after a natural rainfall event and before a rill experiment and 3) after a rill experiment. Recording with a video camera results compared to a photo camera not only a huge time advantage, the method also guarantees more than adequately overlapping sharp images. For each model, approximately 8 minutes of video were taken. As SfM needs single images, we automatically selected the sharpest image from 15 frame intervals. The sharpness was estimated using a derivative-based metric. Then, VisualSfM detects feature points in each image, searches matching feature points in all image pairs, recovers the camera positions and finally by triangulation of camera positions and feature points the software reconstructs a point cloud of the rill surface. From the point cloud, 3D surface models (meshes) are created and via difference calculations of the pre and post models a visualization of the changes (erosion and accumulation areas) and quantification of erosion volumes are possible. The calculated volumes are presented in spatial units of the models and so real values must be converted via references. The outputs are three models at three different points in time. The results show that especially using images taken from suboptimal videos (bad lighting conditions, low contrast of the surface, too much in-motion unsharpness), the sharpness algorithm leads to much more matching features. Hence the point densities of the 3D models are increased and thereby clarify the calculations.
Ghodrati, Masoud; Ghodousi, Mahrad; Yoonessi, Ali
2016-01-01
Humans are fast and accurate in categorizing complex natural images. It is, however, unclear what features of visual information are exploited by brain to perceive the images with such speed and accuracy. It has been shown that low-level contrast statistics of natural scenes can explain the variance of amplitude of event-related potentials (ERP) in response to rapidly presented images. In this study, we investigated the effect of these statistics on frequency content of ERPs. We recorded ERPs from human subjects, while they viewed natural images each presented for 70 ms. Our results showed that Weibull contrast statistics, as a biologically plausible model, explained the variance of ERPs the best, compared to other image statistics that we assessed. Our time-frequency analysis revealed a significant correlation between these statistics and ERPs' power within theta frequency band (~3-7 Hz). This is interesting, as theta band is believed to be involved in context updating and semantic encoding. This correlation became significant at ~110 ms after stimulus onset, and peaked at 138 ms. Our results show that not only the amplitude but also the frequency of neural responses can be modulated with low-level contrast statistics of natural images and highlights their potential role in scene perception.
Ghodrati, Masoud; Ghodousi, Mahrad; Yoonessi, Ali
2016-01-01
Humans are fast and accurate in categorizing complex natural images. It is, however, unclear what features of visual information are exploited by brain to perceive the images with such speed and accuracy. It has been shown that low-level contrast statistics of natural scenes can explain the variance of amplitude of event-related potentials (ERP) in response to rapidly presented images. In this study, we investigated the effect of these statistics on frequency content of ERPs. We recorded ERPs from human subjects, while they viewed natural images each presented for 70 ms. Our results showed that Weibull contrast statistics, as a biologically plausible model, explained the variance of ERPs the best, compared to other image statistics that we assessed. Our time-frequency analysis revealed a significant correlation between these statistics and ERPs' power within theta frequency band (~3–7 Hz). This is interesting, as theta band is believed to be involved in context updating and semantic encoding. This correlation became significant at ~110 ms after stimulus onset, and peaked at 138 ms. Our results show that not only the amplitude but also the frequency of neural responses can be modulated with low-level contrast statistics of natural images and highlights their potential role in scene perception. PMID:28018197
Lukasczyk, Jonas; Weber, Gunther; Maciejewski, Ross; ...
2017-06-01
Tracking graphs are a well established tool in topological analysis to visualize the evolution of components and their properties over time, i.e., when components appear, disappear, merge, and split. However, tracking graphs are limited to a single level threshold and the graphs may vary substantially even under small changes to the threshold. To examine the evolution of features for varying levels, users have to compare multiple tracking graphs without a direct visual link between them. We propose a novel, interactive, nested graph visualization based on the fact that the tracked superlevel set components for different levels are related to eachmore » other through their nesting hierarchy. This approach allows us to set multiple tracking graphs in context to each other and enables users to effectively follow the evolution of components for different levels simultaneously. We show the effectiveness of our approach on datasets from finite pointset methods, computational fluid dynamics, and cosmology simulations.« less
Assessing clutter reduction in parallel coordinates using image processing techniques
NASA Astrophysics Data System (ADS)
Alhamaydh, Heba; Alzoubi, Hussein; Almasaeid, Hisham
2018-01-01
Information visualization has appeared as an important research field for multidimensional data and correlation analysis in recent years. Parallel coordinates (PCs) are one of the popular techniques to visual high-dimensional data. A problem with the PCs technique is that it suffers from crowding, a clutter which hides important data and obfuscates the information. Earlier research has been conducted to reduce clutter without loss in data content. We introduce the use of image processing techniques as an approach for assessing the performance of clutter reduction techniques in PC. We use histogram analysis as our first measure, where the mean feature of the color histograms of the possible alternative orderings of coordinates for the PC images is calculated and compared. The second measure is the extracted contrast feature from the texture of PC images based on gray-level co-occurrence matrices. The results show that the best PC image is the one that has the minimal mean value of the color histogram feature and the maximal contrast value of the texture feature. In addition to its simplicity, the proposed assessment method has the advantage of objectively assessing alternative ordering of PC visualization.
Maekawa, Toru; de Brecht, Matthew; Yamagishi, Noriko
2018-01-01
The study of visual perception has largely been completed without regard to the influence that an individual’s emotional status may have on their performance in visual tasks. However, there is a growing body of evidence to suggest that mood may affect not only creative abilities and interpersonal skills but also the capacity to perform low-level cognitive tasks. Here, we sought to determine whether rudimentary visual search processes are similarly affected by emotion. Specifically, we examined whether an individual’s perceived happiness level affects their ability to detect a target in noise. To do so, we employed pop-out and serial visual search paradigms, implemented using a novel smartphone application that allowed search times and self-rated levels of happiness to be recorded throughout each twenty-four-hour period for two weeks. This experience sampling protocol circumvented the need to alter mood artificially with laboratory-based induction methods. Using our smartphone application, we were able to replicate the classic visual search findings, whereby pop-out search times remained largely unaffected by the number of distractors whereas serial search times increased with increasing number of distractors. While pop-out search times were unaffected by happiness level, serial search times with the maximum numbers of distractors (n = 30) were significantly faster for high happiness levels than low happiness levels (p = 0.02). Our results demonstrate the utility of smartphone applications in assessing ecologically valid measures of human visual performance. We discuss the significance of our findings for the assessment of basic visual functions using search time measures, and for our ability to search effectively for targets in real world settings. PMID:29664952
Maekawa, Toru; Anderson, Stephen J; de Brecht, Matthew; Yamagishi, Noriko
2018-01-01
The study of visual perception has largely been completed without regard to the influence that an individual's emotional status may have on their performance in visual tasks. However, there is a growing body of evidence to suggest that mood may affect not only creative abilities and interpersonal skills but also the capacity to perform low-level cognitive tasks. Here, we sought to determine whether rudimentary visual search processes are similarly affected by emotion. Specifically, we examined whether an individual's perceived happiness level affects their ability to detect a target in noise. To do so, we employed pop-out and serial visual search paradigms, implemented using a novel smartphone application that allowed search times and self-rated levels of happiness to be recorded throughout each twenty-four-hour period for two weeks. This experience sampling protocol circumvented the need to alter mood artificially with laboratory-based induction methods. Using our smartphone application, we were able to replicate the classic visual search findings, whereby pop-out search times remained largely unaffected by the number of distractors whereas serial search times increased with increasing number of distractors. While pop-out search times were unaffected by happiness level, serial search times with the maximum numbers of distractors (n = 30) were significantly faster for high happiness levels than low happiness levels (p = 0.02). Our results demonstrate the utility of smartphone applications in assessing ecologically valid measures of human visual performance. We discuss the significance of our findings for the assessment of basic visual functions using search time measures, and for our ability to search effectively for targets in real world settings.
AE (Acoustic Emission) for Flip-Chip CGA/FCBGA Defect Detection
NASA Technical Reports Server (NTRS)
Ghaffarian, Reza
2014-01-01
C-mode scanning acoustic microscopy (C-SAM) is a nondestructive inspection technique that uses ultrasound to show the internal feature of a specimen. A very high or ultra-high-frequency ultrasound passes through a specimen to produce a visible acoustic microimage (AMI) of its inner features. As ultrasound travels into a specimen, the wave is absorbed, scattered or reflected. The response is highly sensitive to the elastic properties of the materials and is especially sensitive to air gaps. This specific characteristic makes AMI the preferred method for finding "air gaps" such as delamination, cracks, voids, and porosity. C-SAM analysis, which is a type of AMI, was widely used in the past for evaluation of plastic microelectronic circuits, especially for detecting delamination of direct die bonding. With the introduction of the flip-chip die attachment in a package; its use has been expanded to nondestructive characterization of the flip-chip solder bumps and underfill. Figure 1.1 compares visual and C-SAM inspection approaches for defect detection, especially for solder joint interconnections and hidden defects. C-SAM is specifically useful for package features like internal cracks and delamination. C-SAM not only allows for the visualization of the interior features, it has the ability to produce images on layer-by-layer basis. Visual inspection; however, is only superior to C-SAM for the exposed features including solder dewetting, microcracks, and contamination. Ideally, a combination of various inspection techniques - visual, optical and SEM microscopy, C-SAM, and X-ray - need to be performed in order to assure quality at part, package, and system levels. This reports presents evaluations performed on various advanced packages/assemblies, especially the flip-chip die version of ball grid array/column grid array (BGA/CGA) using C-SAM equipment. Both external and internal equipment was used for evaluation. The outside facility provided images of the key features that could be detected using the most advanced C-SAM equipment with a skilled operator. Investigation continued using in-house equipment with its limitations. For comparison, representative X-rays of the assemblies were also gathered to show key defect detection features of these non-destructive techniques. Key images gathered and compared are: Compared the images of 2D X-ray and C-SAM for a plastic LGA assembly showing features that could be detected by either NDE technique. For this specific case, X-ray was a clear winner. Evaluated flip-chip CGA and FCBGA assemblies with and without heat sink by C-SAM. Only the FCCGA package that had no heat sink could be fully analyzed for underfill and bump quality. Cross-sectional microscopy did not revealed peripheral delamination features detected by C-SAM. Analyzed a number of fine pitch PBGA assemblies by C-SAM. Even though the internal features of the package assemblies could be detected, C-SAM was unable to detect solder joint failure at either the package or board level. Twenty times touch ups by solder iron with 700degF tip temperature, each with about 5 second duration, did not induce defects to be detected by C-SAM images. Other techniques need to be considered to induce known defects for characterization. Given NASA's emphasis on the use of microelectronic packages and assemblies and quality assurance on workmanship defect detection, understanding key features of various inspection systems that detect defects in the early stages of package and assembly is critical to developing approaches that will minimize future failures. Additional specific, tailored non-destructive inspection approaches could enable low-risk insertion of these advanced electronic packages having hidden and fine features.
Guidance of attention by information held in working memory.
Calleja, Marissa Ortiz; Rich, Anina N
2013-05-01
Information held in working memory (WM) can guide attention during visual search. The authors of recent studies have interpreted the effect of holding verbal labels in WM as guidance of visual attention by semantic information. In a series of experiments, we tested how attention is influenced by visual features versus category-level information about complex objects held in WM. Participants either memorized an object's image or its category. While holding this information in memory, they searched for a target in a four-object search display. On exact-match trials, the memorized item reappeared as a distractor in the search display. On category-match trials, another exemplar of the memorized item appeared as a distractor. On neutral trials, none of the distractors were related to the memorized object. We found attentional guidance in visual search on both exact-match and category-match trials in Experiment 1, in which the exemplars were visually similar. When we controlled for visual similarity among the exemplars by using four possible exemplars (Exp. 2) or by using two exemplars rated as being visually dissimilar (Exp. 3), we found attentional guidance only on exact-match trials when participants memorized the object's image. The same pattern of results held when the target was invariant (Exps. 2-3) and when the target was defined semantically and varied in visual features (Exp. 4). The findings of these experiments suggest that attentional guidance by WM requires active visual information.
Reilly, Jamie; Garcia, Amanda; Binney, Richard J.
2016-01-01
Much remains to be learned about the neural architecture underlying word meaning. Fully distributed models of semantic memory predict that the sound of a barking dog will conjointly engage a network of distributed sensorimotor spokes. An alternative framework holds that modality-specific features additionally converge within transmodal hubs. Participants underwent functional MRI while covertly naming familiar objects versus newly learned novel objects from only one of their constituent semantic features (visual form, characteristic sound, or point-light motion representation). Relative to the novel object baseline, familiar concepts elicited greater activation within association regions specific to that presentation modality. Furthermore, visual form elicited activation within high-level auditory association cortex. Conversely, environmental sounds elicited activation in regions proximal to visual association cortex. Both conditions commonly engaged a putative hub region within lateral anterior temporal cortex. These results support hybrid semantic models in which local hubs and distributed spokes are dually engaged in service of semantic memory. PMID:27289210