Gamifying Video Object Segmentation.
Spampinato, Concetto; Palazzo, Simone; Giordano, Daniela
2017-10-01
Video object segmentation can be considered as one of the most challenging computer vision problems. Indeed, so far, no existing solution is able to effectively deal with the peculiarities of real-world videos, especially in cases of articulated motion and object occlusions; limitations that appear more evident when we compare the performance of automated methods with the human one. However, manually segmenting objects in videos is largely impractical as it requires a lot of time and concentration. To address this problem, in this paper we propose an interactive video object segmentation method, which exploits, on one hand, the capability of humans to identify correctly objects in visual scenes, and on the other hand, the collective human brainpower to solve challenging and large-scale tasks. In particular, our method relies on a game with a purpose to collect human inputs on object locations, followed by an accurate segmentation phase achieved by optimizing an energy function encoding spatial and temporal constraints between object regions as well as human-provided location priors. Performance analysis carried out on complex video benchmarks, and exploiting data provided by over 60 users, demonstrated that our method shows a better trade-off between annotation times and segmentation accuracy than interactive video annotation and automated video object segmentation approaches.
A new user-assisted segmentation and tracking technique for an object-based video editing system
NASA Astrophysics Data System (ADS)
Yu, Hong Y.; Hong, Sung-Hoon; Lee, Mike M.; Choi, Jae-Gark
2004-03-01
This paper presents a semi-automatic segmentation method which can be used to generate video object plane (VOP) for object based coding scheme and multimedia authoring environment. Semi-automatic segmentation can be considered as a user-assisted segmentation technique. A user can initially mark objects of interest around the object boundaries and then the user-guided and selected objects are continuously separated from the unselected areas through time evolution in the image sequences. The proposed segmentation method consists of two processing steps: partially manual intra-frame segmentation and fully automatic inter-frame segmentation. The intra-frame segmentation incorporates user-assistance to define the meaningful complete visual object of interest to be segmentation and decides precise object boundary. The inter-frame segmentation involves boundary and region tracking to obtain temporal coherence of moving object based on the object boundary information of previous frame. The proposed method shows stable efficient results that could be suitable for many digital video applications such as multimedia contents authoring, content based coding and indexing. Based on these results, we have developed objects based video editing system with several convenient editing functions.
Model-based video segmentation for vision-augmented interactive games
NASA Astrophysics Data System (ADS)
Liu, Lurng-Kuo
2000-04-01
This paper presents an architecture and algorithms for model based video object segmentation and its applications to vision augmented interactive game. We are especially interested in real time low cost vision based applications that can be implemented in software in a PC. We use different models for background and a player object. The object segmentation algorithm is performed in two different levels: pixel level and object level. At pixel level, the segmentation algorithm is formulated as a maximizing a posteriori probability (MAP) problem. The statistical likelihood of each pixel is calculated and used in the MAP problem. Object level segmentation is used to improve segmentation quality by utilizing the information about the spatial and temporal extent of the object. The concept of an active region, which is defined based on motion histogram and trajectory prediction, is introduced to indicate the possibility of a video object region for both background and foreground modeling. It also reduces the overall computation complexity. In contrast with other applications, the proposed video object segmentation system is able to create background and foreground models on the fly even without introductory background frames. Furthermore, we apply different rate of self-tuning on the scene model so that the system can adapt to the environment when there is a scene change. We applied the proposed video object segmentation algorithms to several prototype virtual interactive games. In our prototype vision augmented interactive games, a player can immerse himself/herself inside a game and can virtually interact with other animated characters in a real time manner without being constrained by helmets, gloves, special sensing devices, or background environment. The potential applications of the proposed algorithms including human computer gesture interface and object based video coding such as MPEG-4 video coding.
Fast Appearance Modeling for Automatic Primary Video Object Segmentation.
Yang, Jiong; Price, Brian; Shen, Xiaohui; Lin, Zhe; Yuan, Junsong
2016-02-01
Automatic segmentation of the primary object in a video clip is a challenging problem as there is no prior knowledge of the primary object. Most existing techniques thus adapt an iterative approach for foreground and background appearance modeling, i.e., fix the appearance model while optimizing the segmentation and fix the segmentation while optimizing the appearance model. However, these approaches may rely on good initialization and can be easily trapped in local optimal. In addition, they are usually time consuming for analyzing videos. To address these limitations, we propose a novel and efficient appearance modeling technique for automatic primary video object segmentation in the Markov random field (MRF) framework. It embeds the appearance constraint as auxiliary nodes and edges in the MRF structure, and can optimize both the segmentation and appearance model parameters simultaneously in one graph cut. The extensive experimental evaluations validate the superiority of the proposed approach over the state-of-the-art methods, in both efficiency and effectiveness.
Selecting salient frames for spatiotemporal video modeling and segmentation.
Song, Xiaomu; Fan, Guoliang
2007-12-01
We propose a new statistical generative model for spatiotemporal video segmentation. The objective is to partition a video sequence into homogeneous segments that can be used as "building blocks" for semantic video segmentation. The baseline framework is a Gaussian mixture model (GMM)-based video modeling approach that involves a six-dimensional spatiotemporal feature space. Specifically, we introduce the concept of frame saliency to quantify the relevancy of a video frame to the GMM-based spatiotemporal video modeling. This helps us use a small set of salient frames to facilitate the model training by reducing data redundancy and irrelevance. A modified expectation maximization algorithm is developed for simultaneous GMM training and frame saliency estimation, and the frames with the highest saliency values are extracted to refine the GMM estimation for video segmentation. Moreover, it is interesting to find that frame saliency can imply some object behaviors. This makes the proposed method also applicable to other frame-related video analysis tasks, such as key-frame extraction, video skimming, etc. Experiments on real videos demonstrate the effectiveness and efficiency of the proposed method.
NASA Astrophysics Data System (ADS)
Zhang, Chao; Zhang, Qian; Zheng, Chi; Qiu, Guoping
2018-04-01
Video foreground segmentation is one of the key problems in video processing. In this paper, we proposed a novel and fully unsupervised approach for foreground object co-localization and segmentation of unconstrained videos. We firstly compute both the actual edges and motion boundaries of the video frames, and then align them by their HOG feature maps. Then, by filling the occlusions generated by the aligned edges, we obtained more precise masks about the foreground object. Such motion-based masks could be derived as the motion-based likelihood. Moreover, the color-base likelihood is adopted for the segmentation process. Experimental Results show that our approach outperforms most of the State-of-the-art algorithms.
Improved segmentation of occluded and adjoining vehicles in traffic surveillance videos
NASA Astrophysics Data System (ADS)
Juneja, Medha; Grover, Priyanka
2013-12-01
Occlusion in image processing refers to concealment of any part of the object or the whole object from view of an observer. Real time videos captured by static cameras on roads often encounter overlapping and hence, occlusion of vehicles. Occlusion in traffic surveillance videos usually occurs when an object which is being tracked is hidden by another object. This makes it difficult for the object detection algorithms to distinguish all the vehicles efficiently. Also morphological operations tend to join the close proximity vehicles resulting in formation of a single bounding box around more than one vehicle. Such problems lead to errors in further video processing, like counting of vehicles in a video. The proposed system brings forward efficient moving object detection and tracking approach to reduce such errors. The paper uses successive frame subtraction technique for detection of moving objects. Further, this paper implements the watershed algorithm to segment the overlapped and adjoining vehicles. The segmentation results have been improved by the use of noise and morphological operations.
Causal Video Object Segmentation From Persistence of Occlusions
2015-05-01
Precision, recall, and F-measure are reported on the ground truth anno - tations converted to binary masks. Note we cannot evaluate “number of...to lack of occlusions. References [1] P. Arbelaez, M. Maire, C. Fowlkes, and J . Malik. Con- tour detection and hierarchical image segmentation. TPAMI...X. Bai, J . Wang, D. Simons, and G. Sapiro. Video snapcut: robust video object cutout using localized classifiers. In ACM Transactions on Graphics
Video segmentation using keywords
NASA Astrophysics Data System (ADS)
Ton-That, Vinh; Vong, Chi-Tai; Nguyen-Dao, Xuan-Truong; Tran, Minh-Triet
2018-04-01
At DAVIS-2016 Challenge, many state-of-art video segmentation methods achieve potential results, but they still much depend on annotated frames to distinguish between background and foreground. It takes a lot of time and efforts to create these frames exactly. In this paper, we introduce a method to segment objects from video based on keywords given by user. First, we use a real-time object detection system - YOLOv2 to identify regions containing objects that have labels match with the given keywords in the first frame. Then, for each region identified from the previous step, we use Pyramid Scene Parsing Network to assign each pixel as foreground or background. These frames can be used as input frames for Object Flow algorithm to perform segmentation on entire video. We conduct experiments on a subset of DAVIS-2016 dataset in half the size of its original size, which shows that our method can handle many popular classes in PASCAL VOC 2012 dataset with acceptable accuracy, about 75.03%. We suggest widely testing by combining other methods to improve this result in the future.
2013-10-03
fol- low the setup in the literature ([13, 14]), and use 5 (birdfall, cheetah , girl, monkeydog and parachute) of the videos for evaluation (since the...segmentation labeling results of the method, GT is the ground-truth labeling of the video, and F is the (a) Birdfall (b) Cheetah (c) Girl (d) Monkeydog...Video Ours [14] [13] [20] [6] birdfall 155 189 288 252 454 cheetah 633 806 905 1142 1217 girl 1488 1698 1785 1304 1755 monkeydog 365 472 521 563 683
Extraction of composite visual objects from audiovisual materials
NASA Astrophysics Data System (ADS)
Durand, Gwenael; Thienot, Cedric; Faudemay, Pascal
1999-08-01
An effective analysis of Visual Objects appearing in still images and video frames is required in order to offer fine grain access to multimedia and audiovisual contents. In previous papers, we showed how our method for segmenting still images into visual objects could improve content-based image retrieval and video analysis methods. Visual Objects are used in particular for extracting semantic knowledge about the contents. However, low-level segmentation methods for still images are not likely to extract a complex object as a whole but instead as a set of several sub-objects. For example, a person would be segmented into three visual objects: a face, hair, and a body. In this paper, we introduce the concept of Composite Visual Object. Such an object is hierarchically composed of sub-objects called Component Objects.
NASA Astrophysics Data System (ADS)
Ezhova, Kseniia; Fedorenko, Dmitriy; Chuhlamov, Anton
2016-04-01
The article deals with the methods of image segmentation based on color space conversion, and allow the most efficient way to carry out the detection of a single color in a complex background and lighting, as well as detection of objects on a homogeneous background. The results of the analysis of segmentation algorithms of this type, the possibility of their implementation for creating software. The implemented algorithm is very time-consuming counting, making it a limited application for the analysis of the video, however, it allows us to solve the problem of analysis of objects in the image if there is no dictionary of images and knowledge bases, as well as the problem of choosing the optimal parameters of the frame quantization for video analysis.
Common and Innovative Visuals: A sparsity modeling framework for video.
Abdolhosseini Moghadam, Abdolreza; Kumar, Mrityunjay; Radha, Hayder
2014-05-02
Efficient video representation models are critical for many video analysis and processing tasks. In this paper, we present a framework based on the concept of finding the sparsest solution to model video frames. To model the spatio-temporal information, frames from one scene are decomposed into two components: (i) a common frame, which describes the visual information common to all the frames in the scene/segment, and (ii) a set of innovative frames, which depicts the dynamic behaviour of the scene. The proposed approach exploits and builds on recent results in the field of compressed sensing to jointly estimate the common frame and the innovative frames for each video segment. We refer to the proposed modeling framework by CIV (Common and Innovative Visuals). We show how the proposed model can be utilized to find scene change boundaries and extend CIV to videos from multiple scenes. Furthermore, the proposed model is robust to noise and can be used for various video processing applications without relying on motion estimation and detection or image segmentation. Results for object tracking, video editing (object removal, inpainting) and scene change detection are presented to demonstrate the efficiency and the performance of the proposed model.
User-assisted video segmentation system for visual communication
NASA Astrophysics Data System (ADS)
Wu, Zhengping; Chen, Chun
2002-01-01
Video segmentation plays an important role for efficient storage and transmission in visual communication. In this paper, we introduce a novel video segmentation system using point tracking and contour formation techniques. Inspired by the results from the study of the human visual system, we intend to solve the video segmentation problem into three separate phases: user-assisted feature points selection, feature points' automatic tracking, and contour formation. This splitting relieves the computer of ill-posed automatic segmentation problems, and allows a higher level of flexibility of the method. First, the precise feature points can be found using a combination of user assistance and an eigenvalue-based adjustment. Second, the feature points in the remaining frames are obtained using motion estimation and point refinement. At last, contour formation is used to extract the object, and plus a point insertion process to provide the feature points for next frame's tracking.
Real-time image sequence segmentation using curve evolution
NASA Astrophysics Data System (ADS)
Zhang, Jun; Liu, Weisong
2001-04-01
In this paper, we describe a novel approach to image sequence segmentation and its real-time implementation. This approach uses the 3D structure tensor to produce a more robust frame difference signal and uses curve evolution to extract whole objects. Our algorithm is implemented on a standard PC running the Windows operating system with video capture from a USB camera that is a standard Windows video capture device. Using the Windows standard video I/O functionalities, our segmentation software is highly portable and easy to maintain and upgrade. In its current implementation on a Pentium 400, the system can perform segmentation at 5 frames/sec with a frame resolution of 160 by 120.
Video-based noncooperative iris image segmentation.
Du, Yingzi; Arslanturk, Emrah; Zhou, Zhi; Belcher, Craig
2011-02-01
In this paper, we propose a video-based noncooperative iris image segmentation scheme that incorporates a quality filter to quickly eliminate images without an eye, employs a coarse-to-fine segmentation scheme to improve the overall efficiency, uses a direct least squares fitting of ellipses method to model the deformed pupil and limbic boundaries, and develops a window gradient-based method to remove noise in the iris region. A remote iris acquisition system is set up to collect noncooperative iris video images. An objective method is used to quantitatively evaluate the accuracy of the segmentation results. The experimental results demonstrate the effectiveness of this method. The proposed method would make noncooperative iris recognition or iris surveillance possible.
Object class segmentation of RGB-D video using recurrent convolutional neural networks.
Pavel, Mircea Serban; Schulz, Hannes; Behnke, Sven
2017-04-01
Object class segmentation is a computer vision task which requires labeling each pixel of an image with the class of the object it belongs to. Deep convolutional neural networks (DNN) are able to learn and take advantage of local spatial correlations required for this task. They are, however, restricted by their small, fixed-sized filters, which limits their ability to learn long-range dependencies. Recurrent Neural Networks (RNN), on the other hand, do not suffer from this restriction. Their iterative interpretation allows them to model long-range dependencies by propagating activity. This property is especially useful when labeling video sequences, where both spatial and temporal long-range dependencies occur. In this work, a novel RNN architecture for object class segmentation is presented. We investigate several ways to train such a network. We evaluate our models on the challenging NYU Depth v2 dataset for object class segmentation and obtain competitive results. Copyright © 2017 Elsevier Ltd. All rights reserved.
Doulamis, A; Doulamis, N; Ntalianis, K; Kollias, S
2003-01-01
In this paper, an unsupervised video object (VO) segmentation and tracking algorithm is proposed based on an adaptable neural-network architecture. The proposed scheme comprises: 1) a VO tracking module and 2) an initial VO estimation module. Object tracking is handled as a classification problem and implemented through an adaptive network classifier, which provides better results compared to conventional motion-based tracking algorithms. Network adaptation is accomplished through an efficient and cost effective weight updating algorithm, providing a minimum degradation of the previous network knowledge and taking into account the current content conditions. A retraining set is constructed and used for this purpose based on initial VO estimation results. Two different scenarios are investigated. The first concerns extraction of human entities in video conferencing applications, while the second exploits depth information to identify generic VOs in stereoscopic video sequences. Human face/ body detection based on Gaussian distributions is accomplished in the first scenario, while segmentation fusion is obtained using color and depth information in the second scenario. A decision mechanism is also incorporated to detect time instances for weight updating. Experimental results and comparisons indicate the good performance of the proposed scheme even in sequences with complicated content (object bending, occlusion).
NASA Technical Reports Server (NTRS)
Smith, Michael A.; Kanade, Takeo
1997-01-01
Digital video is rapidly becoming important for education, entertainment, and a host of multimedia applications. With the size of the video collections growing to thousands of hours, technology is needed to effectively browse segments in a short time without losing the content of the video. We propose a method to extract the significant audio and video information and create a "skim" video which represents a very short synopsis of the original. The goal of this work is to show the utility of integrating language and image understanding techniques for video skimming by extraction of significant information, such as specific objects, audio keywords and relevant video structure. The resulting skim video is much shorter, where compaction is as high as 20:1, and yet retains the essential content of the original segment.
Baca, A
1996-04-01
A method has been developed for the precise determination of anthropometric dimensions from the video images of four different body configurations. High precision is achieved by incorporating techniques for finding the location of object boundaries with sub-pixel accuracy, the implementation of calibration algorithms, and by taking into account the varying distances of the body segments from the recording camera. The system allows automatic segment boundary identification from the video image, if the boundaries are marked on the subject by black ribbons. In connection with the mathematical finite-mass-element segment model of Hatze, body segment parameters (volumes, masses, the three principal moments of inertia, the three local coordinates of the segmental mass centers etc.) can be computed by using the anthropometric data determined videometrically as input data. Compared to other, recently published video-based systems for the estimation of the inertial properties of body segments, the present algorithms reduce errors originating from optical distortions, inaccurate edge-detection procedures, and user-specified upper and lower segment boundaries or threshold levels for the edge-detection. The video-based estimation of human body segment parameters is especially useful in situations where ease of application and rapid availability of comparatively precise parameter values are of importance.
Bilayer segmentation of webcam videos using tree-based classifiers.
Yin, Pei; Criminisi, Antonio; Winn, John; Essa, Irfan
2011-01-01
This paper presents an automatic segmentation algorithm for video frames captured by a (monocular) webcam that closely approximates depth segmentation from a stereo camera. The frames are segmented into foreground and background layers that comprise a subject (participant) and other objects and individuals. The algorithm produces correct segmentations even in the presence of large background motion with a nearly stationary foreground. This research makes three key contributions: First, we introduce a novel motion representation, referred to as "motons," inspired by research in object recognition. Second, we propose estimating the segmentation likelihood from the spatial context of motion. The estimation is efficiently learned by random forests. Third, we introduce a general taxonomy of tree-based classifiers that facilitates both theoretical and experimental comparisons of several known classification algorithms and generates new ones. In our bilayer segmentation algorithm, diverse visual cues such as motion, motion context, color, contrast, and spatial priors are fused by means of a conditional random field (CRF) model. Segmentation is then achieved by binary min-cut. Experiments on many sequences of our videochat application demonstrate that our algorithm, which requires no initialization, is effective in a variety of scenes, and the segmentation results are comparable to those obtained by stereo systems.
Unsupervised motion-based object segmentation refined by color
NASA Astrophysics Data System (ADS)
Piek, Matthijs C.; Braspenning, Ralph; Varekamp, Chris
2003-06-01
For various applications, such as data compression, structure from motion, medical imaging and video enhancement, there is a need for an algorithm that divides video sequences into independently moving objects. Because our focus is on video enhancement and structure from motion for consumer electronics, we strive for a low complexity solution. For still images, several approaches exist based on colour, but these lack in both speed and segmentation quality. For instance, colour-based watershed algorithms produce a so-called oversegmentation with many segments covering each single physical object. Other colour segmentation approaches exist which somehow limit the number of segments to reduce this oversegmentation problem. However, this often results in inaccurate edges or even missed objects. Most likely, colour is an inherently insufficient cue for real world object segmentation, because real world objects can display complex combinations of colours. For video sequences, however, an additional cue is available, namely the motion of objects. When different objects in a scene have different motion, the motion cue alone is often enough to reliably distinguish objects from one another and the background. However, because of the lack of sufficient resolution of efficient motion estimators, like the 3DRS block matcher, the resulting segmentation is not at pixel resolution, but at block resolution. Existing pixel resolution motion estimators are more sensitive to noise, suffer more from aperture problems or have less correspondence to the true motion of objects when compared to block-based approaches or are too computationally expensive. From its tendency to oversegmentation it is apparent that colour segmentation is particularly effective near edges of homogeneously coloured areas. On the other hand, block-based true motion estimation is particularly effective in heterogeneous areas, because heterogeneous areas improve the chance a block is unique and thus decrease the chance of the wrong position producing a good match. Consequently, a number of methods exist which combine motion and colour segmentation. These methods use colour segmentation as a base for the motion segmentation and estimation or perform an independent colour segmentation in parallel which is in some way combined with the motion segmentation. The presented method uses both techniques to complement each other by first segmenting on motion cues and then refining the segmentation with colour. To our knowledge few methods exist which adopt this approach. One example is te{meshrefine}. This method uses an irregular mesh, which hinders its efficient implementation in consumer electronics devices. Furthermore, the method produces a foreground/background segmentation, while our applications call for the segmentation of multiple objects. NEW METHOD As mentioned above we start with motion segmentation and refine the edges of this segmentation with a pixel resolution colour segmentation method afterwards. There are several reasons for this approach: + Motion segmentation does not produce the oversegmentation which colour segmentation methods normally produce, because objects are more likely to have colour discontinuities than motion discontinuities. In this way, the colour segmentation only has to be done at the edges of segments, confining the colour segmentation to a smaller part of the image. In such a part, it is more likely that the colour of an object is homogeneous. + This approach restricts the computationally expensive pixel resolution colour segmentation to a subset of the image. Together with the very efficient 3DRS motion estimation algorithm, this helps to reduce the computational complexity. + The motion cue alone is often enough to reliably distinguish objects from one another and the background. To obtain the motion vector fields, a variant of the 3DRS block-based motion estimator which analyses three frames of input was used. The 3DRS motion estimator is known for its ability to estimate motion vectors which closely resemble the true motion. BLOCK-BASED MOTION SEGMENTATION As mentioned above we start with a block-resolution segmentation based on motion vectors. The presented method is inspired by the well-known K-means segmentation method te{K-means}. Several other methods (e.g. te{kmeansc}) adapt K-means for connectedness by adding a weighted shape-error. This adds the additional difficulty of finding the correct weights for the shape-parameters. Also, these methods often bias one particular pre-defined shape. The presented method, which we call K-regions, encourages connectedness because only blocks at the edges of segments may be assigned to another segment. This constrains the segmentation method to such a degree that it allows the method to use least squares for the robust fitting of affine motion models for each segment. Contrary to te{parmkm}, the segmentation step still operates on vectors instead of model parameters. To make sure the segmentation is temporally consistent, the segmentation of the previous frame will be used as initialisation for every new frame. We also present a scheme which makes the algorithm independent of the initially chosen amount of segments. COLOUR-BASED INTRA-BLOCK SEGMENTATION The block resolution motion-based segmentation forms the starting point for the pixel resolution segmentation. The pixel resolution segmentation is obtained from the block resolution segmentation by reclassifying pixels only at the edges of clusters. We assume that an edge between two objects can be found in either one of two neighbouring blocks that belong to different clusters. This assumption allows us to do the pixel resolution segmentation on each pair of such neighbouring blocks separately. Because of the local nature of the segmentation, it largely avoids problems with heterogeneously coloured areas. Because no new segments are introduced in this step, it also does not suffer from oversegmentation problems. The presented method has no problems with bifurcations. For the pixel resolution segmentation itself we reclassify pixels such that we optimize an error norm which favour similarly coloured regions and straight edges. SEGMENTATION MEASURE To assist in the evaluation of the proposed algorithm we developed a quality metric. Because the problem does not have an exact specification, we decided to define a ground truth output which we find desirable for a given input. We define the measure for the segmentation quality as being how different the segmentation is from the ground truth. Our measure enables us to evaluate oversegmentation and undersegmentation seperately. Also, it allows us to evaluate which parts of a frame suffer from oversegmentation or undersegmentation. The proposed algorithm has been tested on several typical sequences. CONCLUSIONS In this abstract we presented a new video segmentation method which performs well in the segmentation of multiple independently moving foreground objects from each other and the background. It combines the strong points of both colour and motion segmentation in the way we expected. One of the weak points is that the segmentation method suffers from undersegmentation when adjacent objects display similar motion. In sequences with detailed backgrounds the segmentation will sometimes display noisy edges. Apart from these results, we think that some of the techniques, and in particular the K-regions technique, may be useful for other two-dimensional data segmentation problems.
Motion-seeded object-based attention for dynamic visual imagery
NASA Astrophysics Data System (ADS)
Huber, David J.; Khosla, Deepak; Kim, Kyungnam
2017-05-01
This paper† describes a novel system that finds and segments "objects of interest" from dynamic imagery (video) that (1) processes each frame using an advanced motion algorithm that pulls out regions that exhibit anomalous motion, and (2) extracts the boundary of each object of interest using a biologically-inspired segmentation algorithm based on feature contours. The system uses a series of modular, parallel algorithms, which allows many complicated operations to be carried out by the system in a very short time, and can be used as a front-end to a larger system that includes object recognition and scene understanding modules. Using this method, we show 90% accuracy with fewer than 0.1 false positives per frame of video, which represents a significant improvement over detection using a baseline attention algorithm.
Telesign: a videophone system for sign language distant communication
NASA Astrophysics Data System (ADS)
Mozelle, Gerard; Preteux, Francoise J.; Viallet, Jean-Emmanuel
1998-09-01
This paper presents a low bit rate videophone system for deaf people communicating by means of sign language. Classic video conferencing systems have focused on head and shoulders sequences which are not well-suited for sign language video transmission since hearing impaired people also use their hands and arms to communicate. To address the above-mentioned functionality, we have developed a two-step content-based video coding system based on: (1) A segmentation step. Four or five video objects (VO) are extracted using a cooperative approach between color-based and morphological segmentation. (2) VO coding are achieved by using a standardized MPEG-4 video toolbox. Results of encoded sign language video sequences, presented for three target bit rates (32 kbits/s, 48 kbits/s and 64 kbits/s), demonstrate the efficiency of the approach presented in this paper.
A Secure and Robust Object-Based Video Authentication System
NASA Astrophysics Data System (ADS)
He, Dajun; Sun, Qibin; Tian, Qi
2004-12-01
An object-based video authentication system, which combines watermarking, error correction coding (ECC), and digital signature techniques, is presented for protecting the authenticity between video objects and their associated backgrounds. In this system, a set of angular radial transformation (ART) coefficients is selected as the feature to represent the video object and the background, respectively. ECC and cryptographic hashing are applied to those selected coefficients to generate the robust authentication watermark. This content-based, semifragile watermark is then embedded into the objects frame by frame before MPEG4 coding. In watermark embedding and extraction, groups of discrete Fourier transform (DFT) coefficients are randomly selected, and their energy relationships are employed to hide and extract the watermark. The experimental results demonstrate that our system is robust to MPEG4 compression, object segmentation errors, and some common object-based video processing such as object translation, rotation, and scaling while securely preventing malicious object modifications. The proposed solution can be further incorporated into public key infrastructure (PKI).
Self Occlusion and Disocclusion in Causal Video Object Segmentation
2015-12-18
computation is parameter- free in contrast to [4, 32, 10]. Taylor et al . [30] perform layer segmentation in longer video sequences leveraging occlusion cues...shows that our method recovers from errors in the first frame (short of failed detection). 4413 image ground truth Lee et al . [19] Grundman et al . [14...Ochs et al . [23] Taylor et al . [30] ours Figure 7. Sample Visual Results on FBMS-59. Comparison of various state-of-the-art methods. Only a single
Brandes, Susanne; Mokhtari, Zeinab; Essig, Fabian; Hünniger, Kerstin; Kurzai, Oliver; Figge, Marc Thilo
2015-02-01
Time-lapse microscopy is an important technique to study the dynamics of various biological processes. The labor-intensive manual analysis of microscopy videos is increasingly replaced by automated segmentation and tracking methods. These methods are often limited to certain cell morphologies and/or cell stainings. In this paper, we present an automated segmentation and tracking framework that does not have these restrictions. In particular, our framework handles highly variable cell shapes and does not rely on any cell stainings. Our segmentation approach is based on a combination of spatial and temporal image variations to detect moving cells in microscopy videos. This method yields a sensitivity of 99% and a precision of 95% in object detection. The tracking of cells consists of different steps, starting from single-cell tracking based on a nearest-neighbor-approach, detection of cell-cell interactions and splitting of cell clusters, and finally combining tracklets using methods from graph theory. The segmentation and tracking framework was applied to synthetic as well as experimental datasets with varying cell densities implying different numbers of cell-cell interactions. We established a validation framework to measure the performance of our tracking technique. The cell tracking accuracy was found to be >99% for all datasets indicating a high accuracy for connecting the detected cells between different time points. Copyright © 2014 Elsevier B.V. All rights reserved.
Shot boundary detection and label propagation for spatio-temporal video segmentation
NASA Astrophysics Data System (ADS)
Piramanayagam, Sankaranaryanan; Saber, Eli; Cahill, Nathan D.; Messinger, David
2015-02-01
This paper proposes a two stage algorithm for streaming video segmentation. In the first stage, shot boundaries are detected within a window of frames by comparing dissimilarity between 2-D segmentations of each frame. In the second stage, the 2-D segments are propagated across the window of frames in both spatial and temporal direction. The window is moved across the video to find all shot transitions and obtain spatio-temporal segments simultaneously. As opposed to techniques that operate on entire video, the proposed approach consumes significantly less memory and enables segmentation of lengthy videos. We tested our segmentation based shot detection method on the TRECVID 2007 video dataset and compared it with block-based technique. Cut detection results on the TRECVID 2007 dataset indicate that our algorithm has comparable results to the best of the block-based methods. The streaming video segmentation routine also achieves promising results on a challenging video segmentation benchmark database.
Tracking cells in Life Cell Imaging videos using topological alignments.
Mosig, Axel; Jäger, Stefan; Wang, Chaofeng; Nath, Sumit; Ersoy, Ilker; Palaniappan, Kannap-pan; Chen, Su-Shing
2009-07-16
With the increasing availability of live cell imaging technology, tracking cells and other moving objects in live cell videos has become a major challenge for bioimage informatics. An inherent problem for most cell tracking algorithms is over- or under-segmentation of cells - many algorithms tend to recognize one cell as several cells or vice versa. We propose to approach this problem through so-called topological alignments, which we apply to address the problem of linking segmentations of two consecutive frames in the video sequence. Starting from the output of a conventional segmentation procedure, we align pairs of consecutive frames through assigning sets of segments in one frame to sets of segments in the next frame. We achieve this through finding maximum weighted solutions to a generalized "bipartite matching" between two hierarchies of segments, where we derive weights from relative overlap scores of convex hulls of sets of segments. For solving the matching task, we rely on an integer linear program. Practical experiments demonstrate that the matching task can be solved efficiently in practice, and that our method is both effective and useful for tracking cells in data sets derived from a so-called Large Scale Digital Cell Analysis System (LSDCAS). The source code of the implementation is available for download from http://www.picb.ac.cn/patterns/Software/topaln.
Small Moving Vehicle Detection in a Satellite Video of an Urban Area
Yang, Tao; Wang, Xiwen; Yao, Bowei; Li, Jing; Zhang, Yanning; He, Zhannan; Duan, Wencheng
2016-01-01
Vehicle surveillance of a wide area allows us to learn much about the daily activities and traffic information. With the rapid development of remote sensing, satellite video has become an important data source for vehicle detection, which provides a broader field of surveillance. The achieved work generally focuses on aerial video with moderately-sized objects based on feature extraction. However, the moving vehicles in satellite video imagery range from just a few pixels to dozens of pixels and exhibit low contrast with respect to the background, which makes it hard to get available appearance or shape information. In this paper, we look into the problem of moving vehicle detection in satellite imagery. To the best of our knowledge, it is the first time to deal with moving vehicle detection from satellite videos. Our approach consists of two stages: first, through foreground motion segmentation and trajectory accumulation, the scene motion heat map is dynamically built. Following this, a novel saliency based background model which intensifies moving objects is presented to segment the vehicles in the hot regions. Qualitative and quantitative experiments on sequence from a recent Skybox satellite video dataset demonstrates that our approach achieves a high detection rate and low false alarm simultaneously. PMID:27657091
Video Salient Object Detection via Fully Convolutional Networks.
Wang, Wenguan; Shen, Jianbing; Shao, Ling
This paper proposes a deep learning model to efficiently detect salient regions in videos. It addresses two important issues: 1) deep video saliency model training with the absence of sufficiently large and pixel-wise annotated video data and 2) fast video saliency training and detection. The proposed deep video saliency network consists of two modules, for capturing the spatial and temporal saliency information, respectively. The dynamic saliency model, explicitly incorporating saliency estimates from the static saliency model, directly produces spatiotemporal saliency inference without time-consuming optical flow computation. We further propose a novel data augmentation technique that simulates video training data from existing annotated image data sets, which enables our network to learn diverse saliency information and prevents overfitting with the limited number of training videos. Leveraging our synthetic video data (150K video sequences) and real videos, our deep video saliency model successfully learns both spatial and temporal saliency cues, thus producing accurate spatiotemporal saliency estimate. We advance the state-of-the-art on the densely annotated video segmentation data set (MAE of .06) and the Freiburg-Berkeley Motion Segmentation data set (MAE of .07), and do so with much improved speed (2 fps with all steps).This paper proposes a deep learning model to efficiently detect salient regions in videos. It addresses two important issues: 1) deep video saliency model training with the absence of sufficiently large and pixel-wise annotated video data and 2) fast video saliency training and detection. The proposed deep video saliency network consists of two modules, for capturing the spatial and temporal saliency information, respectively. The dynamic saliency model, explicitly incorporating saliency estimates from the static saliency model, directly produces spatiotemporal saliency inference without time-consuming optical flow computation. We further propose a novel data augmentation technique that simulates video training data from existing annotated image data sets, which enables our network to learn diverse saliency information and prevents overfitting with the limited number of training videos. Leveraging our synthetic video data (150K video sequences) and real videos, our deep video saliency model successfully learns both spatial and temporal saliency cues, thus producing accurate spatiotemporal saliency estimate. We advance the state-of-the-art on the densely annotated video segmentation data set (MAE of .06) and the Freiburg-Berkeley Motion Segmentation data set (MAE of .07), and do so with much improved speed (2 fps with all steps).
NASA Astrophysics Data System (ADS)
Ciaramello, Francis M.; Hemami, Sheila S.
2007-02-01
For members of the Deaf Community in the United States, current communication tools include TTY/TTD services, video relay services, and text-based communication. With the growth of cellular technology, mobile sign language conversations are becoming a possibility. Proper coding techniques must be employed to compress American Sign Language (ASL) video for low-rate transmission while maintaining the quality of the conversation. In order to evaluate these techniques, an appropriate quality metric is needed. This paper demonstrates that traditional video quality metrics, such as PSNR, fail to predict subjective intelligibility scores. By considering the unique structure of ASL video, an appropriate objective metric is developed. Face and hand segmentation is performed using skin-color detection techniques. The distortions in the face and hand regions are optimally weighted and pooled across all frames to create an objective intelligibility score for a distorted sequence. The objective intelligibility metric performs significantly better than PSNR in terms of correlation with subjective responses.
Joint Multi-Leaf Segmentation, Alignment, and Tracking for Fluorescence Plant Videos.
Yin, Xi; Liu, Xiaoming; Chen, Jin; Kramer, David M
2018-06-01
This paper proposes a novel framework for fluorescence plant video processing. The plant research community is interested in the leaf-level photosynthetic analysis within a plant. A prerequisite for such analysis is to segment all leaves, estimate their structures, and track them over time. We identify this as a joint multi-leaf segmentation, alignment, and tracking problem. First, leaf segmentation and alignment are applied on the last frame of a plant video to find a number of well-aligned leaf candidates. Second, leaf tracking is applied on the remaining frames with leaf candidate transformation from the previous frame. We form two optimization problems with shared terms in their objective functions for leaf alignment and tracking respectively. A quantitative evaluation framework is formulated to evaluate the performance of our algorithm with four metrics. Two models are learned to predict the alignment accuracy and detect tracking failure respectively in order to provide guidance for subsequent plant biology analysis. The limitation of our algorithm is also studied. Experimental results show the effectiveness, efficiency, and robustness of the proposed method.
Bellaïche, Yohanns; Bosveld, Floris; Graner, François; Mikula, Karol; Remesíková, Mariana; Smísek, Michal
2011-01-01
In this paper, we present a novel algorithm for tracking cells in time lapse confocal microscopy movie of a Drosophila epithelial tissue during pupal morphogenesis. We consider a 2D + time video as a 3D static image, where frames are stacked atop each other, and using a spatio-temporal segmentation algorithm we obtain information about spatio-temporal 3D tubes representing evolutions of cells. The main idea for tracking is the usage of two distance functions--first one from the cells in the initial frame and second one from segmented boundaries. We track the cells backwards in time. The first distance function attracts the subsequently constructed cell trajectories to the cells in the initial frame and the second one forces them to be close to centerlines of the segmented tubular structures. This makes our tracking algorithm robust against noise and missing spatio-temporal boundaries. This approach can be generalized to a 3D + time video analysis, where spatio-temporal tubes are 4D objects.
3D noise-resistant segmentation and tracking of unknown and occluded objects using integral imaging
NASA Astrophysics Data System (ADS)
Aloni, Doron; Jung, Jae-Hyun; Yitzhaky, Yitzhak
2017-10-01
Three dimensional (3D) object segmentation and tracking can be useful in various computer vision applications, such as: object surveillance for security uses, robot navigation, etc. We present a method for 3D multiple-object tracking using computational integral imaging, based on accurate 3D object segmentation. The method does not employ object detection by motion analysis in a video as conventionally performed (such as background subtraction or block matching). This means that the movement properties do not significantly affect the detection quality. The object detection is performed by analyzing static 3D image data obtained through computational integral imaging With regard to previous works that used integral imaging data in such a scenario, the proposed method performs the 3D tracking of objects without prior information about the objects in the scene, and it is found efficient under severe noise conditions.
A novel sub-shot segmentation method for user-generated video
NASA Astrophysics Data System (ADS)
Lei, Zhuo; Zhang, Qian; Zheng, Chi; Qiu, Guoping
2018-04-01
With the proliferation of the user-generated videos, temporal segmentation is becoming a challengeable problem. Traditional video temporal segmentation methods like shot detection are not able to work on unedited user-generated videos, since they often only contain one single long shot. We propose a novel temporal segmentation framework for user-generated video. It finds similar frames with a tree partitioning min-Hash technique, constructs sparse temporal constrained affinity sub-graphs, and finally divides the video into sub-shot-level segments with a dense-neighbor-based clustering method. Experimental results show that our approach outperforms all the other related works. Furthermore, it is indicated that the proposed approach is able to segment user-generated videos at an average human level.
Geographic Video 3d Data Model And Retrieval
NASA Astrophysics Data System (ADS)
Han, Z.; Cui, C.; Kong, Y.; Wu, H.
2014-04-01
Geographic video includes both spatial and temporal geographic features acquired through ground-based or non-ground-based cameras. With the popularity of video capture devices such as smartphones, the volume of user-generated geographic video clips has grown significantly and the trend of this growth is quickly accelerating. Such a massive and increasing volume poses a major challenge to efficient video management and query. Most of the today's video management and query techniques are based on signal level content extraction. They are not able to fully utilize the geographic information of the videos. This paper aimed to introduce a geographic video 3D data model based on spatial information. The main idea of the model is to utilize the location, trajectory and azimuth information acquired by sensors such as GPS receivers and 3D electronic compasses in conjunction with video contents. The raw spatial information is synthesized to point, line, polygon and solid according to the camcorder parameters such as focal length and angle of view. With the video segment and video frame, we defined the three categories geometry object using the geometry model of OGC Simple Features Specification for SQL. We can query video through computing the spatial relation between query objects and three categories geometry object such as VFLocation, VSTrajectory, VSFOView and VFFovCone etc. We designed the query methods using the structured query language (SQL) in detail. The experiment indicate that the model is a multiple objective, integration, loosely coupled, flexible and extensible data model for the management of geographic stereo video.
Real-time people counting system using a single video camera
NASA Astrophysics Data System (ADS)
Lefloch, Damien; Cheikh, Faouzi A.; Hardeberg, Jon Y.; Gouton, Pierre; Picot-Clemente, Romain
2008-02-01
There is growing interest in video-based solutions for people monitoring and counting in business and security applications. Compared to classic sensor-based solutions the video-based ones allow for more versatile functionalities, improved performance with lower costs. In this paper, we propose a real-time system for people counting based on single low-end non-calibrated video camera. The two main challenges addressed in this paper are: robust estimation of the scene background and the number of real persons in merge-split scenarios. The latter is likely to occur whenever multiple persons move closely, e.g. in shopping centers. Several persons may be considered to be a single person by automatic segmentation algorithms, due to occlusions or shadows, leading to under-counting. Therefore, to account for noises, illumination and static objects changes, a background substraction is performed using an adaptive background model (updated over time based on motion information) and automatic thresholding. Furthermore, post-processing of the segmentation results is performed, in the HSV color space, to remove shadows. Moving objects are tracked using an adaptive Kalman filter, allowing a robust estimation of the objects future positions even under heavy occlusion. The system is implemented in Matlab, and gives encouraging results even at high frame rates. Experimental results obtained based on the PETS2006 datasets are presented at the end of the paper.
NASA Astrophysics Data System (ADS)
Hatze, Herbert; Baca, Arnold
1993-01-01
The development of noninvasive techniques for the determination of biomechanical body segment parameters (volumes, masses, the three principal moments of inertia, the three local coordinates of the segmental mass centers, etc.) receives increasing attention from the medical sciences (e,.g., orthopaedic gait analysis), bioengineering, sport biomechanics, and the various space programs. In the present paper, a novel method is presented for determining body segment parameters rapidly and accurately. It is based on the video-image processing of four different body configurations and a finite mass-element human body model. The four video images of the subject in question are recorded against a black background, thus permitting the application of shape recognition procedures incorporating edge detection and calibration algorithms. In this way, a total of 181 object space dimensions of the subject's body segments can be reconstructed and used as anthropometric input data for the mathematical finite mass- element body model. The latter comprises 17 segments (abdomino-thoracic, head-neck, shoulders, upper arms, forearms, hands, abdomino-pelvic, thighs, lower legs, feet) and enables the user to compute all the required segment parameters for each of the 17 segments by means of the associated computer program. The hardware requirements are an IBM- compatible PC (1 MB memory) operating under MS-DOS or PC-DOS (Version 3.1 onwards) and incorporating a VGA-board with a feature connector for connecting it to a super video windows framegrabber board for which there must be available a 16-bit large slot. In addition, a VGA-monitor (50 - 70 Hz, horizontal scan rate at least 31.5 kHz), a common video camera and recorder, and a simple rectangular calibration frame are required. The advantage of the new method lies in its ease of application, its comparatively high accuracy, and in the rapid availability of the body segment parameters, which is particularly useful in clinical practice. An example of its practical application illustrates the technique.
Video-assisted segmentation of speech and audio track
NASA Astrophysics Data System (ADS)
Pandit, Medha; Yusoff, Yusseri; Kittler, Josef; Christmas, William J.; Chilton, E. H. S.
1999-08-01
Video database research is commonly concerned with the storage and retrieval of visual information invovling sequence segmentation, shot representation and video clip retrieval. In multimedia applications, video sequences are usually accompanied by a sound track. The sound track contains potential cues to aid shot segmentation such as different speakers, background music, singing and distinctive sounds. These different acoustic categories can be modeled to allow for an effective database retrieval. In this paper, we address the problem of automatic segmentation of audio track of multimedia material. This audio based segmentation can be combined with video scene shot detection in order to achieve partitioning of the multimedia material into semantically significant segments.
Blurry-frame detection and shot segmentation in colonoscopy videos
NASA Astrophysics Data System (ADS)
Oh, JungHwan; Hwang, Sae; Tavanapong, Wallapak; de Groen, Piet C.; Wong, Johnny
2003-12-01
Colonoscopy is an important screening procedure for colorectal cancer. During this procedure, the endoscopist visually inspects the colon. Human inspection, however, is not without error. We hypothesize that colonoscopy videos may contain additional valuable information missed by the endoscopist. Video segmentation is the first necessary step for the content-based video analysis and retrieval to provide efficient access to the important images and video segments from a large colonoscopy video database. Based on the unique characteristics of colonoscopy videos, we introduce a new scheme to detect and remove blurry frames, and segment the videos into shots based on the contents. Our experimental results show that the average precision and recall of the proposed scheme are over 90% for the detection of non-blurry images. The proposed method of blurry frame detection and shot segmentation is extensible to the videos captured from other endoscopic procedures such as upper gastrointestinal endoscopy, enteroscopy, cystoscopy, and laparoscopy.
ETHOWATCHER: validation of a tool for behavioral and video-tracking analysis in laboratory animals.
Crispim Junior, Carlos Fernando; Pederiva, Cesar Nonato; Bose, Ricardo Chessini; Garcia, Vitor Augusto; Lino-de-Oliveira, Cilene; Marino-Neto, José
2012-02-01
We present a software (ETHOWATCHER(®)) developed to support ethography, object tracking and extraction of kinematic variables from digital video files of laboratory animals. The tracking module allows controlled segmentation of the target from the background, extracting image attributes used to calculate the distance traveled, orientation, length, area and a path graph of the experimental animal. The ethography module allows recording of catalog-based behaviors from environment or from video files continuously or frame-by-frame. The output reports duration, frequency and latency of each behavior and the sequence of events in a time-segmented format, set by the user. Validation tests were conducted on kinematic measurements and on the detection of known behavioral effects of drugs. This software is freely available at www.ethowatcher.ufsc.br. Copyright © 2011 Elsevier Ltd. All rights reserved.
Traffic Video Image Segmentation Model Based on Bayesian and Spatio-Temporal Markov Random Field
NASA Astrophysics Data System (ADS)
Zhou, Jun; Bao, Xu; Li, Dawei; Yin, Yongwen
2017-10-01
Traffic video image is a kind of dynamic image and its background and foreground is changed at any time, which results in the occlusion. In this case, using the general method is more difficult to get accurate image segmentation. A segmentation algorithm based on Bayesian and Spatio-Temporal Markov Random Field is put forward, which respectively build the energy function model of observation field and label field to motion sequence image with Markov property, then according to Bayesian' rule, use the interaction of label field and observation field, that is the relationship of label field’s prior probability and observation field’s likelihood probability, get the maximum posterior probability of label field’s estimation parameter, use the ICM model to extract the motion object, consequently the process of segmentation is finished. Finally, the segmentation methods of ST - MRF and the Bayesian combined with ST - MRF were analyzed. Experimental results: the segmentation time in Bayesian combined with ST-MRF algorithm is shorter than in ST-MRF, and the computing workload is small, especially in the heavy traffic dynamic scenes the method also can achieve better segmentation effect.
Knowledge-based understanding of aerial surveillance video
NASA Astrophysics Data System (ADS)
Cheng, Hui; Butler, Darren
2006-05-01
Aerial surveillance has long been used by the military to locate, monitor and track the enemy. Recently, its scope has expanded to include law enforcement activities, disaster management and commercial applications. With the ever-growing amount of aerial surveillance video acquired daily, there is an urgent need for extracting actionable intelligence in a timely manner. Furthermore, to support high-level video understanding, this analysis needs to go beyond current approaches and consider the relationships, motivations and intentions of the objects in the scene. In this paper we propose a system for interpreting aerial surveillance videos that automatically generates a succinct but meaningful description of the observed regions, objects and events. For a given video, the semantics of important regions and objects, and the relationships between them, are summarised into a semantic concept graph. From this, a textual description is derived that provides new search and indexing options for aerial video and enables the fusion of aerial video with other information modalities, such as human intelligence, reports and signal intelligence. Using a Mixture-of-Experts video segmentation algorithm an aerial video is first decomposed into regions and objects with predefined semantic meanings. The objects are then tracked and coerced into a semantic concept graph and the graph is summarized spatially, temporally and semantically using ontology guided sub-graph matching and re-writing. The system exploits domain specific knowledge and uses a reasoning engine to verify and correct the classes, identities and semantic relationships between the objects. This approach is advantageous because misclassifications lead to knowledge contradictions and hence they can be easily detected and intelligently corrected. In addition, the graph representation highlights events and anomalies that a low-level analysis would overlook.
A motion compensation technique using sliced blocks and its application to hybrid video coding
NASA Astrophysics Data System (ADS)
Kondo, Satoshi; Sasai, Hisao
2005-07-01
This paper proposes a new motion compensation method using "sliced blocks" in DCT-based hybrid video coding. In H.264 ? MPEG-4 Advance Video Coding, a brand-new international video coding standard, motion compensation can be performed by splitting macroblocks into multiple square or rectangular regions. In the proposed method, on the other hand, macroblocks or sub-macroblocks are divided into two regions (sliced blocks) by an arbitrary line segment. The result is that the shapes of the segmented regions are not limited to squares or rectangles, allowing the shapes of the segmented regions to better match the boundaries between moving objects. Thus, the proposed method can improve the performance of the motion compensation. In addition, adaptive prediction of the shape according to the region shape of the surrounding macroblocks can reduce overheads to describe shape information in the bitstream. The proposed method also has the advantage that conventional coding techniques such as mode decision using rate-distortion optimization can be utilized, since coding processes such as frequency transform and quantization are performed on a macroblock basis, similar to the conventional coding methods. The proposed method is implemented in an H.264-based P-picture codec and an improvement in bit rate of 5% is confirmed in comparison with H.264.
Context indexing of digital cardiac ultrasound records in PACS
NASA Astrophysics Data System (ADS)
Lobodzinski, S. Suave; Meszaros, Georg N.
1998-07-01
Recent wide adoption of the DICOM 3.0 standard by ultrasound equipment vendors created a need for practical clinical implementations of cardiac imaging study visualization, management and archiving, DICOM 3.0 defines only a logical and physical format for exchanging image data (still images, video, patient and study demographics). All DICOM compliant imaging studies must presently be archived on a 650 Mb recordable compact disk. This is a severe limitation for ultrasound applications where studies of 3 to 10 minutes long are a common practice. In addition, DICOM digital echocardiography objects require physiological signal indexing, content segmentation and characterization. Since DICOM 3.0 is an interchange standard only, it does not define how to database composite video objects. The goal of this research was therefore to address the issues of efficient storage, retrieval and management of DICOM compliant cardiac video studies in a distributed PACS environment. Our Web based implementation has the advantage of accommodating both DICOM defined entity-relation modules (equipment data, patient data, video format, etc.) in standard relational database tables and digital indexed video with its attributes in an object relational database. Object relational data model facilitates content indexing of full motion cardiac imaging studies through bi-directional hyperlink generation that tie searchable video attributes and related objects to individual video frames in the temporal domain. Benefits realized from use of bi-directionally hyperlinked data models in an object relational database include: (1) real time video indexing during image acquisition, (2) random access and frame accurate instant playback of previously recorded full motion imaging data, and (3) time savings from faster and more accurate access to data through multiple navigation mechanisms such as multidimensional queries on an index, queries on a hyperlink attribute, free search and browsing.
Automatic multiple zebrafish larvae tracking in unconstrained microscopic video conditions.
Wang, Xiaoying; Cheng, Eva; Burnett, Ian S; Huang, Yushi; Wlodkowic, Donald
2017-12-14
The accurate tracking of zebrafish larvae movement is fundamental to research in many biomedical, pharmaceutical, and behavioral science applications. However, the locomotive characteristics of zebrafish larvae are significantly different from adult zebrafish, where existing adult zebrafish tracking systems cannot reliably track zebrafish larvae. Further, the far smaller size differentiation between larvae and the container render the detection of water impurities inevitable, which further affects the tracking of zebrafish larvae or require very strict video imaging conditions that typically result in unreliable tracking results for realistic experimental conditions. This paper investigates the adaptation of advanced computer vision segmentation techniques and multiple object tracking algorithms to develop an accurate, efficient and reliable multiple zebrafish larvae tracking system. The proposed system has been tested on a set of single and multiple adult and larvae zebrafish videos in a wide variety of (complex) video conditions, including shadowing, labels, water bubbles and background artifacts. Compared with existing state-of-the-art and commercial multiple organism tracking systems, the proposed system improves the tracking accuracy by up to 31.57% in unconstrained video imaging conditions. To facilitate the evaluation on zebrafish segmentation and tracking research, a dataset with annotated ground truth is also presented. The software is also publicly accessible.
Resolving occlusion and segmentation errors in multiple video object tracking
NASA Astrophysics Data System (ADS)
Cheng, Hsu-Yung; Hwang, Jenq-Neng
2009-02-01
In this work, we propose a method to integrate the Kalman filter and adaptive particle sampling for multiple video object tracking. The proposed framework is able to detect occlusion and segmentation error cases and perform adaptive particle sampling for accurate measurement selection. Compared with traditional particle filter based tracking methods, the proposed method generates particles only when necessary. With the concept of adaptive particle sampling, we can avoid degeneracy problem because the sampling position and range are dynamically determined by parameters that are updated by Kalman filters. There is no need to spend time on processing particles with very small weights. The adaptive appearance for the occluded object refers to the prediction results of Kalman filters to determine the region that should be updated and avoids the problem of using inadequate information to update the appearance under occlusion cases. The experimental results have shown that a small number of particles are sufficient to achieve high positioning and scaling accuracy. Also, the employment of adaptive appearance substantially improves the positioning and scaling accuracy on the tracking results.
Motion video analysis using planar parallax
NASA Astrophysics Data System (ADS)
Sawhney, Harpreet S.
1994-04-01
Motion and structure analysis in video sequences can lead to efficient descriptions of objects and their motions. Interesting events in videos can be detected using such an analysis--for instance independent object motion when the camera itself is moving, figure-ground segregation based on the saliency of a structure compared to its surroundings. In this paper we present a method for 3D motion and structure analysis that uses a planar surface in the environment as a reference coordinate system to describe a video sequence. The motion in the video sequence is described as the motion of the reference plane, and the parallax motion of all the non-planar components of the scene. It is shown how this method simplifies the otherwise hard general 3D motion analysis problem. In addition, a natural coordinate system in the environment is used to describe the scene which can simplify motion based segmentation. This work is a part of an ongoing effort in our group towards video annotation and analysis for indexing and retrieval. Results from a demonstration system being developed are presented.
NASA Astrophysics Data System (ADS)
Gohatre, Umakant Bhaskar; Patil, Venkat P.
2018-04-01
In computer vision application, the multiple object detection and tracking, in real-time operation is one of the important research field, that have gained a lot of attentions, in last few years for finding non stationary entities in the field of image sequence. The detection of object is advance towards following the moving object in video and then representation of object is step to track. The multiple object recognition proof is one of the testing assignment from detection multiple objects from video sequence. The picture enrollment has been for quite some time utilized as a reason for the location the detection of moving multiple objects. The technique of registration to discover correspondence between back to back casing sets in view of picture appearance under inflexible and relative change. The picture enrollment is not appropriate to deal with event occasion that can be result in potential missed objects. In this paper, for address such problems, designs propose novel approach. The divided video outlines utilizing area adjancy diagram of visual appearance and geometric properties. Then it performed between graph sequences by using multi graph matching, then getting matching region labeling by a proposed graph coloring algorithms which assign foreground label to respective region. The plan design is robust to unknown transformation with significant improvement in overall existing work which is related to moving multiple objects detection in real time parameters.
NASA Astrophysics Data System (ADS)
Hasan, Taufiq; Bořil, Hynek; Sangwan, Abhijeet; L Hansen, John H.
2013-12-01
The ability to detect and organize `hot spots' representing areas of excitement within video streams is a challenging research problem when techniques rely exclusively on video content. A generic method for sports video highlight selection is presented in this study which leverages both video/image structure as well as audio/speech properties. Processing begins where the video is partitioned into small segments and several multi-modal features are extracted from each segment. Excitability is computed based on the likelihood of the segmental features residing in certain regions of their joint probability density function space which are considered both exciting and rare. The proposed measure is used to rank order the partitioned segments to compress the overall video sequence and produce a contiguous set of highlights. Experiments are performed on baseball videos based on signal processing advancements for excitement assessment in the commentators' speech, audio energy, slow motion replay, scene cut density, and motion activity as features. Detailed analysis on correlation between user excitability and various speech production parameters is conducted and an effective scheme is designed to estimate the excitement level of commentator's speech from the sports videos. Subjective evaluation of excitability and ranking of video segments demonstrate a higher correlation with the proposed measure compared to well-established techniques indicating the effectiveness of the overall approach.
Segment scheduling method for reducing 360° video streaming latency
NASA Astrophysics Data System (ADS)
Gudumasu, Srinivas; Asbun, Eduardo; He, Yong; Ye, Yan
2017-09-01
360° video is an emerging new format in the media industry enabled by the growing availability of virtual reality devices. It provides the viewer a new sense of presence and immersion. Compared to conventional rectilinear video (2D or 3D), 360° video poses a new and difficult set of engineering challenges on video processing and delivery. Enabling comfortable and immersive user experience requires very high video quality and very low latency, while the large video file size poses a challenge to delivering 360° video in a quality manner at scale. Conventionally, 360° video represented in equirectangular or other projection formats can be encoded as a single standards-compliant bitstream using existing video codecs such as H.264/AVC or H.265/HEVC. Such method usually needs very high bandwidth to provide an immersive user experience. While at the client side, much of such high bandwidth and the computational power used to decode the video are wasted because the user only watches a small portion (i.e., viewport) of the entire picture. Viewport dependent 360°video processing and delivery approaches spend more bandwidth on the viewport than on non-viewports and are therefore able to reduce the overall transmission bandwidth. This paper proposes a dual buffer segment scheduling algorithm for viewport adaptive streaming methods to reduce latency when switching between high quality viewports in 360° video streaming. The approach decouples the scheduling of viewport segments and non-viewport segments to ensure the viewport segment requested matches the latest user head orientation. A base layer buffer stores all lower quality segments, and a viewport buffer stores high quality viewport segments corresponding to the most recent viewer's head orientation. The scheduling scheme determines viewport requesting time based on the buffer status and the head orientation. This paper also discusses how to deploy the proposed scheduling design for various viewport adaptive video streaming methods. The proposed dual buffer segment scheduling method is implemented in an end-to-end tile based 360° viewports adaptive video streaming platform, where the entire 360° video is divided into a number of tiles, and each tile is independently encoded into multiple quality level representations. The client requests different quality level representations of each tile based on the viewer's head orientation and the available bandwidth, and then composes all tiles together for rendering. The simulation results verify that the proposed dual buffer segment scheduling algorithm reduces the viewport switch latency, and utilizes available bandwidth more efficiently. As a result, a more consistent immersive 360° video viewing experience can be presented to the user.
Study of moving object detecting and tracking algorithm for video surveillance system
NASA Astrophysics Data System (ADS)
Wang, Tao; Zhang, Rongfu
2010-10-01
This paper describes a specific process of moving target detecting and tracking in the video surveillance.Obtain high-quality background is the key to achieving differential target detecting in the video surveillance.The paper is based on a block segmentation method to build clear background,and using the method of background difference to detecing moving target,after a series of treatment we can be extracted the more comprehensive object from original image,then using the smallest bounding rectangle to locate the object.In the video surveillance system, the delay of camera and other reasons lead to tracking lag,the model of Kalman filter based on template matching was proposed,using deduced and estimated capacity of Kalman,the center of smallest bounding rectangle for predictive value,predicted the position in the next moment may appare,followed by template matching in the region as the center of this position,by calculate the cross-correlation similarity of current image and reference image,can determine the best matching center.As narrowed the scope of searching,thereby reduced the searching time,so there be achieve fast-tracking.
Surgical gesture segmentation and recognition.
Tao, Lingling; Zappella, Luca; Hager, Gregory D; Vidal, René
2013-01-01
Automatic surgical gesture segmentation and recognition can provide useful feedback for surgical training in robotic surgery. Most prior work in this field relies on the robot's kinematic data. Although recent work [1,2] shows that the robot's video data can be equally effective for surgical gesture recognition, the segmentation of the video into gestures is assumed to be known. In this paper, we propose a framework for joint segmentation and recognition of surgical gestures from kinematic and video data. Unlike prior work that relies on either frame-level kinematic cues, or segment-level kinematic or video cues, our approach exploits both cues by using a combined Markov/semi-Markov conditional random field (MsM-CRF) model. Our experiments show that the proposed model improves over a Markov or semi-Markov CRF when using video data alone, gives results that are comparable to state-of-the-art methods on kinematic data alone, and improves over state-of-the-art methods when combining kinematic and video data.
Logo recognition in video by line profile classification
NASA Astrophysics Data System (ADS)
den Hollander, Richard J. M.; Hanjalic, Alan
2003-12-01
We present an extension to earlier work on recognizing logos in video stills. The logo instances considered here are rigid planar objects observed at a distance in the scene, so the possible perspective transformation can be approximated by an affine transformation. For this reason we can classify the logos by matching (invariant) line profiles. We enhance our previous method by considering multiple line profiles instead of a single profile of the logo. The positions of the lines are based on maxima in the Hough transform space of the segmented logo foreground image. Experiments are performed on MPEG1 sport video sequences to show the performance of the proposed method.
Automatic video segmentation and indexing
NASA Astrophysics Data System (ADS)
Chahir, Youssef; Chen, Liming
1999-08-01
Indexing is an important aspect of video database management. Video indexing involves the analysis of video sequences, which is a computationally intensive process. However, effective management of digital video requires robust indexing techniques. The main purpose of our proposed video segmentation is twofold. Firstly, we develop an algorithm that identifies camera shot boundary. The approach is based on the use of combination of color histograms and block-based technique. Next, each temporal segment is represented by a color reference frame which specifies the shot similarities and which is used in the constitution of scenes. Experimental results using a variety of videos selected in the corpus of the French Audiovisual National Institute are presented to demonstrate the effectiveness of performing shot detection, the content characterization of shots and the scene constitution.
Activity recognition using Video Event Segmentation with Text (VEST)
NASA Astrophysics Data System (ADS)
Holloway, Hillary; Jones, Eric K.; Kaluzniacki, Andrew; Blasch, Erik; Tierno, Jorge
2014-06-01
Multi-Intelligence (multi-INT) data includes video, text, and signals that require analysis by operators. Analysis methods include information fusion approaches such as filtering, correlation, and association. In this paper, we discuss the Video Event Segmentation with Text (VEST) method, which provides event boundaries of an activity to compile related message and video clips for future interest. VEST infers meaningful activities by clustering multiple streams of time-sequenced multi-INT intelligence data and derived fusion products. We discuss exemplar results that segment raw full-motion video (FMV) data by using extracted commentary message timestamps, FMV metadata, and user-defined queries.
Detection and tracking of gas plumes in LWIR hyperspectral video sequence data
NASA Astrophysics Data System (ADS)
Gerhart, Torin; Sunu, Justin; Lieu, Lauren; Merkurjev, Ekaterina; Chang, Jen-Mei; Gilles, Jérôme; Bertozzi, Andrea L.
2013-05-01
Automated detection of chemical plumes presents a segmentation challenge. The segmentation problem for gas plumes is difficult due to the diffusive nature of the cloud. The advantage of considering hyperspectral images in the gas plume detection problem over the conventional RGB imagery is the presence of non-visual data, allowing for a richer representation of information. In this paper we present an effective method of visualizing hyperspectral video sequences containing chemical plumes and investigate the effectiveness of segmentation techniques on these post-processed videos. Our approach uses a combination of dimension reduction and histogram equalization to prepare the hyperspectral videos for segmentation. First, Principal Components Analysis (PCA) is used to reduce the dimension of the entire video sequence. This is done by projecting each pixel onto the first few Principal Components resulting in a type of spectral filter. Next, a Midway method for histogram equalization is used. These methods redistribute the intensity values in order to reduce icker between frames. This properly prepares these high-dimensional video sequences for more traditional segmentation techniques. We compare the ability of various clustering techniques to properly segment the chemical plume. These include K-means, spectral clustering, and the Ginzburg-Landau functional.
Video sensor architecture for surveillance applications.
Sánchez, Jordi; Benet, Ginés; Simó, José E
2012-01-01
This paper introduces a flexible hardware and software architecture for a smart video sensor. This sensor has been applied in a video surveillance application where some of these video sensors are deployed, constituting the sensory nodes of a distributed surveillance system. In this system, a video sensor node processes images locally in order to extract objects of interest, and classify them. The sensor node reports the processing results to other nodes in the cloud (a user or higher level software) in the form of an XML description. The hardware architecture of each sensor node has been developed using two DSP processors and an FPGA that controls, in a flexible way, the interconnection among processors and the image data flow. The developed node software is based on pluggable components and runs on a provided execution run-time. Some basic and application-specific software components have been developed, in particular: acquisition, segmentation, labeling, tracking, classification and feature extraction. Preliminary results demonstrate that the system can achieve up to 7.5 frames per second in the worst case, and the true positive rates in the classification of objects are better than 80%.
Video Sensor Architecture for Surveillance Applications
Sánchez, Jordi; Benet, Ginés; Simó, José E.
2012-01-01
This paper introduces a flexible hardware and software architecture for a smart video sensor. This sensor has been applied in a video surveillance application where some of these video sensors are deployed, constituting the sensory nodes of a distributed surveillance system. In this system, a video sensor node processes images locally in order to extract objects of interest, and classify them. The sensor node reports the processing results to other nodes in the cloud (a user or higher level software) in the form of an XML description. The hardware architecture of each sensor node has been developed using two DSP processors and an FPGA that controls, in a flexible way, the interconnection among processors and the image data flow. The developed node software is based on pluggable components and runs on a provided execution run-time. Some basic and application-specific software components have been developed, in particular: acquisition, segmentation, labeling, tracking, classification and feature extraction. Preliminary results demonstrate that the system can achieve up to 7.5 frames per second in the worst case, and the true positive rates in the classification of objects are better than 80%. PMID:22438723
Perspective Taking Promotes Action Understanding and Learning
ERIC Educational Resources Information Center
Lozano, Sandra C.; Martin Hard, Bridgette; Tversky, Barbara
2006-01-01
People often learn actions by watching others. The authors propose and test the hypothesis that perspective taking promotes encoding a hierarchical representation of an actor's goals and subgoals-a key process for observational learning. Observers segmented videos of an object assembly task into coarse and fine action units. They described what…
Activity-based exploitation of Full Motion Video (FMV)
NASA Astrophysics Data System (ADS)
Kant, Shashi
2012-06-01
Video has been a game-changer in how US forces are able to find, track and defeat its adversaries. With millions of minutes of video being generated from an increasing number of sensor platforms, the DOD has stated that the rapid increase in video is overwhelming their analysts. The manpower required to view and garner useable information from the flood of video is unaffordable, especially in light of current fiscal restraints. "Search" within full-motion video has traditionally relied on human tagging of content, and video metadata, to provision filtering and locate segments of interest, in the context of analyst query. Our approach utilizes a novel machine-vision based approach to index FMV, using object recognition & tracking, events and activities detection. This approach enables FMV exploitation in real-time, as well as a forensic look-back within archives. This approach can help get the most information out of video sensor collection, help focus the attention of overburdened analysts form connections in activity over time and conserve national fiscal resources in exploiting FMV.
Study of Temporal Effects on Subjective Video Quality of Experience.
Bampis, Christos George; Zhi Li; Moorthy, Anush Krishna; Katsavounidis, Ioannis; Aaron, Anne; Bovik, Alan Conrad
2017-11-01
HTTP adaptive streaming is being increasingly deployed by network content providers, such as Netflix and YouTube. By dividing video content into data chunks encoded at different bitrates, a client is able to request the appropriate bitrate for the segment to be played next based on the estimated network conditions. However, this can introduce a number of impairments, including compression artifacts and rebuffering events, which can severely impact an end-user's quality of experience (QoE). We have recently created a new video quality database, which simulates a typical video streaming application, using long video sequences and interesting Netflix content. Going beyond previous efforts, the new database contains highly diverse and contemporary content, and it includes the subjective opinions of a sizable number of human subjects regarding the effects on QoE of both rebuffering and compression distortions. We observed that rebuffering is always obvious and unpleasant to subjects, while bitrate changes may be less obvious due to content-related dependencies. Transient bitrate drops were preferable over rebuffering only on low complexity video content, while consistently low bitrates were poorly tolerated. We evaluated different objective video quality assessment algorithms on our database and found that objective video quality models are unreliable for QoE prediction on videos suffering from both rebuffering events and bitrate changes. This implies the need for more general QoE models that take into account objective quality models, rebuffering-aware information, and memory. The publicly available video content as well as metadata for all of the videos in the new database can be found at http://live.ece.utexas.edu/research/LIVE_NFLXStudy/nflx_index.html.
A spatiotemporal decomposition strategy for personal home video management
NASA Astrophysics Data System (ADS)
Yi, Haoran; Kozintsev, Igor; Polito, Marzia; Wu, Yi; Bouguet, Jean-Yves; Nefian, Ara; Dulong, Carole
2007-01-01
With the advent and proliferation of low cost and high performance digital video recorder devices, an increasing number of personal home video clips are recorded and stored by the consumers. Compared to image data, video data is lager in size and richer in multimedia content. Efficient access to video content is expected to be more challenging than image mining. Previously, we have developed a content-based image retrieval system and the benchmarking framework for personal images. In this paper, we extend our personal image retrieval system to include personal home video clips. A possible initial solution to video mining is to represent video clips by a set of key frames extracted from them thus converting the problem into an image search one. Here we report that a careful selection of key frames may improve the retrieval accuracy. However, because video also has temporal dimension, its key frame representation is inherently limited. The use of temporal information can give us better representation for video content at semantic object and concept levels than image-only based representation. In this paper we propose a bottom-up framework to combine interest point tracking, image segmentation and motion-shape factorization to decompose the video into spatiotemporal regions. We show an example application of activity concept detection using the trajectories extracted from the spatio-temporal regions. The proposed approach shows good potential for concise representation and indexing of objects and their motion in real-life consumer video.
Subjective evaluation of H.265/HEVC based dynamic adaptive video streaming over HTTP (HEVC-DASH)
NASA Astrophysics Data System (ADS)
Irondi, Iheanyi; Wang, Qi; Grecos, Christos
2015-02-01
The Dynamic Adaptive Streaming over HTTP (DASH) standard is becoming increasingly popular for real-time adaptive HTTP streaming of internet video in response to unstable network conditions. Integration of DASH streaming techniques with the new H.265/HEVC video coding standard is a promising area of research. The performance of HEVC-DASH systems has been previously evaluated by a few researchers using objective metrics, however subjective evaluation would provide a better measure of the user's Quality of Experience (QoE) and overall performance of the system. This paper presents a subjective evaluation of an HEVC-DASH system implemented in a hardware testbed. Previous studies in this area have focused on using the current H.264/AVC (Advanced Video Coding) or H.264/SVC (Scalable Video Coding) codecs and moreover, there has been no established standard test procedure for the subjective evaluation of DASH adaptive streaming. In this paper, we define a test plan for HEVC-DASH with a carefully justified data set employing longer video sequences that would be sufficient to demonstrate the bitrate switching operations in response to various network condition patterns. We evaluate the end user's real-time QoE online by investigating the perceived impact of delay, different packet loss rates, fluctuating bandwidth, and the perceived quality of using different DASH video stream segment sizes on a video streaming session using different video sequences. The Mean Opinion Score (MOS) results give an insight into the performance of the system and expectation of the users. The results from this study show the impact of different network impairments and different video segments on users' QoE and further analysis and study may help in optimizing system performance.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Duberstein, Corey A.; Matzner, Shari; Cullinan, Valerie I.
Surveying wildlife at risk from offshore wind energy development is difficult and expensive. Infrared video can be used to record birds and bats that pass through the camera view, but it is also time consuming and expensive to review video and determine what was recorded. We proposed to conduct algorithm and software development to identify and to differentiate thermally detected targets of interest that would allow automated processing of thermal image data to enumerate birds, bats, and insects. During FY2012 we developed computer code within MATLAB to identify objects recorded in video and extract attribute information that describes the objectsmore » recorded. We tested the efficiency of track identification using observer-based counts of tracks within segments of sample video. We examined object attributes, modeled the effects of random variability on attributes, and produced data smoothing techniques to limit random variation within attribute data. We also began drafting and testing methodology to identify objects recorded on video. We also recorded approximately 10 hours of infrared video of various marine birds, passerine birds, and bats near the Pacific Northwest National Laboratory (PNNL) Marine Sciences Laboratory (MSL) at Sequim, Washington. A total of 6 hours of bird video was captured overlooking Sequim Bay over a series of weeks. An additional 2 hours of video of birds was also captured during two weeks overlooking Dungeness Bay within the Strait of Juan de Fuca. Bats and passerine birds (swallows) were also recorded at dusk on the MSL campus during nine evenings. An observer noted the identity of objects viewed through the camera concurrently with recording. These video files will provide the information necessary to produce and test software developed during FY2013. The annotation will also form the basis for creation of a method to reliably identify recorded objects.« less
Hierarchical video summarization based on context clustering
NASA Astrophysics Data System (ADS)
Tseng, Belle L.; Smith, John R.
2003-11-01
A personalized video summary is dynamically generated in our video personalization and summarization system based on user preference and usage environment. The three-tier personalization system adopts the server-middleware-client architecture in order to maintain, select, adapt, and deliver rich media content to the user. The server stores the content sources along with their corresponding MPEG-7 metadata descriptions. In this paper, the metadata includes visual semantic annotations and automatic speech transcriptions. Our personalization and summarization engine in the middleware selects the optimal set of desired video segments by matching shot annotations and sentence transcripts with user preferences. Besides finding the desired contents, the objective is to present a coherent summary. There are diverse methods for creating summaries, and we focus on the challenges of generating a hierarchical video summary based on context information. In our summarization algorithm, three inputs are used to generate the hierarchical video summary output. These inputs are (1) MPEG-7 metadata descriptions of the contents in the server, (2) user preference and usage environment declarations from the user client, and (3) context information including MPEG-7 controlled term list and classification scheme. In a video sequence, descriptions and relevance scores are assigned to each shot. Based on these shot descriptions, context clustering is performed to collect consecutively similar shots to correspond to hierarchical scene representations. The context clustering is based on the available context information, and may be derived from domain knowledge or rules engines. Finally, the selection of structured video segments to generate the hierarchical summary efficiently balances between scene representation and shot selection.
Hey! What's Space Station Freedom?
NASA Technical Reports Server (NTRS)
Vonehrenfried, Dutch
1992-01-01
This video, 'Hey! What's Space Station Freedom?', has been produced as a classroom tool geared toward middle school children. There are three segments to this video. Segment One is a message to teachers presented by Dr. Jeannine Duane, New Jersey, 'Teacher in Space'. Segment Two is a brief Social Studies section and features a series of Presidential Announcements by President John F. Kennedy (May 1961), President Ronald Reagan (July 1982), and President George Bush (July 1989). These historical announcements are speeches concerning the present and future objectives of the United States' space programs. In the last segment, Charlie Walker, former Space Shuttle astronaut, teaches a group of middle school children, through models, computer animation, and actual footage, what Space Station Freedom is, who is involved in its construction, how it is to be built, what each of the modules on the station is for, and how long and in what sequence this construction will occur. There is a brief animation segment where, through the use of cartoons, the children fly up to Space Station Freedom as astronauts, perform several experiments and are given a tour of the station, and fly back to Earth. Space Station Freedom will take four years to build and will have three lab modules, one from ESA and another from Japan, and one habitation module for the astronauts to live in.
Hey] What's Space Station Freedom?
NASA Astrophysics Data System (ADS)
Vonehrenfried, Dutch
This video, 'Hey] What's Space Station Freedom?', has been produced as a classroom tool geared toward middle school children. There are three segments to this video. Segment One is a message to teachers presented by Dr. Jeannine Duane, New Jersey, 'Teacher in Space'. Segment Two is a brief Social Studies section and features a series of Presidential Announcements by President John F. Kennedy (May 1961), President Ronald Reagan (July 1982), and President George Bush (July 1989). These historical announcements are speeches concerning the present and future objectives of the United States' space programs. In the last segment, Charlie Walker, former Space Shuttle astronaut, teaches a group of middle school children, through models, computer animation, and actual footage, what Space Station Freedom is, who is involved in its construction, how it is to be built, what each of the modules on the station is for, and how long and in what sequence this construction will occur. There is a brief animation segment where, through the use of cartoons, the children fly up to Space Station Freedom as astronauts, perform several experiments and are given a tour of the station, and fly back to Earth. Space Station Freedom will take four years to build and will have three lab modules, one from ESA and another from Japan, and one habitation module for the astronauts to live in.
A unified and efficient framework for court-net sports video analysis using 3D camera modeling
NASA Astrophysics Data System (ADS)
Han, Jungong; de With, Peter H. N.
2007-01-01
The extensive amount of video data stored on available media (hard and optical disks) necessitates video content analysis, which is a cornerstone for different user-friendly applications, such as, smart video retrieval and intelligent video summarization. This paper aims at finding a unified and efficient framework for court-net sports video analysis. We concentrate on techniques that are generally applicable for more than one sports type to come to a unified approach. To this end, our framework employs the concept of multi-level analysis, where a novel 3-D camera modeling is utilized to bridge the gap between the object-level and the scene-level analysis. The new 3-D camera modeling is based on collecting features points from two planes, which are perpendicular to each other, so that a true 3-D reference is obtained. Another important contribution is a new tracking algorithm for the objects (i.e. players). The algorithm can track up to four players simultaneously. The complete system contributes to summarization by various forms of information, of which the most important are the moving trajectory and real-speed of each player, as well as 3-D height information of objects and the semantic event segments in a game. We illustrate the performance of the proposed system by evaluating it for a variety of court-net sports videos containing badminton, tennis and volleyball, and we show that the feature detection performance is above 92% and events detection about 90%.
International Space Station (ISS)
2000-12-04
This video still depicts the recently deployed starboard and port solar arrays towering over the International Space Station (ISS). The video was recorded on STS-97's 65th orbit. Delivery, assembly, and activation of the solar arrays was the main mission objective of STS-97. The electrical power system, which is built into a 73-meter (240-foot) long solar array structure consists of solar arrays, radiators, batteries, and electronics, and will provide the power necessary for the first ISS crews to live and work in the U.S. segment. The entire 15.4-metric ton (17-ton) package is called the P6 Integrated Truss Segment, and is the heaviest and largest element yet delivered to the station aboard a space shuttle. The STS-97 crew of five launched aboard the Space Shuttle Orbiter Endeavor on November 30, 2000 for an 11 day mission.
Expedient range enhanced 3-D robot colour vision
NASA Astrophysics Data System (ADS)
Jarvis, R. A.
1983-01-01
Computer vision has been chosen, in many cases, as offering the richest form of sensory information which can be utilized for guiding robotic manipulation. The present investigation is concerned with the problem of three-dimensional (3D) visual interpretation of colored objects in support of robotic manipulation of those objects with a minimum of semantic guidance. The scene 'interpretations' are aimed at providing basic parameters to guide robotic manipulation rather than to provide humans with a detailed description of what the scene 'means'. Attention is given to overall system configuration, hue transforms, a connectivity analysis, plan/elevation segmentations, range scanners, elevation/range segmentation, higher level structure, eye in hand research, and aspects of array and video stream processing.
Psychovisual masks and intelligent streaming RTP techniques for the MPEG-4 standard
NASA Astrophysics Data System (ADS)
Mecocci, Alessandro; Falconi, Francesco
2003-06-01
In today multimedia audio-video communication systems, data compression plays a fundamental role by reducing the bandwidth waste and the costs of the infrastructures and equipments. Among the different compression standards, the MPEG-4 is becoming more and more accepted and widespread. Even if one of the fundamental aspects of this standard is the possibility of separately coding video objects (i.e. to separate moving objects from the background and adapt the coding strategy to the video content), currently implemented codecs work only at the full-frame level. In this way, many advantages of the flexible MPEG-4 syntax are missed. This lack is due both to the difficulties in properly segmenting moving objects in real scenes (featuring an arbitrary motion of the objects and of the acquisition sensor), and to the current use of these codecs, that are mainly oriented towards the market of DVD backups (a full-frame approach is enough for these applications). In this paper we propose a codec for MPEG-4 real-time object streaming, that codes separately the moving objects and the scene background. The proposed codec is capable of adapting its strategy during the transmission, by analysing the video currently transmitted and setting the coder parameters and modalities accordingly. For example, the background can be transmitted as a whole or by dividing it into "slightly-detailed" and "highly detailed" zones that are coded in different ways to reduce the bit-rate while preserving the perceived quality. The coder can automatically switch in real-time, from one modality to the other during the transmission, depending on the current video content. Psychovisual masks and other video-content based measurements have been used as inputs for a Self Learning Intelligent Controller (SLIC) that changes the parameters and the transmission modalities. The current implementation is based on the ISO 14496 standard code that allows Video Objects (VO) transmission (other Open Source Codes like: DivX, Xvid, and Cisco"s Mpeg-4IP, have been analyzed but, as for today, they do not support VO). The original code has been deeply modified to integrate the SLIC and to adapt it for real-time streaming. A personal RTP (Real Time Protocol) has been defined and a Client-Server application has been developed. The viewer can decode and demultiplex the stream in real-time, while adapting to the changing modalities adopted by the Server according to the current video content. The proposed codec works as follows: the image background is separated by means of a segmentation module and it is transmitted by means of a wavelet compression scheme similar to that used in the JPEG2000. The VO are coded separately and multiplexed with the background stream. At the receiver the stream is demultiplexed to obtain the background and the VO that are subsequently pasted together. The final quality depends on many factors, in particular: the quantization parameters, the Group Of Video Object (GOV) length, the GOV structure (i.e. the number of I-P-B VOP), the search area for motion compensation. These factors are strongly related to the following measurement parameters (that have been defined during the development): the Objects Apparent Size (OAS) in the scene, the Video Object Incidence factor (VOI), the temporal correlation (measured through the Normalized Mean SAD, NMSAD). The SLIC module analyzes the currently transmitted video and selects the most appropriate settings by choosing from a predefined set of transmission modalities. For example, in the case of a highly temporal correlated sequence, the number of B-VOP is increased to improve the compression ratio. The strategy for the selection of the number of B-VOP turns out to be very different from those reported in the literature for B-frames (adopted for MPEG-1 and MPEG-2), due to the different behaviour of the temporal correlation when limited only to moving objects. The SLIC module also decides how to transmit the background. In our implementation we adopted the Visual Brain theory i.e. the study of what the "psychic eye" can get from a scene. According to this theory, a Psychomask Image Analysis (PIA) module has been developed to extract the visually homogeneous regions of the background. The PIA module produces two complementary masks one for the visually low variance zones and one for the higly variable zones; these zones are compressed with different strategies and encoded into two multiplexed streams. From practical experiments it turned out that the separate coding is advantageous only if the low variance zones exceed 50% of the whole background area (due to the overhead given by the need of transmitting the zone masks). The SLIC module takes care of deciding the appropriate transmission modality by analyzing the results produced by the PIA module. The main features of this codec are: low bitrate, good image quality and coding speed. The current implementation runs in real-time on standard PC platforms, the major limitation being the fixed position of the acquisition sensor. This limitation is due to the difficulties in separating moving objects from the background when the acquisition sensor moves. Our current real-time segmentation module does not produce suitable results if the acquisition sensor moves (only slight oscillatory movements are tolerated). In any case, the system is particularly suitable for tele surveillance applications at low bit-rates, where the camera is usually fixed or alternates among some predetermined positions (our segmentation module is capable of accurately separate moving objects from the static background when the acquisition sensor stops, even if different scenes are seen as a result of the sensor displacements). Moreover, the proposed architecture is general, in the sense that when real-time, robust segmentation systems (capable of separating objects in real-time from the background while the sensor itself is moving) will be available, they can be easily integrated while leaving the rest of the system unchanged. Experimental results related to real sequences for traffic monitoring and for people tracking and afety control are reported and deeply discussed in the paper. The whole system has been implemented in standard ANSI C code and currently runs on standard PCs under Microsoft Windows operating system (Windows 2000 pro and Windows XP).
Faces of Homelessness: A Teacher's Guide.
ERIC Educational Resources Information Center
Massachusetts State Dept. of Education, Quincy.
A brief teacher's guide supplements a videotape of two 15-minute segments on homelessness. The stated objective of the video is to cover the issues of homelessness as they exist today and to dispel the stereotypes of homelessness leftover from earlier eras. A family which has found itself homeless is introduced and then aspects of the phenomenon…
Video Segmentation Descriptors for Event Recognition
2014-12-08
Velastin, 3D Extended Histogram of Oriented Gradients (3DHOG) for Classification of Road Users in Urban Scenes , BMVC, 2009. [3] M.-Y. Chen and A. Hauptmann...computed on 3D volume outputted by the hierarchical segmentation . Each video is described as follows. Each supertube is temporally divided in n-frame...strength of these descriptors is their adaptability to the scene variations since they are grounded on a video segmentation . This makes them naturally robust
Moving object detection in top-view aerial videos improved by image stacking
NASA Astrophysics Data System (ADS)
Teutsch, Michael; Krüger, Wolfgang; Beyerer, Jürgen
2017-08-01
Image stacking is a well-known method that is used to improve the quality of images in video data. A set of consecutive images is aligned by applying image registration and warping. In the resulting image stack, each pixel has redundant information about its intensity value. This redundant information can be used to suppress image noise, resharpen blurry images, or even enhance the spatial image resolution as done in super-resolution. Small moving objects in the videos usually get blurred or distorted by image stacking and thus need to be handled explicitly. We use image stacking in an innovative way: image registration is applied to small moving objects only, and image warping blurs the stationary background that surrounds the moving objects. Our video data are coming from a small fixed-wing unmanned aerial vehicle (UAV) that acquires top-view gray-value images of urban scenes. Moving objects are mainly cars but also other vehicles such as motorcycles. The resulting images, after applying our proposed image stacking approach, are used to improve baseline algorithms for vehicle detection and segmentation. We improve precision and recall by up to 0.011, which corresponds to a reduction of the number of false positive and false negative detections by more than 3 per second. Furthermore, we show how our proposed image stacking approach can be implemented efficiently.
Global-constrained hidden Markov model applied on wireless capsule endoscopy video segmentation
NASA Astrophysics Data System (ADS)
Wan, Yiwen; Duraisamy, Prakash; Alam, Mohammad S.; Buckles, Bill
2012-06-01
Accurate analysis of wireless capsule endoscopy (WCE) videos is vital but tedious. Automatic image analysis can expedite this task. Video segmentation of WCE into the four parts of the gastrointestinal tract is one way to assist a physician. The segmentation approach described in this paper integrates pattern recognition with statiscal analysis. Iniatially, a support vector machine is applied to classify video frames into four classes using a combination of multiple color and texture features as the feature vector. A Poisson cumulative distribution, for which the parameter depends on the length of segments, models a prior knowledge. A priori knowledge together with inter-frame difference serves as the global constraints driven by the underlying observation of each WCE video, which is fitted by Gaussian distribution to constrain the transition probability of hidden Markov model.Experimental results demonstrated effectiveness of the approach.
Special-effect edit detection using VideoTrails: a comparison with existing techniques
NASA Astrophysics Data System (ADS)
Kobla, Vikrant; DeMenthon, Daniel; Doermann, David S.
1998-12-01
Video segmentation plays an integral role in many multimedia applications, such as digital libraries, content management systems, and various other video browsing, indexing, and retrieval systems. Many algorithms for segmentation of video have appeared within the past few years. Most of these algorithms perform well on cuts, but yield poor performance on gradual transitions or special effects edits. A complete video segmentation system must also achieve good performance on special effect edit detection. In this paper, we discuss the performance of our Video Trails-based algorithms, with other existing special effect edit-detection algorithms within the literature. Results from experiments testing for the ability to detect edits from TV programs, ranging from commercials to news magazine programs, including diverse special effect edits, which we have introduced.
Robust real-time horizon detection in full-motion video
NASA Astrophysics Data System (ADS)
Young, Grace B.; Bagnall, Bryan; Lane, Corey; Parameswaran, Shibin
2014-06-01
The ability to detect the horizon on a real-time basis in full-motion video is an important capability to aid and facilitate real-time processing of full-motion videos for the purposes such as object detection, recognition and other video/image segmentation applications. In this paper, we propose a method for real-time horizon detection that is designed to be used as a front-end processing unit for a real-time marine object detection system that carries out object detection and tracking on full-motion videos captured by ship/harbor-mounted cameras, Unmanned Aerial Vehicles (UAVs) or any other method of surveillance for Maritime Domain Awareness (MDA). Unlike existing horizon detection work, we cannot assume a priori the angle or nature (for e.g. straight line) of the horizon, due to the nature of the application domain and the data. Therefore, the proposed real-time algorithm is designed to identify the horizon at any angle and irrespective of objects appearing close to and/or occluding the horizon line (for e.g. trees, vehicles at a distance) by accounting for its non-linear nature. We use a simple two-stage hierarchical methodology, leveraging color-based features, to quickly isolate the region of the image containing the horizon and then perform a more ne-grained horizon detection operation. In this paper, we present our real-time horizon detection results using our algorithm on real-world full-motion video data from a variety of surveillance sensors like UAVs and ship mounted cameras con rming the real-time applicability of this method and its ability to detect horizon with no a priori assumptions.
Human visual system-based smoking event detection
NASA Astrophysics Data System (ADS)
Odetallah, Amjad D.; Agaian, Sos S.
2012-06-01
Human action (e.g. smoking, eating, and phoning) analysis is an important task in various application domains like video surveillance, video retrieval, human-computer interaction systems, and so on. Smoke detection is a crucial task in many video surveillance applications and could have a great impact to raise the level of safety of urban areas, public parks, airplanes, hospitals, schools and others. The detection task is challenging since there is no prior knowledge about the object's shape, texture and color. In addition, its visual features will change under different lighting and weather conditions. This paper presents a new scheme of a system for detecting human smoking events, or small smoke, in a sequence of images. In developed system, motion detection and background subtraction are combined with motion-region-saving, skin-based image segmentation, and smoke-based image segmentation to capture potential smoke regions which are further analyzed to decide on the occurrence of smoking events. Experimental results show the effectiveness of the proposed approach. As well, the developed method is capable of detecting the small smoking events of uncertain actions with various cigarette sizes, colors, and shapes.
Fully Automatic Segmentation of Fluorescein Leakage in Subjects With Diabetic Macular Edema
Rabbani, Hossein; Allingham, Michael J.; Mettu, Priyatham S.; Cousins, Scott W.; Farsiu, Sina
2015-01-01
Purpose. To create and validate software to automatically segment leakage area in real-world clinical fluorescein angiography (FA) images of subjects with diabetic macular edema (DME). Methods. Fluorescein angiography images obtained from 24 eyes of 24 subjects with DME were retrospectively analyzed. Both video and still-frame images were obtained using a Heidelberg Spectralis 6-mode HRA/OCT unit. We aligned early and late FA frames in the video by a two-step nonrigid registration method. To remove background artifacts, we subtracted early and late FA frames. Finally, after postprocessing steps, including detection and inpainting of the vessels, a robust active contour method was utilized to obtain leakage area in a 1500-μm-radius circular region centered at the fovea. Images were captured at different fields of view (FOVs) and were often contaminated with outliers, as is the case in real-world clinical imaging. Our algorithm was applied to these images with no manual input. Separately, all images were manually segmented by two retina specialists. The sensitivity, specificity, and accuracy of manual interobserver, manual intraobserver, and automatic methods were calculated. Results. The mean accuracy was 0.86 ± 0.08 for automatic versus manual, 0.83 ± 0.16 for manual interobserver, and 0.90 ± 0.08 for manual intraobserver segmentation methods. Conclusions. Our fully automated algorithm can reproducibly and accurately quantify the area of leakage of clinical-grade FA video and is congruent with expert manual segmentation. The performance was reliable for different DME subtypes. This approach has the potential to reduce time and labor costs and may yield objective and reproducible quantitative measurements of DME imaging biomarkers. PMID:25634978
Fully automatic segmentation of fluorescein leakage in subjects with diabetic macular edema.
Rabbani, Hossein; Allingham, Michael J; Mettu, Priyatham S; Cousins, Scott W; Farsiu, Sina
2015-01-29
To create and validate software to automatically segment leakage area in real-world clinical fluorescein angiography (FA) images of subjects with diabetic macular edema (DME). Fluorescein angiography images obtained from 24 eyes of 24 subjects with DME were retrospectively analyzed. Both video and still-frame images were obtained using a Heidelberg Spectralis 6-mode HRA/OCT unit. We aligned early and late FA frames in the video by a two-step nonrigid registration method. To remove background artifacts, we subtracted early and late FA frames. Finally, after postprocessing steps, including detection and inpainting of the vessels, a robust active contour method was utilized to obtain leakage area in a 1500-μm-radius circular region centered at the fovea. Images were captured at different fields of view (FOVs) and were often contaminated with outliers, as is the case in real-world clinical imaging. Our algorithm was applied to these images with no manual input. Separately, all images were manually segmented by two retina specialists. The sensitivity, specificity, and accuracy of manual interobserver, manual intraobserver, and automatic methods were calculated. The mean accuracy was 0.86 ± 0.08 for automatic versus manual, 0.83 ± 0.16 for manual interobserver, and 0.90 ± 0.08 for manual intraobserver segmentation methods. Our fully automated algorithm can reproducibly and accurately quantify the area of leakage of clinical-grade FA video and is congruent with expert manual segmentation. The performance was reliable for different DME subtypes. This approach has the potential to reduce time and labor costs and may yield objective and reproducible quantitative measurements of DME imaging biomarkers. Copyright 2015 The Association for Research in Vision and Ophthalmology, Inc.
An objective comparison of cell-tracking algorithms.
Ulman, Vladimír; Maška, Martin; Magnusson, Klas E G; Ronneberger, Olaf; Haubold, Carsten; Harder, Nathalie; Matula, Pavel; Matula, Petr; Svoboda, David; Radojevic, Miroslav; Smal, Ihor; Rohr, Karl; Jaldén, Joakim; Blau, Helen M; Dzyubachyk, Oleh; Lelieveldt, Boudewijn; Xiao, Pengdong; Li, Yuexiang; Cho, Siu-Yeung; Dufour, Alexandre C; Olivo-Marin, Jean-Christophe; Reyes-Aldasoro, Constantino C; Solis-Lemus, Jose A; Bensch, Robert; Brox, Thomas; Stegmaier, Johannes; Mikut, Ralf; Wolf, Steffen; Hamprecht, Fred A; Esteves, Tiago; Quelhas, Pedro; Demirel, Ömer; Malmström, Lars; Jug, Florian; Tomancak, Pavel; Meijering, Erik; Muñoz-Barrutia, Arrate; Kozubek, Michal; Ortiz-de-Solorzano, Carlos
2017-12-01
We present a combined report on the results of three editions of the Cell Tracking Challenge, an ongoing initiative aimed at promoting the development and objective evaluation of cell segmentation and tracking algorithms. With 21 participating algorithms and a data repository consisting of 13 data sets from various microscopy modalities, the challenge displays today's state-of-the-art methodology in the field. We analyzed the challenge results using performance measures for segmentation and tracking that rank all participating methods. We also analyzed the performance of all of the algorithms in terms of biological measures and practical usability. Although some methods scored high in all technical aspects, none obtained fully correct solutions. We found that methods that either take prior information into account using learning strategies or analyze cells in a global spatiotemporal video context performed better than other methods under the segmentation and tracking scenarios included in the challenge.
Effects of Segmenting, Signalling, and Weeding on Learning from Educational Video
ERIC Educational Resources Information Center
Ibrahim, Mohamed; Antonenko, Pavlo D.; Greenwood, Carmen M.; Wheeler, Denna
2012-01-01
Informed by the cognitive theory of multimedia learning, this study examined the effects of three multimedia design principles on undergraduate students' learning outcomes and perceived learning difficulty in the context of learning entomology from an educational video. These principles included segmenting the video into smaller units, signalling…
Multi-view video segmentation and tracking for video surveillance
NASA Astrophysics Data System (ADS)
Mohammadi, Gelareh; Dufaux, Frederic; Minh, Thien Ha; Ebrahimi, Touradj
2009-05-01
Tracking moving objects is a critical step for smart video surveillance systems. Despite the complexity increase, multiple camera systems exhibit the undoubted advantages of covering wide areas and handling the occurrence of occlusions by exploiting the different viewpoints. The technical problems in multiple camera systems are several: installation, calibration, objects matching, switching, data fusion, and occlusion handling. In this paper, we address the issue of tracking moving objects in an environment covered by multiple un-calibrated cameras with overlapping fields of view, typical of most surveillance setups. Our main objective is to create a framework that can be used to integrate objecttracking information from multiple video sources. Basically, the proposed technique consists of the following steps. We first perform a single-view tracking algorithm on each camera view, and then apply a consistent object labeling algorithm on all views. In the next step, we verify objects in each view separately for inconsistencies. Correspondent objects are extracted through a Homography transform from one view to the other and vice versa. Having found the correspondent objects of different views, we partition each object into homogeneous regions. In the last step, we apply the Homography transform to find the region map of first view in the second view and vice versa. For each region (in the main frame and mapped frame) a set of descriptors are extracted to find the best match between two views based on region descriptors similarity. This method is able to deal with multiple objects. Track management issues such as occlusion, appearance and disappearance of objects are resolved using information from all views. This method is capable of tracking rigid and deformable objects and this versatility lets it to be suitable for different application scenarios.
Robust vehicle detection in different weather conditions: Using MIPM
Menéndez, José Manuel; Jiménez, David
2018-01-01
Intelligent Transportation Systems (ITS) allow us to have high quality traffic information to reduce the risk of potentially critical situations. Conventional image-based traffic detection methods have difficulties acquiring good images due to perspective and background noise, poor lighting and weather conditions. In this paper, we propose a new method to accurately segment and track vehicles. After removing perspective using Modified Inverse Perspective Mapping (MIPM), Hough transform is applied to extract road lines and lanes. Then, Gaussian Mixture Models (GMM) are used to segment moving objects and to tackle car shadow effects, we apply a chromacity-based strategy. Finally, performance is evaluated through three different video benchmarks: own recorded videos in Madrid and Tehran (with different weather conditions at urban and interurban areas); and two well-known public datasets (KITTI and DETRAC). Our results indicate that the proposed algorithms are robust, and more accurate compared to others, especially when facing occlusions, lighting variations and weather conditions. PMID:29513664
Crowdsourcing for identification of polyp-free segments in virtual colonoscopy videos
NASA Astrophysics Data System (ADS)
Park, Ji Hwan; Mirhosseini, Seyedkoosha; Nadeem, Saad; Marino, Joseph; Kaufman, Arie; Baker, Kevin; Barish, Matthew
2017-03-01
Virtual colonoscopy (VC) allows a physician to virtually navigate within a reconstructed 3D colon model searching for colorectal polyps. Though VC is widely recognized as a highly sensitive and specific test for identifying polyps, one limitation is the reading time, which can take over 30 minutes per patient. Large amounts of the colon are often devoid of polyps, and a way of identifying these polyp-free segments could be of valuable use in reducing the required reading time for the interrogating radiologist. To this end, we have tested the ability of the collective crowd intelligence of non-expert workers to identify polyp candidates and polyp-free regions. We presented twenty short videos flying through a segment of a virtual colon to each worker, and the crowd was asked to determine whether or not a possible polyp was observed within that video segment. We evaluated our framework on Amazon Mechanical Turk and found that the crowd was able to achieve a sensitivity of 80.0% and specificity of 86.5% in identifying video segments which contained a clinically proven polyp. Since each polyp appeared in multiple consecutive segments, all polyps were in fact identified. Using the crowd results as a first pass, 80% of the video segments could in theory be skipped by the radiologist, equating to a significant time savings and enabling more VC examinations to be performed.
NASA Astrophysics Data System (ADS)
Li, Wei; Chen, Ting; Zhang, Wenjun; Shi, Yunyu; Li, Jun
2012-04-01
In recent years, Music video data is increasing at an astonishing speed. Shot segmentation and keyframe extraction constitute a fundamental unit in organizing, indexing, retrieving video content. In this paper a unified framework is proposed to detect the shot boundaries and extract the keyframe of a shot. Music video is first segmented to shots by illumination-invariant chromaticity histogram in independent component (IC) analysis feature space .Then we presents a new metric, image complexity, to extract keyframe in a shot which is computed by ICs. Experimental results show the framework is effective and has a good performance.
NASA Astrophysics Data System (ADS)
Mundhenk, T. Nathan; Ni, Kang-Yu; Chen, Yang; Kim, Kyungnam; Owechko, Yuri
2012-01-01
An aerial multiple camera tracking paradigm needs to not only spot unknown targets and track them, but also needs to know how to handle target reacquisition as well as target handoff to other cameras in the operating theater. Here we discuss such a system which is designed to spot unknown targets, track them, segment the useful features and then create a signature fingerprint for the object so that it can be reacquired or handed off to another camera. The tracking system spots unknown objects by subtracting background motion from observed motion allowing it to find targets in motion, even if the camera platform itself is moving. The area of motion is then matched to segmented regions returned by the EDISON mean shift segmentation tool. Whole segments which have common motion and which are contiguous to each other are grouped into a master object. Once master objects are formed, we have a tight bound on which to extract features for the purpose of forming a fingerprint. This is done using color and simple entropy features. These can be placed into a myriad of different fingerprints. To keep data transmission and storage size low for camera handoff of targets, we try several different simple techniques. These include Histogram, Spatiogram and Single Gaussian Model. These are tested by simulating a very large number of target losses in six videos over an interval of 1000 frames each from the DARPA VIVID video set. Since the fingerprints are very simple, they are not expected to be valid for long periods of time. As such, we test the shelf life of fingerprints. This is how long a fingerprint is good for when stored away between target appearances. Shelf life gives us a second metric of goodness and tells us if a fingerprint method has better accuracy over longer periods. In videos which contain multiple vehicle occlusions and vehicles of highly similar appearance we obtain a reacquisition rate for automobiles of over 80% using the simple single Gaussian model compared with the null hypothesis of <20%. Additionally, the performance for fingerprints stays well above the null hypothesis for as much as 800 frames. Thus, a simple and highly compact single Gaussian model is useful for target reacquisition. Since the model is agnostic to view point and object size, it is expected to perform as well on a test of target handoff. Since some of the performance degradation is due to problems with the initial target acquisition and tracking, the simple Gaussian model may perform even better with an improved initial acquisition technique. Also, since the model makes no assumption about the object to be tracked, it should be possible to use it to fingerprint a multitude of objects, not just cars. Further accuracy may be obtained by creating manifolds of objects from multiple samples.
Automatic and quantitative measurement of laryngeal video stroboscopic images.
Kuo, Chung-Feng Jeffrey; Kuo, Joseph; Hsiao, Shang-Wun; Lee, Chi-Lung; Lee, Jih-Chin; Ke, Bo-Han
2017-01-01
The laryngeal video stroboscope is an important instrument for physicians to analyze abnormalities and diseases in the glottal area. Stroboscope has been widely used around the world. However, without quantized indices, physicians can only make subjective judgment on glottal images. We designed a new laser projection marking module and applied it onto the laryngeal video stroboscope to provide scale conversion reference parameters for glottal imaging and to convert the physiological parameters of glottis. Image processing technology was used to segment the important image regions of interest. Information of the glottis was quantified, and the vocal fold image segmentation system was completed to assist clinical diagnosis and increase accuracy. Regarding image processing, histogram equalization was used to enhance glottis image contrast. The center weighted median filters image noise while retaining the texture of the glottal image. Statistical threshold determination was used for automatic segmentation of a glottal image. As the glottis image contains saliva and light spots, which are classified as the noise of the image, noise was eliminated by erosion, expansion, disconnection, and closure techniques to highlight the vocal area. We also used image processing to automatically identify an image of vocal fold region in order to quantify information from the glottal image, such as glottal area, vocal fold perimeter, vocal fold length, glottal width, and vocal fold angle. The quantized glottis image database was created to assist physicians in diagnosing glottis diseases more objectively.
Development and human factors analysis of neuronavigation vs. augmented reality.
Pandya, Abhilash; Siadat, Mohammad-Reza; Auner, Greg; Kalash, Mohammad; Ellis, R Darin
2004-01-01
This paper is focused on the human factors analysis comparing a standard neuronavigation system with an augmented reality system. We use a passive articulated arm (Microscribe, Immersion technology) to track a calibrated end-effector mounted video camera. In real time, we superimpose the live video view with the synchronized graphical view of CT-derived segmented object(s) of interest within a phantom skull. Using the same robotic arm, we have developed a neuronavigation system able to show the end-effector of the arm on orthogonal CT scans. Both the AR and the neuronavigation systems have been shown to be within 3mm of accuracy. A human factors study was conducted in which subjects were asked to draw craniotomies and answer questions to gage their understanding of the phantom objects. The human factors study included 21 subjects and indicated that the subjects performed faster, with more accuracy and less errors using the Augmented Reality interface.
Video content parsing based on combined audio and visual information
NASA Astrophysics Data System (ADS)
Zhang, Tong; Kuo, C.-C. Jay
1999-08-01
While previous research on audiovisual data segmentation and indexing primarily focuses on the pictorial part, significant clues contained in the accompanying audio flow are often ignored. A fully functional system for video content parsing can be achieved more successfully through a proper combination of audio and visual information. By investigating the data structure of different video types, we present tools for both audio and visual content analysis and a scheme for video segmentation and annotation in this research. In the proposed system, video data are segmented into audio scenes and visual shots by detecting abrupt changes in audio and visual features, respectively. Then, the audio scene is categorized and indexed as one of the basic audio types while a visual shot is presented by keyframes and associate image features. An index table is then generated automatically for each video clip based on the integration of outputs from audio and visual analysis. It is shown that the proposed system provides satisfying video indexing results.
NASA Technical Reports Server (NTRS)
Howard, Richard T. (Inventor); Bryan, ThomasC. (Inventor); Book, Michael L. (Inventor)
2004-01-01
A method and system for processing an image including capturing an image and storing the image as image pixel data. Each image pixel datum is stored in a respective memory location having a corresponding address. Threshold pixel data is selected from the image pixel data and linear spot segments are identified from the threshold pixel data selected.. Ihe positions of only a first pixel and a last pixel for each linear segment are saved. Movement of one or more objects are tracked by comparing the positions of fust and last pixels of a linear segment present in the captured image with respective first and last pixel positions in subsequent captured images. Alternatively, additional data for each linear data segment is saved such as sum of pixels and the weighted sum of pixels i.e., each threshold pixel value is multiplied by that pixel's x-location).
A bio-inspired method and system for visual object-based attention and segmentation
NASA Astrophysics Data System (ADS)
Huber, David J.; Khosla, Deepak
2010-04-01
This paper describes a method and system of human-like attention and object segmentation in visual scenes that (1) attends to regions in a scene in their rank of saliency in the image, (2) extracts the boundary of an attended proto-object based on feature contours, and (3) can be biased to boost the attention paid to specific features in a scene, such as those of a desired target object in static and video imagery. The purpose of the system is to identify regions of a scene of potential importance and extract the region data for processing by an object recognition and classification algorithm. The attention process can be performed in a default, bottom-up manner or a directed, top-down manner which will assign a preference to certain features over others. One can apply this system to any static scene, whether that is a still photograph or imagery captured from video. We employ algorithms that are motivated by findings in neuroscience, psychology, and cognitive science to construct a system that is novel in its modular and stepwise approach to the problems of attention and region extraction, its application of a flooding algorithm to break apart an image into smaller proto-objects based on feature density, and its ability to join smaller regions of similar features into larger proto-objects. This approach allows many complicated operations to be carried out by the system in a very short time, approaching real-time. A researcher can use this system as a robust front-end to a larger system that includes object recognition and scene understanding modules; it is engineered to function over a broad range of situations and can be applied to any scene with minimal tuning from the user.
Automatic generation of pictorial transcripts of video programs
NASA Astrophysics Data System (ADS)
Shahraray, Behzad; Gibbon, David C.
1995-03-01
An automatic authoring system for the generation of pictorial transcripts of video programs which are accompanied by closed caption information is presented. A number of key frames, each of which represents the visual information in a segment of the video (i.e., a scene), are selected automatically by performing a content-based sampling of the video program. The textual information is recovered from the closed caption signal and is initially segmented based on its implied temporal relationship with the video segments. The text segmentation boundaries are then adjusted, based on lexical analysis and/or caption control information, to account for synchronization errors due to possible delays in the detection of scene boundaries or the transmission of the caption information. The closed caption text is further refined through linguistic processing for conversion to lower- case with correct capitalization. The key frames and the related text generate a compact multimedia presentation of the contents of the video program which lends itself to efficient storage and transmission. This compact representation can be viewed on a computer screen, or used to generate the input to a commercial text processing package to generate a printed version of the program.
Aerial vehicles collision avoidance using monocular vision
NASA Astrophysics Data System (ADS)
Balashov, Oleg; Muraviev, Vadim; Strotov, Valery
2016-10-01
In this paper image-based collision avoidance algorithm that provides detection of nearby aircraft and distance estimation is presented. The approach requires a vision system with a single moving camera and additional information about carrier's speed and orientation from onboard sensors. The main idea is to create a multi-step approach based on a preliminary detection, regions of interest (ROI) selection, contour segmentation, object matching and localization. The proposed algorithm is able to detect small targets but unlike many other approaches is designed to work with large-scale objects as well. To localize aerial vehicle position the system of equations relating object coordinates in space and observed image is solved. The system solution gives the current position and speed of the detected object in space. Using this information distance and time to collision can be estimated. Experimental research on real video sequences and modeled data is performed. Video database contained different types of aerial vehicles: aircrafts, helicopters, and UAVs. The presented algorithm is able to detect aerial vehicles from several kilometers under regular daylight conditions.
Automated detection of videotaped neonatal seizures of epileptic origin.
Karayiannis, Nicolaos B; Xiong, Yaohua; Tao, Guozhi; Frost, James D; Wise, Merrill S; Hrachovy, Richard A; Mizrahi, Eli M
2006-06-01
This study aimed at the development of a seizure-detection system by training neural networks with quantitative motion information extracted from short video segments of neonatal seizures of the myoclonic and focal clonic types and random infant movements. The motion of the infants' body parts was quantified by temporal motion-strength signals extracted from video segments by motion-segmentation methods based on optical flow computation. The area of each frame occupied by the infants' moving body parts was segmented by clustering the motion parameters obtained by fitting an affine model to the pixel velocities. The motion of the infants' body parts also was quantified by temporal motion-trajectory signals extracted from video recordings by robust motion trackers based on block-motion models. These motion trackers were developed to adjust autonomously to illumination and contrast changes that may occur during the video-frame sequence. Video segments were represented by quantitative features obtained by analyzing motion-strength and motion-trajectory signals in both the time and frequency domains. Seizure recognition was performed by conventional feed-forward neural networks, quantum neural networks, and cosine radial basis function neural networks, which were trained to detect neonatal seizures of the myoclonic and focal clonic types and to distinguish them from random infant movements. The computational tools and procedures developed for automated seizure detection were evaluated on a set of 240 video segments of 54 patients exhibiting myoclonic seizures (80 segments), focal clonic seizures (80 segments), and random infant movements (80 segments). Regardless of the decision scheme used for interpreting the responses of the trained neural networks, all the neural network models exhibited sensitivity and specificity>90%. For one of the decision schemes proposed for interpreting the responses of the trained neural networks, the majority of the trained neural-network models exhibited sensitivity>90% and specificity>95%. In particular, cosine radial basis function neural networks achieved the performance targets of this phase of the project (i.e., sensitivity>95% and specificity>95%). The best among the motion segmentation and tracking methods developed in this study produced quantitative features that constitute a reliable basis for detecting neonatal seizures. The performance targets of this phase of the project were achieved by combining the quantitative features obtained by analyzing motion-strength signals with those produced by analyzing motion-trajectory signals. The computational procedures and tools developed in this study to perform off-line analysis of short video segments will be used in the next phase of this project, which involves the integration of these procedures and tools into a system that can process and analyze long video recordings of infants monitored for seizures in real time.
Deep residual networks for automatic segmentation of laparoscopic videos of the liver
NASA Astrophysics Data System (ADS)
Gibson, Eli; Robu, Maria R.; Thompson, Stephen; Edwards, P. Eddie; Schneider, Crispin; Gurusamy, Kurinchi; Davidson, Brian; Hawkes, David J.; Barratt, Dean C.; Clarkson, Matthew J.
2017-03-01
Motivation: For primary and metastatic liver cancer patients undergoing liver resection, a laparoscopic approach can reduce recovery times and morbidity while offering equivalent curative results; however, only about 10% of tumours reside in anatomical locations that are currently accessible for laparoscopic resection. Augmenting laparoscopic video with registered vascular anatomical models from pre-procedure imaging could support using laparoscopy in a wider population. Segmentation of liver tissue on laparoscopic video supports the robust registration of anatomical liver models by filtering out false anatomical correspondences between pre-procedure and intra-procedure images. In this paper, we present a convolutional neural network (CNN) approach to liver segmentation in laparoscopic liver procedure videos. Method: We defined a CNN architecture comprising fully-convolutional deep residual networks with multi-resolution loss functions. The CNN was trained in a leave-one-patient-out cross-validation on 2050 video frames from 6 liver resections and 7 laparoscopic staging procedures, and evaluated using the Dice score. Results: The CNN yielded segmentations with Dice scores >=0.95 for the majority of images; however, the inter-patient variability in median Dice score was substantial. Four failure modes were identified from low scoring segmentations: minimal visible liver tissue, inter-patient variability in liver appearance, automatic exposure correction, and pathological liver tissue that mimics non-liver tissue appearance. Conclusion: CNNs offer a feasible approach for accurately segmenting liver from other anatomy on laparoscopic video, but additional data or computational advances are necessary to address challenges due to the high inter-patient variability in liver appearance.
Smoke regions extraction based on two steps segmentation and motion detection in early fire
NASA Astrophysics Data System (ADS)
Jian, Wenlin; Wu, Kaizhi; Yu, Zirong; Chen, Lijuan
2018-03-01
Aiming at the early problems of video-based smoke detection in fire video, this paper proposes a method to extract smoke suspected regions by combining two steps segmentation and motion characteristics. Early smoldering smoke can be seen as gray or gray-white regions. In the first stage, regions of interests (ROIs) with smoke are obtained by using two step segmentation methods. Then, suspected smoke regions are detected by combining the two step segmentation and motion detection. Finally, morphological processing is used for smoke regions extracting. The Otsu algorithm is used as segmentation method and the ViBe algorithm is used to detect the motion of smoke. The proposed method was tested on 6 test videos with smoke. The experimental results show the effectiveness of our proposed method over visual observation.
News video story segmentation method using fusion of audio-visual features
NASA Astrophysics Data System (ADS)
Wen, Jun; Wu, Ling-da; Zeng, Pu; Luan, Xi-dao; Xie, Yu-xiang
2007-11-01
News story segmentation is an important aspect for news video analysis. This paper presents a method for news video story segmentation. Different form prior works, which base on visual features transform, the proposed technique uses audio features as baseline and fuses visual features with it to refine the results. At first, it selects silence clips as audio features candidate points, and selects shot boundaries and anchor shots as two kinds of visual features candidate points. Then this paper selects audio feature candidates as cues and develops different fusion method, which effectively using diverse type visual candidates to refine audio candidates, to get story boundaries. Experiment results show that this method has high efficiency and adaptability to different kinds of news video.
Science documentary video slides to enhance education and communication
NASA Astrophysics Data System (ADS)
Byrne, J. M.; Little, L. J.; Dodgson, K.
2010-12-01
Documentary production can convey powerful messages using a combination of authentic science and reinforcing video imagery. Conventional documentary production contains too much information for many viewers to follow; hence many powerful points may be lost. But documentary productions that are re-edited into short video sequences and made available through web based video servers allow the teacher/viewer to access the material as video slides. Each video slide contains one critical discussion segment of the larger documentary. A teacher/viewer can review the documentary one segment at a time in a class room, public forum, or in the comfort of home. The sequential presentation of the video slides allows the viewer to best absorb the documentary message. The website environment provides space for additional questions and discussion to enhance the video message.
Cooperative multisensor system for real-time face detection and tracking in uncontrolled conditions
NASA Astrophysics Data System (ADS)
Marchesotti, Luca; Piva, Stefano; Turolla, Andrea; Minetti, Deborah; Regazzoni, Carlo S.
2005-03-01
The presented work describes an innovative architecture for multi-sensor distributed video surveillance applications. The aim of the system is to track moving objects in outdoor environments with a cooperative strategy exploiting two video cameras. The system also exhibits the capacity of focusing its attention on the faces of detected pedestrians collecting snapshot frames of face images, by segmenting and tracking them over time at different resolution. The system is designed to employ two video cameras in a cooperative client/server structure: the first camera monitors the entire area of interest and detects the moving objects using change detection techniques. The detected objects are tracked over time and their position is indicated on a map representing the monitored area. The objects" coordinates are sent to the server sensor in order to point its zooming optics towards the moving object. The second camera tracks the objects at high resolution. As well as the client camera, this sensor is calibrated and the position of the object detected on the image plane reference system is translated in its coordinates referred to the same area map. In the map common reference system, data fusion techniques are applied to achieve a more precise and robust estimation of the objects" track and to perform face detection and tracking. The work novelties and strength reside in the cooperative multi-sensor approach, in the high resolution long distance tracking and in the automatic collection of biometric data such as a person face clip for recognition purposes.
Automated detection of videotaped neonatal seizures based on motion segmentation methods.
Karayiannis, Nicolaos B; Tao, Guozhi; Frost, James D; Wise, Merrill S; Hrachovy, Richard A; Mizrahi, Eli M
2006-07-01
This study was aimed at the development of a seizure detection system by training neural networks using quantitative motion information extracted by motion segmentation methods from short video recordings of infants monitored for seizures. The motion of the infants' body parts was quantified by temporal motion strength signals extracted from video recordings by motion segmentation methods based on optical flow computation. The area of each frame occupied by the infants' moving body parts was segmented by direct thresholding, by clustering of the pixel velocities, and by clustering the motion parameters obtained by fitting an affine model to the pixel velocities. The computational tools and procedures developed for automated seizure detection were tested and evaluated on 240 short video segments selected and labeled by physicians from a set of video recordings of 54 patients exhibiting myoclonic seizures (80 segments), focal clonic seizures (80 segments), and random infant movements (80 segments). The experimental study described in this paper provided the basis for selecting the most effective strategy for training neural networks to detect neonatal seizures as well as the decision scheme used for interpreting the responses of the trained neural networks. Depending on the decision scheme used for interpreting the responses of the trained neural networks, the best neural networks exhibited sensitivity above 90% or specificity above 90%. The best among the motion segmentation methods developed in this study produced quantitative features that constitute a reliable basis for detecting myoclonic and focal clonic neonatal seizures. The performance targets of this phase of the project may be achieved by combining the quantitative features described in this paper with those obtained by analyzing motion trajectory signals produced by motion tracking methods. A video system based upon automated analysis potentially offers a number of advantages. Infants who are at risk for seizures could be monitored continuously using relatively inexpensive and non-invasive video techniques that supplement direct observation by nursery personnel. This would represent a major advance in seizure surveillance and offers the possibility for earlier identification of potential neurological problems and subsequent intervention.
Collaborative real-time motion video analysis by human observer and image exploitation algorithms
NASA Astrophysics Data System (ADS)
Hild, Jutta; Krüger, Wolfgang; Brüstle, Stefan; Trantelle, Patrick; Unmüßig, Gabriel; Heinze, Norbert; Peinsipp-Byma, Elisabeth; Beyerer, Jürgen
2015-05-01
Motion video analysis is a challenging task, especially in real-time applications. In most safety and security critical applications, a human observer is an obligatory part of the overall analysis system. Over the last years, substantial progress has been made in the development of automated image exploitation algorithms. Hence, we investigate how the benefits of automated video analysis can be integrated suitably into the current video exploitation systems. In this paper, a system design is introduced which strives to combine both the qualities of the human observer's perception and the automated algorithms, thus aiming to improve the overall performance of a real-time video analysis system. The system design builds on prior work where we showed the benefits for the human observer by means of a user interface which utilizes the human visual focus of attention revealed by the eye gaze direction for interaction with the image exploitation system; eye tracker-based interaction allows much faster, more convenient, and equally precise moving target acquisition in video images than traditional computer mouse selection. The system design also builds on prior work we did on automated target detection, segmentation, and tracking algorithms. Beside the system design, a first pilot study is presented, where we investigated how the participants (all non-experts in video analysis) performed in initializing an object tracking subsystem by selecting a target for tracking. Preliminary results show that the gaze + key press technique is an effective, efficient, and easy to use interaction technique when performing selection operations on moving targets in videos in order to initialize an object tracking function.
Video segmentation and camera motion characterization using compressed data
NASA Astrophysics Data System (ADS)
Milanese, Ruggero; Deguillaume, Frederic; Jacot-Descombes, Alain
1997-10-01
We address the problem of automatically extracting visual indexes from videos, in order to provide sophisticated access methods to the contents of a video server. We focus on tow tasks, namely the decomposition of a video clip into uniform segments, and the characterization of each shot by camera motion parameters. For the first task we use a Bayesian classification approach to detecting scene cuts by analyzing motion vectors. For the second task a least- squares fitting procedure determines the pan/tilt/zoom camera parameters. In order to guarantee the highest processing speed, all techniques process and analyze directly MPEG-1 motion vectors, without need for video decompression. Experimental results are reported for a database of news video clips.
Temporally coherent 4D video segmentation for teleconferencing
NASA Astrophysics Data System (ADS)
Ehmann, Jana; Guleryuz, Onur G.
2013-09-01
We develop an algorithm for 4-D (RGB+Depth) video segmentation targeting immersive teleconferencing ap- plications on emerging mobile devices. Our algorithm extracts users from their environments and places them onto virtual backgrounds similar to green-screening. The virtual backgrounds increase immersion and interac- tivity, relieving the users of the system from distractions caused by disparate environments. Commodity depth sensors, while providing useful information for segmentation, result in noisy depth maps with a large number of missing depth values. By combining depth and RGB information, our work signi¯cantly improves the other- wise very coarse segmentation. Further imposing temporal coherence yields compositions where the foregrounds seamlessly blend with the virtual backgrounds with minimal °icker and other artifacts. We achieve said improve- ments by correcting the missing information in depth maps before fast RGB-based segmentation, which operates in conjunction with temporal coherence. Simulation results indicate the e±cacy of the proposed system in video conferencing scenarios.
An intelligent crowdsourcing system for forensic analysis of surveillance video
NASA Astrophysics Data System (ADS)
Tahboub, Khalid; Gadgil, Neeraj; Ribera, Javier; Delgado, Blanca; Delp, Edward J.
2015-03-01
Video surveillance systems are of a great value for public safety. With an exponential increase in the number of cameras, videos obtained from surveillance systems are often archived for forensic purposes. Many automatic methods have been proposed to do video analytics such as anomaly detection and human activity recognition. However, such methods face significant challenges due to object occlusions, shadows and scene illumination changes. In recent years, crowdsourcing has become an effective tool that utilizes human intelligence to perform tasks that are challenging for machines. In this paper, we present an intelligent crowdsourcing system for forensic analysis of surveillance video that includes the video recorded as a part of search and rescue missions and large-scale investigation tasks. We describe a method to enhance crowdsourcing by incorporating human detection, re-identification and tracking. At the core of our system, we use a hierarchal pyramid model to distinguish the crowd members based on their ability, experience and performance record. Our proposed system operates in an autonomous fashion and produces a final output of the crowdsourcing analysis consisting of a set of video segments detailing the events of interest as one storyline.
Multilevel wireless capsule endoscopy video segmentation
NASA Astrophysics Data System (ADS)
Hwang, Sae; Celebi, M. Emre
2010-03-01
Wireless Capsule Endoscopy (WCE) is a relatively new technology (FDA approved in 2002) allowing doctors to view most of the small intestine. WCE transmits more than 50,000 video frames per examination and the visual inspection of the resulting video is a highly time-consuming task even for the experienced gastroenterologist. Typically, a medical clinician spends one or two hours to analyze a WCE video. To reduce the assessment time, it is critical to develop a technique to automatically discriminate digestive organs and shots each of which consists of the same or similar shots. In this paper a multi-level WCE video segmentation methodology is presented to reduce the examination time.
WCE video segmentation using textons
NASA Astrophysics Data System (ADS)
Gallo, Giovanni; Granata, Eliana
2010-03-01
Wireless Capsule Endoscopy (WCE) integrates wireless transmission with image and video technology. It has been used to examine the small intestine non invasively. Medical specialists look for signicative events in the WCE video by direct visual inspection manually labelling, in tiring and up to one hour long sessions, clinical relevant frames. This limits the WCE usage. To automatically discriminate digestive organs such as esophagus, stomach, small intestine and colon is of great advantage. In this paper we propose to use textons for the automatic discrimination of abrupt changes within a video. In particular, we consider, as features, for each frame hue, saturation, value, high-frequency energy content and the responses to a bank of Gabor filters. The experiments have been conducted on ten video segments extracted from WCE videos, in which the signicative events have been previously labelled by experts. Results have shown that the proposed method may eliminate up to 70% of the frames from further investigations. The direct analysis of the doctors may hence be concentrated only on eventful frames. A graphical tool showing sudden changes in the textons frequencies for each frame is also proposed as a visual aid to find clinically relevant segments of the video.
Automated multiple target detection and tracking in UAV videos
NASA Astrophysics Data System (ADS)
Mao, Hongwei; Yang, Chenhui; Abousleman, Glen P.; Si, Jennie
2010-04-01
In this paper, a novel system is presented to detect and track multiple targets in Unmanned Air Vehicles (UAV) video sequences. Since the output of the system is based on target motion, we first segment foreground moving areas from the background in each video frame using background subtraction. To stabilize the video, a multi-point-descriptor-based image registration method is performed where a projective model is employed to describe the global transformation between frames. For each detected foreground blob, an object model is used to describe its appearance and motion information. Rather than immediately classifying the detected objects as targets, we track them for a certain period of time and only those with qualified motion patterns are labeled as targets. In the subsequent tracking process, a Kalman filter is assigned to each tracked target to dynamically estimate its position in each frame. Blobs detected at a later time are used as observations to update the state of the tracked targets to which they are associated. The proposed overlap-rate-based data association method considers the splitting and merging of the observations, and therefore is able to maintain tracks more consistently. Experimental results demonstrate that the system performs well on real-world UAV video sequences. Moreover, careful consideration given to each component in the system has made the proposed system feasible for real-time applications.
A content-based news video retrieval system: NVRS
NASA Astrophysics Data System (ADS)
Liu, Huayong; He, Tingting
2009-10-01
This paper focus on TV news programs and design a content-based news video browsing and retrieval system, NVRS, which is convenient for users to fast browsing and retrieving news video by different categories such as political, finance, amusement, etc. Combining audiovisual features and caption text information, the system automatically segments a complete news program into separate news stories. NVRS supports keyword-based news story retrieval, category-based news story browsing and generates key-frame-based video abstract for each story. Experiments show that the method of story segmentation is effective and the retrieval is also efficient.
Integrated approach to multimodal media content analysis
NASA Astrophysics Data System (ADS)
Zhang, Tong; Kuo, C.-C. Jay
1999-12-01
In this work, we present a system for the automatic segmentation, indexing and retrieval of audiovisual data based on the combination of audio, visual and textural content analysis. The video stream is demultiplexed into audio, image and caption components. Then, a semantic segmentation of the audio signal based on audio content analysis is conducted, and each segment is indexed as one of the basic audio types. The image sequence is segmented into shots based on visual information analysis, and keyframes are extracted from each shot. Meanwhile, keywords are detected from the closed caption. Index tables are designed for both linear and non-linear access to the video. It is shown by experiments that the proposed methods for multimodal media content analysis are effective. And that the integrated framework achieves satisfactory results for video information filtering and retrieval.
Assessment of Fall Characteristics From Depth Sensor Videos.
O'Connor, Jennifer J; Phillips, Lorraine J; Folarinde, Bunmi; Alexander, Gregory L; Rantz, Marilyn
2017-07-01
Falls are a major source of death and disability in older adults; little data, however, are available about the etiology of falls in community-dwelling older adults. Sensor systems installed in independent and assisted living residences of 105 older adults participating in an ongoing technology study were programmed to record live videos of probable fall events. Sixty-four fall video segments from 19 individuals were viewed and rated using the Falls Video Assessment Questionnaire. Raters identified that 56% (n = 36) of falls were due to an incorrect shift of body weight and 27% (n = 17) from losing support of an external object, such as an unlocked wheelchair or rolling walker. In 60% of falls, mobility aids were in the room or in use at the time of the fall. Use of environmentally embedded sensors provides a mechanism for real-time fall detection and, ultimately, may supply information to clinicians for fall prevention interventions. [Journal of Gerontological Nursing, 43(7), 13-19.]. Copyright 2017, SLACK Incorporated.
Audio-based queries for video retrieval over Java enabled mobile devices
NASA Astrophysics Data System (ADS)
Ahmad, Iftikhar; Cheikh, Faouzi Alaya; Kiranyaz, Serkan; Gabbouj, Moncef
2006-02-01
In this paper we propose a generic framework for efficient retrieval of audiovisual media based on its audio content. This framework is implemented in a client-server architecture where the client application is developed in Java to be platform independent whereas the server application is implemented for the PC platform. The client application adapts to the characteristics of the mobile device where it runs such as screen size and commands. The entire framework is designed to take advantage of the high-level segmentation and classification of audio content to improve speed and accuracy of audio-based media retrieval. Therefore, the primary objective of this framework is to provide an adaptive basis for performing efficient video retrieval operations based on the audio content and types (i.e. speech, music, fuzzy and silence). Experimental results approve that such an audio based video retrieval scheme can be used from mobile devices to search and retrieve video clips efficiently over wireless networks.
Computer aided diagnosis of diabetic peripheral neuropathy
NASA Astrophysics Data System (ADS)
Chekh, Viktor; Soliz, Peter; McGrew, Elizabeth; Barriga, Simon; Burge, Mark; Luan, Shuang
2014-03-01
Diabetic peripheral neuropathy (DPN) refers to the nerve damage that can occur in diabetes patients. It most often affects the extremities, such as the feet, and can lead to peripheral vascular disease, deformity, infection, ulceration, and even amputation. The key to managing diabetic foot is prevention and early detection. Unfortunately, current existing diagnostic techniques are mostly based on patient sensations and exhibit significant inter- and intra-observer differences. We have developed a computer aided diagnostic (CAD) system for diabetic peripheral neuropathy. The thermal response of the feet of diabetic patients following cold stimulus is captured using an infrared camera. The plantar foot in the images from a thermal video are segmented and registered for tracking points or specific regions. The temperature recovery of each point on the plantar foot is extracted using our bio-thermal model and analyzed. The regions that exhibit abnormal ability to recover are automatically identified to aid the physicians to recognize problematic areas. The key to our CAD system is the segmentation of infrared video. The main challenges for segmenting infrared video compared to normal digital video are (1) as the foot warms up, it also warms up the surrounding, creating an ever changing contrast; and (2) there may be significant motion during imaging. To overcome this, a hybrid segmentation algorithm was developed based on a number of techniques such as continuous max-flow, model based segmentation, shape preservation, convex hull, and temperature normalization. Verifications of the automatic segmentation and registration using manual segmentation and markers show good agreement.
Automatic summarization of soccer highlights using audio-visual descriptors.
Raventós, A; Quijada, R; Torres, Luis; Tarrés, Francesc
2015-01-01
Automatic summarization generation of sports video content has been object of great interest for many years. Although semantic descriptions techniques have been proposed, many of the approaches still rely on low-level video descriptors that render quite limited results due to the complexity of the problem and to the low capability of the descriptors to represent semantic content. In this paper, a new approach for automatic highlights summarization generation of soccer videos using audio-visual descriptors is presented. The approach is based on the segmentation of the video sequence into shots that will be further analyzed to determine its relevance and interest. Of special interest in the approach is the use of the audio information that provides additional robustness to the overall performance of the summarization system. For every video shot a set of low and mid level audio-visual descriptors are computed and lately adequately combined in order to obtain different relevance measures based on empirical knowledge rules. The final summary is generated by selecting those shots with highest interest according to the specifications of the user and the results of relevance measures. A variety of results are presented with real soccer video sequences that prove the validity of the approach.
Brain activity and desire for internet video game play
Han, Doug Hyun; Bolo, Nicolas; Daniels, Melissa A.; Arenella, Lynn; Lyoo, In Kyoon; Renshaw, Perry F.
2010-01-01
Objective Recent studies have suggested that the brain circuitry mediating cue induced desire for video games is similar to that elicited by cues related to drugs and alcohol. We hypothesized that desire for internet video games during cue presentation would activate similar brain regions to those which have been linked with craving for drugs or pathological gambling. Methods This study involved the acquisition of diagnostic MRI and fMRI data from 19 healthy male adults (ages 18–23 years) following training and a standardized 10-day period of game play with a specified novel internet video game, “War Rock” (K-network®). Using segments of videotape consisting of five contiguous 90-second segments of alternating resting, matched control and video game-related scenes, desire to play the game was assessed using a seven point visual analogue scale before and after presentation of the videotape. Results In responding to internet video game stimuli, compared to neutral control stimuli, significantly greater activity was identified in left inferior frontal gyrus, left parahippocampal gyrus, right and left parietal lobe, right and left thalamus, and right cerebellum (FDR <0.05, p<0.009243). Self-reported desire was positively correlated with the beta values of left inferior frontal gyrus, left parahippocampal gyrus, and right and left thalamus. Compared to the general players, members who played more internet video game (MIGP) cohort showed significantly greater activity in right medial frontal lobe, right and left frontal pre-central gyrus, right parietal post-central gyrus, right parahippocampal gyrus, and left parietal precuneus gyrus. Controlling for total game time, reported desire for the internet video game in the MIGP cohort was positively correlated with activation in right medial frontal lobe and right parahippocampal gyrus. Discussion The present findings suggest that cue-induced activation to internet video game stimuli may be similar to that observed during cue presentation in persons with substance dependence or pathological gambling. In particular, cues appear to commonly elicit activity in the dorsolateral prefrontal, orbitofrontal cortex, parahippocampal gyrus, and thalamus. PMID:21220070
Event completion: event based inferences distort memory in a matter of seconds.
Strickland, Brent; Keil, Frank
2011-12-01
We present novel evidence that implicit causal inferences distort memory for events only seconds after viewing. Adults watched videos of someone launching (or throwing) an object. However, the videos omitted the moment of contact (or release). Subjects falsely reported seeing the moment of contact when it was implied by subsequent footage but did not do so when the contact was not implied. Causal implications were disrupted either by replacing the resulting flight of the ball with irrelevant video or by scrambling event segments. Subjects in the different causal implication conditions did not differ on false alarms for other moments of the event, nor did they differ in general recognition accuracy. These results suggest that as people perceive events, they generate rapid conceptual interpretations that can have a powerful effect on how events are remembered. Copyright © 2011 Elsevier B.V. All rights reserved.
An Objective Comparison of Cell Tracking Algorithms
Ulman, Vladimír; Maška, Martin; Magnusson, Klas E. G.; Ronneberger, Olaf; Haubold, Carsten; Harder, Nathalie; Matula, Pavel; Matula, Petr; Svoboda, David; Radojevic, Miroslav; Smal, Ihor; Rohr, Karl; Jaldén, Joakim; Blau, Helen M.; Dzyubachyk, Oleh; Lelieveldt, Boudewijn; Xiao, Pengdong; Li, Yuexiang; Cho, Siu-Yeung; Dufour, Alexandre C.; Olivo-Marin, Jean-Christophe; Reyes-Aldasoro, Constantino C.; Solis-Lemus, Jose A.; Bensch, Robert; Brox, Thomas; Stegmaier, Johannes; Mikut, Ralf; Wolf, Steffen; Hamprecht, Fred. A.; Esteves, Tiago; Quelhas, Pedro; Demirel, Ömer; Malmström, Lars; Jug, Florian; Tomancak, Pavel; Meijering, Erik; Muñoz-Barrutia, Arrate; Kozubek, Michal; Ortiz-de-Solorzano, Carlos
2017-01-01
We present a combined report on the results of three editions of the Cell Tracking Challenge, an ongoing initiative aimed at promoting the development and objective evaluation of cell tracking algorithms. With twenty-one participating algorithms and a data repository consisting of thirteen datasets of various microscopy modalities, the challenge displays today’s state of the art in the field. We analyze the results using performance measures for segmentation and tracking that rank all participating methods. We also analyze the performance of all algorithms in terms of biological measures and their practical usability. Even though some methods score high in all technical aspects, not a single one obtains fully correct solutions. We show that methods that either take prior information into account using learning strategies or analyze cells in a global spatio-temporal video context perform better than other methods under the segmentation and tracking scenarios included in the challenge. PMID:29083403
Echocardiogram video summarization
NASA Astrophysics Data System (ADS)
Ebadollahi, Shahram; Chang, Shih-Fu; Wu, Henry D.; Takoma, Shin
2001-05-01
This work aims at developing innovative algorithms and tools for summarizing echocardiogram videos. Specifically, we summarize the digital echocardiogram videos by temporally segmenting them into the constituent views and representing each view by the most informative frame. For the segmentation we take advantage of the well-defined spatio- temporal structure of the echocardiogram videos. Two different criteria are used: presence/absence of color and the shape of the region of interest (ROI) in each frame of the video. The change in the ROI is due to different modes of echocardiograms present in one study. The representative frame is defined to be the frame corresponding to the end- diastole of the heart cycle. To locate the end-diastole we track the ECG of each frame to find the exact time the time- marker on the ECG crosses the peak of the end-diastole we track the ECG of each frame to find the exact time the time- marker on the ECG crosses the peak of the R-wave. The corresponding frame is chosen to be the key-frame. The entire echocardiogram video can be summarized into either a static summary, which is a storyboard type of summary and a dynamic summary, which is a concatenation of the selected segments of the echocardiogram video. To the best of our knowledge, this if the first automated system for summarizing the echocardiogram videos base don visual content.
The video watermarking container: efficient real-time transaction watermarking
NASA Astrophysics Data System (ADS)
Wolf, Patrick; Hauer, Enrico; Steinebach, Martin
2008-02-01
When transaction watermarking is used to secure sales in online shops by embedding transaction specific watermarks, the major challenge is embedding efficiency: Maximum speed by minimal workload. This is true for all types of media. Video transaction watermarking presents a double challenge. Video files not only are larger than for example music files of the same playback time. In addition, video watermarking algorithms have a higher complexity than algorithms for other types of media. Therefore online shops that want to protect their videos by transaction watermarking are faced with the problem that their servers need to work harder and longer for every sold medium in comparison to audio sales. In the past, many algorithms responded to this challenge by reducing their complexity. But this usually results in a loss of either robustness or transparency. This paper presents a different approach. The container technology separates watermark embedding into two stages: A preparation stage and the finalization stage. In the preparation stage, the video is divided into embedding segments. For each segment one copy marked with "0" and anther one marked with "1" is created. This stage is computationally expensive but only needs to be done once. In the finalization stage, the watermarked video is assembled from the embedding segments according to the watermark message. This stage is very fast and involves no complex computations. It thus allows efficient creation of individually watermarked video files.
NASA Astrophysics Data System (ADS)
Maurer, Calvin R., Jr.; Sauer, Frank; Hu, Bo; Bascle, Benedicte; Geiger, Bernhard; Wenzel, Fabian; Recchi, Filippo; Rohlfing, Torsten; Brown, Christopher R.; Bakos, Robert J.; Maciunas, Robert J.; Bani-Hashemi, Ali R.
2001-05-01
We are developing a video see-through head-mounted display (HMD) augmented reality (AR) system for image-guided neurosurgical planning and navigation. The surgeon wears a HMD that presents him with the augmented stereo view. The HMD is custom fitted with two miniature color video cameras that capture a stereo view of the real-world scene. We are concentrating specifically at this point on cranial neurosurgery, so the images will be of the patient's head. A third video camera, operating in the near infrared, is also attached to the HMD and is used for head tracking. The pose (i.e., position and orientation) of the HMD is used to determine where to overlay anatomic structures segmented from preoperative tomographic images (e.g., CT, MR) on the intraoperative video images. Two SGI 540 Visual Workstation computers process the three video streams and render the augmented stereo views for display on the HMD. The AR system operates in real time at 30 frames/sec with a temporal latency of about three frames (100 ms) and zero relative lag between the virtual objects and the real-world scene. For an initial evaluation of the system, we created AR images using a head phantom with actual internal anatomic structures (segmented from CT and MR scans of a patient) realistically positioned inside the phantom. When using shaded renderings, many users had difficulty appreciating overlaid brain structures as being inside the head. When using wire frames, and texture-mapped dot patterns, most users correctly visualized brain anatomy as being internal and could generally appreciate spatial relationships among various objects. The 3D perception of these structures is based on both stereoscopic depth cues and kinetic depth cues, with the user looking at the head phantom from varying positions. The perception of the augmented visualization is natural and convincing. The brain structures appear rigidly anchored in the head, manifesting little or no apparent swimming or jitter. The initial evaluation of the system is encouraging, and we believe that AR visualization might become an important tool for image-guided neurosurgical planning and navigation.
Real-time reliability measure-driven multi-hypothesis tracking using 2D and 3D features
NASA Astrophysics Data System (ADS)
Zúñiga, Marcos D.; Brémond, François; Thonnat, Monique
2011-12-01
We propose a new multi-target tracking approach, which is able to reliably track multiple objects even with poor segmentation results due to noisy environments. The approach takes advantage of a new dual object model combining 2D and 3D features through reliability measures. In order to obtain these 3D features, a new classifier associates an object class label to each moving region (e.g. person, vehicle), a parallelepiped model and visual reliability measures of its attributes. These reliability measures allow to properly weight the contribution of noisy, erroneous or false data in order to better maintain the integrity of the object dynamics model. Then, a new multi-target tracking algorithm uses these object descriptions to generate tracking hypotheses about the objects moving in the scene. This tracking approach is able to manage many-to-many visual target correspondences. For achieving this characteristic, the algorithm takes advantage of 3D models for merging dissociated visual evidence (moving regions) potentially corresponding to the same real object, according to previously obtained information. The tracking approach has been validated using video surveillance benchmarks publicly accessible. The obtained performance is real time and the results are competitive compared with other tracking algorithms, with minimal (or null) reconfiguration effort between different videos.
Video attention deviation estimation using inter-frame visual saliency map analysis
NASA Astrophysics Data System (ADS)
Feng, Yunlong; Cheung, Gene; Le Callet, Patrick; Ji, Yusheng
2012-01-01
A viewer's visual attention during video playback is the matching of his eye gaze movement to the changing video content over time. If the gaze movement matches the video content (e.g., follow a rolling soccer ball), then the viewer keeps his visual attention. If the gaze location moves from one video object to another, then the viewer shifts his visual attention. A video that causes a viewer to shift his attention often is a "busy" video. Determination of which video content is busy is an important practical problem; a busy video is difficult for encoder to deploy region of interest (ROI)-based bit allocation, and hard for content provider to insert additional overlays like advertisements, making the video even busier. One way to determine the busyness of video content is to conduct eye gaze experiments with a sizable group of test subjects, but this is time-consuming and costineffective. In this paper, we propose an alternative method to determine the busyness of video-formally called video attention deviation (VAD): analyze the spatial visual saliency maps of the video frames across time. We first derive transition probabilities of a Markov model for eye gaze using saliency maps of a number of consecutive frames. We then compute steady state probability of the saccade state in the model-our estimate of VAD. We demonstrate that the computed steady state probability for saccade using saliency map analysis matches that computed using actual gaze traces for a range of videos with different degrees of busyness. Further, our analysis can also be used to segment video into shorter clips of different degrees of busyness by computing the Kullback-Leibler divergence using consecutive motion compensated saliency maps.
Complete Scene Recovery and Terrain Classification in Textured Terrain Meshes
Song, Wei; Cho, Kyungeun; Um, Kyhyun; Won, Chee Sun; Sim, Sungdae
2012-01-01
Terrain classification allows a mobile robot to create an annotated map of its local environment from the three-dimensional (3D) and two-dimensional (2D) datasets collected by its array of sensors, including a GPS receiver, gyroscope, video camera, and range sensor. However, parts of objects that are outside the measurement range of the range sensor will not be detected. To overcome this problem, this paper describes an edge estimation method for complete scene recovery and complete terrain reconstruction. Here, the Gibbs-Markov random field is used to segment the ground from 2D videos and 3D point clouds. Further, a masking method is proposed to classify buildings and trees in a terrain mesh. PMID:23112653
An improvement analysis on video compression using file segmentation
NASA Astrophysics Data System (ADS)
Sharma, Shubhankar; Singh, K. John; Priya, M.
2017-11-01
From the past two decades the extreme evolution of the Internet has lead a massive rise in video technology and significantly video consumption over the Internet which inhabits the bulk of data traffic in general. Clearly, video consumes that so much data size on the World Wide Web, to reduce the burden on the Internet and deduction of bandwidth consume by video so that the user can easily access the video data.For this, many video codecs are developed such as HEVC/H.265 and V9. Although after seeing codec like this one gets a dilemma of which would be improved technology in the manner of rate distortion and the coding standard.This paper gives a solution about the difficulty for getting low delay in video compression and video application e.g. ad-hoc video conferencing/streaming or observation by surveillance. Also this paper describes the benchmark of HEVC and V9 technique of video compression on subjective oral estimations of High Definition video content, playback on web browsers. Moreover, this gives the experimental ideology of dividing the video file into several segments for compression and putting back together to improve the efficiency of video compression on the web as well as on the offline mode.
Stochastic modeling of soundtrack for efficient segmentation and indexing of video
NASA Astrophysics Data System (ADS)
Naphade, Milind R.; Huang, Thomas S.
1999-12-01
Tools for efficient and intelligent management of digital content are essential for digital video data management. An extremely challenging research area in this context is that of multimedia analysis and understanding. The capabilities of audio analysis in particular for video data management are yet to be fully exploited. We present a novel scheme for indexing and segmentation of video by analyzing the audio track. This analysis is then applied to the segmentation and indexing of movies. We build models for some interesting events in the motion picture soundtrack. The models built include music, human speech and silence. We propose the use of hidden Markov models to model the dynamics of the soundtrack and detect audio-events. Using these models we segment and index the soundtrack. A practical problem in motion picture soundtracks is that the audio in the track is of a composite nature. This corresponds to the mixing of sounds from different sources. Speech in foreground and music in background are common examples. The coexistence of multiple individual audio sources forces us to model such events explicitly. Experiments reveal that explicit modeling gives better result than modeling individual audio events separately.
Video Modeling by Experts with Video Feedback to Enhance Gymnastics Skills
ERIC Educational Resources Information Center
Boyer, Eva; Miltenberger, Raymond G.; Batsche, Catherine; Fogel, Victoria
2009-01-01
The effects of combining video modeling by experts with video feedback were analyzed with 4 female competitive gymnasts (7 to 10 years old) in a multiple baseline design across behaviors. During the intervention, after the gymnast performed a specific gymnastics skill, she viewed a video segment showing an expert gymnast performing the same skill…
Creating and Using Video Segments for Rural Teacher Education.
ERIC Educational Resources Information Center
Ludlow, Barbara L.; Duff, Michael C.
This paper provides guidelines for using video presentations in teacher education programs in special education. The simplest use of video is to provide students with illustrations of basic concepts, demonstrations of specific skills, or examples of model programs and practices. Video can also deliver contextually rich case studies to stimulate…
Learning Outcomes Afforded by Self-Assessed, Segmented Video-Print Combinations
ERIC Educational Resources Information Center
Koumi, Jack
2015-01-01
Learning affordances of video and print are examined in order to assess the learning outcomes afforded by hybrid video-print learning packages. The affordances discussed for print are: navigability, surveyability and legibility. Those discussed for video are: design for constructive reflection, provision of realistic experiences, presentational…
Optimizing Educational Video through Comparative Trials in Clinical Environments
ERIC Educational Resources Information Center
Aronson, Ian David; Plass, Jan L.; Bania, Theodore C.
2012-01-01
Although video is increasingly used in public health education, studies generally do not implement randomized trials of multiple video segments in clinical environments. Therefore, the specific configurations of educational videos that will have the greatest impact on outcome measures ranging from increased knowledge of important public health…
Interacting with target tracking algorithms in a gaze-enhanced motion video analysis system
NASA Astrophysics Data System (ADS)
Hild, Jutta; Krüger, Wolfgang; Heinze, Norbert; Peinsipp-Byma, Elisabeth; Beyerer, Jürgen
2016-05-01
Motion video analysis is a challenging task, particularly if real-time analysis is required. It is therefore an important issue how to provide suitable assistance for the human operator. Given that the use of customized video analysis systems is more and more established, one supporting measure is to provide system functions which perform subtasks of the analysis. Recent progress in the development of automated image exploitation algorithms allow, e.g., real-time moving target tracking. Another supporting measure is to provide a user interface which strives to reduce the perceptual, cognitive and motor load of the human operator for example by incorporating the operator's visual focus of attention. A gaze-enhanced user interface is able to help here. This work extends prior work on automated target recognition, segmentation, and tracking algorithms as well as about the benefits of a gaze-enhanced user interface for interaction with moving targets. We also propose a prototypical system design aiming to combine both the qualities of the human observer's perception and the automated algorithms in order to improve the overall performance of a real-time video analysis system. In this contribution, we address two novel issues analyzing gaze-based interaction with target tracking algorithms. The first issue extends the gaze-based triggering of a target tracking process, e.g., investigating how to best relaunch in the case of track loss. The second issue addresses the initialization of tracking algorithms without motion segmentation where the operator has to provide the system with the object's image region in order to start the tracking algorithm.
Segmentation of Pollen Tube Growth Videos Using Dynamic Bi-Modal Fusion and Seam Carving.
Tambo, Asongu L; Bhanu, Bir
2016-05-01
The growth of pollen tubes is of significant interest in plant cell biology, as it provides an understanding of internal cell dynamics that affect observable structural characteristics such as cell diameter, length, and growth rate. However, these parameters can only be measured in experimental videos if the complete shape of the cell is known. The challenge is to accurately obtain the cell boundary in noisy video images. Usually, these measurements are performed by a scientist who manually draws regions-of-interest on the images displayed on a computer screen. In this paper, a new automated technique is presented for boundary detection by fusing fluorescence and brightfield images, and a new efficient method of obtaining the final cell boundary through the process of Seam Carving is proposed. This approach takes advantage of the nature of the fusion process and also the shape of the pollen tube to efficiently search for the optimal cell boundary. In video segmentation, the first two frames are used to initialize the segmentation process by creating a search space based on a parametric model of the cell shape. Updates to the search space are performed based on the location of past segmentations and a prediction of the next segmentation.Experimental results show comparable accuracy to a previous method, but significant decrease in processing time. This has the potential for real time applications in pollen tube microscopy.
Pedestrian detection based on redundant wavelet transform
NASA Astrophysics Data System (ADS)
Huang, Lin; Ji, Liping; Hu, Ping; Yang, Tiejun
2016-10-01
Intelligent video surveillance is to analysis video or image sequences captured by a fixed or mobile surveillance camera, including moving object detection, segmentation and recognition. By using it, we can be notified immediately in an abnormal situation. Pedestrian detection plays an important role in an intelligent video surveillance system, and it is also a key technology in the field of intelligent vehicle. So pedestrian detection has very vital significance in traffic management optimization, security early warn and abnormal behavior detection. Generally, pedestrian detection can be summarized as: first to estimate moving areas; then to extract features of region of interest; finally to classify using a classifier. Redundant wavelet transform (RWT) overcomes the deficiency of shift variant of discrete wavelet transform, and it has better performance in motion estimation when compared to discrete wavelet transform. Addressing the problem of the detection of multi-pedestrian with different speed, we present an algorithm of pedestrian detection based on motion estimation using RWT, combining histogram of oriented gradients (HOG) and support vector machine (SVM). Firstly, three intensities of movement (IoM) are estimated using RWT and the corresponding areas are segmented. According to the different IoM, a region proposal (RP) is generated. Then, the features of a RP is extracted using HOG. Finally, the features are fed into a SVM trained by pedestrian databases and the final detection results are gained. Experiments show that the proposed algorithm can detect pedestrians accurately and efficiently.
Visual Object Recognition and Tracking of Tools
NASA Technical Reports Server (NTRS)
English, James; Chang, Chu-Yin; Tardella, Neil
2011-01-01
A method has been created to automatically build an algorithm off-line, using computer-aided design (CAD) models, and to apply this at runtime. The object type is discriminated, and the position and orientation are identified. This system can work with a single image and can provide improved performance using multiple images provided from videos. The spatial processing unit uses three stages: (1) segmentation; (2) initial type, pose, and geometry (ITPG) estimation; and (3) refined type, pose, and geometry (RTPG) calculation. The image segmentation module files all the tools in an image and isolates them from the background. For this, the system uses edge-detection and thresholding to find the pixels that are part of a tool. After the pixels are identified, nearby pixels are grouped into blobs. These blobs represent the potential tools in the image and are the product of the segmentation algorithm. The second module uses matched filtering (or template matching). This approach is used for condensing synthetic images using an image subspace that captures key information. Three degrees of orientation, three degrees of position, and any number of degrees of freedom in geometry change are included. To do this, a template-matching framework is applied. This framework uses an off-line system for calculating template images, measurement images, and the measurements of the template images. These results are used online to match segmented tools against the templates. The final module is the RTPG processor. Its role is to find the exact states of the tools given initial conditions provided by the ITPG module. The requirement that the initial conditions exist allows this module to make use of a local search (whereas the ITPG module had global scope). To perform the local search, 3D model matching is used, where a synthetic image of the object is created and compared to the sensed data. The availability of low-cost PC graphics hardware allows rapid creation of synthetic images. In this approach, a function of orientation, distance, and articulation is defined as a metric on the difference between the captured image and a synthetic image with an object in the given orientation, distance, and articulation. The synthetic image is created using a model that is looked up in an object-model database. A composable software architecture is used for implementation. Video is first preprocessed to remove sensor anomalies (like dead pixels), and then is processed sequentially by a prioritized list of tracker-identifiers.
Audio-video feature correlation: faces and speech
NASA Astrophysics Data System (ADS)
Durand, Gwenael; Montacie, Claude; Caraty, Marie-Jose; Faudemay, Pascal
1999-08-01
This paper presents a study of the correlation of features automatically extracted from the audio stream and the video stream of audiovisual documents. In particular, we were interested in finding out whether speech analysis tools could be combined with face detection methods, and to what extend they should be combined. A generic audio signal partitioning algorithm as first used to detect Silence/Noise/Music/Speech segments in a full length movie. A generic object detection method was applied to the keyframes extracted from the movie in order to detect the presence or absence of faces. The correlation between the presence of a face in the keyframes and of the corresponding voice in the audio stream was studied. A third stream, which is the script of the movie, is warped on the speech channel in order to automatically label faces appearing in the keyframes with the name of the corresponding character. We naturally found that extracted audio and video features were related in many cases, and that significant benefits can be obtained from the joint use of audio and video analysis methods.
Subjective evaluation of HEVC in mobile devices
NASA Astrophysics Data System (ADS)
Garcia, Ray; Kalva, Hari
2013-03-01
Mobile compute environments provide a unique set of user needs and expectations that designers must consider. With increased multimedia use in mobile environments, video encoding methods within the smart phone market segment are key factors that contribute to positive user experience. Currently available display resolutions and expected cellular bandwidth are major factors the designer must consider when determining which encoding methods should be supported. The desired goal is to maximize the consumer experience, reduce cost, and reduce time to market. This paper presents a comparative evaluation of the quality of user experience when HEVC and AVC/H.264 video coding standards were used. The goal of the study was to evaluate any improvements in user experience when using HEVC. Subjective comparisons were made between H.264/AVC and HEVC encoding standards in accordance with Doublestimulus impairment scale (DSIS) as defined by ITU-R BT.500-13. Test environments are based on smart phone LCD resolutions and expected cellular bit rates, such as 200kbps and 400kbps. Subjective feedback shows both encoding methods are adequate at 400kbps constant bit rate. However, a noticeable consumer experience gap was observed for 200 kbps. Significantly less H.264 subjective quality is noticed with video sequences that have multiple objects moving and no single point of visual attraction. Video sequences with single points of visual attraction or few moving objects tended to have higher H.264 subjective quality.
Segmentation of the Speaker's Face Region with Audiovisual Correlation
NASA Astrophysics Data System (ADS)
Liu, Yuyu; Sato, Yoichi
The ability to find the speaker's face region in a video is useful for various applications. In this work, we develop a novel technique to find this region within different time windows, which is robust against the changes of view, scale, and background. The main thrust of our technique is to integrate audiovisual correlation analysis into a video segmentation framework. We analyze the audiovisual correlation locally by computing quadratic mutual information between our audiovisual features. The computation of quadratic mutual information is based on the probability density functions estimated by kernel density estimation with adaptive kernel bandwidth. The results of this audiovisual correlation analysis are incorporated into graph cut-based video segmentation to resolve a globally optimum extraction of the speaker's face region. The setting of any heuristic threshold in this segmentation is avoided by learning the correlation distributions of speaker and background by expectation maximization. Experimental results demonstrate that our method can detect the speaker's face region accurately and robustly for different views, scales, and backgrounds.
NASA Astrophysics Data System (ADS)
1998-07-01
This is a composite tape showing 10 short segments primarily about asteroids. The segments have short introductory slides, which include brief descriptions about the shots. The segments are: (1) Radar movie of asteroid 1620 Geographos; (2) Animation of the trajectories of Toutatis and Earth (3) Animation of a landing on Toutatis; (4) Simulated encounter of an asteroid with Earth, includes a simulated impact trajectory; (5) An animated overview of the Manrover vehicle; (6) The Near Earth Asteroid Tracking project, includes a photograph of USAF Station in Hawaii, and animation of Earth approaching 4179 Toutatis and the asteroid Gaspara; (7) live video of the anchor tests of the Champoleon anchoring apparatus; (8) a second live video of the Champoleon anchor tests showing anchoring spikes, and collision rings; (9) An animated segment with narration about the Stardust mission with sound, which describes the mission to fly close to a comet, and capture cometary material for return to Earth; (10) live video of the drop test of a Stardust replica from a hot air balloon; this includes sound but is not narrated.
NASA Astrophysics Data System (ADS)
Cai, Lei; Wang, Lin; Li, Bo; Zhang, Libao; Lv, Wen
2017-06-01
Vehicle tracking technology is currently one of the most active research topics in machine vision. It is an important part of intelligent transportation system. However, in theory and technology, it still faces many challenges including real-time and robustness. In video surveillance, the targets need to be detected in real-time and to be calculated accurate position for judging the motives. The contents of video sequence images and the target motion are complex, so the objects can't be expressed by a unified mathematical model. Object-tracking is defined as locating the interest moving target in each frame of a piece of video. The current tracking technology can achieve reliable results in simple environment over the target with easy identified characteristics. However, in more complex environment, it is easy to lose the target because of the mismatch between the target appearance and its dynamic model. Moreover, the target usually has a complex shape, but the tradition target tracking algorithm usually represents the tracking results by simple geometric such as rectangle or circle, so it cannot provide accurate information for the subsequent upper application. This paper combines a traditional object-tracking technology, Mean-Shift algorithm, with a kind of image segmentation algorithm, Active-Contour model, to get the outlines of objects while the tracking process and automatically handle topology changes. Meanwhile, the outline information is used to aid tracking algorithm to improve it.
Hierarchical video summarization
NASA Astrophysics Data System (ADS)
Ratakonda, Krishna; Sezan, M. Ibrahim; Crinon, Regis J.
1998-12-01
We address the problem of key-frame summarization of vide in the absence of any a priori information about its content. This is a common problem that is encountered in home videos. We propose a hierarchical key-frame summarization algorithm where a coarse-to-fine key-frame summary is generated. A hierarchical key-frame summary facilitates multi-level browsing where the user can quickly discover the content of the video by accessing its coarsest but most compact summary and then view a desired segment of the video with increasingly more detail. At the finest level, the summary is generated on the basis of color features of video frames, using an extension of a recently proposed key-frame extraction algorithm. The finest level key-frames are recursively clustered using a novel pairwise K-means clustering approach with temporal consecutiveness constraint. We also address summarization of MPEG-2 compressed video without fully decoding the bitstream. We also propose efficient mechanisms that facilitate decoding the video when the hierarchical summary is utilized in browsing and playback of video segments starting at selected key-frames.
IBES: a tool for creating instructions based on event segmentation
Mura, Katharina; Petersen, Nils; Huff, Markus; Ghose, Tandra
2013-01-01
Receiving informative, well-structured, and well-designed instructions supports performance and memory in assembly tasks. We describe IBES, a tool with which users can quickly and easily create multimedia, step-by-step instructions by segmenting a video of a task into segments. In a validation study we demonstrate that the step-by-step structure of the visual instructions created by the tool corresponds to the natural event boundaries, which are assessed by event segmentation and are known to play an important role in memory processes. In one part of the study, 20 participants created instructions based on videos of two different scenarios by using the proposed tool. In the other part of the study, 10 and 12 participants respectively segmented videos of the same scenarios yielding event boundaries for coarse and fine events. We found that the visual steps chosen by the participants for creating the instruction manual had corresponding events in the event segmentation. The number of instructional steps was a compromise between the number of fine and coarse events. Our interpretation of results is that the tool picks up on natural human event perception processes of segmenting an ongoing activity into events and enables the convenient transfer into meaningful multimedia instructions for assembly tasks. We discuss the practical application of IBES, for example, creating manuals for differing expertise levels, and give suggestions for research on user-oriented instructional design based on this tool. PMID:24454296
IBES: a tool for creating instructions based on event segmentation.
Mura, Katharina; Petersen, Nils; Huff, Markus; Ghose, Tandra
2013-12-26
Receiving informative, well-structured, and well-designed instructions supports performance and memory in assembly tasks. We describe IBES, a tool with which users can quickly and easily create multimedia, step-by-step instructions by segmenting a video of a task into segments. In a validation study we demonstrate that the step-by-step structure of the visual instructions created by the tool corresponds to the natural event boundaries, which are assessed by event segmentation and are known to play an important role in memory processes. In one part of the study, 20 participants created instructions based on videos of two different scenarios by using the proposed tool. In the other part of the study, 10 and 12 participants respectively segmented videos of the same scenarios yielding event boundaries for coarse and fine events. We found that the visual steps chosen by the participants for creating the instruction manual had corresponding events in the event segmentation. The number of instructional steps was a compromise between the number of fine and coarse events. Our interpretation of results is that the tool picks up on natural human event perception processes of segmenting an ongoing activity into events and enables the convenient transfer into meaningful multimedia instructions for assembly tasks. We discuss the practical application of IBES, for example, creating manuals for differing expertise levels, and give suggestions for research on user-oriented instructional design based on this tool.
Li, Shuben; Chai, Huiping; Huang, Jun; Zeng, Guangqiao; Shao, Wenlong; He, Jianxing
2014-04-01
The purpose of the current study is to present the clinical and surgical results in patients who underwent hybrid video-assisted thoracic surgery with segmental-main bronchial sleeve resection. Thirty-one patients, 27 men and 4 women, underwent segmental-main bronchial sleeve anastomoses for non-small cell lung cancer between May 2004 and May 2011. Twenty-six (83.9%) patients had squamous cell carcinoma, and 5 patients had adenocarcinoma. Six patients were at stage IIB, 24 patients at stage IIIA, and 1 patient at stage IIIB. Secondary sleeve anastomosis was performed in 18 patients, and Y-shaped multiple sleeve anastomosis was performed in 8 patients. Single segmental bronchiole anastomosis was performed in 5 cases. The average time for chest tube removal was 5.6 days. The average length of hospital stay was 11.8 days. No anastomosis fistula developed in any of the patients. The 1-, 2-, and 3-year survival rates were 83.9%, 71.0%, and 41.9%, respectively. Hybrid video-assisted thoracic surgery with segmental-main bronchial sleeve resection is a complex technique that requires training and experience, but it is an effective and safe operation for selected patients.
Wavelet Fusion for Concealed Object Detection Using Passive Millimeter Wave Sequence Images
NASA Astrophysics Data System (ADS)
Chen, Y.; Pang, L.; Liu, H.; Xu, X.
2018-04-01
PMMW imaging system can create interpretable imagery on the objects concealed under clothing, which gives the great advantage to the security check system. Paper addresses wavelet fusion to detect concealed objects using passive millimeter wave (PMMW) sequence images. According to PMMW real-time imager acquired image characteristics and storage methods firstly, using the sum of squared difference (SSD) as the image-related parameters to screen the sequence images. Secondly, the selected images are optimized using wavelet fusion algorithm. Finally, the concealed objects are detected by mean filter, threshold segmentation and edge detection. The experimental results show that this method improves the detection effect of concealed objects by selecting the most relevant images from PMMW sequence images and using wavelet fusion to enhance the information of the concealed objects. The method can be effectively applied to human body concealed object detection in millimeter wave video.
Race and Emotion in Computer-Based HIV Prevention Videos for Emergency Department Patients
ERIC Educational Resources Information Center
Aronson, Ian David; Bania, Theodore C.
2011-01-01
Computer-based video provides a valuable tool for HIV prevention in hospital emergency departments. However, the type of video content and protocol that will be most effective remain underexplored and the subject of debate. This study employs a new and highly replicable methodology that enables comparisons of multiple video segments, each based on…
Adventure Racing and Organizational Behavior: Using Eco Challenge Video Clips to Stimulate Learning
ERIC Educational Resources Information Center
Kenworthy-U'Ren, Amy; Erickson, Anthony
2009-01-01
In this article, the Eco Challenge race video is presented as a teaching tool for facilitating theory-based discussion and application in organizational behavior (OB) courses. Before discussing the intricacies of the video series itself, the authors present a pedagogically based rationale for using reality TV-based video segments in a classroom…
Quantifying technical skills during open operations using video-based motion analysis.
Glarner, Carly E; Hu, Yue-Yung; Chen, Chia-Hsiung; Radwin, Robert G; Zhao, Qianqian; Craven, Mark W; Wiegmann, Douglas A; Pugh, Carla M; Carty, Matthew J; Greenberg, Caprice C
2014-09-01
Objective quantification of technical operative skills in surgery remains poorly defined, although the delivery of and training in these skills is essential to the profession of surgery. Attempts to measure hand kinematics to quantify operative performance primarily have relied on electromagnetic sensors attached to the surgeon's hand or instrument. We sought to determine whether a similar motion analysis could be performed with a marker-less, video-based review, allowing for a scalable approach to performance evaluation. We recorded six reduction mammoplasty operations-a plastic surgery procedure in which the attending and resident surgeons operate in parallel. Segments representative of surgical tasks were identified with Multimedia Video Task Analysis software. Video digital processing was used to extract and analyze the spatiotemporal characteristics of hand movement. Attending plastic surgeons appear to use their nondominant hand more than residents when cutting with the scalpel, suggesting more use of countertraction. While suturing, attendings were more ambidextrous, with smaller differences in movement between their dominant and nondominant hands than residents. Attendings also seem to have more conservation of movement when performing instrument tying than residents, as demonstrated by less nondominant hand displacement. These observations were consistent within procedures and between the different attending plastic surgeons evaluated in this fashion. Video motion analysis can be used to provide objective measurement of technical skills without the need for sensors or markers. Such data could be valuable in better understanding the acquisition and degradation of operative skills, providing enhanced feedback to shorten the learning curve. Copyright © 2014 Mosby, Inc. All rights reserved.
Multi-frame super-resolution with quality self-assessment for retinal fundus videos.
Köhler, Thomas; Brost, Alexander; Mogalle, Katja; Zhang, Qianyi; Köhler, Christiane; Michelson, Georg; Hornegger, Joachim; Tornow, Ralf P
2014-01-01
This paper proposes a novel super-resolution framework to reconstruct high-resolution fundus images from multiple low-resolution video frames in retinal fundus imaging. Natural eye movements during an examination are used as a cue for super-resolution in a robust maximum a-posteriori scheme. In order to compensate heterogeneous illumination on the fundus, we integrate retrospective illumination correction for photometric registration to the underlying imaging model. Our method utilizes quality self-assessment to provide objective quality scores for reconstructed images as well as to select regularization parameters automatically. In our evaluation on real data acquired from six human subjects with a low-cost video camera, the proposed method achieved considerable enhancements of low-resolution frames and improved noise and sharpness characteristics by 74%. In terms of image analysis, we demonstrate the importance of our method for the improvement of automatic blood vessel segmentation as an example application, where the sensitivity was increased by 13% using super-resolution reconstruction.
Egocentric Temporal Action Proposals.
Shao Huang; Weiqiang Wang; Shengfeng He; Lau, Rynson W H
2018-02-01
We present an approach to localize generic actions in egocentric videos, called temporal action proposals (TAPs), for accelerating the action recognition step. An egocentric TAP refers to a sequence of frames that may contain a generic action performed by the wearer of a head-mounted camera, e.g., taking a knife, spreading jam, pouring milk, or cutting carrots. Inspired by object proposals, this paper aims at generating a small number of TAPs, thereby replacing the popular sliding window strategy, for localizing all action events in the input video. To this end, we first propose to temporally segment the input video into action atoms, which are the smallest units that may contain an action. We then apply a hierarchical clustering algorithm with several egocentric cues to generate TAPs. Finally, we propose two actionness networks to score the likelihood of each TAP containing an action. The top ranked candidates are returned as output TAPs. Experimental results show that the proposed TAP detection framework performs significantly better than relevant approaches for egocentric action detection.
Gao, Bin; Li, Xiaoqing; Woo, Wai Lok; Tian, Gui Yun
2018-05-01
Thermographic inspection has been widely applied to non-destructive testing and evaluation with the capabilities of rapid, contactless, and large surface area detection. Image segmentation is considered essential for identifying and sizing defects. To attain a high-level performance, specific physics-based models that describe defects generation and enable the precise extraction of target region are of crucial importance. In this paper, an effective genetic first-order statistical image segmentation algorithm is proposed for quantitative crack detection. The proposed method automatically extracts valuable spatial-temporal patterns from unsupervised feature extraction algorithm and avoids a range of issues associated with human intervention in laborious manual selection of specific thermal video frames for processing. An internal genetic functionality is built into the proposed algorithm to automatically control the segmentation threshold to render enhanced accuracy in sizing the cracks. Eddy current pulsed thermography will be implemented as a platform to demonstrate surface crack detection. Experimental tests and comparisons have been conducted to verify the efficacy of the proposed method. In addition, a global quantitative assessment index F-score has been adopted to objectively evaluate the performance of different segmentation algorithms.
Inferring segmented dense motion layers using 5D tensor voting.
Min, Changki; Medioni, Gérard
2008-09-01
We present a novel local spatiotemporal approach to produce motion segmentation and dense temporal trajectories from an image sequence. A common representation of image sequences is a 3D spatiotemporal volume, (x,y,t), and its corresponding mathematical formalism is the fiber bundle. However, directly enforcing the spatiotemporal smoothness constraint is difficult in the fiber bundle representation. Thus, we convert the representation into a new 5D space (x,y,t,vx,vy) with an additional velocity domain, where each moving object produces a separate 3D smooth layer. The smoothness constraint is now enforced by extracting 3D layers using the tensor voting framework in a single step that solves both correspondence and segmentation simultaneously. Motion segmentation is achieved by identifying those layers, and the dense temporal trajectories are obtained by converting the layers back into the fiber bundle representation. We proceed to address three applications (tracking, mosaic, and 3D reconstruction) that are hard to solve from the video stream directly because of the segmentation and dense matching steps, but become straightforward with our framework. The approach does not make restrictive assumptions about the observed scene or camera motion and is therefore generally applicable. We present results on a number of data sets.
The IXV Ground Segment design, implementation and operations
NASA Astrophysics Data System (ADS)
Martucci di Scarfizzi, Giovanni; Bellomo, Alessandro; Musso, Ivano; Bussi, Diego; Rabaioli, Massimo; Santoro, Gianfranco; Billig, Gerhard; Gallego Sanz, José María
2016-07-01
The Intermediate eXperimental Vehicle (IXV) is an ESA re-entry demonstrator that performed, on the 11th February of 2015, a successful re-entry demonstration mission. The project objectives were the design, development, manufacturing and on ground and in flight verification of an autonomous European lifting and aerodynamically controlled re-entry system. For the IXV mission a dedicated Ground Segment was provided. The main subsystems of the IXV Ground Segment were: IXV Mission Control Center (MCC), from where monitoring of the vehicle was performed, as well as support during pre-launch and recovery phases; IXV Ground Stations, used to cover IXV mission by receiving spacecraft telemetry and forwarding it toward the MCC; the IXV Communication Network, deployed to support the operations of the IXV mission by interconnecting all remote sites with MCC, supporting data, voice and video exchange. This paper describes the concept, architecture, development, implementation and operations of the ESA Intermediate Experimental Vehicle (IXV) Ground Segment and outlines the main operations and lessons learned during the preparation and successful execution of the IXV Mission.
A Motion Detection Algorithm Using Local Phase Information
Lazar, Aurel A.; Ukani, Nikul H.; Zhou, Yiyin
2016-01-01
Previous research demonstrated that global phase alone can be used to faithfully represent visual scenes. Here we provide a reconstruction algorithm by using only local phase information. We also demonstrate that local phase alone can be effectively used to detect local motion. The local phase-based motion detector is akin to models employed to detect motion in biological vision, for example, the Reichardt detector. The local phase-based motion detection algorithm introduced here consists of two building blocks. The first building block measures/evaluates the temporal change of the local phase. The temporal derivative of the local phase is shown to exhibit the structure of a second order Volterra kernel with two normalized inputs. We provide an efficient, FFT-based algorithm for implementing the change of the local phase. The second processing building block implements the detector; it compares the maximum of the Radon transform of the local phase derivative with a chosen threshold. We demonstrate examples of applying the local phase-based motion detection algorithm on several video sequences. We also show how the locally detected motion can be used for segmenting moving objects in video scenes and compare our local phase-based algorithm to segmentation achieved with a widely used optic flow algorithm. PMID:26880882
Robotic Vision-Based Localization in an Urban Environment
NASA Technical Reports Server (NTRS)
Mchenry, Michael; Cheng, Yang; Matthies
2007-01-01
A system of electronic hardware and software, now undergoing development, automatically estimates the location of a robotic land vehicle in an urban environment using a somewhat imprecise map, which has been generated in advance from aerial imagery. This system does not utilize the Global Positioning System and does not include any odometry, inertial measurement units, or any other sensors except a stereoscopic pair of black-and-white digital video cameras mounted on the vehicle. Of course, the system also includes a computer running software that processes the video image data. The software consists mostly of three components corresponding to the three major image-data-processing functions: Visual Odometry This component automatically tracks point features in the imagery and computes the relative motion of the cameras between sequential image frames. This component incorporates a modified version of a visual-odometry algorithm originally published in 1989. The algorithm selects point features, performs multiresolution area-correlation computations to match the features in stereoscopic images, tracks the features through the sequence of images, and uses the tracking results to estimate the six-degree-of-freedom motion of the camera between consecutive stereoscopic pairs of images (see figure). Urban Feature Detection and Ranging Using the same data as those processed by the visual-odometry component, this component strives to determine the three-dimensional (3D) coordinates of vertical and horizontal lines that are likely to be parts of, or close to, the exterior surfaces of buildings. The basic sequence of processes performed by this component is the following: 1. An edge-detection algorithm is applied, yielding a set of linked lists of edge pixels, a horizontal-gradient image, and a vertical-gradient image. 2. Straight-line segments of edges are extracted from the linked lists generated in step 1. Any straight-line segments longer than an arbitrary threshold (e.g., 30 pixels) are assumed to belong to buildings or other artificial objects. 3. A gradient-filter algorithm is used to test straight-line segments longer than the threshold to determine whether they represent edges of natural or artificial objects. In somewhat oversimplified terms, the test is based on the assumption that the gradient of image intensity varies little along a segment that represents the edge of an artificial object.
ERIC Educational Resources Information Center
Wang, Judy H.; Liang, Wenchi; Schwartz, Marc D.; Lee, Marion M.; Kreling, Barbara; Mandelblatt, Jeanne S.
2008-01-01
This study developed and evaluated a culturally tailored video guided by the health belief model to improve Chinese women's low rate of mammography use. Focus-group discussions and an advisory board meeting guided the video development. A 17-min video, including a soap opera and physician-recommendation segment, was made in Chinese languages. A…
Automated Music Video Generation Using Multi-level Feature-based Segmentation
NASA Astrophysics Data System (ADS)
Yoon, Jong-Chul; Lee, In-Kwon; Byun, Siwoo
The expansion of the home video market has created a requirement for video editing tools to allow ordinary people to assemble videos from short clips. However, professional skills are still necessary to create a music video, which requires a stream to be synchronized with pre-composed music. Because the music and the video are pre-generated in separate environments, even a professional producer usually requires a number of trials to obtain a satisfactory synchronization, which is something that most amateurs are unable to achieve.
A new visual navigation system for exploring biomedical Open Educational Resource (OER) videos
Zhao, Baoquan; Xu, Songhua; Lin, Shujin; Luo, Xiaonan; Duan, Lian
2016-01-01
Objective Biomedical videos as open educational resources (OERs) are increasingly proliferating on the Internet. Unfortunately, seeking personally valuable content from among the vast corpus of quality yet diverse OER videos is nontrivial due to limitations of today’s keyword- and content-based video retrieval techniques. To address this need, this study introduces a novel visual navigation system that facilitates users’ information seeking from biomedical OER videos in mass quantity by interactively offering visual and textual navigational clues that are both semantically revealing and user-friendly. Materials and Methods The authors collected and processed around 25 000 YouTube videos, which collectively last for a total length of about 4000 h, in the broad field of biomedical sciences for our experiment. For each video, its semantic clues are first extracted automatically through computationally analyzing audio and visual signals, as well as text either accompanying or embedded in the video. These extracted clues are subsequently stored in a metadata database and indexed by a high-performance text search engine. During the online retrieval stage, the system renders video search results as dynamic web pages using a JavaScript library that allows users to interactively and intuitively explore video content both efficiently and effectively. Results The authors produced a prototype implementation of the proposed system, which is publicly accessible at https://patentq.njit.edu/oer. To examine the overall advantage of the proposed system for exploring biomedical OER videos, the authors further conducted a user study of a modest scale. The study results encouragingly demonstrate the functional effectiveness and user-friendliness of the new system for facilitating information seeking from and content exploration among massive biomedical OER videos. Conclusion Using the proposed tool, users can efficiently and effectively find videos of interest, precisely locate video segments delivering personally valuable information, as well as intuitively and conveniently preview essential content of a single or a collection of videos. PMID:26335986
Motion adaptive Kalman filter for super-resolution
NASA Astrophysics Data System (ADS)
Richter, Martin; Nasse, Fabian; Schröder, Hartmut
2011-01-01
Superresolution is a sophisticated strategy to enhance image quality of both low and high resolution video, performing tasks like artifact reduction, scaling and sharpness enhancement in one algorithm, all of them reconstructing high frequency components (above Nyquist frequency) in some way. Especially recursive superresolution algorithms can fulfill high quality aspects because they control the video output using a feed-back loop and adapt the result in the next iteration. In addition to excellent output quality, temporal recursive methods are very hardware efficient and therefore even attractive for real-time video processing. A very promising approach is the utilization of Kalman filters as proposed by Farsiu et al. Reliable motion estimation is crucial for the performance of superresolution. Therefore, robust global motion models are mainly used, but this also limits the application of superresolution algorithm. Thus, handling sequences with complex object motion is essential for a wider field of application. Hence, this paper proposes improvements by extending the Kalman filter approach using motion adaptive variance estimation and segmentation techniques. Experiments confirm the potential of our proposal for ideal and real video sequences with complex motion and further compare its performance to state-of-the-art methods like trainable filters.
Optimal frame-by-frame result combination strategy for OCR in video stream
NASA Astrophysics Data System (ADS)
Bulatov, Konstantin; Lynchenko, Aleksander; Krivtsov, Valeriy
2018-04-01
This paper describes the problem of combining classification results of multiple observations of one object. This task can be regarded as a particular case of a decision-making using a combination of experts votes with calculated weights. The accuracy of various methods of combining the classification results depending on different models of input data is investigated on the example of frame-by-frame character recognition in a video stream. Experimentally it is shown that the strategy of choosing a single most competent expert in case of input data without irrelevant observations has an advantage (in this case irrelevant means with character localization and segmentation errors). At the same time this work demonstrates the advantage of combining several most competent experts according to multiplication rule or voting if irrelevant samples are present in the input data.
ERIC Educational Resources Information Center
Ludlow, Barbara L.; Foshay, John B.; Duff, Michael C.
Video presentations of teaching episodes in home, school, and community settings and audio recordings of parents' and professionals' views can be important adjuncts to personnel preparation in special education. This paper describes instructional applications of digital media and outlines steps in producing audio and video segments. Digital audio…
Efficient depth intraprediction method for H.264/AVC-based three-dimensional video coding
NASA Astrophysics Data System (ADS)
Oh, Kwan-Jung; Oh, Byung Tae
2015-04-01
We present an intracoding method that is applicable to depth map coding in multiview plus depth systems. Our approach combines skip prediction and plane segmentation-based prediction. The proposed depth intraskip prediction uses the estimated direction at both the encoder and decoder, and does not need to encode residual data. Our plane segmentation-based intraprediction divides the current block into biregions, and applies a different prediction scheme for each segmented region. This method avoids incorrect estimations across different regions, resulting in higher prediction accuracy. Simulation results demonstrate that the proposed scheme is superior to H.264/advanced video coding intraprediction and has the ability to improve the subjective rendering quality.
ERIC Educational Resources Information Center
King, Keith; Laake, Rebecca A.; Bernard, Amy
2006-01-01
This study examined the sexual messages depicted in music videos aired on MTV, MTV2, BET, and GAC from August 2, 2004 to August 15, 2004. One-hour segments of music videos were taped daily for two weeks. Depictions of sexual attire and sexual behavior were analyzed via a four-page coding sheet (interrater-reliability = 0.93). Results indicated…
Bayesian Modeling of Temporal Coherence in Videos for Entity Discovery and Summarization.
Mitra, Adway; Biswas, Soma; Bhattacharyya, Chiranjib
2017-03-01
A video is understood by users in terms of entities present in it. Entity Discovery is the task of building appearance model for each entity (e.g., a person), and finding all its occurrences in the video. We represent a video as a sequence of tracklets, each spanning 10-20 frames, and associated with one entity. We pose Entity Discovery as tracklet clustering, and approach it by leveraging Temporal Coherence (TC): the property that temporally neighboring tracklets are likely to be associated with the same entity. Our major contributions are the first Bayesian nonparametric models for TC at tracklet-level. We extend Chinese Restaurant Process (CRP) to TC-CRP, and further to Temporally Coherent Chinese Restaurant Franchise (TC-CRF) to jointly model entities and temporal segments using mixture components and sparse distributions. For discovering persons in TV serial videos without meta-data like scripts, these methods show considerable improvement over state-of-the-art approaches to tracklet clustering in terms of clustering accuracy, cluster purity and entity coverage. The proposed methods can perform online tracklet clustering on streaming videos unlike existing approaches, and can automatically reject false tracklets. Finally we discuss entity-driven video summarization- where temporal segments of the video are selected based on the discovered entities, to create a semantically meaningful summary.
VIDEO MODELING BY EXPERTS WITH VIDEO FEEDBACK TO ENHANCE GYMNASTICS SKILLS
Boyer, Eva; Miltenberger, Raymond G; Batsche, Catherine; Fogel, Victoria
2009-01-01
The effects of combining video modeling by experts with video feedback were analyzed with 4 female competitive gymnasts (7 to 10 years old) in a multiple baseline design across behaviors. During the intervention, after the gymnast performed a specific gymnastics skill, she viewed a video segment showing an expert gymnast performing the same skill and then viewed a video replay of her own performance of the skill. The results showed that all gymnasts demonstrated improved performance across three gymnastics skills following exposure to the intervention. PMID:20514194
Video modeling by experts with video feedback to enhance gymnastics skills.
Boyer, Eva; Miltenberger, Raymond G; Batsche, Catherine; Fogel, Victoria
2009-01-01
The effects of combining video modeling by experts with video feedback were analyzed with 4 female competitive gymnasts (7 to 10 years old) in a multiple baseline design across behaviors. During the intervention, after the gymnast performed a specific gymnastics skill, she viewed a video segment showing an expert gymnast performing the same skill and then viewed a video replay of her own performance of the skill. The results showed that all gymnasts demonstrated improved performance across three gymnastics skills following exposure to the intervention.
Extraction of Blebs in Human Embryonic Stem Cell Videos.
Guan, Benjamin X; Bhanu, Bir; Talbot, Prue; Weng, Nikki Jo-Hao
2016-01-01
Blebbing is an important biological indicator in determining the health of human embryonic stem cells (hESC). Especially, areas of a bleb sequence in a video are often used to distinguish two cell blebbing behaviors in hESC: dynamic and apoptotic blebbings. This paper analyzes various segmentation methods for bleb extraction in hESC videos and introduces a bio-inspired score function to improve the performance in bleb extraction. Full bleb formation consists of bleb expansion and retraction. Blebs change their size and image properties dynamically in both processes and between frames. Therefore, adaptive parameters are needed for each segmentation method. A score function derived from the change of bleb area and orientation between consecutive frames is proposed which provides adaptive parameters for bleb extraction in videos. In comparison to manual analysis, the proposed method provides an automated fast and accurate approach for bleb sequence extraction.
2006-01-01
segments video game interaction into domain-independent components which together form a framework that can be used to characterize real-time interactive...multimedia applications in general and HRI in particular. We provide examples of using the components in both the video game and the Unmanned Aerial
ERIC Educational Resources Information Center
Ayala, Sandra M.
2010-01-01
Ten first grade students, participating in a Tier II response to intervention (RTI) reading program received an intervention of video self modeling to improve decoding skills and sight word recognition. The students were video recorded blending and segmenting decodable words, and reading sight words taken directly from their curriculum…
Video rate color region segmentation for mobile robotic applications
NASA Astrophysics Data System (ADS)
de Cabrol, Aymeric; Bonnin, Patrick J.; Hugel, Vincent; Blazevic, Pierre; Chetto, Maryline
2005-08-01
Color Region may be an interesting image feature to extract for visual tasks in robotics, such as navigation and obstacle avoidance. But, whereas numerous methods are used for vision systems embedded on robots, only a few use this segmentation mainly because of the processing duration. In this paper, we propose a new real-time (ie. video rate) color region segmentation followed by a robust color classification and a merging of regions, dedicated to various applications such as RoboCup four-legged league or an industrial conveyor wheeled robot. Performances of this algorithm and confrontation with other methods, in terms of result quality and temporal performances are provided. For better quality results, the obtained speed up is between 2 and 4. For same quality results, the it is up to 10. We present also the outlines of the Dynamic Vision System of the CLEOPATRE Project - for which this segmentation has been developed - and the Clear Box Methodology which allowed us to create the new color region segmentation from the evaluation and the knowledge of other well known segmentations.
Audio-guided audiovisual data segmentation, indexing, and retrieval
NASA Astrophysics Data System (ADS)
Zhang, Tong; Kuo, C.-C. Jay
1998-12-01
While current approaches for video segmentation and indexing are mostly focused on visual information, audio signals may actually play a primary role in video content parsing. In this paper, we present an approach for automatic segmentation, indexing, and retrieval of audiovisual data, based on audio content analysis. The accompanying audio signal of audiovisual data is first segmented and classified into basic types, i.e., speech, music, environmental sound, and silence. This coarse-level segmentation and indexing step is based upon morphological and statistical analysis of several short-term features of the audio signals. Then, environmental sounds are classified into finer classes, such as applause, explosions, bird sounds, etc. This fine-level classification and indexing step is based upon time- frequency analysis of audio signals and the use of the hidden Markov model as the classifier. On top of this archiving scheme, an audiovisual data retrieval system is proposed. Experimental results show that the proposed approach has an accuracy rate higher than 90 percent for the coarse-level classification, and higher than 85 percent for the fine-level classification. Examples of audiovisual data segmentation and retrieval are also provided.
Infrared video based gas leak detection method using modified FAST features
NASA Astrophysics Data System (ADS)
Wang, Min; Hong, Hanyu; Huang, Likun
2018-03-01
In order to detect the invisible leaking gas that is usually dangerous and easily leads to fire or explosion in time, many new technologies have arisen in the recent years, among which the infrared video based gas leak detection is widely recognized as a viable tool. However, all the moving regions of a video frame can be detected as leaking gas regions by the existing infrared video based gas leak detection methods, without discriminating the property of each detected region, e.g., a walking person in a video frame may be also detected as gas by the current gas leak detection methods.To solve this problem, we propose a novel infrared video based gas leak detection method in this paper, which is able to effectively suppress strong motion disturbances.Firstly, the Gaussian mixture model(GMM) is used to establish the background model.Then due to the observation that the shapes of gas regions are different from most rigid moving objects, we modify the Features From Accelerated Segment Test (FAST) algorithm and use the modified FAST (mFAST) features to describe each connected component. In view of the fact that the statistical property of the mFAST features extracted from gas regions is different from that of other motion regions, we propose the Pixel-Per-Points (PPP) condition to further select candidate connected components.Experimental results show that the algorithm is able to effectively suppress most strong motion disturbances and achieve real-time leaking gas detection.
NASA Technical Reports Server (NTRS)
Tescher, Andrew G. (Editor)
1989-01-01
Various papers on image compression and automatic target recognition are presented. Individual topics addressed include: target cluster detection in cluttered SAR imagery, model-based target recognition using laser radar imagery, Smart Sensor front-end processor for feature extraction of images, object attitude estimation and tracking from a single video sensor, symmetry detection in human vision, analysis of high resolution aerial images for object detection, obscured object recognition for an ATR application, neural networks for adaptive shape tracking, statistical mechanics and pattern recognition, detection of cylinders in aerial range images, moving object tracking using local windows, new transform method for image data compression, quad-tree product vector quantization of images, predictive trellis encoding of imagery, reduced generalized chain code for contour description, compact architecture for a real-time vision system, use of human visibility functions in segmentation coding, color texture analysis and synthesis using Gibbs random fields.
Segmented cold cathode display panel
NASA Technical Reports Server (NTRS)
Payne, Leslie (Inventor)
1998-01-01
The present invention is a video display device that utilizes the novel concept of generating an electronically controlled pattern of electron emission at the output of a segmented photocathode. This pattern of electron emission is amplified via a channel plate. The result is that an intense electronic image can be accelerated toward a phosphor thus creating a bright video image. This novel arrangement allows for one to provide a full color flat video display capable of implementation in large formats. In an alternate arrangement, the present invention is provided without the channel plate and a porous conducting surface is provided instead. In this alternate arrangement, the brightness of the image is reduced but the cost of the overall device is significantly lowered because fabrication complexity is significantly decreased.
Colonoscopy tutorial software made with a cadaver's sectioned images.
Chung, Beom Sun; Chung, Min Suk; Park, Hyung Seon; Shin, Byeong-Seok; Kwon, Koojoo
2016-11-01
Novice doctors may watch tutorial videos in training for actual or computed tomographic (CT) colonoscopy. The conventional learning videos can be complemented by virtual colonoscopy software made with a cadaver's sectioned images (SIs). The objective of this study was to assist colonoscopy trainees with the new interactive software. Submucosal segmentation on the SIs was carried out through the whole length of the large intestine. With the SIs and segmented images, a three dimensional model was reconstructed. Six-hundred seventy-one proximal colonoscopic views (conventional views) and corresponding distal colonoscopic views (simulating the retroflexion of a colonoscope) were produced. Not only navigation views showing the current location of the colonoscope tip and its course, but also, supplementary description views were elaborated. The four corresponding views were put into convenient browsing software to be downloaded free from the homepage (anatomy.co.kr). The SI colonoscopy software with the realistic images and supportive tools was available to anybody. Users could readily notice the position and direction of the virtual colonoscope tip and recognize meaningful structures in colonoscopic views. The software is expected to be an auxiliary learning tool to improve technique and related knowledge in actual and CT colonoscopies. Hopefully, the software will be updated using raw images from the Visible Korean project. Copyright © 2016 Elsevier GmbH. All rights reserved.
NASA Astrophysics Data System (ADS)
Chen, Zhenzhong; Han, Junwei; Ngan, King Ngi
2005-10-01
MPEG-4 treats a scene as a composition of several objects or so-called video object planes (VOPs) that are separately encoded and decoded. Such a flexible video coding framework makes it possible to code different video object with different distortion scale. It is necessary to analyze the priority of the video objects according to its semantic importance, intrinsic properties and psycho-visual characteristics such that the bit budget can be distributed properly to video objects to improve the perceptual quality of the compressed video. This paper aims to provide an automatic video object priority definition method based on object-level visual attention model and further propose an optimization framework for video object bit allocation. One significant contribution of this work is that the human visual system characteristics are incorporated into the video coding optimization process. Another advantage is that the priority of the video object can be obtained automatically instead of fixing weighting factors before encoding or relying on the user interactivity. To evaluate the performance of the proposed approach, we compare it with traditional verification model bit allocation and the optimal multiple video object bit allocation algorithms. Comparing with traditional bit allocation algorithms, the objective quality of the object with higher priority is significantly improved under this framework. These results demonstrate the usefulness of this unsupervised subjective quality lifting framework.
Hirano, Yutaka; Ikuta, Shin-Ichiro; Nakano, Manabu; Akiyama, Seita; Nakamura, Hajime; Nasu, Masataka; Saito, Futoshi; Nakagawa, Junichi; Matsuzaki, Masashi; Miyazaki, Shunichi
2007-02-01
Assessment of deterioration of regional wall motion by echocardiography is not only subjective but also features difficulties with interobserver agreement. Progress in digital communication technology has made it possible to send video images from a distant location via the Internet. The possibility of evaluating left ventricular wall motion using video images sent via the Internet to distant institutions was evaluated. Twenty-two subjects were randomly selected. Four sets of video images (parasternal long-axis view, parasternal short-axis view, apical four-chamber view, and apical two-chamber view) were taken for one cardiac cycle. The images were sent via the Internet to two institutions (observer C in facility A and observers D and E in facility B) for evaluation. Great care was taken to prevent disclosure of patient information to these observers. Parasternal long-axis images were divided into four segments, and the parasternal short-axis view, apical four-chamber view, and apical two-chamber view were divided into six segments. One of the following assessments, normokinesis, hypokinesis, akinesis, or dyskinesis, was assigned to each segment. The interobserver rates of agreement in judgments between observers C and D, observers C and E, and intraobserver agreement rate (for observer D) were calculated. The rate of interobserver agreement was 85.7% (394/460 segments; Kappa = 0.65) between observers C and D, 76.7% (353/460 segments; Kappa = 0.39) between observers D and E, and 76.3% (351/460 segments; Kappa = 0.36)between observers C and E, and intraobserver agreement was 94.3% (434/460; Kappa = 0.86). Segments of difference judgments between observers C and D were normokinesis-hypokinesis; 62.1%, hypokinesis-akinesis; 33.3%, akinesis-dyskinesis; 3.0%, and normokinesis-akinesis; 1.5%. Wall motion can be evaluated at remote institutions via the Internet.
Activity Detection and Retrieval for Image and Video Data with Limited Training
2015-06-10
applications. Here we propose two techniques for image segmentation. The first involves an automata based multiple threshold selection scheme, where a... automata . For our second approach to segmentation, we employ a region based segmentation technique that is capable of handling intensity inhomogeneity...techniques for image segmentation. The first involves an automata based multiple threshold selection scheme, where a mixture of Gaussian is fitted to the
ERIC Educational Resources Information Center
Eick, Charles Joseph; King, David T., Jr.
2012-01-01
The instructor of an integrated science course for nonscience majors embedded content-related video segments from YouTube and other similar internet sources into lecture. Through this study, the instructor wanted to know students' perceptions of how video use engaged them and increased their interest and understanding of science. Written survey…
Testing with feedback improves recall of information in informed consent: A proof of concept study.
Roberts, Katherine J; Revenson, Tracey A; Urken, Mark L; Fleszar, Sara; Cipollina, Rebecca; Rowe, Meghan E; Reis, Laura L Dos; Lepore, Stephen J
2016-08-01
This study investigates whether applying educational testing approaches to an informed consent video for a medical procedure can lead to greater recall of the information presented. Undergraduate students (n=120) were randomly assigned to watch a 20-min video on informed consent under one of three conditions: 1) tested using multiple-choice knowledge questions and provided with feedback on their answers after each 5-min segment; 2) tested with multiple choice knowledge questions but not provided feedback after each segment; or 3) watched the video without knowledge testing. Participants who were tested and provided feedback had significantly greater information recall compared to those who were tested but not provided feedback and to those not tested. The effect of condition was stronger for moderately difficult questions versus easy questions. Inserting knowledge tests and providing feedback about the responses at timed intervals in videos can be effective in improving recall of information. Providing informed consent information through a video not only standardizes the material, but using testing with feedback inserted within the video has the potential to increase recall and retention of this material. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Validity and reliability of naturalistic driving scene categorization Judgments from crowdsourcing.
Cabrall, Christopher D D; Lu, Zhenji; Kyriakidis, Miltos; Manca, Laura; Dijksterhuis, Chris; Happee, Riender; de Winter, Joost
2018-05-01
A common challenge with processing naturalistic driving data is that humans may need to categorize great volumes of recorded visual information. By means of the online platform CrowdFlower, we investigated the potential of crowdsourcing to categorize driving scene features (i.e., presence of other road users, straight road segments, etc.) at greater scale than a single person or a small team of researchers would be capable of. In total, 200 workers from 46 different countries participated in 1.5days. Validity and reliability were examined, both with and without embedding researcher generated control questions via the CrowdFlower mechanism known as Gold Test Questions (GTQs). By employing GTQs, we found significantly more valid (accurate) and reliable (consistent) identification of driving scene items from external workers. Specifically, at a small scale CrowdFlower Job of 48 three-second video segments, an accuracy (i.e., relative to the ratings of a confederate researcher) of 91% on items was found with GTQs compared to 78% without. A difference in bias was found, where without GTQs, external workers returned more false positives than with GTQs. At a larger scale CrowdFlower Job making exclusive use of GTQs, 12,862 three-second video segments were released for annotation. Infeasible (and self-defeating) to check the accuracy of each at this scale, a random subset of 1012 categorizations was validated and returned similar levels of accuracy (95%). In the small scale Job, where full video segments were repeated in triplicate, the percentage of unanimous agreement on the items was found significantly more consistent when using GTQs (90%) than without them (65%). Additionally, in the larger scale Job (where a single second of a video segment was overlapped by ratings of three sequentially neighboring segments), a mean unanimity of 94% was obtained with validated-as-correct ratings and 91% with non-validated ratings. Because the video segments overlapped in full for the small scale Job, and in part for the larger scale Job, it should be noted that such reliability reported here may not be directly comparable. Nonetheless, such results are both indicative of high levels of obtained rating reliability. Overall, our results provide compelling evidence for CrowdFlower, via use of GTQs, being able to yield more accurate and consistent crowdsourced categorizations of naturalistic driving scene contents than when used without such a control mechanism. Such annotations in such short periods of time present a potentially powerful resource in driving research and driving automation development. Copyright © 2017 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Grieggs, Samuel M.; McLaughlin, Michael J.; Ezekiel, Soundararajan; Blasch, Erik
2015-06-01
As technology and internet use grows at an exponential rate, video and imagery data is becoming increasingly important. Various techniques such as Wide Area Motion imagery (WAMI), Full Motion Video (FMV), and Hyperspectral Imaging (HSI) are used to collect motion data and extract relevant information. Detecting and identifying a particular object in imagery data is an important step in understanding visual imagery, such as content-based image retrieval (CBIR). Imagery data is segmented and automatically analyzed and stored in dynamic and robust database. In our system, we seek utilize image fusion methods which require quality metrics. Many Image Fusion (IF) algorithms have been proposed based on different, but only a few metrics, used to evaluate the performance of these algorithms. In this paper, we seek a robust, objective metric to evaluate the performance of IF algorithms which compares the outcome of a given algorithm to ground truth and reports several types of errors. Given the ground truth of a motion imagery data, it will compute detection failure, false alarm, precision and recall metrics, background and foreground regions statistics, as well as split and merge of foreground regions. Using the Structural Similarity Index (SSIM), Mutual Information (MI), and entropy metrics; experimental results demonstrate the effectiveness of the proposed methodology for object detection, activity exploitation, and CBIR.
Multilevel analysis of sports video sequences
NASA Astrophysics Data System (ADS)
Han, Jungong; Farin, Dirk; de With, Peter H. N.
2006-01-01
We propose a fully automatic and flexible framework for analysis and summarization of tennis broadcast video sequences, using visual features and specific game-context knowledge. Our framework can analyze a tennis video sequence at three levels, which provides a broad range of different analysis results. The proposed framework includes novel pixel-level and object-level tennis video processing algorithms, such as a moving-player detection taking both the color and the court (playing-field) information into account, and a player-position tracking algorithm based on a 3-D camera model. Additionally, we employ scene-level models for detecting events, like service, base-line rally and net-approach, based on a number real-world visual features. The system can summarize three forms of information: (1) all court-view playing frames in a game, (2) the moving trajectory and real-speed of each player, as well as relative position between the player and the court, (3) the semantic event segments in a game. The proposed framework is flexible in choosing the level of analysis that is desired. It is effective because the framework makes use of several visual cues obtained from the real-world domain to model important events like service, thereby increasing the accuracy of the scene-level analysis. The paper presents attractive experimental results highlighting the system efficiency and analysis capabilities.
Highlight summarization in golf videos using audio signals
NASA Astrophysics Data System (ADS)
Kim, Hyoung-Gook; Kim, Jin Young
2008-01-01
In this paper, we present an automatic summarization of highlights in golf videos based on audio information alone without video information. The proposed highlight summarization system is carried out based on semantic audio segmentation and detection on action units from audio signals. Studio speech, field speech, music, and applause are segmented by means of sound classification. Swing is detected by the methods of impulse onset detection. Sounds like swing and applause form a complete action unit, while studio speech and music parts are used to anchor the program structure. With the advantage of highly precise detection of applause, highlights are extracted effectively. Our experimental results obtain high classification precision on 18 golf games. It proves that the proposed system is very effective and computationally efficient to apply the technology to embedded consumer electronic devices.
Yi, Chucai; Tian, Yingli
2012-09-01
In this paper, we propose a novel framework to extract text regions from scene images with complex backgrounds and multiple text appearances. This framework consists of three main steps: boundary clustering (BC), stroke segmentation, and string fragment classification. In BC, we propose a new bigram-color-uniformity-based method to model both text and attachment surface, and cluster edge pixels based on color pairs and spatial positions into boundary layers. Then, stroke segmentation is performed at each boundary layer by color assignment to extract character candidates. We propose two algorithms to combine the structural analysis of text stroke with color assignment and filter out background interferences. Further, we design a robust string fragment classification based on Gabor-based text features. The features are obtained from feature maps of gradient, stroke distribution, and stroke width. The proposed framework of text localization is evaluated on scene images, born-digital images, broadcast video images, and images of handheld objects captured by blind persons. Experimental results on respective datasets demonstrate that the framework outperforms state-of-the-art localization algorithms.
Depth Extraction from Videos Using Geometric Context and Occlusion Boundaries (Open Access)
2014-09-05
RAZA ET AL .: DEPTH EXTRACTION FROM VIDEOS 1 Depth Extraction from Videos Using Geometric Context and Occlusion Boundaries S. Hussain Raza1...electronic forms. ar X iv :1 51 0. 07 31 7v 1 [ cs .C V ] 2 5 O ct 2 01 5 2 RAZA ET AL .: DEPTH EXTRACTION FROM VIDEOS Frame Ground Truth Depth...temporal segmentation using the method proposed by Grundmann et al . [4]. estimation and triangulation to estimate depth maps [17, 27](see Figure 1). In
Schittek Janda, M; Tani Botticelli, A; Mattheos, N; Nebel, D; Wagner, A; Nattestad, A; Attström, R
2005-05-01
Video-based instructions for clinical procedures have been used frequently during the preceding decades. To investigate in a randomised controlled trial the learning effectiveness of fragmented videos vs. the complete sequential video and to analyse the attitudes of the user towards video as a learning aid. An instructional video on surgical hand wash was produced. The video was available in two different forms in two separate web pages: one as a sequential video and one fragmented into eight short clips. Twenty-eight dental students in the second semester were randomised into an experimental (n = 15) and a control group (n = 13). The experimental group used the fragmented form of the video and the control group watched the complete one. The use of the videos was logged and the students were video taped whilst undertaking a test hand wash. The videos were analysed systematically and blindly by two independent clinicians. The students also performed a written test concerning learning outcome from the videos as well as they answered an attitude questionnaire. The students in the experimental group watched the video significantly longer than the control group. There were no significant differences between the groups with regard to the ratings and scores when performing the hand wash. The experimental group had significantly better results in the written test compared with those of the control group. There was no significant difference between the groups with regard to attitudes towards the use of video for learning, as measured by the Visual Analogue Scales. Most students in both groups expressed satisfaction with the use of video for learning. The students demonstrated positive attitudes and acceptable learning outcome from viewing CAL videos as a part of their pre-clinical training. Videos that are part of computer-based learning settings would ideally be presented to the students both as a segmented and as a whole video to give the students the option to choose the form of video which suits the individual student's learning style.
NASA Astrophysics Data System (ADS)
Hidalgo-Aguirre, Maribel; Gitelman, Julian; Lesk, Mark Richard; Costantino, Santiago
2015-11-01
Optical coherence tomography (OCT) imaging has become a standard diagnostic tool in ophthalmology, providing essential information associated with various eye diseases. In order to investigate the dynamics of the ocular fundus, we present a simple and accurate automated algorithm to segment the inner limiting membrane in video-rate optic nerve head spectral domain (SD) OCT images. The method is based on morphological operations including a two-step contrast enhancement technique, proving to be very robust when dealing with low signal-to-noise ratio images and pathological eyes. An analysis algorithm was also developed to measure neuroretinal tissue deformation from the segmented retinal profiles. The performance of the algorithm is demonstrated, and deformation results are presented for healthy and glaucomatous eyes.
Operator-coached machine vision for space telerobotics
NASA Technical Reports Server (NTRS)
Bon, Bruce; Wilcox, Brian; Litwin, Todd; Gennery, Donald B.
1991-01-01
A prototype system for interactive object modeling has been developed and tested. The goal of this effort has been to create a system which would demonstrate the feasibility of high interactive operator-coached machine vision in a realistic task environment, and to provide a testbed for experimentation with various modes of operator interaction. The purpose for such a system is to use human perception where machine vision is difficult, i.e., to segment the scene into objects and to designate their features, and to use machine vision to overcome limitations of human perception, i.e., for accurate measurement of object geometry. The system captures and displays video images from a number of cameras, allows the operator to designate a polyhedral object one edge at a time by moving a 3-D cursor within these images, performs a least-squares fit of the designated edges to edge data detected with a modified Sobel operator, and combines the edges thus detected to form a wire-frame object model that matches the Sobel data.
Use of videos for Distribution Construction and Maintenance (DC M) training
DOE Office of Scientific and Technical Information (OSTI.GOV)
Long, G.M.
This paper presents the results of a survey taken among members of the American Gas Association (AGA)'s Distribution Construction and Maintenance (DC M) committee to gauge the extent, sources, mode of use, and degree of satisfaction with videos as a training aid in distribution construction and maintenance skills. Also cites AGA Engineering Technical Note, DCM-88-3-1, as a catalog of the videos listed by respondents to the survey. Comments on the various sources of training videos and the characteristics of videos from each. Conference presentation included showing of a sampling of video segments from these various sources. 1 fig.
NASA Astrophysics Data System (ADS)
Fauziah; Wibowo, E. P.; Madenda, S.; Hustinawati
2018-03-01
Capturing and recording motion in human is mostly done with the aim for sports, health, animation films, criminality, and robotic applications. In this study combined background subtraction and back propagation neural network. This purpose to produce, find similarity movement. The acquisition process using 8 MP resolution camera MP4 format, duration 48 seconds, 30frame/rate. video extracted produced 1444 pieces and results hand motion identification process. Phase of image processing performed is segmentation process, feature extraction, identification. Segmentation using bakground subtraction, extracted feature basically used to distinguish between one object to another object. Feature extraction performed by using motion based morfology analysis based on 7 invariant moment producing four different classes motion: no object, hand down, hand-to-side and hands-up. Identification process used to recognize of hand movement using seven inputs. Testing and training with a variety of parameters tested, it appears that architecture provides the highest accuracy in one hundred hidden neural network. The architecture is used propagate the input value of the system implementation process into the user interface. The result of the identification of the type of the human movement has been clone to produce the highest acuracy of 98.5447%. The training process is done to get the best results.
Classification and Weakly Supervised Pain Localization using Multiple Segment Representation.
Sikka, Karan; Dhall, Abhinav; Bartlett, Marian Stewart
2014-10-01
Automatic pain recognition from videos is a vital clinical application and, owing to its spontaneous nature, poses interesting challenges to automatic facial expression recognition (AFER) research. Previous pain vs no-pain systems have highlighted two major challenges: (1) ground truth is provided for the sequence, but the presence or absence of the target expression for a given frame is unknown, and (2) the time point and the duration of the pain expression event(s) in each video are unknown. To address these issues we propose a novel framework (referred to as MS-MIL) where each sequence is represented as a bag containing multiple segments, and multiple instance learning (MIL) is employed to handle this weakly labeled data in the form of sequence level ground-truth. These segments are generated via multiple clustering of a sequence or running a multi-scale temporal scanning window, and are represented using a state-of-the-art Bag of Words (BoW) representation. This work extends the idea of detecting facial expressions through 'concept frames' to 'concept segments' and argues through extensive experiments that algorithms such as MIL are needed to reap the benefits of such representation. The key advantages of our approach are: (1) joint detection and localization of painful frames using only sequence-level ground-truth, (2) incorporation of temporal dynamics by representing the data not as individual frames but as segments, and (3) extraction of multiple segments, which is well suited to signals with uncertain temporal location and duration in the video. Extensive experiments on UNBC-McMaster Shoulder Pain dataset highlight the effectiveness of the approach by achieving competitive results on both tasks of pain classification and localization in videos. We also empirically evaluate the contributions of different components of MS-MIL. The paper also includes the visualization of discriminative facial patches, important for pain detection, as discovered by our algorithm and relates them to Action Units that have been associated with pain expression. We conclude the paper by demonstrating that MS-MIL yields a significant improvement on another spontaneous facial expression dataset, the FEEDTUM dataset.
Pass the Popcorn: “Obesogenic” Behaviors and Stigma in Children’s Movies
Throop, Elizabeth M.; Skinner, Asheley Cockrell; Perrin, Andrew J.; Steiner, Michael J.; Odulana, Adebowale; Perrin, Eliana M.
2014-01-01
Objective To determine the prevalence of obesity-related behaviors and attitudes in children’s movies. Design and Methods We performed a mixed-methods study of the top-grossing G- and PG-rated movies, 2006–2010 (4 per year). For each 10-minute movie segment the following were assessed: 1) prevalence of key nutrition and physical activity behaviors corresponding to the American Academy of Pediatrics obesity prevention recommendations for families; 2) prevalence of weight stigma; 3) assessment as healthy, unhealthy, or neutral; 3) free-text interpretations of stigma. Results Agreement between coders was greater than 85% (Cohen’s kappa=0.7), good for binary responses. Segments with food depicted: exaggerated portion size (26%); unhealthy snacks (51%); sugar-sweetened beverages (19%). Screen time was also prevalent (40% of movies showed television; 35% computer; 20% video games). Unhealthy segments outnumbered healthy segments 2:1. Most (70%) of the movies included weight-related stigmatizing content (e.g. “That fat butt! Flabby arms! And this ridiculous belly!”). Conclusions These popular children’s movies had significant “obesogenic” content, and most contained weight-based stigma. They present a mixed message to children: promoting unhealthy behaviors while stigmatizing the behaviors’ possible effects. Further research is needed to determine the effects of such messages on children. PMID:24311390
From image captioning to video summary using deep recurrent networks and unsupervised segmentation
NASA Astrophysics Data System (ADS)
Morosanu, Bogdan-Andrei; Lemnaru, Camelia
2018-04-01
Automatic captioning systems based on recurrent neural networks have been tremendously successful at providing realistic natural language captions for complex and varied image data. We explore methods for adapting existing models trained on large image caption data sets to a similar problem, that of summarising videos using natural language descriptions and frame selection. These architectures create internal high level representations of the input image that can be used to define probability distributions and distance metrics on these distributions. Specifically, we interpret each hidden unit inside a layer of the caption model as representing the un-normalised log probability of some unknown image feature of interest for the caption generation process. We can then apply well understood statistical divergence measures to express the difference between images and create an unsupervised segmentation of video frames, classifying consecutive images of low divergence as belonging to the same context, and those of high divergence as belonging to different contexts. To provide a final summary of the video, we provide a group of selected frames and a text description accompanying them, allowing a user to perform a quick exploration of large unlabeled video databases.
Linguistic Characteristics of Individuals with High Functioning Autism and Asperger Syndrome
ERIC Educational Resources Information Center
Seung, Hye Kyeung
2007-01-01
This study examined the linguistic characteristics of high functioning individuals with autism and Asperger syndrome. Each group consisted of 10 participants who were matched on sex, chronological age, and intelligence scores. Participants generated a narrative after watching a brief video segment of the Social Attribution Task video. Each…
Adding Feminist Therapy to Videotape Demonstrations.
ERIC Educational Resources Information Center
Konrad, Jennifer L.; Yoder, Janice D.
2000-01-01
Provides directions for presenting a 32-minute series of four videotape segments that highlights the fundamental features of four approaches to psychotherapy, extending its reach to include a feminist perspective. Describes the approaches and included segments. Reports that students' comments demonstrate that the video sequence provided a helpful…
What Makes a Message Stick? The Role of Content and Context in Social Media Epidemics
2013-09-23
First, we propose visual memes , or frequently re-posted short video segments, for detecting and monitoring latent video interactions at scale. Content...interactions (such as quoting, or remixing, parts of a video). Visual memes are extracted by scalable detection algorithms that we develop, with...high accuracy. We further augment visual memes with text, via a statistical model of latent topics. We model content interactions on YouTube with
Shadow Detection Based on Regions of Light Sources for Object Extraction in Nighttime Video
Lee, Gil-beom; Lee, Myeong-jin; Lee, Woo-Kyung; Park, Joo-heon; Kim, Tae-Hwan
2017-01-01
Intelligent video surveillance systems detect pre-configured surveillance events through background modeling, foreground and object extraction, object tracking, and event detection. Shadow regions inside video frames sometimes appear as foreground objects, interfere with ensuing processes, and finally degrade the event detection performance of the systems. Conventional studies have mostly used intensity, color, texture, and geometric information to perform shadow detection in daytime video, but these methods lack the capability of removing shadows in nighttime video. In this paper, a novel shadow detection algorithm for nighttime video is proposed; this algorithm partitions each foreground object based on the object’s vertical histogram and screens out shadow objects by validating their orientations heading toward regions of light sources. From the experimental results, it can be seen that the proposed algorithm shows more than 93.8% shadow removal and 89.9% object extraction rates for nighttime video sequences, and the algorithm outperforms conventional shadow removal algorithms designed for daytime videos. PMID:28327515
Lalys, Florent; Riffaud, Laurent; Bouget, David; Jannin, Pierre
2012-01-01
The need for a better integration of the new generation of Computer-Assisted-Surgical (CAS) systems has been recently emphasized. One necessity to achieve this objective is to retrieve data from the Operating Room (OR) with different sensors, then to derive models from these data. Recently, the use of videos from cameras in the OR has demonstrated its efficiency. In this paper, we propose a framework to assist in the development of systems for the automatic recognition of high level surgical tasks using microscope videos analysis. We validated its use on cataract procedures. The idea is to combine state-of-the-art computer vision techniques with time series analysis. The first step of the framework consisted in the definition of several visual cues for extracting semantic information, therefore characterizing each frame of the video. Five different pieces of image-based classifiers were therefore implemented. A step of pupil segmentation was also applied for dedicated visual cue detection. Time series classification algorithms were then applied to model time-varying data. Dynamic Time Warping (DTW) and Hidden Markov Models (HMM) were tested. This association combined the advantages of all methods for better understanding of the problem. The framework was finally validated through various studies. Six binary visual cues were chosen along with 12 phases to detect, obtaining accuracies of 94%. PMID:22203700
Robust and efficient fiducial tracking for augmented reality in HD-laparoscopic video streams
NASA Astrophysics Data System (ADS)
Mueller, M.; Groch, A.; Baumhauer, M.; Maier-Hein, L.; Teber, D.; Rassweiler, J.; Meinzer, H.-P.; Wegner, In.
2012-02-01
Augmented Reality (AR) is a convenient way of porting information from medical images into the surgical field of view and can deliver valuable assistance to the surgeon, especially in laparoscopic procedures. In addition, high definition (HD) laparoscopic video devices are a great improvement over the previously used low resolution equipment. However, in AR applications that rely on real-time detection of fiducials from video streams, the demand for efficient image processing has increased due to the introduction of HD devices. We present an algorithm based on the well-known Conditional Density Propagation (CONDENSATION) algorithm which can satisfy these new demands. By incorporating a prediction around an already existing and robust segmentation algorithm, we can speed up the whole procedure while leaving the robustness of the fiducial segmentation untouched. For evaluation purposes we tested the algorithm on recordings from real interventions, allowing for a meaningful interpretation of the results. Our results show that we can accelerate the segmentation by a factor of 3.5 on average. Moreover, the prediction information can be used to compensate for fiducials that are temporarily occluded or out of scope, providing greater stability.
Video Comprehensibility and Attention in Very Young Children
Pempek, Tiffany A.; Kirkorian, Heather L.; Richards, John E.; Anderson, Daniel R.; Lund, Anne F.; Stevens, Michael
2010-01-01
Earlier research established that preschool children pay less attention to television that is sequentially or linguistically incomprehensible. This study determines the youngest age for which this effect can be found. One-hundred and three 6-, 12-, 18-, and 24-month-olds’ looking and heart rate were recorded while they watched Teletubbies, a television program designed for very young children. Comprehensibility was manipulated by either randomly ordering shots or reversing dialogue to become backward speech. Infants watched one normal segment and one distorted version of the same segment. Only 24-month-olds, and to some extent 18-month-olds, distinguished between normal and distorted video by looking for longer durations towards the normal stimuli. The results suggest that it may not be until the middle of the second year that children demonstrate the earliest beginnings of comprehension of video as it is currently produced. PMID:20822238
A clinical pilot study of a modular video-CT augmentation system for image-guided skull base surgery
NASA Astrophysics Data System (ADS)
Liu, Wen P.; Mirota, Daniel J.; Uneri, Ali; Otake, Yoshito; Hager, Gregory; Reh, Douglas D.; Ishii, Masaru; Gallia, Gary L.; Siewerdsen, Jeffrey H.
2012-02-01
Augmentation of endoscopic video with preoperative or intraoperative image data [e.g., planning data and/or anatomical segmentations defined in computed tomography (CT) and magnetic resonance (MR)], can improve navigation, spatial orientation, confidence, and tissue resection in skull base surgery, especially with respect to critical neurovascular structures that may be difficult to visualize in the video scene. This paper presents the engineering and evaluation of a video augmentation system for endoscopic skull base surgery translated to use in a clinical study. Extension of previous research yielded a practical system with a modular design that can be applied to other endoscopic surgeries, including orthopedic, abdominal, and thoracic procedures. A clinical pilot study is underway to assess feasibility and benefit to surgical performance by overlaying CT or MR planning data in realtime, high-definition endoscopic video. Preoperative planning included segmentation of the carotid arteries, optic nerves, and surgical target volume (e.g., tumor). An automated camera calibration process was developed that demonstrates mean re-projection accuracy (0.7+/-0.3) pixels and mean target registration error of (2.3+/-1.5) mm. An IRB-approved clinical study involving fifteen patients undergoing skull base tumor surgery is underway in which each surgery includes the experimental video-CT system deployed in parallel to the standard-of-care (unaugmented) video display. Questionnaires distributed to one neurosurgeon and two otolaryngologists are used to assess primary outcome measures regarding the benefit to surgical confidence in localizing critical structures and targets by means of video overlay during surgical approach, resection, and reconstruction.
Tri-state delta modulation system for Space Shuttle digital TV downlink
NASA Technical Reports Server (NTRS)
Udalov, S.; Huth, G. K.; Roberts, D.; Batson, B. H.
1981-01-01
Future requirements for Shuttle Orbiter downlink communication may include transmission of digital video which, in addition to black and white, may also be either field-sequential or NTSC color format. The use of digitized video could provide for picture privacy at the expense of additional onboard hardware, together with an increased bandwidth due to the digitization process. A general objective for the Space Shuttle application is to develop a digitization technique that is compatible with data rates in the 20-30 Mbps range but still provides good quality pictures. This paper describes a tri-state delta modulation/demodulation (TSDM) technique which is a good compromise between implementation complexity and performance. The unique feature of TSDM is that it provides for efficient run-length encoding of constant-intensity segments of a TV picture. Axiomatix has developed a hardware implementation of a high-speed TSDM transmitter and receiver for black-and-white TV and field-sequential color. The hardware complexity of this TSDM implementation is summarized in the paper.
The Great War. [Teaching Materials].
ERIC Educational Resources Information Center
Public Broadcasting Service, Washington, DC.
This package of teaching materials is intended to accompany an eight-part film series entitled "The Great War" (i.e., World War I), produced for public television. The package consists of a "teacher's guide,""video segment index,""student resource" materials, and approximately 40 large photographs. The video series is not a war story of battles,…
Optimizing Instructional Video for Preservice Teachers in an Online Technology Integration Course
ERIC Educational Resources Information Center
Ibrahim, Mohamed; Callaway, Rebecca; Bell, David
2014-01-01
This study assessed the effect of design instructional video based on the Cognitive Theory of Multimedia Learning by applying segmentation and signaling on the learning outcome of students in an online technology integration course. The study assessed the correlation between students' personal preferences (preferred learning styles and area…
ERIC Educational Resources Information Center
di Giura, Marcella Beacco
1994-01-01
The problems and value of television as instructional material for the second-language classroom are discussed, and a new videocassette series produced by the journal "Francais dans le Monde" is described. Criteria for topic and segment selection are outlined, and suggestions are made for classroom use. (MSE)
Evolving discriminators for querying video sequences
NASA Astrophysics Data System (ADS)
Iyengar, Giridharan; Lippman, Andrew B.
1997-01-01
In this paper we present a framework for content based query and retrieval of information from large video databases. This framework enables content based retrieval of video sequences by characterizing the sequences using motion, texture and colorimetry cues. This characterization is biologically inspired and results in a compact parameter space where every segment of video is represented by an 8 dimensional vector. Searching and retrieval is done in real- time with accuracy in this parameter space. Using this characterization, we then evolve a set of discriminators using Genetic Programming Experiments indicate that these discriminators are capable of analyzing and characterizing video. The VideoBook is able to search and retrieve video sequences with 92% accuracy in real-time. Experiments thus demonstrate that the characterization is capable of extracting higher level structure from raw pixel values.
Content-based management service for medical videos.
Mendi, Engin; Bayrak, Coskun; Cecen, Songul; Ermisoglu, Emre
2013-01-01
Development of health information technology has had a dramatic impact to improve the efficiency and quality of medical care. Developing interoperable health information systems for healthcare providers has the potential to improve the quality and equitability of patient-centered healthcare. In this article, we describe an automated content-based medical video analysis and management service that provides convenience and ease in accessing the relevant medical video content without sequential scanning. The system facilitates effective temporal video segmentation and content-based visual information retrieval that enable a more reliable understanding of medical video content. The system is implemented as a Web- and mobile-based service and has the potential to offer a knowledge-sharing platform for the purpose of efficient medical video content access.
Li, Jia; Xia, Changqun; Chen, Xiaowu
2017-10-12
Image-based salient object detection (SOD) has been extensively studied in past decades. However, video-based SOD is much less explored due to the lack of large-scale video datasets within which salient objects are unambiguously defined and annotated. Toward this end, this paper proposes a video-based SOD dataset that consists of 200 videos. In constructing the dataset, we manually annotate all objects and regions over 7,650 uniformly sampled keyframes and collect the eye-tracking data of 23 subjects who free-view all videos. From the user data, we find that salient objects in a video can be defined as objects that consistently pop-out throughout the video, and objects with such attributes can be unambiguously annotated by combining manually annotated object/region masks with eye-tracking data of multiple subjects. To the best of our knowledge, it is currently the largest dataset for videobased salient object detection. Based on this dataset, this paper proposes an unsupervised baseline approach for video-based SOD by using saliencyguided stacked autoencoders. In the proposed approach, multiple spatiotemporal saliency cues are first extracted at the pixel, superpixel and object levels. With these saliency cues, stacked autoencoders are constructed in an unsupervised manner that automatically infers a saliency score for each pixel by progressively encoding the high-dimensional saliency cues gathered from the pixel and its spatiotemporal neighbors. In experiments, the proposed unsupervised approach is compared with 31 state-of-the-art models on the proposed dataset and outperforms 30 of them, including 19 imagebased classic (unsupervised or non-deep learning) models, six image-based deep learning models, and five video-based unsupervised models. Moreover, benchmarking results show that the proposed dataset is very challenging and has the potential to boost the development of video-based SOD.
Vertical dynamic deflection measurement in concrete beams with the Microsoft Kinect.
Qi, Xiaojuan; Lichti, Derek; El-Badry, Mamdouh; Chow, Jacky; Ang, Kathleen
2014-02-19
The Microsoft Kinect is arguably the most popular RGB-D camera currently on the market, partially due to its low cost. It offers many advantages for the measurement of dynamic phenomena since it can directly measure three-dimensional coordinates of objects at video frame rate using a single sensor. This paper presents the results of an investigation into the development of a Microsoft Kinect-based system for measuring the deflection of reinforced concrete beams subjected to cyclic loads. New segmentation methods for object extraction from the Kinect's depth imagery and vertical displacement reconstruction algorithms have been developed and implemented to reconstruct the time-dependent displacement of concrete beams tested in laboratory conditions. The results demonstrate that the amplitude and frequency of the vertical displacements can be reconstructed with submillimetre and milliHz-level precision and accuracy, respectively.
Vertical Dynamic Deflection Measurement in Concrete Beams with the Microsoft Kinect
Qi, Xiaojuan; Lichti, Derek; El-Badry, Mamdouh; Chow, Jacky; Ang, Kathleen
2014-01-01
The Microsoft Kinect is arguably the most popular RGB-D camera currently on the market, partially due to its low cost. It offers many advantages for the measurement of dynamic phenomena since it can directly measure three-dimensional coordinates of objects at video frame rate using a single sensor. This paper presents the results of an investigation into the development of a Microsoft Kinect-based system for measuring the deflection of reinforced concrete beams subjected to cyclic loads. New segmentation methods for object extraction from the Kinect's depth imagery and vertical displacement reconstruction algorithms have been developed and implemented to reconstruct the time-dependent displacement of concrete beams tested in laboratory conditions. The results demonstrate that the amplitude and frequency of the vertical displacements can be reconstructed with submillimetre and milliHz-level precision and accuracy, respectively. PMID:24556668
A Neural-Dynamic Architecture for Concurrent Estimation of Object Pose and Identity
Lomp, Oliver; Faubel, Christian; Schöner, Gregor
2017-01-01
Handling objects or interacting with a human user about objects on a shared tabletop requires that objects be identified after learning from a small number of views and that object pose be estimated. We present a neurally inspired architecture that learns object instances by storing features extracted from a single view of each object. Input features are color and edge histograms from a localized area that is updated during processing. The system finds the best-matching view for the object in a novel input image while concurrently estimating the object’s pose, aligning the learned view with current input. The system is based on neural dynamics, computationally operating in real time, and can handle dynamic scenes directly off live video input. In a scenario with 30 everyday objects, the system achieves recognition rates of 87.2% from a single training view for each object, while also estimating pose quite precisely. We further demonstrate that the system can track moving objects, and that it can segment the visual array, selecting and recognizing one object while suppressing input from another known object in the immediate vicinity. Evaluation on the COIL-100 dataset, in which objects are depicted from different viewing angles, revealed recognition rates of 91.1% on the first 30 objects, each learned from four training views. PMID:28503145
Behavior analysis of video object in complicated background
NASA Astrophysics Data System (ADS)
Zhao, Wenting; Wang, Shigang; Liang, Chao; Wu, Wei; Lu, Yang
2016-10-01
This paper aims to achieve robust behavior recognition of video object in complicated background. Features of the video object are described and modeled according to the depth information of three-dimensional video. Multi-dimensional eigen vector are constructed and used to process high-dimensional data. Stable object tracing in complex scenes can be achieved with multi-feature based behavior analysis, so as to obtain the motion trail. Subsequently, effective behavior recognition of video object is obtained according to the decision criteria. What's more, the real-time of algorithms and accuracy of analysis are both improved greatly. The theory and method on the behavior analysis of video object in reality scenes put forward by this project have broad application prospect and important practical significance in the security, terrorism, military and many other fields.
Video indexing based on image and sound
NASA Astrophysics Data System (ADS)
Faudemay, Pascal; Montacie, Claude; Caraty, Marie-Jose
1997-10-01
Video indexing is a major challenge for both scientific and economic reasons. Information extraction can sometimes be easier from sound channel than from image channel. We first present a multi-channel and multi-modal query interface, to query sound, image and script through 'pull' and 'push' queries. We then summarize the segmentation phase, which needs information from the image channel. Detection of critical segments is proposed. It should speed-up both automatic and manual indexing. We then present an overview of the information extraction phase. Information can be extracted from the sound channel, through speaker recognition, vocal dictation with unconstrained vocabularies, and script alignment with speech. We present experiment results for these various techniques. Speaker recognition methods were tested on the TIMIT and NTIMIT database. Vocal dictation as experimented on newspaper sentences spoken by several speakers. Script alignment was tested on part of a carton movie, 'Ivanhoe'. For good quality sound segments, error rates are low enough for use in indexing applications. Major issues are the processing of sound segments with noise or music, and performance improvement through the use of appropriate, low-cost architectures or networks of workstations.
Texture-adaptive hyperspectral video acquisition system with a spatial light modulator
NASA Astrophysics Data System (ADS)
Fang, Xiaojing; Feng, Jiao; Wang, Yongjin
2014-10-01
We present a new hybrid camera system based on spatial light modulator (SLM) to capture texture-adaptive high-resolution hyperspectral video. The hybrid camera system records a hyperspectral video with low spatial resolution using a gray camera and a high-spatial resolution video using a RGB camera. The hyperspectral video is subsampled by the SLM. The subsampled points can be adaptively selected according to the texture characteristic of the scene by combining with digital imaging analysis and computational processing. In this paper, we propose an adaptive sampling method utilizing texture segmentation and wavelet transform (WT). We also demonstrate the effectiveness of the sampled pattern on the SLM with the proposed method.
2016-06-01
and material developers use an online game to crowdsource ideas from online players in order to increase viable synthetic prototypes. In entertainment... games , players often create videos of their game play to share with other players to demonstrate how to complete a segment of a game . This thesis...explores similar self-recorded videos of ESP game play and determines if they provide useful data to capability and material developers that can
Vodcasts and Captures: Using Multimedia to Improve Student Learning in Introductory Biology
ERIC Educational Resources Information Center
Walker, J. D.; Cotner, Sehoya; Beermann, Nicholas
2011-01-01
This study investigated the use of multimedia materials to enhance student learning in a large, introductory biology course. Two sections of this course were taught by the same instructor in the same semester. In one section, video podcasts or "vodcasts" were created which combined custom animation and video segments with music and…
Making History: An Indiana Teacher Uses Technology to Feel the History
ERIC Educational Resources Information Center
Technology & Learning, 2008
2008-01-01
Jon Carl's vision is simple: get students passionate about history by turning them into historians. To accomplish this, he created a class centered on documentary film-making. Students choose a topic, conduct research at local libraries, write a script, film video interviews, and create video segments of four to 15 minutes. District technology…
Selective Set Effects Produced by Television Adjunct in Learning from Text.
ERIC Educational Resources Information Center
Yi, Julie C.
This study used television segments to investigate the impact of multimedia in establishing context for text learning. Adult participants (n=128) were shown a video either before or after reading a story. The video shown before reading was intended to create a "set" for either a burglar or buyer perspective contained in the story. The…
Gradual cut detection using low-level vision for digital video
NASA Astrophysics Data System (ADS)
Lee, Jae-Hyun; Choi, Yeun-Sung; Jang, Ok-bae
1996-09-01
Digital video computing and organization is one of the important issues in multimedia system, signal compression, or database. Video should be segmented into shots to be used for identification and indexing. This approach requires a suitable method to automatically locate cut points in order to separate shot in a video. Automatic cut detection to isolate shots in a video has received considerable attention due to many practical applications; our video database, browsing, authoring system, retrieval and movie. Previous studies are based on a set of difference mechanisms and they measured the content changes between video frames. But they could not detect more special effects which include dissolve, wipe, fade-in, fade-out, and structured flashing. In this paper, a new cut detection method for gradual transition based on computer vision techniques is proposed. And then, experimental results applied to commercial video are presented and evaluated.
Automated fall detection on privacy-enhanced video.
Edgcomb, Alex; Vahid, Frank
2012-01-01
A privacy-enhanced video obscures the appearance of a person in the video. We consider four privacy enhancements: blurring of the person, silhouetting of the person, covering the person with a graphical box, and covering the person with a graphical oval. We demonstrate that an automated video-based fall detection algorithm can be as accurate on privacy-enhanced video as on raw video. The algorithm operated on video from a stationary in-home camera, using a foreground-background segmentation algorithm to extract a minimum bounding rectangle (MBR) around the motion in the video, and using time series shapelet analysis on the height and width of the rectangle to detect falls. We report accuracy applying fall detection on 23 scenarios depicted as raw video and privacy-enhanced videos involving a sole actor portraying normal activities and various falls. We found that fall detection on privacy-enhanced video, except for the common approach of blurring of the person, was competitive with raw video, and in particular that the graphical oval privacy enhancement yielded the same accuracy as raw video, namely 0.91 sensitivity and 0.92 specificity.
A unified framework for gesture recognition and spatiotemporal gesture segmentation.
Alon, Jonathan; Athitsos, Vassilis; Yuan, Quan; Sclaroff, Stan
2009-09-01
Within the context of hand gesture recognition, spatiotemporal gesture segmentation is the task of determining, in a video sequence, where the gesturing hand is located and when the gesture starts and ends. Existing gesture recognition methods typically assume either known spatial segmentation or known temporal segmentation, or both. This paper introduces a unified framework for simultaneously performing spatial segmentation, temporal segmentation, and recognition. In the proposed framework, information flows both bottom-up and top-down. A gesture can be recognized even when the hand location is highly ambiguous and when information about when the gesture begins and ends is unavailable. Thus, the method can be applied to continuous image streams where gestures are performed in front of moving, cluttered backgrounds. The proposed method consists of three novel contributions: a spatiotemporal matching algorithm that can accommodate multiple candidate hand detections in every frame, a classifier-based pruning framework that enables accurate and early rejection of poor matches to gesture models, and a subgesture reasoning algorithm that learns which gesture models can falsely match parts of other longer gestures. The performance of the approach is evaluated on two challenging applications: recognition of hand-signed digits gestured by users wearing short-sleeved shirts, in front of a cluttered background, and retrieval of occurrences of signs of interest in a video database containing continuous, unsegmented signing in American Sign Language (ASL).
DIY Video Abstracts: Lessons from an ultimately successful experience
NASA Astrophysics Data System (ADS)
Brauman, K. A.
2013-12-01
A great video abstract can come together in as little as two days with only a laptop and a sense of adventure. From script to setup, here are tips to make the process practically pain-free. The content of every abstract is unique, but some pointers for writing a video script are universal. Keeping it short and clarifying the message into 4 or 5 single-issue segments make any video better. Making the video itself can be intimidating, but it doesn't have to be! Practical ideas to be discussed include setting up the script as a narrow column to avoid the appearance of reading and hunting for a colored backdrop. A lot goes into just two minutes of video, but for not too much effort the payoff is tremendous.
NASA Astrophysics Data System (ADS)
Sa, Qila; Wang, Zhihui
2018-03-01
At present, content-based video retrieval (CBVR) is the most mainstream video retrieval method, using the video features of its own to perform automatic identification and retrieval. This method involves a key technology, i.e. shot segmentation. In this paper, the method of automatic video shot boundary detection with K-means clustering and improved adaptive dual threshold comparison is proposed. First, extract the visual features of every frame and divide them into two categories using K-means clustering algorithm, namely, one with significant change and one with no significant change. Then, as to the classification results, utilize the improved adaptive dual threshold comparison method to determine the abrupt as well as gradual shot boundaries.Finally, achieve automatic video shot boundary detection system.
A Standard-Compliant Virtual Meeting System with Active Video Object Tracking
NASA Astrophysics Data System (ADS)
Lin, Chia-Wen; Chang, Yao-Jen; Wang, Chih-Ming; Chen, Yung-Chang; Sun, Ming-Ting
2002-12-01
This paper presents an H.323 standard compliant virtual video conferencing system. The proposed system not only serves as a multipoint control unit (MCU) for multipoint connection but also provides a gateway function between the H.323 LAN (local-area network) and the H.324 WAN (wide-area network) users. The proposed virtual video conferencing system provides user-friendly object compositing and manipulation features including 2D video object scaling, repositioning, rotation, and dynamic bit-allocation in a 3D virtual environment. A reliable, and accurate scheme based on background image mosaics is proposed for real-time extracting and tracking foreground video objects from the video captured with an active camera. Chroma-key insertion is used to facilitate video objects extraction and manipulation. We have implemented a prototype of the virtual conference system with an integrated graphical user interface to demonstrate the feasibility of the proposed methods.
Indexed Captioned Searchable Videos: A Learning Companion for STEM Coursework
NASA Astrophysics Data System (ADS)
Tuna, Tayfun; Subhlok, Jaspal; Barker, Lecia; Shah, Shishir; Johnson, Olin; Hovey, Christopher
2017-02-01
Videos of classroom lectures have proven to be a popular and versatile learning resource. A key shortcoming of the lecture video format is accessing the content of interest hidden in a video. This work meets this challenge with an advanced video framework featuring topical indexing, search, and captioning (ICS videos). Standard optical character recognition (OCR) technology was enhanced with image transformations for extraction of text from video frames to support indexing and search. The images and text on video frames is analyzed to divide lecture videos into topical segments. The ICS video player integrates indexing, search, and captioning in video playback providing instant access to the content of interest. This video framework has been used by more than 70 courses in a variety of STEM disciplines and assessed by more than 4000 students. Results presented from the surveys demonstrate the value of the videos as a learning resource and the role played by videos in a students learning process. Survey results also establish the value of indexing and search features in a video platform for education. This paper reports on the development and evaluation of ICS videos framework and over 5 years of usage experience in several STEM courses.
Spiers, Adam J; Resnik, Linda; Dollar, Aaron M
2017-07-01
New upper limb prosthetic devices are continuously being developed by a variety of industrial, academic, and hobbyist groups. Yet, little research has evaluated the long term use of currently available prostheses in daily life activities, beyond laboratory or survey studies. We seek to objectively measure how experienced unilateral upper limb prosthesis-users employ their prosthetic devices and unaffected limb for manipulation during everyday activities. In particular, our goal is to create a method for evaluating all types of amputee manipulation, including non-prehensile actions beyond conventional grasp functions, as well as to examine the relative use of both limbs in unilateral and bilateral cases. This study employs a head-mounted video camera to record participant's hands and arms as they complete unstructured domestic tasks within their own homes. A new 'Unilateral Prosthesis-User Manipulation Taxonomy' is presented based observations from 10 hours of recorded videos. The taxonomy addresses manipulation actions of the intact hand, prostheses, bilateral activities, and environmental feature-use (aiïordances). Our preliminary results involved tagging 23 minute segments of the full videos from 3 amputee participants using the taxonomy. This resulted in over 2,300 tag instances. Observations included that non-prehensile interactions outnumbered prehensile interactions in the affected limb for users with more distal amputation that allowed arm mobility.
ESPN2 Sports Figures Makes Math and Physics a Ball! 1996-97 Educator's Curriculum.
ERIC Educational Resources Information Center
Rusczyk, Richard; Lehoczky, Sandor
This guide is designed to accompany ESPN's SportsFigures video segments which were created to enhance the interest and learning progress of high school students in mathematics, physics, and physical science. Using actual, re-enacted, or staged events, the problems presented in each of the 16 Sports Figures segments illustrate the relationship…
Leveraging Automatic Speech Recognition Errors to Detect Challenging Speech Segments in TED Talks
ERIC Educational Resources Information Center
Mirzaei, Maryam Sadat; Meshgi, Kourosh; Kawahara, Tatsuya
2016-01-01
This study investigates the use of Automatic Speech Recognition (ASR) systems to epitomize second language (L2) listeners' problems in perception of TED talks. ASR-generated transcripts of videos often involve recognition errors, which may indicate difficult segments for L2 listeners. This paper aims to discover the root-causes of the ASR errors…
The Effects of Music on Microsurgical Technique and Performance: A Motion Analysis Study.
Shakir, Afaaf; Chattopadhyay, Arhana; Paek, Laurence S; McGoldrick, Rory B; Chetta, Matthew D; Hui, Kenneth; Lee, Gordon K
2017-05-01
Music is commonly played in operating rooms (ORs) throughout the country. If a preferred genre of music is played, surgeons have been shown to perform surgical tasks quicker and with greater accuracy. However, there are currently no studies investigating the effects of music on microsurgical technique. Motion analysis technology has recently been validated in the objective assessment of plastic surgery trainees' performance of microanastomoses. Here, we aimed to examine the effects of music on microsurgical skills using motion analysis technology as a primary objective assessment tool. Residents and fellows in the Plastic and Reconstructive Surgery program were recruited to complete a demographic survey and participate in microsurgical tasks. Each participant completed 2 arterial microanastomoses on a chicken foot model, one with music playing, and the other without music playing. Participants were blinded to the study objectives and encouraged to perform their best. The order of music and no music was randomized. Microanastomoses were video recorded using a digitalized S-video system and deidentified. Video segments were analyzed using ProAnalyst motion analysis software for automatic noncontact markerless video tracking of the needle driver tip. Nine residents and 3 plastic surgery fellows were tested. Reported microsurgical experience ranged from 1 to 10 arterial anastomoses performed (n = 2), 11 to 100 anastomoses (n = 9), and 101 to 500 anastomoses (n = 1). Mean age was 33 years (range, 29-36 years), with 11 participants right-handed and 1 ambidextrous. Of the 12 subjects tested, 11 (92%) preferred music in the OR. Composite instrument motion analysis scores significantly improved with playing preferred music during testing versus no music (paired t test, P <0.001). Improvement with music was significant even after stratifying scores by order in which variables were tested (music first vs no music first), postgraduate year, and number of anastomoses (analysis of variance, P < 0.01). Preferred music in the OR may have a positive effect on trainees' microsurgical performance; as such, trainees should be encouraged to participate in setting the conditions of the OR to optimize their comfort and, possibly, performance. Moreover, motion analysis technology is a useful tool with a wide range of applications for surgical education and outcomes optimization.
VOP memory management in MPEG-4
NASA Astrophysics Data System (ADS)
Vaithianathan, Karthikeyan; Panchanathan, Sethuraman
2001-03-01
MPEG-4 is a multimedia standard that requires Video Object Planes (VOPs). Generation of VOPs for any kind of video sequence is still a challenging problem that largely remains unsolved. Nevertheless, if this problem is treated by imposing certain constraints, solutions for specific application domains can be found. MPEG-4 applications in mobile devices is one such domain where the opposite goals namely low power and high throughput are required to be met. Efficient memory management plays a major role in reducing the power consumption. Specifically, efficient memory management for VOPs is difficult because the lifetimes of these objects vary and these life times may be overlapping. Varying life times of the objects requires dynamic memory management where memory fragmentation is a key problem that needs to be addressed. In general, memory management systems address this problem by following a combination of strategy, policy and mechanism. For MPEG4 based mobile devices that lack instruction processors, a hardware based memory management solution is necessary. In MPEG4 based mobile devices that have a RISC processor, using a Real time operating system (RTOS) for this memory management task is not expected to be efficient because the strategies and policies used by the ROTS is often tuned for handling memory segments of smaller sizes compared to object sizes. Hence, a memory management scheme specifically tuned for VOPs is important. In this paper, different strategies, policies and mechanisms for memory management are considered and an efficient combination is proposed for the case of VOP memory management along with a hardware architecture, which can handle the proposed combination.
Federal Register 2010, 2011, 2012, 2013, 2014
2012-12-21
... INTERNATIONAL TRADE COMMISSION [Investigation No. 337-TA-852] Certain Video Analytics Software..., 2012, based on a complaint filed by ObjectVideo, Inc. (``ObjectVideo'') of Reston, Virginia. 77 FR... United States after importation of certain video analytics software systems, components thereof, and...
Video and image retrieval beyond the cognitive level: the needs and possibilities
NASA Astrophysics Data System (ADS)
Hanjalic, Alan
2000-12-01
The worldwide research efforts in the are of image and video retrieval have concentrated so far on increasing the efficiency and reliability of extracting the elements of image and video semantics and so on improving the search and retrieval performance at the cognitive level of content abstraction. At this abstraction level, the user is searching for 'factual' or 'objective' content such as image showing a panorama of San Francisco, an outdoor or an indoor image, a broadcast news report on a defined topic, a movie dialog between the actors A and B or the parts of a basketball game showing fast breaks, steals and scores. These efforts, however, do not address the retrieval applications at the so-called affective level of content abstraction where the 'ground truth' is not strictly defined. Such applications are, for instance, those where subjectivity of the user plays the major role, e.g. the task of retrieving all images that the user 'likes most', and those that are based on 'recognizing emotions' in audiovisual data. Typical examples are searching for all images that 'radiate happiness', identifying all 'sad' movie fragments and looking for the 'romantic landscapes', 'sentimental' movie segments, 'movie highlights' or 'most exciting' moments of a sport event. This paper discusses the needs and possibilities for widening the current scope of research in the area of image and video search and retrieval in order to enable applications at the affective level of content abstraction.
Video and image retrieval beyond the cognitive level: the needs and possibilities
NASA Astrophysics Data System (ADS)
Hanjalic, Alan
2001-01-01
The worldwide research efforts in the are of image and video retrieval have concentrated so far on increasing the efficiency and reliability of extracting the elements of image and video semantics and so on improving the search and retrieval performance at the cognitive level of content abstraction. At this abstraction level, the user is searching for 'factual' or 'objective' content such as image showing a panorama of San Francisco, an outdoor or an indoor image, a broadcast news report on a defined topic, a movie dialog between the actors A and B or the parts of a basketball game showing fast breaks, steals and scores. These efforts, however, do not address the retrieval applications at the so-called affective level of content abstraction where the 'ground truth' is not strictly defined. Such applications are, for instance, those where subjectivity of the user plays the major role, e.g. the task of retrieving all images that the user 'likes most', and those that are based on 'recognizing emotions' in audiovisual data. Typical examples are searching for all images that 'radiate happiness', identifying all 'sad' movie fragments and looking for the 'romantic landscapes', 'sentimental' movie segments, 'movie highlights' or 'most exciting' moments of a sport event. This paper discusses the needs and possibilities for widening the current scope of research in the area of image and video search and retrieval in order to enable applications at the affective level of content abstraction.
Astrometric and Photometric Analysis of the September 2008 ATV-1 Re-Entry Event
NASA Technical Reports Server (NTRS)
Mulrooney, Mark K.; Barker, Edwin S.; Maley, Paul D.; Beaulieu, Kevin R.; Stokely, Christopher L.
2008-01-01
NASA utilized Image Intensified Video Cameras for ATV data acquisition from a jet flying at 12.8 km. Afterwards the video was digitized and then analyzed with a modified commercial software package, Image Systems Trackeye. Astrometric results were limited by saturation, plate scale, and imposed linear plate solution based on field reference stars. Time-dependent fragment angular trajectories, velocities, accelerations, and luminosities were derived in each video segment. It was evident that individual fragments behave differently. Photometric accuracy was insufficient to confidently assess correlations between luminosity and fragment spatial behavior (velocity, deceleration). Use of high resolution digital video cameras in future should remedy this shortcoming.
Intelligent video storage of visual evidences on site in fast deployment
NASA Astrophysics Data System (ADS)
Desurmont, Xavier; Bastide, Arnaud; Delaigle, Jean-Francois
2004-07-01
In this article we present a generic, flexible, scalable and robust approach for an intelligent real-time forensic visual system. The proposed implementation could be rapidly deployable and integrates minimum logistic support as it embeds low complexity devices (PCs and cameras) that communicate through wireless network. The goal of these advanced tools is to provide intelligent video storage of potential video evidences for fast intervention during deployment around a hazardous sector after a terrorism attack, a disaster, an air crash or before attempt of it. Advanced video analysis tools, such as segmentation and tracking are provided to support intelligent storage and annotation.
Video enhancement workbench: an operational real-time video image processing system
NASA Astrophysics Data System (ADS)
Yool, Stephen R.; Van Vactor, David L.; Smedley, Kirk G.
1993-01-01
Video image sequences can be exploited in real-time, giving analysts rapid access to information for military or criminal investigations. Video-rate dynamic range adjustment subdues fluctuations in image intensity, thereby assisting discrimination of small or low- contrast objects. Contrast-regulated unsharp masking enhances differentially shadowed or otherwise low-contrast image regions. Real-time removal of localized hotspots, when combined with automatic histogram equalization, may enhance resolution of objects directly adjacent. In video imagery corrupted by zero-mean noise, real-time frame averaging can assist resolution and location of small or low-contrast objects. To maximize analyst efficiency, lengthy video sequences can be screened automatically for low-frequency, high-magnitude events. Combined zoom, roam, and automatic dynamic range adjustment permit rapid analysis of facial features captured by video cameras recording crimes in progress. When trying to resolve small objects in murky seawater, stereo video places the moving imagery in an optimal setting for human interpretation.
MPEG-7 audio-visual indexing test-bed for video retrieval
NASA Astrophysics Data System (ADS)
Gagnon, Langis; Foucher, Samuel; Gouaillier, Valerie; Brun, Christelle; Brousseau, Julie; Boulianne, Gilles; Osterrath, Frederic; Chapdelaine, Claude; Dutrisac, Julie; St-Onge, Francis; Champagne, Benoit; Lu, Xiaojian
2003-12-01
This paper reports on the development status of a Multimedia Asset Management (MAM) test-bed for content-based indexing and retrieval of audio-visual documents within the MPEG-7 standard. The project, called "MPEG-7 Audio-Visual Document Indexing System" (MADIS), specifically targets the indexing and retrieval of video shots and key frames from documentary film archives, based on audio-visual content like face recognition, motion activity, speech recognition and semantic clustering. The MPEG-7/XML encoding of the film database is done off-line. The description decomposition is based on a temporal decomposition into visual segments (shots), key frames and audio/speech sub-segments. The visible outcome will be a web site that allows video retrieval using a proprietary XQuery-based search engine and accessible to members at the Canadian National Film Board (NFB) Cineroute site. For example, end-user will be able to ask to point on movie shots in the database that have been produced in a specific year, that contain the face of a specific actor who tells a specific word and in which there is no motion activity. Video streaming is performed over the high bandwidth CA*net network deployed by CANARIE, a public Canadian Internet development organization.
Grayscale image segmentation for real-time traffic sign recognition: the hardware point of view
NASA Astrophysics Data System (ADS)
Cao, Tam P.; Deng, Guang; Elton, Darrell
2009-02-01
In this paper, we study several grayscale-based image segmentation methods for real-time road sign recognition applications on an FPGA hardware platform. The performance of different image segmentation algorithms in different lighting conditions are initially compared using PC simulation. Based on these results and analysis, suitable algorithms are implemented and tested on a real-time FPGA speed sign detection system. Experimental results show that the system using segmented images uses significantly less hardware resources on an FPGA while maintaining comparable system's performance. The system is capable of processing 60 live video frames per second.
Eyben, Florian; Weninger, Felix; Lehment, Nicolas; Schuller, Björn; Rigoll, Gerhard
2013-01-01
Without doubt general video and sound, as found in large multimedia archives, carry emotional information. Thus, audio and video retrieval by certain emotional categories or dimensions could play a central role for tomorrow's intelligent systems, enabling search for movies with a particular mood, computer aided scene and sound design in order to elicit certain emotions in the audience, etc. Yet, the lion's share of research in affective computing is exclusively focusing on signals conveyed by humans, such as affective speech. Uniting the fields of multimedia retrieval and affective computing is believed to lend to a multiplicity of interesting retrieval applications, and at the same time to benefit affective computing research, by moving its methodology "out of the lab" to real-world, diverse data. In this contribution, we address the problem of finding "disturbing" scenes in movies, a scenario that is highly relevant for computer-aided parental guidance. We apply large-scale segmental feature extraction combined with audio-visual classification to the particular task of detecting violence. Our system performs fully data-driven analysis including automatic segmentation. We evaluate the system in terms of mean average precision (MAP) on the official data set of the MediaEval 2012 evaluation campaign's Affect Task, which consists of 18 original Hollywood movies, achieving up to .398 MAP on unseen test data in full realism. An in-depth analysis of the worth of individual features with respect to the target class and the system errors is carried out and reveals the importance of peak-related audio feature extraction and low-level histogram-based video analysis.
Eyben, Florian; Weninger, Felix; Lehment, Nicolas; Schuller, Björn; Rigoll, Gerhard
2013-01-01
Without doubt general video and sound, as found in large multimedia archives, carry emotional information. Thus, audio and video retrieval by certain emotional categories or dimensions could play a central role for tomorrow's intelligent systems, enabling search for movies with a particular mood, computer aided scene and sound design in order to elicit certain emotions in the audience, etc. Yet, the lion's share of research in affective computing is exclusively focusing on signals conveyed by humans, such as affective speech. Uniting the fields of multimedia retrieval and affective computing is believed to lend to a multiplicity of interesting retrieval applications, and at the same time to benefit affective computing research, by moving its methodology “out of the lab” to real-world, diverse data. In this contribution, we address the problem of finding “disturbing” scenes in movies, a scenario that is highly relevant for computer-aided parental guidance. We apply large-scale segmental feature extraction combined with audio-visual classification to the particular task of detecting violence. Our system performs fully data-driven analysis including automatic segmentation. We evaluate the system in terms of mean average precision (MAP) on the official data set of the MediaEval 2012 evaluation campaign's Affect Task, which consists of 18 original Hollywood movies, achieving up to .398 MAP on unseen test data in full realism. An in-depth analysis of the worth of individual features with respect to the target class and the system errors is carried out and reveals the importance of peak-related audio feature extraction and low-level histogram-based video analysis. PMID:24391704
Hierarchical vs non-hierarchical audio indexation and classification for video genres
NASA Astrophysics Data System (ADS)
Dammak, Nouha; BenAyed, Yassine
2018-04-01
In this paper, Support Vector Machines (SVMs) are used for segmenting and indexing video genres based on only audio features extracted at block level, which has a prominent asset by capturing local temporal information. The main contribution of our study is to show the wide effect on the classification accuracies while using an hierarchical categorization structure based on Mel Frequency Cepstral Coefficients (MFCC) audio descriptor. In fact, the classification consists in three common video genres: sports videos, music clips and news scenes. The sub-classification may divide each genre into several multi-speaker and multi-dialect sub-genres. The validation of this approach was carried out on over 360 minutes of video span yielding a classification accuracy of over 99%.
Automatic colonic lesion detection and tracking in endoscopic videos
NASA Astrophysics Data System (ADS)
Li, Wenjing; Gustafsson, Ulf; A-Rahim, Yoursif
2011-03-01
The biology of colorectal cancer offers an opportunity for both early detection and prevention. Compared with other imaging modalities, optical colonoscopy is the procedure of choice for simultaneous detection and removal of colonic polyps. Computer assisted screening makes it possible to assist physicians and potentially improve the accuracy of the diagnostic decision during the exam. This paper presents an unsupervised method to detect and track colonic lesions in endoscopic videos. The aim of the lesion screening and tracking is to facilitate detection of polyps and abnormal mucosa in real time as the physician is performing the procedure. For colonic lesion detection, the conventional marker controlled watershed based segmentation is used to segment the colonic lesions, followed by an adaptive ellipse fitting strategy to further validate the shape. For colonic lesion tracking, a mean shift tracker with background modeling is used to track the target region from the detection phase. The approach has been tested on colonoscopy videos acquired during regular colonoscopic procedures and demonstrated promising results.
Design, implementation and accuracy of a prototype for medical augmented reality.
Pandya, Abhilash; Siadat, Mohammad-Reza; Auner, Greg
2005-01-01
This paper is focused on prototype development and accuracy evaluation of a medical Augmented Reality (AR) system. The accuracy of such a system is of critical importance for medical use, and is hence considered in detail. We analyze the individual error contributions and the system accuracy of the prototype. A passive articulated arm is used to track a calibrated end-effector-mounted video camera. The live video view is superimposed in real time with the synchronized graphical view of CT-derived segmented object(s) of interest within a phantom skull. The AR accuracy mostly depends on the accuracy of the tracking technology, the registration procedure, the camera calibration, and the image scanning device (e.g., a CT or MRI scanner). The accuracy of the Microscribe arm was measured to be 0.87 mm. After mounting the camera on the tracking device, the AR accuracy was measured to be 2.74 mm on average (standard deviation = 0.81 mm). After using data from a 2-mm-thick CT scan, the AR error remained essentially the same at an average of 2.75 mm (standard deviation = 1.19 mm). For neurosurgery, the acceptable error is approximately 2-3 mm, and our prototype approaches these accuracy requirements. The accuracy could be increased with a higher-fidelity tracking system and improved calibration and object registration. The design and methods of this prototype device can be extrapolated to current medical robotics (due to the kinematic similarity) and neuronavigation systems.
Tracking and people counting using Particle Filter Method
NASA Astrophysics Data System (ADS)
Sulistyaningrum, D. R.; Setiyono, B.; Rizky, M. S.
2018-03-01
In recent years, technology has developed quite rapidly, especially in the field of object tracking. Moreover, if the object under study is a person and the number of people a lot. The purpose of this research is to apply Particle Filter method for tracking and counting people in certain area. Tracking people will be rather difficult if there are some obstacles, one of which is occlusion. The stages of tracking and people counting scheme in this study include pre-processing, segmentation using Gaussian Mixture Model (GMM), tracking using particle filter, and counting based on centroid. The Particle Filter method uses the estimated motion included in the model used. The test results show that the tracking and people counting can be done well with an average accuracy of 89.33% and 77.33% respectively from six videos test data. In the process of tracking people, the results are good if there is partial occlusion and no occlusion
NASA Astrophysics Data System (ADS)
Kaur, Berinderjeet; Tay, Eng Guan; Toh, Tin Lam; Leong, Yew Hoong; Lee, Ngan Hoe
2018-03-01
A study of school mathematics curriculum enacted by competent teachers in Singapore secondary schools is a programmatic research project at the National Institute of Education (NIE) funded by the Ministry of Education (MOE) in Singapore through the Office of Education Research (OER) at NIE. The main goal of the project is to collect a set of data that would be used by two studies to research the enacted secondary school mathematics curriculum. The project aims to examine how competent experienced secondary school teachers implement the designated curriculum prescribed by the MOE in the 2013 revision of curriculum. It does this firstly by examining the video recordings of the classroom instruction and interactions between secondary school mathematics teachers and their students, as it is these interactions that fundamentally determine the nature of the actual mathematics learning and teaching that take place in the classroom. It also examines content through the instructional materials used—their preparation, use in classroom and as homework. The project comprises a video segment and a survey segment. Approximately 630 secondary mathematics teachers and 600 students are participating in the project. The data collection for the video segment of the project is guided by the renowned complementary accounts methodology while the survey segment adopts a self-report questionnaire approach. The findings of the project will serve several purposes. They will provide timely feedback to mathematics specialists in the MOE, inform pre-service and professional development programmes for mathematics teachers at the NIE and contribute towards articulation of "Mathematics pedagogy in Singapore secondary schools" that is evidence based.
Two novel motion-based algorithms for surveillance video analysis on embedded platforms
NASA Astrophysics Data System (ADS)
Vijverberg, Julien A.; Loomans, Marijn J. H.; Koeleman, Cornelis J.; de With, Peter H. N.
2010-05-01
This paper proposes two novel motion-vector based techniques for target detection and target tracking in surveillance videos. The algorithms are designed to operate on a resource-constrained device, such as a surveillance camera, and to reuse the motion vectors generated by the video encoder. The first novel algorithm for target detection uses motion vectors to construct a consistent motion mask, which is combined with a simple background segmentation technique to obtain a segmentation mask. The second proposed algorithm aims at multi-target tracking and uses motion vectors to assign blocks to targets employing five features. The weights of these features are adapted based on the interaction between targets. These algorithms are combined in one complete analysis application. The performance of this application for target detection has been evaluated for the i-LIDS sterile zone dataset and achieves an F1-score of 0.40-0.69. The performance of the analysis algorithm for multi-target tracking has been evaluated using the CAVIAR dataset and achieves an MOTP of around 9.7 and MOTA of 0.17-0.25. On a selection of targets in videos from other datasets, the achieved MOTP and MOTA are 8.8-10.5 and 0.32-0.49 respectively. The execution time on a PC-based platform is 36 ms. This includes the 20 ms for generating motion vectors, which are also required by the video encoder.
Perceptual video quality assessment in H.264 video coding standard using objective modeling.
Karthikeyan, Ramasamy; Sainarayanan, Gopalakrishnan; Deepa, Subramaniam Nachimuthu
2014-01-01
Since usage of digital video is wide spread nowadays, quality considerations have become essential, and industry demand for video quality measurement is rising. This proposal provides a method of perceptual quality assessment in H.264 standard encoder using objective modeling. For this purpose, quality impairments are calculated and a model is developed to compute the perceptual video quality metric based on no reference method. Because of the shuttle difference between the original video and the encoded video the quality of the encoded picture gets degraded, this quality difference is introduced by the encoding process like Intra and Inter prediction. The proposed model takes into account of the artifacts introduced by these spatial and temporal activities in the hybrid block based coding methods and an objective modeling of these artifacts into subjective quality estimation is proposed. The proposed model calculates the objective quality metric using subjective impairments; blockiness, blur and jerkiness compared to the existing bitrate only calculation defined in the ITU G 1070 model. The accuracy of the proposed perceptual video quality metrics is compared against popular full reference objective methods as defined by VQEG.
Video repairing under variable illumination using cyclic motions.
Jia, Jiaya; Tai, Yu-Wing; Wu, Tai-Pang; Tang, Chi-Keung
2006-05-01
This paper presents a complete system capable of synthesizing a large number of pixels that are missing due to occlusion or damage in an uncalibrated input video. These missing pixels may correspond to the static background or cyclic motions of the captured scene. Our system employs user-assisted video layer segmentation, while the main processing in video repair is fully automatic. The input video is first decomposed into the color and illumination videos. The necessary temporal consistency is maintained by tensor voting in the spatio-temporal domain. Missing colors and illumination of the background are synthesized by applying image repairing. Finally, the occluded motions are inferred by spatio-temporal alignment of collected samples at multiple scales. We experimented on our system with some difficult examples with variable illumination, where the capturing camera can be stationary or in motion.
Analysis of environmental sounds
NASA Astrophysics Data System (ADS)
Lee, Keansub
Environmental sound archives - casual recordings of people's daily life - are easily collected by MPS players or camcorders with low cost and high reliability, and shared in the web-sites. There are two kinds of user generated recordings we would like to be able to handle in this thesis: Continuous long-duration personal audio and Soundtracks of short consumer video clips. These environmental recordings contain a lot of useful information (semantic concepts) related with activity, location, occasion and content. As a consequence, the environment archives present many new opportunities for the automatic extraction of information that can be used in intelligent browsing systems. This thesis proposes systems for detecting these interesting concepts on a collection of these real-world recordings. The first system is to segment and label personal audio archives - continuous recordings of an individual's everyday experiences - into 'episodes' (relatively consistent acoustic situations lasting a few minutes or more) using the Bayesian Information Criterion and spectral clustering. The second system is for identifying regions of speech or music in the kinds of energetic and highly-variable noise present in this real-world sound. Motivated by psychoacoustic evidence that pitch is crucial in the perception and organization of sound, we develop a noise-robust pitch detection algorithm to locate speech or music-like regions. To avoid false alarms resulting from background noise with strong periodic components (such as air-conditioning), a new scheme is added in order to suppress these noises in the domain of autocorrelogram. In addition, the third system is to automatically detect a large set of interesting semantic concepts; which we chose for being both informative and useful to users, as well as being technically feasible. These 25 concepts are associated with people's activities, locations, occasions, objects, scenes and sounds, and are based on a large collection of consumer videos in conjunction with user studies. We model the soundtrack of each video, regardless of its original duration, as a fixed-sized clip-level summary feature. For each concept, an SVM-based classifier is trained according to three distance measures (Kullback-Leibler, Bhattacharyya, and Mahalanobis distance). Detecting the time of occurrence of a local object (for instance, a cheering sound) embedded in a longer soundtrack is useful and important for applications such as search and retrieval in consumer video archives. We finally present a Markov-model based clustering algorithm able to identify and segment consistent sets of temporal frames into regions associated with different ground-truth labels, and at the same time to exclude a set of uninformative frames shared in common from all clips. The labels are provided at the clip level, so this refinement of the time axis represents a variant of Multiple-Instance Learning (MIL). Quantitative evaluation shows that the performance of our proposed approaches tested on the 60h personal audio archives or 1900 YouTube video clips is significantly better than existing algorithms for detecting these useful concepts in real-world personal audio recordings.
Videos for Science Communication and Nature Interpretation: The TIB|AV-Portal as Resource.
NASA Astrophysics Data System (ADS)
Marín Arraiza, Paloma; Plank, Margret; Löwe, Peter
2016-04-01
Scientific audiovisual media such as videos of research, interactive displays or computer animations has become an important part of scientific communication and education. Dynamic phenomena can be described better by audiovisual media than by words and pictures. For this reason, scientific videos help us to understand and discuss environmental phenomena more efficiently. Moreover, the creation of scientific videos is easier than ever, thanks to mobile devices and open source editing software. Video-clips, webinars or even the interactive part of a PICO are formats of scientific audiovisual media used in the Geosciences. This type of media translates the location-referenced Science Communication such as environmental interpretation into computed-based Science Communication. A new way of Science Communication is video abstracting. A video abstract is a three- to five-minute video statement that provides background information about a research paper. It also gives authors the opportunity to present their research activities to a wider audience. Since this kind of media have become an important part of scientific communication there is a need for reliable infrastructures which are capable of managing the digital assets researchers generate. Using the reference of the usecase of video abstracts this paper gives an overview over the activities by the German National Library of Science and Technology (TIB) regarding publishing and linking audiovisual media in a scientifically sound way. The German National Library of Science and Technology (TIB) in cooperation with the Hasso Plattner Institute (HPI) developed a web-based portal (av.tib.eu) that optimises access to scientific videos in the fields of science and technology. Videos from the realms of science and technology can easily be uploaded onto the TIB|AV Portal. Within a short period of time the videos are assigned a digital object identifier (DOI). This enables them to be referenced, cited, and linked (e.g. to the relevant article or further supplement materials). By using media fragment identifiers not only the whole video can be cited, but also individual parts of it. Doing so, users are also likely to find high-quality related content (for instance, a video abstract and the corresponding article or an expedition documentary and its field notebook). Based on automatic analysis of speech, images and texts within the videos a large amount of metadata associated with the segments of the video is automatically generated. These metadata enhance the searchability of the video and make it easier to retrieve and interlink meaningful parts of the video. This new and reliable library-driven infrastructure allow all different types of data be discoverable, accessible, citable, freely reusable, and interlinked. Therefore, it simplifies Science Communication
Extraction and analysis of neuron firing signals from deep cortical video microscopy
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kerekes, Ryan A; Blundon, Jay
We introduce a method for extracting and analyzing neuronal activity time signals from video of the cortex of a live animal. The signals correspond to the firing activity of individual cortical neurons. Activity signals are based on the changing fluorescence of calcium indicators in the cells over time. We propose a cell segmentation method that relies on a user-specified center point, from which the signal extraction method proceeds. A stabilization approach is used to reduce tissue motion in the video. The extracted signal is then processed to flatten the baseline and detect action potentials. We show results from applying themore » method to a cortical video of a live mouse.« less
Sensor agnostic object recognition using a map seeking circuit
NASA Astrophysics Data System (ADS)
Overman, Timothy L.; Hart, Michael
2012-05-01
Automatic object recognition capabilities are traditionally tuned to exploit the specific sensing modality they were designed to. Their successes (and shortcomings) are tied to object segmentation from the background, they typically require highly skilled personnel to train them, and they become cumbersome with the introduction of new objects. In this paper we describe a sensor independent algorithm based on the biologically inspired technology of map seeking circuits (MSC) which overcomes many of these obstacles. In particular, the MSC concept offers transparency in object recognition from a common interface to all sensor types, analogous to a USB device. It also provides a common core framework that is independent of the sensor and expandable to support high dimensionality decision spaces. Ease in training is assured by using commercially available 3D models from the video game community. The search time remains linear no matter how many objects are introduced, ensuring rapid object recognition. Here, we report results of an MSC algorithm applied to object recognition and pose estimation from high range resolution radar (1D), electrooptical imagery (2D), and LIDAR point clouds (3D) separately. By abstracting the sensor phenomenology from the underlying a prior knowledge base, MSC shows promise as an easily adaptable tool for incorporating additional sensor inputs.
Multi-Frame Convolutional Neural Networks for Object Detection in Temporal Data
2017-03-01
maximum 200 words) Given the problem of detecting objects in video , existing neural-network solutions rely on a post-processing step to combine...information across frames and strengthen conclusions. This technique has been successful for videos with simple, dominant objects but it cannot detect objects...Computer Science iii THIS PAGE INTENTIONALLY LEFT BLANK iv ABSTRACT Given the problem of detecting objects in video , existing neural-network solutions rely
The Webb Telescope's Actuators: Curving Mirrors in Space
2017-12-08
NASA image release December 9, 2010 Caption: The James Webb Space Telescope's Engineering Design Unit (EDU) primary mirror segment, coated with gold by Quantum Coating Incorporated. The actuator is located behind the mirror. Credit: Photo by Drew Noel NASA's James Webb Space Telescope is a wonder of modern engineering. As the planned successor to the Hubble Space telescope, even the smallest of parts on this giant observatory will play a critical role in its performance. A new video takes viewers behind the Webb's mirrors to investigate "actuators," one component that will help Webb focus on some of the earliest objects in the universe. The video called "Got Your Back" is part of an on-going video series about the Webb telescope called "Behind the Webb." It was produced at the Space Telescope Science Institute (STScI) in Baltimore, Md. and takes viewers behind the scenes with scientists and engineers who are creating the Webb telescope's components. During the 3 minute and 12 second video, STScI host Mary Estacion interviewed people involved in the project at Ball Aerospace in Boulder, Colo. and showed the actuators in action. The Webb telescope will study every phase in the history of our universe, ranging from the first luminous glows after the big bang, to the formation of solar systems capable of supporting life on planets like Earth, to the evolution of our own solar system. Measuring the light this distant light requires a primary mirror 6.5 meters (21 feet 4 inches) across – six times larger than the Hubble Space telescope’s mirror! Launching a mirror this large into space isn’t feasible. Instead, Webb engineers and scientists innovated a unique solution – building 18 mirrors that will act in unison as one large mirror. These mirrors are packaged together into three sections that fold up - much easier to fit inside a rocket. Each mirror is made from beryllium and weighs approximately 20 kilograms (46 pounds). Once in space, getting these mirrors to focus correctly on faraway galaxies is another challenge entirely. Actuators, or tiny mechanical motors, provide the answer to achieving a single perfect focus. The primary and secondary mirror segments are both moved by six actuators that are attached to the back of the mirrors. The primary segment has an additional actuator at the center of the mirror that adjusts its curvature. The third mirror segment remains stationary. Lee Feinberg, Webb Optical Telescope Element Manager at NASA's Goddard Space Flight Center in Greenbelt, Md. explained "Aligning the primary mirror segments as though they are a single large mirror means each mirror is aligned to 1/10,000th the thickness of a human hair. This alignment has to be done at 50 degrees above absolute zero! What's even more amazing is that the engineers and scientists working on the Webb telescope literally had to invent how to do this." With the actuators in place, Brad Shogrin, Webb Telescope Manager at Ball Aerospace, Boulder, Colo, details the next step: attaching the hexapod (meaning six-footed) assembly and radius of curvature subsystem (ROC). "Radius of curvature" refers to the distance to the center point of the curvature of the mirror. Feinberg added "To understand the concept in a more basic sense, if you change that radius of curvature, you change the mirror's focus." The "Behind the Webb" video series is available in HQ, large and small Quicktime formats, HD, Large and Small WMV formats, and HD, Large and Small Xvid formats. To see the actuators being attached to the back of a telescope mirror in this new "Behind the Webb" video, visit: webbtelescope.org/webb_telescope/behind_the_webb/7 For more information about Webb's mirrors, visit: www.jwst.nasa.gov/mirrors.html For more information on the James Webb Space Telescope, visit: jwst.nasa.gov Rob Gutro NASA's Goddard Space Flight Center, Greenbelt, Md. NASA Goddard Space Flight Center enables NASA’s mission through four scientific endeavors: Earth Science, Heliophysics, Solar System Exploration, and Astrophysics. Goddard plays a leading role in NASA’s accomplishments by contributing compelling scientific knowledge to advance the Agency’s mission. Follow us on Twitter Join us on Facebook
Kim, Seung-Cheol; Dong, Xiao-Bin; Kwon, Min-Woo; Kim, Eun-Soo
2013-05-06
A novel approach for fast generation of video holograms of three-dimensional (3-D) moving objects using a motion compensation-based novel-look-up-table (MC-N-LUT) method is proposed. Motion compensation has been widely employed in compression of conventional 2-D video data because of its ability to exploit high temporal correlation between successive video frames. Here, this concept of motion-compensation is firstly applied to the N-LUT based on its inherent property of shift-invariance. That is, motion vectors of 3-D moving objects are extracted between the two consecutive video frames, and with them motions of the 3-D objects at each frame are compensated. Then, through this process, 3-D object data to be calculated for its video holograms are massively reduced, which results in a dramatic increase of the computational speed of the proposed method. Experimental results with three kinds of 3-D video scenarios reveal that the average number of calculated object points and the average calculation time for one object point of the proposed method, have found to be reduced down to 86.95%, 86.53% and 34.99%, 32.30%, respectively compared to those of the conventional N-LUT and temporal redundancy-based N-LUT (TR-N-LUT) methods.
MPEG-4 ASP SoC receiver with novel image enhancement techniques for DAB networks
NASA Astrophysics Data System (ADS)
Barreto, D.; Quintana, A.; García, L.; Callicó, G. M.; Núñez, A.
2007-05-01
This paper presents a system for real-time video reception in low-power mobile devices using Digital Audio Broadcast (DAB) technology for transmission. A demo receiver terminal is designed into a FPGA platform using the Advanced Simple Profile (ASP) MPEG-4 standard for video decoding. In order to keep the demanding DAB requirements, the bandwidth of the encoded sequence must be drastically reduced. In this sense, prior to the MPEG-4 coding stage, a pre-processing stage is performed. It is firstly composed by a segmentation phase according to motion and texture based on the Principal Component Analysis (PCA) of the input video sequence, and secondly by a down-sampling phase, which depends on the segmentation results. As a result of the segmentation task, a set of texture and motion maps are obtained. These motion and texture maps are also included into the bit-stream as user data side-information and are therefore known to the receiver. For all bit-rates, the whole encoder/decoder system proposed in this paper exhibits higher image visual quality than the alternative encoding/decoding method, assuming equal image sizes. A complete analysis of both techniques has also been performed to provide the optimum motion and texture maps for the global system, which has been finally validated for a variety of video sequences. Additionally, an optimal HW/SW partition for the MPEG-4 decoder has been studied and implemented over a Programmable Logic Device with an embedded ARM9 processor. Simulation results show that a throughput of 15 QCIF frames per second can be achieved with low area and low power implementation.
2003-05-01
Students at Williams Technology Middle School in Huntsville were featured in a new segment of NASA CONNECT, a video series aimed to enhance the teaching of math, science, and technology to middle school students. The segment premiered nationwide May 15, 2003, and helped viewers understand Sir Isaac Newton's first, second, and third laws of gravity and how they relate to NASA's efforts in developing the next generation of space transportation.
Shape-based human detection for threat assessment
NASA Astrophysics Data System (ADS)
Lee, Dah-Jye; Zhan, Pengcheng; Thomas, Aaron; Schoenberger, Robert B.
2004-07-01
Detection of intrusions for early threat assessment requires the capability of distinguishing whether the intrusion is a human, an animal, or other objects. Most low-cost security systems use simple electronic motion detection sensors to monitor motion or the location of objects within the perimeter. Although cost effective, these systems suffer from high rates of false alarm, especially when monitoring open environments. Any moving objects including animals can falsely trigger the security system. Other security systems that utilize video equipment require human interpretation of the scene in order to make real-time threat assessment. Shape-based human detection technique has been developed for accurate early threat assessments for open and remote environment. Potential threats are isolated from the static background scene using differential motion analysis and contours of the intruding objects are extracted for shape analysis. Contour points are simplified by removing redundant points connecting short and straight line segments and preserving only those with shape significance. Contours are represented in tangent space for comparison with shapes stored in database. Power cepstrum technique has been developed to search for the best matched contour in database and to distinguish a human from other objects from different viewing angles and distances.
Jersey number detection in sports video for athlete identification
NASA Astrophysics Data System (ADS)
Ye, Qixiang; Huang, Qingming; Jiang, Shuqiang; Liu, Yang; Gao, Wen
2005-07-01
Athlete identification is important for sport video content analysis since users often care about the video clips with their preferred athletes. In this paper, we propose a method for athlete identification by combing the segmentation, tracking and recognition procedures into a coarse-to-fine scheme for jersey number (digital characters on sport shirt) detection. Firstly, image segmentation is employed to separate the jersey number regions with its background. And size/pipe-like attributes of digital characters are used to filter out candidates. Then, a K-NN (K nearest neighbor) classifier is employed to classify a candidate into a digit in "0-9" or negative. In the recognition procedure, we use the Zernike moment features, which are invariant to rotation and scale for digital shape recognition. Synthetic training samples with different fonts are used to represent the pattern of digital characters with non-rigid deformation. Once a character candidate is detected, a SSD (smallest square distance)-based tracking procedure is started. The recognition procedure is performed every several frames in the tracking process. After tracking tens of frames, the overall recognition results are combined to determine if a candidate is a true jersey number or not by a voting procedure. Experiments on several types of sports video shows encouraging result.
A benchmark for comparison of cell tracking algorithms
Maška, Martin; Ulman, Vladimír; Svoboda, David; Matula, Pavel; Matula, Petr; Ederra, Cristina; Urbiola, Ainhoa; España, Tomás; Venkatesan, Subramanian; Balak, Deepak M.W.; Karas, Pavel; Bolcková, Tereza; Štreitová, Markéta; Carthel, Craig; Coraluppi, Stefano; Harder, Nathalie; Rohr, Karl; Magnusson, Klas E. G.; Jaldén, Joakim; Blau, Helen M.; Dzyubachyk, Oleh; Křížek, Pavel; Hagen, Guy M.; Pastor-Escuredo, David; Jimenez-Carretero, Daniel; Ledesma-Carbayo, Maria J.; Muñoz-Barrutia, Arrate; Meijering, Erik; Kozubek, Michal; Ortiz-de-Solorzano, Carlos
2014-01-01
Motivation: Automatic tracking of cells in multidimensional time-lapse fluorescence microscopy is an important task in many biomedical applications. A novel framework for objective evaluation of cell tracking algorithms has been established under the auspices of the IEEE International Symposium on Biomedical Imaging 2013 Cell Tracking Challenge. In this article, we present the logistics, datasets, methods and results of the challenge and lay down the principles for future uses of this benchmark. Results: The main contributions of the challenge include the creation of a comprehensive video dataset repository and the definition of objective measures for comparison and ranking of the algorithms. With this benchmark, six algorithms covering a variety of segmentation and tracking paradigms have been compared and ranked based on their performance on both synthetic and real datasets. Given the diversity of the datasets, we do not declare a single winner of the challenge. Instead, we present and discuss the results for each individual dataset separately. Availability and implementation: The challenge Web site (http://www.codesolorzano.com/celltrackingchallenge) provides access to the training and competition datasets, along with the ground truth of the training videos. It also provides access to Windows and Linux executable files of the evaluation software and most of the algorithms that competed in the challenge. Contact: codesolorzano@unav.es Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24526711
Tompkins, Matthew L.; Woods, Andy T.; Aimola Davies, Anne M.
2016-01-01
Drawing inspiration from sleight-of-hand magic tricks, we developed an experimental paradigm to investigate whether magicians’ misdirection techniques could be used to induce the misperception of “phantom” objects. While previous experiments investigating sleight-of-hand magic tricks have focused on creating false assumptions about the movement of an object in a scene, our experiment investigated creating false assumptions about the presence of an object in a scene. Participants watched a sequence of silent videos depicting a magician performing with a single object. Following each video, participants were asked to write a description of the events in the video. In the final video, participants watched the Phantom Vanish Magic Trick, a novel magic trick developed for this experiment, in which the magician pantomimed the actions of presenting an object and then making it magically disappear. No object was presented during the final video. The silent videos precluded the use of false verbal suggestions, and participants were not asked leading questions about the objects. Nevertheless, 32% of participants reported having visual impressions of non-existent objects. These findings support an inferential model of perception, wherein top-down expectations can be manipulated by the magician to generate vivid illusory experiences, even in the absence of corresponding bottom-up information. PMID:27493635
Dynamic graph cuts for efficient inference in Markov Random Fields.
Kohli, Pushmeet; Torr, Philip H S
2007-12-01
Abstract-In this paper we present a fast new fully dynamic algorithm for the st-mincut/max-flow problem. We show how this algorithm can be used to efficiently compute MAP solutions for certain dynamically changing MRF models in computer vision such as image segmentation. Specifically, given the solution of the max-flow problem on a graph, the dynamic algorithm efficiently computes the maximum flow in a modified version of the graph. The time taken by it is roughly proportional to the total amount of change in the edge weights of the graph. Our experiments show that, when the number of changes in the graph is small, the dynamic algorithm is significantly faster than the best known static graph cut algorithm. We test the performance of our algorithm on one particular problem: the object-background segmentation problem for video. It should be noted that the application of our algorithm is not limited to the above problem, the algorithm is generic and can be used to yield similar improvements in many other cases that involve dynamic change.
Three-dimensional kinematics of the lower limbs during forward ice hockey skating.
Upjohn, Tegan; Turcotte, René; Pearsall, David J; Loh, Jonathan
2008-05-01
The objectives of the study were to describe lower limb kinematics in three dimensions during the forward skating stride in hockey players and to contrast skating techniques between low- and high-calibre skaters. Participant motions were recorded with four synchronized digital video cameras while wearing reflective marker triads on the thighs, shanks, and skates. Participants skated on a specialized treadmill with a polyethylene slat bed at a self-selected speed for 1 min. Each participant completed three 1-min skating trials separated by 5 min of rest. Joint and limb segment angles were calculated within the local (anatomical) and global reference planes. Similar gross movement patterns and stride rates were observed; however, high-calibre participants showed a greater range and rate of joint motion in both the sagittal and frontal planes, contributing to greater stride length for high-calibre players. Furthermore, consequent postural differences led to greater lateral excursion during the power stroke in high-calibre skaters. In conclusion, specific kinematic differences in both joint and limb segment angle movement patterns were observed between low- and high-calibre skaters.
Motion compensated shape error concealment.
Schuster, Guido M; Katsaggelos, Aggelos K
2006-02-01
The introduction of Video Objects (VOs) is one of the innovations of MPEG-4. The alpha-plane of a VO defines its shape at a given instance in time and hence determines the boundary of its texture. In packet-based networks, shape, motion, and texture are subject to loss. While there has been considerable attention paid to the concealment of texture and motion errors, little has been done in the field of shape error concealment. In this paper we propose a post-processing shape error concealment technique that uses the motion compensated boundary information of the previously received alpha-plane. The proposed approach is based on matching received boundary segments in the current frame to the boundary in the previous frame. This matching is achieved by finding a maximally smooth motion vector field. After the current boundary segments are matched to the previous boundary, the missing boundary pieces are reconstructed by motion compensation. Experimental results demonstrating the performance of the proposed motion compensated shape error concealment method, and comparing it with the previously proposed weighted side matching method are presented.
DairyBeef: maximizing quality and profits--a consistent food safety message.
Moore, D A; Kirk, J H; Klingborg, D J; Garry, F; Wailes, W; Dalton, J; Busboom, J; Sams, R W; Poe, M; Payne, M; Marchello, J; Looper, M; Falk, D; Wright, T
2004-01-01
To respond to meat safety and quality issues in dairy market cattle, a collaborative project team for 7 western states was established to develop educational resources providing a consistent meat safety and quality message to dairy producers, farm advisors, and veterinarians. The team produced an educational website and CD-ROM course that included videos, narrated slide sets, and on-farm tools. The objectives of this course were: 1) to help producers and their advisors understand market cattle food safety and quality issues, 2) help maintain markets for these cows, and 3) help producers identify ways to improve the quality of dairy cattle going to slaughter. DairyBeef. Maximizing Quality & Profits consists of 6 sections, including 4 core segments. Successful completion of quizzes following each core segment is required for participants to receive a certificate of completion. A formative evaluation of the program revealed the necessity for minor content and technological changes with the web-based course. All evaluators considered the materials relevant to dairy producers. After editing, course availability was enabled in February, 2003. Between February and May, 2003, 21 individuals received certificates of completion.
Topical video object discovery from key frames by modeling word co-occurrence prior.
Zhao, Gangqiang; Yuan, Junsong; Hua, Gang; Yang, Jiong
2015-12-01
A topical video object refers to an object, that is, frequently highlighted in a video. It could be, e.g., the product logo and the leading actor/actress in a TV commercial. We propose a topic model that incorporates a word co-occurrence prior for efficient discovery of topical video objects from a set of key frames. Previous work using topic models, such as latent Dirichelet allocation (LDA), for video object discovery often takes a bag-of-visual-words representation, which ignored important co-occurrence information among the local features. We show that such data driven co-occurrence information from bottom-up can conveniently be incorporated in LDA with a Gaussian Markov prior, which combines top-down probabilistic topic modeling with bottom-up priors in a unified model. Our experiments on challenging videos demonstrate that the proposed approach can discover different types of topical objects despite variations in scale, view-point, color and lighting changes, or even partial occlusions. The efficacy of the co-occurrence prior is clearly demonstrated when compared with topic models without such priors.
A no-reference video quality assessment metric based on ROI
NASA Astrophysics Data System (ADS)
Jia, Lixiu; Zhong, Xuefei; Tu, Yan; Niu, Wenjuan
2015-01-01
A no reference video quality assessment metric based on the region of interest (ROI) was proposed in this paper. In the metric, objective video quality was evaluated by integrating the quality of two compressed artifacts, i.e. blurring distortion and blocking distortion. The Gaussian kernel function was used to extract the human density maps of the H.264 coding videos from the subjective eye tracking data. An objective bottom-up ROI extraction model based on magnitude discrepancy of discrete wavelet transform between two consecutive frames, center weighted color opponent model, luminance contrast model and frequency saliency model based on spectral residual was built. Then only the objective saliency maps were used to compute the objective blurring and blocking quality. The results indicate that the objective ROI extraction metric has a higher the area under the curve (AUC) value. Comparing with the conventional video quality assessment metrics which measured all the video quality frames, the metric proposed in this paper not only decreased the computation complexity, but improved the correlation between subjective mean opinion score (MOS) and objective scores.
Developing assessment system for wireless capsule endoscopy videos based on event detection
NASA Astrophysics Data System (ADS)
Chen, Ying-ju; Yasen, Wisam; Lee, Jeongkyu; Lee, Dongha; Kim, Yongho
2009-02-01
Along with the advancing of technology in wireless and miniature camera, Wireless Capsule Endoscopy (WCE), the combination of both, enables a physician to diagnose patient's digestive system without actually perform a surgical procedure. Although WCE is a technical breakthrough that allows physicians to visualize the entire small bowel noninvasively, the video viewing time takes 1 - 2 hours. This is very time consuming for the gastroenterologist. Not only it sets a limit on the wide application of this technology but also it incurs considerable amount of cost. Therefore, it is important to automate such process so that the medical clinicians only focus on interested events. As an extension from our previous work that characterizes the motility of digestive tract in WCE videos, we propose a new assessment system for energy based events detection (EG-EBD) to classify the events in WCE videos. For the system, we first extract general features of a WCE video that can characterize the intestinal contractions in digestive organs. Then, the event boundaries are identified by using High Frequency Content (HFC) function. The segments are classified into WCE event by special features. In this system, we focus on entering duodenum, entering cecum, and active bleeding. This assessment system can be easily extended to discover more WCE events, such as detailed organ segmentation and more diseases, by using new special features. In addition, the system provides a score for every WCE image for each event. Using the event scores, the system helps a specialist to speedup the diagnosis process.
NASA Astrophysics Data System (ADS)
Genovese, Mariangela; Napoli, Ettore
2013-05-01
The identification of moving objects is a fundamental step in computer vision processing chains. The development of low cost and lightweight smart cameras steadily increases the request of efficient and high performance circuits able to process high definition video in real time. The paper proposes two processor cores aimed to perform the real time background identification on High Definition (HD, 1920 1080 pixel) video streams. The implemented algorithm is the OpenCV version of the Gaussian Mixture Model (GMM), an high performance probabilistic algorithm for the segmentation of the background that is however computationally intensive and impossible to implement on general purpose CPU with the constraint of real time processing. In the proposed paper, the equations of the OpenCV GMM algorithm are optimized in such a way that a lightweight and low power implementation of the algorithm is obtained. The reported performances are also the result of the use of state of the art truncated binary multipliers and ROM compression techniques for the implementation of the non-linear functions. The first circuit has commercial FPGA devices as a target and provides speed and logic resource occupation that overcome previously proposed implementations. The second circuit is oriented to an ASIC (UMC-90nm) standard cell implementation. Both implementations are able to process more than 60 frames per second in 1080p format, a frame rate compatible with HD television.
Identification of uncommon objects in containers
Bremer, Peer-Timo; Kim, Hyojin; Thiagarajan, Jayaraman J.
2017-09-12
A system for identifying in an image an object that is commonly found in a collection of images and for identifying a portion of an image that represents an object based on a consensus analysis of segmentations of the image. The system collects images of containers that contain objects for generating a collection of common objects within the containers. To process the images, the system generates a segmentation of each image. The image analysis system may also generate multiple segmentations for each image by introducing variations in the selection of voxels to be merged into a segment. The system then generates clusters of the segments based on similarity among the segments. Each cluster represents a common object found in the containers. Once the clustering is complete, the system may be used to identify common objects in images of new containers based on similarity between segments of images and the clusters.
Anderson, Jeffrey R; Barrett, Steven F
2009-01-01
Image segmentation is the process of isolating distinct objects within an image. Computer algorithms have been developed to aid in the process of object segmentation, but a completely autonomous segmentation algorithm has yet to be developed [1]. This is because computers do not have the capability to understand images and recognize complex objects within the image. However, computer segmentation methods [2], requiring user input, have been developed to quickly segment objects in serial sectioned images, such as magnetic resonance images (MRI) and confocal laser scanning microscope (CLSM) images. In these cases, the segmentation process becomes a powerful tool in visualizing the 3D nature of an object. The user input is an important part of improving the performance of many segmentation methods. A double threshold segmentation method has been investigated [3] to separate objects in gray scaled images, where the gray level of the object is among the gray levels of the background. In order to best determine the threshold values for this segmentation method the image must be manipulated for optimal contrast. The same is true of other segmentation and edge detection methods as well. Typically, the better the image contrast, the better the segmentation results. This paper describes a graphical user interface (GUI) that allows the user to easily change image contrast parameters that will optimize the performance of subsequent object segmentation. This approach makes use of the fact that the human brain is extremely effective in object recognition and understanding. The GUI provides the user with the ability to define the gray scale range of the object of interest. These lower and upper bounds of this range are used in a histogram stretching process to improve image contrast. Also, the user can interactively modify the gamma correction factor that provides a non-linear distribution of gray scale values, while observing the corresponding changes to the image. This interactive approach gives the user the power to make optimal choices in the contrast enhancement parameters.
Learning of perceptual grouping for object segmentation on RGB-D data☆
Richtsfeld, Andreas; Mörwald, Thomas; Prankl, Johann; Zillich, Michael; Vincze, Markus
2014-01-01
Object segmentation of unknown objects with arbitrary shape in cluttered scenes is an ambitious goal in computer vision and became a great impulse with the introduction of cheap and powerful RGB-D sensors. We introduce a framework for segmenting RGB-D images where data is processed in a hierarchical fashion. After pre-clustering on pixel level parametric surface patches are estimated. Different relations between patch-pairs are calculated, which we derive from perceptual grouping principles, and support vector machine classification is employed to learn Perceptual Grouping. Finally, we show that object hypotheses generation with Graph-Cut finds a globally optimal solution and prevents wrong grouping. Our framework is able to segment objects, even if they are stacked or jumbled in cluttered scenes. We also tackle the problem of segmenting objects when they are partially occluded. The work is evaluated on publicly available object segmentation databases and also compared with state-of-the-art work of object segmentation. PMID:24478571
Query by example video based on fuzzy c-means initialized by fixed clustering center
NASA Astrophysics Data System (ADS)
Hou, Sujuan; Zhou, Shangbo; Siddique, Muhammad Abubakar
2012-04-01
Currently, the high complexity of video contents has posed the following major challenges for fast retrieval: (1) efficient similarity measurements, and (2) efficient indexing on the compact representations. A video-retrieval strategy based on fuzzy c-means (FCM) is presented for querying by example. Initially, the query video is segmented and represented by a set of shots, each shot can be represented by a key frame, and then we used video processing techniques to find visual cues to represent the key frame. Next, because the FCM algorithm is sensitive to the initializations, here we initialized the cluster center by the shots of query video so that users could achieve appropriate convergence. After an FCM cluster was initialized by the query video, each shot of query video was considered a benchmark point in the aforesaid cluster, and each shot in the database possessed a class label. The similarity between the shots in the database with the same class label and benchmark point can be transformed into the distance between them. Finally, the similarity between the query video and the video in database was transformed into the number of similar shots. Our experimental results demonstrated the performance of this proposed approach.
Huntsville Area Students Appear in Episode of NASA CONNECT
NASA Technical Reports Server (NTRS)
2003-01-01
Students at Williams Technology Middle School in Huntsville were featured in a new segment of NASA CONNECT, a video series aimed to enhance the teaching of math, science, and technology to middle school students. The segment premiered nationwide May 15, 2003, and helped viewers understand Sir Isaac Newton's first, second, and third laws of gravity and how they relate to NASA's efforts in developing the next generation of space transportation.
Popova, I I; Orlov, O I; Matsnev, E I; Revyakin, Yu G
2016-01-01
The paper reports the results of testing some diagnostic video systems enabling digital rendering of TNT teeth and jaws. The authors substantiate the criteria of choosing and integration of imaging systems in future on Russian segment of the International space station kit LOR developed for examination and download of high-quality images of cosmonauts' TNT, parodentium and teeth.
Arbelle, Assaf; Reyes, Jose; Chen, Jia-Yun; Lahav, Galit; Riklin Raviv, Tammy
2018-04-22
We present a novel computational framework for the analysis of high-throughput microscopy videos of living cells. The proposed framework is generally useful and can be applied to different datasets acquired in a variety of laboratory settings. This is accomplished by tying together two fundamental aspects of cell lineage construction, namely cell segmentation and tracking, via a Bayesian inference of dynamic models. In contrast to most existing approaches, which aim to be general, no assumption of cell shape is made. Spatial, temporal, and cross-sectional variation of the analysed data are accommodated by two key contributions. First, time series analysis is exploited to estimate the temporal cell shape uncertainty in addition to cell trajectory. Second, a fast marching (FM) algorithm is used to integrate the inferred cell properties with the observed image measurements in order to obtain image likelihood for cell segmentation, and association. The proposed approach has been tested on eight different time-lapse microscopy data sets, some of which are high-throughput, demonstrating promising results for the detection, segmentation and association of planar cells. Our results surpass the state of the art for the Fluo-C2DL-MSC data set of the Cell Tracking Challenge (Maška et al., 2014). Copyright © 2018 Elsevier B.V. All rights reserved.
Efficient Lane Boundary Detection with Spatial-Temporal Knowledge Filtering
Nan, Zhixiong; Wei, Ping; Xu, Linhai; Zheng, Nanning
2016-01-01
Lane boundary detection technology has progressed rapidly over the past few decades. However, many challenges that often lead to lane detection unavailability remain to be solved. In this paper, we propose a spatial-temporal knowledge filtering model to detect lane boundaries in videos. To address the challenges of structure variation, large noise and complex illumination, this model incorporates prior spatial-temporal knowledge with lane appearance features to jointly identify lane boundaries. The model first extracts line segments in video frames. Two novel filters—the Crossing Point Filter (CPF) and the Structure Triangle Filter (STF)—are proposed to filter out the noisy line segments. The two filters introduce spatial structure constraints and temporal location constraints into lane detection, which represent the spatial-temporal knowledge about lanes. A straight line or curve model determined by a state machine is used to fit the line segments to finally output the lane boundaries. We collected a challenging realistic traffic scene dataset. The experimental results on this dataset and other standard dataset demonstrate the strength of our method. The proposed method has been successfully applied to our autonomous experimental vehicle. PMID:27529248
Intuitive color-based visualization of multimedia content as large graphs
NASA Astrophysics Data System (ADS)
Delest, Maylis; Don, Anthony; Benois-Pineau, Jenny
2004-06-01
Data visualization techniques are penetrating in various technological areas. In the field of multimedia such as information search and retrieval in multimedia archives, or digital media production and post-production, data visualization methodologies based on large graphs give an exciting alternative to conventional storyboard visualization. In this paper we develop a new approach to visualization of multimedia (video) documents based both on large graph clustering and preliminary video segmenting and indexing.
NASA Astrophysics Data System (ADS)
Al Hadhrami, Tawfik; Wang, Qi; Grecos, Christos
2012-06-01
When natural disasters or other large-scale incidents occur, obtaining accurate and timely information on the developing situation is vital to effective disaster recovery operations. High-quality video streams and high-resolution images, if available in real time, would provide an invaluable source of current situation reports to the incident management team. Meanwhile, a disaster often causes significant damage to the communications infrastructure. Therefore, another essential requirement for disaster management is the ability to rapidly deploy a flexible incident area communication network. Such a network would facilitate the transmission of real-time video streams and still images from the disrupted area to remote command and control locations. In this paper, a comprehensive end-to-end video/image transmission system between an incident area and a remote control centre is proposed and implemented, and its performance is experimentally investigated. In this study a hybrid multi-segment communication network is designed that seamlessly integrates terrestrial wireless mesh networks (WMNs), distributed wireless visual sensor networks, an airborne platform with video camera balloons, and a Digital Video Broadcasting- Satellite (DVB-S) system. By carefully integrating all of these rapidly deployable, interworking and collaborative networking technologies, we can fully exploit the joint benefits provided by WMNs, WSNs, balloon camera networks and DVB-S for real-time video streaming and image delivery in emergency situations among the disaster hit area, the remote control centre and the rescue teams in the field. The whole proposed system is implemented in a proven simulator. Through extensive simulations, the real-time visual communication performance of this integrated system has been numerically evaluated, towards a more in-depth understanding in supporting high-quality visual communications in such a demanding context.
Changes of cerebral current source by audiovisual erotic stimuli in premature ejaculation patients.
Hyun, Jae-Seog; Kam, Sung-Chul; Kwon, Oh-Young
2008-06-01
Premature ejaculation (PE) is one of the most common forms of male sexual dysfunction. The mechanisms of PE remain poorly understood, despite its high prevalence. To investigate the pathophysiology and causes of PE in the central nervous system, we tried to observe the changes in brain current source distribution by audiovisual induction of sexual arousal. Electroencephalograpies were recorded in patients with PE (45.0 +/- 10.3 years old, N = 18) and in controls (45.6 +/- 9.8 years old, N = 18) during four 10-minute segments of resting, watching a music video excerpt, resting, and watching an erotic video excerpt. Five artifact-free 5-second segments were used to obtain cross-spectral low-resolution brain electromagnetic tomography (LORETA) images. Statistical nonparametric maps (SnPM) were obtained to detect the current density changes of six frequency bands between the erotic video session and the music video session in each group. Comparisons were also made between the two groups in the erotic video session. In the SnPM of each spectrum in patients with PE, the current source density of the alpha band was significantly reduced in the right precentral gyrus, the right insula, and both superior parietal lobules (P < 0.01). Comparing the two groups in the erotic video session, the current densities of the beta-2 and -3 bands in the PE group were significantly decreased in the right parahippocampal gyrus and left middle temporal gyrus (P < 0.01). Neuronal activity in the right precental gyrus, the right insula, both the superior parietal lobule, the right parahippocampal gyrus, and the left middle temporal gyrus may be decreased in PE patients upon sexual arousal. Further studies are needed to evaluate the meaning of decreased neuronal activities in PE patients.
Schroeder, Carsten; Chung, Jane M; Mackall, Judith A; Cakulev, Ivan T; Patel, Aaron; Patel, Sunny J; Hoit, Brian D; Sahadevan, Jayakumar
2018-06-14
The aim of the study was to study the feasibility, safety, and efficacy of transesophageal echocardiography-guided intraoperative left ventricular lead placement via a video-assisted thoracoscopic surgery approach in patients with failed conventional biventricular pacing. Twelve patients who could not have the left ventricular lead placed conventionally underwent epicardial left ventricular lead placement by video-assisted thoracoscopic surgery. Eight patients had previous chest surgery (66%). Operative positioning was a modified far lateral supine exposure with 30-degree bed tilt, allowing for groin and sternal access. To determine the optimal left ventricular location for lead placement, the left ventricular surface was divided arbitrarily into nine segments. These segments were transpericardially paced using a hand-held malleable pacing probe identifying the optimal site verified by transesophageal echocardiography. The pacing leads were screwed into position via a limited pericardiotomy. The video-assisted thoracoscopic surgery approach was successful in all patients. Biventricular pacing was achieved in all patients and all reported symptomatic benefit with reduction in New York Heart Association class from III to I-II (P = 0.016). Baseline ejection fraction was 23 ± 3%; within 1-year follow-up, the ejection fraction increased to 32 ± 10% (P = 0.05). The mean follow-up was 566 days. The median length of hospital stay was 7 days with chest tube removal between postoperative days 2 and 5. In patients who are nonresponders to conventional biventricular pacing, intraoperative left ventricular lead placement using anatomical and functional characteristics via a video-assisted thoracoscopic surgery approach is effective in improving heart failure symptoms. This optimized left ventricular lead placement is feasible and safe. Previous chest surgery is no longer an exclusion criterion for a video-assisted thoracoscopic surgery approach.
Objective assessment of MPEG-2 video quality
NASA Astrophysics Data System (ADS)
Gastaldo, Paolo; Zunino, Rodolfo; Rovetta, Stefano
2002-07-01
The increasing use of video compression standards in broadcasting television systems has required, in recent years, the development of video quality measurements that take into account artifacts specifically caused by digital compression techniques. In this paper we present a methodology for the objective quality assessment of MPEG video streams by using circular back-propagation feedforward neural networks. Mapping neural networks can render nonlinear relationships between objective features and subjective judgments, thus avoiding any simplifying assumption on the complexity of the model. The neural network processes an instantaneous set of input values, and yields an associated estimate of perceived quality. Therefore, the neural-network approach turns objective quality assessment into adaptive modeling of subjective perception. The objective features used for the estimate are chosen according to the assessed relevance to perceived quality and are continuously extracted in real time from compressed video streams. The overall system mimics perception but does not require any analytical model of the underlying physical phenomenon. The capability to process compressed video streams represents an important advantage over existing approaches, like avoiding the stream-decoding process greatly enhances real-time performance. Experimental results confirm that the system provides satisfactory, continuous-time approximations for actual scoring curves concerning real test videos.
Systematic Parameterization, Storage, and Representation of Volumetric DICOM Data.
Fischer, Felix; Selver, M Alper; Gezer, Sinem; Dicle, Oğuz; Hillen, Walter
Tomographic medical imaging systems produce hundreds to thousands of slices, enabling three-dimensional (3D) analysis. Radiologists process these images through various tools and techniques in order to generate 3D renderings for various applications, such as surgical planning, medical education, and volumetric measurements. To save and store these visualizations, current systems use snapshots or video exporting, which prevents further optimizations and requires the storage of significant additional data. The Grayscale Softcopy Presentation State extension of the Digital Imaging and Communications in Medicine (DICOM) standard resolves this issue for two-dimensional (2D) data by introducing an extensive set of parameters, namely 2D Presentation States (2DPR), that describe how an image should be displayed. 2DPR allows storing these parameters instead of storing parameter applied images, which cause unnecessary duplication of the image data. Since there is currently no corresponding extension for 3D data, in this study, a DICOM-compliant object called 3D presentation states (3DPR) is proposed for the parameterization and storage of 3D medical volumes. To accomplish this, the 3D medical visualization process is divided into four tasks, namely pre-processing, segmentation, post-processing, and rendering. The important parameters of each task are determined. Special focus is given to the compression of segmented data, parameterization of the rendering process, and DICOM-compliant implementation of the 3DPR object. The use of 3DPR was tested in a radiology department on three clinical cases, which require multiple segmentations and visualizations during the workflow of radiologists. The results show that 3DPR can effectively simplify the workload of physicians by directly regenerating 3D renderings without repeating intermediate tasks, increase efficiency by preserving all user interactions, and provide efficient storage as well as transfer of visualized data.
Object tracking via background subtraction for monitoring illegal activity in crossroad
NASA Astrophysics Data System (ADS)
Ghimire, Deepak; Jeong, Sunghwan; Park, Sang Hyun; Lee, Joonwhoan
2016-07-01
In the field of intelligent transportation system a great number of vision-based techniques have been proposed to prevent pedestrians from being hit by vehicles. This paper presents a system that can perform pedestrian and vehicle detection and monitoring of illegal activity in zebra crossings. In zebra crossing, according to the traffic light status, to fully avoid a collision, a driver or pedestrian should be warned earlier if they possess any illegal moves. In this research, at first, we detect the traffic light status of pedestrian and monitor the crossroad for vehicle pedestrian moves. The background subtraction based object detection and tracking is performed to detect pedestrian and vehicles in crossroads. Shadow removal, blob segmentation, trajectory analysis etc. are used to improve the object detection and classification performance. We demonstrate the experiment in several video sequences which are recorded in different time and environment such as day time and night time, sunny and raining environment. Our experimental results show that such simple and efficient technique can be used successfully as a traffic surveillance system to prevent accidents in zebra crossings.
Video quality assesment using M-SVD
NASA Astrophysics Data System (ADS)
Tao, Peining; Eskicioglu, Ahmet M.
2007-01-01
Objective video quality measurement is a challenging problem in a variety of video processing application ranging from lossy compression to printing. An ideal video quality measure should be able to mimic the human observer. We present a new video quality measure, M-SVD, to evaluate distorted video sequences based on singular value decomposition. A computationally efficient approach is developed for full-reference (FR) video quality assessment. This measure is tested on the Video Quality Experts Group (VQEG) phase I FR-TV test data set. Our experiments show the graphical measure displays the amount of distortion as well as the distribution of error in all frames of the video sequence while the numerical measure has a good correlation with perceived video quality outperforms PSNR and other objective measures by a clear margin.
Video shot boundary detection using region-growing-based watershed method
NASA Astrophysics Data System (ADS)
Wang, Jinsong; Patel, Nilesh; Grosky, William
2004-10-01
In this paper, a novel shot boundary detection approach is presented, based on the popular region growing segmentation method - Watershed segmentation. In image processing, gray-scale pictures could be considered as topographic reliefs, in which the numerical value of each pixel of a given image represents the elevation at that point. Watershed method segments images by filling up basins with water starting at local minima, and at points where water coming from different basins meet, dams are built. In our method, each frame in the video sequences is first transformed from the feature space into the topographic space based on a density function. Low-level features are extracted from frame to frame. Each frame is then treated as a point in the feature space. The density of each point is defined as the sum of the influence functions of all neighboring data points. The height function that is originally used in Watershed segmentation is then replaced by inverting the density at the point. Thus, all the highest density values are transformed into local minima. Subsequently, Watershed segmentation is performed in the topographic space. The intuitive idea under our method is that frames within a shot are highly agglomerative in the feature space and have higher possibilities to be merged together, while those frames between shots representing the shot changes are not, hence they have less density values and are less likely to be clustered by carefully extracting the markers and choosing the stopping criterion.
Retinal slit lamp video mosaicking.
De Zanet, Sandro; Rudolph, Tobias; Richa, Rogerio; Tappeiner, Christoph; Sznitman, Raphael
2016-06-01
To this day, the slit lamp remains the first tool used by an ophthalmologist to examine patient eyes. Imaging of the retina poses, however, a variety of problems, namely a shallow depth of focus, reflections from the optical system, a small field of view and non-uniform illumination. For ophthalmologists, the use of slit lamp images for documentation and analysis purposes, however, remains extremely challenging due to large image artifacts. For this reason, we propose an automatic retinal slit lamp video mosaicking, which enlarges the field of view and reduces amount of noise and reflections, thus enhancing image quality. Our method is composed of three parts: (i) viable content segmentation, (ii) global registration and (iii) image blending. Frame content is segmented using gradient boosting with custom pixel-wise features. Speeded-up robust features are used for finding pair-wise translations between frames with robust random sample consensus estimation and graph-based simultaneous localization and mapping for global bundle adjustment. Foreground-aware blending based on feathering merges video frames into comprehensive mosaics. Foreground is segmented successfully with an area under the curve of the receiver operating characteristic curve of 0.9557. Mosaicking results and state-of-the-art methods were compared and rated by ophthalmologists showing a strong preference for a large field of view provided by our method. The proposed method for global registration of retinal slit lamp images of the retina into comprehensive mosaics improves over state-of-the-art methods and is preferred qualitatively.
Semantic-based surveillance video retrieval.
Hu, Weiming; Xie, Dan; Fu, Zhouyu; Zeng, Wenrong; Maybank, Steve
2007-04-01
Visual surveillance produces large amounts of video data. Effective indexing and retrieval from surveillance video databases are very important. Although there are many ways to represent the content of video clips in current video retrieval algorithms, there still exists a semantic gap between users and retrieval systems. Visual surveillance systems supply a platform for investigating semantic-based video retrieval. In this paper, a semantic-based video retrieval framework for visual surveillance is proposed. A cluster-based tracking algorithm is developed to acquire motion trajectories. The trajectories are then clustered hierarchically using the spatial and temporal information, to learn activity models. A hierarchical structure of semantic indexing and retrieval of object activities, where each individual activity automatically inherits all the semantic descriptions of the activity model to which it belongs, is proposed for accessing video clips and individual objects at the semantic level. The proposed retrieval framework supports various queries including queries by keywords, multiple object queries, and queries by sketch. For multiple object queries, succession and simultaneity restrictions, together with depth and breadth first orders, are considered. For sketch-based queries, a method for matching trajectories drawn by users to spatial trajectories is proposed. The effectiveness and efficiency of our framework are tested in a crowded traffic scene.
Three-dimensional rendering of segmented object using matlab - biomed 2010.
Anderson, Jeffrey R; Barrett, Steven F
2010-01-01
The three-dimensional rendering of microscopic objects is a difficult and challenging task that often requires specialized image processing techniques. Previous work has been described of a semi-automatic segmentation process of fluorescently stained neurons collected as a sequence of slice images with a confocal laser scanning microscope. Once properly segmented, each individual object can be rendered and studied as a three-dimensional virtual object. This paper describes the work associated with the design and development of Matlab files to create three-dimensional images from the segmented object data previously mentioned. Part of the motivation for this work is to integrate both the segmentation and rendering processes into one software application, providing a seamless transition from the segmentation tasks to the rendering and visualization tasks. Previously these tasks were accomplished on two different computer systems, windows and Linux. This transition basically limits the usefulness of the segmentation and rendering applications to those who have both computer systems readily available. The focus of this work is to create custom Matlab image processing algorithms for object rendering and visualization, and merge these capabilities to the Matlab files that were developed especially for the image segmentation task. The completed Matlab application will contain both the segmentation and rendering processes in a single graphical user interface, or GUI. This process for rendering three-dimensional images in Matlab requires that a sequence of two-dimensional binary images, representing a cross-sectional slice of the object, be reassembled in a 3D space, and covered with a surface. Additional segmented objects can be rendered in the same 3D space. The surface properties of each object can be varied by the user to aid in the study and analysis of the objects. This inter-active process becomes a powerful visual tool to study and understand microscopic objects.
Algorithms for detection of objects in image sequences captured from an airborne imaging system
NASA Technical Reports Server (NTRS)
Kasturi, Rangachar; Camps, Octavia; Tang, Yuan-Liang; Devadiga, Sadashiva; Gandhi, Tarak
1995-01-01
This research was initiated as a part of the effort at the NASA Ames Research Center to design a computer vision based system that can enhance the safety of navigation by aiding the pilots in detecting various obstacles on the runway during critical section of the flight such as a landing maneuver. The primary goal is the development of algorithms for detection of moving objects from a sequence of images obtained from an on-board video camera. Image regions corresponding to the independently moving objects are segmented from the background by applying constraint filtering on the optical flow computed from the initial few frames of the sequence. These detected regions are tracked over subsequent frames using a model based tracking algorithm. Position and velocity of the moving objects in the world coordinate is estimated using an extended Kalman filter. The algorithms are tested using the NASA line image sequence with six static trucks and a simulated moving truck and experimental results are described. Various limitations of the currently implemented version of the above algorithm are identified and possible solutions to build a practical working system are investigated.
The impact of video technology on learning: A cooking skills experiment.
Surgenor, Dawn; Hollywood, Lynsey; Furey, Sinéad; Lavelle, Fiona; McGowan, Laura; Spence, Michelle; Raats, Monique; McCloat, Amanda; Mooney, Elaine; Caraher, Martin; Dean, Moira
2017-07-01
This study examines the role of video technology in the development of cooking skills. The study explored the views of 141 female participants on whether video technology can promote confidence in learning new cooking skills to assist in meal preparation. Prior to each focus group participants took part in a cooking experiment to assess the most effective method of learning for low-skilled cooks across four experimental conditions (recipe card only; recipe card plus video demonstration; recipe card plus video demonstration conducted in segmented stages; and recipe card plus video demonstration whereby participants freely accessed video demonstrations as and when needed). Focus group findings revealed that video technology was perceived to assist learning in the cooking process in the following ways: (1) improved comprehension of the cooking process; (2) real-time reassurance in the cooking process; (3) assisting the acquisition of new cooking skills; and (4) enhancing the enjoyment of the cooking process. These findings display the potential for video technology to promote motivation and confidence as well as enhancing cooking skills among low-skilled individuals wishing to cook from scratch using fresh ingredients. Copyright © 2017 Elsevier Ltd. All rights reserved.
The Simple Video Coder: A free tool for efficiently coding social video data.
Barto, Daniel; Bird, Clark W; Hamilton, Derek A; Fink, Brandi C
2017-08-01
Videotaping of experimental sessions is a common practice across many disciplines of psychology, ranging from clinical therapy, to developmental science, to animal research. Audio-visual data are a rich source of information that can be easily recorded; however, analysis of the recordings presents a major obstacle to project completion. Coding behavior is time-consuming and often requires ad-hoc training of a student coder. In addition, existing software is either prohibitively expensive or cumbersome, which leaves researchers with inadequate tools to quickly process video data. We offer the Simple Video Coder-free, open-source software for behavior coding that is flexible in accommodating different experimental designs, is intuitive for students to use, and produces outcome measures of event timing, frequency, and duration. Finally, the software also offers extraction tools to splice video into coded segments suitable for training future human coders or for use as input for pattern classification algorithms.
SuBSENSE: a universal change detection method with local adaptive sensitivity.
St-Charles, Pierre-Luc; Bilodeau, Guillaume-Alexandre; Bergevin, Robert
2015-01-01
Foreground/background segmentation via change detection in video sequences is often used as a stepping stone in high-level analytics and applications. Despite the wide variety of methods that have been proposed for this problem, none has been able to fully address the complex nature of dynamic scenes in real surveillance tasks. In this paper, we present a universal pixel-level segmentation method that relies on spatiotemporal binary features as well as color information to detect changes. This allows camouflaged foreground objects to be detected more easily while most illumination variations are ignored. Besides, instead of using manually set, frame-wide constants to dictate model sensitivity and adaptation speed, we use pixel-level feedback loops to dynamically adjust our method's internal parameters without user intervention. These adjustments are based on the continuous monitoring of model fidelity and local segmentation noise levels. This new approach enables us to outperform all 32 previously tested state-of-the-art methods on the 2012 and 2014 versions of the ChangeDetection.net dataset in terms of overall F-Measure. The use of local binary image descriptors for pixel-level modeling also facilitates high-speed parallel implementations: our own version, which used no low-level or architecture-specific instruction, reached real-time processing speed on a midlevel desktop CPU. A complete C++ implementation based on OpenCV is available online.
Six characteristics of nutrition education videos that support learning and motivation to learn.
Ramsay, Samantha A; Holyoke, Laura; Branen, Laurel J; Fletcher, Janice
2012-01-01
To identify characteristics in nutrition education video vignettes that support learning and motivation to learn about feeding children. Nine focus group interviews were conducted with child care providers in child care settings from 4 states in the western United States: California, Idaho, Oregon, and Washington. At each focus group interview, 3-8 participants (n = 37) viewed video vignettes and participated in a facilitated focus group discussion that was audiorecorded, transcribed, and analyzed. Primary characteristics of video vignettes child care providers perceived as supporting learning and motivation to learn about feeding young children were identified: (1) use real scenarios; (2) provide short segments; (3) present simple, single messages; (4) convey a skill-in-action; (5) develop the videos so participants can relate to the settings; and (6) support participants' ability to conceptualize the information. These 6 characteristics can be used by nutrition educators in selecting and developing videos in nutrition education. Copyright © 2012 Society for Nutrition Education and Behavior. Published by Elsevier Inc. All rights reserved.
Mosberger, Rafael; Andreasson, Henrik; Lilienthal, Achim J
2014-09-26
This article presents a novel approach for vision-based detection and tracking of humans wearing high-visibility clothing with retro-reflective markers. Addressing industrial applications where heavy vehicles operate in the vicinity of humans, we deploy a customized stereo camera setup with active illumination that allows for efficient detection of the reflective patterns created by the worker's safety garments. After segmenting reflective objects from the image background, the interest regions are described with local image feature descriptors and classified in order to discriminate safety garments from other reflective objects in the scene. In a final step, the trajectories of the detected humans are estimated in 3D space relative to the camera. We evaluate our tracking system in two industrial real-world work environments on several challenging video sequences. The experimental results indicate accurate tracking performance and good robustness towards partial occlusions, body pose variation, and a wide range of different illumination conditions.
Mosberger, Rafael; Andreasson, Henrik; Lilienthal, Achim J.
2014-01-01
This article presents a novel approach for vision-based detection and tracking of humans wearing high-visibility clothing with retro-reflective markers. Addressing industrial applications where heavy vehicles operate in the vicinity of humans, we deploy a customized stereo camera setup with active illumination that allows for efficient detection of the reflective patterns created by the worker's safety garments. After segmenting reflective objects from the image background, the interest regions are described with local image feature descriptors and classified in order to discriminate safety garments from other reflective objects in the scene. In a final step, the trajectories of the detected humans are estimated in 3D space relative to the camera. We evaluate our tracking system in two industrial real-world work environments on several challenging video sequences. The experimental results indicate accurate tracking performance and good robustness towards partial occlusions, body pose variation, and a wide range of different illumination conditions. PMID:25264956
Automatic recognition of ship types from infrared images using superstructure moment invariants
NASA Astrophysics Data System (ADS)
Li, Heng; Wang, Xinyu
2007-11-01
Automatic object recognition is an active area of interest for military and commercial applications. In this paper, a system addressing autonomous recognition of ship types in infrared images is proposed. Firstly, an approach of segmentation based on detection of salient features of the target with subsequent shadow removing is proposed, as is the base of the subsequent object recognition. Considering the differences between the shapes of various ships mainly lie in their superstructures, we then use superstructure moment functions invariant to translation, rotation and scale differences in input patterns and develop a robust algorithm of obtaining ship superstructure. Subsequently a back-propagation neural network is used as a classifier in the recognition stage and projection images of simulated three-dimensional ship models are used as the training sets. Our recognition model was implemented and experimentally validated using both simulated three-dimensional ship model images and real images derived from video of an AN/AAS-44V Forward Looking Infrared(FLIR) sensor.
Fernandez-Miranda, Juan C
2018-06-07
The medial temporal lobe can be divided in anterior, middle, and posterior segments. The anterior segment is formed by the uncus and hippocampal head, and it has extra and intraventricular structures. There are 2 main approaches to the uncohippocampal region, the anteromedial temporal lobectomy (Spencer's technique) and the transsylvian selective amygdalohippocampectomy (Yasargil's technique).In this video, we present the case of a 29-yr-old man with new onset of generalized seizures and a contrast-enhancing lesion in the left anterior segment of the medial temporal lobe compatible with high-grade glioma. He had a medical history of cervical astrocytoma at age 8 requiring craniospinal radiation therapy and ventriculoperitoneal shunt placement.The tumor was approached using a combined transsylvian transcisternal and transinferior insular sulcus approach to the extra and intraventricular aspects of the uncohippocampal region. It was resected completely, and the patient was neurologically intact after resection with no further seizures at 6-mo follow-up. The diagnosis was glioblastoma IDH-wild type, for which he underwent adjuvant therapy.Surgical anatomy and technical nuances of this approach are illustrated using a 3-dimensional video and anatomic dissections. The selective approach, when compared to an anteromedial temporal lobectomy, has the advantage of preserving the anterolateral temporal cortex, which is particularly relevant in dominant-hemisphere lesions, and the related fiber tracts, including the inferior fronto-occipital and inferior longitudinal fascicles, and most of the optic radiation fibers. The transsylvian approach, however, is technically and anatomically more challenging and potentially carries a higher risk of vascular injury and vasospasm.Page 1 and figures from Fernández-Miranda JC et al, Microvascular Anatomy of the Medial Temporal Region: Part 1: Its Application to Arteriovenous Malformation Surgery, Operative Neurosurgery, 2010, Volume 67, issue 3, ons237-ons276, by permission of the Congress of Neurological Surgeons (1:26-1:37 in video).Page 1 from Fernández-Miranda JC et al, Three-Dimensio-nal Microsurgical and Tractographic Anatomy of the White Matter of the Human Brain, Neurosurgery, 2008, Volume 62, issue suppl_3, SHC989-SHC1028, by permission of the Congress of Neurological Surgeons (1:54-1:56 in video).
Virtual Surveyor based Object Extraction from Airborne LiDAR data
NASA Astrophysics Data System (ADS)
Habib, Md. Ahsan
Topographic feature detection of land cover from LiDAR data is important in various fields - city planning, disaster response and prevention, soil conservation, infrastructure or forestry. In recent years, feature classification, compliant with Object-Based Image Analysis (OBIA) methodology has been gaining traction in remote sensing and geographic information science (GIS). In OBIA, the LiDAR image is first divided into meaningful segments called object candidates. This results, in addition to spectral values, in a plethora of new information such as aggregated spectral pixel values, morphology, texture, context as well as topology. Traditional nonparametric segmentation methods rely on segmentations at different scales to produce a hierarchy of semantically significant objects. Properly tuned scale parameters are, therefore, imperative in these methods for successful subsequent classification. Recently, some progress has been made in the development of methods for tuning the parameters for automatic segmentation. However, researchers found that it is very difficult to automatically refine the tuning with respect to each object class present in the scene. Moreover, due to the relative complexity of real-world objects, the intra-class heterogeneity is very high, which leads to over-segmentation. Therefore, the method fails to deliver correctly many of the new segment features. In this dissertation, a new hierarchical 3D object segmentation algorithm called Automatic Virtual Surveyor based Object Extracted (AVSOE) is presented. AVSOE segments objects based on their distinct geometric concavity/convexity. This is achieved by strategically mapping the sloping surface, which connects the object to its background. Further analysis produces hierarchical decomposition of objects to its sub-objects at a single scale level. Extensive qualitative and qualitative results are presented to demonstrate the efficacy of this hierarchical segmentation approach.
Segmentation precedes face categorization under suboptimal conditions.
Van Den Boomen, Carlijn; Fahrenfort, Johannes J; Snijders, Tineke M; Kemner, Chantal
2015-01-01
Both categorization and segmentation processes play a crucial role in face perception. However, the functional relation between these subprocesses is currently unclear. The present study investigates the temporal relation between segmentation-related and category-selective responses in the brain, using electroencephalography (EEG). Surface segmentation and category content were both manipulated using texture-defined objects, including faces. This allowed us to study brain activity related to segmentation and to categorization. In the main experiment, participants viewed texture-defined objects for a duration of 800 ms. EEG results revealed that segmentation-related responses precede category-selective responses. Three additional experiments revealed that the presence and timing of categorization depends on stimulus properties and presentation duration. Photographic objects were presented for a long and short (92 ms) duration and evoked fast category-selective responses in both cases. On the other hand, presentation of texture-defined objects for a short duration only evoked segmentation-related but no category-selective responses. Category-selective responses were much slower when evoked by texture-defined than by photographic objects. We suggest that in case of categorization of objects under suboptimal conditions, such as when low-level stimulus properties are not sufficient for fast object categorization, segmentation facilitates the slower categorization process.
Segmentation precedes face categorization under suboptimal conditions
Van Den Boomen, Carlijn; Fahrenfort, Johannes J.; Snijders, Tineke M.; Kemner, Chantal
2015-01-01
Both categorization and segmentation processes play a crucial role in face perception. However, the functional relation between these subprocesses is currently unclear. The present study investigates the temporal relation between segmentation-related and category-selective responses in the brain, using electroencephalography (EEG). Surface segmentation and category content were both manipulated using texture-defined objects, including faces. This allowed us to study brain activity related to segmentation and to categorization. In the main experiment, participants viewed texture-defined objects for a duration of 800 ms. EEG results revealed that segmentation-related responses precede category-selective responses. Three additional experiments revealed that the presence and timing of categorization depends on stimulus properties and presentation duration. Photographic objects were presented for a long and short (92 ms) duration and evoked fast category-selective responses in both cases. On the other hand, presentation of texture-defined objects for a short duration only evoked segmentation-related but no category-selective responses. Category-selective responses were much slower when evoked by texture-defined than by photographic objects. We suggest that in case of categorization of objects under suboptimal conditions, such as when low-level stimulus properties are not sufficient for fast object categorization, segmentation facilitates the slower categorization process. PMID:26074838
Multi-object segmentation using coupled nonparametric shape and relative pose priors
NASA Astrophysics Data System (ADS)
Uzunbas, Mustafa Gökhan; Soldea, Octavian; Çetin, Müjdat; Ünal, Gözde; Erçil, Aytül; Unay, Devrim; Ekin, Ahmet; Firat, Zeynep
2009-02-01
We present a new method for multi-object segmentation in a maximum a posteriori estimation framework. Our method is motivated by the observation that neighboring or coupling objects in images generate configurations and co-dependencies which could potentially aid in segmentation if properly exploited. Our approach employs coupled shape and inter-shape pose priors that are computed using training images in a nonparametric multi-variate kernel density estimation framework. The coupled shape prior is obtained by estimating the joint shape distribution of multiple objects and the inter-shape pose priors are modeled via standard moments. Based on such statistical models, we formulate an optimization problem for segmentation, which we solve by an algorithm based on active contours. Our technique provides significant improvements in the segmentation of weakly contrasted objects in a number of applications. In particular for medical image analysis, we use our method to extract brain Basal Ganglia structures, which are members of a complex multi-object system posing a challenging segmentation problem. We also apply our technique to the problem of handwritten character segmentation. Finally, we use our method to segment cars in urban scenes.
NASA Astrophysics Data System (ADS)
Yeom, Seokwon
2013-05-01
Millimeter waves imaging draws increasing attention in security applications for weapon detection under clothing. In this paper, concealed object segmentation and three-dimensional localization schemes are reviewed. A concealed object is segmented by the k-means algorithm. A feature-based stereo-matching method estimates the longitudinal distance of the concealed object. The distance is estimated by the discrepancy between the corresponding centers of the segmented objects. Experimental results are provided with the analysis of the depth resolution.
A NDVI assisted remote sensing image adaptive scale segmentation method
NASA Astrophysics Data System (ADS)
Zhang, Hong; Shen, Jinxiang; Ma, Yanmei
2018-03-01
Multiscale segmentation of images can effectively form boundaries of different objects with different scales. However, for the remote sensing image which widely coverage with complicated ground objects, the number of suitable segmentation scales, and each of the scale size is still difficult to be accurately determined, which severely restricts the rapid information extraction of the remote sensing image. A great deal of experiments showed that the normalized difference vegetation index (NDVI) can effectively express the spectral characteristics of a variety of ground objects in remote sensing images. This paper presents a method using NDVI assisted adaptive segmentation of remote sensing images, which segment the local area by using NDVI similarity threshold to iteratively select segmentation scales. According to the different regions which consist of different targets, different segmentation scale boundaries could be created. The experimental results showed that the adaptive segmentation method based on NDVI can effectively create the objects boundaries for different ground objects of remote sensing images.
Automatic segmentation of colon glands using object-graphs.
Gunduz-Demir, Cigdem; Kandemir, Melih; Tosun, Akif Burak; Sokmensuer, Cenk
2010-02-01
Gland segmentation is an important step to automate the analysis of biopsies that contain glandular structures. However, this remains a challenging problem as the variation in staining, fixation, and sectioning procedures lead to a considerable amount of artifacts and variances in tissue sections, which may result in huge variances in gland appearances. In this work, we report a new approach for gland segmentation. This approach decomposes the tissue image into a set of primitive objects and segments glands making use of the organizational properties of these objects, which are quantified with the definition of object-graphs. As opposed to the previous literature, the proposed approach employs the object-based information for the gland segmentation problem, instead of using the pixel-based information alone. Working with the images of colon tissues, our experiments demonstrate that the proposed object-graph approach yields high segmentation accuracies for the training and test sets and significantly improves the segmentation performance of its pixel-based counterparts. The experiments also show that the object-based structure of the proposed approach provides more tolerance to artifacts and variances in tissues.
Real-time skin feature identification in a time-sequential video stream
NASA Astrophysics Data System (ADS)
Kramberger, Iztok
2005-04-01
Skin color can be an important feature when tracking skin-colored objects. Particularly this is the case for computer-vision-based human-computer interfaces (HCI). Humans have a highly developed feeling of space and, therefore, it is reasonable to support this within intelligent HCI, where the importance of augmented reality can be foreseen. Joining human-like interaction techniques within multimodal HCI could, or will, gain a feature for modern mobile telecommunication devices. On the other hand, real-time processing plays an important role in achieving more natural and physically intuitive ways of human-machine interaction. The main scope of this work is the development of a stereoscopic computer-vision hardware-accelerated framework for real-time skin feature identification in the sense of a single-pass image segmentation process. The hardware-accelerated preprocessing stage is presented with the purpose of color and spatial filtering, where the skin color model within the hue-saturation-value (HSV) color space is given with a polyhedron of threshold values representing the basis of the filter model. An adaptive filter management unit is suggested to achieve better segmentation results. This enables the adoption of filter parameters to the current scene conditions in an adaptive way. Implementation of the suggested hardware structure is given at the level of filed programmable system level integrated circuit (FPSLIC) devices using an embedded microcontroller as their main feature. A stereoscopic clue is achieved using a time-sequential video stream, but this shows no difference for real-time processing requirements in terms of hardware complexity. The experimental results for the hardware-accelerated preprocessing stage are given by efficiency estimation of the presented hardware structure using a simple motion-detection algorithm based on a binary function.
Analysis of Spatio-Temporal Traffic Patterns Based on Pedestrian Trajectories
NASA Astrophysics Data System (ADS)
Busch, S.; Schindler, T.; Klinger, T.; Brenner, C.
2016-06-01
For driver assistance and autonomous driving systems, it is essential to predict the behaviour of other traffic participants. Usually, standard filter approaches are used to this end, however, in many cases, these are not sufficient. For example, pedestrians are able to change their speed or direction instantly. Also, there may be not enough observation data to determine the state of an object reliably, e.g. in case of occlusions. In those cases, it is very useful if a prior model exists, which suggests certain outcomes. For example, it is useful to know that pedestrians are usually crossing the road at a certain location and at certain times. This information can then be stored in a map which then can be used as a prior in scene analysis, or in practical terms to reduce the speed of a vehicle in advance in order to minimize critical situations. In this paper, we present an approach to derive such a spatio-temporal map automatically from the observed behaviour of traffic participants in everyday traffic situations. In our experiments, we use one stationary camera to observe a complex junction, where cars, public transportation and pedestrians interact. We concentrate on the pedestrians trajectories to map traffic patterns. In the first step, we extract trajectory segments from the video data. These segments are then clustered in order to derive a spatial model of the scene, in terms of a spatially embedded graph. In the second step, we analyse the temporal patterns of pedestrian movement on this graph. We are able to derive traffic light sequences as well as the timetables of nearby public transportation. To evaluate our approach, we used a 4 hour video sequence. We show that we are able to derive traffic light sequences as well as time tables of nearby public transportation.
Sedentary behaviours among adults across Canada.
Herman, Katya M; Saunders, Travis J
2016-12-27
OBJECTIVES: While cross-Canada variations in physical activity and weight status have been illustrated, less is known about sedentary behaviour (SB). The aim of this study was to describe various SBs and their correlates among Canadian adults. METHODS: Cross-sectional data from the 2011-2012 Canadian Community Health Survey included 92,918 respondents aged 20-75+ years, representative of >22 million Canadian adults. TV/video viewing, computer, video game playing and reading time were self-reported. Associations with socio-demographic, health and health behaviour variables were examined. RESULTS: About 31% of adults reported >2 hours/day TV viewing, while 47% of men and 41% of women reported >5 hours/week computer use, 24% of men and 12% of women reported ≥1 hour/week video game playing, and 33% of men and 46% of women reported >5 hours/week reading; 28% of respondents reported ≥5 hours/day total SB time. Age was the strongest correlate: adults 75+ had 5 and 6 times greater odds respectively of reporting >2 hours/day TV viewing and >5 hours/week reading, but far lesser odds of reporting high computer or video game time, compared to adults 20-24. Other variables associated with specific SBs included gender, marital status, education, occupation, income and immigrant status, as well as BMI, weight perceptions, smoking, diet and physical activity. CONCLUSION: Common sedentary behaviours were associated with numerous socio-demographic, health and health behaviour characteristics in a large representative sample of Canadians. These correlates differed according to the type of SB. Public health interventions targeting SB should be behavior-specific and tailored to the population segment of interest.
Efficient Use of Video for 3d Modelling of Cultural Heritage Objects
NASA Astrophysics Data System (ADS)
Alsadik, B.; Gerke, M.; Vosselman, G.
2015-03-01
Currently, there is a rapid development in the techniques of the automated image based modelling (IBM), especially in advanced structure-from-motion (SFM) and dense image matching methods, and camera technology. One possibility is to use video imaging to create 3D reality based models of cultural heritage architectures and monuments. Practically, video imaging is much easier to apply when compared to still image shooting in IBM techniques because the latter needs a thorough planning and proficiency. However, one is faced with mainly three problems when video image sequences are used for highly detailed modelling and dimensional survey of cultural heritage objects. These problems are: the low resolution of video images, the need to process a large number of short baseline video images and blur effects due to camera shake on a significant number of images. In this research, the feasibility of using video images for efficient 3D modelling is investigated. A method is developed to find the minimal significant number of video images in terms of object coverage and blur effect. This reduction in video images is convenient to decrease the processing time and to create a reliable textured 3D model compared with models produced by still imaging. Two experiments for modelling a building and a monument are tested using a video image resolution of 1920×1080 pixels. Internal and external validations of the produced models are applied to find out the final predicted accuracy and the model level of details. Related to the object complexity and video imaging resolution, the tests show an achievable average accuracy between 1 - 5 cm when using video imaging, which is suitable for visualization, virtual museums and low detailed documentation.
Cognitive Tempo, Violent Video Games, and Aggressive Behavior in Young Boys.
ERIC Educational Resources Information Center
Irwin, A. Roland; Gross, Alan M.
1995-01-01
Assesses interpersonal aggression and aggression toward inanimate objects in a free-play setting where children played video games. Results indicated that subjects who played video games with aggressive content exhibited more object aggression during free-play and more interpersonal aggression during the frustrating situation than youngsters who…
An objective method for a video quality evaluation in a 3DTV service
NASA Astrophysics Data System (ADS)
Wilczewski, Grzegorz
2015-09-01
The following article describes proposed objective method for a 3DTV video quality evaluation, a Compressed Average Image Intensity (CAII) method. Identification of the 3DTV service's content chain nodes enables to design a versatile, objective video quality metric. It is based on an advanced approach to the stereoscopic videostream analysis. Insights towards designed metric mechanisms, as well as the evaluation of performance of the designed video quality metric, in the face of the simulated environmental conditions are herein discussed. As a result, created CAII metric might be effectively used in a variety of service quality assessment applications.
NASA Astrophysics Data System (ADS)
Lee, Feifei; Kotani, Koji; Chen, Qiu; Ohmi, Tadahiro
2010-02-01
In this paper, a fast search algorithm for MPEG-4 video clips from video database is proposed. An adjacent pixel intensity difference quantization (APIDQ) histogram is utilized as the feature vector of VOP (video object plane), which had been reliably applied to human face recognition previously. Instead of fully decompressed video sequence, partially decoded data, namely DC sequence of the video object are extracted from the video sequence. Combined with active search, a temporal pruning algorithm, fast and robust video search can be realized. The proposed search algorithm has been evaluated by total 15 hours of video contained of TV programs such as drama, talk, news, etc. to search for given 200 MPEG-4 video clips which each length is 15 seconds. Experimental results show the proposed algorithm can detect the similar video clip in merely 80ms, and Equal Error Rate (ERR) of 2 % in drama and news categories are achieved, which are more accurately and robust than conventional fast video search algorithm.
Li, Yixian; Qi, Lehua; Song, Yongshan; Chao, Xujiang
2017-06-01
The components of carbon/carbon (C/C) composites have significant influence on the thermal and mechanical properties, so a quantitative characterization of component is necessary to study the microstructure of C/C composites, and further to improve the macroscopic properties of C/C composites. Considering the extinction crosses of the pyrocarbon matrix have significant moving features, the polarized light microscope (PLM) video is used to characterize C/C composites quantitatively because it contains sufficiently dynamic and structure information. Then the optical flow method is introduced to compute the optical flow field between the adjacent frames, and segment the components of C/C composites from PLM image by image processing. Meanwhile the matrix with different textures is re-segmented by the length difference of motion vectors, and then the component fraction of each component and extinction angle of pyrocarbon matrix are calculated directly. Finally, the C/C composites are successfully characterized from three aspects of carbon fiber, pyrocarbon, and pores by a series of image processing operators based on PLM video, and the errors of component fractions are less than 15%. © 2017 Wiley Periodicals, Inc.
Real time markerless motion tracking using linked kinematic chains
Luck, Jason P [Arvada, CO; Small, Daniel E [Albuquerque, NM
2007-08-14
A markerless method is described for tracking the motion of subjects in a three dimensional environment using a model based on linked kinematic chains. The invention is suitable for tracking robotic, animal or human subjects in real-time using a single computer with inexpensive video equipment, and does not require the use of markers or specialized clothing. A simple model of rigid linked segments is constructed of the subject and tracked using three dimensional volumetric data collected by a multiple camera video imaging system. A physics based method is then used to compute forces to align the model with subsequent volumetric data sets in real-time. The method is able to handle occlusion of segments and accommodates joint limits, velocity constraints, and collision constraints and provides for error recovery. The method further provides for elimination of singularities in Jacobian based calculations, which has been problematic in alternative methods.
Shor, Eran; Seida, Kimberly
2018-04-18
It is a common notion among many scholars and pundits that the pornography industry becomes "harder and harder" with every passing year. Some have suggested that porn viewers, who are mostly men, become desensitized to "soft" pornography, and producers are happy to generate videos that are more hard core, resulting in a growing demand for and supply of violent and degrading acts against women in mainstream pornographic videos. We examined this accepted wisdom by utilizing a sample of 269 popular videos uploaded to PornHub over the past decade. More specifically, we tested two related claims: (1) aggressive content in videos is on the rise and (2) viewers prefer such content, reflected in both the number of views and the rankings for videos containing aggression. Our results offer no support for these contentions. First, we did not find any consistent uptick in aggressive content over the past decade; in fact, the average video today contains shorter segments showing aggression. Second, videos containing aggressive acts are both less likely to receive views and less likely to be ranked favorably by viewers, who prefer videos where women clearly perform pleasure.
Image Segmentation Using Minimum Spanning Tree
NASA Astrophysics Data System (ADS)
Dewi, M. P.; Armiati, A.; Alvini, S.
2018-04-01
This research aim to segmented the digital image. The process of segmentation is to separate the object from the background. So the main object can be processed for the other purposes. Along with the development of technology in digital image processing application, the segmentation process becomes increasingly necessary. The segmented image which is the result of the segmentation process should accurate due to the next process need the interpretation of the information on the image. This article discussed the application of minimum spanning tree on graph in segmentation process of digital image. This method is able to separate an object from the background and the image will change to be the binary images. In this case, the object that being the focus is set in white, while the background is black or otherwise.
Consumer-based technology for distribution of surgical videos for objective evaluation.
Gonzalez, Ray; Martinez, Jose M; Lo Menzo, Emanuele; Iglesias, Alberto R; Ro, Charles Y; Madan, Atul K
2012-08-01
The Global Operative Assessment of Laparoscopic Skill (GOALS) is one validated metric utilized to grade laparoscopic skills and has been utilized to score recorded operative videos. To facilitate easier viewing of these recorded videos, we are developing novel techniques to enable surgeons to view these videos. The objective of this study is to determine the feasibility of utilizing widespread current consumer-based technology to assist in distributing appropriate videos for objective evaluation. Videos from residents were recorded via a direct connection from the camera processor via an S-video output via a cable into a hub to connect to a standard laptop computer via a universal serial bus (USB) port. A standard consumer-based video editing program was utilized to capture the video and record in appropriate format. We utilized mp4 format, and depending on the size of the file, the videos were scaled down (compressed), their format changed (using a standard video editing program), or sliced into multiple videos. Standard available consumer-based programs were utilized to convert the video into a more appropriate format for handheld personal digital assistants. In addition, the videos were uploaded to a social networking website and video sharing websites. Recorded cases of laparoscopic cholecystectomy in a porcine model were utilized. Compression was required for all formats. All formats were accessed from home computers, work computers, and iPhones without difficulty. Qualitative analyses by four surgeons demonstrated appropriate quality to grade for these formats. Our preliminary results show promise that, utilizing consumer-based technology, videos can be easily distributed to surgeons to grade via GOALS via various methods. Easy accessibility may help make evaluation of resident videos less complicated and cumbersome.
Automatically monitoring driftwood in large rivers: preliminary results
NASA Astrophysics Data System (ADS)
Piegay, H.; Lemaire, P.; MacVicar, B.; Mouquet-Noppe, C.; Tougne, L.
2014-12-01
Driftwood in rivers impact sediment transport, riverine habitat and human infrastructures. Quantifying it, in particular large woods on fairly large rivers where it can move easily, would allow us to improve our knowledge on fluvial transport processes. There are several means of studying this phenomenon, amongst which RFID sensors tracking, photo and video monitoring. In this abstract, we are interested in the latter, being easier and cheaper to deploy. However, video monitoring of driftwood generates a huge amount of images and manually labeling it is tedious. It is essential to automate such a monitoring process, which is a difficult task in the field of computer vision, and more specifically automatic video analysis. Detecting foreground into dynamic background remains an open problem to date. We installed a video camera at the riverside of a gauging station on the Ain River, a 3500 km² Piedmont River in France. Several floods were manually annotated by a human operator. We developed software that automatically extracts and characterizes wood blocks within a video stream. This algorithm is based upon a statistical model and combines static, dynamic and spatial data. Segmented wood objects are further described with the help of a skeleton-based approach that helps us to automatically determine its shape, diameter and length. The first detailed comparisons between manual annotations and automatically extracted data show that we can fairly well detect large wood until a given size (approximately 120 cm in length or 15 cm in diameter) whereas smaller ones are difficult to detect and tend to be missed by either the human operator, either the algorithm. Detection is fairly accurate in high flow conditions where the water channel is usually brown because of suspended sediment transport. In low flow context, our algorithm still needs improvement to reduce the number of false positive so as to better distinguish shadow or turbulence structures from wood pieces.
Open-source telemedicine platform for wireless medical video communication.
Panayides, A; Eleftheriou, I; Pantziaris, M
2013-01-01
An m-health system for real-time wireless communication of medical video based on open-source software is presented. The objective is to deliver a low-cost telemedicine platform which will allow for reliable remote diagnosis m-health applications such as emergency incidents, mass population screening, and medical education purposes. The performance of the proposed system is demonstrated using five atherosclerotic plaque ultrasound videos. The videos are encoded at the clinically acquired resolution, in addition to lower, QCIF, and CIF resolutions, at different bitrates, and four different encoding structures. Commercially available wireless local area network (WLAN) and 3.5G high-speed packet access (HSPA) wireless channels are used to validate the developed platform. Objective video quality assessment is based on PSNR ratings, following calibration using the variable frame delay (VFD) algorithm that removes temporal mismatch between original and received videos. Clinical evaluation is based on atherosclerotic plaque ultrasound video assessment protocol. Experimental results show that adequate diagnostic quality wireless medical video communications are realized using the designed telemedicine platform. HSPA cellular networks provide for ultrasound video transmission at the acquired resolution, while VFD algorithm utilization bridges objective and subjective ratings.
Open-Source Telemedicine Platform for Wireless Medical Video Communication
Panayides, A.; Eleftheriou, I.; Pantziaris, M.
2013-01-01
An m-health system for real-time wireless communication of medical video based on open-source software is presented. The objective is to deliver a low-cost telemedicine platform which will allow for reliable remote diagnosis m-health applications such as emergency incidents, mass population screening, and medical education purposes. The performance of the proposed system is demonstrated using five atherosclerotic plaque ultrasound videos. The videos are encoded at the clinically acquired resolution, in addition to lower, QCIF, and CIF resolutions, at different bitrates, and four different encoding structures. Commercially available wireless local area network (WLAN) and 3.5G high-speed packet access (HSPA) wireless channels are used to validate the developed platform. Objective video quality assessment is based on PSNR ratings, following calibration using the variable frame delay (VFD) algorithm that removes temporal mismatch between original and received videos. Clinical evaluation is based on atherosclerotic plaque ultrasound video assessment protocol. Experimental results show that adequate diagnostic quality wireless medical video communications are realized using the designed telemedicine platform. HSPA cellular networks provide for ultrasound video transmission at the acquired resolution, while VFD algorithm utilization bridges objective and subjective ratings. PMID:23573082
No-reference video quality measurement: added value of machine learning
NASA Astrophysics Data System (ADS)
Mocanu, Decebal Constantin; Pokhrel, Jeevan; Garella, Juan Pablo; Seppänen, Janne; Liotou, Eirini; Narwaria, Manish
2015-11-01
Video quality measurement is an important component in the end-to-end video delivery chain. Video quality is, however, subjective, and thus, there will always be interobserver differences in the subjective opinion about the visual quality of the same video. Despite this, most existing works on objective quality measurement typically focus only on predicting a single score and evaluate their prediction accuracies based on how close it is to the mean opinion scores (or similar average based ratings). Clearly, such an approach ignores the underlying diversities in the subjective scoring process and, as a result, does not allow further analysis on how reliable the objective prediction is in terms of subjective variability. Consequently, the aim of this paper is to analyze this issue and present a machine-learning based solution to address it. We demonstrate the utility of our ideas by considering the practical scenario of video broadcast transmissions with focus on digital terrestrial television (DTT) and proposing a no-reference objective video quality estimator for such application. We conducted meaningful verification studies on different video content (including video clips recorded from real DTT broadcast transmissions) in order to verify the performance of the proposed solution.
Segmentation of Object Outlines into Parts: A Large-Scale Integrative Study
ERIC Educational Resources Information Center
De Winter, Joeri; Wagemans, Johan
2006-01-01
In this study, a large number of observers (N=201) were asked to segment a collection of outlines derived from line drawings of everyday objects (N=88). This data set was then used as a benchmark to evaluate current models of object segmentation. All of the previously proposed rules of segmentation were found supported in our results. For example,…
Identification of GHB and morphine in hair in a case of drug-facilitated sexual assault.
Rossi, Riccardo; Lancia, Massimo; Gambelunghe, Cristiana; Oliva, Antonio; Fucci, Nadia
2009-04-15
The authors present the case of a 24-year-old girl who was sexually assaulted after administration of gamma-hydroxybutyrate (GHB) and morphine. She had been living in an international college for foreign students for about 1 year and often complained of a general unhealthy feeling in the morning. At the end of the college period she returned to Italy and received at home some video clips shot by a mobile phone camera. In these videos she was having sex with a boy she met when she was studying abroad. Toxicological analysis of her hair was done: the hair was 20-cm long. A 2/3-cm segmentation of all the length of the hair was performed. Morphine and GHB were detected in hair segments related to the period of time she was abroad. The analyses of hair segments were performed by gas chromatography/mass spectrometry (GC/MS) and the concentration of morphine and GHB were calculated. A higher value of GHB was found in the period associated with the possible criminal activity and was also associated with the presence of morphine in the same period.
Kinematics of the field hockey penalty corner push-in.
Kerr, Rebecca; Ness, Kevin
2006-01-01
The aims of the study were to determine those variables that significantly affect push-in execution and thereby formulate coaching recommendations specific to the push-in. Two 50 Hz video cameras recorded transverse and longitudinal views of push-in trials performed by eight experienced and nine inexperienced male push-in performers. Video footage was digitized for data analysis of ball speed, stance width, drag distance, drag time, drag speed, centre of massy displacement and segment and stick displacements and velocities. Experienced push-in performers demonstrated a significantly greater (p < 0.05) stance width, a significantly greater distance between the ball and the front foot at the start of the push-in and a significantly faster ball speed than inexperienced performers. In addition, the experienced performers showed a significant positive correlation between ball speed and playing experience and tended to adopt a combination of simultaneous and sequential segment rotation to achieve accuracy and fast ball speed. The study yielded the following coaching recommendations for enhanced push-in performance: maximize drag distance by maximizing front foot-ball distance at the start of the push-in; use a combination of simultaneous and sequential segment rotations to optimise both accuracy and ball speed and maximize drag speed.
NASA Astrophysics Data System (ADS)
Jiang, Yang; Gong, Yuanzheng; Wang, Thomas D.; Seibel, Eric J.
2017-02-01
Multimodal endoscopy, with fluorescence-labeled probes binding to overexpressed molecular targets, is a promising technology to visualize early-stage cancer. T/B ratio is the quantitative analysis used to correlate fluorescence regions to cancer. Currently, T/B ratio calculation is post-processing and does not provide real-time feedback to the endoscopist. To achieve real-time computer assisted diagnosis (CAD), we establish image processing protocols for calculating T/B ratio and locating high-risk fluorescence regions for guiding biopsy and therapy in Barrett's esophagus (BE) patients. Methods: Chan-Vese algorithm, an active contour model, is used to segment high-risk regions in fluorescence videos. A semi-implicit gradient descent method was applied to minimize the energy function of this algorithm and evolve the segmentation. The surrounding background was then identified using morphology operation. The average T/B ratio was computed and regions of interest were highlighted based on user-selected thresholding. Evaluation was conducted on 50 fluorescence videos acquired from clinical video recordings using a custom multimodal endoscope. Results: With a processing speed of 2 fps on a laptop computer, we obtained accurate segmentation of high-risk regions examined by experts. For each case, the clinical user could optimize target boundary by changing the penalty on area inside the contour. Conclusion: Automatic and real-time procedure of calculating T/B ratio and identifying high-risk regions of early esophageal cancer was developed. Future work will increase processing speed to <5 fps, refine the clinical interface, and apply to additional GI cancers and fluorescence peptides.
Microsurgical Clipping of an Unruptured Carotid Cave Aneurysm: 3-Dimensional Operative Video.
Tabani, Halima; Yousef, Sonia; Burkhardt, Jan-Karl; Gandhi, Sirin; Benet, Arnau; Lawton, Michael T
2017-08-01
Most aneurysms originating from the clinoidal segment of the internal carotid artery (ICA) are nowadays managed conservatively, treated endovascularly with coiling (with or without stenting) or flow diverters. However, microsurgical clip occlusion remains an alternative. This video demonstrates clip occlusion of an unruptured right carotid cave aneurysm measuring 7 mm in a 39-year-old woman. The patient opted for surgery because of concerns about prolonged antiplatelet use associated with endovascular therapy. After patient consent, a standard pterional craniotomy was performed followed by extradural anterior clinoidectomy. After dural opening and sylvian fissure split, a clinoidal flap was opened to enter the extradural space around the clinoidal segment. The dural ring was dissected circumferentially, freeing the medial wall of the ICA down to the sellar region and mobilizing the ICA out of its canal of the clinoidal segment. With the aneurysm neck in view, the aneurysm was clipped with a 45° angled fenestrated clip over the ICA. Indocyanine green angiography confirmed no further filling of the aneurysm and patency of the ICA. Complete aneurysm occlusion was confirmed with postoperative angiography, and the patient had no neurologic deficits (Video 1). This case demonstrates the importance of anterior clinoidectomy and thorough distal dural ring dissection for effective clipping of carotid cave aneurysms. Control of venous bleeding from the cavernous sinus with fibrin glue injection simplifies the dissection, which should minimize manipulation of the optic nerve. Knowledge of this anatomy and proficiency with these techniques is important in an era of declining open aneurysm cases. Copyright © 2017 Elsevier Inc. All rights reserved.
Remote Video Monitor of Vehicles in Cooperative Information Platform
NASA Astrophysics Data System (ADS)
Qin, Guofeng; Wang, Xiaoguo; Wang, Li; Li, Yang; Li, Qiyan
Detection of vehicles plays an important role in the area of the modern intelligent traffic management. And the pattern recognition is a hot issue in the area of computer vision. An auto- recognition system in cooperative information platform is studied. In the cooperative platform, 3G wireless network, including GPS, GPRS (CDMA), Internet (Intranet), remote video monitor and M-DMB networks are integrated. The remote video information can be taken from the terminals and sent to the cooperative platform, then detected by the auto-recognition system. The images are pretreated and segmented, including feature extraction, template matching and pattern recognition. The system identifies different models and gets vehicular traffic statistics. Finally, the implementation of the system is introduced.
Practical system for generating digital mixed reality video holograms.
Song, Joongseok; Kim, Changseob; Park, Hanhoon; Park, Jong-Il
2016-07-10
We propose a practical system that can effectively mix the depth data of real and virtual objects by using a Z buffer and can quickly generate digital mixed reality video holograms by using multiple graphic processing units (GPUs). In an experiment, we verify that real objects and virtual objects can be merged naturally in free viewing angles, and the occlusion problem is well handled. Furthermore, we demonstrate that the proposed system can generate mixed reality video holograms at 7.6 frames per second. Finally, the system performance is objectively verified by users' subjective evaluations.
NASA Astrophysics Data System (ADS)
Le, Minh Tuan; Nguyen, Congdu; Yoon, Dae-Il; Jung, Eun Ku; Jia, Jie; Kim, Hae-Kwang
2007-12-01
In this paper, we propose a method of 3D graphics to video encoding and streaming that are embedded into a remote interactive 3D visualization system for rapidly representing a 3D scene on mobile devices without having to download it from the server. In particular, a 3D graphics to video framework is presented that increases the visual quality of regions of interest (ROI) of the video by performing more bit allocation to ROI during H.264 video encoding. The ROI are identified by projection 3D objects to a 2D plane during rasterization. The system offers users to navigate the 3D scene and interact with objects of interests for querying their descriptions. We developed an adaptive media streaming server that can provide an adaptive video stream in term of object-based quality to the client according to the user's preferences and the variation of network bandwidth. Results show that by doing ROI mode selection, PSNR of test sample slightly change while visual quality of objects increases evidently.
Video-Based Big Data Analytics in Cyberlearning
ERIC Educational Resources Information Center
Wang, Shuangbao; Kelly, William
2017-01-01
In this paper, we present a novel system, inVideo, for video data analytics, and its use in transforming linear videos into interactive learning objects. InVideo is able to analyze video content automatically without the need for initial viewing by a human. Using a highly efficient video indexing engine we developed, the system is able to analyze…
ERIC Educational Resources Information Center
Lawrence, Michael A.
1985-01-01
"Narrowcasting" is information and entertainment aimed at specific population segments, including previously ignored minorities. Cable, satellite, videodisc, low-power television, and video cassette recorders may all help keep minorities from being "information poor." These elements, however, are expensive, and study is needed to understand how…
An objective measure of hyperactivity aspects with compressed webcam video.
Wehrmann, Thomas; Müller, Jörg Michael
2015-01-01
Objective measures of physical activity are currently not considered in clinical guidelines for the assessment of hyperactivity in the context of Attention-Deficit/Hyperactivity Disorder (ADHD) due to low and inconsistent associations between clinical ratings, missing age-related norm data and high technical requirements. This pilot study introduces a new objective measure for physical activity using compressed webcam video footage, which should be less affected by age-related variables. A pre-test established a preliminary standard procedure for testing a clinical sample of 39 children aged 6-16 years (21 with a clinical ADHD diagnosis, 18 without). Subjects were filmed for 6 min while solving a standardized cognitive performance task. Our webcam video-based video-activity score was compared with respect to two independent video-based movement ratings by students, ratings of Inattentiveness, Hyperactivity and Impulsivity by clinicians (DCL-ADHS) giving a clinical diagnosis of ADHD and parents (FBB-ADHD) and physical features (age, weight, height, BMI) using mean scores, correlations and multiple regression. Our video-activity score showed a high agreement (r = 0.81) with video-based movement ratings, but also considerable associations with age-related physical attributes. After controlling for age-related confounders, the video-activity score showed not the expected association with clinicians' or parents' hyperactivity ratings. Our preliminary conclusion is that our video-activity score assesses physical activity but not specific information related to hyperactivity. The general problem of defining and assessing hyperactivity with objective criteria remains.
Assessment of YouTube videos as a source of information on medication use in pregnancy.
Hansen, Craig; Interrante, Julia D; Ailes, Elizabeth C; Frey, Meghan T; Broussard, Cheryl S; Godoshian, Valerie J; Lewis, Courtney; Polen, Kara N D; Garcia, Amanda P; Gilboa, Suzanne M
2016-01-01
When making decisions about medication use in pregnancy, women consult many information sources, including the Internet. The aim of this study was to assess the content of publicly accessible YouTube videos that discuss medication use in pregnancy. Using 2023 distinct combinations of search terms related to medications and pregnancy, we extracted metadata from YouTube videos using a YouTube video Application Programming Interface. Relevant videos were defined as those with a medication search term and a pregnancy-related search term in either the video title or description. We viewed relevant videos and abstracted content from each video into a database. We documented whether videos implied each medication to be "safe" or "unsafe" in pregnancy and compared that assessment with the medication's Teratogen Information System (TERIS) rating. After viewing 651 videos, 314 videos with information about medication use in pregnancy were available for the final analyses. The majority of videos were from law firms (67%), television segments (10%), or physicians (8%). Selective serotonin reuptake inhibitors (SSRIs) were the most common medication class named (225 videos, 72%), and 88% of videos about SSRIs indicated that they were unsafe for use in pregnancy. However, the TERIS ratings for medication products in this class range from "unlikely" to "minimal" teratogenic risk. For the majority of medications, current YouTube video content does not adequately reflect what is known about the safety of their use in pregnancy and should be interpreted cautiously. However, YouTube could serve as a platform for communicating evidence-based medication safety information. Copyright © 2015 John Wiley & Sons, Ltd.
Perception of synchronization errors in haptic and visual communications
NASA Astrophysics Data System (ADS)
Kameyama, Seiji; Ishibashi, Yutaka
2006-10-01
This paper deals with a system which conveys the haptic sensation experimented by a user to a remote user. In the system, the user controls a haptic interface device with another remote haptic interface device while watching video. Haptic media and video of a real object which the user is touching are transmitted to another user. By subjective assessment, we investigate the allowable range and imperceptible range of synchronization error between haptic media and video. We employ four real objects and ask each subject whether the synchronization error is perceived or not for each object in the assessment. Assessment results show that we can more easily perceive the synchronization error in the case of haptic media ahead of video than in the case of the haptic media behind the video.
Spatio-Temporal Video Segmentation with Shape Growth or Shrinkage Constraint
NASA Technical Reports Server (NTRS)
Tarabalka, Yuliya; Charpiat, Guillaume; Brucker, Ludovic; Menze, Bjoern H.
2014-01-01
We propose a new method for joint segmentation of monotonously growing or shrinking shapes in a time sequence of noisy images. The task of segmenting the image time series is expressed as an optimization problem using the spatio-temporal graph of pixels, in which we are able to impose the constraint of shape growth or of shrinkage by introducing monodirectional infinite links connecting pixels at the same spatial locations in successive image frames. The globally optimal solution is computed with a graph cut. The performance of the proposed method is validated on three applications: segmentation of melting sea ice floes and of growing burned areas from time series of 2D satellite images, and segmentation of a growing brain tumor from sequences of 3D medical scans. In the latter application, we impose an additional intersequences inclusion constraint by adding directed infinite links between pixels of dependent image structures.
Shuttle Lesson Learned - Toxicology
NASA Technical Reports Server (NTRS)
James, John T.
2010-01-01
This is a script for a video about toxicology and the space shuttle. The first segment is deals with dust in the space vehicle. The next segment will be about archival samples. Then we'll look at real time on-board analyzers that give us a lot of capability in terms of monitoring for combustion products and the ability to monitor volatile organics on the station. Finally we will look at other issues that are about setting limits and dealing with ground based lessons that pertain to toxicology.
A data set for evaluating the performance of multi-class multi-object video tracking
NASA Astrophysics Data System (ADS)
Chakraborty, Avishek; Stamatescu, Victor; Wong, Sebastien C.; Wigley, Grant; Kearney, David
2017-05-01
One of the challenges in evaluating multi-object video detection, tracking and classification systems is having publically available data sets with which to compare different systems. However, the measures of performance for tracking and classification are different. Data sets that are suitable for evaluating tracking systems may not be appropriate for classification. Tracking video data sets typically only have ground truth track IDs, while classification video data sets only have ground truth class-label IDs. The former identifies the same object over multiple frames, while the latter identifies the type of object in individual frames. This paper describes an advancement of the ground truth meta-data for the DARPA Neovision2 Tower data set to allow both the evaluation of tracking and classification. The ground truth data sets presented in this paper contain unique object IDs across 5 different classes of object (Car, Bus, Truck, Person, Cyclist) for 24 videos of 871 image frames each. In addition to the object IDs and class labels, the ground truth data also contains the original bounding box coordinates together with new bounding boxes in instances where un-annotated objects were present. The unique IDs are maintained during occlusions between multiple objects or when objects re-enter the field of view. This will provide: a solid foundation for evaluating the performance of multi-object tracking of different types of objects, a straightforward comparison of tracking system performance using the standard Multi Object Tracking (MOT) framework, and classification performance using the Neovision2 metrics. These data have been hosted publically.
An integrated framework for detecting suspicious behaviors in video surveillance
NASA Astrophysics Data System (ADS)
Zin, Thi Thi; Tin, Pyke; Hama, Hiromitsu; Toriu, Takashi
2014-03-01
In this paper, we propose an integrated framework for detecting suspicious behaviors in video surveillance systems which are established in public places such as railway stations, airports, shopping malls and etc. Especially, people loitering in suspicion, unattended objects left behind and exchanging suspicious objects between persons are common security concerns in airports and other transit scenarios. These involve understanding scene/event, analyzing human movements, recognizing controllable objects, and observing the effect of the human movement on those objects. In the proposed framework, multiple background modeling technique, high level motion feature extraction method and embedded Markov chain models are integrated for detecting suspicious behaviors in real time video surveillance systems. Specifically, the proposed framework employs probability based multiple backgrounds modeling technique to detect moving objects. Then the velocity and distance measures are computed as the high level motion features of the interests. By using an integration of the computed features and the first passage time probabilities of the embedded Markov chain, the suspicious behaviors in video surveillance are analyzed for detecting loitering persons, objects left behind and human interactions such as fighting. The proposed framework has been tested by using standard public datasets and our own video surveillance scenarios.
Multi-scale image segmentation method with visual saliency constraints and its application
NASA Astrophysics Data System (ADS)
Chen, Yan; Yu, Jie; Sun, Kaimin
2018-03-01
Object-based image analysis method has many advantages over pixel-based methods, so it is one of the current research hotspots. It is very important to get the image objects by multi-scale image segmentation in order to carry out object-based image analysis. The current popular image segmentation methods mainly share the bottom-up segmentation principle, which is simple to realize and the object boundaries obtained are accurate. However, the macro statistical characteristics of the image areas are difficult to be taken into account, and fragmented segmentation (or over-segmentation) results are difficult to avoid. In addition, when it comes to information extraction, target recognition and other applications, image targets are not equally important, i.e., some specific targets or target groups with particular features worth more attention than the others. To avoid the problem of over-segmentation and highlight the targets of interest, this paper proposes a multi-scale image segmentation method with visually saliency graph constraints. Visual saliency theory and the typical feature extraction method are adopted to obtain the visual saliency information, especially the macroscopic information to be analyzed. The visual saliency information is used as a distribution map of homogeneity weight, where each pixel is given a weight. This weight acts as one of the merging constraints in the multi- scale image segmentation. As a result, pixels that macroscopically belong to the same object but are locally different can be more likely assigned to one same object. In addition, due to the constraint of visual saliency model, the constraint ability over local-macroscopic characteristics can be well controlled during the segmentation process based on different objects. These controls will improve the completeness of visually saliency areas in the segmentation results while diluting the controlling effect for non- saliency background areas. Experiments show that this method works better for texture image segmentation than traditional multi-scale image segmentation methods, and can enable us to give priority control to the saliency objects of interest. This method has been used in image quality evaluation, scattered residential area extraction, sparse forest extraction and other applications to verify its validation. All applications showed good results.
A segmentation editing framework based on shape change statistics
NASA Astrophysics Data System (ADS)
Mostapha, Mahmoud; Vicory, Jared; Styner, Martin; Pizer, Stephen
2017-02-01
Segmentation is a key task in medical image analysis because its accuracy significantly affects successive steps. Automatic segmentation methods often produce inadequate segmentations, which require the user to manually edit the produced segmentation slice by slice. Because editing is time-consuming, an editing tool that enables the user to produce accurate segmentations by only drawing a sparse set of contours would be needed. This paper describes such a framework as applied to a single object. Constrained by the additional information enabled by the manually segmented contours, the proposed framework utilizes object shape statistics to transform the failed automatic segmentation to a more accurate version. Instead of modeling the object shape, the proposed framework utilizes shape change statistics that were generated to capture the object deformation from the failed automatic segmentation to its corresponding correct segmentation. An optimization procedure was used to minimize an energy function that consists of two terms, an external contour match term and an internal shape change regularity term. The high accuracy of the proposed segmentation editing approach was confirmed by testing it on a simulated data set based on 10 in-vivo infant magnetic resonance brain data sets using four similarity metrics. Segmentation results indicated that our method can provide efficient and adequately accurate segmentations (Dice segmentation accuracy increase of 10%), with very sparse contours (only 10%), which is promising in greatly decreasing the work expected from the user.
Another Way of Tracking Moving Objects Using Short Video Clips
ERIC Educational Resources Information Center
Vera, Francisco; Romanque, Cristian
2009-01-01
Physics teachers have long employed video clips to study moving objects in their classrooms and instructional labs. A number of approaches exist, both free and commercial, for tracking the coordinates of a point using video. The main characteristics of the method described in this paper are: it is simple to use; coordinates can be tracked using…
An unsupervised method for summarizing egocentric sport videos
NASA Astrophysics Data System (ADS)
Habibi Aghdam, Hamed; Jahani Heravi, Elnaz; Puig, Domenec
2015-12-01
People are getting more interested to record their sport activities using head-worn or hand-held cameras. This type of videos which is called egocentric sport videos has different motion and appearance patterns compared with life-logging videos. While a life-logging video can be defined in terms of well-defined human-object interactions, notwithstanding, it is not trivial to describe egocentric sport videos using well-defined activities. For this reason, summarizing egocentric sport videos based on human-object interaction might fail to produce meaningful results. In this paper, we propose an unsupervised method for summarizing egocentric videos by identifying the key-frames of the video. Our method utilizes both appearance and motion information and it automatically finds the number of the key-frames. Our blind user study on the new dataset collected from YouTube shows that in 93:5% cases, the users choose the proposed method as their first video summary choice. In addition, our method is within the top 2 choices of the users in 99% of studies.
Development of a video-delivered relaxation treatment of late-life anxiety for veterans.
Gould, Christine E; Zapata, Aimee Marie L; Bruce, Janine; Bereknyei Merrell, Sylvia; Wetherell, Julie Loebach; O'Hara, Ruth; Kuhn, Eric; Goldstein, Mary K; Beaudreau, Sherry A
2017-10-01
Behavioral treatments reduce anxiety, yet many older adults may not have access to these efficacious treatments. To address this need, we developed and evaluated the feasibility and acceptability of a video-delivered anxiety treatment for older Veterans. This treatment program, BREATHE (Breathing, Relaxation, and Education for Anxiety Treatment in the Home Environment), combines psychoeducation, diaphragmatic breathing, and progressive muscle relaxation training with engagement in activities. A mixed methods concurrent study design was used to examine the clarity of the treatment videos. We conducted semi-structured interviews with 20 Veterans (M age = 69.5, SD = 7.3 years; 55% White, Non-Hispanic) and collected ratings of video clarity. Quantitative ratings revealed that 100% of participants generally or definitely could follow breathing and relaxation video instructions. Qualitative findings, however, demonstrated more variability in the extent to which each video segment was clear. Participants identified both immediate benefits and motivation challenges associated with a video-delivered treatment. Participants suggested that some patients may need encouragement, whereas others need face-to-face therapy. Quantitative ratings of video clarity and qualitative findings highlight the feasibility of a video-delivered treatment for older Veterans with anxiety. Our findings demonstrate the importance of ensuring patients can follow instructions provided in self-directed treatments and the role that an iterative testing process has in addressing these issues. Next steps include testing the treatment videos with older Veterans with anxiety disorders.
Layer-based buffer aware rate adaptation design for SHVC video streaming
NASA Astrophysics Data System (ADS)
Gudumasu, Srinivas; Hamza, Ahmed; Asbun, Eduardo; He, Yong; Ye, Yan
2016-09-01
This paper proposes a layer based buffer aware rate adaptation design which is able to avoid abrupt video quality fluctuation, reduce re-buffering latency and improve bandwidth utilization when compared to a conventional simulcast based adaptive streaming system. The proposed adaptation design schedules DASH segment requests based on the estimated bandwidth, dependencies among video layers and layer buffer fullness. Scalable HEVC video coding is the latest state-of-art video coding technique that can alleviate various issues caused by simulcast based adaptive video streaming. With scalable coded video streams, the video is encoded once into a number of layers representing different qualities and/or resolutions: a base layer (BL) and one or more enhancement layers (EL), each incrementally enhancing the quality of the lower layers. Such layer based coding structure allows fine granularity rate adaptation for the video streaming applications. Two video streaming use cases are presented in this paper. The first use case is to stream HD SHVC video over a wireless network where available bandwidth varies, and the performance comparison between proposed layer-based streaming approach and conventional simulcast streaming approach is provided. The second use case is to stream 4K/UHD SHVC video over a hybrid access network that consists of a 5G millimeter wave high-speed wireless link and a conventional wired or WiFi network. The simulation results verify that the proposed layer based rate adaptation approach is able to utilize the bandwidth more efficiently. As a result, a more consistent viewing experience with higher quality video content and minimal video quality fluctuations can be presented to the user.
Automatic topics segmentation for TV news video
NASA Astrophysics Data System (ADS)
Hmayda, Mounira; Ejbali, Ridha; Zaied, Mourad
2017-03-01
Automatic identification of television programs in the TV stream is an important task for operating archives. This article proposes a new spatio-temporal approach to identify the programs in TV stream into two main steps: First, a reference catalogue for video features visual jingles built. We operate the features that characterize the instances of the same program type to identify the different types of programs in the flow of television. The role of video features is to represent the visual invariants for each visual jingle using appropriate automatic descriptors for each television program. On the other hand, programs in television streams are identified by examining the similarity of the video signal for visual grammars in the catalogue. The main idea of the identification process is to compare the visual similarity of the video signal features in the flow of television to the catalogue. After presenting the proposed approach, the paper overviews encouraging experimental results on several streams extracted from different channels and compounds of several programs.
NASA Astrophysics Data System (ADS)
Shimada, Satoshi; Azuma, Shouzou; Teranaka, Sayaka; Kojima, Akira; Majima, Yukie; Maekawa, Yasuko
We developed the system that knowledge could be discovered and shared cooperatively in the organization based on the SECI model of knowledge management. This system realized three processes by the following method. (1)A video that expressed skill is segmented into a number of scenes according to its contents. Tacit knowledge is shared in each scene. (2)Tacit knowledge is extracted by bulletin board linked to each scene. (3)Knowledge is acquired by repeatedly viewing the video scene with the comment that shows the technical content to be practiced. We conducted experiments that the system was used by nurses working for general hospitals. Experimental results show that the nursing practical knack is able to be collected by utilizing bulletin board linked to video scene. Results of this study confirmed the possibility of expressing the tacit knowledge of nurses' empirical nursing skills sensitively with a clue of video images.
Deformable M-Reps for 3D Medical Image Segmentation.
Pizer, Stephen M; Fletcher, P Thomas; Joshi, Sarang; Thall, Andrew; Chen, James Z; Fridman, Yonatan; Fritsch, Daniel S; Gash, Graham; Glotzer, John M; Jiroutek, Michael R; Lu, Conglin; Muller, Keith E; Tracton, Gregg; Yushkevich, Paul; Chaney, Edward L
2003-11-01
M-reps (formerly called DSLs) are a multiscale medial means for modeling and rendering 3D solid geometry. They are particularly well suited to model anatomic objects and in particular to capture prior geometric information effectively in deformable models segmentation approaches. The representation is based on figural models , which define objects at coarse scale by a hierarchy of figures - each figure generally a slab representing a solid region and its boundary simultaneously. This paper focuses on the use of single figure models to segment objects of relatively simple structure. A single figure is a sheet of medial atoms, which is interpolated from the model formed by a net, i.e., a mesh or chain, of medial atoms (hence the name m-reps ), each atom modeling a solid region via not only a position and a width but also a local figural frame giving figural directions and an object angle between opposing, corresponding positions on the boundary implied by the m-rep. The special capability of an m-rep is to provide spatial and orientational correspondence between an object in two different states of deformation. This ability is central to effective measurement of both geometric typicality and geometry to image match, the two terms of the objective function optimized in segmentation by deformable models. The other ability of m-reps central to effective segmentation is their ability to support segmentation at multiple levels of scale, with successively finer precision. Objects modeled by single figures are segmented first by a similarity transform augmented by object elongation, then by adjustment of each medial atom, and finally by displacing a dense sampling of the m-rep implied boundary. While these models and approaches also exist in 2D, we focus on 3D objects. The segmentation of the kidney from CT and the hippocampus from MRI serve as the major examples in this paper. The accuracy of segmentation as compared to manual, slice-by-slice segmentation is reported.
Deformable M-Reps for 3D Medical Image Segmentation
Pizer, Stephen M.; Fletcher, P. Thomas; Joshi, Sarang; Thall, Andrew; Chen, James Z.; Fridman, Yonatan; Fritsch, Daniel S.; Gash, Graham; Glotzer, John M.; Jiroutek, Michael R.; Lu, Conglin; Muller, Keith E.; Tracton, Gregg; Yushkevich, Paul; Chaney, Edward L.
2013-01-01
M-reps (formerly called DSLs) are a multiscale medial means for modeling and rendering 3D solid geometry. They are particularly well suited to model anatomic objects and in particular to capture prior geometric information effectively in deformable models segmentation approaches. The representation is based on figural models, which define objects at coarse scale by a hierarchy of figures – each figure generally a slab representing a solid region and its boundary simultaneously. This paper focuses on the use of single figure models to segment objects of relatively simple structure. A single figure is a sheet of medial atoms, which is interpolated from the model formed by a net, i.e., a mesh or chain, of medial atoms (hence the name m-reps), each atom modeling a solid region via not only a position and a width but also a local figural frame giving figural directions and an object angle between opposing, corresponding positions on the boundary implied by the m-rep. The special capability of an m-rep is to provide spatial and orientational correspondence between an object in two different states of deformation. This ability is central to effective measurement of both geometric typicality and geometry to image match, the two terms of the objective function optimized in segmentation by deformable models. The other ability of m-reps central to effective segmentation is their ability to support segmentation at multiple levels of scale, with successively finer precision. Objects modeled by single figures are segmented first by a similarity transform augmented by object elongation, then by adjustment of each medial atom, and finally by displacing a dense sampling of the m-rep implied boundary. While these models and approaches also exist in 2D, we focus on 3D objects. The segmentation of the kidney from CT and the hippocampus from MRI serve as the major examples in this paper. The accuracy of segmentation as compared to manual, slice-by-slice segmentation is reported. PMID:23825898
ERIC Educational Resources Information Center
Kozma, Robert B.; Russell, Joel
1997-01-01
Examines how professional chemists and undergraduate chemistry students respond to chemistry-related video segments, graphs, animations, and equations. Discusses the role that surface features of representations play in the understanding of chemistry. Contains 36 references. (DDR)
Annotation of UAV surveillance video
NASA Astrophysics Data System (ADS)
Howlett, Todd; Robertson, Mark A.; Manthey, Dan; Krol, John
2004-08-01
Significant progress toward the development of a video annotation capability is presented in this paper. Research and development of an object tracking algorithm applicable for UAV video is described. Object tracking is necessary for attaching the annotations to the objects of interest. A methodology and format is defined for encoding video annotations using the SMPTE Key-Length-Value encoding standard. This provides the following benefits: a non-destructive annotation, compliance with existing standards, video playback in systems that are not annotation enabled and support for a real-time implementation. A model real-time video annotation system is also presented, at a high level, using the MPEG-2 Transport Stream as the transmission medium. This work was accomplished to meet the Department of Defense"s (DoD"s) need for a video annotation capability. Current practices for creating annotated products are to capture a still image frame, annotate it using an Electric Light Table application, and then pass the annotated image on as a product. That is not adequate for reporting or downstream cueing. It is too slow and there is a severe loss of information. This paper describes a capability for annotating directly on the video.
Motion-Blur-Free High-Speed Video Shooting Using a Resonant Mirror
Inoue, Michiaki; Gu, Qingyi; Takaki, Takeshi; Ishii, Idaku; Tajima, Kenji
2017-01-01
This study proposes a novel concept of actuator-driven frame-by-frame intermittent tracking for motion-blur-free video shooting of fast-moving objects. The camera frame and shutter timings are controlled for motion blur reduction in synchronization with a free-vibration-type actuator vibrating with a large amplitude at hundreds of hertz so that motion blur can be significantly reduced in free-viewpoint high-frame-rate video shooting for fast-moving objects by deriving the maximum performance of the actuator. We develop a prototype of a motion-blur-free video shooting system by implementing our frame-by-frame intermittent tracking algorithm on a high-speed video camera system with a resonant mirror vibrating at 750 Hz. It can capture 1024 × 1024 images of fast-moving objects at 750 fps with an exposure time of 0.33 ms without motion blur. Several experimental results for fast-moving objects verify that our proposed method can reduce image degradation from motion blur without decreasing the camera exposure time. PMID:29109385
More About The Video Event Trigger
NASA Technical Reports Server (NTRS)
Williams, Glenn L.
1996-01-01
Report presents additional information about system described in "Video Event Trigger" (LEW-15076). Digital electronic system processes video-image data to generate trigger signal when image shows significant change, such as motion, or appearance, disappearance, change in color, brightness, or dilation of object. Potential uses include monitoring of hallways, parking lots, and other areas during hours when supposed unoccupied, looking for fires, tracking airplanes or other moving objects, identification of missing or defective parts on production lines, and video recording of automobile crash tests.
Video sensor with range measurement capability
NASA Technical Reports Server (NTRS)
Howard, Richard T. (Inventor); Briscoe, Jeri M. (Inventor); Corder, Eric L. (Inventor); Broderick, David J. (Inventor)
2008-01-01
A video sensor device is provided which incorporates a rangefinder function. The device includes a single video camera and a fixed laser spaced a predetermined distance from the camera for, when activated, producing a laser beam. A diffractive optic element divides the beam so that multiple light spots are produced on a target object. A processor calculates the range to the object based on the known spacing and angles determined from the light spots on the video images produced by the camera.
Moving object detection and tracking in videos through turbulent medium
NASA Astrophysics Data System (ADS)
Halder, Kalyan Kumar; Tahtali, Murat; Anavatti, Sreenatha G.
2016-06-01
This paper addresses the problem of identifying and tracking moving objects in a video sequence having a time-varying background. This is a fundamental task in many computer vision applications, though a very challenging one because of turbulence that causes blurring and spatiotemporal movements of the background images. Our proposed approach involves two major steps. First, a moving object detection algorithm that deals with the detection of real motions by separating the turbulence-induced motions using a two-level thresholding technique is used. In the second step, a feature-based generalized regression neural network is applied to track the detected objects throughout the frames in the video sequence. The proposed approach uses the centroid and area features of the moving objects and creates the reference regions instantly by selecting the objects within a circle. Simulation experiments are carried out on several turbulence-degraded video sequences and comparisons with an earlier method confirms that the proposed approach provides a more effective tracking of the targets.
Automatic facial animation parameters extraction in MPEG-4 visual communication
NASA Astrophysics Data System (ADS)
Yang, Chenggen; Gong, Wanwei; Yu, Lu
2002-01-01
Facial Animation Parameters (FAPs) are defined in MPEG-4 to animate a facial object. The algorithm proposed in this paper to extract these FAPs is applied to very low bit-rate video communication, in which the scene is composed of a head-and-shoulder object with complex background. This paper addresses the algorithm to automatically extract all FAPs needed to animate a generic facial model, estimate the 3D motion of head by points. The proposed algorithm extracts human facial region by color segmentation and intra-frame and inter-frame edge detection. Facial structure and edge distribution of facial feature such as vertical and horizontal gradient histograms are used to locate the facial feature region. Parabola and circle deformable templates are employed to fit facial feature and extract a part of FAPs. A special data structure is proposed to describe deformable templates to reduce time consumption for computing energy functions. Another part of FAPs, 3D rigid head motion vectors, are estimated by corresponding-points method. A 3D head wire-frame model provides facial semantic information for selection of proper corresponding points, which helps to increase accuracy of 3D rigid object motion estimation.
A Multi-Objective Decision Making Approach for Solving the Image Segmentation Fusion Problem.
Khelifi, Lazhar; Mignotte, Max
2017-08-01
Image segmentation fusion is defined as the set of methods which aim at merging several image segmentations, in a manner that takes full advantage of the complementarity of each one. Previous relevant researches in this field have been impeded by the difficulty in identifying an appropriate single segmentation fusion criterion, providing the best possible, i.e., the more informative, result of fusion. In this paper, we propose a new model of image segmentation fusion based on multi-objective optimization which can mitigate this problem, to obtain a final improved result of segmentation. Our fusion framework incorporates the dominance concept in order to efficiently combine and optimize two complementary segmentation criteria, namely, the global consistency error and the F-measure (precision-recall) criterion. To this end, we present a hierarchical and efficient way to optimize the multi-objective consensus energy function related to this fusion model, which exploits a simple and deterministic iterative relaxation strategy combining the different image segments. This step is followed by a decision making task based on the so-called "technique for order performance by similarity to ideal solution". Results obtained on two publicly available databases with manual ground truth segmentations clearly show that our multi-objective energy-based model gives better results than the classical mono-objective one.
Thamjamrassri, Punyotai; Song, YuJin; Tak, JaeHyun; Kang, HoYong; Hong, Jeeyoung
2018-01-01
Objectives Customer discovery (CD) is a method to determine if there are actual customers for a product/service and what they would want before actually developing the product/service. This concept, however, is rather new to health information technology (IT) systems. Therefore, the aim of this paper was to demonstrate how to use the CD method in developing a comprehensive health IT service for patients with knee/leg pain. Methods We participated in a 6-week I-Corps program to perform CD, in which we interviewed 55 people in person, by phone, or by video conference within 6 weeks: 4 weeks in the United States and 2 weeks in Korea. The interviewees included orthopedic doctors, physical therapists, physical trainers, physicians, researchers, pharmacists, vendors, and patients. By analyzing the interview data, the aim was to revise our business model accordingly. Results Using the CD approach enabled us to understand the customer segments and identify value propositions. We concluded that a facilitating tele-rehabilitation system is needed the most and that the most suitable customer segment is early stage arthritis patients. We identified a new design concept for the customer segment. Furthermore, CD is required to identify value propositions in detail. Conclusions CD is crucial to determine a more desirable direction in developing health IT systems, and it can be a powerful tool to increase the potential for successful commercialization in the health IT field. PMID:29503756
Multiple Hypotheses Image Segmentation and Classification With Application to Dietary Assessment
Zhu, Fengqing; Bosch, Marc; Khanna, Nitin; Boushey, Carol J.; Delp, Edward J.
2016-01-01
We propose a method for dietary assessment to automatically identify and locate food in a variety of images captured during controlled and natural eating events. Two concepts are combined to achieve this: a set of segmented objects can be partitioned into perceptually similar object classes based on global and local features; and perceptually similar object classes can be used to assess the accuracy of image segmentation. These ideas are implemented by generating multiple segmentations of an image to select stable segmentations based on the classifier’s confidence score assigned to each segmented image region. Automatic segmented regions are classified using a multichannel feature classification system. For each segmented region, multiple feature spaces are formed. Feature vectors in each of the feature spaces are individually classified. The final decision is obtained by combining class decisions from individual feature spaces using decision rules. We show improved accuracy of segmenting food images with classifier feedback. PMID:25561457
Multiple hypotheses image segmentation and classification with application to dietary assessment.
Zhu, Fengqing; Bosch, Marc; Khanna, Nitin; Boushey, Carol J; Delp, Edward J
2015-01-01
We propose a method for dietary assessment to automatically identify and locate food in a variety of images captured during controlled and natural eating events. Two concepts are combined to achieve this: a set of segmented objects can be partitioned into perceptually similar object classes based on global and local features; and perceptually similar object classes can be used to assess the accuracy of image segmentation. These ideas are implemented by generating multiple segmentations of an image to select stable segmentations based on the classifier's confidence score assigned to each segmented image region. Automatic segmented regions are classified using a multichannel feature classification system. For each segmented region, multiple feature spaces are formed. Feature vectors in each of the feature spaces are individually classified. The final decision is obtained by combining class decisions from individual feature spaces using decision rules. We show improved accuracy of segmenting food images with classifier feedback.
ERIC Educational Resources Information Center
Barrett, Andrew J.; And Others
The Center for Interactive Technology, Applications, and Research at the College of Engineering of the University of South Florida (Tampa) has developed objective and descriptive evaluation models to assist in determining the educational potential of computer and video courseware. The computer-based courseware evaluation model and the video-based…
Cooperative Educational Project - The Southern Appalachians: A Changing World
NASA Astrophysics Data System (ADS)
Clark, S.; Back, J.; Tubiolo, A.; Romanaux, E.
2001-12-01
The Southern Appalachian Mountains, a popular recreation area known for its beauty and rich biodiversity, was chosen by the U.S. Geological Survey as the site to produce a video, booklet, and teachers guide to explain basic geologic principles and how long-term geologic processes affect landscapes, ecosystems, and the quality of human life. The video was produced in cooperation with the National Park Service and has benefited from the advice of the Southern Appalachian Man and Biosphere Cooperative, a group of 11 Federal and three State agencies that works to promote the environmental health, stewardship, and sustainable development of the resources of the region. Much of the information in the video is included in the booklet. A teachers guide provides supporting activities that teachers may use to reinforce the concepts presented in the video and booklet. Although the Southern Appalachians include some of the most visited recreation areas in the country, few are aware of the geologic underpinnings that have contributed to the beauty, biological diversity, and quality of human life in the region. The video includes several animated segments that show paleogeographic reconstructions of the Earth and movements of the North American continent over time; the formation of the Ocoee sedimentary basin beginning about 750 million years ago; the collision of the North American and African continents about 270 million years ago; the formation of granites and similar rocks, faults, and geologic windows; and the extent of glaciation in North America. The animated segments are tied to familiar public-access localities in the region. They illustrate geologic processes and time periods, making the geologic setting of the region more understandable to tourists and local students. The video reinforces the concept that understanding geologic processes and settings is an important component of informed land management to sustain the quality of life in a region. The video and a teachers guide will be distributed by the Southern Appalachian Man and Biosphere to local middle and high schools, libraries, and visitors centers in the region. It will be distributed by the U.S. Geological Survey and sold in Park Service and Forest Service gift shops in the region.
Fostering Teacher Candidates' Reflective Practice through Video Editing
ERIC Educational Resources Information Center
Trent, Margaret; Gurvitch, Rachel
2015-01-01
Recently, interest in using video to promote the reflective practice in preservice teacher education has increased. Video recordings of teaching incidents inspire the reflective practice in preservice teachers by allowing them to analyze instruction and view teaching in an objective light. As an extension of video recording, video editing has…
Free-viewpoint video of human actors using multiple handheld Kinects.
Ye, Genzhi; Liu, Yebin; Deng, Yue; Hasler, Nils; Ji, Xiangyang; Dai, Qionghai; Theobalt, Christian
2013-10-01
We present an algorithm for creating free-viewpoint video of interacting humans using three handheld Kinect cameras. Our method reconstructs deforming surface geometry and temporal varying texture of humans through estimation of human poses and camera poses for every time step of the RGBZ video. Skeletal configurations and camera poses are found by solving a joint energy minimization problem, which optimizes the alignment of RGBZ data from all cameras, as well as the alignment of human shape templates to the Kinect data. The energy function is based on a combination of geometric correspondence finding, implicit scene segmentation, and correspondence finding using image features. Finally, texture recovery is achieved through jointly optimization on spatio-temporal RGB data using matrix completion. As opposed to previous methods, our algorithm succeeds on free-viewpoint video of human actors under general uncontrolled indoor scenes with potentially dynamic background, and it succeeds even if the cameras are moving.
Context-Aware Fusion of RGB and Thermal Imagery for Traffic Monitoring
Alldieck, Thiemo; Bahnsen, Chris H.; Moeslund, Thomas B.
2016-01-01
In order to enable a robust 24-h monitoring of traffic under changing environmental conditions, it is beneficial to observe the traffic scene using several sensors, preferably from different modalities. To fully benefit from multi-modal sensor output, however, one must fuse the data. This paper introduces a new approach for fusing color RGB and thermal video streams by using not only the information from the videos themselves, but also the available contextual information of a scene. The contextual information is used to judge the quality of a particular modality and guides the fusion of two parallel segmentation pipelines of the RGB and thermal video streams. The potential of the proposed context-aware fusion is demonstrated by extensive tests of quantitative and qualitative characteristics on existing and novel video datasets and benchmarked against competing approaches to multi-modal fusion. PMID:27869730
A microcomputer interface for a digital audio processor-based data recording system.
Croxton, T L; Stump, S J; Armstrong, W M
1987-10-01
An inexpensive interface is described that performs direct transfer of digitized data from the digital audio processor and video cassette recorder based data acquisition system designed by Bezanilla (1985, Biophys. J., 47:437-441) to an IBM PC/XT microcomputer. The FORTRAN callable software that drives this interface is capable of controlling the video cassette recorder and starting data collection immediately after recognition of a segment of previously collected data. This permits piecewise analysis of long intervals of data that would otherwise exceed the memory capability of the microcomputer.
A microcomputer interface for a digital audio processor-based data recording system.
Croxton, T L; Stump, S J; Armstrong, W M
1987-01-01
An inexpensive interface is described that performs direct transfer of digitized data from the digital audio processor and video cassette recorder based data acquisition system designed by Bezanilla (1985, Biophys. J., 47:437-441) to an IBM PC/XT microcomputer. The FORTRAN callable software that drives this interface is capable of controlling the video cassette recorder and starting data collection immediately after recognition of a segment of previously collected data. This permits piecewise analysis of long intervals of data that would otherwise exceed the memory capability of the microcomputer. PMID:3676444
(abstract) Geological Tour of Southwestern Mexico
NASA Technical Reports Server (NTRS)
Adams, Steven L.; Lang, Harold R.
1993-01-01
Nineteen Landsat Themic Mapper quarter scenes, coregistered at 28.5 m spatial resolution with three arc second digital topographic data, were used to create a movie, simulating a flight over the Guerrero and Mixteco terrains of southwestern Mexico. The flight path was chosen to elucidate important structural, stratigraphic, and geomorphic features. The video, available in VHS format, is a 360 second animation consisting of 10 800 total frames. The simulated velocity during three 120 second flight segments of the video is approximately 37 000 km per hour, traversing approximately 1 000 km on the ground.
Static hand gesture recognition from a video
NASA Astrophysics Data System (ADS)
Rokade, Rajeshree S.; Doye, Dharmpal
2011-10-01
A sign language (also signed language) is a language which, instead of acoustically conveyed sound patterns, uses visually transmitted sign patterns to convey meaning- "simultaneously combining hand shapes, orientation and movement of the hands". Sign languages commonly develop in deaf communities, which can include interpreters, friends and families of deaf people as well as people who are deaf or hard of hearing themselves. In this paper, we proposed a novel system for recognition of static hand gestures from a video, based on Kohonen neural network. We proposed algorithm to separate out key frames, which include correct gestures from a video sequence. We segment, hand images from complex and non uniform background. Features are extracted by applying Kohonen on key frames and recognition is done.
ASSESSMENT OF YOUTUBE VIDEOS AS A SOURCE OF INFORMATION ON MEDICATION USE IN PREGNANCY
Hansen, Craig; Interrante, Julia D; Ailes, Elizabeth C; Frey, Meghan T; Broussard, Cheryl S; Godoshian, Valerie J; Lewis, Courtney; Polen, Kara ND; Garcia, Amanda P; Gilboa, Suzanne M
2015-01-01
Background When making decisions about medication use in pregnancy, women consult many information sources, including the Internet. The aim of this study was to assess the content of publicly-accessible YouTube videos that discuss medication use in pregnancy. Methods Using 2,023 distinct combinations of search terms related to medications and pregnancy, we extracted metadata from YouTube videos using a YouTube video Application Programming Interface. Relevant videos were defined as those with a medication search term and a pregnancy-related search term in either the video title or description. We viewed relevant videos and abstracted content from each video into a database. We documented whether videos implied each medication to be ‘safe’ or ‘unsafe’ in pregnancy and compared that assessment with the medication’s Teratogen Information System (TERIS) rating. Results After viewing 651 videos, 314 videos with information about medication use in pregnancy were available for the final analyses. The majority of videos were from law firms (67%), television segments (10%), or physicians (8%). Selective serotonin reuptake inhibitors (SSRIs) were the most common medication class named (225 videos, 72%), and 88% percent of videos about SSRIs indicated they were ‘unsafe’ for use in pregnancy. However, the TERIS ratings for medication products in this class range from ‘unlikely’ to ‘minimal’ teratogenic risk. Conclusion For the majority of medications, current YouTube video content does not adequately reflect what is known about the safety of their use in pregnancy and should be interpreted cautiously. However, YouTube could serve as a valuable platform for communicating evidence-based medication safety information. PMID:26541372
ERIC Educational Resources Information Center
Rohrer, Daniel M.
"Cableshop" is an experimental cable television service offering three- to seven-minute broadcast segments of product or community information and using a combination of telephone, computer, and video technology. Viewers participating in the service will have a choice of items ready for viewing listed on a "menu" channel and…
Duncan, James R; Kline, Benjamin; Glaiberman, Craig B
2007-04-01
To create and test methods of extracting efficiency data from recordings of simulated renal stent procedures. Task analysis was performed and used to design a standardized testing protocol. Five experienced angiographers then performed 16 renal stent simulations using the Simbionix AngioMentor angiographic simulator. Audio and video recordings of these simulations were captured from multiple vantage points. The recordings were synchronized and compiled. A series of efficiency metrics (procedure time, contrast volume, and tool use) were then extracted from the recordings. The intraobserver and interobserver variability of these individual metrics was also assessed. The metrics were converted to costs and aggregated to determine the fixed and variable costs of a procedure segment or the entire procedure. Task analysis and pilot testing led to a standardized testing protocol suitable for performance assessment. Task analysis also identified seven checkpoints that divided the renal stent simulations into six segments. Efficiency metrics for these different segments were extracted from the recordings and showed excellent intra- and interobserver correlations. Analysis of the individual and aggregated efficiency metrics demonstrated large differences between segments as well as between different angiographers. These differences persisted when efficiency was expressed as either total or variable costs. Task analysis facilitated both protocol development and data analysis. Efficiency metrics were readily extracted from recordings of simulated procedures. Aggregating the metrics and dividing the procedure into segments revealed potential insights that could be easily overlooked because the simulator currently does not attempt to aggregate the metrics and only provides data derived from the entire procedure. The data indicate that analysis of simulated angiographic procedures will be a powerful method of assessing performance in interventional radiology.
Markerless identification of key events in gait cycle using image flow.
Vishnoi, Nalini; Duric, Zoran; Gerber, Naomi Lynn
2012-01-01
Gait analysis has been an interesting area of research for several decades. In this paper, we propose image-flow-based methods to compute the motion and velocities of different body segments automatically, using a single inexpensive video camera. We then identify and extract different events of the gait cycle (double-support, mid-swing, toe-off and heel-strike) from video images. Experiments were conducted in which four walking subjects were captured from the sagittal plane. Automatic segmentation was performed to isolate the moving body from the background. The head excursion and the shank motion were then computed to identify the key frames corresponding to different events in the gait cycle. Our approach does not require calibrated cameras or special markers to capture movement. We have also compared our method with the Optotrak 3D motion capture system and found our results in good agreement with the Optotrak results. The development of our method has potential use in the markerless and unencumbered video capture of human locomotion. Monitoring gait in homes and communities provides a useful application for the aged and the disabled. Our method could potentially be used as an assessment tool to determine gait symmetry or to establish the normal gait pattern of an individual.
NASA Astrophysics Data System (ADS)
Babic, Z.; Pilipovic, R.; Risojevic, V.; Mirjanic, G.
2016-06-01
Honey bees have crucial role in pollination across the world. This paper presents a simple, non-invasive, system for pollen bearing honey bee detection in surveillance video obtained at the entrance of a hive. The proposed system can be used as a part of a more complex system for tracking and counting of honey bees with remote pollination monitoring as a final goal. The proposed method is executed in real time on embedded systems co-located with a hive. Background subtraction, color segmentation and morphology methods are used for segmentation of honey bees. Classification in two classes, pollen bearing honey bees and honey bees that do not have pollen load, is performed using nearest mean classifier, with a simple descriptor consisting of color variance and eccentricity features. On in-house data set we achieved correct classification rate of 88.7% with 50 training images per class. We show that the obtained classification results are not far behind from the results of state-of-the-art image classification methods. That favors the proposed method, particularly having in mind that real time video transmission to remote high performance computing workstation is still an issue, and transfer of obtained parameters of pollination process is much easier.
Segmentation and object-oriented processing of single-season and multi-season Landsat-7 ETM+ data was utilized for the classification of wetlands in a 1560 km2 study area of north central Florida. This segmentation and object-oriented classification outperformed the traditional ...
Automated content and quality assessment of full-motion-video for the generation of meta data
NASA Astrophysics Data System (ADS)
Harguess, Josh
2015-05-01
Virtually all of the video data (and full-motion-video (FMV)) that is currently collected and stored in support of missions has been corrupted to various extents by image acquisition and compression artifacts. Additionally, video collected by wide-area motion imagery (WAMI) surveillance systems and unmanned aerial vehicles (UAVs) and similar sources is often of low quality or in other ways corrupted so that it is not worth storing or analyzing. In order to make progress in the problem of automatic video analysis, the first problem that should be solved is deciding whether the content of the video is even worth analyzing to begin with. We present a work in progress to address three types of scenes which are typically found in real-world data stored in support of Department of Defense (DoD) missions: no or very little motion in the scene, large occlusions in the scene, and fast camera motion. Each of these produce video that is generally not usable to an analyst or automated algorithm for mission support and therefore should be removed or flagged to the user as such. We utilize recent computer vision advances in motion detection and optical flow to automatically assess FMV for the identification and generation of meta-data (or tagging) of video segments which exhibit unwanted scenarios as described above. Results are shown on representative real-world video data.
Unsupervised object segmentation with a hybrid graph model (HGM).
Liu, Guangcan; Lin, Zhouchen; Yu, Yong; Tang, Xiaoou
2010-05-01
In this work, we address the problem of performing class-specific unsupervised object segmentation, i.e., automatic segmentation without annotated training images. Object segmentation can be regarded as a special data clustering problem where both class-specific information and local texture/color similarities have to be considered. To this end, we propose a hybrid graph model (HGM) that can make effective use of both symmetric and asymmetric relationship among samples. The vertices of a hybrid graph represent the samples and are connected by directed edges and/or undirected ones, which represent the asymmetric and/or symmetric relationship between them, respectively. When applied to object segmentation, vertices are superpixels, the asymmetric relationship is the conditional dependence of occurrence, and the symmetric relationship is the color/texture similarity. By combining the Markov chain formed by the directed subgraph and the minimal cut of the undirected subgraph, the object boundaries can be determined for each image. Using the HGM, we can conveniently achieve simultaneous segmentation and recognition by integrating both top-down and bottom-up information into a unified process. Experiments on 42 object classes (9,415 images in total) show promising results.
Mao, Xue Gang; Du, Zi Han; Liu, Jia Qian; Chen, Shu Xin; Hou, Ji Yu
2018-01-01
Traditional field investigation and artificial interpretation could not satisfy the need of forest gaps extraction at regional scale. High spatial resolution remote sensing image provides the possibility for regional forest gaps extraction. In this study, we used object-oriented classification method to segment and classify forest gaps based on QuickBird high resolution optical remote sensing image in Jiangle National Forestry Farm of Fujian Province. In the process of object-oriented classification, 10 scales (10-100, with a step length of 10) were adopted to segment QuickBird remote sensing image; and the intersection area of reference object (RA or ) and intersection area of segmented object (RA os ) were adopted to evaluate the segmentation result at each scale. For segmentation result at each scale, 16 spectral characteristics and support vector machine classifier (SVM) were further used to classify forest gaps, non-forest gaps and others. The results showed that the optimal segmentation scale was 40 when RA or was equal to RA os . The accuracy difference between the maximum and minimum at different segmentation scales was 22%. At optimal scale, the overall classification accuracy was 88% (Kappa=0.82) based on SVM classifier. Combining high resolution remote sensing image data with object-oriented classification method could replace the traditional field investigation and artificial interpretation method to identify and classify forest gaps at regional scale.
Meghdadi, Amir H; Irani, Pourang
2013-12-01
We propose a novel video visual analytics system for interactive exploration of surveillance video data. Our approach consists of providing analysts with various views of information related to moving objects in a video. To do this we first extract each object's movement path. We visualize each movement by (a) creating a single action shot image (a still image that coalesces multiple frames), (b) plotting its trajectory in a space-time cube and (c) displaying an overall timeline view of all the movements. The action shots provide a still view of the moving object while the path view presents movement properties such as speed and location. We also provide tools for spatial and temporal filtering based on regions of interest. This allows analysts to filter out large amounts of movement activities while the action shot representation summarizes the content of each movement. We incorporated this multi-part visual representation of moving objects in sViSIT, a tool to facilitate browsing through the video content by interactive querying and retrieval of data. Based on our interaction with security personnel who routinely interact with surveillance video data, we identified some of the most common tasks performed. This resulted in designing a user study to measure time-to-completion of the various tasks. These generally required searching for specific events of interest (targets) in videos. Fourteen different tasks were designed and a total of 120 min of surveillance video were recorded (indoor and outdoor locations recording movements of people and vehicles). The time-to-completion of these tasks were compared against a manual fast forward video browsing guided with movement detection. We demonstrate how our system can facilitate lengthy video exploration and significantly reduce browsing time to find events of interest. Reports from expert users identify positive aspects of our approach which we summarize in our recommendations for future video visual analytics systems.
Electrical Arc Ignition Testing of Spacesuit Materials
NASA Technical Reports Server (NTRS)
Smith, Sarah; Gallus, Tim; Tapia, Susana; Ball, Elizabeth; Beeson, Harold
2006-01-01
A viewgraph presentation on electrical arc ignition testing of spacesuit materials is shown. The topics include: 1) Background; 2) Test Objectives; 3) Test Sample Materials; 4) Test Methods; 5) Scratch Test Objectives; 6) Cotton Scratch Test Video; 7) Scratch Test Results; 8) Entire Date Plot; 9) Closeup Data Plot; 10) Scratch Test Problems; 11) Poke Test Objectives; 12) Poke Test Results; 13) Poke Test Problems; 14) Wire-break Test Objectives; 15) Cotton Wire-Break Test Video; 16) High Speed Cotton Wire-break Test Video; 17) Typical Data Plot; 18) Closeup Data Plot; 19) Wire-break Test Results; 20) Wire-break Tests vs. Scratch Tests; 21) Urethane-coated Nylon; and 22) Moleskin.
Surgical gesture classification from video and kinematic data.
Zappella, Luca; Béjar, Benjamín; Hager, Gregory; Vidal, René
2013-10-01
Much of the existing work on automatic classification of gestures and skill in robotic surgery is based on dynamic cues (e.g., time to completion, speed, forces, torque) or kinematic data (e.g., robot trajectories and velocities). While videos could be equally or more discriminative (e.g., videos contain semantic information not present in kinematic data), they are typically not used because of the difficulties associated with automatic video interpretation. In this paper, we propose several methods for automatic surgical gesture classification from video data. We assume that the video of a surgical task (e.g., suturing) has been segmented into video clips corresponding to a single gesture (e.g., grabbing the needle, passing the needle) and propose three methods to classify the gesture of each video clip. In the first one, we model each video clip as the output of a linear dynamical system (LDS) and use metrics in the space of LDSs to classify new video clips. In the second one, we use spatio-temporal features extracted from each video clip to learn a dictionary of spatio-temporal words, and use a bag-of-features (BoF) approach to classify new video clips. In the third one, we use multiple kernel learning (MKL) to combine the LDS and BoF approaches. Since the LDS approach is also applicable to kinematic data, we also use MKL to combine both types of data in order to exploit their complementarity. Our experiments on a typical surgical training setup show that methods based on video data perform equally well, if not better, than state-of-the-art approaches based on kinematic data. In turn, the combination of both kinematic and video data outperforms any other algorithm based on one type of data alone. Copyright © 2013 Elsevier B.V. All rights reserved.
A Multi-modal, Discriminative and Spatially Invariant CNN for RGB-D Object Labeling.
Asif, Umar; Bennamoun, Mohammed; Sohel, Ferdous
2017-08-30
While deep convolutional neural networks have shown a remarkable success in image classification, the problems of inter-class similarities, intra-class variances, the effective combination of multimodal data, and the spatial variability in images of objects remain to be major challenges. To address these problems, this paper proposes a novel framework to learn a discriminative and spatially invariant classification model for object and indoor scene recognition using multimodal RGB-D imagery. This is achieved through three postulates: 1) spatial invariance - this is achieved by combining a spatial transformer network with a deep convolutional neural network to learn features which are invariant to spatial translations, rotations, and scale changes, 2) high discriminative capability - this is achieved by introducing Fisher encoding within the CNN architecture to learn features which have small inter-class similarities and large intra-class compactness, and 3) multimodal hierarchical fusion - this is achieved through the regularization of semantic segmentation to a multi-modal CNN architecture, where class probabilities are estimated at different hierarchical levels (i.e., imageand pixel-levels), and fused into a Conditional Random Field (CRF)- based inference hypothesis, the optimization of which produces consistent class labels in RGB-D images. Extensive experimental evaluations on RGB-D object and scene datasets, and live video streams (acquired from Kinect) show that our framework produces superior object and scene classification results compared to the state-of-the-art methods.
Lennarson, P J; Smith, D W; Sawin, P D; Todd, M M; Sato, Y; Traynelis, V C
2001-04-01
The purpose of this study was to characterize and compare segmental cervical motion during orotracheal intubation in cadavers with and without a complete subaxial injury, as well as to examine the efficacy of commonly used stabilization techniques in limiting that motion. Intubation procedures were performed in 10 fresh human cadavers in which cervical spines were intact and following the creation of a complete C4-5 ligamentous injury. Movement of the cervical spine during direct laryngoscopy and intubation was recorded using video fluoroscopy and examined under the following conditions: 1) without stabilization; 2) with manual in-line cervical immobilization; and 3) with Gardner-Wells traction. Subsequently, segmental angular rotation, subluxation, and distraction at the injured C4-5 level were measured from digitized frames of the recorded video fluoroscopy. After complete C4-5 destabilization, the effects of attempted stabilization on distraction, angulation, and subluxation were analyzed. Immobilization effectively eliminated distraction, and diminished angulation, but increased subluxation. Traction significantly increased distraction, but decreased angular rotation and effectively eliminated subluxation. Orotracheal intubation without stabilization had intermediate results, causing less distraction than traction, less subluxation than immobilization, but increased angulation compared with either intervention. These results are discussed in terms of both statistical and clinical significance and recommendations are made.
System and process for detecting and monitoring surface defects
NASA Technical Reports Server (NTRS)
Mueller, Mark K. (Inventor)
1994-01-01
A system and process for detecting and monitoring defects in large surfaces such as the field joints of the container segments of a space shuttle booster motor. Beams of semi-collimated light from three non-parallel fiber optic light panels are directed at a region of the surface at non-normal angles of expected incidence. A video camera gathers some portion of the light that is reflected at an angle other than the angle of expected reflectance, and generates signals which are analyzed to discern defects in the surface. The analysis may be performed by visual inspection of an image on a video monitor, or by inspection of filtered or otherwise processed images. In one alternative embodiment, successive predetermined regions of the surface are aligned with the light source before illumination, thereby permitting efficient detection of defects in a large surface. Such alignment is performed by using a line scan gauge to sense the light which passes through an aperture in the surface. In another embodiment a digital map of the surface is created, thereby permitting the maintenance of records detailing changes in the location or size of defects as the container segment is refurbished and re-used. The defect detection apparatus may also be advantageously mounted on a fixture which engages the edge of a container segment.
Probabilistic fusion of stereo with color and contrast for bilayer segmentation.
Kolmogorov, Vladimir; Criminisi, Antonio; Blake, Andrew; Cross, Geoffrey; Rother, Carsten
2006-09-01
This paper describes models and algorithms for the real-time segmentation of foreground from background layers in stereo video sequences. Automatic separation of layers from color/contrast or from stereo alone is known to be error-prone. Here, color, contrast, and stereo matching information are fused to infer layers accurately and efficiently. The first algorithm, Layered Dynamic Programming (LDP), solves stereo in an extended six-state space that represents both foreground/background layers and occluded regions. The stereo-match likelihood is then fused with a contrast-sensitive color model that is learned on-the-fly and stereo disparities are obtained by dynamic programming. The second algorithm, Layered Graph Cut (LGC), does not directly solve stereo. Instead, the stereo match likelihood is marginalized over disparities to evaluate foreground and background hypotheses and then fused with a contrast-sensitive color model like the one used in LDP. Segmentation is solved efficiently by ternary graph cut. Both algorithms are evaluated with respect to ground truth data and found to have similar performance, substantially better than either stereo or color/ contrast alone. However, their characteristics with respect to computational efficiency are rather different. The algorithms are demonstrated in the application of background substitution and shown to give good quality composite video output.
The effects of video game playing on attention, memory, and executive control.
Boot, Walter R; Kramer, Arthur F; Simons, Daniel J; Fabiani, Monica; Gratton, Gabriele
2008-11-01
Expert video game players often outperform non-players on measures of basic attention and performance. Such differences might result from exposure to video games or they might reflect other group differences between those people who do or do not play video games. Recent research has suggested a causal relationship between playing action video games and improvements in a variety of visual and attentional skills (e.g., [Green, C. S., & Bavelier, D. (2003). Action video game modifies visual selective attention. Nature, 423, 534-537]). The current research sought to replicate and extend these results by examining both expert/non-gamer differences and the effects of video game playing on tasks tapping a wider range of cognitive abilities, including attention, memory, and executive control. Non-gamers played 20+ h of an action video game, a puzzle game, or a real-time strategy game. Expert gamers and non-gamers differed on a number of basic cognitive skills: experts could track objects moving at greater speeds, better detected changes to objects stored in visual short-term memory, switched more quickly from one task to another, and mentally rotated objects more efficiently. Strikingly, extensive video game practice did not substantially enhance performance for non-gamers on most cognitive tasks, although they did improve somewhat in mental rotation performance. Our results suggest that at least some differences between video game experts and non-gamers in basic cognitive performance result either from far more extensive video game experience or from pre-existing group differences in abilities that result in a self-selection effect.
FPGA implementation for real-time background subtraction based on Horprasert model.
Rodriguez-Gomez, Rafael; Fernandez-Sanchez, Enrique J; Diaz, Javier; Ros, Eduardo
2012-01-01
Background subtraction is considered the first processing stage in video surveillance systems, and consists of determining objects in movement in a scene captured by a static camera. It is an intensive task with a high computational cost. This work proposes an embedded novel architecture on FPGA which is able to extract the background on resource-limited environments and offers low degradation (produced because of the hardware-friendly model modification). In addition, the original model is extended in order to detect shadows and improve the quality of the segmentation of the moving objects. We have analyzed the resource consumption and performance in Spartan3 Xilinx FPGAs and compared to others works available on the literature, showing that the current architecture is a good trade-off in terms of accuracy, performance and resources utilization. With less than a 65% of the resources utilization of a XC3SD3400 Spartan-3A low-cost family FPGA, the system achieves a frequency of 66.5 MHz reaching 32.8 fps with resolution 1,024 × 1,024 pixels, and an estimated power consumption of 5.76 W.
Information recovery through image sequence fusion under wavelet transformation
NASA Astrophysics Data System (ADS)
He, Qiang
2010-04-01
Remote sensing is widely applied to provide information of areas with limited ground access with applications such as to assess the destruction from natural disasters and to plan relief and recovery operations. However, the data collection of aerial digital images is constrained by bad weather, atmospheric conditions, and unstable camera or camcorder. Therefore, how to recover the information from the low-quality remote sensing images and how to enhance the image quality becomes very important for many visual understanding tasks, such like feature detection, object segmentation, and object recognition. The quality of remote sensing imagery can be improved through meaningful combination of the employed images captured from different sensors or from different conditions through information fusion. Here we particularly address information fusion to remote sensing images under multi-resolution analysis in the employed image sequences. The image fusion is to recover complete information by integrating multiple images captured from the same scene. Through image fusion, a new image with high-resolution or more perceptive for human and machine is created from a time series of low-quality images based on image registration between different video frames.
NASA Astrophysics Data System (ADS)
Butell, Bart
1996-02-01
Microsoft's Visual Basic (VB) and Borland's Delphi provide an extremely robust programming environment for delivering multimedia solutions for interactive kiosks, games and titles. Their object oriented use of standard and custom controls enable a user to build extremely powerful applications. A multipurpose, database enabled programming environment that can provide an event driven interface functions as a multimedia kernel. This kernel can provide a variety of authoring solutions (e.g. a timeline based model similar to Macromedia Director or a node authoring model similar to Icon Author). At the heart of the kernel is a set of low level multimedia components providing object oriented interfaces for graphics, audio, video and imaging. Data preparation tools (e.g., layout, palette and Sprite Editors) could be built to manage the media database. The flexible interface for VB allows the construction of an infinite number of user models. The proliferation of these models within a popular, easy to use environment will allow the vast developer segment of 'producer' types to bring their ideas to the market. This is the key to building exciting, content rich multimedia solutions. Microsoft's VB and Borland's Delphi environments combined with multimedia components enable these possibilities.
FPGA Implementation for Real-Time Background Subtraction Based on Horprasert Model
Rodriguez-Gomez, Rafael; Fernandez-Sanchez, Enrique J.; Diaz, Javier; Ros, Eduardo
2012-01-01
Background subtraction is considered the first processing stage in video surveillance systems, and consists of determining objects in movement in a scene captured by a static camera. It is an intensive task with a high computational cost. This work proposes an embedded novel architecture on FPGA which is able to extract the background on resource-limited environments and offers low degradation (produced because of the hardware-friendly model modification). In addition, the original model is extended in order to detect shadows and improve the quality of the segmentation of the moving objects. We have analyzed the resource consumption and performance in Spartan3 Xilinx FPGAs and compared to others works available on the literature, showing that the current architecture is a good trade-off in terms of accuracy, performance and resources utilization. With less than a 65% of the resources utilization of a XC3SD3400 Spartan-3A low-cost family FPGA, the system achieves a frequency of 66.5 MHz reaching 32.8 fps with resolution 1,024 × 1,024 pixels, and an estimated power consumption of 5.76 W. PMID:22368487
Single-incision video-assisted thoracoscopic surgery left-lower lobe anterior segmentectomy (S8)
Lirio, Francisco; Sesma, Julio; Baschwitz, Benno; Bolufer, Sergio
2017-01-01
Unusual anatomical segmentectomies are technically demanding procedures that require a deep knowledge of intralobar anatomy and surgical skill. In the other hand, these procedures preserve more normal lung parenchyma for lesions located in specific anatomical segments, and are indicated for benign lesions, metastasis and also early stage adenocarcinomas without nodal involvement. A 32-year-old woman was diagnosed of a benign pneumocytoma in the anterior segment of the left-lower lobe (S8, LLL), so we performed a single-incision video-assisted thoracoscopic surgery (SI-VATS) anatomical S8 segmentectomy in 140 minutes under intercostal block. There were no intraoperative neither postoperative complications, the chest tube was removed at 24 hours and the patient discharged at 5th postoperative day with low pain on the visual analogue scale (VAS). Final pathologic exam reported a benign sclerosant pneumocytoma with free margins. The patient has recovered her normal activities at 3 months completely with radiological normal controls at 1 and 3 months. PMID:29078674
Single-incision video-assisted thoracoscopic surgery left-lower lobe anterior segmentectomy (S8).
Galvez, Carlos; Lirio, Francisco; Sesma, Julio; Baschwitz, Benno; Bolufer, Sergio
2017-01-01
Unusual anatomical segmentectomies are technically demanding procedures that require a deep knowledge of intralobar anatomy and surgical skill. In the other hand, these procedures preserve more normal lung parenchyma for lesions located in specific anatomical segments, and are indicated for benign lesions, metastasis and also early stage adenocarcinomas without nodal involvement. A 32-year-old woman was diagnosed of a benign pneumocytoma in the anterior segment of the left-lower lobe (S8, LLL), so we performed a single-incision video-assisted thoracoscopic surgery (SI-VATS) anatomical S8 segmentectomy in 140 minutes under intercostal block. There were no intraoperative neither postoperative complications, the chest tube was removed at 24 hours and the patient discharged at 5 th postoperative day with low pain on the visual analogue scale (VAS). Final pathologic exam reported a benign sclerosant pneumocytoma with free margins. The patient has recovered her normal activities at 3 months completely with radiological normal controls at 1 and 3 months.
Shojaedini, Seyed Vahab; Heydari, Masoud
2014-10-01
Shape and movement features of sperms are important parameters for infertility study and treatment. In this article, a new method is introduced for characterization sperms in microscopic videos. In this method, first a hypothesis framework is defined to distinguish sperms from other particles in captured video. Then decision about each hypothesis is done in following steps: Selecting some primary regions as candidates for sperms by watershed-based segmentation, pruning of some false candidates during successive frames using graph theory concept and finally confirming correct sperms by using their movement trajectories. Performance of the proposed method is evaluated on real captured images belongs to semen with high density of sperms. The obtained results show the proposed method may detect 97% of sperms in presence of 5% false detections and track 91% of moving sperms. Furthermore, it can be shown that better characterization of sperms in proposed algorithm doesn't lead to extracting more false sperms compared to some present approaches.
Clayman, Marla L.; Makoul, Gregory; Harper, Maya M.; Koby, Danielle G.; Williams, Adam R.
2012-01-01
Objectives Describe the development and refinement of a scheme, Detail of Essential Elements and Participants in Shared Decision Making (DEEP-SDM), for coding Shared Decision Making (SDM) while reporting on the characteristics of decisions in a sample of patients with metastatic breast cancer. Methods The Evidence-Based Patient Choice instrument was modified to reflect Makoul and Clayman’s Integrative Model of SDM. Coding was conducted on video recordings of 20 women at the first visit with their medical oncologists after suspicion of disease progression. Noldus Observer XT v.8, a video coding software platform, was used for coding. Results The sample contained 80 decisions (range: 1-11), divided into 150 decision making segments. Most decisions were physician-led, although patients and physicians initiated similar numbers of decision-making conversations. Conclusion DEEP-SDM facilitates content analysis of encounters between women with metastatic breast cancer and their medical oncologists. Despite the fractured nature of decision making, it is possible to identify decision points and to code each of the Essential Elements of Shared Decision Making. Further work should include application of DEEP-SDM to non-cancer encounters. Practice Implications: A better understanding of how decisions unfold in the medical encounter can help inform the relationship of SDM to patient-reported outcomes. PMID:22784391
Integration of Research Into Science-outreach (IRIS): A Video and Web-based Approach
NASA Astrophysics Data System (ADS)
Clay, P. L.; O'Driscoll, B.
2013-12-01
The development of the IRIS (Integration of Research Into Science-outreach) initiative is aimed at using field- and laboratory- based videos and blog entries to enable a sustained outreach relationship between university researchers and local classrooms. IRIS seeks to communicate complex, cutting-edge scientific research in the Earth and Planetary sciences to school-aged children in a simple and interesting manner, in the hope of ameliorating the overall decline of children entering into science and engineering fields in future generations. The primary method of delivery IRIS utilizes is the media of film, ';webinars' and blog entries. Filmed sequences of laboratory work, field work, science demos and mini webinars on current and relevant material in the Earth and Planetary sciences are ';subscribed' to by local schools. Selected sequences are delivered in 20-30 minute film segments with accompanying written material. The level at which the subject matter is currently geared is towards secondary level school-aged children, with the purpose of inspiring and encouraging curiosity, learning and development in scientific research. The video broadcasts are supplemented by a hands-on visit 1-2 times per year by a group of scientists participating in the filmed sequences to the subscribing class, with the objective of engaging and establishing a natural rapport between the class and the scientists that they see in the broadcasts. This transgresses boundaries that traditional 'one off' outreach platforms often aren't able to achieve. The initial results of the IRIS outreach initiative including successes, problems encountered and classroom feedback will be reported.
Information processing of motion in facial expression and the geometry of dynamical systems
NASA Astrophysics Data System (ADS)
Assadi, Amir H.; Eghbalnia, Hamid; McMenamin, Brenton W.
2005-01-01
An interesting problem in analysis of video data concerns design of algorithms that detect perceptually significant features in an unsupervised manner, for instance methods of machine learning for automatic classification of human expression. A geometric formulation of this genre of problems could be modeled with help of perceptual psychology. In this article, we outline one approach for a special case where video segments are to be classified according to expression of emotion or other similar facial motions. The encoding of realistic facial motions that convey expression of emotions for a particular person P forms a parameter space XP whose study reveals the "objective geometry" for the problem of unsupervised feature detection from video. The geometric features and discrete representation of the space XP are independent of subjective evaluations by observers. While the "subjective geometry" of XP varies from observer to observer, levels of sensitivity and variation in perception of facial expressions appear to share a certain level of universality among members of similar cultures. Therefore, statistical geometry of invariants of XP for a sample of population could provide effective algorithms for extraction of such features. In cases where frequency of events is sufficiently large in the sample data, a suitable framework could be provided to facilitate the information-theoretic organization and study of statistical invariants of such features. This article provides a general approach to encode motion in terms of a particular genre of dynamical systems and the geometry of their flow. An example is provided to illustrate the general theory.
Video library for video imaging detection at intersection stop lines.
DOT National Transportation Integrated Search
2010-04-01
The objective of this activity was to record video that could be used for controlled : evaluation of video image vehicle detection system (VIVDS) products and software upgrades to : existing products based on a list of conditions that might be diffic...
Assessment of Multiresolution Segmentation for Extracting Greenhouses from WORLDVIEW-2 Imagery
NASA Astrophysics Data System (ADS)
Aguilar, M. A.; Aguilar, F. J.; García Lorca, A.; Guirado, E.; Betlej, M.; Cichon, P.; Nemmaoui, A.; Vallario, A.; Parente, C.
2016-06-01
The latest breed of very high resolution (VHR) commercial satellites opens new possibilities for cartographic and remote sensing applications. In this way, object based image analysis (OBIA) approach has been proved as the best option when working with VHR satellite imagery. OBIA considers spectral, geometric, textural and topological attributes associated with meaningful image objects. Thus, the first step of OBIA, referred to as segmentation, is to delineate objects of interest. Determination of an optimal segmentation is crucial for a good performance of the second stage in OBIA, the classification process. The main goal of this work is to assess the multiresolution segmentation algorithm provided by eCognition software for delineating greenhouses from WorldView- 2 multispectral orthoimages. Specifically, the focus is on finding the optimal parameters of the multiresolution segmentation approach (i.e., Scale, Shape and Compactness) for plastic greenhouses. The optimum Scale parameter estimation was based on the idea of local variance of object heterogeneity within a scene (ESP2 tool). Moreover, different segmentation results were attained by using different combinations of Shape and Compactness values. Assessment of segmentation quality based on the discrepancy between reference polygons and corresponding image segments was carried out to identify the optimal setting of multiresolution segmentation parameters. Three discrepancy indices were used: Potential Segmentation Error (PSE), Number-of-Segments Ratio (NSR) and Euclidean Distance 2 (ED2).
An optimized video system for augmented reality in endodontics: a feasibility study.
Bruellmann, D D; Tjaden, H; Schwanecke, U; Barth, P
2013-03-01
We propose an augmented reality system for the reliable detection of root canals in video sequences based on a k-nearest neighbor color classification and introduce a simple geometric criterion for teeth. The new software was implemented using C++, Qt, and the image processing library OpenCV. Teeth are detected in video images to restrict the segmentation of the root canal orifices by using a k-nearest neighbor algorithm. The location of the root canal orifices were determined using Euclidean distance-based image segmentation. A set of 126 human teeth with known and verified locations of the root canal orifices was used for evaluation. The software detects root canals orifices for automatic classification of the teeth in video images and stores location and size of the found structures. Overall 287 of 305 root canals were correctly detected. The overall sensitivity was about 94 %. Classification accuracy for molars ranged from 65.0 to 81.2 % and from 85.7 to 96.7 % for premolars. The realized software shows that observations made in anatomical studies can be exploited to automate real-time detection of root canal orifices and tooth classification with a software system. Automatic storage of location, size, and orientation of the found structures with this software can be used for future anatomical studies. Thus, statistical tables with canal locations will be derived, which can improve anatomical knowledge of the teeth to alleviate root canal detection in the future. For this purpose the software is freely available at: http://www.dental-imaging.zahnmedizin.uni-mainz.de/.
Survey statistics of automated segmentations applied to optical imaging of mammalian cells.
Bajcsy, Peter; Cardone, Antonio; Chalfoun, Joe; Halter, Michael; Juba, Derek; Kociolek, Marcin; Majurski, Michael; Peskin, Adele; Simon, Carl; Simon, Mylene; Vandecreme, Antoine; Brady, Mary
2015-10-15
The goal of this survey paper is to overview cellular measurements using optical microscopy imaging followed by automated image segmentation. The cellular measurements of primary interest are taken from mammalian cells and their components. They are denoted as two- or three-dimensional (2D or 3D) image objects of biological interest. In our applications, such cellular measurements are important for understanding cell phenomena, such as cell counts, cell-scaffold interactions, cell colony growth rates, or cell pluripotency stability, as well as for establishing quality metrics for stem cell therapies. In this context, this survey paper is focused on automated segmentation as a software-based measurement leading to quantitative cellular measurements. We define the scope of this survey and a classification schema first. Next, all found and manually filteredpublications are classified according to the main categories: (1) objects of interests (or objects to be segmented), (2) imaging modalities, (3) digital data axes, (4) segmentation algorithms, (5) segmentation evaluations, (6) computational hardware platforms used for segmentation acceleration, and (7) object (cellular) measurements. Finally, all classified papers are converted programmatically into a set of hyperlinked web pages with occurrence and co-occurrence statistics of assigned categories. The survey paper presents to a reader: (a) the state-of-the-art overview of published papers about automated segmentation applied to optical microscopy imaging of mammalian cells, (b) a classification of segmentation aspects in the context of cell optical imaging, (c) histogram and co-occurrence summary statistics about cellular measurements, segmentations, segmented objects, segmentation evaluations, and the use of computational platforms for accelerating segmentation execution, and (d) open research problems to pursue. The novel contributions of this survey paper are: (1) a new type of classification of cellular measurements and automated segmentation, (2) statistics about the published literature, and (3) a web hyperlinked interface to classification statistics of the surveyed papers at https://isg.nist.gov/deepzoomweb/resources/survey/index.html.
Joint multi-object registration and segmentation of left and right cardiac ventricles in 4D cine MRI
NASA Astrophysics Data System (ADS)
Ehrhardt, Jan; Kepp, Timo; Schmidt-Richberg, Alexander; Handels, Heinz
2014-03-01
The diagnosis of cardiac function based on cine MRI requires the segmentation of cardiac structures in the images, but the problem of automatic cardiac segmentation is still open, due to the imaging characteristics of cardiac MR images and the anatomical variability of the heart. In this paper, we present a variational framework for joint segmentation and registration of multiple structures of the heart. To enable the simultaneous segmentation and registration of multiple objects, a shape prior term is introduced into a region competition approach for multi-object level set segmentation. The proposed algorithm is applied for simultaneous segmentation of the myocardium as well as the left and right ventricular blood pool in short axis cine MRI images. Two experiments are performed: first, intra-patient 4D segmentation with a given initial segmentation for one time-point in a 4D sequence, and second, a multi-atlas segmentation strategy is applied to unseen patient data. Evaluation of segmentation accuracy is done by overlap coefficients and surface distances. An evaluation based on clinical 4D cine MRI images of 25 patients shows the benefit of the combined approach compared to sole registration and sole segmentation.
Annotations of Mexican bullfighting videos for semantic index
NASA Astrophysics Data System (ADS)
Montoya Obeso, Abraham; Oropesa Morales, Lester Arturo; Fernando Vázquez, Luis; Cocolán Almeda, Sara Ivonne; Stoian, Andrei; García Vázquez, Mireya Saraí; Zamudio Fuentes, Luis Miguel; Montiel Perez, Jesús Yalja; de la O Torres, Saul; Ramírez Acosta, Alejandro Alvaro
2015-09-01
The video annotation is important for web indexing and browsing systems. Indeed, in order to evaluate the performance of video query and mining techniques, databases with concept annotations are required. Therefore, it is necessary generate a database with a semantic indexing that represents the digital content of the Mexican bullfighting atmosphere. This paper proposes a scheme to make complex annotations in a video in the frame of multimedia search engine project. Each video is partitioned using our segmentation algorithm that creates shots of different length and different number of frames. In order to make complex annotations about the video, we use ELAN software. The annotations are done in two steps: First, we take note about the whole content in each shot. Second, we describe the actions as parameters of the camera like direction, position and deepness. As a consequence, we obtain a more complete descriptor of every action. In both cases we use the concepts of the TRECVid 2014 dataset. We also propose new concepts. This methodology allows to generate a database with the necessary information to create descriptors and algorithms capable to detect actions to automatically index and classify new bullfighting multimedia content.
Using learning analytics to evaluate a video-based lecture series.
Lau, K H Vincent; Farooque, Pue; Leydon, Gary; Schwartz, Michael L; Sadler, R Mark; Moeller, Jeremy J
2018-01-01
The video-based lecture (VBL), an important component of the flipped classroom (FC) and massive open online course (MOOC) approaches to medical education, has primarily been evaluated through direct learner feedback. Evaluation may be enhanced through learner analytics (LA) - analysis of quantitative audience usage data generated by video-sharing platforms. We applied LA to an experimental series of ten VBLs on electroencephalography (EEG) interpretation, uploaded to YouTube in the model of a publicly accessible MOOC. Trends in view count; total percentage of video viewed and audience retention (AR) (percentage of viewers watching at a time point compared to the initial total) were examined. The pattern of average AR decline was characterized using regression analysis, revealing a uniform linear decline in viewership for each video, with no evidence of an optimal VBL length. Segments with transient increases in AR corresponded to those focused on core concepts, indicative of content requiring more detailed evaluation. We propose a model for applying LA at four levels: global, series, video, and feedback. LA may be a useful tool in evaluating a VBL series. Our proposed model combines analytics data and learner self-report for comprehensive evaluation.
A generic flexible and robust approach for intelligent real-time video-surveillance systems
NASA Astrophysics Data System (ADS)
Desurmont, Xavier; Delaigle, Jean-Francois; Bastide, Arnaud; Macq, Benoit
2004-05-01
In this article we present a generic, flexible and robust approach for an intelligent real-time video-surveillance system. A previous version of the system was presented in [1]. The goal of these advanced tools is to provide help to operators by detecting events of interest in visual scenes and highlighting alarms and compute statistics. The proposed system is a multi-camera platform able to handle different standards of video inputs (composite, IP, IEEE1394 ) and which can basically compress (MPEG4), store and display them. This platform also integrates advanced video analysis tools, such as motion detection, segmentation, tracking and interpretation. The design of the architecture is optimised to playback, display, and process video flows in an efficient way for video-surveillance application. The implementation is distributed on a scalable computer cluster based on Linux and IP network. It relies on POSIX threads for multitasking scheduling. Data flows are transmitted between the different modules using multicast technology and under control of a TCP-based command network (e.g. for bandwidth occupation control). We report here some results and we show the potential use of such a flexible system in third generation video surveillance system. We illustrate the interest of the system in a real case study, which is the indoor surveillance.
SAFE: Stopping AIDS through Functional Education.
ERIC Educational Resources Information Center
Hylton, Judith
This functional curriculum is intended to teach people with developmental disabilities or other learning problems how to prevent infection with HIV/AIDS (Human Immunodeficiency Virus/Acquired Immune Deficiency Syndrome). The entire curriculum includes six video segments, four illustrated brochures, 28 slides and illustrations, as well as a guide…
ERIC Educational Resources Information Center
Rubin, Joan; And Others
This set of materials include an interactive videotape and textbook program (tape not included here) for high-beginning and intermediate English-as-a-Second-Language (ESL) students in or about to enter the workplace. The materials provide instruction in communication skills essential for job success. The 10 video segments and corresponding student…
Science, Mathematics, and the Mimi.
ERIC Educational Resources Information Center
Doblmeier, Joyce; Fields, Barbara
1996-01-01
Students with difficulty in maintaining grade-level performance at the Model Secondary School for the Deaf (Washington, DC) are learning mathematics and science skills using "The Voyage of the Mimi," a 13-segment video series and associated educational materials that detail a scientific expedition which is studying humpback whales. Team…
The effect of input data transformations on object-based image analysis
LIPPITT, CHRISTOPHER D.; COULTER, LLOYD L.; FREEMAN, MARY; LAMANTIA-BISHOP, JEFFREY; PANG, WYSON; STOW, DOUGLAS A.
2011-01-01
The effect of using spectral transform images as input data on segmentation quality and its potential effect on products generated by object-based image analysis are explored in the context of land cover classification in Accra, Ghana. Five image data transformations are compared to untransformed spectral bands in terms of their effect on segmentation quality and final product accuracy. The relationship between segmentation quality and product accuracy is also briefly explored. Results suggest that input data transformations can aid in the delineation of landscape objects by image segmentation, but the effect is idiosyncratic to the transformation and object of interest. PMID:21673829
A wavelet-based Bayesian framework for 3D object segmentation in microscopy
NASA Astrophysics Data System (ADS)
Pan, Kangyu; Corrigan, David; Hillebrand, Jens; Ramaswami, Mani; Kokaram, Anil
2012-03-01
In confocal microscopy, target objects are labeled with fluorescent markers in the living specimen, and usually appear with irregular brightness in the observed images. Also, due to the existence of out-of-focus objects in the image, the segmentation of 3-D objects in the stack of image slices captured at different depth levels of the specimen is still heavily relied on manual analysis. In this paper, a novel Bayesian model is proposed for segmenting 3-D synaptic objects from given image stack. In order to solve the irregular brightness and out-offocus problems, the segmentation model employs a likelihood using the luminance-invariant 'wavelet features' of image objects in the dual-tree complex wavelet domain as well as a likelihood based on the vertical intensity profile of the image stack in 3-D. Furthermore, a smoothness 'frame' prior based on the a priori knowledge of the connections of the synapses is introduced to the model for enhancing the connectivity of the synapses. As a result, our model can successfully segment the in-focus target synaptic object from a 3D image stack with irregular brightness.
Teasing Apart Complex Motions using VideoPoint
NASA Astrophysics Data System (ADS)
Fischer, Mark
2002-10-01
Using video analysis software such as VideoPoint, it is possible to explore the physics of any phenomenon that can be captured on videotape. The good news is that complex motions can be filmed and analyzed. The bad news is that the motions can become very complex very quickly. An example of such a complicated motion, the 2-dimensional motion of an object as filmed by a camera that is moving and rotating in the same plane will be discussed. Methods for extracting the desired object motion will be given as well as suggestions for shooting more easily analyzable video clips.
Enumeration versus multiple object tracking: the case of action video game players
Green, C.S.; Bavelier, D.
2010-01-01
Here, we demonstrate that action video game play enhances subjects’ ability in two tasks thought to indicate the number of items that can be apprehended. Using an enumeration task, in which participants have to determine the number of quickly flashed squares, accuracy measures showed a near ceiling performance for low numerosities and a sharp drop in performance once a critical number of squares was reached. Importantly, this critical number was higher by about two items in video game players (VGPs) than in non-video game players (NVGPs). A following control study indicated that this improvement was not due to an enhanced ability to instantly apprehend the numerosity of the display, a process known as subitizing, but rather due to an enhancement in the slower more serial process of counting. To confirm that video game play facilitates the processing of multiple objects at once, we compared VGPs and NVGPs on the multiple object tracking task (MOT), which requires the allocation of attention to several items over time. VGPs were able to successfully track approximately two more items than NVGPs. Furthermore, NVGPs trained on an action video game established the causal effect of game playing in the enhanced performance on the two tasks. Together, these studies confirm the view that playing action video games enhances the number of objects that can be apprehended and suggest that this enhancement is mediated by changes in visual short-term memory skills. PMID:16359652
Enumeration versus multiple object tracking: the case of action video game players.
Green, C S; Bavelier, D
2006-08-01
Here, we demonstrate that action video game play enhances subjects' ability in two tasks thought to indicate the number of items that can be apprehended. Using an enumeration task, in which participants have to determine the number of quickly flashed squares, accuracy measures showed a near ceiling performance for low numerosities and a sharp drop in performance once a critical number of squares was reached. Importantly, this critical number was higher by about two items in video game players (VGPs) than in non-video game players (NVGPs). A following control study indicated that this improvement was not due to an enhanced ability to instantly apprehend the numerosity of the display, a process known as subitizing, but rather due to an enhancement in the slower more serial process of counting. To confirm that video game play facilitates the processing of multiple objects at once, we compared VGPs and NVGPs on the multiple object tracking task (MOT), which requires the allocation of attention to several items over time. VGPs were able to successfully track approximately two more items than NVGPs. Furthermore, NVGPs trained on an action video game established the causal effect of game playing in the enhanced performance on the two tasks. Together, these studies confirm the view that playing action video games enhances the number of objects that can be apprehended and suggest that this enhancement is mediated by changes in visual short-term memory skills.
Song, Qi; Chen, Mingqing; Bai, Junjie; Sonka, Milan; Wu, Xiaodong
2011-01-01
Multi-object segmentation with mutual interaction is a challenging task in medical image analysis. We report a novel solution to a segmentation problem, in which target objects of arbitrary shape mutually interact with terrain-like surfaces, which widely exists in the medical imaging field. The approach incorporates context information used during simultaneous segmentation of multiple objects. The object-surface interaction information is encoded by adding weighted inter-graph arcs to our graph model. A globally optimal solution is achieved by solving a single maximum flow problem in a low-order polynomial time. The performance of the method was evaluated in robust delineation of lung tumors in megavoltage cone-beam CT images in comparison with an expert-defined independent standard. The evaluation showed that our method generated highly accurate tumor segmentations. Compared with the conventional graph-cut method, our new approach provided significantly better results (p < 0.001). The Dice coefficient obtained by the conventional graph-cut approach (0.76 +/- 0.10) was improved to 0.84 +/- 0.05 when employing our new method for pulmonary tumor segmentation.
Multi person detection and tracking based on hierarchical level-set method
NASA Astrophysics Data System (ADS)
Khraief, Chadia; Benzarti, Faouzi; Amiri, Hamid
2018-04-01
In this paper, we propose an efficient unsupervised method for mutli-person tracking based on hierarchical level-set approach. The proposed method uses both edge and region information in order to effectively detect objects. The persons are tracked on each frame of the sequence by minimizing an energy functional that combines color, texture and shape information. These features are enrolled in covariance matrix as region descriptor. The present method is fully automated without the need to manually specify the initial contour of Level-set. It is based on combined person detection and background subtraction methods. The edge-based is employed to maintain a stable evolution, guide the segmentation towards apparent boundaries and inhibit regions fusion. The computational cost of level-set is reduced by using narrow band technique. Many experimental results are performed on challenging video sequences and show the effectiveness of the proposed method.
Fu, Min; Wu, Wenming; Hong, Xiafei; Liu, Qiuhua; Jiang, Jialin; Ou, Yaobin; Zhao, Yupei; Gong, Xinqi
2018-04-24
Efficient computational recognition and segmentation of target organ from medical images are foundational in diagnosis and treatment, especially about pancreas cancer. In practice, the diversity in appearance of pancreas and organs in abdomen, makes detailed texture information of objects important in segmentation algorithm. According to our observations, however, the structures of previous networks, such as the Richer Feature Convolutional Network (RCF), are too coarse to segment the object (pancreas) accurately, especially the edge. In this paper, we extend the RCF, proposed to the field of edge detection, for the challenging pancreas segmentation, and put forward a novel pancreas segmentation network. By employing multi-layer up-sampling structure replacing the simple up-sampling operation in all stages, the proposed network fully considers the multi-scale detailed contexture information of object (pancreas) to perform per-pixel segmentation. Additionally, using the CT scans, we supply and train our network, thus get an effective pipeline. Working with our pipeline with multi-layer up-sampling model, we achieve better performance than RCF in the task of single object (pancreas) segmentation. Besides, combining with multi scale input, we achieve the 76.36% DSC (Dice Similarity Coefficient) value in testing data. The results of our experiments show that our advanced model works better than previous networks in our dataset. On the other words, it has better ability in catching detailed contexture information. Therefore, our new single object segmentation model has practical meaning in computational automatic diagnosis.
Intentional forgetting diminishes memory for continuous events.
Fawcett, Jonathan M; Taylor, Tracy L; Nadel, Lynn
2013-01-01
In a novel event method directed forgetting task, instructions to Remember (R) or Forget (F) were integrated throughout the presentation of four videos depicting common events (e.g., baking cookies). Participants responded more accurately to cued recall questions (E1) and true/false statements (E2-4) regarding R segments than F segments. This was true even when forced to attend to F segments by virtue of having to perform concurrent discrimination (E2) or conceptual segmentation (E3) tasks. The final experiment (E5) demonstrated a larger R >F difference for specific true/false statements (the woman added three cups of flour) than for general true/false statements (the woman added flour) suggesting that participants likely encoded and retained at least a general representation of the events they had intended to forget, even though this representation was not as specific as the representation of events they had intended to remember.
A computerized recognition system for the home-based physiotherapy exercises using an RGBD camera.
Ar, Ilktan; Akgul, Yusuf Sinan
2014-11-01
Computerized recognition of the home based physiotherapy exercises has many benefits and it has attracted considerable interest among the computer vision community. However, most methods in the literature view this task as a special case of motion recognition. In contrast, we propose to employ the three main components of a physiotherapy exercise (the motion patterns, the stance knowledge, and the exercise object) as different recognition tasks and embed them separately into the recognition system. The low level information about each component is gathered using machine learning methods. Then, we use a generative Bayesian network to recognize the exercise types by combining the information from these sources at an abstract level, which takes the advantage of domain knowledge for a more robust system. Finally, a novel postprocessing step is employed to estimate the exercise repetitions counts. The performance evaluation of the system is conducted with a new dataset which contains RGB (red, green, and blue) and depth videos of home-based exercise sessions for commonly applied shoulder and knee exercises. The proposed system works without any body-part segmentation, bodypart tracking, joint detection, and temporal segmentation methods. In the end, favorable exercise recognition rates and encouraging results on the estimation of repetition counts are obtained.
Open-source software platform for medical image segmentation applications
NASA Astrophysics Data System (ADS)
Namías, R.; D'Amato, J. P.; del Fresno, M.
2017-11-01
Segmenting 2D and 3D images is a crucial and challenging problem in medical image analysis. Although several image segmentation algorithms have been proposed for different applications, no universal method currently exists. Moreover, their use is usually limited when detection of complex and multiple adjacent objects of interest is needed. In addition, the continually increasing volumes of medical imaging scans require more efficient segmentation software design and highly usable applications. In this context, we present an extension of our previous segmentation framework which allows the combination of existing explicit deformable models in an efficient and transparent way, handling simultaneously different segmentation strategies and interacting with a graphic user interface (GUI). We present the object-oriented design and the general architecture which consist of two layers: the GUI at the top layer, and the processing core filters at the bottom layer. We apply the framework for segmenting different real-case medical image scenarios on public available datasets including bladder and prostate segmentation from 2D MRI, and heart segmentation in 3D CT. Our experiments on these concrete problems show that this framework facilitates complex and multi-object segmentation goals while providing a fast prototyping open-source segmentation tool.
Multiresolution saliency map based object segmentation
NASA Astrophysics Data System (ADS)
Yang, Jian; Wang, Xin; Dai, ZhenYou
2015-11-01
Salient objects' detection and segmentation are gaining increasing research interest in recent years. A saliency map can be obtained from different models presented in previous studies. Based on this saliency map, the most salient region (MSR) in an image can be extracted. This MSR, generally a rectangle, can be used as the initial parameters for object segmentation algorithms. However, to our knowledge, all of those saliency maps are represented in a unitary resolution although some models have even introduced multiscale principles in the calculation process. Furthermore, some segmentation methods, such as the well-known GrabCut algorithm, need more iteration time or additional interactions to get more precise results without predefined pixel types. A concept of a multiresolution saliency map is introduced. This saliency map is provided in a multiresolution format, which naturally follows the principle of the human visual mechanism. Moreover, the points in this map can be utilized to initialize parameters for GrabCut segmentation by labeling the feature pixels automatically. Both the computing speed and segmentation precision are evaluated. The results imply that this multiresolution saliency map-based object segmentation method is simple and efficient.
Federal Register 2010, 2011, 2012, 2013, 2014
2012-01-06
... INTERNATIONAL TRADE COMMISSION [Investigation No. 337-TA-795] Certain Video Analytics Software... filed by ObjectVideo, Inc. of Reston, Virginia. 76 FR 45859 (Aug. 1, 2011). The complaint, as amended... certain video analytics software, systems, components thereof, and products containing same by reason of...
McCullough, D P; Gudla, P R; Harris, B S; Collins, J A; Meaburn, K J; Nakaya, M A; Yamaguchi, T P; Misteli, T; Lockett, S J
2008-05-01
Communications between cells in large part drive tissue development and function, as well as disease-related processes such as tumorigenesis. Understanding the mechanistic bases of these processes necessitates quantifying specific molecules in adjacent cells or cell nuclei of intact tissue. However, a major restriction on such analyses is the lack of an efficient method that correctly segments each object (cell or nucleus) from 3-D images of an intact tissue specimen. We report a highly reliable and accurate semi-automatic algorithmic method for segmenting fluorescence-labeled cells or nuclei from 3-D tissue images. Segmentation begins with semi-automatic, 2-D object delineation in a user-selected plane, using dynamic programming (DP) to locate the border with an accumulated intensity per unit length greater that any other possible border around the same object. Then the two surfaces of the object in planes above and below the selected plane are found using an algorithm that combines DP and combinatorial searching. Following segmentation, any perceived errors can be interactively corrected. Segmentation accuracy is not significantly affected by intermittent labeling of object surfaces, diffuse surfaces, or spurious signals away from surfaces. The unique strength of the segmentation method was demonstrated on a variety of biological tissue samples where all cells, including irregularly shaped cells, were accurately segmented based on visual inspection.
Efficient region-based approach for blotch detection in archived video using texture information
NASA Astrophysics Data System (ADS)
Yous, Hamza; Serir, Amina
2017-03-01
We propose a method for blotch detection in archived videos by modeling their spatiotemporal properties. We introduce an adaptive spatiotemporal segmentation to extract candidate regions that can be classified as blotches. Then, the similarity between the preselected regions and their corresponding motion-compensated regions in the adjacent frames is assessed by means of motion trajectory estimation and textural information analysis. Perceived ground truth based on just noticeable contrast is employed for the evaluation of our approach against the state-of-the-art, and the reported results show a better performance for our approach.
NASA Astrophysics Data System (ADS)
Onley, David; Steinberg, Gary
2004-04-01
The consequences of the Special Theory of Relativity are explored in a virtual world in which the speed of light is only 10 m/s. Ray tracing software and other visualization tools, modified to allow for the finite speed of light, are employed to create a video that brings to life a journey through this imaginary world. The aberation of light, the Doppler effect, the altered perception of time and power of incoming radiation are explored in separate segments of this 35 min video. Several of the effects observed are new and quite unexpected. A commentary and animated explanations help keep the viewer from losing all perspective.
A goal bias in action: The boundaries adults perceive in events align with sites of actor intent.
Levine, Dani; Hirsh-Pasek, Kathy; Pace, Amy; Michnick Golinkoff, Roberta
2017-06-01
We live in a dynamic world comprised of continuous events. Remembering our past and predicting future events, however, requires that we segment these ongoing streams of information in a consistent manner. How is this segmentation achieved? This research examines whether the boundaries adults perceive in events, such as the Olympic figure skating routine used in these studies, align with the beginnings (sources) and endings (goals) of human goal-directed actions. Study 1 showed that a group of experts, given an explicit task with unlimited time to rewatch the event, identified the same subevents as one another, but with greater agreement as to the timing of goals than sources. In Study 2, experts, novices familiarized with the figure skating sequence, and unfamiliarized novices performed an online event segmentation task, marking boundaries as the video progressed in real time. The online boundaries of all groups corresponded with the sources and goals offered by Study 1's experts, with greater alignment of goals than sources. Additionally, expertise, but not mere perceptual familiarity, boosted the alignment of sources and goals. Finally, Study 3, which presented novices with the video played in reverse, indicated, unexpectedly, that even when spatiotemporal cues were disrupted, viewers' perceived event boundaries still aligned with their perception of the actors' intended sources and goals. This research extends the goal bias to event segmentation, and suggests that our spontaneous sensitivity toward goals may allow us to transform even relatively complex and unfamiliar event streams into structured and meaningful representations. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Perioperative outcomes of video- and robot-assisted segmentectomies.
Rinieri, Philippe; Peillon, Christophe; Salaün, Mathieu; Mahieu, Julien; Bubenheim, Michael; Baste, Jean-Marc
2016-02-01
Video-assisted thoracic surgery appears to be technically difficult for segmentectomy. Conversely, robotic surgery could facilitate the performance of segmentectomy. The aim of this study was to compare the early results of video- and robot-assisted segmentectomies. Data were collected prospectively on videothoracoscopy from 2010 and on robotic procedures from 2013. Fifty-one patients who were candidates for minimally invasive segmentectomy were included in the study. Perioperative outcomes of video-assisted and robotic segmentectomies were compared. The minimally invasive segmentectomies included 32 video- and 16 robot-assisted procedures; 3 segmentectomies (2 video-assisted and 1 robot-assisted) were converted to lobectomies. Four conversions to thoracotomy were necessary for anatomical reason or arterial injury, with no uncontrolled bleeding in the robotic arm. There were 7 benign or infectious lesions, 9 pre-invasive lesions, 25 lung cancers, and 10 metastatic diseases. Patient characteristics, type of segment, conversion to thoracotomy, conversion to lobectomy, operative time, postoperative complications, chest tube duration, postoperative stay, and histology were similar in the video and robot groups. Estimated blood loss was significantly higher in the video group (100 vs. 50 mL, p = 0.028). The morbidity rate of minimally invasive segmentectomy was low. The short-term results of video-assisted and robot-assisted segmentectomies were similar, and more data are required to show any advantages between the two techniques. Long-term oncologic outcomes are necessary to evaluate these new surgical practices. © The Author(s) 2016.
Automated Visual Event Detection, Tracking, and Data Management System for Cabled- Observatory Video
NASA Astrophysics Data System (ADS)
Edgington, D. R.; Cline, D. E.; Schlining, B.; Raymond, E.
2008-12-01
Ocean observatories and underwater video surveys have the potential to unlock important discoveries with new and existing camera systems. Yet the burden of video management and analysis often requires reducing the amount of video recorded through time-lapse video or similar methods. It's unknown how many digitized video data sets exist in the oceanographic community, but we suspect that many remain under analyzed due to lack of good tools or human resources to analyze the video. To help address this problem, the Automated Visual Event Detection (AVED) software and The Video Annotation and Reference System (VARS) have been under development at MBARI. For detecting interesting events in the video, the AVED software has been developed over the last 5 years. AVED is based on a neuromorphic-selective attention algorithm, modeled on the human vision system. Frames are decomposed into specific feature maps that are combined into a unique saliency map. This saliency map is then scanned to determine the most salient locations. The candidate salient locations are then segmented from the scene using algorithms suitable for the low, non-uniform light and marine snow typical of deep underwater video. For managing the AVED descriptions of the video, the VARS system provides an interface and database for describing, viewing, and cataloging the video. VARS was developed by the MBARI for annotating deep-sea video data and is currently being used to describe over 3000 dives by our remotely operated vehicles (ROV), making it well suited to this deepwater observatory application with only a few modifications. To meet the compute and data intensive job of video processing, a distributed heterogeneous network of computers is managed using the Condor workload management system. This system manages data storage, video transcoding, and AVED processing. Looking to the future, we see high-speed networks and Grid technology as an important element in addressing the problem of processing and accessing large video data sets.
Hierarchical layered and semantic-based image segmentation using ergodicity map
NASA Astrophysics Data System (ADS)
Yadegar, Jacob; Liu, Xiaoqing
2010-04-01
Image segmentation plays a foundational role in image understanding and computer vision. Although great strides have been made and progress achieved on automatic/semi-automatic image segmentation algorithms, designing a generic, robust, and efficient image segmentation algorithm is still challenging. Human vision is still far superior compared to computer vision, especially in interpreting semantic meanings/objects in images. We present a hierarchical/layered semantic image segmentation algorithm that can automatically and efficiently segment images into hierarchical layered/multi-scaled semantic regions/objects with contextual topological relationships. The proposed algorithm bridges the gap between high-level semantics and low-level visual features/cues (such as color, intensity, edge, etc.) through utilizing a layered/hierarchical ergodicity map, where ergodicity is computed based on a space filling fractal concept and used as a region dissimilarity measurement. The algorithm applies a highly scalable, efficient, and adaptive Peano- Cesaro triangulation/tiling technique to decompose the given image into a set of similar/homogenous regions based on low-level visual cues in a top-down manner. The layered/hierarchical ergodicity map is built through a bottom-up region dissimilarity analysis. The recursive fractal sweep associated with the Peano-Cesaro triangulation provides efficient local multi-resolution refinement to any level of detail. The generated binary decomposition tree also provides efficient neighbor retrieval mechanisms for contextual topological object/region relationship generation. Experiments have been conducted within the maritime image environment where the segmented layered semantic objects include the basic level objects (i.e. sky/land/water) and deeper level objects in the sky/land/water surfaces. Experimental results demonstrate the proposed algorithm has the capability to robustly and efficiently segment images into layered semantic objects/regions with contextual topological relationships.
Object tracking using multiple camera video streams
NASA Astrophysics Data System (ADS)
Mehrubeoglu, Mehrube; Rojas, Diego; McLauchlan, Lifford
2010-05-01
Two synchronized cameras are utilized to obtain independent video streams to detect moving objects from two different viewing angles. The video frames are directly correlated in time. Moving objects in image frames from the two cameras are identified and tagged for tracking. One advantage of such a system involves overcoming effects of occlusions that could result in an object in partial or full view in one camera, when the same object is fully visible in another camera. Object registration is achieved by determining the location of common features in the moving object across simultaneous frames. Perspective differences are adjusted. Combining information from images from multiple cameras increases robustness of the tracking process. Motion tracking is achieved by determining anomalies caused by the objects' movement across frames in time in each and the combined video information. The path of each object is determined heuristically. Accuracy of detection is dependent on the speed of the object as well as variations in direction of motion. Fast cameras increase accuracy but limit the speed and complexity of the algorithm. Such an imaging system has applications in traffic analysis, surveillance and security, as well as object modeling from multi-view images. The system can easily be expanded by increasing the number of cameras such that there is an overlap between the scenes from at least two cameras in proximity. An object can then be tracked long distances or across multiple cameras continuously, applicable, for example, in wireless sensor networks for surveillance or navigation.
ERIC Educational Resources Information Center
Zlotlow, Susan F.; Allen, George J.
1981-01-01
Assessed the validity of examining the influence of counselors' physical attractiveness via observation of videotapes. Reactions to audio-only and video-only videotape segments were compared with in vivo contact. In vivo contact yielded more positive impressions than videotape observations. Technical skill was more predictive of counselor…
76 FR 44341 - Agency Information Collection Activities: Submission for OMB Review; Comment Request
Federal Register 2010, 2011, 2012, 2013, 2014
2011-07-25
... included the following: Objectives of the video. Targeted audiences of the video. Dissemination efforts of.... Perception of increased involvement. Demographics of the viewers. This phase will include all videos produced...
Perceiving referential intent: Dynamics of reference in natural parent-child interactions
Trueswell, John C.; Lin, Yi; Armstrong, Benjamin; Cartmill, Erica A.; Goldin-Meadow, Susan; Gleitman, Lila R.
2016-01-01
Two studies are presented which examined the temporal dynamics of the social-attentive behaviors that co-occur with referent identification during natural parent-child interactions in the home. Study 1 focused on 6.2 hours of videos of 56 parents interacting during everyday activities with their 14–18 month-olds, during which parents uttered common nouns as parts of spontaneously occurring utterances. Trained coders recorded, on a second-by-second basis, parent and child attentional behaviors relevant to reference in the period (40 sec.) immediately surrounding parental naming. The referential transparency of each interaction was independently assessed by having naïve adult participants guess what word the parent had uttered in these video segments, but with the audio turned off, forcing them to use only non-linguistic evidence available in the ongoing stream of events. We found a great deal of ambiguity in the input along with a few potent moments of word-referent transparency; these transparent moments have a particular temporal signature with respect to parent and child attentive behavior: it was the object’s appearance and/or the fact that it captured parent/child attention at the moment the word was uttered, not the presence of the object throughout the video, that predicted observers’ accuracy. Study 2 experimentally investigated the precision of the timing relation, and whether it has an effect on observer accuracy, by disrupting the timing between when the word was uttered and the behaviors present in the videos as they were originally recorded. Disrupting timing by only +/− 1 to 2 sec. reduced participant confidence and significantly decreased their accuracy in word identification. The results enhance an expanding literature on how dyadic attentional factors can influence early vocabulary growth. By hypothesis, this kind of time-sensitive data-selection process operates as a filter on input, removing many extraneous and ill-supported word-meaning hypotheses from consideration during children’s early vocabulary learning. PMID:26775159
Physical activity patterns across time-segmented youth sport flag football practice.
Schlechter, Chelsey R; Guagliano, Justin M; Rosenkranz, Richard R; Milliken, George A; Dzewaltowski, David A
2018-02-08
Youth sport (YS) reaches a large number of children world-wide and contributes substantially to children's daily physical activity (PA), yet less than half of YS time has been shown to be spent in moderate-to-vigorous physical activity (MVPA). Physical activity during practice is likely to vary depending on practice structure that changes across YS time, therefore the purpose of this study was 1) to describe the type and frequency of segments of time, defined by contextual characteristics of practice structure, during YS practices and 2) determine the influence of these segments on PA. Research assistants video-recorded the full duration of 28 practices from 14 boys' flag football teams (2 practices/team) while children concurrently (N = 111, aged 5-11 years, mean 7.9 ± 1.2 years) wore ActiGraph GT1M accelerometers to measure PA. Observers divided videos of each practice into continuous context time segments (N = 204; mean-segments-per-practice = 7.3, SD = 2.5) using start/stop points defined by change in context characteristics, and assigned a value for task (e.g., management, gameplay, etc.), member arrangement (e.g., small group, whole group, etc.), and setting demand (i.e., fosters participation, fosters exclusion). Segments were then paired with accelerometer data. Data were analyzed using a multilevel model with segment as unit of analysis. Whole practices averaged 34 ± 2.4% of time spent in MVPA. Free-play (51.5 ± 5.5%), gameplay (53.6 ± 3.7%), and warm-up (53.9 ± 3.6%) segments had greater percentage of time (%time) in MVPA compared to fitness (36.8 ± 4.4%) segments (p ≤ .01). Greater %time was spent in MVPA during free-play segments compared to scrimmage (30.2 ± 4.6%), strategy (30.6 ± 3.2%), and sport-skill (31.6 ± 3.1%) segments (p ≤ .01), and in segments that fostered participation (36.1 ± 2.7%) than segments that fostered exclusion (29.1 ± 3.0%; p ≤ .01). Significantly greater %time was spent in low-energy stationary behavior in fitness (15.7 ± 3.4%) than gameplay (4.0 ± 2.9%) segments (p ≤ .01), and in sport-skill (17.6 ± 2.2%) than free-play (8.2 ± 4.2%), gameplay, and warm-up (10.6 ± 2.6%) segments (p < .05). The %time spent in low-energy stationary behavior and in MVPA differed by characteristics of task and setting demand of the segment. Restructuring the routine of YS practice to include segments conducive to MVPA could increase %time spent in MVPA during practice. As YS reaches a large number of children worldwide, increasing PA during YS has the potential to create a public health impact.
Robust feedback zoom tracking for digital video surveillance.
Zou, Tengyue; Tang, Xiaoqi; Song, Bao; Wang, Jin; Chen, Jihong
2012-01-01
Zoom tracking is an important function in video surveillance, particularly in traffic management and security monitoring. It involves keeping an object of interest in focus during the zoom operation. Zoom tracking is typically achieved by moving the zoom and focus motors in lenses following the so-called "trace curve", which shows the in-focus motor positions versus the zoom motor positions for a specific object distance. The main task of a zoom tracking approach is to accurately estimate the trace curve for the specified object. Because a proportional integral derivative (PID) controller has historically been considered to be the best controller in the absence of knowledge of the underlying process and its high-quality performance in motor control, in this paper, we propose a novel feedback zoom tracking (FZT) approach based on the geometric trace curve estimation and PID feedback controller. The performance of this approach is compared with existing zoom tracking methods in digital video surveillance. The real-time implementation results obtained on an actual digital video platform indicate that the developed FZT approach not only solves the traditional one-to-many mapping problem without pre-training but also improves the robustness for tracking moving or switching objects which is the key challenge in video surveillance.
NASA Astrophysics Data System (ADS)
Amit, S. N. K.; Saito, S.; Sasaki, S.; Kiyoki, Y.; Aoki, Y.
2015-04-01
Google earth with high-resolution imagery basically takes months to process new images before online updates. It is a time consuming and slow process especially for post-disaster application. The objective of this research is to develop a fast and effective method of updating maps by detecting local differences occurred over different time series; where only region with differences will be updated. In our system, aerial images from Massachusetts's road and building open datasets, Saitama district datasets are used as input images. Semantic segmentation is then applied to input images. Semantic segmentation is a pixel-wise classification of images by implementing deep neural network technique. Deep neural network technique is implemented due to being not only efficient in learning highly discriminative image features such as road, buildings etc., but also partially robust to incomplete and poorly registered target maps. Then, aerial images which contain semantic information are stored as database in 5D world map is set as ground truth images. This system is developed to visualise multimedia data in 5 dimensions; 3 dimensions as spatial dimensions, 1 dimension as temporal dimension, and 1 dimension as degenerated dimensions of semantic and colour combination dimension. Next, ground truth images chosen from database in 5D world map and a new aerial image with same spatial information but different time series are compared via difference extraction method. The map will only update where local changes had occurred. Hence, map updating will be cheaper, faster and more effective especially post-disaster application, by leaving unchanged region and only update changed region.
National network television news coverage of contraception - a content analysis.
Patton, Elizabeth W; Moniz, Michelle H; Hughes, Lauren S; Buis, Lorraine; Howell, Joel
2017-01-01
The objective was to describe and analyze national network television news framing of contraception, recognizing that onscreen news can influence the public's knowledge and beliefs. We used the Vanderbilt Television News Archives and LexisNexis Database to obtain video and print transcripts of all relevant national network television news segments covering contraception from January 2010 to June 2014. We conducted a content analysis of 116 TV news segments covering contraception during the rollout of the Affordable Care Act. Segments were quantitatively coded for contraceptive methods covered, story sources used, and inclusion of medical and nonmedical content (intercoder reliability using Krippendorf's alpha ranged 0.6-1 for coded categories). Most (55%) news stories focused on contraception in general rather than specific methods. The most effective contraceptive methods were rarely discussed (implant, 1%; intrauterine device, 4%). The most frequently used sources were political figures (40%), advocates (25%), the general public (25%) and Catholic Church leaders (16%); medical professionals (11%) and health researchers (4%) appeared in a minority of stories. A minority of stories (31%) featured medical content. National network news coverage of contraception frequently focuses on contraception in political and social terms and uses nonmedical figures such as politicians and church leaders as sources. This focus deemphasizes the public health aspect of contraception, leading medical professionals and health content to be rarely featured. Media coverage of contraception may influence patients' views about contraception. Understanding the content, sources and medical accuracy of current media portrayals of contraception may enable health care professionals to dispel popular misperceptions. Published by Elsevier Inc.
An algorithm for calculi segmentation on ureteroscopic images.
Rosa, Benoît; Mozer, Pierre; Szewczyk, Jérôme
2011-03-01
The purpose of the study is to develop an algorithm for the segmentation of renal calculi on ureteroscopic images. In fact, renal calculi are common source of urological obstruction, and laser lithotripsy during ureteroscopy is a possible therapy. A laser-based system to sweep the calculus surface and vaporize it was developed to automate a very tedious manual task. The distal tip of the ureteroscope is directed using image guidance, and this operation is not possible without an efficient segmentation of renal calculi on the ureteroscopic images. We proposed and developed a region growing algorithm to segment renal calculi on ureteroscopic images. Using real video images to compute ground truth and compare our segmentation with a reference segmentation, we computed statistics on different image metrics, such as Precision, Recall, and Yasnoff Measure, for comparison with ground truth. The algorithm and its parameters were established for the most likely clinical scenarii. The segmentation results are encouraging: the developed algorithm was able to correctly detect more than 90% of the surface of the calculi, according to an expert observer. Implementation of an algorithm for the segmentation of calculi on ureteroscopic images is feasible. The next step is the integration of our algorithm in the command scheme of a motorized system to build a complete operating prototype.
Robotic Arm Comprising Two Bending Segments
NASA Technical Reports Server (NTRS)
Mehling, Joshua S.; Difler, Myron A.; Ambrose, Robert O.; Chu, Mars W.; Valvo, Michael C.
2010-01-01
The figure shows several aspects of an experimental robotic manipulator that includes a housing from which protrudes a tendril- or tentacle-like arm 1 cm thick and 1 m long. The arm consists of two collinear segments, each of which can be bent independently of the other, and the two segments can be bent simultaneously in different planes. The arm can be retracted to a minimum length or extended by any desired amount up to its full length. The arm can also be made to rotate about its own longitudinal axis. Some prior experimental robotic manipulators include single-segment bendable arms. Those arms are thicker and shorter than the present one. The present robotic manipulator serves as a prototype of future manipulators that, by virtue of the slenderness and multiple- bending capability of their arms, are expected to have sufficient dexterity for operation within spaces that would otherwise be inaccessible. Such manipulators could be especially well suited as means of minimally invasive inspection during construction and maintenance activities. Each of the two collinear bending arm segments is further subdivided into a series of collinear extension- and compression-type helical springs joined by threaded links. The extension springs occupy the majority of the length of the arm and engage passively in bending. The compression springs are used for actively controlled bending. Bending is effected by means of pairs of antagonistic tendons in the form of spectra gel spun polymer lines that are attached at specific threaded links and run the entire length of the arm inside the spring helix from the attachment links to motor-driven pulleys inside the housing. Two pairs of tendons, mounted in orthogonal planes that intersect along the longitudinal axis, are used to effect bending of each segment. The tendons for actuating the distal bending segment are in planes offset by an angle of 45 from those of the proximal bending segment: This configuration makes it possible to accommodate all eight tendons at the same diameter along the arm. The threaded links have central bores through which power and video wires can be strung (1) from a charge-coupled-device camera mounted on the tip of the arms (2) back along the interior of the arm into the housing and then (3) from within the housing to an external video monitor.
Integrating Language and Vision to Generate Natural Language Descriptions of Videos in the Wild
2014-08-23
the videos and produce probabilistic detections of grammatical subjects, verbs, and objects. In our data-set there are 45 candidate entities for the... grammatical subject (such as animal, baby, cat, chef, and person) and 241 for the grammatical object (such as flute, motorbike, shrimp, person, and tv...There are 218 candidate activities for the grammatical verb, including climb, cut, play, ride, and walk. Entity Related Features From each video two
Simultaneous segmentation of the bone and cartilage surfaces of a knee joint in 3D
NASA Astrophysics Data System (ADS)
Yin, Y.; Zhang, X.; Anderson, D. D.; Brown, T. D.; Hofwegen, C. Van; Sonka, M.
2009-02-01
We present a novel framework for the simultaneous segmentation of multiple interacting surfaces belonging to multiple mutually interacting objects. The method is a non-trivial extension of our previously reported optimal multi-surface segmentation. Considering an example application of knee-cartilage segmentation, the framework consists of the following main steps: 1) Shape model construction: Building a mean shape for each bone of the joint (femur, tibia, patella) from interactively segmented volumetric datasets. Using the resulting mean-shape model - identification of cartilage, non-cartilage, and transition areas on the mean-shape bone model surfaces. 2) Presegmentation: Employment of iterative optimal surface detection method to achieve approximate segmentation of individual bone surfaces. 3) Cross-object surface mapping: Detection of inter-bone equidistant separating sheets to help identify corresponding vertex pairs for all interacting surfaces. 4) Multi-object, multi-surface graph construction and final segmentation: Construction of a single multi-bone, multi-surface graph so that two surfaces (bone and cartilage) with zero and non-zero intervening distances can be detected for each bone of the joint, according to whether or not cartilage can be locally absent or present on the bone. To define inter-object relationships, corresponding vertex pairs identified using the separating sheets were interlinked in the graph. The graph optimization algorithm acted on the entire multiobject, multi-surface graph to yield a globally optimal solution. The segmentation framework was tested on 16 MR-DESS knee-joint datasets from the Osteoarthritis Initiative database. The average signed surface positioning error for the 6 detected surfaces ranged from 0.00 to 0.12 mm. When independently initialized, the signed reproducibility error of bone and cartilage segmentation ranged from 0.00 to 0.26 mm. The results showed that this framework provides robust, accurate, and reproducible segmentation of the knee joint bone and cartilage surfaces of the femur, tibia, and patella. As a general segmentation tool, the developed framework can be applied to a broad range of multi-object segmentation problems.
Intelligent keyframe extraction for video printing
NASA Astrophysics Data System (ADS)
Zhang, Tong
2004-10-01
Nowadays most digital cameras have the functionality of taking short video clips, with the length of video ranging from several seconds to a couple of minutes. The purpose of this research is to develop an algorithm which extracts an optimal set of keyframes from each short video clip so that the user could obtain proper video frames to print out. In current video printing systems, keyframes are normally obtained by evenly sampling the video clip over time. Such an approach, however, may not reflect highlights or regions of interest in the video. Keyframes derived in this way may also be improper for video printing in terms of either content or image quality. In this paper, we present an intelligent keyframe extraction approach to derive an improved keyframe set by performing semantic analysis of the video content. For a video clip, a number of video and audio features are analyzed to first generate a candidate keyframe set. These features include accumulative color histogram and color layout differences, camera motion estimation, moving object tracking, face detection and audio event detection. Then, the candidate keyframes are clustered and evaluated to obtain a final keyframe set. The objective is to automatically generate a limited number of keyframes to show different views of the scene; to show different people and their actions in the scene; and to tell the story in the video shot. Moreover, frame extraction for video printing, which is a rather subjective problem, is considered in this work for the first time, and a semi-automatic approach is proposed.
Content-Aware Video Adaptation under Low-Bitrate Constraint
NASA Astrophysics Data System (ADS)
Hsiao, Ming-Ho; Chen, Yi-Wen; Chen, Hua-Tsung; Chou, Kuan-Hung; Lee, Suh-Yin
2007-12-01
With the development of wireless network and the improvement of mobile device capability, video streaming is more and more widespread in such an environment. Under the condition of limited resource and inherent constraints, appropriate video adaptations have become one of the most important and challenging issues in wireless multimedia applications. In this paper, we propose a novel content-aware video adaptation in order to effectively utilize resource and improve visual perceptual quality. First, the attention model is derived from analyzing the characteristics of brightness, location, motion vector, and energy features in compressed domain to reduce computation complexity. Then, through the integration of attention model, capability of client device and correlational statistic model, attractive regions of video scenes are derived. The information object- (IOB-) weighted rate distortion model is used for adjusting the bit allocation. Finally, the video adaptation scheme dynamically adjusts video bitstream in frame level and object level. Experimental results validate that the proposed scheme achieves better visual quality effectively and efficiently.
Moya, Nikolas; Falcão, Alexandre X; Ciesielski, Krzysztof C; Udupa, Jayaram K
2014-01-01
Graph-cut algorithms have been extensively investigated for interactive binary segmentation, when the simultaneous delineation of multiple objects can save considerable user's time. We present an algorithm (named DRIFT) for 3D multiple object segmentation based on seed voxels and Differential Image Foresting Transforms (DIFTs) with relaxation. DRIFT stands behind efficient implementations of some state-of-the-art methods. The user can add/remove markers (seed voxels) along a sequence of executions of the DRIFT algorithm to improve segmentation. Its first execution takes linear time with the image's size, while the subsequent executions for corrections take sublinear time in practice. At each execution, DRIFT first runs the DIFT algorithm, then it applies diffusion filtering to smooth boundaries between objects (and background) and, finally, it corrects possible objects' disconnection occurrences with respect to their seeds. We evaluate DRIFT in 3D CT-images of the thorax for segmenting the arterial system, esophagus, left pleural cavity, right pleural cavity, trachea and bronchi, and the venous system.
Feasibility of Using Video Camera for Automated Enforcement on Red-Light Running and Managed Lanes.
DOT National Transportation Integrated Search
2009-12-25
The overall objective of this study is to evaluate the feasibility, effectiveness, legality, and public acceptance aspects of automated enforcement on red light running and HOV occupancy requirement using video cameras in Nevada. This objective was a...
A shape-based segmentation method for mobile laser scanning point clouds
NASA Astrophysics Data System (ADS)
Yang, Bisheng; Dong, Zhen
2013-07-01
Segmentation of mobile laser point clouds of urban scenes into objects is an important step for post-processing (e.g., interpretation) of point clouds. Point clouds of urban scenes contain numerous objects with significant size variability, complex and incomplete structures, and holes or variable point densities, raising great challenges for the segmentation of mobile laser point clouds. This paper addresses these challenges by proposing a shape-based segmentation method. The proposed method first calculates the optimal neighborhood size of each point to derive the geometric features associated with it, and then classifies the point clouds according to geometric features using support vector machines (SVMs). Second, a set of rules are defined to segment the classified point clouds, and a similarity criterion for segments is proposed to overcome over-segmentation. Finally, the segmentation output is merged based on topological connectivity into a meaningful geometrical abstraction. The proposed method has been tested on point clouds of two urban scenes obtained by different mobile laser scanners. The results show that the proposed method segments large-scale mobile laser point clouds with good accuracy and computationally effective time cost, and that it segments pole-like objects particularly well.
Hardware accelerator design for tracking in smart camera
NASA Astrophysics Data System (ADS)
Singh, Sanjay; Dunga, Srinivasa Murali; Saini, Ravi; Mandal, A. S.; Shekhar, Chandra; Vohra, Anil
2011-10-01
Smart Cameras are important components in video analysis. For video analysis, smart cameras needs to detect interesting moving objects, track such objects from frame to frame, and perform analysis of object track in real time. Therefore, the use of real-time tracking is prominent in smart cameras. The software implementation of tracking algorithm on a general purpose processor (like PowerPC) could achieve low frame rate far from real-time requirements. This paper presents the SIMD approach based hardware accelerator designed for real-time tracking of objects in a scene. The system is designed and simulated using VHDL and implemented on Xilinx XUP Virtex-IIPro FPGA. Resulted frame rate is 30 frames per second for 250x200 resolution video in gray scale.
Detection of dominant flow and abnormal events in surveillance video
NASA Astrophysics Data System (ADS)
Kwak, Sooyeong; Byun, Hyeran
2011-02-01
We propose an algorithm for abnormal event detection in surveillance video. The proposed algorithm is based on a semi-unsupervised learning method, a kind of feature-based approach so that it does not detect the moving object individually. The proposed algorithm identifies dominant flow without individual object tracking using a latent Dirichlet allocation model in crowded environments. It can also automatically detect and localize an abnormally moving object in real-life video. The performance tests are taken with several real-life databases, and their results show that the proposed algorithm can efficiently detect abnormally moving objects in real time. The proposed algorithm can be applied to any situation in which abnormal directions or abnormal speeds are detected regardless of direction.
Magnetic Braking: A Video Analysis
NASA Astrophysics Data System (ADS)
Molina-Bolívar, J. A.; Abella-Palacios, A. J.
2012-10-01
This paper presents a laboratory exercise that introduces students to the use of video analysis software and the Lenz's law demonstration. Digital techniques have proved to be very useful for the understanding of physical concepts. In particular, the availability of affordable digital video offers students the opportunity to actively engage in kinematics in introductory-level physics.1,2 By using digital videos frame advance features and "marking" the position of a moving object in each frame, students are able to more precisely determine the position of an object at much smaller time increments than would be possible with common time devices. Once the student collects data consisting of positions and times, these values may be manipulated to determine velocity and acceleration. There are a variety of commercial and free applications that can be used for video analysis. Because the relevant technology has become inexpensive, video analysis has become a prevalent tool in introductory physics courses.
Security Event Recognition for Visual Surveillance
NASA Astrophysics Data System (ADS)
Liao, W.; Yang, C.; Yang, M. Ying; Rosenhahn, B.
2017-05-01
With rapidly increasing deployment of surveillance cameras, the reliable methods for automatically analyzing the surveillance video and recognizing special events are demanded by different practical applications. This paper proposes a novel effective framework for security event analysis in surveillance videos. First, convolutional neural network (CNN) framework is used to detect objects of interest in the given videos. Second, the owners of the objects are recognized and monitored in real-time as well. If anyone moves any object, this person will be verified whether he/she is its owner. If not, this event will be further analyzed and distinguished between two different scenes: moving the object away or stealing it. To validate the proposed approach, a new video dataset consisting of various scenarios is constructed for more complex tasks. For comparison purpose, the experiments are also carried out on the benchmark databases related to the task on abandoned luggage detection. The experimental results show that the proposed approach outperforms the state-of-the-art methods and effective in recognizing complex security events.
Problem Video Game Use and Dimensions of Psychopathology
ERIC Educational Resources Information Center
Starcevic, Vladan; Berle, David; Porter, Guy; Fenech, Pauline
2011-01-01
The objective of this study was to examine associations between problem video game use and psychopathology. The Video Game Use Questionnaire (VGUQ) and the Symptom Checklist 90 (SCL-90) were administered in an international anonymous online survey. The VGUQ was used to identify problem video game users and SCL-90 assessed dimensions of…
Federal Register 2010, 2011, 2012, 2013, 2014
2012-07-31
... INTERNATIONAL TRADE COMMISSION [Investigation No. 337-TA-852] Certain Video Analytics Software... 337 of the Tariff Act of 1930, as amended, 19 U.S.C. 1337, on behalf of ObjectVideo, Inc. of Reston... sale within the United States after importation of certain video analytics software, systems...
Video Game Based Learning in English Grammar
ERIC Educational Resources Information Center
Singaravelu, G.
2008-01-01
The study enlightens the effectiveness of Video Game Based Learning in English Grammar at standard VI. A Video Game package was prepared and it consisted of self-learning activities in play way manner which attracted the minds of the young learners. Chief objective: Find out the effectiveness of Video-Game based learning in English grammar.…
Shima, Yoichiro; Suwa, Akina; Gomi, Yuichiro; Nogawa, Hiroki; Nagata, Hiroshi; Tanaka, Hiroshi
2007-01-01
Real-time video pictures can be transmitted inexpensively via a broadband connection using the DVTS (digital video transport system). However, the degradation of video pictures transmitted by DVTS has not been sufficiently evaluated. We examined the application of DVTS to remote consultation by using images of laparoscopic and endoscopic surgeries. A subjective assessment by the double stimulus continuous quality scale (DSCQS) method of the transmitted video pictures was carried out by eight doctors. Three of the four video recordings were assessed as being transmitted with no degradation in quality. None of the doctors noticed any degradation in the images due to encryption by the VPN (virtual private network) system. We also used an automatic picture quality assessment system to make an objective assessment of the same images. The objective DSCQS values were similar to the subjective ones. We conclude that although the quality of video pictures transmitted by the DVTS was slightly reduced, they were useful for clinical purposes. Encryption with a VPN did not degrade image quality.
Rigid shape matching by segmentation averaging.
Wang, Hongzhi; Oliensis, John
2010-04-01
We use segmentations to match images by shape. The new matching technique does not require point-to-point edge correspondence and is robust to small shape variations and spatial shifts. To address the unreliability of segmentations computed bottom-up, we give a closed form approximation to an average over all segmentations. Our method has many extensions, yielding new algorithms for tracking, object detection, segmentation, and edge-preserving smoothing. For segmentation, instead of a maximum a posteriori approach, we compute the "central" segmentation minimizing the average distance to all segmentations of an image. For smoothing, instead of smoothing images based on local structures, we smooth based on the global optimal image structures. Our methods for segmentation, smoothing, and object detection perform competitively, and we also show promising results in shape-based tracking.
NASA Astrophysics Data System (ADS)
Xu, Chao; Zhou, Dongxiang; Zhai, Yongping; Liu, Yunhui
2015-12-01
This paper realizes the automatic segmentation and classification of Mycobacterium tuberculosis with conventional light microscopy. First, the candidate bacillus objects are segmented by the marker-based watershed transform. The markers are obtained by an adaptive threshold segmentation based on the adaptive scale Gaussian filter. The scale of the Gaussian filter is determined according to the color model of the bacillus objects. Then the candidate objects are extracted integrally after region merging and contaminations elimination. Second, the shape features of the bacillus objects are characterized by the Hu moments, compactness, eccentricity, and roughness, which are used to classify the single, touching and non-bacillus objects. We evaluated the logistic regression, random forest, and intersection kernel support vector machines classifiers in classifying the bacillus objects respectively. Experimental results demonstrate that the proposed method yields to high robustness and accuracy. The logistic regression classifier performs best with an accuracy of 91.68%.
Mishra, Ajay; Aloimonos, Yiannis
2009-01-01
The human visual system observes and understands a scene/image by making a series of fixations. Every fixation point lies inside a particular region of arbitrary shape and size in the scene which can either be an object or just a part of it. We define as a basic segmentation problem the task of segmenting that region containing the fixation point. Segmenting the region containing the fixation is equivalent to finding the enclosing contour- a connected set of boundary edge fragments in the edge map of the scene - around the fixation. This enclosing contour should be a depth boundary.We present here a novel algorithm that finds this bounding contour and achieves the segmentation of one object, given the fixation. The proposed segmentation framework combines monocular cues (color/intensity/texture) with stereo and/or motion, in a cue independent manner. The semantic robots of the immediate future will be able to use this algorithm to automatically find objects in any environment. The capability of automatically segmenting objects in their visual field can bring the visual processing to the next level. Our approach is different from current approaches. While existing work attempts to segment the whole scene at once into many areas, we segment only one image region, specifically the one containing the fixation point. Experiments with real imagery collected by our active robot and from the known databases 1 demonstrate the promise of the approach.
2016-01-01
Passive content fingerprinting is widely used for video content identification and monitoring. However, many challenges remain unsolved especially for partial-copies detection. The main challenge is to find the right balance between the computational cost of fingerprint extraction and fingerprint dimension, without compromising detection performance against various attacks (robustness). Fast video detection performance is desirable in several modern applications, for instance, in those where video detection involves the use of large video databases or in applications requiring real-time video detection of partial copies, a process whose difficulty increases when videos suffer severe transformations. In this context, conventional fingerprinting methods are not fully suitable to cope with the attacks and transformations mentioned before, either because the robustness of these methods is not enough or because their execution time is very high, where the time bottleneck is commonly found in the fingerprint extraction and matching operations. Motivated by these issues, in this work we propose a content fingerprinting method based on the extraction of a set of independent binary global and local fingerprints. Although these features are robust against common video transformations, their combination is more discriminant against severe video transformations such as signal processing attacks, geometric transformations and temporal and spatial desynchronization. Additionally, we use an efficient multilevel filtering system accelerating the processes of fingerprint extraction and matching. This multilevel filtering system helps to rapidly identify potential similar video copies upon which the fingerprint process is carried out only, thus saving computational time. We tested with datasets of real copied videos, and the results show how our method outperforms state-of-the-art methods regarding detection scores. Furthermore, the granularity of our method makes it suitable for partial-copy detection; that is, by processing only short segments of 1 second length. PMID:27861492
Rivera, Reynaldo; Santos, David; Brändle, Gaspar; Cárdaba, Miguel Ángel M
2016-04-01
Exposure to media violence might have detrimental effects on psychological adjustment and is associated with aggression-related attitudes and behaviors. As a result, many media literacy programs were implemented to tackle that major public health issue. However, there is little evidence about their effectiveness. Evaluating design effectiveness, particularly regarding targeting process, would prevent adverse effects and improve the evaluation of evidence-based media literacy programs. The present research examined whether or not different relational lifestyles may explain the different effects of an antiviolence intervention program. Based on relational and lifestyles theory, the authors designed a randomized controlled trial and applied an analysis of variance 2 (treatment: experimental vs. control) × 4 (lifestyle classes emerged from data using latent class analysis: communicative vs. autonomous vs. meta-reflexive vs. fractured). Seven hundred and thirty-five Italian students distributed in 47 classes participated anonymously in the research (51.3% females). Participants completed a lifestyle questionnaire as well as their attitudes and behavioral intentions as the dependent measures. The results indicated that the program was effective in changing adolescents' attitudes toward violence. However, behavioral intentions toward consumption of violent video games were moderated by lifestyles. Those with communicative relational lifestyles showed fewer intentions to consume violent video games, while a boomerang effect was found among participants with problematic lifestyles. Adolescents' lifestyles played an important role in influencing the effectiveness of an intervention aimed at changing behavioral intentions toward the consumption of violent video games. For that reason, audience lifestyle segmentation analysis should be considered an essential technique for designing, evaluating, and improving media literacy programs. © The Author(s) 2016.
Belgiu, Mariana; Dr Guţ, Lucian
2014-10-01
Although multiresolution segmentation (MRS) is a powerful technique for dealing with very high resolution imagery, some of the image objects that it generates do not match the geometries of the target objects, which reduces the classification accuracy. MRS can, however, be guided to produce results that approach the desired object geometry using either supervised or unsupervised approaches. Although some studies have suggested that a supervised approach is preferable, there has been no comparative evaluation of these two approaches. Therefore, in this study, we have compared supervised and unsupervised approaches to MRS. One supervised and two unsupervised segmentation methods were tested on three areas using QuickBird and WorldView-2 satellite imagery. The results were assessed using both segmentation evaluation methods and an accuracy assessment of the resulting building classifications. Thus, differences in the geometries of the image objects and in the potential to achieve satisfactory thematic accuracies were evaluated. The two approaches yielded remarkably similar classification results, with overall accuracies ranging from 82% to 86%. The performance of one of the unsupervised methods was unexpectedly similar to that of the supervised method; they identified almost identical scale parameters as being optimal for segmenting buildings, resulting in very similar geometries for the resulting image objects. The second unsupervised method produced very different image objects from the supervised method, but their classification accuracies were still very similar. The latter result was unexpected because, contrary to previously published findings, it suggests a high degree of independence between the segmentation results and classification accuracy. The results of this study have two important implications. The first is that object-based image analysis can be automated without sacrificing classification accuracy, and the second is that the previously accepted idea that classification is dependent on segmentation is challenged by our unexpected results, casting doubt on the value of pursuing 'optimal segmentation'. Our results rather suggest that as long as under-segmentation remains at acceptable levels, imperfections in segmentation can be ruled out, so that a high level of classification accuracy can still be achieved.
Image Segmentation Method Using Fuzzy C Mean Clustering Based on Multi-Objective Optimization
NASA Astrophysics Data System (ADS)
Chen, Jinlin; Yang, Chunzhi; Xu, Guangkui; Ning, Li
2018-04-01
Image segmentation is not only one of the hottest topics in digital image processing, but also an important part of computer vision applications. As one kind of image segmentation algorithms, fuzzy C-means clustering is an effective and concise segmentation algorithm. However, the drawback of FCM is that it is sensitive to image noise. To solve the problem, this paper designs a novel fuzzy C-mean clustering algorithm based on multi-objective optimization. We add a parameter λ to the fuzzy distance measurement formula to improve the multi-objective optimization. The parameter λ can adjust the weights of the pixel local information. In the algorithm, the local correlation of neighboring pixels is added to the improved multi-objective mathematical model to optimize the clustering cent. Two different experimental results show that the novel fuzzy C-means approach has an efficient performance and computational time while segmenting images by different type of noises.
Satellite switched FDMA advanced communication technology satellite program
NASA Technical Reports Server (NTRS)
Atwood, S.; Higton, G. H.; Wood, K.; Kline, A.; Furiga, A.; Rausch, M.; Jan, Y.
1982-01-01
The satellite switched frequency division multiple access system provided a detailed system architecture that supports a point to point communication system for long haul voice, video and data traffic between small Earth terminals at Ka band frequencies at 30/20 GHz. A detailed system design is presented for the space segment, small terminal/trunking segment at network control segment for domestic traffic model A or B, each totaling 3.8 Gb/s of small terminal traffic and 6.2 Gb/s trunk traffic. The small terminal traffic (3.8 Gb/s) is emphasized, for the satellite router portion of the system design, which is a composite of thousands of Earth stations with digital traffic ranging from a single 32 Kb/s CVSD voice channel to thousands of channels containing voice, video and data with a data rate as high as 33 Mb/s. The system design concept presented, effectively optimizes a unique frequency and channelization plan for both traffic models A and B with minimum reorganization of the satellite payload transponder subsystem hardware design. The unique zoning concept allows multiple beam antennas while maximizing multiple carrier frequency reuse. Detailed hardware design estimates for an FDMA router (part of the satellite transponder subsystem) indicate a weight and dc power budget of 353 lbs, 195 watts for traffic model A and 498 lbs, 244 watts for traffic model B.
Objective video presentation QoE predictor for smart adaptive video streaming
NASA Astrophysics Data System (ADS)
Wang, Zhou; Zeng, Kai; Rehman, Abdul; Yeganeh, Hojatollah; Wang, Shiqi
2015-09-01
How to deliver videos to consumers over the network for optimal quality-of-experience (QoE) has been the central goal of modern video delivery services. Surprisingly, regardless of the large volume of videos being delivered everyday through various systems attempting to improve visual QoE, the actual QoE of end consumers is not properly assessed, not to say using QoE as the key factor in making critical decisions at the video hosting, network and receiving sites. Real-world video streaming systems typically use bitrate as the main video presentation quality indicator, but using the same bitrate to encode different video content could result in drastically different visual QoE, which is further affected by the display device and viewing condition of each individual consumer who receives the video. To correct this, we have to put QoE back to the driver's seat and redesign the video delivery systems. To achieve this goal, a major challenge is to find an objective video presentation QoE predictor that is accurate, fast, easy-to-use, display device adaptive, and provides meaningful QoE predictions across resolution and content. We propose to use the newly developed SSIMplus index (https://ece.uwaterloo.ca/~z70wang/research/ssimplus/) for this role. We demonstrate that based on SSIMplus, one can develop a smart adaptive video streaming strategy that leads to much smoother visual QoE impossible to achieve using existing adaptive bitrate video streaming approaches. Furthermore, SSIMplus finds many more applications, in live and file-based quality monitoring, in benchmarking video encoders and transcoders, and in guiding network resource allocations.
Binding and segmentation via a neural mass model trained with Hebbian and anti-Hebbian mechanisms.
Cona, Filippo; Zavaglia, Melissa; Ursino, Mauro
2012-04-01
Synchronization of neural activity in the gamma band, modulated by a slower theta rhythm, is assumed to play a significant role in binding and segmentation of multiple objects. In the present work, a recent neural mass model of a single cortical column is used to analyze the synaptic mechanisms which can warrant synchronization and desynchronization of cortical columns, during an autoassociation memory task. The model considers two distinct layers communicating via feedforward connections. The first layer receives the external input and works as an autoassociative network in the theta band, to recover a previously memorized object from incomplete information. The second realizes segmentation of different objects in the gamma band. To this end, units within both layers are connected with synapses trained on the basis of previous experience to store objects. The main model assumptions are: (i) recovery of incomplete objects is realized by excitatory synapses from pyramidal to pyramidal neurons in the same object; (ii) binding in the gamma range is realized by excitatory synapses from pyramidal neurons to fast inhibitory interneurons in the same object. These synapses (both at points i and ii) have a few ms dynamics and are trained with a Hebbian mechanism. (iii) Segmentation is realized with faster AMPA synapses, with rise times smaller than 1 ms, trained with an anti-Hebbian mechanism. Results show that the model, with the previous assumptions, can correctly reconstruct and segment three simultaneous objects, starting from incomplete knowledge. Segmentation of more objects is possible but requires an increased ratio between the theta and gamma periods.
NASA Technical Reports Server (NTRS)
2003-01-01
This video presents an overview of the first Tracking and Data Relay Satellite (TDRS-1) in the form of text, computer animations, footage, and an interview with its program manager. Launched by the Space Shuttle Challenger in 1983, TDRS-1 was the first of a network of satellites used for relaying data to and from scientific spacecraft. Most of this short video is silent, and consists of footage and animation of the deployment of TDRS-1, written and animated explanations of what TDRS satellites do, and samples of the astronomical and Earth science data they transmit. The program manager explains in the final segment of the video the improvement TDRS satellites brought to communication with manned space missions, including alleviation of blackout during reentry, and also the role TDRS-1 played in providing telemedicine for a breast cancer patient in Antarctica.
Statistical modelling of subdiffusive dynamics in the cytoplasm of living cells: A FARIMA approach
NASA Astrophysics Data System (ADS)
Burnecki, K.; Muszkieta, M.; Sikora, G.; Weron, A.
2012-04-01
Golding and Cox (Phys. Rev. Lett., 96 (2006) 098102) tracked the motion of individual fluorescently labelled mRNA molecules inside live E. coli cells. They found that in the set of 23 trajectories from 3 different experiments, the automatically recognized motion is subdiffusive and published an intriguing microscopy video. Here, we extract the corresponding time series from this video by image segmentation method and present its detailed statistical analysis. We find that this trajectory was not included in the data set already studied and has different statistical properties. It is best fitted by a fractional autoregressive integrated moving average (FARIMA) process with the normal-inverse Gaussian (NIG) noise and the negative memory. In contrast to earlier studies, this shows that the fractional Brownian motion is not the best model for the dynamics documented in this video.
Object-oriented approach to the automatic segmentation of bones from pediatric hand radiographs
NASA Astrophysics Data System (ADS)
Shim, Hyeonjoon; Liu, Brent J.; Taira, Ricky K.; Hall, Theodore R.
1997-04-01
The purpose of this paper is to develop a robust and accurate method that automatically segments phalangeal and epiphyseal bones from digital pediatric hand radiographs exhibiting various stages of growth. The development of this system draws principles from object-oriented design, model- guided analysis, and feedback control. A system architecture called 'the object segmentation machine' was implemented incorporating these design philosophies. The system is aided by a knowledge base where all model contours and other information such as age, race, and sex, are stored. These models include object structure models, shape models, 1-D wrist profiles, and gray level histogram models. Shape analysis is performed first by using an arc-length orientation transform to break down a given contour into elementary segments and curves. Then an interpretation tree is used as an inference engine to map known model contour segments to data contour segments obtained from the transform. Spatial and anatomical relationships among contour segments work as constraints from shape model. These constraints aid in generating a list of candidate matches. The candidate match with the highest confidence is chosen to be the current intermediate result. Verification of intermediate results are perform by a feedback control loop.
NASA Astrophysics Data System (ADS)
Yin, Y.; Sonka, M.
2010-03-01
A novel method is presented for definition of search lines in a variety of surface segmentation approaches. The method is inspired by properties of electric field direction lines and is applicable to general-purpose n-D shapebased image segmentation tasks. Its utility is demonstrated in graph construction and optimal segmentation of multiple mutually interacting objects. The properties of the electric field-based graph construction guarantee that inter-object graph connecting lines are non-intersecting and inherently covering the entire object-interaction space. When applied to inter-object cross-surface mapping, our approach generates one-to-one and all-to-all vertex correspondent pairs between the regions of mutual interaction. We demonstrate the benefits of the electric field approach in several examples ranging from relatively simple single-surface segmentation to complex multiobject multi-surface segmentation of femur-tibia cartilage. The performance of our approach is demonstrated in 60 MR images from the Osteoarthritis Initiative (OAI), in which our approach achieved a very good performance as judged by surface positioning errors (average of 0.29 and 0.59 mm for signed and unsigned cartilage positioning errors, respectively).
Joint Attributes and Event Analysis for Multimedia Event Detection.
Ma, Zhigang; Chang, Xiaojun; Xu, Zhongwen; Sebe, Nicu; Hauptmann, Alexander G
2017-06-15
Semantic attributes have been increasingly used the past few years for multimedia event detection (MED) with promising results. The motivation is that multimedia events generally consist of lower level components such as objects, scenes, and actions. By characterizing multimedia event videos with semantic attributes, one could exploit more informative cues for improved detection results. Much existing work obtains semantic attributes from images, which may be suboptimal for video analysis since these image-inferred attributes do not carry dynamic information that is essential for videos. To address this issue, we propose to learn semantic attributes from external videos using their semantic labels. We name them video attributes in this paper. In contrast with multimedia event videos, these external videos depict lower level contents such as objects, scenes, and actions. To harness video attributes, we propose an algorithm established on a correlation vector that correlates them to a target event. Consequently, we could incorporate video attributes latently as extra information into the event detector learnt from multimedia event videos in a joint framework. To validate our method, we perform experiments on the real-world large-scale TRECVID MED 2013 and 2014 data sets and compare our method with several state-of-the-art algorithms. The experiments show that our method is advantageous for MED.
Detecting and Analyzing Multiple Moving Objects in Crowded Environments with Coherent Motion Regions
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cheriyadat, Anil M.
Understanding the world around us from large-scale video data requires vision systems that can perform automatic interpretation. While human eyes can unconsciously perceive independent objects in crowded scenes and other challenging operating environments, automated systems have difficulty detecting, counting, and understanding their behavior in similar scenes. Computer scientists at ORNL have a developed a technology termed as "Coherent Motion Region Detection" that invloves identifying multiple indepedent moving objects in crowded scenes by aggregating low-level motion cues extracted from moving objects. Humans and other species exploit such low-level motion cues seamlessely to perform perceptual grouping for visual understanding. The algorithm detectsmore » and tracks feature points on moving objects resulting in partial trajectories that span coherent 3D region in the space-time volume defined by the video. In the case of multi-object motion, many possible coherent motion regions can be constructed around the set of trajectories. The unique approach in the algorithm is to identify all possible coherent motion regions, then extract a subset of motion regions based on an innovative measure to automatically locate moving objects in crowded environments.The software reports snapshot of the object, count, and derived statistics ( count over time) from input video streams. The software can directly process videos streamed over the internet or directly from a hardware device (camera).« less
Template-Based 3D Reconstruction of Non-rigid Deformable Object from Monocular Video
NASA Astrophysics Data System (ADS)
Liu, Yang; Peng, Xiaodong; Zhou, Wugen; Liu, Bo; Gerndt, Andreas
2018-06-01
In this paper, we propose a template-based 3D surface reconstruction system of non-rigid deformable objects from monocular video sequence. Firstly, we generate a semi-dense template of the target object with structure from motion method using a subsequence video. This video can be captured by rigid moving camera orienting the static target object or by a static camera observing the rigid moving target object. Then, with the reference template mesh as input and based on the framework of classical template-based methods, we solve an energy minimization problem to get the correspondence between the template and every frame to get the time-varying mesh to present the deformation of objects. The energy terms combine photometric cost, temporal and spatial smoothness cost as well as as-rigid-as-possible cost which can enable elastic deformation. In this paper, an easy and controllable solution to generate the semi-dense template for complex objects is presented. Besides, we use an effective iterative Schur based linear solver for the energy minimization problem. The experimental evaluation presents qualitative deformation objects reconstruction results with real sequences. Compare against the results with other templates as input, the reconstructions based on our template have more accurate and detailed results for certain regions. The experimental results show that the linear solver we used performs better efficiency compared to traditional conjugate gradient based solver.
The emerging High Efficiency Video Coding standard (HEVC)
NASA Astrophysics Data System (ADS)
Raja, Gulistan; Khan, Awais
2013-12-01
High definition video (HDV) is becoming popular day by day. This paper describes the performance analysis of latest upcoming video standard known as High Efficiency Video Coding (HEVC). HEVC is designed to fulfil all the requirements for future high definition videos. In this paper, three configurations (intra only, low delay and random access) of HEVC are analyzed using various 480p, 720p and 1080p high definition test video sequences. Simulation results show the superior objective and subjective quality of HEVC.
Object Segmentation Methods for Online Model Acquisition to Guide Robotic Grasping
NASA Astrophysics Data System (ADS)
Ignakov, Dmitri
A vision system is an integral component of many autonomous robots. It enables the robot to perform essential tasks such as mapping, localization, or path planning. A vision system also assists with guiding the robot's grasping and manipulation tasks. As an increased demand is placed on service robots to operate in uncontrolled environments, advanced vision systems must be created that can function effectively in visually complex and cluttered settings. This thesis presents the development of segmentation algorithms to assist in online model acquisition for guiding robotic manipulation tasks. Specifically, the focus is placed on localizing door handles to assist in robotic door opening, and on acquiring partial object models to guide robotic grasping. First, a method for localizing a door handle of unknown geometry based on a proposed 3D segmentation method is presented. Following segmentation, localization is performed by fitting a simple box model to the segmented handle. The proposed method functions without requiring assumptions about the appearance of the handle or the door, and without a geometric model of the handle. Next, an object segmentation algorithm is developed, which combines multiple appearance (intensity and texture) and geometric (depth and curvature) cues. The algorithm is able to segment objects without utilizing any a priori appearance or geometric information in visually complex and cluttered environments. The segmentation method is based on the Conditional Random Fields (CRF) framework, and the graph cuts energy minimization technique. A simple and efficient method for initializing the proposed algorithm which overcomes graph cuts' reliance on user interaction is also developed. Finally, an improved segmentation algorithm is developed which incorporates a distance metric learning (DML) step as a means of weighing various appearance and geometric segmentation cues, allowing the method to better adapt to the available data. The improved method also models the distribution of 3D points in space as a distribution of algebraic distances from an ellipsoid fitted to the object, improving the method's ability to predict which points are likely to belong to the object or the background. Experimental validation of all methods is performed. Each method is evaluated in a realistic setting, utilizing scenarios of various complexities. Experimental results have demonstrated the effectiveness of the handle localization method, and the object segmentation methods.
Gena, Angeliki; Couloura, Sophia; Kymissis, Effie
2005-10-01
The purpose of this study was to modify the affective behavior of three preschoolers with autism in home settings and in the context of play activities, and to compare the effects of video modeling to the effects of in-vivo modeling in teaching these children contextually appropriate affective responses. A multiple-baseline design across subjects, with a return to baseline condition, was used to assess the effects of treatment that consisted of reinforcement, video modeling, in-vivo modeling, and prompting. During training trials, reinforcement in the form of verbal praise and tokens was delivered contingent upon appropriate affective responding. Error correction procedures differed for each treatment condition. In the in-vivo modeling condition, the therapist used modeling and verbal prompting. In the video modeling condition, video segments of a peer modeling the correct response and verbal prompting by the therapist were used as corrective procedures. Participants received treatment in three categories of affective behavior--sympathy, appreciation, and disapproval--and were presented with a total of 140 different scenarios. The study demonstrated that both treatments--video modeling and in-vivo modeling--systematically increased appropriate affective responding in all response categories for the three participants. Additionally, treatment effects generalized across responses to untrained scenarios, the child's mother, new therapists, and time.
Action Spotting and Recognition Based on a Spatiotemporal Orientation Analysis.
Derpanis, Konstantinos G; Sizintsev, Mikhail; Cannons, Kevin J; Wildes, Richard P
2013-03-01
This paper provides a unified framework for the interrelated topics of action spotting, the spatiotemporal detection and localization of human actions in video, and action recognition, the classification of a given video into one of several predefined categories. A novel compact local descriptor of video dynamics in the context of action spotting and recognition is introduced based on visual spacetime oriented energy measurements. This descriptor is efficiently computed directly from raw image intensity data and thereby forgoes the problems typically associated with flow-based features. Importantly, the descriptor allows for the comparison of the underlying dynamics of two spacetime video segments irrespective of spatial appearance, such as differences induced by clothing, and with robustness to clutter. An associated similarity measure is introduced that admits efficient exhaustive search for an action template, derived from a single exemplar video, across candidate video sequences. The general approach presented for action spotting and recognition is amenable to efficient implementation, which is deemed critical for many important applications. For action spotting, details of a real-time GPU-based instantiation of the proposed approach are provided. Empirical evaluation of both action spotting and action recognition on challenging datasets suggests the efficacy of the proposed approach, with state-of-the-art performance documented on standard datasets.
Generating OER by Recording Lectures: A Case Study
ERIC Educational Resources Information Center
Llamas-Nistal, Martín; Mikic-Fonte, Fernando A.
2014-01-01
The University of Vigo, Vigo, Spain, has the objective of making all the teaching material generated by its teachers freely available. To attain this objective, it encourages the development of Open Educational Resources, especially videos. This paper presents an experience of recording lectures and generating the corresponding videos as a step…
Foreign Language Students' Conversational Negotiations in Different Task Environments
ERIC Educational Resources Information Center
Hardy, Ilonca M.; Moore, Joyce L.
2004-01-01
This study examined the effect of structural and content characteristics of language tasks on foreign language learners' conversational negotiations. In a 2x2 Greco-Latin square design, degree of structural support of language tasks, students' degree of familiarity with German video segments, and task order were varied. Twenty-eight pairs of…
Hubble Identifies Source of Ultraviolet Light in an Old Galaxy
NASA Technical Reports Server (NTRS)
2000-01-01
This videotape is comprised of four segments: (1) a Video zoom in on galaxy M32 using ground images, (2) Hubble images of galaxy M32, (3) Ground base color image of galaxies M31 and M32, and (4) Black and white ground based images of galaxy M32.
Automatic Online Lecture Highlighting Based on Multimedia Analysis
ERIC Educational Resources Information Center
Che, Xiaoyin; Yang, Haojin; Meinel, Christoph
2018-01-01
Textbook highlighting is widely considered to be beneficial for students. In this paper, we propose a comprehensive solution to highlight the online lecture videos in both sentence- and segment-level, just as is done with paper books. The solution is based on automatic analysis of multimedia lecture materials, such as speeches, transcripts, and…
Affect Response to Simulated Information Attack during Complex Task Performance
2014-12-02
AND FRUSTRATION ........................ 42 FIGURE 27. TASK LOAD INDEX OF MENTAL DEMAND, TEMPORAL DEMAND, AND PHYSICAL DEMAND...situational awareness, affect, and trait characteristics interact with human performance during cyberspace attacks in the physical and information...Operator state was manipulated using emotional stimulation portrayed through the presentation of video segments. The effect of emotions on
MILE Curriculum [and Nine CD-ROM Lessons].
ERIC Educational Resources Information Center
Reiman, John
This curriculum on money management skills for deaf adolescent and young adult students is presented on nine video CD-ROMs as well as in a print version. The curriculum was developed following a survey of the needs of school and rehabilitation programs. It was also piloted and subsequently revised. Each teaching segment is presented in sign…
ERIC Educational Resources Information Center
Jones, Rachel; Hall, Sara White; Thigpen, Kamila; Murray, Tom; Loschert, Kristen
2015-01-01
This report demonstrates how one predominantly low-income school district dramatically improved student engagement in the classroom and increased high school graduation rates through project-based learning (PBL) and the effective use of technology. The report, which includes short video segments with educators and students, focuses on Talladega…
Zhang, Lei; Zeng, Zhi; Ji, Qiang
2011-09-01
Chain graph (CG) is a hybrid probabilistic graphical model (PGM) capable of modeling heterogeneous relationships among random variables. So far, however, its application in image and video analysis is very limited due to lack of principled learning and inference methods for a CG of general topology. To overcome this limitation, we introduce methods to extend the conventional chain-like CG model to CG model with more general topology and the associated methods for learning and inference in such a general CG model. Specifically, we propose techniques to systematically construct a generally structured CG, to parameterize this model, to derive its joint probability distribution, to perform joint parameter learning, and to perform probabilistic inference in this model. To demonstrate the utility of such an extended CG, we apply it to two challenging image and video analysis problems: human activity recognition and image segmentation. The experimental results show improved performance of the extended CG model over the conventional directed or undirected PGMs. This study demonstrates the promise of the extended CG for effective modeling and inference of complex real-world problems.
Lip reading using neural networks
NASA Astrophysics Data System (ADS)
Kalbande, Dhananjay; Mishra, Akassh A.; Patil, Sanjivani; Nirgudkar, Sneha; Patel, Prashant
2011-10-01
Computerized lip reading, or speech reading, is concerned with the difficult task of converting a video signal of a speaking person to written text. It has several applications like teaching deaf and dumb to speak and communicate effectively with the other people, its crime fighting potential and invariance to acoustic environment. We convert the video of the subject speaking vowels into images and then images are further selected manually for processing. However, several factors like fast speech, bad pronunciation, and poor illumination, movement of face, moustaches and beards make lip reading difficult. Contour tracking methods and Template matching are used for the extraction of lips from the face. K Nearest Neighbor algorithm is then used to classify the 'speaking' images and the 'silent' images. The sequence of images is then transformed into segments of utterances. Feature vector is calculated on each frame for all the segments and is stored in the database with properly labeled class. Character recognition is performed using modified KNN algorithm which assigns more weight to nearer neighbors. This paper reports the recognition of vowels using KNN algorithms
Local and global evaluation for remote sensing image segmentation
NASA Astrophysics Data System (ADS)
Su, Tengfei; Zhang, Shengwei
2017-08-01
In object-based image analysis, how to produce accurate segmentation is usually a very important issue that needs to be solved before image classification or target recognition. The study for segmentation evaluation method is key to solving this issue. Almost all of the existent evaluation strategies only focus on the global performance assessment. However, these methods are ineffective for the situation that two segmentation results with very similar overall performance have very different local error distributions. To overcome this problem, this paper presents an approach that can both locally and globally quantify segmentation incorrectness. In doing so, region-overlapping metrics are utilized to quantify each reference geo-object's over and under-segmentation error. These quantified error values are used to produce segmentation error maps which have effective illustrative power to delineate local segmentation error patterns. The error values for all of the reference geo-objects are aggregated through using area-weighted summation, so that global indicators can be derived. An experiment using two scenes of very different high resolution images showed that the global evaluation part of the proposed approach was almost as effective as other two global evaluation methods, and the local part was a useful complement to comparing different segmentation results.
Content-based intermedia synchronization
NASA Astrophysics Data System (ADS)
Oh, Dong-Young; Sampath-Kumar, Srihari; Rangan, P. Venkat
1995-03-01
Inter-media synchronization methods developed until now have been based on syntactic timestamping of video frames and audio samples. These methods are not fully appropriate for the synchronization of multimedia objects which may have to be accessed individually by their contents, e.g. content-base data retrieval. We propose a content-based multimedia synchronization scheme in which a media stream is viewed as hierarchial composition of smaller objects which are logically structured based on the contents, and the synchronization is achieved by deriving temporal relations among logical units of media object. content-based synchronization offers several advantages such as, elimination of the need for time stamping, freedom from limitations of jitter, synchronization of independently captured media objects in video editing, and compensation for inherent asynchronies in capture times of video and audio.
Segmentation of touching mycobacterium tuberculosis from Ziehl-Neelsen stained sputum smear images
NASA Astrophysics Data System (ADS)
Xu, Chao; Zhou, Dongxiang; Liu, Yunhui
2015-12-01
Touching Mycobacterium tuberculosis objects in the Ziehl-Neelsen stained sputum smear images present different shapes and invisible boundaries in the adhesion areas, which increases the difficulty in objects recognition and counting. In this paper, we present a segmentation method of combining the hierarchy tree analysis with gradient vector flow snake to address this problem. The skeletons of the objects are used for structure analysis based on the hierarchy tree. The gradient vector flow snake is used to estimate the object edge. Experimental results show that the single objects composing the touching objects are successfully segmented by the proposed method. This work will improve the accuracy and practicability of the computer-aided diagnosis of tuberculosis.
NASA Astrophysics Data System (ADS)
Lee, Joohwi; Kim, Sun Hyung; Styner, Martin
2016-03-01
The delineation of rodent brain structures is challenging due to low-contrast multiple cortical and subcortical organs that are closely interfacing to each other. Atlas-based segmentation has been widely employed due to its ability to delineate multiple organs at the same time via image registration. The use of multiple atlases and subsequent label fusion techniques has further improved the robustness and accuracy of atlas-based segmentation. However, the accuracy of atlas-based segmentation is still prone to registration errors; for example, the segmentation of in vivo MR images can be less accurate and robust against image artifacts than the segmentation of post mortem images. In order to improve the accuracy and robustness of atlas-based segmentation, we propose a multi-object, model-based, multi-atlas segmentation method. We first establish spatial correspondences across atlases using a set of dense pseudo-landmark particles. We build a multi-object point distribution model using those particles in order to capture inter- and intra- subject variation among brain structures. The segmentation is obtained by fitting the model into a subject image, followed by label fusion process. Our result shows that the proposed method resulted in greater accuracy than comparable segmentation methods, including a widely used ANTs registration tool.
Improvement and Extension of Shape Evaluation Criteria in Multi-Scale Image Segmentation
NASA Astrophysics Data System (ADS)
Sakamoto, M.; Honda, Y.; Kondo, A.
2016-06-01
From the last decade, the multi-scale image segmentation is getting a particular interest and practically being used for object-based image analysis. In this study, we have addressed the issues on multi-scale image segmentation, especially, in improving the performances for validity of merging and variety of derived region's shape. Firstly, we have introduced constraints on the application of spectral criterion which could suppress excessive merging between dissimilar regions. Secondly, we have extended the evaluation for smoothness criterion by modifying the definition on the extent of the object, which was brought for controlling the shape's diversity. Thirdly, we have developed new shape criterion called aspect ratio. This criterion helps to improve the reproducibility on the shape of object to be matched to the actual objectives of interest. This criterion provides constraint on the aspect ratio in the bounding box of object by keeping properties controlled with conventional shape criteria. These improvements and extensions lead to more accurate, flexible, and diverse segmentation results according to the shape characteristics of the target of interest. Furthermore, we also investigated a technique for quantitative and automatic parameterization in multi-scale image segmentation. This approach is achieved by comparing segmentation result with training area specified in advance by considering the maximization of the average area in derived objects or satisfying the evaluation index called F-measure. Thus, it has been possible to automate the parameterization that suited the objectives especially in the view point of shape's reproducibility.
Robust Feedback Zoom Tracking for Digital Video Surveillance
Zou, Tengyue; Tang, Xiaoqi; Song, Bao; Wang, Jin; Chen, Jihong
2012-01-01
Zoom tracking is an important function in video surveillance, particularly in traffic management and security monitoring. It involves keeping an object of interest in focus during the zoom operation. Zoom tracking is typically achieved by moving the zoom and focus motors in lenses following the so-called “trace curve”, which shows the in-focus motor positions versus the zoom motor positions for a specific object distance. The main task of a zoom tracking approach is to accurately estimate the trace curve for the specified object. Because a proportional integral derivative (PID) controller has historically been considered to be the best controller in the absence of knowledge of the underlying process and its high-quality performance in motor control, in this paper, we propose a novel feedback zoom tracking (FZT) approach based on the geometric trace curve estimation and PID feedback controller. The performance of this approach is compared with existing zoom tracking methods in digital video surveillance. The real-time implementation results obtained on an actual digital video platform indicate that the developed FZT approach not only solves the traditional one-to-many mapping problem without pre-training but also improves the robustness for tracking moving or switching objects which is the key challenge in video surveillance. PMID:22969388
Medical Image Segmentation by Combining Graph Cut and Oriented Active Appearance Models
Chen, Xinjian; Udupa, Jayaram K.; Bağcı, Ulaş; Zhuge, Ying; Yao, Jianhua
2017-01-01
In this paper, we propose a novel 3D segmentation method based on the effective combination of the active appearance model (AAM), live wire (LW), and graph cut (GC). The proposed method consists of three main parts: model building, initialization, and segmentation. In the model building part, we construct the AAM and train the LW cost function and GC parameters. In the initialization part, a novel algorithm is proposed for improving the conventional AAM matching method, which effectively combines the AAM and LW method, resulting in Oriented AAM (OAAM). A multi-object strategy is utilized to help in object initialization. We employ a pseudo-3D initialization strategy, and segment the organs slice by slice via multi-object OAAM method. For the segmentation part, a 3D shape constrained GC method is proposed. The object shape generated from the initialization step is integrated into the GC cost computation, and an iterative GC-OAAM method is used for object delineation. The proposed method was tested in segmenting the liver, kidneys, and spleen on a clinical CT dataset and also tested on the MICCAI 2007 grand challenge for liver segmentation training dataset. The results show the following: (a) An overall segmentation accuracy of true positive volume fraction (TPVF) > 94.3%, false positive volume fraction (FPVF) < 0.2% can be achieved. (b) The initialization performance can be improved by combining AAM and LW. (c) The multi-object strategy greatly facilitates the initialization. (d) Compared to the traditional 3D AAM method, the pseudo 3D OAAM method achieves comparable performance while running 12 times faster. (e) The performance of proposed method is comparable to the state of the art liver segmentation algorithm. The executable version of 3D shape constrained GC with user interface can be downloaded from website http://xinjianchen.wordpress.com/research/. PMID:22311862
System and method for automated object detection in an image
Kenyon, Garrett T.; Brumby, Steven P.; George, John S.; Paiton, Dylan M.; Schultz, Peter F.
2015-10-06
A contour/shape detection model may use relatively simple and efficient kernels to detect target edges in an object within an image or video. A co-occurrence probability may be calculated for two or more edge features in an image or video using an object definition. Edge features may be differentiated between in response to measured contextual support, and prominent edge features may be extracted based on the measured contextual support. The object may then be identified based on the extracted prominent edge features.
Visualizing and Writing Video Programs.
ERIC Educational Resources Information Center
Floyd, Steve
1979-01-01
Reviews 10 steps which serve as guidelines to simplify the creative process of producing a video training program: (1) audience analysis, (2) task analysis, (3) definition of objective, (4) conceptualization, (5) visualization, (6) storyboard, (7) video storyboard, (8) evaluation, (9) revision, and (10) production. (LRA)
Focused Assessment with Sonography for Trauma in weightlessness: a feasibility study
NASA Technical Reports Server (NTRS)
Kirkpatrick, Andrew W.; Hamilton, Douglas R.; Nicolaou, Savvas; Sargsyan, Ashot E.; Campbell, Mark R.; Feiveson, Alan; Dulchavsky, Scott A.; Melton, Shannon; Beck, George; Dawson, David L.
2003-01-01
BACKGROUND: The Focused Assessment with Sonography for Trauma (FAST) examines for fluid in gravitationally dependent regions. There is no prior experience with this technique in weightlessness, such as on the International Space Station, where sonography is currently the only diagnostic imaging tool. STUDY DESIGN: A ground-based (1 g) porcine model for sonography was developed. We examined both the feasibility and the comparative performance of the FAST examination in parabolic flight. Sonographic detection and fluid behavior were evaluated in four animals during alternating weightlessness (0 g) and hypergravity (1.8 g) periods. During flight, boluses of fluid were incrementally introduced into the peritoneal cavity. Standardized sonographic windows were recorded. Postflight, the video recordings were divided into 169 20-second segments for subsequent interpretation by 12 blinded ultrasonography experts. Reviewers first decided whether a video segment was of sufficient diagnostic quality to analyze (determinate). Determinate segments were then analyzed as containing or not containing fluid. A probit regression model compared the probability of a positive fluid diagnosis to actual fluid levels (0 to 500 mL) under both 0-g and 1.8-g conditions. RESULTS: The in-flight sonographers found real-time scanning and interpretation technically similar to that of terrestrial conditions, as long as restraint was maintained. On blinded review, 80% of the recorded ultrasound segments were considered determinate. The best sensitivity for diagnosis in 0 g was found to be from the subhepatic space, with probability of a positive fluid diagnosis ranging from 9% (no fluid) to 51% (500 mL fluid). CONCLUSIONS: The FAST examination is technically feasible in weightlessness, and merits operational consideration for clinical contingencies in space.
Video segmentation for post-production
NASA Astrophysics Data System (ADS)
Wills, Ciaran
2001-12-01
Specialist post-production is an industry that has much to gain from the application of content-based video analysis techniques. However the types of material handled in specialist post-production, such as television commercials, pop music videos and special effects are quite different in nature from the typical broadcast material which many video analysis techniques are designed to work with; shots are short and highly dynamic, and the transitions are often novel or ambiguous. We address the problem of scene change detection and develop a new algorithm which tackles some of the common aspects of post-production material that cause difficulties for past algorithms, such as illumination changes and jump cuts. Operating in the compressed domain on Motion JPEG compressed video, our algorithm detects cuts and fades by analyzing each JPEG macroblock in the context of its temporal and spatial neighbors. Analyzing the DCT coefficients directly we can extract the mean color of a block and an approximate detail level. We can also perform an approximated cross-correlation between two blocks. The algorithm is part of a set of tools being developed to work with an automated asset management system designed specifically for use in post-production facilities.
Video bioinformatics analysis of human embryonic stem cell colony growth.
Lin, Sabrina; Fonteno, Shawn; Satish, Shruthi; Bhanu, Bir; Talbot, Prue
2010-05-20
Because video data are complex and are comprised of many images, mining information from video material is difficult to do without the aid of computer software. Video bioinformatics is a powerful quantitative approach for extracting spatio-temporal data from video images using computer software to perform dating mining and analysis. In this article, we introduce a video bioinformatics method for quantifying the growth of human embryonic stem cells (hESC) by analyzing time-lapse videos collected in a Nikon BioStation CT incubator equipped with a camera for video imaging. In our experiments, hESC colonies that were attached to Matrigel were filmed for 48 hours in the BioStation CT. To determine the rate of growth of these colonies, recipes were developed using CL-Quant software which enables users to extract various types of data from video images. To accurately evaluate colony growth, three recipes were created. The first segmented the image into the colony and background, the second enhanced the image to define colonies throughout the video sequence accurately, and the third measured the number of pixels in the colony over time. The three recipes were run in sequence on video data collected in a BioStation CT to analyze the rate of growth of individual hESC colonies over 48 hours. To verify the truthfulness of the CL-Quant recipes, the same data were analyzed manually using Adobe Photoshop software. When the data obtained using the CL-Quant recipes and Photoshop were compared, results were virtually identical, indicating the CL-Quant recipes were truthful. The method described here could be applied to any video data to measure growth rates of hESC or other cells that grow in colonies. In addition, other video bioinformatics recipes can be developed in the future for other cell processes such as migration, apoptosis, and cell adhesion.
Rice, Sean C; Higginbotham, Tina; Dean, Melanie J; Slaughter, James C; Yachimski, Patrick S; Obstein, Keith L
2016-11-01
Successful outpatient colonoscopy (CLS) depends on many factors including the quality of a patient's bowel preparation. Although education on consumption of the pre-CLS purgative can improve bowel preparation quality, no study has evaluated dietary education alone. We have created an educational video on pre-CLS dietary instructions to determine whether dietary education would improve outpatient bowel preparation quality. A prospective randomized, blinded, controlled study of patients undergoing outpatient CLS was performed. All patients received a 4 l polyethylene glycol-based split-dose bowel preparation and standard institutional pre-procedure instructions. Patients were then randomly assigned to an intervention arm or to a no intervention arm. A 4-min educational video detailing clear liquid diet restriction was made available to patients in the intervention arm, whereas those randomized to no intervention did not have access to the video. Patients randomized to the video were provided with the YouTube video link 48-72 h before CLS. An attending endoscopist blinded to randomization performed the CLS. Bowel preparation quality was scored using the Boston Bowel Preparation Scale (BBPS). Adequate preparation was defined as a BBPS total score of ≥6 with all segment scores ≥2. Wilcoxon rank-sum and Pearson's χ 2 -tests were performed to assess differences between groups. Ninety-two patients were randomized (video: n=42; control: n=50) with 47 total video views being tallied. There were no demographic differences between groups. There was no statistically significant difference in adequate preparation between groups (video=74%; control=68%; P=0.54). The availability of a supplementary patient educational video on clear liquid diet alone was insufficient to improve bowel preparation quality when compared with standard pre-procedure instruction at our institution.
Scalable gastroscopic video summarization via similar-inhibition dictionary selection.
Wang, Shuai; Cong, Yang; Cao, Jun; Yang, Yunsheng; Tang, Yandong; Zhao, Huaici; Yu, Haibin
2016-01-01
This paper aims at developing an automated gastroscopic video summarization algorithm to assist clinicians to more effectively go through the abnormal contents of the video. To select the most representative frames from the original video sequence, we formulate the problem of gastroscopic video summarization as a dictionary selection issue. Different from the traditional dictionary selection methods, which take into account only the number and reconstruction ability of selected key frames, our model introduces the similar-inhibition constraint to reinforce the diversity of selected key frames. We calculate the attention cost by merging both gaze and content change into a prior cue to help select the frames with more high-level semantic information. Moreover, we adopt an image quality evaluation process to eliminate the interference of the poor quality images and a segmentation process to reduce the computational complexity. For experiments, we build a new gastroscopic video dataset captured from 30 volunteers with more than 400k images and compare our method with the state-of-the-arts using the content consistency, index consistency and content-index consistency with the ground truth. Compared with all competitors, our method obtains the best results in 23 of 30 videos evaluated based on content consistency, 24 of 30 videos evaluated based on index consistency and all videos evaluated based on content-index consistency. For gastroscopic video summarization, we propose an automated annotation method via similar-inhibition dictionary selection. Our model can achieve better performance compared with other state-of-the-art models and supplies more suitable key frames for diagnosis. The developed algorithm can be automatically adapted to various real applications, such as the training of young clinicians, computer-aided diagnosis or medical report generation. Copyright © 2015 Elsevier B.V. All rights reserved.
Estimating Physical Activity Energy Expenditure with the Kinect Sensor in an Exergaming Environment
Nathan, David; Huynh, Du Q.; Rubenson, Jonas; Rosenberg, Michael
2015-01-01
Active video games that require physical exertion during game play have been shown to confer health benefits. Typically, energy expended during game play is measured using devices attached to players, such as accelerometers, or portable gas analyzers. Since 2010, active video gaming technology incorporates marker-less motion capture devices to simulate human movement into game play. Using the Kinect Sensor and Microsoft SDK this research aimed to estimate the mechanical work performed by the human body and estimate subsequent metabolic energy using predictive algorithmic models. Nineteen University students participated in a repeated measures experiment performing four fundamental movements (arm swings, standing jumps, body-weight squats, and jumping jacks). Metabolic energy was captured using a Cortex Metamax 3B automated gas analysis system with mechanical movement captured by the combined motion data from two Kinect cameras. Estimations of the body segment properties, such as segment mass, length, centre of mass position, and radius of gyration, were calculated from the Zatsiorsky-Seluyanov's equations of de Leva, with adjustment made for posture cost. GPML toolbox implementation of the Gaussian Process Regression, a locally weighted k-Nearest Neighbour Regression, and a linear regression technique were evaluated for their performance on predicting the metabolic cost from new feature vectors. The experimental results show that Gaussian Process Regression outperformed the other two techniques by a small margin. This study demonstrated that physical activity energy expenditure during exercise, using the Kinect camera as a motion capture system, can be estimated from segmental mechanical work. Estimates for high-energy activities, such as standing jumps and jumping jacks, can be made accurately, but for low-energy activities, such as squatting, the posture of static poses should be considered as a contributing factor. When translated into the active video gaming environment, the results could be incorporated into game play to more accurately control the energy expenditure requirements. PMID:26000460
Playing Active Video Games may not develop movement skills: An intervention trial.
Barnett, Lisa M; Ridgers, Nicola D; Reynolds, John; Hanna, Lisa; Salmon, Jo
2015-01-01
To investigate the impact of playing sports Active Video Games on children's actual and perceived object control skills. Intervention children played Active Video Games for 6 weeks (1 h/week) in 2012. The Test of Gross Motor Development-2 assessed object control skill. The Pictorial Scale of Perceived Movement Skill Competence assessed perceived object control skill. Repeated measurements of object control and perceived object control were analysed for the whole sample, using linear mixed models, which included fixed effects for group (intervention or control) and time (pre and post) and their interaction. The first model adjusted for sex only and the second model also adjusted for age, and prior ball sports experience (yes/no). Seven mixed-gender focus discussions were conducted with intervention children after programme completion. Ninety-five Australian children (55% girls; 43% intervention group) aged 4 to 8 years (M 6.2, SD 0.95) participated. Object control skill improved over time (p = 0.006) but there was no significant difference (p = 0.913) between groups in improvement (predicted means: control 31.80 to 33.53, SED = 0.748; intervention 30.33 to 31.83, SED = 0.835). A similar result held for the second model. Similarly the intervention did not change perceived object control in Model 1 (predicted means: control: 19.08 to 18.68, SED = 0.362; intervention 18.67 to 18.88, SED = 0.406) or Model 2. Children found the intervention enjoyable, but most did not perceive direct equivalence between Active Video Games and 'real life' activities. Whilst Active Video Game play may help introduce children to sport, this amount of time playing is unlikely to build skill.
Playing Active Video Games may not develop movement skills: An intervention trial
Barnett, Lisa M.; Ridgers, Nicola D.; Reynolds, John; Hanna, Lisa; Salmon, Jo
2015-01-01
Background: To investigate the impact of playing sports Active Video Games on children's actual and perceived object control skills. Methods: Intervention children played Active Video Games for 6 weeks (1 h/week) in 2012. The Test of Gross Motor Development-2 assessed object control skill. The Pictorial Scale of Perceived Movement Skill Competence assessed perceived object control skill. Repeated measurements of object control and perceived object control were analysed for the whole sample, using linear mixed models, which included fixed effects for group (intervention or control) and time (pre and post) and their interaction. The first model adjusted for sex only and the second model also adjusted for age, and prior ball sports experience (yes/no). Seven mixed-gender focus discussions were conducted with intervention children after programme completion. Results: Ninety-five Australian children (55% girls; 43% intervention group) aged 4 to 8 years (M 6.2, SD 0.95) participated. Object control skill improved over time (p = 0.006) but there was no significant difference (p = 0.913) between groups in improvement (predicted means: control 31.80 to 33.53, SED = 0.748; intervention 30.33 to 31.83, SED = 0.835). A similar result held for the second model. Similarly the intervention did not change perceived object control in Model 1 (predicted means: control: 19.08 to 18.68, SED = 0.362; intervention 18.67 to 18.88, SED = 0.406) or Model 2. Children found the intervention enjoyable, but most did not perceive direct equivalence between Active Video Games and ‘real life’ activities. Conclusions: Whilst Active Video Game play may help introduce children to sport, this amount of time playing is unlikely to build skill. PMID:26844136
Object detection in cinematographic video sequences for automatic indexing
NASA Astrophysics Data System (ADS)
Stauder, Jurgen; Chupeau, Bertrand; Oisel, Lionel
2003-06-01
This paper presents an object detection framework applied to cinematographic post-processing of video sequences. Post-processing is done after production and before editing. At the beginning of each shot of a video, a slate (also called clapperboard) is shown. The slate contains notably an electronic audio timecode that is necessary for audio-visual synchronization. This paper presents an object detection framework to detect slates in video sequences for automatic indexing and post-processing. It is based on five steps. The first two steps aim to reduce drastically the video data to be analyzed. They ensure high recall rate but have low precision. The first step detects images at the beginning of a shot possibly showing up a slate while the second step searches in these images for candidates regions with color distribution similar to slates. The objective is to not miss any slate while eliminating long parts of video without slate appearance. The third and fourth steps are statistical classification and pattern matching to detected and precisely locate slates in candidate regions. These steps ensure high recall rate and high precision. The objective is to detect slates with very little false alarms to minimize interactive corrections. In a last step, electronic timecodes are read from slates to automize audio-visual synchronization. The presented slate detector has a recall rate of 89% and a precision of 97,5%. By temporal integration, much more than 89% of shots in dailies are detected. By timecode coherence analysis, the precision can be raised too. Issues for future work are to accelerate the system to be faster than real-time and to extend the framework for several slate types.