NASA Astrophysics Data System (ADS)
Chen, Xinyuan; Song, Li; Yang, Xiaokang
2016-09-01
Video denoising can be described as the problem of mapping from a specific length of noisy frames to clean one. We propose a deep architecture based on Recurrent Neural Network (RNN) for video denoising. The model learns a patch-based end-to-end mapping between the clean and noisy video sequences. It takes the corrupted video sequences as the input and outputs the clean one. Our deep network, which we refer to as deep Recurrent Neural Networks (deep RNNs or DRNNs), stacks RNN layers where each layer receives the hidden state of the previous layer as input. Experiment shows (i) the recurrent architecture through temporal domain extracts motion information and does favor to video denoising, and (ii) deep architecture have large enough capacity for expressing mapping relation between corrupted videos as input and clean videos as output, furthermore, (iii) the model has generality to learned different mappings from videos corrupted by different types of noise (e.g., Poisson-Gaussian noise). By training on large video databases, we are able to compete with some existing video denoising methods.
Presentation of 3D Scenes Through Video Example.
Baldacci, Andrea; Ganovelli, Fabio; Corsini, Massimiliano; Scopigno, Roberto
2017-09-01
Using synthetic videos to present a 3D scene is a common requirement for architects, designers, engineers or Cultural Heritage professionals however it is usually time consuming and, in order to obtain high quality results, the support of a film maker/computer animation expert is necessary. We introduce an alternative approach that takes the 3D scene of interest and an example video as input, and automatically produces a video of the input scene that resembles the given video example. In other words, our algorithm allows the user to "replicate" an existing video, on a different 3D scene. We build on the intuition that a video sequence of a static environment is strongly characterized by its optical flow, or, in other words, that two videos are similar if their optical flows are similar. We therefore recast the problem as producing a video of the input scene whose optical flow is similar to the optical flow of the input video. Our intuition is supported by a user-study specifically designed to verify this statement. We have successfully tested our approach on several scenes and input videos, some of which are reported in the accompanying material of this paper.
Human action classification using procrustes shape theory
NASA Astrophysics Data System (ADS)
Cho, Wanhyun; Kim, Sangkyoon; Park, Soonyoung; Lee, Myungeun
2015-02-01
In this paper, we propose new method that can classify a human action using Procrustes shape theory. First, we extract a pre-shape configuration vector of landmarks from each frame of an image sequence representing an arbitrary human action, and then we have derived the Procrustes fit vector for pre-shape configuration vector. Second, we extract a set of pre-shape vectors from tanning sample stored at database, and we compute a Procrustes mean shape vector for these preshape vectors. Third, we extract a sequence of the pre-shape vectors from input video, and we project this sequence of pre-shape vectors on the tangent space with respect to the pole taking as a sequence of mean shape vectors corresponding with a target video. And we calculate the Procrustes distance between two sequences of the projection pre-shape vectors on the tangent space and the mean shape vectors. Finally, we classify the input video into the human action class with minimum Procrustes distance. We assess a performance of the proposed method using one public dataset, namely Weizmann human action dataset. Experimental results reveal that the proposed method performs very good on this dataset.
Human silhouette matching based on moment invariants
NASA Astrophysics Data System (ADS)
Sun, Yong-Chao; Qiu, Xian-Jie; Xia, Shi-Hong; Wang, Zhao-Qi
2005-07-01
This paper aims to apply the method of silhouette matching based on moment invariants to infer the human motion parameters from video sequences of single monocular uncalibrated camera. Currently, there are two ways of tracking human motion: Marker and Markerless. While a hybrid framework is introduced in this paper to recover the input video contents. A standard 3D motion database is built up by marker technique in advance. Given a video sequences, human silhouettes are extracted as well as the viewpoint information of the camera which would be utilized to project the standard 3D motion database onto the 2D one. Therefore, the video recovery problem is formulated as a matching issue of finding the most similar body pose in standard 2D library with the one in video image. The framework is applied to the special trampoline sport where we can obtain the complicated human motion parameters in the single camera video sequences, and a lot of experiments are demonstrated that this approach is feasible in the field of monocular video-based 3D motion reconstruction.
Method and Apparatus for Evaluating the Visual Quality of Processed Digital Video Sequences
NASA Technical Reports Server (NTRS)
Watson, Andrew B. (Inventor)
2002-01-01
A Digital Video Quality (DVQ) apparatus and method that incorporate a model of human visual sensitivity to predict the visibility of artifacts. The DVQ method and apparatus are used for the evaluation of the visual quality of processed digital video sequences and for adaptively controlling the bit rate of the processed digital video sequences without compromising the visual quality. The DVQ apparatus minimizes the required amount of memory and computation. The input to the DVQ apparatus is a pair of color image sequences: an original (R) non-compressed sequence, and a processed (T) sequence. Both sequences (R) and (T) are sampled, cropped, and subjected to color transformations. The sequences are then subjected to blocking and discrete cosine transformation, and the results are transformed to local contrast. The next step is a time filtering operation which implements the human sensitivity to different time frequencies. The results are converted to threshold units by dividing each discrete cosine transform coefficient by its respective visual threshold. At the next stage the two sequences are subtracted to produce an error sequence. The error sequence is subjected to a contrast masking operation, which also depends upon the reference sequence (R). The masked errors can be pooled in various ways to illustrate the perceptual error over various dimensions, and the pooled error can be converted to a visual quality measure.
Quality and noise measurements in mobile phone video capture
NASA Astrophysics Data System (ADS)
Petrescu, Doina; Pincenti, John
2011-02-01
The quality of videos captured with mobile phones has become increasingly important particularly since resolutions and formats have reached a level that rivals the capabilities available in the digital camcorder market, and since many mobile phones now allow direct playback on large HDTVs. The video quality is determined by the combined quality of the individual parts of the imaging system including the image sensor, the digital color processing, and the video compression, each of which has been studied independently. In this work, we study the combined effect of these elements on the overall video quality. We do this by evaluating the capture under various lighting, color processing, and video compression conditions. First, we measure full reference quality metrics between encoder input and the reconstructed sequence, where the encoder input changes with light and color processing modifications. Second, we introduce a system model which includes all elements that affect video quality, including a low light additive noise model, ISP color processing, as well as the video encoder. Our experiments show that in low light conditions and for certain choices of color processing the system level visual quality may not improve when the encoder becomes more capable or the compression ratio is reduced.
Template-Based 3D Reconstruction of Non-rigid Deformable Object from Monocular Video
NASA Astrophysics Data System (ADS)
Liu, Yang; Peng, Xiaodong; Zhou, Wugen; Liu, Bo; Gerndt, Andreas
2018-06-01
In this paper, we propose a template-based 3D surface reconstruction system of non-rigid deformable objects from monocular video sequence. Firstly, we generate a semi-dense template of the target object with structure from motion method using a subsequence video. This video can be captured by rigid moving camera orienting the static target object or by a static camera observing the rigid moving target object. Then, with the reference template mesh as input and based on the framework of classical template-based methods, we solve an energy minimization problem to get the correspondence between the template and every frame to get the time-varying mesh to present the deformation of objects. The energy terms combine photometric cost, temporal and spatial smoothness cost as well as as-rigid-as-possible cost which can enable elastic deformation. In this paper, an easy and controllable solution to generate the semi-dense template for complex objects is presented. Besides, we use an effective iterative Schur based linear solver for the energy minimization problem. The experimental evaluation presents qualitative deformation objects reconstruction results with real sequences. Compare against the results with other templates as input, the reconstructions based on our template have more accurate and detailed results for certain regions. The experimental results show that the linear solver we used performs better efficiency compared to traditional conjugate gradient based solver.
Leszczuk, Mikołaj; Dudek, Łukasz; Witkowski, Marcin
The VQiPS (Video Quality in Public Safety) Working Group, supported by the U.S. Department of Homeland Security, has been developing a user guide for public safety video applications. According to VQiPS, five parameters have particular importance influencing the ability to achieve a recognition task. They are: usage time-frame, discrimination level, target size, lighting level, and level of motion. These parameters form what are referred to as Generalized Use Classes (GUCs). The aim of our research was to develop algorithms that would automatically assist classification of input sequences into one of the GUCs. Target size and lighting level parameters were approached. The experiment described reveals the experts' ambiguity and hesitation during the manual target size determination process. However, the automatic methods developed for target size classification make it possible to determine GUC parameters with 70 % compliance to the end-users' opinion. Lighting levels of the entire sequence can be classified with an efficiency reaching 93 %. To make the algorithms available for use, a test application has been developed. It is able to process video files and display classification results, the user interface being very simple and requiring only minimal user interaction.
Markerless video analysis for movement quantification in pediatric epilepsy monitoring.
Lu, Haiping; Eng, How-Lung; Mandal, Bappaditya; Chan, Derrick W S; Ng, Yen-Ling
2011-01-01
This paper proposes a markerless video analytic system for quantifying body part movements in pediatric epilepsy monitoring. The system utilizes colored pajamas worn by a patient in bed to extract body part movement trajectories, from which various features can be obtained for seizure detection and analysis. Hence, it is non-intrusive and it requires no sensor/marker to be attached to the patient's body. It takes raw video sequences as input and a simple user-initialization indicates the body parts to be examined. In background/foreground modeling, Gaussian mixture models are employed in conjunction with HSV-based modeling. Body part detection follows a coarse-to-fine paradigm with graph-cut-based segmentation. Finally, body part parameters are estimated with domain knowledge guidance. Experimental studies are reported on sequences captured in an Epilepsy Monitoring Unit at a local hospital. The results demonstrate the feasibility of the proposed system in pediatric epilepsy monitoring and seizure detection.
A novel multiple description scalable coding scheme for mobile wireless video transmission
NASA Astrophysics Data System (ADS)
Zheng, Haifeng; Yu, Lun; Chen, Chang Wen
2005-03-01
We proposed in this paper a novel multiple description scalable coding (MDSC) scheme based on in-band motion compensation temporal filtering (IBMCTF) technique in order to achieve high video coding performance and robust video transmission. The input video sequence is first split into equal-sized groups of frames (GOFs). Within a GOF, each frame is hierarchically decomposed by discrete wavelet transform. Since there is a direct relationship between wavelet coefficients and what they represent in the image content after wavelet decomposition, we are able to reorganize the spatial orientation trees to generate multiple bit-streams and employed SPIHT algorithm to achieve high coding efficiency. We have shown that multiple bit-stream transmission is very effective in combating error propagation in both Internet video streaming and mobile wireless video. Furthermore, we adopt the IBMCTF scheme to remove the redundancy for inter-frames along the temporal direction using motion compensated temporal filtering, thus high coding performance and flexible scalability can be provided in this scheme. In order to make compressed video resilient to channel error and to guarantee robust video transmission over mobile wireless channels, we add redundancy to each bit-stream and apply error concealment strategy for lost motion vectors. Unlike traditional multiple description schemes, the integration of these techniques enable us to generate more than two bit-streams that may be more appropriate for multiple antenna transmission of compressed video. Simulate results on standard video sequences have shown that the proposed scheme provides flexible tradeoff between coding efficiency and error resilience.
Egocentric Temporal Action Proposals.
Shao Huang; Weiqiang Wang; Shengfeng He; Lau, Rynson W H
2018-02-01
We present an approach to localize generic actions in egocentric videos, called temporal action proposals (TAPs), for accelerating the action recognition step. An egocentric TAP refers to a sequence of frames that may contain a generic action performed by the wearer of a head-mounted camera, e.g., taking a knife, spreading jam, pouring milk, or cutting carrots. Inspired by object proposals, this paper aims at generating a small number of TAPs, thereby replacing the popular sliding window strategy, for localizing all action events in the input video. To this end, we first propose to temporally segment the input video into action atoms, which are the smallest units that may contain an action. We then apply a hierarchical clustering algorithm with several egocentric cues to generate TAPs. Finally, we propose two actionness networks to score the likelihood of each TAP containing an action. The top ranked candidates are returned as output TAPs. Experimental results show that the proposed TAP detection framework performs significantly better than relevant approaches for egocentric action detection.
Space-time light field rendering.
Wang, Huamin; Sun, Mingxuan; Yang, Ruigang
2007-01-01
In this paper, we propose a novel framework called space-time light field rendering, which allows continuous exploration of a dynamic scene in both space and time. Compared to existing light field capture/rendering systems, it offers the capability of using unsynchronized video inputs and the added freedom of controlling the visualization in the temporal domain, such as smooth slow motion and temporal integration. In order to synthesize novel views from any viewpoint at any time instant, we develop a two-stage rendering algorithm. We first interpolate in the temporal domain to generate globally synchronized images using a robust spatial-temporal image registration algorithm followed by edge-preserving image morphing. We then interpolate these software-synchronized images in the spatial domain to synthesize the final view. In addition, we introduce a very accurate and robust algorithm to estimate subframe temporal offsets among input video sequences. Experimental results from unsynchronized videos with or without time stamps show that our approach is capable of maintaining photorealistic quality from a variety of real scenes.
NASA Astrophysics Data System (ADS)
Kushwaha, Alok Kumar Singh; Srivastava, Rajeev
2015-09-01
An efficient view invariant framework for the recognition of human activities from an input video sequence is presented. The proposed framework is composed of three consecutive modules: (i) detect and locate people by background subtraction, (ii) view invariant spatiotemporal template creation for different activities, (iii) and finally, template matching is performed for view invariant activity recognition. The foreground objects present in a scene are extracted using change detection and background modeling. The view invariant templates are constructed using the motion history images and object shape information for different human activities in a video sequence. For matching the spatiotemporal templates for various activities, the moment invariants and Mahalanobis distance are used. The proposed approach is tested successfully on our own viewpoint dataset, KTH action recognition dataset, i3DPost multiview dataset, MSR viewpoint action dataset, VideoWeb multiview dataset, and WVU multiview human action recognition dataset. From the experimental results and analysis over the chosen datasets, it is observed that the proposed framework is robust, flexible, and efficient with respect to multiple views activity recognition, scale, and phase variations.
Optimal space communications techniques. [discussion of video signals and delta modulation
NASA Technical Reports Server (NTRS)
Schilling, D. L.
1974-01-01
The encoding of video signals using the Song Adaptive Delta Modulator (Song ADM) is discussed. The video signals are characterized as a sequence of pulses having arbitrary height and width. Although the ADM is suited to tracking signals having fast rise times, it was found that the DM algorithm (which permits an exponential rise for estimating an input step) results in a large overshoot and an underdamped response to the step. An overshoot suppression algorithm which significantly reduces the ringing while not affecting the rise time is presented along with formuli for the rise time and the settling time. Channel errors and their effect on the DM encoded bit stream were investigated.
Hierarchical video summarization based on context clustering
NASA Astrophysics Data System (ADS)
Tseng, Belle L.; Smith, John R.
2003-11-01
A personalized video summary is dynamically generated in our video personalization and summarization system based on user preference and usage environment. The three-tier personalization system adopts the server-middleware-client architecture in order to maintain, select, adapt, and deliver rich media content to the user. The server stores the content sources along with their corresponding MPEG-7 metadata descriptions. In this paper, the metadata includes visual semantic annotations and automatic speech transcriptions. Our personalization and summarization engine in the middleware selects the optimal set of desired video segments by matching shot annotations and sentence transcripts with user preferences. Besides finding the desired contents, the objective is to present a coherent summary. There are diverse methods for creating summaries, and we focus on the challenges of generating a hierarchical video summary based on context information. In our summarization algorithm, three inputs are used to generate the hierarchical video summary output. These inputs are (1) MPEG-7 metadata descriptions of the contents in the server, (2) user preference and usage environment declarations from the user client, and (3) context information including MPEG-7 controlled term list and classification scheme. In a video sequence, descriptions and relevance scores are assigned to each shot. Based on these shot descriptions, context clustering is performed to collect consecutively similar shots to correspond to hierarchical scene representations. The context clustering is based on the available context information, and may be derived from domain knowledge or rules engines. Finally, the selection of structured video segments to generate the hierarchical summary efficiently balances between scene representation and shot selection.
Stereo and IMU-Assisted Visual Odometry for Small Robots
NASA Technical Reports Server (NTRS)
2012-01-01
This software performs two functions: (1) taking stereo image pairs as input, it computes stereo disparity maps from them by cross-correlation to achieve 3D (three-dimensional) perception; (2) taking a sequence of stereo image pairs as input, it tracks features in the image sequence to estimate the motion of the cameras between successive image pairs. A real-time stereo vision system with IMU (inertial measurement unit)-assisted visual odometry was implemented on a single 750 MHz/520 MHz OMAP3530 SoC (system on chip) from TI (Texas Instruments). Frame rates of 46 fps (frames per second) were achieved at QVGA (Quarter Video Graphics Array i.e. 320 240), or 8 fps at VGA (Video Graphics Array 640 480) resolutions, while simultaneously tracking up to 200 features, taking full advantage of the OMAP3530's integer DSP (digital signal processor) and floating point ARM processors. This is a substantial advancement over previous work as the stereo implementation produces 146 Mde/s (millions of disparities evaluated per second) in 2.5W, yielding a stereo energy efficiency of 58.8 Mde/J, which is 3.75 better than prior DSP stereo while providing more functionality.
A constrained joint source/channel coder design and vector quantization of nonstationary sources
NASA Technical Reports Server (NTRS)
Sayood, Khalid; Chen, Y. C.; Nori, S.; Araj, A.
1993-01-01
The emergence of broadband ISDN as the network for the future brings with it the promise of integration of all proposed services in a flexible environment. In order to achieve this flexibility, asynchronous transfer mode (ATM) has been proposed as the transfer technique. During this period a study was conducted on the bridging of network transmission performance and video coding. The successful transmission of variable bit rate video over ATM networks relies on the interaction between the video coding algorithm and the ATM networks. Two aspects of networks that determine the efficiency of video transmission are the resource allocation algorithm and the congestion control algorithm. These are explained in this report. Vector quantization (VQ) is one of the more popular compression techniques to appear in the last twenty years. Numerous compression techniques, which incorporate VQ, have been proposed. While the LBG VQ provides excellent compression, there are also several drawbacks to the use of the LBG quantizers including search complexity and memory requirements, and a mismatch between the codebook and the inputs. The latter mainly stems from the fact that the VQ is generally designed for a specific rate and a specific class of inputs. In this work, an adaptive technique is proposed for vector quantization of images and video sequences. This technique is an extension of the recursively indexed scalar quantization (RISQ) algorithm.
The sequence measurement system of the IR camera
NASA Astrophysics Data System (ADS)
Geng, Ai-hui; Han, Hong-xia; Zhang, Hai-bo
2011-08-01
Currently, the IR cameras are broadly used in the optic-electronic tracking, optic-electronic measuring, fire control and optic-electronic countermeasure field, but the output sequence of the most presently applied IR cameras in the project is complex and the giving sequence documents from the leave factory are not detailed. Aiming at the requirement that the continuous image transmission and image procession system need the detailed sequence of the IR cameras, the sequence measurement system of the IR camera is designed, and the detailed sequence measurement way of the applied IR camera is carried out. The FPGA programming combined with the SignalTap online observation way has been applied in the sequence measurement system, and the precise sequence of the IR camera's output signal has been achieved, the detailed document of the IR camera has been supplied to the continuous image transmission system, image processing system and etc. The sequence measurement system of the IR camera includes CameraLink input interface part, LVDS input interface part, FPGA part, CameraLink output interface part and etc, thereinto the FPGA part is the key composed part in the sequence measurement system. Both the video signal of the CmaeraLink style and the video signal of LVDS style can be accepted by the sequence measurement system, and because the image processing card and image memory card always use the CameraLink interface as its input interface style, the output signal style of the sequence measurement system has been designed into CameraLink interface. The sequence measurement system does the IR camera's sequence measurement work and meanwhile does the interface transmission work to some cameras. Inside the FPGA of the sequence measurement system, the sequence measurement program, the pixel clock modification, the SignalTap file configuration and the SignalTap online observation has been integrated to realize the precise measurement to the IR camera. Te sequence measurement program written by the verilog language combining the SignalTap tool on line observation can count the line numbers in one frame, pixel numbers in one line and meanwhile account the line offset and row offset of the image. Aiming at the complex sequence of the IR camera's output signal, the sequence measurement system of the IR camera accurately measures the sequence of the project applied camera, supplies the detailed sequence document to the continuous system such as image processing system and image transmission system and gives out the concrete parameters of the fval, lval, pixclk, line offset and row offset. The experiment shows that the sequence measurement system of the IR camera can get the precise sequence measurement result and works stably, laying foundation for the continuous system.
Resolution enhancement of low-quality videos using a high-resolution frame
NASA Astrophysics Data System (ADS)
Pham, Tuan Q.; van Vliet, Lucas J.; Schutte, Klamer
2006-01-01
This paper proposes an example-based Super-Resolution (SR) algorithm of compressed videos in the Discrete Cosine Transform (DCT) domain. Input to the system is a Low-Resolution (LR) compressed video together with a High-Resolution (HR) still image of similar content. Using a training set of corresponding LR-HR pairs of image patches from the HR still image, high-frequency details are transferred from the HR source to the LR video. The DCT-domain algorithm is much faster than example-based SR in spatial domain 6 because of a reduction in search dimensionality, which is a direct result of the compact and uncorrelated DCT representation. Fast searching techniques like tree-structure vector quantization 16 and coherence search1 are also key to the improved efficiency. Preliminary results on MJPEG sequence show promising result of the DCT-domain SR synthesis approach.
Image sequence analysis workstation for multipoint motion analysis
NASA Astrophysics Data System (ADS)
Mostafavi, Hassan
1990-08-01
This paper describes an application-specific engineering workstation designed and developed to analyze motion of objects from video sequences. The system combines the software and hardware environment of a modem graphic-oriented workstation with the digital image acquisition, processing and display techniques. In addition to automation and Increase In throughput of data reduction tasks, the objective of the system Is to provide less invasive methods of measurement by offering the ability to track objects that are more complex than reflective markers. Grey level Image processing and spatial/temporal adaptation of the processing parameters is used for location and tracking of more complex features of objects under uncontrolled lighting and background conditions. The applications of such an automated and noninvasive measurement tool include analysis of the trajectory and attitude of rigid bodies such as human limbs, robots, aircraft in flight, etc. The system's key features are: 1) Acquisition and storage of Image sequences by digitizing and storing real-time video; 2) computer-controlled movie loop playback, freeze frame display, and digital Image enhancement; 3) multiple leading edge tracking in addition to object centroids at up to 60 fields per second from both live input video or a stored Image sequence; 4) model-based estimation and tracking of the six degrees of freedom of a rigid body: 5) field-of-view and spatial calibration: 6) Image sequence and measurement data base management; and 7) offline analysis software for trajectory plotting and statistical analysis.
Adaptive correlation filter-based video stabilization without accumulative global motion estimation
NASA Astrophysics Data System (ADS)
Koh, Eunjin; Lee, Chanyong; Jeong, Dong Gil
2014-12-01
We present a digital video stabilization approach that provides both robustness and efficiency for practical applications. In this approach, we adopt a stabilization model that maintains spatio-temporal information of past input frames efficiently and can track original stabilization position. Because of the stabilization model, the proposed method does not need accumulative global motion estimation and can recover the original position even if there is a failure in interframe motion estimation. It can also intelligently overcome the situation of damaged or interrupted video sequences. Moreover, because it is simple and suitable to parallel scheme, we implement it on a commercial field programmable gate array and a graphics processing unit board with compute unified device architecture in a breeze. Experimental results show that the proposed approach is both fast and robust.
An Imaging And Graphics Workstation For Image Sequence Analysis
NASA Astrophysics Data System (ADS)
Mostafavi, Hassan
1990-01-01
This paper describes an application-specific engineering workstation designed and developed to analyze imagery sequences from a variety of sources. The system combines the software and hardware environment of the modern graphic-oriented workstations with the digital image acquisition, processing and display techniques. The objective is to achieve automation and high throughput for many data reduction tasks involving metric studies of image sequences. The applications of such an automated data reduction tool include analysis of the trajectory and attitude of aircraft, missile, stores and other flying objects in various flight regimes including launch and separation as well as regular flight maneuvers. The workstation can also be used in an on-line or off-line mode to study three-dimensional motion of aircraft models in simulated flight conditions such as wind tunnels. The system's key features are: 1) Acquisition and storage of image sequences by digitizing real-time video or frames from a film strip; 2) computer-controlled movie loop playback, slow motion and freeze frame display combined with digital image sharpening, noise reduction, contrast enhancement and interactive image magnification; 3) multiple leading edge tracking in addition to object centroids at up to 60 fields per second from both live input video or a stored image sequence; 4) automatic and manual field-of-view and spatial calibration; 5) image sequence data base generation and management, including the measurement data products; 6) off-line analysis software for trajectory plotting and statistical analysis; 7) model-based estimation and tracking of object attitude angles; and 8) interface to a variety of video players and film transport sub-systems.
Evaluation of Moving Object Detection Based on Various Input Noise Using Fixed Camera
NASA Astrophysics Data System (ADS)
Kiaee, N.; Hashemizadeh, E.; Zarrinpanjeh, N.
2017-09-01
Detecting and tracking objects in video has been as a research area of interest in the field of image processing and computer vision. This paper evaluates the performance of a novel method for object detection algorithm in video sequences. This process helps us to know the advantage of this method which is being used. The proposed framework compares the correct and wrong detection percentage of this algorithm. This method was evaluated with the collected data in the field of urban transport which include car and pedestrian in fixed camera situation. The results show that the accuracy of the algorithm will decreases because of image resolution reduction.
NASA Astrophysics Data System (ADS)
Sun, Hong; Wu, Qian-zhong
2013-09-01
In order to improve the precision of optical-electric tracking device, proposing a kind of improved optical-electric tracking device based on MEMS, in allusion to the tracking error of gyroscope senor and the random drift, According to the principles of time series analysis of random sequence, establish AR model of gyro random error based on Kalman filter algorithm, then the output signals of gyro are multiple filtered with Kalman filter. And use ARM as micro controller servo motor is controlled by fuzzy PID full closed loop control algorithm, and add advanced correction and feed-forward links to improve response lag of angle input, Free-forward can make output perfectly follow input. The function of lead compensation link is to shorten the response of input signals, so as to reduce errors. Use the wireless video monitor module and remote monitoring software (Visual Basic 6.0) to monitor servo motor state in real time, the video monitor module gathers video signals, and the wireless video module will sent these signals to upper computer, so that show the motor running state in the window of Visual Basic 6.0. At the same time, take a detailed analysis to the main error source. Through the quantitative analysis of the errors from bandwidth and gyro sensor, it makes the proportion of each error in the whole error more intuitive, consequently, decrease the error of the system. Through the simulation and experiment results shows the system has good following characteristic, and it is very valuable for engineering application.
Robust camera calibration for sport videos using court models
NASA Astrophysics Data System (ADS)
Farin, Dirk; Krabbe, Susanne; de With, Peter H. N.; Effelsberg, Wolfgang
2003-12-01
We propose an automatic camera calibration algorithm for court sports. The obtained camera calibration parameters are required for applications that need to convert positions in the video frame to real-world coordinates or vice versa. Our algorithm uses a model of the arrangement of court lines for calibration. Since the court model can be specified by the user, the algorithm can be applied to a variety of different sports. The algorithm starts with a model initialization step which locates the court in the image without any user assistance or a-priori knowledge about the most probable position. Image pixels are classified as court line pixels if they pass several tests including color and local texture constraints. A Hough transform is applied to extract line elements, forming a set of court line candidates. The subsequent combinatorial search establishes correspondences between lines in the input image and lines from the court model. For the succeeding input frames, an abbreviated calibration algorithm is used, which predicts the camera parameters for the new image and optimizes the parameters using a gradient-descent algorithm. We have conducted experiments on a variety of sport videos (tennis, volleyball, and goal area sequences of soccer games). Video scenes with considerable difficulties were selected to test the robustness of the algorithm. Results show that the algorithm is very robust to occlusions, partial court views, bad lighting conditions, or shadows.
Dactyl Alphabet Gesture Recognition in a Video Sequence Using Microsoft Kinect
NASA Astrophysics Data System (ADS)
Artyukhin, S. G.; Mestetskiy, L. M.
2015-05-01
This paper presents an efficient framework for solving the problem of static gesture recognition based on data obtained from the web cameras and depth sensor Kinect (RGB-D - data). Each gesture given by a pair of images: color image and depth map. The database store gestures by it features description, genereated by frame for each gesture of the alphabet. Recognition algorithm takes as input a video sequence (a sequence of frames) for marking, put in correspondence with each frame sequence gesture from the database, or decide that there is no suitable gesture in the database. First, classification of the frame of the video sequence is done separately without interframe information. Then, a sequence of successful marked frames in equal gesture is grouped into a single static gesture. We propose a method combined segmentation of frame by depth map and RGB-image. The primary segmentation is based on the depth map. It gives information about the position and allows to get hands rough border. Then, based on the color image border is specified and performed analysis of the shape of the hand. Method of continuous skeleton is used to generate features. We propose a method of skeleton terminal branches, which gives the opportunity to determine the position of the fingers and wrist. Classification features for gesture is description of the position of the fingers relative to the wrist. The experiments were carried out with the developed algorithm on the example of the American Sign Language. American Sign Language gesture has several components, including the shape of the hand, its orientation in space and the type of movement. The accuracy of the proposed method is evaluated on the base of collected gestures consisting of 2700 frames.
On the cyclic nature of perception in vision versus audition
VanRullen, Rufin; Zoefel, Benedikt; Ilhan, Barkin
2014-01-01
Does our perceptual awareness consist of a continuous stream, or a discrete sequence of perceptual cycles, possibly associated with the rhythmic structure of brain activity? This has been a long-standing question in neuroscience. We review recent psychophysical and electrophysiological studies indicating that part of our visual awareness proceeds in approximately 7–13 Hz cycles rather than continuously. On the other hand, experimental attempts at applying similar tools to demonstrate the discreteness of auditory awareness have been largely unsuccessful. We argue and demonstrate experimentally that visual and auditory perception are not equally affected by temporal subsampling of their respective input streams: video sequences remain intelligible at sampling rates of two to three frames per second, whereas audio inputs lose their fine temporal structure, and thus all significance, below 20–30 samples per second. This does not mean, however, that our auditory perception must proceed continuously. Instead, we propose that audition could still involve perceptual cycles, but the periodic sampling should happen only after the stage of auditory feature extraction. In addition, although visual perceptual cycles can follow one another at a spontaneous pace largely independent of the visual input, auditory cycles may need to sample the input stream more flexibly, by adapting to the temporal structure of the auditory inputs. PMID:24639585
Lazar, Aurel A; Pnevmatikakis, Eftychios A
2011-03-01
We investigate architectures for time encoding and time decoding of visual stimuli such as natural and synthetic video streams (movies, animation). The architecture for time encoding is akin to models of the early visual system. It consists of a bank of filters in cascade with single-input multi-output neural circuits. Neuron firing is based on either a threshold-and-fire or an integrate-and-fire spiking mechanism with feedback. We show that analog information is represented by the neural circuits as projections on a set of band-limited functions determined by the spike sequence. Under Nyquist-type and frame conditions, the encoded signal can be recovered from these projections with arbitrary precision. For the video time encoding machine architecture, we demonstrate that band-limited video streams of finite energy can be faithfully recovered from the spike trains and provide a stable algorithm for perfect recovery. The key condition for recovery calls for the number of neurons in the population to be above a threshold value.
Lazar, Aurel A.; Pnevmatikakis, Eftychios A.
2013-01-01
We investigate architectures for time encoding and time decoding of visual stimuli such as natural and synthetic video streams (movies, animation). The architecture for time encoding is akin to models of the early visual system. It consists of a bank of filters in cascade with single-input multi-output neural circuits. Neuron firing is based on either a threshold-and-fire or an integrate-and-fire spiking mechanism with feedback. We show that analog information is represented by the neural circuits as projections on a set of band-limited functions determined by the spike sequence. Under Nyquist-type and frame conditions, the encoded signal can be recovered from these projections with arbitrary precision. For the video time encoding machine architecture, we demonstrate that band-limited video streams of finite energy can be faithfully recovered from the spike trains and provide a stable algorithm for perfect recovery. The key condition for recovery calls for the number of neurons in the population to be above a threshold value. PMID:21296708
Learning the Gestalt rule of collinearity from object motion.
Prodöhl, Carsten; Würtz, Rolf P; von der Malsburg, Christoph
2003-08-01
The Gestalt principle of collinearity (and curvilinearity) is widely regarded as being mediated by the long-range connection structure in primary visual cortex. We review the neurophysiological and psychophysical literature to argue that these connections are developed from visual experience after birth, relying on coherent object motion. We then present a neural network model that learns these connections in an unsupervised Hebbian fashion with input from real camera sequences. The model uses spatiotemporal retinal filtering, which is very sensitive to changes in the visual input. We show that it is crucial for successful learning to use the correlation of the transient responses instead of the sustained ones. As a consequence, learning works best with video sequences of moving objects. The model addresses a special case of the fundamental question of what represents the necessary a priori knowledge the brain is equipped with at birth so that the self-organized process of structuring by experience can be successful.
NASA Astrophysics Data System (ADS)
Murillo, Sergio; Pattichis, Marios; Soliz, Peter; Barriga, Simon; Loizou, C. P.; Pattichis, C. S.
2010-03-01
Motion estimation from digital video is an ill-posed problem that requires a regularization approach. Regularization introduces a smoothness constraint that can reduce the resolution of the velocity estimates. The problem is further complicated for ultrasound videos (US), where speckle noise levels can be significant. Motion estimation using optical flow models requires the modification of several parameters to satisfy the optical flow constraint as well as the level of imposed smoothness. Furthermore, except in simulations or mostly unrealistic cases, there is no ground truth to use for validating the velocity estimates. This problem is present in all real video sequences that are used as input to motion estimation algorithms. It is also an open problem in biomedical applications like motion analysis of US of carotid artery (CA) plaques. In this paper, we study the problem of obtaining reliable ultrasound video motion estimates for atherosclerotic plaques for use in clinical diagnosis. A global optimization framework for motion parameter optimization is presented. This framework uses actual carotid artery motions to provide optimal parameter values for a variety of motions and is tested on ten different US videos using two different motion estimation techniques.
Spherical rotation orientation indication for HEVC and JEM coding of 360 degree video
NASA Astrophysics Data System (ADS)
Boyce, Jill; Xu, Qian
2017-09-01
Omnidirectional (or "360 degree") video, representing a panoramic view of a spherical 360° ×180° scene, can be encoded using conventional video compression standards, once it has been projection mapped to a 2D rectangular format. Equirectangular projection format is currently used for mapping 360 degree video to a rectangular representation for coding using HEVC/JEM. However, video in the top and bottom regions of the image, corresponding to the "north pole" and "south pole" of the spherical representation, is significantly warped. We propose to perform spherical rotation of the input video prior to HEVC/JEM encoding in order to improve the coding efficiency, and to signal parameters in a supplemental enhancement information (SEI) message that describe the inverse rotation process recommended to be applied following HEVC/JEM decoding, prior to display. Experiment results show that up to 17.8% bitrate gain (using the WS-PSNR end-to-end metric) can be achieved for the Chairlift sequence using HM16.15 and 11.9% gain using JEM6.0, and an average gain of 2.9% for HM16.15 and 2.2% for JEM6.0.
Reconstructing Interlaced High-Dynamic-Range Video Using Joint Learning.
Inchang Choi; Seung-Hwan Baek; Kim, Min H
2017-11-01
For extending the dynamic range of video, it is a common practice to capture multiple frames sequentially with different exposures and combine them to extend the dynamic range of each video frame. However, this approach results in typical ghosting artifacts due to fast and complex motion in nature. As an alternative, video imaging with interlaced exposures has been introduced to extend the dynamic range. However, the interlaced approach has been hindered by jaggy artifacts and sensor noise, leading to concerns over image quality. In this paper, we propose a data-driven approach for jointly solving two specific problems of deinterlacing and denoising that arise in interlaced video imaging with different exposures. First, we solve the deinterlacing problem using joint dictionary learning via sparse coding. Since partial information of detail in differently exposed rows is often available via interlacing, we make use of the information to reconstruct details of the extended dynamic range from the interlaced video input. Second, we jointly solve the denoising problem by tailoring sparse coding to better handle additive noise in low-/high-exposure rows, and also adopt multiscale homography flow to temporal sequences for denoising. We anticipate that the proposed method will allow for concurrent capture of higher dynamic range video frames without suffering from ghosting artifacts. We demonstrate the advantages of our interlaced video imaging compared with the state-of-the-art high-dynamic-range video methods.
Long-term scale adaptive tracking with kernel correlation filters
NASA Astrophysics Data System (ADS)
Wang, Yueren; Zhang, Hong; Zhang, Lei; Yang, Yifan; Sun, Mingui
2018-04-01
Object tracking in video sequences has broad applications in both military and civilian domains. However, as the length of input video sequence increases, a number of problems arise, such as severe object occlusion, object appearance variation, and object out-of-view (some portion or the entire object leaves the image space). To deal with these problems and identify the object being tracked from cluttered background, we present a robust appearance model using Speeded Up Robust Features (SURF) and advanced integrated features consisting of the Felzenszwalb's Histogram of Oriented Gradients (FHOG) and color attributes. Since re-detection is essential in long-term tracking, we develop an effective object re-detection strategy based on moving area detection. We employ the popular kernel correlation filters in our algorithm design, which facilitates high-speed object tracking. Our evaluation using the CVPR2013 Object Tracking Benchmark (OTB2013) dataset illustrates that the proposed algorithm outperforms reference state-of-the-art trackers in various challenging scenarios.
Video Vectorization via Tetrahedral Remeshing.
Wang, Chuan; Zhu, Jie; Guo, Yanwen; Wang, Wenping
2017-02-09
We present a video vectorization method that generates a video in vector representation from an input video in raster representation. A vector-based video representation offers the benefits of vector graphics, such as compactness and scalability. The vector video we generate is represented by a simplified tetrahedral control mesh over the spatial-temporal video volume, with color attributes defined at the mesh vertices. We present novel techniques for simplification and subdivision of a tetrahedral mesh to achieve high simplification ratio while preserving features and ensuring color fidelity. From an input raster video, our method is capable of generating a compact video in vector representation that allows a faithful reconstruction with low reconstruction errors.
Lifelong learning of human actions with deep neural network self-organization.
Parisi, German I; Tani, Jun; Weber, Cornelius; Wermter, Stefan
2017-12-01
Lifelong learning is fundamental in autonomous robotics for the acquisition and fine-tuning of knowledge through experience. However, conventional deep neural models for action recognition from videos do not account for lifelong learning but rather learn a batch of training data with a predefined number of action classes and samples. Thus, there is the need to develop learning systems with the ability to incrementally process available perceptual cues and to adapt their responses over time. We propose a self-organizing neural architecture for incrementally learning to classify human actions from video sequences. The architecture comprises growing self-organizing networks equipped with recurrent neurons for processing time-varying patterns. We use a set of hierarchically arranged recurrent networks for the unsupervised learning of action representations with increasingly large spatiotemporal receptive fields. Lifelong learning is achieved in terms of prediction-driven neural dynamics in which the growth and the adaptation of the recurrent networks are driven by their capability to reconstruct temporally ordered input sequences. Experimental results on a classification task using two action benchmark datasets show that our model is competitive with state-of-the-art methods for batch learning also when a significant number of sample labels are missing or corrupted during training sessions. Additional experiments show the ability of our model to adapt to non-stationary input avoiding catastrophic interference. Copyright © 2017 The Author(s). Published by Elsevier Ltd.. All rights reserved.
NASA Astrophysics Data System (ADS)
Tornow, Ralf P.; Milczarek, Aleksandra; Odstrcilik, Jan; Kolar, Radim
2017-07-01
A parallel video ophthalmoscope was developed to acquire short video sequences (25 fps, 250 frames) of both eyes simultaneously with exact synchronization. Video sequences were registered off-line to compensate for eye movements. From registered video sequences dynamic parameters like cardiac cycle induced reflection changes and eye movements can be calculated and compared between eyes.
Video Super-Resolution via Bidirectional Recurrent Convolutional Networks.
Huang, Yan; Wang, Wei; Wang, Liang
2018-04-01
Super resolving a low-resolution video, namely video super-resolution (SR), is usually handled by either single-image SR or multi-frame SR. Single-Image SR deals with each video frame independently, and ignores intrinsic temporal dependency of video frames which actually plays a very important role in video SR. Multi-Frame SR generally extracts motion information, e.g., optical flow, to model the temporal dependency, but often shows high computational cost. Considering that recurrent neural networks (RNNs) can model long-term temporal dependency of video sequences well, we propose a fully convolutional RNN named bidirectional recurrent convolutional network for efficient multi-frame SR. Different from vanilla RNNs, 1) the commonly-used full feedforward and recurrent connections are replaced with weight-sharing convolutional connections. So they can greatly reduce the large number of network parameters and well model the temporal dependency in a finer level, i.e., patch-based rather than frame-based, and 2) connections from input layers at previous timesteps to the current hidden layer are added by 3D feedforward convolutions, which aim to capture discriminate spatio-temporal patterns for short-term fast-varying motions in local adjacent frames. Due to the cheap convolutional operations, our model has a low computational complexity and runs orders of magnitude faster than other multi-frame SR methods. With the powerful temporal dependency modeling, our model can super resolve videos with complex motions and achieve well performance.
Auditory access, language access, and implicit sequence learning in deaf children.
Hall, Matthew L; Eigsti, Inge-Marie; Bortfeld, Heather; Lillo-Martin, Diane
2018-05-01
Developmental psychology plays a central role in shaping evidence-based best practices for prelingually deaf children. The Auditory Scaffolding Hypothesis (Conway et al., 2009) asserts that a lack of auditory stimulation in deaf children leads to impoverished implicit sequence learning abilities, measured via an artificial grammar learning (AGL) task. However, prior research is confounded by a lack of both auditory and language input. The current study examines implicit learning in deaf children who were (Deaf native signers) or were not (oral cochlear implant users) exposed to language from birth, and in hearing children, using both AGL and Serial Reaction Time (SRT) tasks. Neither deaf nor hearing children across the three groups show evidence of implicit learning on the AGL task, but all three groups show robust implicit learning on the SRT task. These findings argue against the Auditory Scaffolding Hypothesis, and suggest that implicit sequence learning may be resilient to both auditory and language deprivation, within the tested limits. A video abstract of this article can be viewed at: https://youtu.be/EeqfQqlVHLI [Correction added on 07 August 2017, after first online publication: The video abstract link was added.]. © 2017 John Wiley & Sons Ltd.
MPEG-4 ASP SoC receiver with novel image enhancement techniques for DAB networks
NASA Astrophysics Data System (ADS)
Barreto, D.; Quintana, A.; García, L.; Callicó, G. M.; Núñez, A.
2007-05-01
This paper presents a system for real-time video reception in low-power mobile devices using Digital Audio Broadcast (DAB) technology for transmission. A demo receiver terminal is designed into a FPGA platform using the Advanced Simple Profile (ASP) MPEG-4 standard for video decoding. In order to keep the demanding DAB requirements, the bandwidth of the encoded sequence must be drastically reduced. In this sense, prior to the MPEG-4 coding stage, a pre-processing stage is performed. It is firstly composed by a segmentation phase according to motion and texture based on the Principal Component Analysis (PCA) of the input video sequence, and secondly by a down-sampling phase, which depends on the segmentation results. As a result of the segmentation task, a set of texture and motion maps are obtained. These motion and texture maps are also included into the bit-stream as user data side-information and are therefore known to the receiver. For all bit-rates, the whole encoder/decoder system proposed in this paper exhibits higher image visual quality than the alternative encoding/decoding method, assuming equal image sizes. A complete analysis of both techniques has also been performed to provide the optimum motion and texture maps for the global system, which has been finally validated for a variety of video sequences. Additionally, an optimal HW/SW partition for the MPEG-4 decoder has been studied and implemented over a Programmable Logic Device with an embedded ARM9 processor. Simulation results show that a throughput of 15 QCIF frames per second can be achieved with low area and low power implementation.
Multiframe digitization of x-ray (TV) images (abstract)
NASA Astrophysics Data System (ADS)
Karpenko, V. A.; Khil'chenko, A. D.; Lysenko, A. P.; Panchenko, V. E.
1989-07-01
The work in progress deals with the experimental search for a technique of digitizing x-ray TV images. The small volume of the buffer memory of the analog-to-digital (A/D) converter (ADC) we have previously used to detect TV signals made it necessary to digitize only one line at a time of the television raster and also to make use of gating to gain the video information contained in the whole frame. This paper is devoted to multiframe digitizing. The recorder of video signals comprises a broadband 8-bit A/D converter, a buffer memory having 128K words and a control circuit which forms a necessary sequence of advance pulses for the A/D converter and the memory relative to the input frame and line sync pulses (FSP and LSP). The device provides recording of video signals corresponding to one or a few frames following one after another, or to their fragments. The control circuit is responsible for the separation of the required fragment of the TV image. When loading the limit registers, the following input parameters of the control circuit are set: the skipping of a definite number of lines after the next FSP, the number of the lines of recording inside a fragment, the frequency of the information lines inside a fragment, the delay in the start of the ADC conversion relative to the arrival of the LSP, the length of the information section of a line, and the frequency of taking the readouts in a line. In addition, among the instructions given are the number of frames of recording and the frequency of their sequence. Thus, the A/D converter operates only inside a given fragment of the TV image. The information is introduced into the memory in sequence, fragment by fragment, without skipping and is then extracted as samples according to the addresses needed for representation in the required form, and processing. The video signal recorder governs the shortest time of the ADC conversion per point of 250 ns. As before, among the apparatus used were an image vidicon with luminophor conversion of x-radiation to light, and a single-crystal x-ray diffraction scheme necessary to form dynamic test objects from x-ray lines dispersed in space (the projections of the linear focus of an x-ray tube).
Video-tracker trajectory analysis: who meets whom, when and where
NASA Astrophysics Data System (ADS)
Jäger, U.; Willersinn, D.
2010-04-01
Unveiling unusual or hostile events by observing manifold moving persons in a crowd is a challenging task for human operators, especially when sitting in front of monitor walls for hours. Typically, hostile events are rare. Thus, due to tiredness and negligence the operator may miss important events. In such situations, an automatic alarming system is able to support the human operator. The system incorporates a processing chain consisting of (1) people tracking, (2) event detection, (3) data retrieval, and (4) display of relevant video sequence overlaid by highlighted regions of interest. In this paper we focus on the event detection stage of the processing chain mentioned above. In our case, the selected event of interest is the encounter of people. Although being based on a rather simple trajectory analysis, this kind of event embodies great practical importance because it paves the way to answer the question "who meets whom, when and where". This, in turn, forms the basis to detect potential situations where e.g. money, weapons, drugs etc. are handed over from one person to another in crowded environments like railway stations, airports or busy streets and places etc.. The input to the trajectory analysis comes from a multi-object video-based tracking system developed at IOSB which is able to track multiple individuals within a crowd in real-time [1]. From this we calculate the inter-distances between all persons on a frame-to-frame basis. We use a sequence of simple rules based on the individuals' kinematics to detect the event mentioned above to output the frame number, the persons' IDs from the tracker and the pixel coordinates of the meeting position. Using this information, a data retrieval system may extract the corresponding part of the recorded video image sequence and finally allows for replaying the selected video clip with a highlighted region of interest to attract the operator's attention for further visual inspection.
Empirical mode decomposition-based facial pose estimation inside video sequences
NASA Astrophysics Data System (ADS)
Qing, Chunmei; Jiang, Jianmin; Yang, Zhijing
2010-03-01
We describe a new pose-estimation algorithm via integration of the strength in both empirical mode decomposition (EMD) and mutual information. While mutual information is exploited to measure the similarity between facial images to estimate poses, EMD is exploited to decompose input facial images into a number of intrinsic mode function (IMF) components, which redistribute the effect of noise, expression changes, and illumination variations as such that, when the input facial image is described by the selected IMF components, all the negative effects can be minimized. Extensive experiments were carried out in comparisons to existing representative techniques, and the results show that the proposed algorithm achieves better pose-estimation performances with robustness to noise corruption, illumination variation, and facial expressions.
Optimal input sizes for neural network de-interlacing
NASA Astrophysics Data System (ADS)
Choi, Hyunsoo; Seo, Guiwon; Lee, Chulhee
2009-02-01
Neural network de-interlacing has shown promising results among various de-interlacing methods. In this paper, we investigate the effects of input size for neural networks for various video formats when the neural networks are used for de-interlacing. In particular, we investigate optimal input sizes for CIF, VGA and HD video formats.
Learning Collaborative Sparse Representation for Grayscale-Thermal Tracking.
Li, Chenglong; Cheng, Hui; Hu, Shiyi; Liu, Xiaobai; Tang, Jin; Lin, Liang
2016-09-27
Integrating multiple different yet complementary feature representations has been proved to be an effective way for boosting tracking performance. This paper investigates how to perform robust object tracking in challenging scenarios by adaptively incorporating information from grayscale and thermal videos, and proposes a novel collaborative algorithm for online tracking. In particular, an adaptive fusion scheme is proposed based on collaborative sparse representation in Bayesian filtering framework. We jointly optimize sparse codes and the reliable weights of different modalities in an online way. In addition, this work contributes a comprehensive video benchmark, which includes 50 grayscale-thermal sequences and their ground truth annotations for tracking purpose. The videos are with high diversity and the annotations were finished by one single person to guarantee consistency. Extensive experiments against other stateof- the-art trackers with both grayscale and grayscale-thermal inputs demonstrate the effectiveness of the proposed tracking approach. Through analyzing quantitative results, we also provide basic insights and potential future research directions in grayscale-thermal tracking.
Video Image Stabilization and Registration
NASA Technical Reports Server (NTRS)
Hathaway, David H. (Inventor); Meyer, Paul J. (Inventor)
2002-01-01
A method of stabilizing and registering a video image in multiple video fields of a video sequence provides accurate determination of the image change in magnification, rotation and translation between video fields, so that the video fields may be accurately corrected for these changes in the image in the video sequence. In a described embodiment, a key area of a key video field is selected which contains an image which it is desired to stabilize in a video sequence. The key area is subdivided into nested pixel blocks and the translation of each of the pixel blocks from the key video field to a new video field is determined as a precursor to determining change in magnification, rotation and translation of the image from the key video field to the new video field.
Video Image Stabilization and Registration
NASA Technical Reports Server (NTRS)
Hathaway, David H. (Inventor); Meyer, Paul J. (Inventor)
2003-01-01
A method of stabilizing and registering a video image in multiple video fields of a video sequence provides accurate determination of the image change in magnification, rotation and translation between video fields, so that the video fields may be accurately corrected for these changes in the image in the video sequence. In a described embodiment, a key area of a key video field is selected which contains an image which it is desired to stabilize in a video sequence. The key area is subdivided into nested pixel blocks and the translation of each of the pixel blocks from the key video field to a new video field is determined as a precursor to determining change in magnification, rotation and translation of the image from the key video field to the new video field.
NASA Astrophysics Data System (ADS)
Pandremmenou, K.; Shahid, M.; Kondi, L. P.; Lövström, B.
2015-03-01
In this work, we propose a No-Reference (NR) bitstream-based model for predicting the quality of H.264/AVC video sequences, affected by both compression artifacts and transmission impairments. The proposed model is based on a feature extraction procedure, where a large number of features are calculated from the packet-loss impaired bitstream. Many of the features are firstly proposed in this work, and the specific set of the features as a whole is applied for the first time for making NR video quality predictions. All feature observations are taken as input to the Least Absolute Shrinkage and Selection Operator (LASSO) regression method. LASSO indicates the most important features, and using only them, it is possible to estimate the Mean Opinion Score (MOS) with high accuracy. Indicatively, we point out that only 13 features are able to produce a Pearson Correlation Coefficient of 0.92 with the MOS. Interestingly, the performance statistics we computed in order to assess our method for predicting the Structural Similarity Index and the Video Quality Metric are equally good. Thus, the obtained experimental results verified the suitability of the features selected by LASSO as well as the ability of LASSO in making accurate predictions through sparse modeling.
ERIC Educational Resources Information Center
Inceçay, Volkan; Koçoglu, Zeynep
2017-01-01
The present study examined whether or not different input delivery modes have an effect on listening comprehension of Turkish students learning English at the university level. It investigated the effect of one single mode, which is audio-only, and three dual input delivery modes, which were audio-video, audio-video with target language subtitles…
Evolving discriminators for querying video sequences
NASA Astrophysics Data System (ADS)
Iyengar, Giridharan; Lippman, Andrew B.
1997-01-01
In this paper we present a framework for content based query and retrieval of information from large video databases. This framework enables content based retrieval of video sequences by characterizing the sequences using motion, texture and colorimetry cues. This characterization is biologically inspired and results in a compact parameter space where every segment of video is represented by an 8 dimensional vector. Searching and retrieval is done in real- time with accuracy in this parameter space. Using this characterization, we then evolve a set of discriminators using Genetic Programming Experiments indicate that these discriminators are capable of analyzing and characterizing video. The VideoBook is able to search and retrieve video sequences with 92% accuracy in real-time. Experiments thus demonstrate that the characterization is capable of extracting higher level structure from raw pixel values.
NASA Astrophysics Data System (ADS)
Levchuk, Georgiy; Bobick, Aaron; Jones, Eric
2010-04-01
In this paper, we describe results from experimental analysis of a model designed to recognize activities and functions of moving and static objects from low-resolution wide-area video inputs. Our model is based on representing the activities and functions using three variables: (i) time; (ii) space; and (iii) structures. The activity and function recognition is achieved by imposing lexical, syntactic, and semantic constraints on the lower-level event sequences. In the reported research, we have evaluated the utility and sensitivity of several algorithms derived from natural language processing and pattern recognition domains. We achieved high recognition accuracy for a wide range of activity and function types in the experiments using Electro-Optical (EO) imagery collected by Wide Area Airborne Surveillance (WAAS) platform.
Design of an All-Optical Network Based on LCoS Technologies
NASA Astrophysics Data System (ADS)
Cheng, Yuh-Jiuh; Shiau, Yhi
2016-06-01
In this paper, an all-optical network composed of the ROADMs (reconfigurable optical add-drop multiplexer), L2/L3 optical packet switches, and the fiber optical cross-connection for fiber scheduling and measurement based on LCoS (liquid crystal on silicon) technologies is proposed. The L2/L3 optical packet switches are designed with optical output buffers. Only the header of optical packets is converted to electronic signals to control the wavelength of input ports and the packet payloads can be transparently destined to their output ports. An optical output buffer is designed to queue the packets when more than one incoming packet should reach to the same destination output port. For preserving service-packet sequencing and fairness of routing sequence, a priority scheme and a round-robin algorithm are adopted at the optical output buffer. The wavelength of input ports is designed for routing incoming packets using LCoS technologies. Finally, the proposed OFS (optical flow switch) with input buffers can quickly transfer the big data to the output ports and the main purpose of the OFS is to reduce the number of wavelength reflections. The all-optical content delivery network is comprised of the OFSs for a large amount of audio and video data transmissions in the future.
Yang, Yang; Stanković, Vladimir; Xiong, Zixiang; Zhao, Wei
2009-03-01
Following recent works on the rate region of the quadratic Gaussian two-terminal source coding problem and limit-approaching code designs, this paper examines multiterminal source coding of two correlated, i.e., stereo, video sequences to save the sum rate over independent coding of both sequences. Two multiterminal video coding schemes are proposed. In the first scheme, the left sequence of the stereo pair is coded by H.264/AVC and used at the joint decoder to facilitate Wyner-Ziv coding of the right video sequence. The first I-frame of the right sequence is successively coded by H.264/AVC Intracoding and Wyner-Ziv coding. An efficient stereo matching algorithm based on loopy belief propagation is then adopted at the decoder to produce pixel-level disparity maps between the corresponding frames of the two decoded video sequences on the fly. Based on the disparity maps, side information for both motion vectors and motion-compensated residual frames of the right sequence are generated at the decoder before Wyner-Ziv encoding. In the second scheme, source splitting is employed on top of classic and Wyner-Ziv coding for compression of both I-frames to allow flexible rate allocation between the two sequences. Experiments with both schemes on stereo video sequences using H.264/AVC, LDPC codes for Slepian-Wolf coding of the motion vectors, and scalar quantization in conjunction with LDPC codes for Wyner-Ziv coding of the residual coefficients give a slightly lower sum rate than separate H.264/AVC coding of both sequences at the same video quality.
NASA Astrophysics Data System (ADS)
Kypraios, Ioannis; Young, Rupert C. D.; Chatwin, Chris R.; Birch, Phil M.
2009-04-01
θThe window unit in the design of the complex logarithmic r-θ mapping for hybrid optical neural network filter can allow multiple objects of the same class to be detected within the input image. Additionally, the architecture of the neural network unit of the complex logarithmic r-θ mapping for hybrid optical neural network filter becomes attractive for accommodating the recognition of multiple objects of different classes within the input image by modifying the output layer of the unit. We test the overall filter for multiple objects of the same and of different classes' recognition within cluttered input images and video sequences of cluttered scenes. Logarithmic r-θ mapping for hybrid optical neural network filter is shown to exhibit with a single pass over the input data simultaneously in-plane rotation, out-of-plane rotation, scale, log r-θ map translation and shift invariance, and good clutter tolerance by recognizing correctly the different objects within the cluttered scenes. We record in our results additional extracted information from the cluttered scenes about the objects' relative position, scale and in-plane rotation.
Encrypting Digital Camera with Automatic Encryption Key Deletion
NASA Technical Reports Server (NTRS)
Oakley, Ernest C. (Inventor)
2007-01-01
A digital video camera includes an image sensor capable of producing a frame of video data representing an image viewed by the sensor, an image memory for storing video data such as previously recorded frame data in a video frame location of the image memory, a read circuit for fetching the previously recorded frame data, an encryption circuit having an encryption key input connected to receive the previously recorded frame data from the read circuit as an encryption key, an un-encrypted data input connected to receive the frame of video data from the image sensor and an encrypted data output port, and a write circuit for writing a frame of encrypted video data received from the encrypted data output port of the encryption circuit to the memory and overwriting the video frame location storing the previously recorded frame data.
Use of Internet Resources in the Biology Lecture Classroom.
ERIC Educational Resources Information Center
Francis, Joseph W.
2000-01-01
Introduces internet resources that are available for instructional use in biology classrooms. Provides information on video-based technologies to create and capture video sequences, interactive web sites that allow interaction with biology simulations, online texts, and interactive videos that display animated video sequences. (YDS)
Sequence to Sequence - Video to Text
2015-12-11
Saenko, and S. Guadarrama. Generating natural-language video descriptions using text - mined knowledge. In AAAI, July 2013. 2 [20] P. Kuznetsova, V...Sequence to Sequence – Video to Text Subhashini Venugopalan1 Marcus Rohrbach2,4 Jeff Donahue2 Raymond Mooney1 Trevor Darrell2 Kate Saenko3...1. Introduction Describing visual content with natural language text has recently received increased interest, especially describing images with a
NASA Astrophysics Data System (ADS)
Lee, Feifei; Kotani, Koji; Chen, Qiu; Ohmi, Tadahiro
2010-02-01
In this paper, a fast search algorithm for MPEG-4 video clips from video database is proposed. An adjacent pixel intensity difference quantization (APIDQ) histogram is utilized as the feature vector of VOP (video object plane), which had been reliably applied to human face recognition previously. Instead of fully decompressed video sequence, partially decoded data, namely DC sequence of the video object are extracted from the video sequence. Combined with active search, a temporal pruning algorithm, fast and robust video search can be realized. The proposed search algorithm has been evaluated by total 15 hours of video contained of TV programs such as drama, talk, news, etc. to search for given 200 MPEG-4 video clips which each length is 15 seconds. Experimental results show the proposed algorithm can detect the similar video clip in merely 80ms, and Equal Error Rate (ERR) of 2 % in drama and news categories are achieved, which are more accurately and robust than conventional fast video search algorithm.
Region-Based Prediction for Image Compression in the Cloud.
Begaint, Jean; Thoreau, Dominique; Guillotel, Philippe; Guillemot, Christine
2018-04-01
Thanks to the increasing number of images stored in the cloud, external image similarities can be leveraged to efficiently compress images by exploiting inter-images correlations. In this paper, we propose a novel image prediction scheme for cloud storage. Unlike current state-of-the-art methods, we use a semi-local approach to exploit inter-image correlation. The reference image is first segmented into multiple planar regions determined from matched local features and super-pixels. The geometric and photometric disparities between the matched regions of the reference image and the current image are then compensated. Finally, multiple references are generated from the estimated compensation models and organized in a pseudo-sequence to differentially encode the input image using classical video coding tools. Experimental results demonstrate that the proposed approach yields significant rate-distortion performance improvements compared with the current image inter-coding solutions such as high efficiency video coding.
Physiologically Modulating Videogames or Simulations which use Motion-Sensing Input Devices
NASA Technical Reports Server (NTRS)
Pope, Alan T. (Inventor); Stephens, Chad L. (Inventor); Blanson, Nina Marie (Inventor)
2014-01-01
New types of controllers allow players to make inputs to a video game or simulation by moving the entire controller itself. This capability is typically accomplished using a wireless input device having accelerometers, gyroscopes, and an infrared LED tracking camera. The present invention exploits these wireless motion-sensing technologies to modulate the player's movement inputs to the videogame based upon physiological signals. Such biofeedback-modulated video games train valuable mental skills beyond eye-hand coordination. These psychophysiological training technologies enhance personal improvement, not just the diversion, of the user.
Short-term change detection for UAV video
NASA Astrophysics Data System (ADS)
Saur, Günter; Krüger, Wolfgang
2012-11-01
In the last years, there has been an increased use of unmanned aerial vehicles (UAV) for video reconnaissance and surveillance. An important application in this context is change detection in UAV video data. Here we address short-term change detection, in which the time between observations ranges from several minutes to a few hours. We distinguish this task from video motion detection (shorter time scale) and from long-term change detection, based on time series of still images taken between several days, weeks, or even years. Examples for relevant changes we are looking for are recently parked or moved vehicles. As a pre-requisite, a precise image-to-image registration is needed. Images are selected on the basis of the geo-coordinates of the sensor's footprint and with respect to a certain minimal overlap. The automatic imagebased fine-registration adjusts the image pair to a common geometry by using a robust matching approach to handle outliers. The change detection algorithm has to distinguish between relevant and non-relevant changes. Examples for non-relevant changes are stereo disparity at 3D structures of the scene, changed length of shadows, and compression or transmission artifacts. To detect changes in image pairs we analyzed image differencing, local image correlation, and a transformation-based approach (multivariate alteration detection). As input we used color and gradient magnitude images. To cope with local misalignment of image structures we extended the approaches by a local neighborhood search. The algorithms are applied to several examples covering both urban and rural scenes. The local neighborhood search in combination with intensity and gradient magnitude differencing clearly improved the results. Extended image differencing performed better than both the correlation based approach and the multivariate alternation detection. The algorithms are adapted to be used in semi-automatic workflows for the ABUL video exploitation system of Fraunhofer IOSB, see Heinze et. al. 2010.1 In a further step we plan to incorporate more information from the video sequences to the change detection input images, e.g., by image enhancement or by along-track stereo which are available in the ABUL system.
Real-time UAV trajectory generation using feature points matching between video image sequences
NASA Astrophysics Data System (ADS)
Byun, Younggi; Song, Jeongheon; Han, Dongyeob
2017-09-01
Unmanned aerial vehicles (UAVs), equipped with navigation systems and video capability, are currently being deployed for intelligence, reconnaissance and surveillance mission. In this paper, we present a systematic approach for the generation of UAV trajectory using a video image matching system based on SURF (Speeded up Robust Feature) and Preemptive RANSAC (Random Sample Consensus). Video image matching to find matching points is one of the most important steps for the accurate generation of UAV trajectory (sequence of poses in 3D space). We used the SURF algorithm to find the matching points between video image sequences, and removed mismatching by using the Preemptive RANSAC which divides all matching points to outliers and inliers. The inliers are only used to determine the epipolar geometry for estimating the relative pose (rotation and translation) between image sequences. Experimental results from simulated video image sequences showed that our approach has a good potential to be applied to the automatic geo-localization of the UAVs system
Clinical application of a light-pen computer system for quantitative angiography
NASA Technical Reports Server (NTRS)
Alderman, E. L.
1975-01-01
The paper describes an angiographic analysis system which uses a video disk for recording and playback, a light-pen for data input, minicomputer processing, and an electrostatic printer/plotter for hardcopy output. The method is applied to quantitative analysis of ventricular volumes, sequential ventriculography for assessment of physiologic and pharmacologic interventions, analysis of instantaneous time sequence of ventricular systolic and diastolic events, and quantitation of segmental abnormalities. The system is shown to provide the capability for computation of ventricular volumes and other measurements from operator-defined margins by greatly reducing the tedium and errors associated with manual planimetry.
2005-01-01
Sequencing of the human genome has ushered in a new era of biology. The technologies developed to facilitate the sequencing of the human genome are now being applied to the sequencing of other genomes. In 2004, a partnership was formed between Washington University School of Medicine Genome Sequencing Center's Outreach Program and Washington University Department of Biology Science Outreach to create a video tour depicting the processes involved in large-scale sequencing. “Sequencing a Genome: Inside the Washington University Genome Sequencing Center” is a tour of the laboratory that follows the steps in the sequencing pipeline, interspersed with animated explanations of the scientific procedures used at the facility. Accompanying interviews with the staff illustrate different entry levels for a career in genome science. This video project serves as an example of how research and academic institutions can provide teachers and students with access and exposure to innovative technologies at the forefront of biomedical research. Initial feedback on the video from undergraduate students, high school teachers, and high school students provides suggestions for use of this video in a classroom setting to supplement present curricula. PMID:16341256
Video repairing under variable illumination using cyclic motions.
Jia, Jiaya; Tai, Yu-Wing; Wu, Tai-Pang; Tang, Chi-Keung
2006-05-01
This paper presents a complete system capable of synthesizing a large number of pixels that are missing due to occlusion or damage in an uncalibrated input video. These missing pixels may correspond to the static background or cyclic motions of the captured scene. Our system employs user-assisted video layer segmentation, while the main processing in video repair is fully automatic. The input video is first decomposed into the color and illumination videos. The necessary temporal consistency is maintained by tensor voting in the spatio-temporal domain. Missing colors and illumination of the background are synthesized by applying image repairing. Finally, the occluded motions are inferred by spatio-temporal alignment of collected samples at multiple scales. We experimented on our system with some difficult examples with variable illumination, where the capturing camera can be stationary or in motion.
Extended image differencing for change detection in UAV video mosaics
NASA Astrophysics Data System (ADS)
Saur, Günter; Krüger, Wolfgang; Schumann, Arne
2014-03-01
Change detection is one of the most important tasks when using unmanned aerial vehicles (UAV) for video reconnaissance and surveillance. We address changes of short time scale, i.e. the observations are taken in time distances from several minutes up to a few hours. Each observation is a short video sequence acquired by the UAV in near-nadir view and the relevant changes are, e.g., recently parked or moved vehicles. In this paper we extend our previous approach of image differencing for single video frames to video mosaics. A precise image-to-image registration combined with a robust matching approach is needed to stitch the video frames to a mosaic. Additionally, this matching algorithm is applied to mosaic pairs in order to align them to a common geometry. The resulting registered video mosaic pairs are the input of the change detection procedure based on extended image differencing. A change mask is generated by an adaptive threshold applied to a linear combination of difference images of intensity and gradient magnitude. The change detection algorithm has to distinguish between relevant and non-relevant changes. Examples for non-relevant changes are stereo disparity at 3D structures of the scene, changed size of shadows, and compression or transmission artifacts. The special effects of video mosaicking such as geometric distortions and artifacts at moving objects have to be considered, too. In our experiments we analyze the influence of these effects on the change detection results by considering several scenes. The results show that for video mosaics this task is more difficult than for single video frames. Therefore, we extended the image registration by estimating an elastic transformation using a thin plate spline approach. The results for mosaics are comparable to that of single video frames and are useful for interactive image exploitation due to a larger scene coverage.
Vocabulary Learning through Viewing Video: The Effect of Two Enhancement Techniques
ERIC Educational Resources Information Center
Montero Perez, Maribel; Peters, Elke; Desmet, Piet
2018-01-01
While most studies on L2 vocabulary learning through input have addressed learners' vocabulary uptake from written text, this study focuses on audio-visual input. In particular, we investigate the effects of enhancing video by (1) adding different types of L2 subtitling (i.e. no captioning, full captioning, keyword captioning, and glossed keyword…
(abstract) Synthesis of Speaker Facial Movements to Match Selected Speech Sequences
NASA Technical Reports Server (NTRS)
Scott, Kenneth C.
1994-01-01
We are developing a system for synthesizing image sequences the simulate the facial motion of a speaker. To perform this synthesis, we are pursuing two major areas of effort. We are developing the necessary computer graphics technology to synthesize a realistic image sequence of a person speaking selected speech sequences. Next, we are developing a model that expresses the relation between spoken phonemes and face/mouth shape. A subject is video taped speaking an arbitrary text that contains expression of the full list of desired database phonemes. The subject is video taped from the front speaking normally, recording both audio and video detail simultaneously. Using the audio track, we identify the specific video frames on the tape relating to each spoken phoneme. From this range we digitize the video frame which represents the extreme of mouth motion/shape. Thus, we construct a database of images of face/mouth shape related to spoken phonemes. A selected audio speech sequence is recorded which is the basis for synthesizing a matching video sequence; the speaker need not be the same as used for constructing the database. The audio sequence is analyzed to determine the spoken phoneme sequence and the relative timing of the enunciation of those phonemes. Synthesizing an image sequence corresponding to the spoken phoneme sequence is accomplished using a graphics technique known as morphing. Image sequence keyframes necessary for this processing are based on the spoken phoneme sequence and timing. We have been successful in synthesizing the facial motion of a native English speaker for a small set of arbitrary speech segments. Our future work will focus on advancement of the face shape/phoneme model and independent control of facial features.
Code of Federal Regulations, 2012 CFR
2012-10-01
... transmissions, and video transmissions in the GSO Fixed-Satellite Service. 25.212 Section 25.212... Technical Standards § 25.212 Narrowband analog transmissions, digital transmissions, and video transmissions... narrowband and/or wideband digital services, including digital video services, if the maximum input spectral...
Code of Federal Regulations, 2010 CFR
2010-10-01
... transmissions, and video transmissions in the GSO Fixed-Satellite Service. 25.212 Section 25.212... Technical Standards § 25.212 Narrowband analog transmissions, digital transmissions, and video transmissions... narrowband and/or wideband digital services, including digital video services, if the maximum input spectral...
Code of Federal Regulations, 2011 CFR
2011-10-01
... transmissions, and video transmissions in the GSO Fixed-Satellite Service. 25.212 Section 25.212... Technical Standards § 25.212 Narrowband analog transmissions, digital transmissions, and video transmissions... narrowband and/or wideband digital services, including digital video services, if the maximum input spectral...
Computerized tomography using video recorded fluoroscopic images
NASA Technical Reports Server (NTRS)
Kak, A. C.; Jakowatz, C. V., Jr.; Baily, N. A.; Keller, R. A.
1975-01-01
A computerized tomographic imaging system is examined which employs video-recorded fluoroscopic images as input data. By hooking the video recorder to a digital computer through a suitable interface, such a system permits very rapid construction of tomograms.
Hwang, Min Gu; Har, Dong Hwan
2017-11-01
This study designs a method of identifying the camera model used to take videos that are distributed through mobile phones and determines the original version of the mobile phone video for use as legal evidence. For this analysis, an experiment was conducted to find the unique characteristics of each mobile phone. The videos recorded by mobile phones were analyzed to establish the delay time of sound signals, and the differences between the delay times of sound signals for different mobile phones were traced by classifying their characteristics. Furthermore, the sound input signals for mobile phone videos used as legal evidence were analyzed to ascertain whether they have the unique characteristics of the original version. The objective of this study was to find a method for validating the use of mobile phone videos as legal evidence using mobile phones through differences in the delay times of sound input signals. © 2017 American Academy of Forensic Sciences.
Evaluation of a video image detection system : final report.
DOT National Transportation Integrated Search
1994-05-01
A video image detection system (VIDS) is an advanced wide-area traffic monitoring system : that processes input from a video camera. The Autoscope VIDS coupled with an information : management system was selected as the monitoring device because test...
NASA Astrophysics Data System (ADS)
Pandremmenou, K.; Tziortziotis, N.; Paluri, S.; Zhang, W.; Blekas, K.; Kondi, L. P.; Kumar, S.
2015-03-01
We propose the use of the Least Absolute Shrinkage and Selection Operator (LASSO) regression method in order to predict the Cumulative Mean Squared Error (CMSE), incurred by the loss of individual slices in video transmission. We extract a number of quality-relevant features from the H.264/AVC video sequences, which are given as input to the LASSO. This method has the benefit of not only keeping a subset of the features that have the strongest effects towards video quality, but also produces accurate CMSE predictions. Particularly, we study the LASSO regression through two different architectures; the Global LASSO (G.LASSO) and Local LASSO (L.LASSO). In G.LASSO, a single regression model is trained for all slice types together, while in L.LASSO, motivated by the fact that the values for some features are closely dependent on the considered slice type, each slice type has its own regression model, in an e ort to improve LASSO's prediction capability. Based on the predicted CMSE values, we group the video slices into four priority classes. Additionally, we consider a video transmission scenario over a noisy channel, where Unequal Error Protection (UEP) is applied to all prioritized slices. The provided results demonstrate the efficiency of LASSO in estimating CMSE with high accuracy, using only a few features. les that typically contain high-entropy data, producing a footprint that is far less conspicuous than existing methods. The system uses a local web server to provide a le system, user interface and applications through an web architecture.
Moghadam, Saeed Montazeri; Seyyedsalehi, Seyyed Ali
2018-05-31
Nonlinear components extracted from deep structures of bottleneck neural networks exhibit a great ability to express input space in a low-dimensional manifold. Sharing and combining the components boost the capability of the neural networks to synthesize and interpolate new and imaginary data. This synthesis is possibly a simple model of imaginations in human brain where the components are expressed in a nonlinear low dimensional manifold. The current paper introduces a novel Dynamic Deep Bottleneck Neural Network to analyze and extract three main features of videos regarding the expression of emotions on the face. These main features are identity, emotion and expression intensity that are laid in three different sub-manifolds of one nonlinear general manifold. The proposed model enjoying the advantages of recurrent networks was used to analyze the sequence and dynamics of information in videos. It is noteworthy to mention that this model also has also the potential to synthesize new videos showing variations of one specific emotion on the face of unknown subjects. Experiments on discrimination and recognition ability of extracted components showed that the proposed model has an average of 97.77% accuracy in recognition of six prominent emotions (Fear, Surprise, Sadness, Anger, Disgust, and Happiness), and 78.17% accuracy in the recognition of intensity. The produced videos revealed variations from neutral to the apex of an emotion on the face of the unfamiliar test subject which is on average 0.8 similar to reference videos in the scale of the SSIM method. Copyright © 2018 Elsevier Ltd. All rights reserved.
High-definition video display based on the FPGA and THS8200
NASA Astrophysics Data System (ADS)
Qian, Jia; Sui, Xiubao
2014-11-01
This paper presents a high-definition video display solution based on the FPGA and THS8200. THS8200 is a video decoder chip launched by TI company, this chip has three 10-bit DAC channels which can capture video data in both 4:2:2 and 4:4:4 formats, and its data synchronization can be either through the dedicated synchronization signals HSYNC and VSYNC, or extracted from the embedded video stream synchronization information SAV / EAV code. In this paper, we will utilize the address and control signals generated by FPGA to access to the data-storage array, and then the FPGA generates the corresponding digital video signals YCbCr. These signals combined with the synchronization signals HSYNC and VSYNC that are also generated by the FPGA act as the input signals of THS8200. In order to meet the bandwidth requirements of the high-definition TV, we adopt video input in the 4:2:2 format over 2×10-bit interface. THS8200 is needed to be controlled by FPGA with I2C bus to set the internal registers, and as a result, it can generate the synchronous signal that is satisfied with the standard SMPTE and transfer the digital video signals YCbCr into analog video signals YPbPr. Hence, the composite analog output signals YPbPr are consist of image data signal and synchronous signal which are superimposed together inside the chip THS8200. The experimental research indicates that the method presented in this paper is a viable solution for high-definition video display, which conforms to the input requirements of the new high-definition display devices.
MPEG-1 low-cost encoder solution
NASA Astrophysics Data System (ADS)
Grueger, Klaus; Schirrmeister, Frank; Filor, Lutz; von Reventlow, Christian; Schneider, Ulrich; Mueller, Gerriet; Sefzik, Nicolai; Fiedrich, Sven
1995-02-01
A solution for real-time compression of digital YCRCB video data to an MPEG-1 video data stream has been developed. As an additional option, motion JPEG and video telephone streams (H.261) can be generated. For MPEG-1, up to two bidirectional predicted images are supported. The required computational power for motion estimation and DCT/IDCT, memory size and memory bandwidth have been the main challenges. The design uses fast-page-mode memory accesses and requires only one single 80 ns EDO-DRAM with 256 X 16 organization for video encoding. This can be achieved only by using adequate access and coding strategies. The architecture consists of an input processing and filter unit, a memory interface, a motion estimation unit, a motion compensation unit, a DCT unit, a quantization control, a VLC unit and a bus interface. For using the available memory bandwidth by the processing tasks, a fixed schedule for memory accesses has been applied, that can be interrupted for asynchronous events. The motion estimation unit implements a highly sophisticated hierarchical search strategy based on block matching. The DCT unit uses a separated fast-DCT flowgraph realized by a switchable hardware unit for both DCT and IDCT operation. By appropriate multiplexing, only one multiplier is required for: DCT, quantization, inverse quantization, and IDCT. The VLC unit generates the video-stream up to the video sequence layer and is directly coupled with an intelligent bus-interface. Thus, the assembly of video, audio and system data can easily be performed by the host computer. Having a relatively low complexity and only small requirements for DRAM circuits, the developed solution can be applied to low-cost encoding products for consumer electronics.
Physiologically Modulating Videogames or Simulations which Use Motion-Sensing Input Devices
NASA Technical Reports Server (NTRS)
Blanson, Nina Marie (Inventor); Stephens, Chad L. (Inventor); Pope, Alan T. (Inventor)
2017-01-01
New types of controllers allow a player to make inputs to a video game or simulation by moving the entire controller itself or by gesturing or by moving the player's body in whole or in part. This capability is typically accomplished using a wireless input device having accelerometers, gyroscopes, and a camera. The present invention exploits these wireless motion-sensing technologies to modulate the player's movement inputs to the videogame based upon physiological signals. Such biofeedback-modulated video games train valuable mental skills beyond eye-hand coordination. These psychophysiological training technologies enhance personal improvement, not just the diversion, of the user.
Assessing the performance of a motion tracking system based on optical joint transform correlation
NASA Astrophysics Data System (ADS)
Elbouz, M.; Alfalou, A.; Brosseau, C.; Ben Haj Yahia, N.; Alam, M. S.
2015-08-01
We present an optimized system specially designed for the tracking and recognition of moving subjects in a confined environment (such as an elderly remaining at home). In the first step of our study, we use a VanderLugt correlator (VLC) with an adapted pre-processing treatment of the input plane and a postprocessing of the correlation plane via a nonlinear function allowing us to make a robust decision. The second step is based on an optical joint transform correlation (JTC)-based system (NZ-NL-correlation JTC) for achieving improved detection and tracking of moving persons in a confined space. The proposed system has been found to have significantly superior discrimination and robustness capabilities allowing to detect an unknown target in an input scene and to determine the target's trajectory when this target is in motion. This system offers robust tracking performance of a moving target in several scenarios, such as rotational variation of input faces. Test results obtained using various real life video sequences show that the proposed system is particularly suitable for real-time detection and tracking of moving objects.
Video quality assesment using M-SVD
NASA Astrophysics Data System (ADS)
Tao, Peining; Eskicioglu, Ahmet M.
2007-01-01
Objective video quality measurement is a challenging problem in a variety of video processing application ranging from lossy compression to printing. An ideal video quality measure should be able to mimic the human observer. We present a new video quality measure, M-SVD, to evaluate distorted video sequences based on singular value decomposition. A computationally efficient approach is developed for full-reference (FR) video quality assessment. This measure is tested on the Video Quality Experts Group (VQEG) phase I FR-TV test data set. Our experiments show the graphical measure displays the amount of distortion as well as the distribution of error in all frames of the video sequence while the numerical measure has a good correlation with perceived video quality outperforms PSNR and other objective measures by a clear margin.
Walsh-Hadamard transform kernel-based feature vector for shot boundary detection.
Lakshmi, Priya G G; Domnic, S
2014-12-01
Video shot boundary detection (SBD) is the first step of video analysis, summarization, indexing, and retrieval. In SBD process, videos are segmented into basic units called shots. In this paper, a new SBD method is proposed using color, edge, texture, and motion strength as vector of features (feature vector). Features are extracted by projecting the frames on selected basis vectors of Walsh-Hadamard transform (WHT) kernel and WHT matrix. After extracting the features, based on the significance of the features, weights are calculated. The weighted features are combined to form a single continuity signal, used as input for Procedure Based shot transition Identification process (PBI). Using the procedure, shot transitions are classified into abrupt and gradual transitions. Experimental results are examined using large-scale test sets provided by the TRECVID 2007, which has evaluated hard cut and gradual transition detection. To evaluate the robustness of the proposed method, the system evaluation is performed. The proposed method yields F1-Score of 97.4% for cut, 78% for gradual, and 96.1% for overall transitions. We have also evaluated the proposed feature vector with support vector machine classifier. The results show that WHT-based features can perform well than the other existing methods. In addition to this, few more video sequences are taken from the Openvideo project and the performance of the proposed method is compared with the recent existing SBD method.
Intelligent bandwith compression
NASA Astrophysics Data System (ADS)
Tseng, D. Y.; Bullock, B. L.; Olin, K. E.; Kandt, R. K.; Olsen, J. D.
1980-02-01
The feasibility of a 1000:1 bandwidth compression ratio for image transmission has been demonstrated using image-analysis algorithms and a rule-based controller. Such a high compression ratio was achieved by first analyzing scene content using auto-cueing and feature-extraction algorithms, and then transmitting only the pertinent information consistent with mission requirements. A rule-based controller directs the flow of analysis and performs priority allocations on the extracted scene content. The reconstructed bandwidth-compressed image consists of an edge map of the scene background, with primary and secondary target windows embedded in the edge map. The bandwidth-compressed images are updated at a basic rate of 1 frame per second, with the high-priority target window updated at 7.5 frames per second. The scene-analysis algorithms used in this system together with the adaptive priority controller are described. Results of simulated 1000:1 band width-compressed images are presented. A video tape simulation of the Intelligent Bandwidth Compression system has been produced using a sequence of video input from the data base.
Passive IFF: Autonomous Nonintrusive Rapid Identification of Friendly Assets
NASA Technical Reports Server (NTRS)
Moynihan, Philip; Steenburg, Robert Van; Chao, Tien-Hsin
2004-01-01
A proposed optoelectronic instrument would identify targets rapidly, without need to radiate an interrogating signal, apply identifying marks to the targets, or equip the targets with transponders. The instrument was conceived as an identification, friend or foe (IFF) system in a battlefield setting, where it would be part of a targeting system for weapons, by providing rapid identification for aimed weapons to help in deciding whether and when to trigger them. The instrument could also be adapted to law-enforcement and industrial applications in which it is necessary to rapidly identify objects in view. The instrument would comprise mainly an optical correlator and a neural processor (see figure). The inherent parallel-processing speed and capability of the optical correlator would be exploited to obtain rapid identification of a set of probable targets within a scene of interest and to define regions within the scene for the neural processor to analyze. The neural processor would then concentrate on each region selected by the optical correlator in an effort to identify the target. Depending on whether or not a target was recognized by comparison of its image data with data in an internal database on which the neural processor was trained, the processor would generate an identifying signal (typically, friend or foe ). The time taken for this identification process would be less than the time needed by a human or robotic gunner to acquire a view of, and aim at, a target. An optical correlator that has been under development for several years and that has been demonstrated to be capable of tracking a cruise missile might be considered a prototype of the optical correlator in the proposed IFF instrument. This optical correlator features a 512-by-512-pixel input image frame and operates at an input frame rate of 60 Hz. It includes a spatial light modulator (SLM) for video-to-optical image conversion, a pair of precise lenses to effect Fourier transforms, a filter SLM for digital-to-optical correlation-filter data conversion, and a charge-coupled device (CCD) for detection of correlation peaks. In operation, the input scene grabbed by a video sensor is streamed into the input SLM. Precomputed correlation-filter data files representative of known targets are then downloaded and sequenced into the filter SLM at a rate of 1,000 Hz. When there occurs a match between the input target data and one of the known-target data files, the CCD detects a correlation peak at the location of the target. Distortion- invariant correlation filters from a bank of such filters are then sequenced through the optical correlator for each input frame. The net result is the rapid preliminary recognition of one or a few targets.
Merkel, Daniel; Brinkmann, Eckard; Kämmer, Joerg C; Köhler, Miriam; Wiens, Daniel; Derwahl, Karl-Michael
2015-09-01
The electronic colorization of grayscale B-mode sonograms using various color schemes aims to enhance the adaptability and practicability of B-mode sonography in daylight conditions. The purpose of this study was to determine the diagnostic effectiveness and importance of colorized B-mode sonography. Fifty-three video sequences of sonographic examinations of the liver were digitized and subsequently colorized in 2 different color combinations (yellow-brown and blue-white). The set of 53 images consisted of 33 with isoechoic masses, 8 with obvious lesions of the liver (hypoechoic or hyperechoic), and 12 with inconspicuous reference images of the liver. The video sequences were combined in a random order and edited into half-hour video clips. Isoechoic liver lesions were successfully detected in 58% of the yellow-brown video sequences and in 57% of the grayscale video sequences (P = .74, not significant). Fifty percent of the isoechoic liver lesions were successfully detected in the blue-white video sequences, as opposed to a 55% detection rate in the corresponding grayscale video sequences (P= .11, not significant). In 2 subgroups, significantly more liver lesions were detected with grayscale sonography compared to blue-white sonography. Yellow-brown-colorized B-mode sonography appears to be similarly effective for detection of isoechoic parenchymal liver lesions as traditional grayscale sonography. Blue-white colorization in B-mode sonography is probably not as effective as grayscale sonography, although a statistically significant disadvantage was shown only in the subgroup of hyperechoic liver lesions. © 2015 by the American Institute of Ultrasound in Medicine.
Yao, Guangle; Lei, Tao; Zhong, Jiandan; Jiang, Ping; Jia, Wenwu
2017-01-01
Background subtraction (BS) is one of the most commonly encountered tasks in video analysis and tracking systems. It distinguishes the foreground (moving objects) from the video sequences captured by static imaging sensors. Background subtraction in remote scene infrared (IR) video is important and common to lots of fields. This paper provides a Remote Scene IR Dataset captured by our designed medium-wave infrared (MWIR) sensor. Each video sequence in this dataset is identified with specific BS challenges and the pixel-wise ground truth of foreground (FG) for each frame is also provided. A series of experiments were conducted to evaluate BS algorithms on this proposed dataset. The overall performance of BS algorithms and the processor/memory requirements were compared. Proper evaluation metrics or criteria were employed to evaluate the capability of each BS algorithm to handle different kinds of BS challenges represented in this dataset. The results and conclusions in this paper provide valid references to develop new BS algorithm for remote scene IR video sequence, and some of them are not only limited to remote scene or IR video sequence but also generic for background subtraction. The Remote Scene IR dataset and the foreground masks detected by each evaluated BS algorithm are available online: https://github.com/JerryYaoGl/BSEvaluationRemoteSceneIR. PMID:28837112
Violent Interaction Detection in Video Based on Deep Learning
NASA Astrophysics Data System (ADS)
Zhou, Peipei; Ding, Qinghai; Luo, Haibo; Hou, Xinglin
2017-06-01
Violent interaction detection is of vital importance in some video surveillance scenarios like railway stations, prisons or psychiatric centres. Existing vision-based methods are mainly based on hand-crafted features such as statistic features between motion regions, leading to a poor adaptability to another dataset. En lightened by the development of convolutional networks on common activity recognition, we construct a FightNet to represent the complicated visual violence interaction. In this paper, a new input modality, image acceleration field is proposed to better extract the motion attributes. Firstly, each video is framed as RGB images. Secondly, optical flow field is computed using the consecutive frames and acceleration field is obtained according to the optical flow field. Thirdly, the FightNet is trained with three kinds of input modalities, i.e., RGB images for spatial networks, optical flow images and acceleration images for temporal networks. By fusing results from different inputs, we conclude whether a video tells a violent event or not. To provide researchers a common ground for comparison, we have collected a violent interaction dataset (VID), containing 2314 videos with 1077 fight ones and 1237 no-fight ones. By comparison with other algorithms, experimental results demonstrate that the proposed model for violent interaction detection shows higher accuracy and better robustness.
Constructing storyboards based on hierarchical clustering analysis
NASA Astrophysics Data System (ADS)
Hasebe, Satoshi; Sami, Mustafa M.; Muramatsu, Shogo; Kikuchi, Hisakazu
2005-07-01
There are growing needs for quick preview of video contents for the purpose of improving accessibility of video archives as well as reducing network traffics. In this paper, a storyboard that contains a user-specified number of keyframes is produced from a given video sequence. It is based on hierarchical cluster analysis of feature vectors that are derived from wavelet coefficients of video frames. Consistent use of extracted feature vectors is the key to avoid a repetition of computationally-intensive parsing of the same video sequence. Experimental results suggest that a significant reduction in computational time is gained by this strategy.
Detection and tracking of gas plumes in LWIR hyperspectral video sequence data
NASA Astrophysics Data System (ADS)
Gerhart, Torin; Sunu, Justin; Lieu, Lauren; Merkurjev, Ekaterina; Chang, Jen-Mei; Gilles, Jérôme; Bertozzi, Andrea L.
2013-05-01
Automated detection of chemical plumes presents a segmentation challenge. The segmentation problem for gas plumes is difficult due to the diffusive nature of the cloud. The advantage of considering hyperspectral images in the gas plume detection problem over the conventional RGB imagery is the presence of non-visual data, allowing for a richer representation of information. In this paper we present an effective method of visualizing hyperspectral video sequences containing chemical plumes and investigate the effectiveness of segmentation techniques on these post-processed videos. Our approach uses a combination of dimension reduction and histogram equalization to prepare the hyperspectral videos for segmentation. First, Principal Components Analysis (PCA) is used to reduce the dimension of the entire video sequence. This is done by projecting each pixel onto the first few Principal Components resulting in a type of spectral filter. Next, a Midway method for histogram equalization is used. These methods redistribute the intensity values in order to reduce icker between frames. This properly prepares these high-dimensional video sequences for more traditional segmentation techniques. We compare the ability of various clustering techniques to properly segment the chemical plume. These include K-means, spectral clustering, and the Ginzburg-Landau functional.
Electronic data generation and display system
NASA Technical Reports Server (NTRS)
Wetekamm, Jules
1988-01-01
The Electronic Data Generation and Display System (EDGADS) is a field tested paperless technical manual system. The authoring provides subject matter experts the option of developing procedureware from digital or hardcopy inputs of technical information from text, graphics, pictures, and recorded media (video, audio, etc.). The display system provides multi-window presentations of graphics, pictures, animations, and action sequences with text and audio overlays on high resolution color CRT and monochrome portable displays. The database management system allows direct access via hierarchical menus, keyword name, ID number, voice command or touch of a screen pictoral of the item (ICON). It contains operations and maintenance technical information at three levels of intelligence for a total system.
Content-based TV sports video retrieval using multimodal analysis
NASA Astrophysics Data System (ADS)
Yu, Yiqing; Liu, Huayong; Wang, Hongbin; Zhou, Dongru
2003-09-01
In this paper, we propose content-based video retrieval, which is a kind of retrieval by its semantical contents. Because video data is composed of multimodal information streams such as video, auditory and textual streams, we describe a strategy of using multimodal analysis for automatic parsing sports video. The paper first defines the basic structure of sports video database system, and then introduces a new approach that integrates visual stream analysis, speech recognition, speech signal processing and text extraction to realize video retrieval. The experimental results for TV sports video of football games indicate that the multimodal analysis is effective for video retrieval by quickly browsing tree-like video clips or inputting keywords within predefined domain.
Learning Sociolinguistically Appropriate Language through the Video Drama "Connect with English"
ERIC Educational Resources Information Center
Hwang, Caroline C.
2005-01-01
Video provides (1) simultaneous audio/visual input, and (2) complete and contextualized conversations, and thus proves to be a rich vehicle in foreign language instruction. The video drama "Connect with English" (a.k.a. "Rebecca's Dream"), created to promote English language learning, is particularly outstanding in that it contains an captivating…
"Espanol para ti": A Video Program That Works.
ERIC Educational Resources Information Center
Steele, Elena; Johnson, Holly
2000-01-01
Describes the development of "Espanol para ti," a video program for teaching Spanish at the elementary school level. The program was designed for use in Clark County, Nevada elementary schools and is taught by a certified Spanish teacher via video twice a week, utilizing comprehensible input through visuals, games, and songs that are conducive to…
Woo, Kevin L; Rieucau, Guillaume
2008-07-01
The increasing use of the video playback technique in behavioural ecology reveals a growing need to ensure better control of the visual stimuli that focal animals experience. Technological advances now allow researchers to develop computer-generated animations instead of using video sequences of live-acting demonstrators. However, care must be taken to match the motion characteristics (speed and velocity) of the animation to the original video source. Here, we presented a tool based on the use of an optic flow analysis program to measure the resemblance of motion characteristics of computer-generated animations compared to videos of live-acting animals. We examined three distinct displays (tail-flick (TF), push-up body rock (PUBR), and slow arm wave (SAW)) exhibited by animations of Jacky dragons (Amphibolurus muricatus) that were compared to the original video sequences of live lizards. We found no significant differences between the motion characteristics of videos and animations across all three displays. Our results showed that our animations are similar the speed and velocity features of each display. Researchers need to ensure that similar motion characteristics in animation and video stimuli are represented, and this feature is a critical component in the future success of the video playback technique.
Motion video compression system with neural network having winner-take-all function
NASA Technical Reports Server (NTRS)
Fang, Wai-Chi (Inventor); Sheu, Bing J. (Inventor)
1997-01-01
A motion video data system includes a compression system, including an image compressor, an image decompressor correlative to the image compressor having an input connected to an output of the image compressor, a feedback summing node having one input connected to an output of the image decompressor, a picture memory having an input connected to an output of the feedback summing node, apparatus for comparing an image stored in the picture memory with a received input image and deducing therefrom pixels having differences between the stored image and the received image and for retrieving from the picture memory a partial image including the pixels only and applying the partial image to another input of the feedback summing node, whereby to produce at the output of the feedback summing node an updated decompressed image, a subtraction node having one input connected to received the received image and another input connected to receive the partial image so as to generate a difference image, the image compressor having an input connected to receive the difference image whereby to produce a compressed difference image at the output of the image compressor.
Simulation and Real-Time Verification of Video Algorithms on the TI C6400 Using Simulink
2004-08-20
SPONSOR/MONITOR’S ACRONYM(S) 11. SPONSOR/MONITOR’S REPORT NUMBER(S) 12 . DISTRIBUTION/AVAILABILITY STATEMENT Approved for public release...plot estimates over time (scrolling data) Adjust detection threshold (click mouse on graph) Monitor video capture Input video frames Captured frames 12 ...Video App: Surveillance Recording 1 2 7 3 4 9 5 6 11 SL for video Explanation of GUI 12 Target Options8 Build Process 10 13 14 15 16 M-code snippet
Identifying swimmers as water-polo or swim team-mates from visual displays of less than one second.
Steel, Kylie A; Adams, Roger D; Canning, Colleen G
2007-09-01
Opportunities for ball passing in water-polo may be brief and the decision to pass only informed by minimal visual input. Since researchers using point light displays have shown that the walking or running gait of familiars can be identified, water-polo players may have the ability to recognize team-mates from their swimming gait. To test this hypothesis, members of a water-polo team and a competition swim team viewed two randomized sets of video clips, each less than one second long, of swimmers from both teams sprinting freestyle past a fixed camera. The arm stroke clip sequence showed only the upper body, and the kick sequence showed only the lower body. After viewing each video clip, observers rated their level of certainty as to whether the swimmer presented was a team-mate or not. Discrimination was significantly above chance in both groups. Water-polo players were better able to identify team-mates from their kick, whereas swimmers were better able to do so by viewing arm stroke. Our results suggest that, as with walking and running gait, small amounts of visual information about swimmers can be used for recognition, and so raise the possibility that specific training may be able to improve team-mate classification in water-polo, particularly in newly formed teams.
Performance evaluation of the intra compression in the video coding standards
NASA Astrophysics Data System (ADS)
Abramowski, Andrzej
2015-09-01
The article presents a comparison of the Intra prediction algorithms in the current state-of-the-art video coding standards, including MJPEG 2000, VP8, VP9, H.264/AVC and H.265/HEVC. The effectiveness of techniques employed by each standard is evaluated in terms of compression efficiency and average encoding time. The compression efficiency is measured using BD-PSNR and BD-RATE metrics with H.265/HEVC results as an anchor. Tests are performed on a set of video sequences, composed of sequences gathered by Joint Collaborative Team on Video Coding during the development of the H.265/HEVC standard and 4K sequences provided by Ultra Video Group. According to results, H.265/HEVC provides significant bit-rate savings at the expense of computational complexity, while VP9 may be regarded as a compromise between the efficiency and required encoding time.
Video-assisted segmentation of speech and audio track
NASA Astrophysics Data System (ADS)
Pandit, Medha; Yusoff, Yusseri; Kittler, Josef; Christmas, William J.; Chilton, E. H. S.
1999-08-01
Video database research is commonly concerned with the storage and retrieval of visual information invovling sequence segmentation, shot representation and video clip retrieval. In multimedia applications, video sequences are usually accompanied by a sound track. The sound track contains potential cues to aid shot segmentation such as different speakers, background music, singing and distinctive sounds. These different acoustic categories can be modeled to allow for an effective database retrieval. In this paper, we address the problem of automatic segmentation of audio track of multimedia material. This audio based segmentation can be combined with video scene shot detection in order to achieve partitioning of the multimedia material into semantically significant segments.
A system for endobronchial video analysis
NASA Astrophysics Data System (ADS)
Byrnes, Patrick D.; Higgins, William E.
2017-03-01
Image-guided bronchoscopy is a critical component in the treatment of lung cancer and other pulmonary disorders. During bronchoscopy, a high-resolution endobronchial video stream facilitates guidance through the lungs and allows for visual inspection of a patient's airway mucosal surfaces. Despite the detailed information it contains, little effort has been made to incorporate recorded video into the clinical workflow. Follow-up procedures often required in cancer assessment or asthma treatment could significantly benefit from effectively parsed and summarized video. Tracking diagnostic regions of interest (ROIs) could potentially better equip physicians to detect early airway-wall cancer or improve asthma treatments, such as bronchial thermoplasty. To address this need, we have developed a system for the postoperative analysis of recorded endobronchial video. The system first parses an input video stream into endoscopic shots, derives motion information, and selects salient representative key frames. Next, a semi-automatic method for CT-video registration creates data linkages between a CT-derived airway-tree model and the input video. These data linkages then enable the construction of a CT-video chest model comprised of a bronchoscopy path history (BPH) - defining all airway locations visited during a procedure - and texture-mapping information for rendering registered video frames onto the airwaytree model. A suite of analysis tools is included to visualize and manipulate the extracted data. Video browsing and retrieval is facilitated through a video table of contents (TOC) and a search query interface. The system provides a variety of operational modes and additional functionality, including the ability to define regions of interest. We demonstrate the potential of our system using two human case study examples.
Video enhancement workbench: an operational real-time video image processing system
NASA Astrophysics Data System (ADS)
Yool, Stephen R.; Van Vactor, David L.; Smedley, Kirk G.
1993-01-01
Video image sequences can be exploited in real-time, giving analysts rapid access to information for military or criminal investigations. Video-rate dynamic range adjustment subdues fluctuations in image intensity, thereby assisting discrimination of small or low- contrast objects. Contrast-regulated unsharp masking enhances differentially shadowed or otherwise low-contrast image regions. Real-time removal of localized hotspots, when combined with automatic histogram equalization, may enhance resolution of objects directly adjacent. In video imagery corrupted by zero-mean noise, real-time frame averaging can assist resolution and location of small or low-contrast objects. To maximize analyst efficiency, lengthy video sequences can be screened automatically for low-frequency, high-magnitude events. Combined zoom, roam, and automatic dynamic range adjustment permit rapid analysis of facial features captured by video cameras recording crimes in progress. When trying to resolve small objects in murky seawater, stereo video places the moving imagery in an optimal setting for human interpretation.
Popular video for rural development in Peru.
Calvelo Rios, J M
1989-01-01
Peru developed its first use of video for training and education in rural areas over a decade ago. On completion of the project in 1986, over 400,000 peasants had attended video courses lasting from 5-20 days. The courses included rural health, family planning, reforestation, agriculture, animal husbandry, housing, nutrition, and water sanitation. There were 125 course packages made and 1,260 video programs from 10-18 minutes in length. There were 780 additional video programs created on human resource development, socioeconomic diagnostics and culture. 160 specialists were trained to produce audiovisual materials and run the programs. Also, 70 trainers from other countries were trained. The results showed many used the training in practical applications. To promote rural development 2 things are needed , capital and physical inputs, such as equipment, fertilizers, pesticides, etc. The video project provided peasants an additional input that would help them manage the financial and physical inputs more efficiently. Video was used because many farmers are illiterate or speak a language different from the official one. Printed guides that contained many illustrations and few words served as memory aids and group discussions reinforced practical learning. By seeing, hearing, and doing, the training was effective. There were 46% women which made fertility and family planning subjects more easily communicated. The production of teaching modules included field investigations, academic research, field recording, tape editing, and experimental application in the field. An agreement with the peasants was initiated before a course began to help insure full participation and to also make sure resources were available to use the knowledge gained. The courses were limited to 30 and the cost per participant was $34 per course.
Semantic-based crossmodal processing during visual suppression.
Cox, Dustin; Hong, Sang Wook
2015-01-01
To reveal the mechanisms underpinning the influence of auditory input on visual awareness, we examine, (1) whether purely semantic-based multisensory integration facilitates the access to visual awareness for familiar visual events, and (2) whether crossmodal semantic priming is the mechanism responsible for the semantic auditory influence on visual awareness. Using continuous flash suppression, we rendered dynamic and familiar visual events (e.g., a video clip of an approaching train) inaccessible to visual awareness. We manipulated the semantic auditory context of the videos by concurrently pairing them with a semantically matching soundtrack (congruent audiovisual condition), a semantically non-matching soundtrack (incongruent audiovisual condition), or with no soundtrack (neutral video-only condition). We found that participants identified the suppressed visual events significantly faster (an earlier breakup of suppression) in the congruent audiovisual condition compared to the incongruent audiovisual condition and video-only condition. However, this facilitatory influence of semantic auditory input was only observed when audiovisual stimulation co-occurred. Our results suggest that the enhanced visual processing with a semantically congruent auditory input occurs due to audiovisual crossmodal processing rather than semantic priming, which may occur even when visual information is not available to visual awareness.
Co-Located Collaborative Learning Video Game with Single Display Groupware
ERIC Educational Resources Information Center
Infante, Cristian; Weitz, Juan; Reyes, Tomas; Nussbaum, Miguel; Gomez, Florencia; Radovic, Darinka
2010-01-01
Role Game is a co-located CSCL video game played by three students sitting at one machine sharing a single screen, each with their own input device. Inspired by video console games, Role Game enables students to learn by doing, acquiring social abilities and mastering subject matter in a context of co-located collaboration. After describing the…
The Effect of Interactivity with a Music Video Game on Second Language Vocabulary Recall
ERIC Educational Resources Information Center
deHaan, Jonathan; Reed, W. Michael; Kuwada, Katsuko
2010-01-01
Video games are potential sources of second language input; however, the medium's fundamental characteristic, interactivity, has not been thoroughly examined in terms of its effect on learning outcomes. This experimental study investigated to what degree, if at all, video game interactivity would help or hinder the noticing and recall of second…
ERIC Educational Resources Information Center
Hayes, John; Pulliam, Robert
A video performance monitoring system was developed by the URS/Matrix Company, under contract to the USAF Human Resources Laboratory and was evaluated experimentally in three technical training settings. Using input from 1 to 8 video cameras, the system provided a flexible combination of signal processing, direct monitor, recording and replay…
Standardized access, display, and retrieval of medical video
NASA Astrophysics Data System (ADS)
Bellaire, Gunter; Steines, Daniel; Graschew, Georgi; Thiel, Andreas; Bernarding, Johannes; Tolxdorff, Thomas; Schlag, Peter M.
1999-05-01
The system presented here enhances documentation and data- secured, second-opinion facilities by integrating video sequences into DICOM 3.0. We present an implementation for a medical video server extended by a DICOM interface. Security mechanisms conforming with DICOM are integrated to enable secure internet access. Digital video documents of diagnostic and therapeutic procedures should be examined regarding the clip length and size necessary for second opinion and manageable with today's hardware. Image sources relevant for this paper include 3D laparoscope, 3D surgical microscope, 3D open surgery camera, synthetic video, and monoscopic endoscopes, etc. The global DICOM video concept and three special workplaces of distinct applications are described. Additionally, an approach is presented to analyze the motion of the endoscopic camera for future automatic video-cutting. Digital stereoscopic video sequences are especially in demand for surgery . Therefore DSVS are also integrated into the DICOM video concept. Results are presented describing the suitability of stereoscopic display techniques for the operating room.
Human Motion Capture Data Tailored Transform Coding.
Junhui Hou; Lap-Pui Chau; Magnenat-Thalmann, Nadia; Ying He
2015-07-01
Human motion capture (mocap) is a widely used technique for digitalizing human movements. With growing usage, compressing mocap data has received increasing attention, since compact data size enables efficient storage and transmission. Our analysis shows that mocap data have some unique characteristics that distinguish themselves from images and videos. Therefore, directly borrowing image or video compression techniques, such as discrete cosine transform, does not work well. In this paper, we propose a novel mocap-tailored transform coding algorithm that takes advantage of these features. Our algorithm segments the input mocap sequences into clips, which are represented in 2D matrices. Then it computes a set of data-dependent orthogonal bases to transform the matrices to frequency domain, in which the transform coefficients have significantly less dependency. Finally, the compression is obtained by entropy coding of the quantized coefficients and the bases. Our method has low computational cost and can be easily extended to compress mocap databases. It also requires neither training nor complicated parameter setting. Experimental results demonstrate that the proposed scheme significantly outperforms state-of-the-art algorithms in terms of compression performance and speed.
Algorithm for Video Summarization of Bronchoscopy Procedures
2011-01-01
Background The duration of bronchoscopy examinations varies considerably depending on the diagnostic and therapeutic procedures used. It can last more than 20 minutes if a complex diagnostic work-up is included. With wide access to videobronchoscopy, the whole procedure can be recorded as a video sequence. Common practice relies on an active attitude of the bronchoscopist who initiates the recording process and usually chooses to archive only selected views and sequences. However, it may be important to record the full bronchoscopy procedure as documentation when liability issues are at stake. Furthermore, an automatic recording of the whole procedure enables the bronchoscopist to focus solely on the performed procedures. Video recordings registered during bronchoscopies include a considerable number of frames of poor quality due to blurry or unfocused images. It seems that such frames are unavoidable due to the relatively tight endobronchial space, rapid movements of the respiratory tract due to breathing or coughing, and secretions which occur commonly in the bronchi, especially in patients suffering from pulmonary disorders. Methods The use of recorded bronchoscopy video sequences for diagnostic, reference and educational purposes could be considerably extended with efficient, flexible summarization algorithms. Thus, the authors developed a prototype system to create shortcuts (called summaries or abstracts) of bronchoscopy video recordings. Such a system, based on models described in previously published papers, employs image analysis methods to exclude frames or sequences of limited diagnostic or education value. Results The algorithm for the selection or exclusion of specific frames or shots from video sequences recorded during bronchoscopy procedures is based on several criteria, including automatic detection of "non-informative", frames showing the branching of the airways and frames including pathological lesions. Conclusions The paper focuses on the challenge of generating summaries of bronchoscopy video recordings. PMID:22185344
Motion video analysis using planar parallax
NASA Astrophysics Data System (ADS)
Sawhney, Harpreet S.
1994-04-01
Motion and structure analysis in video sequences can lead to efficient descriptions of objects and their motions. Interesting events in videos can be detected using such an analysis--for instance independent object motion when the camera itself is moving, figure-ground segregation based on the saliency of a structure compared to its surroundings. In this paper we present a method for 3D motion and structure analysis that uses a planar surface in the environment as a reference coordinate system to describe a video sequence. The motion in the video sequence is described as the motion of the reference plane, and the parallax motion of all the non-planar components of the scene. It is shown how this method simplifies the otherwise hard general 3D motion analysis problem. In addition, a natural coordinate system in the environment is used to describe the scene which can simplify motion based segmentation. This work is a part of an ongoing effort in our group towards video annotation and analysis for indexing and retrieval. Results from a demonstration system being developed are presented.
NASA Astrophysics Data System (ADS)
Sano, Kimikazu; Nagatani, Munehiko; Mutoh, Miwa; Murata, Koichi
This paper is a report on a high ESD breakdown-voltage InP HBT transimpedance amplifier IC for optical video distribution systems. To make ESD breakdown-voltage higher, we designed ESD protection circuits integrated in the TIA IC using base-collector/base-emitter diodes of InP HBTs and resistors. These components for ESD protection circuits have already existed in the employed InP HBT IC process, so no process modifications were needed. Furthermore, to meet requirements for use in optical video distribution systems, we studied circuit design techniques to obtain a good input-output linearity and a low-noise characteristic. Fabricated InP HBT TIA IC exhibited high human-body-model ESD breakdown voltages (±1000V for power supply terminals, ±200V for high-speed input/output terminals), good input-output linearity (less than 2.9-% duty-cycle-distortion), and low noise characteristic (10.7pA/√Hz averaged input-referred noise current density) with a -3-dB-down higher frequency of 6.9GHz. To the best of our knowledge, this paper is the first literature describing InP ICs with high ESD-breakdown voltages.
Video and LAN solutions for a digital OR: the Varese experience
NASA Astrophysics Data System (ADS)
Nocco, Umberto; Cocozza, Eugenio; Sivo, Monica; Peta, Giancarlo
2007-03-01
Purpose: build 20 ORs equipped with independent video acquisition and broadcasting systems and a powerful LAN connectivity. Methods: a digital PC controlled video matrix has been installed in each OR. The LAN connectivity has been developed to grant data entering the OR and high speed connectivity to a server and to broadcasting devices. Video signals are broadcasted within the OR. Fixed inputs and five additional video inputs have been placed in the OR. Images can be stored locally on a high capacity HDD and a DVD recorder. Images can be also stored in a central archive for future acquisition and reference. Ethernet plugs have been placed within the OR to acquire images and data from the Hospital LAN; the OR is connected to the server/archive using a dedicated optical fiber. Results: 20 independent digital ORs have been built. Each OR is "self contained" and images can be digitally managed and broadcasted. Security issues concerning both image visualization and electrical safety have been fulfilled and each OR is fully integrated in the Hospital LAN. Conclusions: Digital ORs were fully implemented, they fulfill surgeons needs in terms of video acquisition and distribution and grant high quality video for each kind of surgery in a major hospital.
NASA Astrophysics Data System (ADS)
Liu, Yu; Lin, Xiaocheng; Fan, Nianfei; Zhang, Lin
2016-01-01
Wireless video multicast has become one of the key technologies in wireless applications. But the main challenge of conventional wireless video multicast, i.e., the cliff effect, remains unsolved. To overcome the cliff effect, a hybrid digital-analog (HDA) video transmission framework based on SoftCast, which transmits the digital bitstream with the quantization residuals, is proposed. With an effective power allocation algorithm and appropriate parameter settings, the residual gains can be maximized; meanwhile, the digital bitstream can assure transmission of a basic video to the multicast receiver group. In the multiple-input multiple-output (MIMO) system, since nonuniform noise interference on different antennas can be regarded as the cliff effect problem, ParCast, which is a variation of SoftCast, is also applied to video transmission to solve it. The HDA scheme with corresponding power allocation algorithms is also applied to improve video performance. Simulations show that the proposed HDA scheme can overcome the cliff effect completely with the transmission of residuals. What is more, it outperforms the compared WSVC scheme by more than 2 dB when transmitting under the same bandwidth, and it can further improve performance by nearly 8 dB in MIMO when compared with the ParCast scheme.
NASA Astrophysics Data System (ADS)
Kypraios, Ioannis; Young, Rupert C. D.; Chatwin, Chris R.
2009-08-01
Motivated by the non-linear interpolation and generalization abilities of the hybrid optical neural network filter between the reference and non-reference images of the true-class object we designed the modifiedhybrid optical neural network filter. We applied an optical mask to the hybrid optical neural network's filter input. The mask was built with the constant weight connections of a randomly chosen image included in the training set. The resulted design of the modified-hybrid optical neural network filter is optimized for performing best in cluttered scenes of the true-class object. Due to the shift invariance properties inherited by its correlator unit the filter can accommodate multiple objects of the same class to be detected within an input cluttered image. Additionally, the architecture of the neural network unit of the general hybrid optical neural network filter allows the recognition of multiple objects of different classes within the input cluttered image by modifying the output layer of the unit. We test the modified-hybrid optical neural network filter for multiple objects of the same and of different classes' recognition within cluttered input images and video sequences of cluttered scenes. The filter is shown to exhibit with a single pass over the input data simultaneously out-of-plane rotation, shift invariance and good clutter tolerance. It is able to successfully detect and classify correctly the true-class objects within background clutter for which there has been no previous training.
Deep linear autoencoder and patch clustering-based unified one-dimensional coding of image and video
NASA Astrophysics Data System (ADS)
Li, Honggui
2017-09-01
This paper proposes a unified one-dimensional (1-D) coding framework of image and video, which depends on deep learning neural network and image patch clustering. First, an improved K-means clustering algorithm for image patches is employed to obtain the compact inputs of deep artificial neural network. Second, for the purpose of best reconstructing original image patches, deep linear autoencoder (DLA), a linear version of the classical deep nonlinear autoencoder, is introduced to achieve the 1-D representation of image blocks. Under the circumstances of 1-D representation, DLA is capable of attaining zero reconstruction error, which is impossible for the classical nonlinear dimensionality reduction methods. Third, a unified 1-D coding infrastructure for image, intraframe, interframe, multiview video, three-dimensional (3-D) video, and multiview 3-D video is built by incorporating different categories of videos into the inputs of patch clustering algorithm. Finally, it is shown in the results of simulation experiments that the proposed methods can simultaneously gain higher compression ratio and peak signal-to-noise ratio than those of the state-of-the-art methods in the situation of low bitrate transmission.
Synthesis of Speaker Facial Movement to Match Selected Speech Sequences
NASA Technical Reports Server (NTRS)
Scott, K. C.; Kagels, D. S.; Watson, S. H.; Rom, H.; Wright, J. R.; Lee, M.; Hussey, K. J.
1994-01-01
A system is described which allows for the synthesis of a video sequence of a realistic-appearing talking human head. A phonic based approach is used to describe facial motion; image processing rather than physical modeling techniques are used to create video frames.
The MaizeGDB Genome Browser tutorial: one example of database outreach to biologists via video.
Harper, Lisa C; Schaeffer, Mary L; Thistle, Jordan; Gardiner, Jack M; Andorf, Carson M; Campbell, Darwin A; Cannon, Ethalinda K S; Braun, Bremen L; Birkett, Scott M; Lawrence, Carolyn J; Sen, Taner Z
2011-01-01
Video tutorials are an effective way for researchers to quickly learn how to use online tools offered by biological databases. At MaizeGDB, we have developed a number of video tutorials that demonstrate how to use various tools and explicitly outline the caveats researchers should know to interpret the information available to them. One such popular video currently available is 'Using the MaizeGDB Genome Browser', which describes how the maize genome was sequenced and assembled as well as how the sequence can be visualized and interacted with via the MaizeGDB Genome Browser. Database
NASA Astrophysics Data System (ADS)
Bulan, Orhan; Bernal, Edgar A.; Loce, Robert P.; Wu, Wencheng
2013-03-01
Video cameras are widely deployed along city streets, interstate highways, traffic lights, stop signs and toll booths by entities that perform traffic monitoring and law enforcement. The videos captured by these cameras are typically compressed and stored in large databases. Performing a rapid search for a specific vehicle within a large database of compressed videos is often required and can be a time-critical life or death situation. In this paper, we propose video compression and decompression algorithms that enable fast and efficient vehicle or, more generally, event searches in large video databases. The proposed algorithm selects reference frames (i.e., I-frames) based on a vehicle having been detected at a specified position within the scene being monitored while compressing a video sequence. A search for a specific vehicle in the compressed video stream is performed across the reference frames only, which does not require decompression of the full video sequence as in traditional search algorithms. Our experimental results on videos captured in a local road show that the proposed algorithm significantly reduces the search space (thus reducing time and computational resources) in vehicle search tasks within compressed video streams, particularly those captured in light traffic volume conditions.
Evaluation of privacy in high dynamic range video sequences
NASA Astrophysics Data System (ADS)
Řeřábek, Martin; Yuan, Lin; Krasula, Lukáš; Korshunov, Pavel; Fliegel, Karel; Ebrahimi, Touradj
2014-09-01
The ability of high dynamic range (HDR) to capture details in environments with high contrast has a significant impact on privacy in video surveillance. However, the extent to which HDR imaging affects privacy, when compared to a typical low dynamic range (LDR) imaging, is neither well studied nor well understood. To achieve such an objective, a suitable dataset of images and video sequences is needed. Therefore, we have created a publicly available dataset of HDR video for privacy evaluation PEViD-HDR, which is an HDR extension of an existing Privacy Evaluation Video Dataset (PEViD). PEViD-HDR video dataset can help in the evaluations of privacy protection tools, as well as for showing the importance of HDR imaging in video surveillance applications and its influence on the privacy-intelligibility trade-off. We conducted a preliminary subjective experiment demonstrating the usability of the created dataset for evaluation of privacy issues in video. The results confirm that a tone-mapped HDR video contains more privacy sensitive information and details compared to a typical LDR video.
The Fringe Reading Facility at the Max-Planck-Institut fuer Stroemungsforschung
NASA Astrophysics Data System (ADS)
Becker, F.; Meier, G. E. A.; Wegner, H.; Timm, R.; Wenskus, R.
1987-05-01
A Mach-Zehnder interferometer is used for optical flow measurements in a transonic wind tunnel. Holographic interferograms are reconstructed by illumination with a He-Ne-laser and viewed by a video camera through wide angle optics. This setup was used for investigating industrial double exposure holograms of truck tires in order to develop methods of automatic recognition of certain manufacturing faults. Automatic input is achieved by a transient recorder digitizing the output of a TV camera and transferring the digitized data to a PDP11-34. Interest centered around sequences of interferograms showing the interaction of vortices with a profile and subsequent emission of sound generated by this process. The objective is the extraction of quantitative data which relates to the emission of noise.
The Fringe Reading Facility at the Max-Planck-Institut fuer Stroemungsforschung
NASA Technical Reports Server (NTRS)
Becker, F.; Meier, G. E. A.; Wegner, H.; Timm, R.; Wenskus, R.
1987-01-01
A Mach-Zehnder interferometer is used for optical flow measurements in a transonic wind tunnel. Holographic interferograms are reconstructed by illumination with a He-Ne-laser and viewed by a video camera through wide angle optics. This setup was used for investigating industrial double exposure holograms of truck tires in order to develop methods of automatic recognition of certain manufacturing faults. Automatic input is achieved by a transient recorder digitizing the output of a TV camera and transferring the digitized data to a PDP11-34. Interest centered around sequences of interferograms showing the interaction of vortices with a profile and subsequent emission of sound generated by this process. The objective is the extraction of quantitative data which relates to the emission of noise.
NASA Technical Reports Server (NTRS)
Serebreny, S. M.; Evans, W. E.; Wiegman, E. J.
1974-01-01
The usefulness of dynamic display techniques in exploiting the repetitive nature of ERTS imagery was investigated. A specially designed Electronic Satellite Image Analysis Console (ESIAC) was developed and employed to process data for seven ERTS principal investigators studying dynamic hydrological conditions for diverse applications. These applications include measurement of snowfield extent and sediment plumes from estuary discharge, Playa Lake inventory, and monitoring of phreatophyte and other vegetation changes. The ESIAC provides facilities for storing registered image sequences in a magnetic video disc memory for subsequent recall, enhancement, and animated display in monochrome or color. The most unique feature of the system is the capability to time lapse the imagery and analytic displays of the imagery. Data products included quantitative measurements of distances and areas, binary thematic maps based on monospectral or multispectral decisions, radiance profiles, and movie loops. Applications of animation for uses other than creating time-lapse sequences are identified. Input to the ESIAC can be either digital or via photographic transparencies.
Deriving video content type from HEVC bitstream semantics
NASA Astrophysics Data System (ADS)
Nightingale, James; Wang, Qi; Grecos, Christos; Goma, Sergio R.
2014-05-01
As network service providers seek to improve customer satisfaction and retention levels, they are increasingly moving from traditional quality of service (QoS) driven delivery models to customer-centred quality of experience (QoE) delivery models. QoS models only consider metrics derived from the network however, QoE models also consider metrics derived from within the video sequence itself. Various spatial and temporal characteristics of a video sequence have been proposed, both individually and in combination, to derive methods of classifying video content either on a continuous scale or as a set of discrete classes. QoE models can be divided into three broad categories, full reference, reduced reference and no-reference models. Due to the need to have the original video available at the client for comparison, full reference metrics are of limited practical value in adaptive real-time video applications. Reduced reference metrics often require metadata to be transmitted with the bitstream, while no-reference metrics typically operate in the decompressed domain at the client side and require significant processing to extract spatial and temporal features. This paper proposes a heuristic, no-reference approach to video content classification which is specific to HEVC encoded bitstreams. The HEVC encoder already makes use of spatial characteristics to determine partitioning of coding units and temporal characteristics to determine the splitting of prediction units. We derive a function which approximates the spatio-temporal characteristics of the video sequence by using the weighted averages of the depth at which the coding unit quadtree is split and the prediction mode decision made by the encoder to estimate spatial and temporal characteristics respectively. Since the video content type of a sequence is determined by using high level information parsed from the video stream, spatio-temporal characteristics are identified without the need for full decoding and can be used in a timely manner to aid decision making in QoE oriented adaptive real time streaming.
DOT National Transportation Integrated Search
2013-10-21
Today many intersections are operated based on data input from nonintrusive video detection systems. With those systems the video detectors can be easily deployed/modified for different application requirements. This research project is initiated to ...
Telesign: a videophone system for sign language distant communication
NASA Astrophysics Data System (ADS)
Mozelle, Gerard; Preteux, Francoise J.; Viallet, Jean-Emmanuel
1998-09-01
This paper presents a low bit rate videophone system for deaf people communicating by means of sign language. Classic video conferencing systems have focused on head and shoulders sequences which are not well-suited for sign language video transmission since hearing impaired people also use their hands and arms to communicate. To address the above-mentioned functionality, we have developed a two-step content-based video coding system based on: (1) A segmentation step. Four or five video objects (VO) are extracted using a cooperative approach between color-based and morphological segmentation. (2) VO coding are achieved by using a standardized MPEG-4 video toolbox. Results of encoded sign language video sequences, presented for three target bit rates (32 kbits/s, 48 kbits/s and 64 kbits/s), demonstrate the efficiency of the approach presented in this paper.
Data Management Rubric for Video Data in Organismal Biology.
Brainerd, Elizabeth L; Blob, Richard W; Hedrick, Tyson L; Creamer, Andrew T; Müller, Ulrike K
2017-07-01
Standards-based data management facilitates data preservation, discoverability, and access for effective data reuse within research groups and across communities of researchers. Data sharing requires community consensus on standards for data management, such as storage and formats for digital data preservation, metadata (i.e., contextual data about the data) that should be recorded and stored, and data access. Video imaging is a valuable tool for measuring time-varying phenotypes in organismal biology, with particular application for research in functional morphology, comparative biomechanics, and animal behavior. The raw data are the videos, but videos alone are not sufficient for scientific analysis. Nearly endless videos of animals can be found on YouTube and elsewhere on the web, but these videos have little value for scientific analysis because essential metadata such as true frame rate, spatial calibration, genus and species, weight, age, etc. of organisms, are generally unknown. We have embarked on a project to build community consensus on video data management and metadata standards for organismal biology research. We collected input from colleagues at early stages, organized an open workshop, "Establishing Standards for Video Data Management," at the Society for Integrative and Comparative Biology meeting in January 2017, and then collected two more rounds of input on revised versions of the standards. The result we present here is a rubric consisting of nine standards for video data management, with three levels within each standard: good, better, and best practices. The nine standards are: (1) data storage; (2) video file formats; (3) metadata linkage; (4) video data and metadata access; (5) contact information and acceptable use; (6) camera settings; (7) organism(s); (8) recording conditions; and (9) subject matter/topic. The first four standards address data preservation and interoperability for sharing, whereas standards 5-9 establish minimum metadata standards for organismal biology video, and suggest additional metadata that may be useful for some studies. This rubric was developed with substantial input from researchers and students, but still should be viewed as a living document that should be further refined and updated as technology and research practices change. The audience for these standards includes researchers, journals, and granting agencies, and also the developers and curators of databases that may contribute to video data sharing efforts. We offer this project as an example of building community consensus for data management, preservation, and sharing standards, which may be useful for future efforts by the organismal biology research community. © The Author 2017. Published by Oxford University Press on behalf of the Society for Integrative and Comparative Biology.
Data Management Rubric for Video Data in Organismal Biology
Brainerd, Elizabeth L.; Blob, Richard W.; Hedrick, Tyson L.; Creamer, Andrew T.; Müller, Ulrike K.
2017-01-01
Synopsis Standards-based data management facilitates data preservation, discoverability, and access for effective data reuse within research groups and across communities of researchers. Data sharing requires community consensus on standards for data management, such as storage and formats for digital data preservation, metadata (i.e., contextual data about the data) that should be recorded and stored, and data access. Video imaging is a valuable tool for measuring time-varying phenotypes in organismal biology, with particular application for research in functional morphology, comparative biomechanics, and animal behavior. The raw data are the videos, but videos alone are not sufficient for scientific analysis. Nearly endless videos of animals can be found on YouTube and elsewhere on the web, but these videos have little value for scientific analysis because essential metadata such as true frame rate, spatial calibration, genus and species, weight, age, etc. of organisms, are generally unknown. We have embarked on a project to build community consensus on video data management and metadata standards for organismal biology research. We collected input from colleagues at early stages, organized an open workshop, “Establishing Standards for Video Data Management,” at the Society for Integrative and Comparative Biology meeting in January 2017, and then collected two more rounds of input on revised versions of the standards. The result we present here is a rubric consisting of nine standards for video data management, with three levels within each standard: good, better, and best practices. The nine standards are: (1) data storage; (2) video file formats; (3) metadata linkage; (4) video data and metadata access; (5) contact information and acceptable use; (6) camera settings; (7) organism(s); (8) recording conditions; and (9) subject matter/topic. The first four standards address data preservation and interoperability for sharing, whereas standards 5–9 establish minimum metadata standards for organismal biology video, and suggest additional metadata that may be useful for some studies. This rubric was developed with substantial input from researchers and students, but still should be viewed as a living document that should be further refined and updated as technology and research practices change. The audience for these standards includes researchers, journals, and granting agencies, and also the developers and curators of databases that may contribute to video data sharing efforts. We offer this project as an example of building community consensus for data management, preservation, and sharing standards, which may be useful for future efforts by the organismal biology research community. PMID:28881939
Xu, Yilei; Roy-Chowdhury, Amit K
2007-05-01
In this paper, we present a theory for combining the effects of motion, illumination, 3D structure, albedo, and camera parameters in a sequence of images obtained by a perspective camera. We show that the set of all Lambertian reflectance functions of a moving object, at any position, illuminated by arbitrarily distant light sources, lies "close" to a bilinear subspace consisting of nine illumination variables and six motion variables. This result implies that, given an arbitrary video sequence, it is possible to recover the 3D structure, motion, and illumination conditions simultaneously using the bilinear subspace formulation. The derivation builds upon existing work on linear subspace representations of reflectance by generalizing it to moving objects. Lighting can change slowly or suddenly, locally or globally, and can originate from a combination of point and extended sources. We experimentally compare the results of our theory with ground truth data and also provide results on real data by using video sequences of a 3D face and the entire human body with various combinations of motion and illumination directions. We also show results of our theory in estimating 3D motion and illumination model parameters from a video sequence.
The MaizeGDB Genome Browser tutorial: one example of database outreach to biologists via video
Harper, Lisa C.; Schaeffer, Mary L.; Thistle, Jordan; Gardiner, Jack M.; Andorf, Carson M.; Campbell, Darwin A.; Cannon, Ethalinda K.S.; Braun, Bremen L.; Birkett, Scott M.; Lawrence, Carolyn J.; Sen, Taner Z.
2011-01-01
Video tutorials are an effective way for researchers to quickly learn how to use online tools offered by biological databases. At MaizeGDB, we have developed a number of video tutorials that demonstrate how to use various tools and explicitly outline the caveats researchers should know to interpret the information available to them. One such popular video currently available is ‘Using the MaizeGDB Genome Browser’, which describes how the maize genome was sequenced and assembled as well as how the sequence can be visualized and interacted with via the MaizeGDB Genome Browser. Database URL: http://www.maizegdb.org/ PMID:21565781
NASA Astrophysics Data System (ADS)
Bartolini, Franco; Pasquini, Cristina; Piva, Alessandro
2001-04-01
The recent development of video compression algorithms allowed the diffusion of systems for the transmission of video sequences over data networks. However, the transmission over error prone mobile communication channels is yet an open issue. In this paper, a system developed for the real time transmission of H263 video coded sequences over TETRA mobile networks is presented. TETRA is an open digital trunked radio standard defined by the European Telecommunications Standardization Institute developed for professional mobile radio users, providing full integration of voice and data services. Experimental tests demonstrate that, in spite of the low frame rate allowed by the SW only implementation of the decoder and by the low channel rate a video compression technique such as that complying with the H263 standard, is still preferable to a simpler but less effective frame based compression system.
Mental Verb Input for Promoting Children's Theory of Mind: A Training Study
ERIC Educational Resources Information Center
Gola, Alice Ann Howard
2012-01-01
An experimental study investigated the effect of the type of mental verb input (i.e., input with "think", "know", and "remember") on preschoolers' theory of mind development. Preschoolers (n = 72) heard 128 mental verb utterances presented in video format across four sessions over two weeks. The training conditions differed only in the way the…
ERIC Educational Resources Information Center
Ghavamnia, M.; Eslami-Rasekh, A.; Vahid Dastjerdi, H.
2018-01-01
This study investigates the relative effectiveness of four types of input-enhanced instruction on the development of Iranian EFL learners' production of pragmatically appropriate and grammatically accurate suggestions. Over a 16-week course, input delivered through video clips was enhanced differently in four intact classes: (1) metapragmatic…
Video denoising using low rank tensor decomposition
NASA Astrophysics Data System (ADS)
Gui, Lihua; Cui, Gaochao; Zhao, Qibin; Wang, Dongsheng; Cichocki, Andrzej; Cao, Jianting
2017-03-01
Reducing noise in a video sequence is of vital important in many real-world applications. One popular method is block matching collaborative filtering. However, the main drawback of this method is that noise standard deviation for the whole video sequence is known in advance. In this paper, we present a tensor based denoising framework that considers 3D patches instead of 2D patches. By collecting the similar 3D patches non-locally, we employ the low-rank tensor decomposition for collaborative filtering. Since we specify the non-informative prior over the noise precision parameter, the noise variance can be inferred automatically from observed video data. Therefore, our method is more practical, which does not require knowing the noise variance. The experimental on video denoising demonstrates the effectiveness of our proposed method.
Multicore-based 3D-DWT video encoder
NASA Astrophysics Data System (ADS)
Galiano, Vicente; López-Granado, Otoniel; Malumbres, Manuel P.; Migallón, Hector
2013-12-01
Three-dimensional wavelet transform (3D-DWT) encoders are good candidates for applications like professional video editing, video surveillance, multi-spectral satellite imaging, etc. where a frame must be reconstructed as quickly as possible. In this paper, we present a new 3D-DWT video encoder based on a fast run-length coding engine. Furthermore, we present several multicore optimizations to speed-up the 3D-DWT computation. An exhaustive evaluation of the proposed encoder (3D-GOP-RL) has been performed, and we have compared the evaluation results with other video encoders in terms of rate/distortion (R/D), coding/decoding delay, and memory consumption. Results show that the proposed encoder obtains good R/D results for high-resolution video sequences with nearly in-place computation using only the memory needed to store a group of pictures. After applying the multicore optimization strategies over the 3D DWT, the proposed encoder is able to compress a full high-definition video sequence in real-time.
Peña, Raul; Ávila, Alfonso; Muñoz, David; Lavariega, Juan
2015-01-01
The recognition of clinical manifestations in both video images and physiological-signal waveforms is an important aid to improve the safety and effectiveness in medical care. Physicians can rely on video-waveform (VW) observations to recognize difficult-to-spot signs and symptoms. The VW observations can also reduce the number of false positive incidents and expand the recognition coverage to abnormal health conditions. The synchronization between the video images and the physiological-signal waveforms is fundamental for the successful recognition of the clinical manifestations. The use of conventional equipment to synchronously acquire and display the video-waveform information involves complex tasks such as the video capture/compression, the acquisition/compression of each physiological signal, and the video-waveform synchronization based on timestamps. This paper introduces a data hiding technique capable of both enabling embedding channels and synchronously hiding samples of physiological signals into encoded video sequences. Our data hiding technique offers large data capacity and simplifies the complexity of the video-waveform acquisition and reproduction. The experimental results revealed successful embedding and full restoration of signal's samples. Our results also demonstrated a small distortion in the video objective quality, a small increment in bit-rate, and embedded cost savings of -2.6196% for high and medium motion video sequences.
Video bandwidth compression system
NASA Astrophysics Data System (ADS)
Ludington, D.
1980-08-01
The objective of this program was the development of a Video Bandwidth Compression brassboard model for use by the Air Force Avionics Laboratory, Wright-Patterson Air Force Base, in evaluation of bandwidth compression techniques for use in tactical weapons and to aid in the selection of particular operational modes to be implemented in an advanced flyable model. The bandwidth compression system is partitioned into two major divisions: the encoder, which processes the input video with a compression algorithm and transmits the most significant information; and the decoder where the compressed data is reconstructed into a video image for display.
Sanderson, Saskia C.; Suckiel, Sabrina A.; Zweig, Micol; Bottinger, Erwin P.; Jabs, Ethylin Wang; Richardson, Lynne D.
2016-01-01
Background: As whole-genome sequencing (WGS) increases in availability, WGS educational aids are needed for research participants, patients, and the general public. Our aim was therefore to develop an accessible and scalable WGS educational aid. Genet Med 18 5, 501–512. Methods: We engaged multiple stakeholders in an iterative process over a 1-year period culminating in the production of a novel 10-minute WGS educational animated video, “Whole Genome Sequencing and You” (https://goo.gl/HV8ezJ). We then presented the animated video to 281 online-survey respondents (the video-information group). There were also two comparison groups: a written-information group (n = 281) and a no-information group (n = 300). Genet Med 18 5, 501–512. Results: In the video-information group, 79% reported the video was easy to understand, satisfaction scores were high (mean 4.00 on 1–5 scale, where 5 = high satisfaction), and knowledge increased significantly. There were significant differences in knowledge compared with the no-information group but few differences compared with the written-information group. Intention to receive personal results from WGS and decisional conflict in response to a hypothetical scenario did not differ between the three groups. Genet Med 18 5, 501–512. Conclusions: The educational animated video, “Whole Genome Sequencing and You,” was well received by this sample of online-survey respondents. Further work is needed to evaluate its utility as an aid to informed decision making about WGS in other populations. Genet Med 18 5, 501–512. PMID:26334178
Anomaly Detection in Moving-Camera Video Sequences Using Principal Subspace Analysis
Thomaz, Lucas A.; Jardim, Eric; da Silva, Allan F.; ...
2017-10-16
This study presents a family of algorithms based on sparse decompositions that detect anomalies in video sequences obtained from slow moving cameras. These algorithms start by computing the union of subspaces that best represents all the frames from a reference (anomaly free) video as a low-rank projection plus a sparse residue. Then, they perform a low-rank representation of a target (possibly anomalous) video by taking advantage of both the union of subspaces and the sparse residue computed from the reference video. Such algorithms provide good detection results while at the same time obviating the need for previous video synchronization. However,more » this is obtained at the cost of a large computational complexity, which hinders their applicability. Another contribution of this paper approaches this problem by using intrinsic properties of the obtained data representation in order to restrict the search space to the most relevant subspaces, providing computational complexity gains of up to two orders of magnitude. The developed algorithms are shown to cope well with videos acquired in challenging scenarios, as verified by the analysis of 59 videos from the VDAO database that comprises videos with abandoned objects in a cluttered industrial scenario.« less
Anomaly Detection in Moving-Camera Video Sequences Using Principal Subspace Analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Thomaz, Lucas A.; Jardim, Eric; da Silva, Allan F.
This study presents a family of algorithms based on sparse decompositions that detect anomalies in video sequences obtained from slow moving cameras. These algorithms start by computing the union of subspaces that best represents all the frames from a reference (anomaly free) video as a low-rank projection plus a sparse residue. Then, they perform a low-rank representation of a target (possibly anomalous) video by taking advantage of both the union of subspaces and the sparse residue computed from the reference video. Such algorithms provide good detection results while at the same time obviating the need for previous video synchronization. However,more » this is obtained at the cost of a large computational complexity, which hinders their applicability. Another contribution of this paper approaches this problem by using intrinsic properties of the obtained data representation in order to restrict the search space to the most relevant subspaces, providing computational complexity gains of up to two orders of magnitude. The developed algorithms are shown to cope well with videos acquired in challenging scenarios, as verified by the analysis of 59 videos from the VDAO database that comprises videos with abandoned objects in a cluttered industrial scenario.« less
Achieving an Optimal Medium Altitude UAV Force Balance in Support of COIN Operations
2009-02-02
and execute operations. UAS with common data links and remote video terminals (RVTs) provide input to the common operational picture (COP) and...full-motion video (FMV) is intuitive to many tactical warfighters who have used similar sensors in manned aircraft. Modern data links allow the video ...Document (AFDD) 2-9. Intelligence, Surveillance, and Reconnaissance Operations, 17 July 2007. Baldor, Lolita C. “Increased UAV reliance evident in
Action Bank: A High Level Representation of Activity in Video (Author’s Manuscript)
2012-07-26
of highly discriminative performance. We have tested action bank on four major activity recognition benchmarks. In all cases, our perfor- mance is...that seek a more semantically rich and discriminative Bank of Action Detectors View 1 View 2 View n Biking Javelin Jump Rope Fencing Input Video...Positive: jumping, throwing , running, ... Negative: biking, fencing, drumming, ... Figure 1. Action bank is a high-level representation for video ac
ERIC Educational Resources Information Center
Casinghino, Carl
2015-01-01
Teaching advanced video production is an art that requires great sensitivity to the process of providing feedback that helps students to learn and grow. Some students experience difficulty in developing narrative sequences or cause-and-effect strings of motion picture sequences. But when students learn to work collaboratively through the revision…
ERIC Educational Resources Information Center
Yakubova, Gulnoza; Hughes, Elizabeth M.; Shinaberry, Megan
2016-01-01
The purpose of this study was to determine the effectiveness of a video modeling intervention with concrete-representational-abstract instructional sequence in teaching mathematics concepts to students with autism spectrum disorder (ASD). A multiple baseline across skills design of single-case experimental methodology was used to determine the…
NASA Astrophysics Data System (ADS)
Ongena, G.; van de Wijngaert, L. A. L.; Huizer, E.
2013-03-01
The purpose of this study is to seek input for a new online audiovisual heritage service. In doing so, we assess comparable online video services to gain insights into the motivations and perceptual innovation characteristics of the video services. The research is based on data from a Dutch survey held among 1,939 online video service users. The results show that online video service held overlapping antecedents but does show differences in motivations and in perceived innovation characteristics. Hence, in general, one can state that in comparison, online video services comply with different needs and have differences in perceived innovation characteristics. This implies that one can design online video services for different needs. In addition to scientific implications, the outcomes also provide guidance for practitioners in implementing new online video services.
Systems and methods for improved telepresence
Anderson, Matthew O.; Willis, W. David; Kinoshita, Robert A.
2005-10-25
The present invention provides a modular, flexible system for deploying multiple video perception technologies. The telepresence system of the present invention is capable of allowing an operator to control multiple mono and stereo video inputs in a hands-free manner. The raw data generated by the input devices is processed into a common zone structure that corresponds to the commands of the user, and the commands represented by the zone structure are transmitted to the appropriate device. This modularized approach permits input devices to be easily interfaced with various telepresence devices. Additionally, new input devices and telepresence devices are easily added to the system and are frequently interchangeable. The present invention also provides a modular configuration component that allows an operator to define a plurality of views each of which defines the telepresence devices to be controlled by a particular input device. The present invention provides a modular flexible system for providing telepresence for a wide range of applications. The modularization of the software components combined with the generalized zone concept allows the systems and methods of the present invention to be easily expanded to encompass new devices and new uses.
ERIC Educational Resources Information Center
Scheflen, Sarah Clifford; Freeman, Stephanny F. N.; Paparella, Tanya
2012-01-01
Four children with autism were taught play skills through the use of video modeling. Video instruction was used to model play and appropriate language through a developmental sequence of play levels integrated with language techniques. Results showed that children with autism could successfully use video modeling to learn how to play appropriately…
A DSP-based neural network non-uniformity correction algorithm for IRFPA
NASA Astrophysics Data System (ADS)
Liu, Chong-liang; Jin, Wei-qi; Cao, Yang; Liu, Xiu
2009-07-01
An effective neural network non-uniformity correction (NUC) algorithm based on DSP is proposed in this paper. The non-uniform response in infrared focal plane array (IRFPA) detectors produces corrupted images with a fixed-pattern noise(FPN).We introduced and analyzed the artificial neural network scene-based non-uniformity correction (SBNUC) algorithm. A design of DSP-based NUC development platform for IRFPA is described. The DSP hardware platform designed is of low power consumption, with 32-bit fixed point DSP TMS320DM643 as the kernel processor. The dependability and expansibility of the software have been improved by DSP/BIOS real-time operating system and Reference Framework 5. In order to realize real-time performance, the calibration parameters update is set at a lower task priority then video input and output in DSP/BIOS. In this way, calibration parameters updating will not affect video streams. The work flow of the system and the strategy of real-time realization are introduced. Experiments on real infrared imaging sequences demonstrate that this algorithm requires only a few frames to obtain high quality corrections. It is computationally efficient and suitable for all kinds of non-uniformity.
A Motion Detection Algorithm Using Local Phase Information
Lazar, Aurel A.; Ukani, Nikul H.; Zhou, Yiyin
2016-01-01
Previous research demonstrated that global phase alone can be used to faithfully represent visual scenes. Here we provide a reconstruction algorithm by using only local phase information. We also demonstrate that local phase alone can be effectively used to detect local motion. The local phase-based motion detector is akin to models employed to detect motion in biological vision, for example, the Reichardt detector. The local phase-based motion detection algorithm introduced here consists of two building blocks. The first building block measures/evaluates the temporal change of the local phase. The temporal derivative of the local phase is shown to exhibit the structure of a second order Volterra kernel with two normalized inputs. We provide an efficient, FFT-based algorithm for implementing the change of the local phase. The second processing building block implements the detector; it compares the maximum of the Radon transform of the local phase derivative with a chosen threshold. We demonstrate examples of applying the local phase-based motion detection algorithm on several video sequences. We also show how the locally detected motion can be used for segmenting moving objects in video scenes and compare our local phase-based algorithm to segmentation achieved with a widely used optic flow algorithm. PMID:26880882
Application of TrackEye in equine locomotion research.
Drevemo, S; Roepstorff, L; Kallings, P; Johnston, C J
1993-01-01
TrackEye is an analysis system, which is applicable for equine biokinematic studies. It covers the whole process from digitizing of images, automatic target tracking and analysis. Key components in the system are an image work station for processing of video images and a high-resolution film-to-video scanner for 16-mm film. A recording module controls the input device and handles the capture of image sequences into a videodisc system, and a tracking module is able to follow reference markers automatically. The system offers a flexible analysis including calculations of markers displacements, distances and joint angles, velocities and accelerations. TrackEye was used to study effects of phenylbutazone on the fetlock and carpal joint angle movements in a horse with a mild lameness caused by osteo-arthritis in the fetlock joint of a forelimb. Significant differences, most evident before treatment, were observed in the minimum fetlock and carpal joint angles when contralateral limbs were compared (p < 0.001). The minimum fetlock angle and the minimum carpal joint angle were significantly greater in the lame limb before treatment compared to those 6, 37 and 49 h after the last treatment (p < 0.001).
Data compression of discrete sequence: A tree based approach using dynamic programming
NASA Technical Reports Server (NTRS)
Shivaram, Gurusrasad; Seetharaman, Guna; Rao, T. R. N.
1994-01-01
A dynamic programming based approach for data compression of a ID sequence is presented. The compression of an input sequence of size N to that of a smaller size k is achieved by dividing the input sequence into k subsequences and replacing the subsequences by their respective average values. The partitioning of the input sequence is carried with the intention of reducing the mean squared error in the reconstructed sequence. The complexity involved in finding the partitions which would result in such an optimal compressed sequence is reduced by using the dynamic programming approach, which is presented.
Lehmann, Ronny; Seitz, Anke; Bosse, Hans Martin; Lutz, Thomas; Huwendiek, Sören
2016-11-01
Physical examination skills are crucial for a medical doctor. The physical examination of children differs significantly from that of adults. Students often have only limited contact with pediatric patients to practice these skills. In order to improve the acquisition of pediatric physical examination skills during bedside teaching, we have developed a combined video-based training concept, subsequently evaluating its use and perception. Fifteen videos were compiled, demonstrating defined physical examination sequences in children of different ages. Students were encouraged to use these videos as preparation for bedside teaching during their pediatric clerkship. After bedside teaching, acceptance of this approach was evaluated using a 10-item survey, asking for the frequency of video use and the benefits to learning, self-confidence, and preparation of bedside teaching as well as the concluding OSCE. N=175 out of 299 students returned survey forms (58.5%). Students most frequently used videos, either illustrating complete examination sequences or corresponding focus examinations frequently assessed in the OSCE. Students perceived the videos as a helpful method of conveying the practical process and preparation for bedside teaching as well as the OSCE, and altogether considered them a worthwhile learning experience. Self-confidence at bedside teaching was enhanced by preparation with the videos. The demonstration of a defined standardized procedural sequence, explanatory comments, and demonstration of infrequent procedures and findings were perceived as particularly supportive. Long video segments, poor alignment with other curricular learning activities, and technical problems were perceived as less helpful. Students prefer an optional individual use of the videos, with easy technical access, thoughtful combination with the bedside teaching, and consecutive standardized practice of demonstrated procedures. Preparation with instructional videos combined with bedside teaching, were perceived to improve the acquisition of pediatric physical examination skills. Copyright © 2016 Elsevier GmbH. All rights reserved.
47 CFR 76.70 - Exemption from input selector switch rules.
Code of Federal Regulations, 2014 CFR
2014-10-01
... 47 Telecommunication 4 2014-10-01 2014-10-01 false Exemption from input selector switch rules. 76.70 Section 76.70 Telecommunication FEDERAL COMMUNICATIONS COMMISSION (CONTINUED) BROADCAST RADIO SERVICES MULTICHANNEL VIDEO AND CABLE TELEVISION SERVICE Carriage of Television Broadcast Signals § 76.70...
47 CFR 76.70 - Exemption from input selector switch rules.
Code of Federal Regulations, 2012 CFR
2012-10-01
... 47 Telecommunication 4 2012-10-01 2012-10-01 false Exemption from input selector switch rules. 76.70 Section 76.70 Telecommunication FEDERAL COMMUNICATIONS COMMISSION (CONTINUED) BROADCAST RADIO SERVICES MULTICHANNEL VIDEO AND CABLE TELEVISION SERVICE Carriage of Television Broadcast Signals § 76.70...
47 CFR 76.70 - Exemption from input selector switch rules.
Code of Federal Regulations, 2013 CFR
2013-10-01
... 47 Telecommunication 4 2013-10-01 2013-10-01 false Exemption from input selector switch rules. 76.70 Section 76.70 Telecommunication FEDERAL COMMUNICATIONS COMMISSION (CONTINUED) BROADCAST RADIO SERVICES MULTICHANNEL VIDEO AND CABLE TELEVISION SERVICE Carriage of Television Broadcast Signals § 76.70...
47 CFR 76.70 - Exemption from input selector switch rules.
Code of Federal Regulations, 2011 CFR
2011-10-01
... 47 Telecommunication 4 2011-10-01 2011-10-01 false Exemption from input selector switch rules. 76.70 Section 76.70 Telecommunication FEDERAL COMMUNICATIONS COMMISSION (CONTINUED) BROADCAST RADIO SERVICES MULTICHANNEL VIDEO AND CABLE TELEVISION SERVICE Carriage of Television Broadcast Signals § 76.70...
Video image stabilization and registration--plus
NASA Technical Reports Server (NTRS)
Hathaway, David H. (Inventor)
2009-01-01
A method of stabilizing a video image displayed in multiple video fields of a video sequence includes the steps of: subdividing a selected area of a first video field into nested pixel blocks; determining horizontal and vertical translation of each of the pixel blocks in each of the pixel block subdivision levels from the first video field to a second video field; and determining translation of the image from the first video field to the second video field by determining a change in magnification of the image from the first video field to the second video field in each of horizontal and vertical directions, and determining shear of the image from the first video field to the second video field in each of the horizontal and vertical directions.
The emerging High Efficiency Video Coding standard (HEVC)
NASA Astrophysics Data System (ADS)
Raja, Gulistan; Khan, Awais
2013-12-01
High definition video (HDV) is becoming popular day by day. This paper describes the performance analysis of latest upcoming video standard known as High Efficiency Video Coding (HEVC). HEVC is designed to fulfil all the requirements for future high definition videos. In this paper, three configurations (intra only, low delay and random access) of HEVC are analyzed using various 480p, 720p and 1080p high definition test video sequences. Simulation results show the superior objective and subjective quality of HEVC.
Moving object detection and tracking in videos through turbulent medium
NASA Astrophysics Data System (ADS)
Halder, Kalyan Kumar; Tahtali, Murat; Anavatti, Sreenatha G.
2016-06-01
This paper addresses the problem of identifying and tracking moving objects in a video sequence having a time-varying background. This is a fundamental task in many computer vision applications, though a very challenging one because of turbulence that causes blurring and spatiotemporal movements of the background images. Our proposed approach involves two major steps. First, a moving object detection algorithm that deals with the detection of real motions by separating the turbulence-induced motions using a two-level thresholding technique is used. In the second step, a feature-based generalized regression neural network is applied to track the detected objects throughout the frames in the video sequence. The proposed approach uses the centroid and area features of the moving objects and creates the reference regions instantly by selecting the objects within a circle. Simulation experiments are carried out on several turbulence-degraded video sequences and comparisons with an earlier method confirms that the proposed approach provides a more effective tracking of the targets.
Effects of blurring and vertical misalignment on visual fatigue of stereoscopic displays
NASA Astrophysics Data System (ADS)
Baek, Sangwook; Lee, Chulhee
2015-03-01
In this paper, we investigate two error issues in stereo images, which may produce visual fatigue. When two cameras are used to produce 3D video sequences, vertical misalignment can be a problem. Although this problem may not occur in professionally produced 3D programs, it is still a major issue in many low-cost 3D programs. Recently, efforts have been made to produce 3D video programs using smart phones or tablets, which may present the vertical alignment problem. Also, in 2D-3D conversion techniques, the simulated frame may have blur effects, which can also introduce visual fatigue in 3D programs. In this paper, to investigate the relationship between these two errors (vertical misalignment and blurring in one image), we performed a subjective test using simulated 3D video sequences that include stereo video sequences with various vertical misalignments and blurring in a stereo image. We present some analyses along with objective models to predict the degree of visual fatigue from vertical misalignment and blurring.
Performance Evaluation of the NASA/KSC Transmission System
NASA Technical Reports Server (NTRS)
Christensen, Kenneth J.
2000-01-01
NASA-KSC currently uses three bridged 100-Mbps FDDI segments as its backbone for data traffic. The FDDI Transmission System (FTXS) connects the KSC industrial area, KSC launch complex 39 area, and the Cape Canaveral Air Force Station. The report presents a performance modeling study of the FTXS and the proposed ATM Transmission System (ATXS). The focus of the study is on performance of MPEG video transmission on these networks. Commercial modeling tools - the CACI Predictor and Comnet tools - were used. In addition, custom software tools were developed to characterize conversation pairs in Sniffer trace (capture) files to use as input to these tools. A baseline study of both non-launch and launch day data traffic on the FTXS is presented. MPEG-1 and MPEG-2 video traffic was characterized and the shaping of it evaluated. It is shown that the characteristics of a video stream has a direct effect on its performance in a network. It is also shown that shaping of video streams is necessary to prevent overflow losses and resulting poor video quality. The developed models can be used to predict when the existing FTXS will 'run out of room' and for optimizing the parameters of ATM links used for transmission of MPEG video. Future work with these models can provide useful input and validation to set-top box projects within the Advanced Networks Development group in NASA-KSC Development Engineering.
Content-based video retrieval by example video clip
NASA Astrophysics Data System (ADS)
Dimitrova, Nevenka; Abdel-Mottaleb, Mohamed
1997-01-01
This paper presents a novel approach for video retrieval from a large archive of MPEG or Motion JPEG compressed video clips. We introduce a retrieval algorithm that takes a video clip as a query and searches the database for clips with similar contents. Video clips are characterized by a sequence of representative frame signatures, which are constructed from DC coefficients and motion information (`DC+M' signatures). The similarity between two video clips is determined by using their respective signatures. This method facilitates retrieval of clips for the purpose of video editing, broadcast news retrieval, or copyright violation detection.
Using Video Modeling to Teach Complex Social Sequences to Children with Autism
ERIC Educational Resources Information Center
Nikopoulos, Christos K.; Keenan, Mickey
2007-01-01
This study comprised of two experiments was designed to teach complex social sequences to children with autism. Experimental control was achieved by collecting data using means of within-system design methodology. Across a number of conditions children were taken to a room to view one of the four short videos of two people engaging in a simple…
Optimal frame-by-frame result combination strategy for OCR in video stream
NASA Astrophysics Data System (ADS)
Bulatov, Konstantin; Lynchenko, Aleksander; Krivtsov, Valeriy
2018-04-01
This paper describes the problem of combining classification results of multiple observations of one object. This task can be regarded as a particular case of a decision-making using a combination of experts votes with calculated weights. The accuracy of various methods of combining the classification results depending on different models of input data is investigated on the example of frame-by-frame character recognition in a video stream. Experimentally it is shown that the strategy of choosing a single most competent expert in case of input data without irrelevant observations has an advantage (in this case irrelevant means with character localization and segmentation errors). At the same time this work demonstrates the advantage of combining several most competent experts according to multiplication rule or voting if irrelevant samples are present in the input data.
Source-Adaptation-Based Wireless Video Transport: A Cross-Layer Approach
NASA Astrophysics Data System (ADS)
Qu, Qi; Pei, Yong; Modestino, James W.; Tian, Xusheng
2006-12-01
Real-time packet video transmission over wireless networks is expected to experience bursty packet losses that can cause substantial degradation to the transmitted video quality. In wireless networks, channel state information is hard to obtain in a reliable and timely manner due to the rapid change of wireless environments. However, the source motion information is always available and can be obtained easily and accurately from video sequences. Therefore, in this paper, we propose a novel cross-layer framework that exploits only the motion information inherent in video sequences and efficiently combines a packetization scheme, a cross-layer forward error correction (FEC)-based unequal error protection (UEP) scheme, an intracoding rate selection scheme as well as a novel intraframe interleaving scheme. Our objective and subjective results demonstrate that the proposed approach is very effective in dealing with the bursty packet losses occurring on wireless networks without incurring any additional implementation complexity or delay. Thus, the simplicity of our proposed system has important implications for the implementation of a practical real-time video transmission system.
Visual Attention Modeling for Stereoscopic Video: A Benchmark and Computational Model.
Fang, Yuming; Zhang, Chi; Li, Jing; Lei, Jianjun; Perreira Da Silva, Matthieu; Le Callet, Patrick
2017-10-01
In this paper, we investigate the visual attention modeling for stereoscopic video from the following two aspects. First, we build one large-scale eye tracking database as the benchmark of visual attention modeling for stereoscopic video. The database includes 47 video sequences and their corresponding eye fixation data. Second, we propose a novel computational model of visual attention for stereoscopic video based on Gestalt theory. In the proposed model, we extract the low-level features, including luminance, color, texture, and depth, from discrete cosine transform coefficients, which are used to calculate feature contrast for the spatial saliency computation. The temporal saliency is calculated by the motion contrast from the planar and depth motion features in the stereoscopic video sequences. The final saliency is estimated by fusing the spatial and temporal saliency with uncertainty weighting, which is estimated by the laws of proximity, continuity, and common fate in Gestalt theory. Experimental results show that the proposed method outperforms the state-of-the-art stereoscopic video saliency detection models on our built large-scale eye tracking database and one other database (DML-ITRACK-3D).
Digital video steganalysis exploiting collusion sensitivity
NASA Astrophysics Data System (ADS)
Budhia, Udit; Kundur, Deepa
2004-09-01
In this paper we present an effective steganalyis technique for digital video sequences based on the collusion attack. Steganalysis is the process of detecting with a high probability and low complexity the presence of covert data in multimedia. Existing algorithms for steganalysis target detecting covert information in still images. When applied directly to video sequences these approaches are suboptimal. In this paper, we present a method that overcomes this limitation by using redundant information present in the temporal domain to detect covert messages in the form of Gaussian watermarks. Our gains are achieved by exploiting the collusion attack that has recently been studied in the field of digital video watermarking, and more sophisticated pattern recognition tools. Applications of our scheme include cybersecurity and cyberforensics.
Active Voodoo Dolls: A Vision Based Input Device for Nonrigid Control.
1998-08-01
A vision based technique for nonrigid control is presented that can be used for animation and video game applications. The user grasps a soft...allowing the user to control it interactively. Our use of texture mapping hardware in tracking makes the system responsive enough for interactive animation and video game character control.
Using Video To Teach for Sociolinguistic Competence in the Foreign Language Classroom.
ERIC Educational Resources Information Center
Witten, Caryn
2000-01-01
This study worked to develop the sociolinguistic competence of college learners of first-year Spanish using input enhancement techniques that required learners to actively view video. Research shows that native speakers are more sensitive to sociolinguistic errors than to grammatical errors made by nonnative speakers. Therefore, the study…
RAPID: A random access picture digitizer, display, and memory system
NASA Technical Reports Server (NTRS)
Yakimovsky, Y.; Rayfield, M.; Eskenazi, R.
1976-01-01
RAPID is a system capable of providing convenient digital analysis of video data in real-time. It has two modes of operation. The first allows for continuous digitization of an EIA RS-170 video signal. Each frame in the video signal is digitized and written in 1/30 of a second into RAPID's internal memory. The second mode leaves the content of the internal memory independent of the current input video. In both modes of operation the image contained in the memory is used to generate an EIA RS-170 composite video output signal representing the digitized image in the memory so that it can be displayed on a monitor.
Video Kills the Lecturing Star: New Technologies and the Teaching of Meterology.
ERIC Educational Resources Information Center
Sumner, Graham
1984-01-01
The educational potential of time-lapse video sequences and weather data obtained using a conventional microcomputer are considered in the light of recent advances in both fields. Illustrates how videos and microcomputers can be used to study clouds in meteorology classes. (RM)
NASA Astrophysics Data System (ADS)
Barnett, Barry S.; Bovik, Alan C.
1995-04-01
This paper presents a real time full motion video conferencing system based on the Visual Pattern Image Sequence Coding (VPISC) software codec. The prototype system hardware is comprised of two personal computers, two camcorders, two frame grabbers, and an ethernet connection. The prototype system software has a simple structure. It runs under the Disk Operating System, and includes a user interface, a video I/O interface, an event driven network interface, and a free running or frame synchronous video codec that also acts as the controller for the video and network interfaces. Two video coders have been tested in this system. Simple implementations of Visual Pattern Image Coding and VPISC have both proven to support full motion video conferencing with good visual quality. Future work will concentrate on expanding this prototype to support the motion compensated version of VPISC, as well as encompassing point-to-point modem I/O and multiple network protocols. The application will be ported to multiple hardware platforms and operating systems. The motivation for developing this prototype system is to demonstrate the practicality of software based real time video codecs. Furthermore, software video codecs are not only cheaper, but are more flexible system solutions because they enable different computer platforms to exchange encoded video information without requiring on-board protocol compatible video codex hardware. Software based solutions enable true low cost video conferencing that fits the `open systems' model of interoperability that is so important for building portable hardware and software applications.
User Input Devices’ Impact on Virtual Desktop Trainers
2010-07-01
effectiveness?” 3 Background • Literature Review – Evolution of game controllers – Use of Game controllers outside of video games – Personnel...computers verses console video games • Virtual Battlespace 2 (VBS2TM) • Sony PlayStation 3 game controller • Natural Point TrackIR 5 4 Methodology • Phases...gamers” averaged 4.6 years of experience playing video games at 2.1 hours per week – The “Gamers” averaged 10.4 years of experience playing PC Games
ERIC Educational Resources Information Center
Arslanyilmaz, Abdurrahman; Pedersen, Susan
2010-01-01
This study examines the effects of task familiarity through the use of subtitled videos on negotiation of meaning in an online task-based language learning (TBLL) environment. It explores the amount of negotiation of meaning produced by non-native speakers (NNSs) aimed at improving input comprehension to enhance second language acquisition. Ten…
47 CFR 73.687 - Transmission system requirements.
Code of Federal Regulations, 2014 CFR
2014-10-01
... modulating signal to the transmitter input terminals in place of the normal composite television video signal... taken by the use of a video sweep generator and without the use of pedestal synchronizing pulses. The d..., of zero microseconds up to a frequency of 3.0 MHz; and then linearly decreasing to 4.18 MHz so as to...
47 CFR 73.687 - Transmission system requirements.
Code of Federal Regulations, 2010 CFR
2010-10-01
... modulating signal to the transmitter input terminals in place of the normal composite television video signal... taken by the use of a video sweep generator and without the use of pedestal synchronizing pulses. The d..., of zero microseconds up to a frequency of 3.0 MHz; and then linearly decreasing to 4.18 MHz so as to...
47 CFR 73.687 - Transmission system requirements.
Code of Federal Regulations, 2013 CFR
2013-10-01
... modulating signal to the transmitter input terminals in place of the normal composite television video signal... taken by the use of a video sweep generator and without the use of pedestal synchronizing pulses. The d..., of zero microseconds up to a frequency of 3.0 MHz; and then linearly decreasing to 4.18 MHz so as to...
47 CFR 73.687 - Transmission system requirements.
Code of Federal Regulations, 2011 CFR
2011-10-01
... modulating signal to the transmitter input terminals in place of the normal composite television video signal... taken by the use of a video sweep generator and without the use of pedestal synchronizing pulses. The d..., of zero microseconds up to a frequency of 3.0 MHz; and then linearly decreasing to 4.18 MHz so as to...
47 CFR 73.687 - Transmission system requirements.
Code of Federal Regulations, 2012 CFR
2012-10-01
... modulating signal to the transmitter input terminals in place of the normal composite television video signal... taken by the use of a video sweep generator and without the use of pedestal synchronizing pulses. The d..., of zero microseconds up to a frequency of 3.0 MHz; and then linearly decreasing to 4.18 MHz so as to...
Learning-Based Just-Noticeable-Quantization- Distortion Modeling for Perceptual Video Coding.
Ki, Sehwan; Bae, Sung-Ho; Kim, Munchurl; Ko, Hyunsuk
2018-07-01
Conventional predictive video coding-based approaches are reaching the limit of their potential coding efficiency improvements, because of severely increasing computation complexity. As an alternative approach, perceptual video coding (PVC) has attempted to achieve high coding efficiency by eliminating perceptual redundancy, using just-noticeable-distortion (JND) directed PVC. The previous JNDs were modeled by adding white Gaussian noise or specific signal patterns into the original images, which were not appropriate in finding JND thresholds due to distortion with energy reduction. In this paper, we present a novel discrete cosine transform-based energy-reduced JND model, called ERJND, that is more suitable for JND-based PVC schemes. Then, the proposed ERJND model is extended to two learning-based just-noticeable-quantization-distortion (JNQD) models as preprocessing that can be applied for perceptual video coding. The two JNQD models can automatically adjust JND levels based on given quantization step sizes. One of the two JNQD models, called LR-JNQD, is based on linear regression and determines the model parameter for JNQD based on extracted handcraft features. The other JNQD model is based on a convolution neural network (CNN), called CNN-JNQD. To our best knowledge, our paper is the first approach to automatically adjust JND levels according to quantization step sizes for preprocessing the input to video encoders. In experiments, both the LR-JNQD and CNN-JNQD models were applied to high efficiency video coding (HEVC) and yielded maximum (average) bitrate reductions of 38.51% (10.38%) and 67.88% (24.91%), respectively, with little subjective video quality degradation, compared with the input without preprocessing applied.
Subjective quality evaluation of low-bit-rate video
NASA Astrophysics Data System (ADS)
Masry, Mark; Hemami, Sheila S.; Osberger, Wilfried M.; Rohaly, Ann M.
2001-06-01
A subjective quality evaluation was performed to qualify vie4wre responses to visual defects that appear in low bit rate video at full and reduced frame rates. The stimuli were eight sequences compressed by three motion compensated encoders - Sorenson Video, H.263+ and a Wavelet based coder - operating at five bit/frame rate combinations. The stimulus sequences exhibited obvious coding artifacts whose nature differed across the three coders. The subjective evaluation was performed using the Single Stimulus Continuos Quality Evaluation method of UTI-R Rec. BT.500-8. Viewers watched concatenated coded test sequences and continuously registered the perceived quality using a slider device. Data form 19 viewers was colleted. An analysis of their responses to the presence of various artifacts across the range of possible coding conditions and content is presented. The effects of blockiness and blurriness on perceived quality are examined. The effects of changes in frame rate on perceived quality are found to be related to the nature of the motion in the sequence.
NASA Astrophysics Data System (ADS)
Rogotis, Savvas; Palaskas, Christos; Ioannidis, Dimosthenis; Tzovaras, Dimitrios; Likothanassis, Spiros
2015-11-01
This work aims to present an extended framework for automatically recognizing suspicious activities in outdoor perimeter surveilling systems based on infrared video processing. By combining size-, speed-, and appearance-based features, like the local phase quantization and the histograms of oriented gradients, actions of small duration are recognized and used as input, along with spatial information, for modeling target activities using the theory of hidden conditional random fields (HCRFs). HCRFs are used to classify an observation sequence into the most appropriate activity label class, thus discriminating high-risk activities like trespassing from zero risk activities, such as loitering outside the perimeter. The effectiveness of this approach is demonstrated with experimental results in various scenarios that represent suspicious activities in perimeter surveillance systems.
NASA Astrophysics Data System (ADS)
Hashimoto, Ryoji; Matsumura, Tomoya; Nozato, Yoshihiro; Watanabe, Kenji; Onoye, Takao
A multi-agent object attention system is proposed, which is based on biologically inspired attractor selection model. Object attention is facilitated by using a video sequence and a depth map obtained through a compound-eye image sensor TOMBO. Robustness of the multi-agent system over environmental changes is enhanced by utilizing the biological model of adaptive response by attractor selection. To implement the proposed system, an efficient VLSI architecture is employed with reducing enormous computational costs and memory accesses required for depth map processing and multi-agent attractor selection process. According to the FPGA implementation result of the proposed object attention system, which is accomplished by using 7,063 slices, 640×512 pixel input images can be processed in real-time with three agents at a rate of 9fps in 48MHz operation.
Design of video interface conversion system based on FPGA
NASA Astrophysics Data System (ADS)
Zhao, Heng; Wang, Xiang-jun
2014-11-01
This paper presents a FPGA based video interface conversion system that enables the inter-conversion between digital and analog video. Cyclone IV series EP4CE22F17C chip from Altera Corporation is used as the main video processing chip, and single-chip is used as the information interaction control unit between FPGA and PC. The system is able to encode/decode messages from the PC. Technologies including video decoding/encoding circuits, bus communication protocol, data stream de-interleaving and de-interlacing, color space conversion and the Camera Link timing generator module of FPGA are introduced. The system converts Composite Video Broadcast Signal (CVBS) from the CCD camera into Low Voltage Differential Signaling (LVDS), which will be collected by the video processing unit with Camera Link interface. The processed video signals will then be inputted to system output board and displayed on the monitor.The current experiment shows that it can achieve high-quality video conversion with minimum board size.
Heterogeneity image patch index and its application to consumer video summarization.
Dang, Chinh T; Radha, Hayder
2014-06-01
Automatic video summarization is indispensable for fast browsing and efficient management of large video libraries. In this paper, we introduce an image feature that we refer to as heterogeneity image patch (HIP) index. The proposed HIP index provides a new entropy-based measure of the heterogeneity of patches within any picture. By evaluating this index for every frame in a video sequence, we generate a HIP curve for that sequence. We exploit the HIP curve in solving two categories of video summarization applications: key frame extraction and dynamic video skimming. Under the key frame extraction frame-work, a set of candidate key frames is selected from abundant video frames based on the HIP curve. Then, a proposed patch-based image dissimilarity measure is used to create affinity matrix of these candidates. Finally, a set of key frames is extracted from the affinity matrix using a min–max based algorithm. Under video skimming, we propose a method to measure the distance between a video and its skimmed representation. The video skimming problem is then mapped into an optimization framework and solved by minimizing a HIP-based distance for a set of extracted excerpts. The HIP framework is pixel-based and does not require semantic information or complex camera motion estimation. Our simulation results are based on experiments performed on consumer videos and are compared with state-of-the-art methods. It is shown that the HIP approach outperforms other leading methods, while maintaining low complexity.
Robust video super-resolution with registration efficiency adaptation
NASA Astrophysics Data System (ADS)
Zhang, Xinfeng; Xiong, Ruiqin; Ma, Siwei; Zhang, Li; Gao, Wen
2010-07-01
Super-Resolution (SR) is a technique to construct a high-resolution (HR) frame by fusing a group of low-resolution (LR) frames describing the same scene. The effectiveness of the conventional super-resolution techniques, when applied on video sequences, strongly relies on the efficiency of motion alignment achieved by image registration. Unfortunately, such efficiency is limited by the motion complexity in the video and the capability of adopted motion model. In image regions with severe registration errors, annoying artifacts usually appear in the produced super-resolution video. This paper proposes a robust video super-resolution technique that adapts itself to the spatially-varying registration efficiency. The reliability of each reference pixel is measured by the corresponding registration error and incorporated into the optimization objective function of SR reconstruction. This makes the SR reconstruction highly immune to the registration errors, as outliers with higher registration errors are assigned lower weights in the objective function. In particular, we carefully design a mechanism to assign weights according to registration errors. The proposed superresolution scheme has been tested with various video sequences and experimental results clearly demonstrate the effectiveness of the proposed method.
Optical neural network system for pose determination of spinning satellites
NASA Technical Reports Server (NTRS)
Lee, Andrew; Casasent, David
1990-01-01
An optical neural network architecture and algorithm based on a Hopfield optimization network are presented for multitarget tracking. This tracker utilizes a neuron for every possible target track, and a quadratic energy function of neural activities which is minimized using gradient descent neural evolution. The neural net tracker is demonstrated as part of a system for determining position and orientation (pose) of spinning satellites with respect to a robotic spacecraft. The input to the system is time sequence video from a single camera. Novelty detection and filtering are utilized to locate and segment novel regions from the input images. The neural net multitarget tracker determines the correspondences (or tracks) of the novel regions as a function of time, and hence the paths of object (satellite) parts. The path traced out by a given part or region is approximately elliptical in image space, and the position, shape and orientation of the ellipse are functions of the satellite geometry and its pose. Having a geometric model of the satellite, and the elliptical path of a part in image space, the three-dimensional pose of the satellite is determined. Digital simulation results using this algorithm are presented for various satellite poses and lighting conditions.
NASA Astrophysics Data System (ADS)
Boumehrez, Farouk; Brai, Radhia; Doghmane, Noureddine; Mansouri, Khaled
2018-01-01
Recently, video streaming has attracted much attention and interest due to its capability to process and transmit large data. We propose a quality of experience (QoE) model relying on high efficiency video coding (HEVC) encoder adaptation scheme, in turn based on the multiple description coding (MDC) for video streaming. The main contributions of the paper are (1) a performance evaluation of the new and emerging video coding standard HEVC/H.265, which is based on the variation of quantization parameter (QP) values depending on different video contents to deduce their influence on the sequence to be transmitted, (2) QoE support multimedia applications in wireless networks are investigated, so we inspect the packet loss impact on the QoE of transmitted video sequences, (3) HEVC encoder parameter adaptation scheme based on MDC is modeled with the encoder parameter and objective QoE model. A comparative study revealed that the proposed MDC approach is effective for improving the transmission with a peak signal-to-noise ratio (PSNR) gain of about 2 to 3 dB. Results show that a good choice of QP value can compensate for transmission channel effects and improve received video quality, although HEVC/H.265 is also sensitive to packet loss. The obtained results show the efficiency of our proposed method in terms of PSNR and mean-opinion-score.
Dual-Layer Video Encryption using RSA Algorithm
NASA Astrophysics Data System (ADS)
Chadha, Aman; Mallik, Sushmit; Chadha, Ankit; Johar, Ravdeep; Mani Roja, M.
2015-04-01
This paper proposes a video encryption algorithm using RSA and Pseudo Noise (PN) sequence, aimed at applications requiring sensitive video information transfers. The system is primarily designed to work with files encoded using the Audio Video Interleaved (AVI) codec, although it can be easily ported for use with Moving Picture Experts Group (MPEG) encoded files. The audio and video components of the source separately undergo two layers of encryption to ensure a reasonable level of security. Encryption of the video component involves applying the RSA algorithm followed by the PN-based encryption. Similarly, the audio component is first encrypted using PN and further subjected to encryption using the Discrete Cosine Transform. Combining these techniques, an efficient system, invulnerable to security breaches and attacks with favorable values of parameters such as encryption/decryption speed, encryption/decryption ratio and visual degradation; has been put forth. For applications requiring encryption of sensitive data wherein stringent security requirements are of prime concern, the system is found to yield negligible similarities in visual perception between the original and the encrypted video sequence. For applications wherein visual similarity is not of major concern, we limit the encryption task to a single level of encryption which is accomplished by using RSA, thereby quickening the encryption process. Although some similarity between the original and encrypted video is observed in this case, it is not enough to comprehend the happenings in the video.
Video error concealment using block matching and frequency selective extrapolation algorithms
NASA Astrophysics Data System (ADS)
P. K., Rajani; Khaparde, Arti
2017-06-01
Error Concealment (EC) is a technique at the decoder side to hide the transmission errors. It is done by analyzing the spatial or temporal information from available video frames. It is very important to recover distorted video because they are used for various applications such as video-telephone, video-conference, TV, DVD, internet video streaming, video games etc .Retransmission-based and resilient-based methods, are also used for error removal. But these methods add delay and redundant data. So error concealment is the best option for error hiding. In this paper, the error concealment methods such as Block Matching error concealment algorithm is compared with Frequency Selective Extrapolation algorithm. Both the works are based on concealment of manually error video frames as input. The parameter used for objective quality measurement was PSNR (Peak Signal to Noise Ratio) and SSIM(Structural Similarity Index). The original video frames along with error video frames are compared with both the Error concealment algorithms. According to simulation results, Frequency Selective Extrapolation is showing better quality measures such as 48% improved PSNR and 94% increased SSIM than Block Matching Algorithm.
BNU-LSVED: a multimodal spontaneous expression database in educational environment
NASA Astrophysics Data System (ADS)
Sun, Bo; Wei, Qinglan; He, Jun; Yu, Lejun; Zhu, Xiaoming
2016-09-01
In the field of pedagogy or educational psychology, emotions are treated as very important factors, which are closely associated with cognitive processes. Hence, it is meaningful for teachers to analyze students' emotions in classrooms, thus adjusting their teaching activities and improving students ' individual development. To provide a benchmark for different expression recognition algorithms, a large collection of training and test data in classroom environment has become an acute problem that needs to be resolved. In this paper, we present a multimodal spontaneous database in real learning environment. To collect the data, students watched seven kinds of teaching videos and were simultaneously filmed by a camera. Trained coders made one of the five learning expression labels for each image sequence extracted from the captured videos. This subset consists of 554 multimodal spontaneous expression image sequences (22,160 frames) recorded in real classrooms. There are four main advantages in this database. 1) Due to recorded in the real classroom environment, viewer's distance from the camera and the lighting of the database varies considerably between image sequences. 2) All the data presented are natural spontaneous responses to teaching videos. 3) The multimodal database also contains nonverbal behavior including eye movement, head posture and gestures to infer a student ' s affective state during the courses. 4) In the video sequences, there are different kinds of temporal activation patterns. In addition, we have demonstrated the labels for the image sequences are in high reliability through Cronbach's alpha method.
NASA Astrophysics Data System (ADS)
Walton, James S.; Hodgson, Peter; Hallamasek, Karen; Palmer, Jake
2003-07-01
4DVideo is creating a general purpose capability for capturing and analyzing kinematic data from video sequences in near real-time. The core element of this capability is a software package designed for the PC platform. The software ("4DCapture") is designed to capture and manipulate customized AVI files that can contain a variety of synchronized data streams -- including audio, video, centroid locations -- and signals acquired from more traditional sources (such as accelerometers and strain gauges.) The code includes simultaneous capture or playback of multiple video streams, and linear editing of the images (together with the ancilliary data embedded in the files). Corresponding landmarks seen from two or more views are matched automatically, and photogrammetric algorithms permit multiple landmarks to be tracked in two- and three-dimensions -- with or without lens calibrations. Trajectory data can be processed within the main application or they can be exported to a spreadsheet where they can be processed or passed along to a more sophisticated, stand-alone, data analysis application. Previous attempts to develop such applications for high-speed imaging have been limited in their scope, or by the complexity of the application itself. 4DVideo has devised a friendly ("FlowStack") user interface that assists the end-user to capture and treat image sequences in a natural progression. 4DCapture employs the AVI 2.0 standard and DirectX technology which effectively eliminates the file size limitations found in older applications. In early tests, 4DVideo has streamed three RS-170 video sources to disk for more than an hour without loss of data. At this time, the software can acquire video sequences in three ways: (1) directly, from up to three hard-wired cameras supplying RS-170 (monochrome) signals; (2) directly, from a single camera or video recorder supplying an NTSC (color) signal; and (3) by importing existing video streams in the AVI 1.0 or AVI 2.0 formats. The latter is particularly useful for high-speed applications where the raw images are often captured and stored by the camera before being downloaded. Provision has been made to synchronize data acquired from any combination of these video sources using audio and visual "tags." Additional "front-ends," designed for digital cameras, are anticipated.
Weight distributions for turbo codes using random and nonrandom permutations
NASA Technical Reports Server (NTRS)
Dolinar, S.; Divsalar, D.
1995-01-01
This article takes a preliminary look at the weight distributions achievable for turbo codes using random, nonrandom, and semirandom permutations. Due to the recursiveness of the encoders, it is important to distinguish between self-terminating and non-self-terminating input sequences. The non-self-terminating sequences have little effect on decoder performance, because they accumulate high encoded weight until they are artificially terminated at the end of the block. From probabilistic arguments based on selecting the permutations randomly, it is concluded that the self-terminating weight-2 data sequences are the most important consideration in the design of constituent codes; higher-weight self-terminating sequences have successively decreasing importance. Also, increasing the number of codes and, correspondingly, the number of permutations makes it more and more likely that the bad input sequences will be broken up by one or more of the permuters. It is possible to design nonrandom permutations that ensure that the minimum distance due to weight-2 input sequences grows roughly as the square root of (2N), where N is the block length. However, these nonrandom permutations amplify the bad effects of higher-weight inputs, and as a result they are inferior in performance to randomly selected permutations. But there are 'semirandom' permutations that perform nearly as well as the designed nonrandom permutations with respect to weight-2 input sequences and are not as susceptible to being foiled by higher-weight inputs.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lupinetti, F.
1988-01-01
This paper outlines a video communication system capable of non-line-of-sight (NLOS), secure, low-probability of intercept (LPI), antijam, real time transmission and reception of video information in a tactical enviroment. An introduction to a class of ternary PN sequences is presented to familiarize the reader with yet another avenue for spreading and despreading baseband information. The use of the high frequency (HF) band (1.5 to 30 MHz) for real time video transmission is suggested to allow NLOS communication. The spreading of the baseband information by means of multiple nontrivially different ternary pseudonoise (PN) sequence is used in order to assure encryptionmore » of the signal, enhanced security, a good degree of LPI, and good antijam features. 18 refs., 3 figs., 1 tab.« less
Real-time video quality monitoring
NASA Astrophysics Data System (ADS)
Liu, Tao; Narvekar, Niranjan; Wang, Beibei; Ding, Ran; Zou, Dekun; Cash, Glenn; Bhagavathy, Sitaram; Bloom, Jeffrey
2011-12-01
The ITU-T Recommendation G.1070 is a standardized opinion model for video telephony applications that uses video bitrate, frame rate, and packet-loss rate to measure the video quality. However, this model was original designed as an offline quality planning tool. It cannot be directly used for quality monitoring since the above three input parameters are not readily available within a network or at the decoder. And there is a great room for the performance improvement of this quality metric. In this article, we present a real-time video quality monitoring solution based on this Recommendation. We first propose a scheme to efficiently estimate the three parameters from video bitstreams, so that it can be used as a real-time video quality monitoring tool. Furthermore, an enhanced algorithm based on the G.1070 model that provides more accurate quality prediction is proposed. Finally, to use this metric in real-world applications, we present an example emerging application of real-time quality measurement to the management of transmitted videos, especially those delivered to mobile devices.
Video Denoising via Dynamic Video Layering
NASA Astrophysics Data System (ADS)
Guo, Han; Vaswani, Namrata
2018-07-01
Video denoising refers to the problem of removing "noise" from a video sequence. Here the term "noise" is used in a broad sense to refer to any corruption or outlier or interference that is not the quantity of interest. In this work, we develop a novel approach to video denoising that is based on the idea that many noisy or corrupted videos can be split into three parts - the "low-rank layer", the "sparse layer", and a small residual (which is small and bounded). We show, using extensive experiments, that our denoising approach outperforms the state-of-the-art denoising algorithms.
Compression of computer generated phase-shifting hologram sequence using AVC and HEVC
NASA Astrophysics Data System (ADS)
Xing, Yafei; Pesquet-Popescu, Béatrice; Dufaux, Frederic
2013-09-01
With the capability of achieving twice the compression ratio of Advanced Video Coding (AVC) with similar reconstruction quality, High Efficiency Video Coding (HEVC) is expected to become the newleading technique of video coding. In order to reduce the storage and transmission burden of digital holograms, in this paper we propose to use HEVC for compressing the phase-shifting digital hologram sequences (PSDHS). By simulating phase-shifting digital holography (PSDH) interferometry, interference patterns between illuminated three dimensional( 3D) virtual objects and the stepwise phase changed reference wave are generated as digital holograms. The hologram sequences are obtained by the movement of the virtual objects and compressed by AVC and HEVC. The experimental results show that AVC and HEVC are efficient to compress PSDHS, with HEVC giving better performance. Good compression rate and reconstruction quality can be obtained with bitrate above 15000kbps.
Online and unsupervised face recognition for continuous video stream
NASA Astrophysics Data System (ADS)
Huo, Hongwen; Feng, Jufu
2009-10-01
We present a novel online face recognition approach for video stream in this paper. Our method includes two stages: pre-training and online training. In the pre-training phase, our method observes interactions, collects batches of input data, and attempts to estimate their distributions (Box-Cox transformation is adopted here to normalize rough estimates). In the online training phase, our method incrementally improves classifiers' knowledge of the face space and updates it continuously with incremental eigenspace analysis. The performance achieved by our method shows its great potential in video stream processing.
A video, text, and speech-driven realistic 3-d virtual head for human-machine interface.
Yu, Jun; Wang, Zeng-Fu
2015-05-01
A multiple inputs-driven realistic facial animation system based on 3-D virtual head for human-machine interface is proposed. The system can be driven independently by video, text, and speech, thus can interact with humans through diverse interfaces. The combination of parameterized model and muscular model is used to obtain a tradeoff between computational efficiency and high realism of 3-D facial animation. The online appearance model is used to track 3-D facial motion from video in the framework of particle filtering, and multiple measurements, i.e., pixel color value of input image and Gabor wavelet coefficient of illumination ratio image, are infused to reduce the influence of lighting and person dependence for the construction of online appearance model. The tri-phone model is used to reduce the computational consumption of visual co-articulation in speech synchronized viseme synthesis without sacrificing any performance. The objective and subjective experiments show that the system is suitable for human-machine interaction.
Pre-processing SAR image stream to facilitate compression for transport on bandwidth-limited-link
Rush, Bobby G.; Riley, Robert
2015-09-29
Pre-processing is applied to a raw VideoSAR (or similar near-video rate) product to transform the image frame sequence into a product that resembles more closely the type of product for which conventional video codecs are designed, while sufficiently maintaining utility and visual quality of the product delivered by the codec.
Test Input Generation for Red-Black Trees using Abstraction
NASA Technical Reports Server (NTRS)
Visser, Willem; Pasareanu, Corina S.; Pelanek, Radek
2005-01-01
We consider the problem of test input generation for code that manipulates complex data structures. Test inputs are sequences of method calls from the data structure interface. We describe test input generation techniques that rely on state matching to avoid generation of redundant tests. Exhaustive techniques use explicit state model checking to explore all the possible test sequences up to predefined input sizes. Lossy techniques rely on abstraction mappings to compute and store abstract versions of the concrete states; they explore under-approximations of all the possible test sequences. We have implemented the techniques on top of the Java PathFinder model checker and we evaluate them using a Java implementation of red-black trees.
Real-time image sequence segmentation using curve evolution
NASA Astrophysics Data System (ADS)
Zhang, Jun; Liu, Weisong
2001-04-01
In this paper, we describe a novel approach to image sequence segmentation and its real-time implementation. This approach uses the 3D structure tensor to produce a more robust frame difference signal and uses curve evolution to extract whole objects. Our algorithm is implemented on a standard PC running the Windows operating system with video capture from a USB camera that is a standard Windows video capture device. Using the Windows standard video I/O functionalities, our segmentation software is highly portable and easy to maintain and upgrade. In its current implementation on a Pentium 400, the system can perform segmentation at 5 frames/sec with a frame resolution of 160 by 120.
Video-to-film color-image recorder.
NASA Technical Reports Server (NTRS)
Montuori, J. S.; Carnes, W. R.; Shim, I. H.
1973-01-01
A precision video-to-film recorder for use in image data processing systems, being developed for NASA, will convert three video input signals (red, blue, green) into a single full-color light beam for image recording on color film. Argon ion and krypton lasers are used to produce three spectral lines which are independently modulated by the appropriate video signals, combined into a single full-color light beam, and swept over the recording film in a raster format for image recording. A rotating multi-faceted spinner mounted on a translating carriage generates the raster, and an annotation head is used to record up to 512 alphanumeric characters in a designated area outside the image area.
Video Salient Object Detection via Fully Convolutional Networks.
Wang, Wenguan; Shen, Jianbing; Shao, Ling
This paper proposes a deep learning model to efficiently detect salient regions in videos. It addresses two important issues: 1) deep video saliency model training with the absence of sufficiently large and pixel-wise annotated video data and 2) fast video saliency training and detection. The proposed deep video saliency network consists of two modules, for capturing the spatial and temporal saliency information, respectively. The dynamic saliency model, explicitly incorporating saliency estimates from the static saliency model, directly produces spatiotemporal saliency inference without time-consuming optical flow computation. We further propose a novel data augmentation technique that simulates video training data from existing annotated image data sets, which enables our network to learn diverse saliency information and prevents overfitting with the limited number of training videos. Leveraging our synthetic video data (150K video sequences) and real videos, our deep video saliency model successfully learns both spatial and temporal saliency cues, thus producing accurate spatiotemporal saliency estimate. We advance the state-of-the-art on the densely annotated video segmentation data set (MAE of .06) and the Freiburg-Berkeley Motion Segmentation data set (MAE of .07), and do so with much improved speed (2 fps with all steps).This paper proposes a deep learning model to efficiently detect salient regions in videos. It addresses two important issues: 1) deep video saliency model training with the absence of sufficiently large and pixel-wise annotated video data and 2) fast video saliency training and detection. The proposed deep video saliency network consists of two modules, for capturing the spatial and temporal saliency information, respectively. The dynamic saliency model, explicitly incorporating saliency estimates from the static saliency model, directly produces spatiotemporal saliency inference without time-consuming optical flow computation. We further propose a novel data augmentation technique that simulates video training data from existing annotated image data sets, which enables our network to learn diverse saliency information and prevents overfitting with the limited number of training videos. Leveraging our synthetic video data (150K video sequences) and real videos, our deep video saliency model successfully learns both spatial and temporal saliency cues, thus producing accurate spatiotemporal saliency estimate. We advance the state-of-the-art on the densely annotated video segmentation data set (MAE of .06) and the Freiburg-Berkeley Motion Segmentation data set (MAE of .07), and do so with much improved speed (2 fps with all steps).
Mapping wide row crops with video sequences acquired from a tractor moving at treatment speed.
Sainz-Costa, Nadir; Ribeiro, Angela; Burgos-Artizzu, Xavier P; Guijarro, María; Pajares, Gonzalo
2011-01-01
This paper presents a mapping method for wide row crop fields. The resulting map shows the crop rows and weeds present in the inter-row spacing. Because field videos are acquired with a camera mounted on top of an agricultural vehicle, a method for image sequence stabilization was needed and consequently designed and developed. The proposed stabilization method uses the centers of some crop rows in the image sequence as features to be tracked, which compensates for the lateral movement (sway) of the camera and leaves the pitch unchanged. A region of interest is selected using the tracked features, and an inverse perspective technique transforms the selected region into a bird's-eye view that is centered on the image and that enables map generation. The algorithm developed has been tested on several video sequences of different fields recorded at different times and under different lighting conditions, with good initial results. Indeed, lateral displacements of up to 66% of the inter-row spacing were suppressed through the stabilization process, and crop rows in the resulting maps appear straight.
Extending Talk on a Prescribed Discussion Topic in a Learner-Native Speaker eTandem Learning Task
ERIC Educational Resources Information Center
Black, Emily
2017-01-01
Opportunities for language learners to access authentic input and engage in consequential interactions with native speakers of their target language abound in this era of computer mediated communication. Synchronous audio/video calling software represents one opportunity to access such input and address the challenges of developing pragmatic and…
Secure and Efficient Reactive Video Surveillance for Patient Monitoring.
Braeken, An; Porambage, Pawani; Gurtov, Andrei; Ylianttila, Mika
2016-01-02
Video surveillance is widely deployed for many kinds of monitoring applications in healthcare and assisted living systems. Security and privacy are two promising factors that align the quality and validity of video surveillance systems with the caliber of patient monitoring applications. In this paper, we propose a symmetric key-based security framework for the reactive video surveillance of patients based on the inputs coming from data measured by a wireless body area network attached to the human body. Only authenticated patients are able to activate the video cameras, whereas the patient and authorized people can consult the video data. User and location privacy are at each moment guaranteed for the patient. A tradeoff between security and quality of service is defined in order to ensure that the surveillance system gets activated even in emergency situations. In addition, the solution includes resistance against tampering with the device on the patient's side.
Secure and Efficient Reactive Video Surveillance for Patient Monitoring
Braeken, An; Porambage, Pawani; Gurtov, Andrei; Ylianttila, Mika
2016-01-01
Video surveillance is widely deployed for many kinds of monitoring applications in healthcare and assisted living systems. Security and privacy are two promising factors that align the quality and validity of video surveillance systems with the caliber of patient monitoring applications. In this paper, we propose a symmetric key-based security framework for the reactive video surveillance of patients based on the inputs coming from data measured by a wireless body area network attached to the human body. Only authenticated patients are able to activate the video cameras, whereas the patient and authorized people can consult the video data. User and location privacy are at each moment guaranteed for the patient. A tradeoff between security and quality of service is defined in order to ensure that the surveillance system gets activated even in emergency situations. In addition, the solution includes resistance against tampering with the device on the patient’s side. PMID:26729130
NASA Astrophysics Data System (ADS)
Beigi, Parmida; Salcudean, Septimiu E.; Rohling, Robert; Ng, Gary C.
2016-03-01
This paper presents an automatic localization method for a standard hand-held needle in ultrasound based on temporal motion analysis of spatially decomposed data. Subtle displacement arising from tremor motion has a periodic pattern which is usually imperceptible in the intensity image but may convey information in the phase image. Our method aims to detect such periodic motion of a hand-held needle and distinguish it from intrinsic tissue motion, using a technique inspired by video magnification. Complex steerable pyramids allow specific design of the wavelets' orientations according to the insertion angle as well as the measurement of the local phase. We therefore use steerable pairs of even and odd Gabor wavelets to decompose the ultrasound B-mode sequence into various spatial frequency bands. Variations of the local phase measurements in the spatially decomposed input data is then temporally analyzed using a finite impulse response bandpass filter to detect regions with a tremor motion pattern. Results obtained from different pyramid levels are then combined and thresholded to generate the binary mask input for the Hough transform, which determines an estimate of the direction angle and discards some of the outliers. Polynomial fitting is used at the final stage to remove any remaining outliers and improve the trajectory detection. The detected needle is finally added back to the input sequence as an overlay of a cloud of points. We demonstrate the efficiency of our approach to detect the needle using subtle tremor motion in an agar phantom and in-vivo porcine cases where intrinsic motion is also present. The localization accuracy was calculated by comparing to expert manual segmentation, and presented in (mean, standard deviation and root-mean-square error) of (0.93°, 1.26° and 0.87°) and (1.53 mm, 1.02 mm and 1.82 mm) for the trajectory and the tip, respectively.
A design of real time image capturing and processing system using Texas Instrument's processor
NASA Astrophysics Data System (ADS)
Wee, Toon-Joo; Chaisorn, Lekha; Rahardja, Susanto; Gan, Woon-Seng
2007-09-01
In this work, we developed and implemented an image capturing and processing system that equipped with capability of capturing images from an input video in real time. The input video can be a video from a PC, video camcorder or DVD player. We developed two modes of operation in the system. In the first mode, an input image from the PC is processed on the processing board (development platform with a digital signal processor) and is displayed on the PC. In the second mode, current captured image from the video camcorder (or from DVD player) is processed on the board but is displayed on the LCD monitor. The major difference between our system and other existing conventional systems is that image-processing functions are performed on the board instead of the PC (so that the functions can be used for further developments on the board). The user can control the operations of the board through the Graphic User Interface (GUI) provided on the PC. In order to have a smooth image data transfer between the PC and the board, we employed Real Time Data Transfer (RTDX TM) technology to create a link between them. For image processing functions, we developed three main groups of function: (1) Point Processing; (2) Filtering and; (3) 'Others'. Point Processing includes rotation, negation and mirroring. Filter category provides median, adaptive, smooth and sharpen filtering in the time domain. In 'Others' category, auto-contrast adjustment, edge detection, segmentation and sepia color are provided, these functions either add effect on the image or enhance the image. We have developed and implemented our system using C/C# programming language on TMS320DM642 (or DM642) board from Texas Instruments (TI). The system was showcased in College of Engineering (CoE) exhibition 2006 at Nanyang Technological University (NTU) and have more than 40 users tried our system. It is demonstrated that our system is adequate for real time image capturing. Our system can be used or applied for applications such as medical imaging, video surveillance, etc.
Alant, Erna; du Plooy, Amelia; Dada, Shakila
2007-01-01
Although the sequence of graphic or pictorial symbols displayed on a communication board can have an impact on the language output of children, very little research has been conducted to describe this. Research in this area is particularly relevant for prioritising the importance of specific visual and graphic features in providing more effective and user-friendly access to communication boards. This study is concerned with understanding the impact ofspecific sequences of graphic symbol input on the graphic and spoken output of children who have acquired language. Forty participants were divided into two comparable groups. Each group was exposed to graphic symbol input with a certain word order sequence. The structure of input was either in typical English word order sequence Subject- Verb-Object (SVO) or in the word order sequence of Subject-Object-Verb (SOV). Both input groups had to answer six questions by using graphic output as well as speech. The findings indicated that there are significant differences in the PCS graphic output patterns of children who are exposed to graphic input in the SOV and SVO sequences. Furthermore, the output produced in the graphic mode differed considerably to the output produced in the spoken mode. Clinical implications of these findings are discussed
Extraction of Blebs in Human Embryonic Stem Cell Videos.
Guan, Benjamin X; Bhanu, Bir; Talbot, Prue; Weng, Nikki Jo-Hao
2016-01-01
Blebbing is an important biological indicator in determining the health of human embryonic stem cells (hESC). Especially, areas of a bleb sequence in a video are often used to distinguish two cell blebbing behaviors in hESC: dynamic and apoptotic blebbings. This paper analyzes various segmentation methods for bleb extraction in hESC videos and introduces a bio-inspired score function to improve the performance in bleb extraction. Full bleb formation consists of bleb expansion and retraction. Blebs change their size and image properties dynamically in both processes and between frames. Therefore, adaptive parameters are needed for each segmentation method. A score function derived from the change of bleb area and orientation between consecutive frames is proposed which provides adaptive parameters for bleb extraction in videos. In comparison to manual analysis, the proposed method provides an automated fast and accurate approach for bleb sequence extraction.
Capturing Revolute Motion and Revolute Joint Parameters with Optical Tracking
NASA Astrophysics Data System (ADS)
Antonya, C.
2017-12-01
Optical tracking of users and various technical systems are becoming more and more popular. It consists of analysing sequence of recorded images using video capturing devices and image processing algorithms. The returned data contains mainly point-clouds, coordinates of markers or coordinates of point of interest. These data can be used for retrieving information related to the geometry of the objects, but also to extract parameters for the analytical model of the system useful in a variety of computer aided engineering simulations. The parameter identification of joints deals with extraction of physical parameters (mainly geometric parameters) for the purpose of constructing accurate kinematic and dynamic models. The input data are the time-series of the marker’s position. The least square method was used for fitting the data into different geometrical shapes (ellipse, circle, plane) and for obtaining the position and orientation of revolute joins.
Chung, Jongsuk; Son, Dae-Soon; Jeon, Hyo-Jeong; Kim, Kyoung-Mee; Park, Gahee; Ryu, Gyu Ha; Park, Woong-Yang; Park, Donghyun
2016-01-01
Targeted capture massively parallel sequencing is increasingly being used in clinical settings, and as costs continue to decline, use of this technology may become routine in health care. However, a limited amount of tissue has often been a challenge in meeting quality requirements. To offer a practical guideline for the minimum amount of input DNA for targeted sequencing, we optimized and evaluated the performance of targeted sequencing depending on the input DNA amount. First, using various amounts of input DNA, we compared commercially available library construction kits and selected Agilent’s SureSelect-XT and KAPA Biosystems’ Hyper Prep kits as the kits most compatible with targeted deep sequencing using Agilent’s SureSelect custom capture. Then, we optimized the adapter ligation conditions of the Hyper Prep kit to improve library construction efficiency and adapted multiplexed hybrid selection to reduce the cost of sequencing. In this study, we systematically evaluated the performance of the optimized protocol depending on the amount of input DNA, ranging from 6.25 to 200 ng, suggesting the minimal input DNA amounts based on coverage depths required for specific applications. PMID:27220682
Still-to-video face recognition in unconstrained environments
NASA Astrophysics Data System (ADS)
Wang, Haoyu; Liu, Changsong; Ding, Xiaoqing
2015-02-01
Face images from video sequences captured in unconstrained environments usually contain several kinds of variations, e.g. pose, facial expression, illumination, image resolution and occlusion. Motion blur and compression artifacts also deteriorate recognition performance. Besides, in various practical systems such as law enforcement, video surveillance and e-passport identification, only a single still image per person is enrolled as the gallery set. Many existing methods may fail to work due to variations in face appearances and the limit of available gallery samples. In this paper, we propose a novel approach for still-to-video face recognition in unconstrained environments. By assuming that faces from still images and video frames share the same identity space, a regularized least squares regression method is utilized to tackle the multi-modality problem. Regularization terms based on heuristic assumptions are enrolled to avoid overfitting. In order to deal with the single image per person problem, we exploit face variations learned from training sets to synthesize virtual samples for gallery samples. We adopt a learning algorithm combining both affine/convex hull-based approach and regularizations to match image sets. Experimental results on a real-world dataset consisting of unconstrained video sequences demonstrate that our method outperforms the state-of-the-art methods impressively.
A novel visual saliency detection method for infrared video sequences
NASA Astrophysics Data System (ADS)
Wang, Xin; Zhang, Yuzhen; Ning, Chen
2017-12-01
Infrared video applications such as target detection and recognition, moving target tracking, and so forth can benefit a lot from visual saliency detection, which is essentially a method to automatically localize the ;important; content in videos. In this paper, a novel visual saliency detection method for infrared video sequences is proposed. Specifically, for infrared video saliency detection, both the spatial saliency and temporal saliency are considered. For spatial saliency, we adopt a mutual consistency-guided spatial cues combination-based method to capture the regions with obvious luminance contrast and contour features. For temporal saliency, a multi-frame symmetric difference approach is proposed to discriminate salient moving regions of interest from background motions. Then, the spatial saliency and temporal saliency are combined to compute the spatiotemporal saliency using an adaptive fusion strategy. Besides, to highlight the spatiotemporal salient regions uniformly, a multi-scale fusion approach is embedded into the spatiotemporal saliency model. Finally, a Gestalt theory-inspired optimization algorithm is designed to further improve the reliability of the final saliency map. Experimental results demonstrate that our method outperforms many state-of-the-art saliency detection approaches for infrared videos under various backgrounds.
Automatic summarization of soccer highlights using audio-visual descriptors.
Raventós, A; Quijada, R; Torres, Luis; Tarrés, Francesc
2015-01-01
Automatic summarization generation of sports video content has been object of great interest for many years. Although semantic descriptions techniques have been proposed, many of the approaches still rely on low-level video descriptors that render quite limited results due to the complexity of the problem and to the low capability of the descriptors to represent semantic content. In this paper, a new approach for automatic highlights summarization generation of soccer videos using audio-visual descriptors is presented. The approach is based on the segmentation of the video sequence into shots that will be further analyzed to determine its relevance and interest. Of special interest in the approach is the use of the audio information that provides additional robustness to the overall performance of the summarization system. For every video shot a set of low and mid level audio-visual descriptors are computed and lately adequately combined in order to obtain different relevance measures based on empirical knowledge rules. The final summary is generated by selecting those shots with highest interest according to the specifications of the user and the results of relevance measures. A variety of results are presented with real soccer video sequences that prove the validity of the approach.
On continuous user authentication via typing behavior.
Roth, Joseph; Liu, Xiaoming; Metaxas, Dimitris
2014-10-01
We hypothesize that an individual computer user has a unique and consistent habitual pattern of hand movements, independent of the text, while typing on a keyboard. As a result, this paper proposes a novel biometric modality named typing behavior (TB) for continuous user authentication. Given a webcam pointing toward a keyboard, we develop real-time computer vision algorithms to automatically extract hand movement patterns from the video stream. Unlike the typical continuous biometrics, such as keystroke dynamics (KD), TB provides a reliable authentication with a short delay, while avoiding explicit key-logging. We collect a video database where 63 unique subjects type static text and free text for multiple sessions. For one typing video, the hands are segmented in each frame and a unique descriptor is extracted based on the shape and position of hands, as well as their temporal dynamics in the video sequence. We propose a novel approach, named bag of multi-dimensional phrases, to match the cross-feature and cross-temporal pattern between a gallery sequence and probe sequence. The experimental results demonstrate a superior performance of TB when compared with KD, which, together with our ultrareal-time demo system, warrant further investigation of this novel vision application and biometric modality.
Playing Action Video Games Improves Visuomotor Control.
Li, Li; Chen, Rongrong; Chen, Jing
2016-08-01
Can playing action video games improve visuomotor control? If so, can these games be used in training people to perform daily visuomotor-control tasks, such as driving? We found that action gamers have better lane-keeping and visuomotor-control skills than do non-action gamers. We then trained non-action gamers with action or nonaction video games. After they played a driving or first-person-shooter video game for 5 or 10 hr, their visuomotor control improved significantly. In contrast, non-action gamers showed no such improvement after they played a nonaction video game. Our model-driven analysis revealed that although different action video games have different effects on the sensorimotor system underlying visuomotor control, action gaming in general improves the responsiveness of the sensorimotor system to input error signals. The findings support a causal link between action gaming (for as little as 5 hr) and enhancement in visuomotor control, and suggest that action video games can be beneficial training tools for driving. © The Author(s) 2016.
Knowledge-based approach to video content classification
NASA Astrophysics Data System (ADS)
Chen, Yu; Wong, Edward K.
2001-01-01
A framework for video content classification using a knowledge-based approach is herein proposed. This approach is motivated by the fact that videos are rich in semantic contents, which can best be interpreted and analyzed by human experts. We demonstrate the concept by implementing a prototype video classification system using the rule-based programming language CLIPS 6.05. Knowledge for video classification is encoded as a set of rules in the rule base. The left-hand-sides of rules contain high level and low level features, while the right-hand-sides of rules contain intermediate results or conclusions. Our current implementation includes features computed from motion, color, and text extracted from video frames. Our current rule set allows us to classify input video into one of five classes: news, weather, reporting, commercial, basketball and football. We use MYCIN's inexact reasoning method for combining evidences, and to handle the uncertainties in the features and in the classification results. We obtained good results in a preliminary experiment, and it demonstrated the validity of the proposed approach.
Knowledge-based approach to video content classification
NASA Astrophysics Data System (ADS)
Chen, Yu; Wong, Edward K.
2000-12-01
A framework for video content classification using a knowledge-based approach is herein proposed. This approach is motivated by the fact that videos are rich in semantic contents, which can best be interpreted and analyzed by human experts. We demonstrate the concept by implementing a prototype video classification system using the rule-based programming language CLIPS 6.05. Knowledge for video classification is encoded as a set of rules in the rule base. The left-hand-sides of rules contain high level and low level features, while the right-hand-sides of rules contain intermediate results or conclusions. Our current implementation includes features computed from motion, color, and text extracted from video frames. Our current rule set allows us to classify input video into one of five classes: news, weather, reporting, commercial, basketball and football. We use MYCIN's inexact reasoning method for combining evidences, and to handle the uncertainties in the features and in the classification results. We obtained good results in a preliminary experiment, and it demonstrated the validity of the proposed approach.
A robust coding scheme for packet video
NASA Technical Reports Server (NTRS)
Chen, Y. C.; Sayood, Khalid; Nelson, D. J.
1991-01-01
We present a layered packet video coding algorithm based on a progressive transmission scheme. The algorithm provides good compression and can handle significant packet loss with graceful degradation in the reconstruction sequence. Simulation results for various conditions are presented.
A robust coding scheme for packet video
NASA Technical Reports Server (NTRS)
Chen, Yun-Chung; Sayood, Khalid; Nelson, Don J.
1992-01-01
A layered packet video coding algorithm based on a progressive transmission scheme is presented. The algorithm provides good compression and can handle significant packet loss with graceful degradation in the reconstruction sequence. Simulation results for various conditions are presented.
Optimal Filter Estimation for Lucas-Kanade Optical Flow
Sharmin, Nusrat; Brad, Remus
2012-01-01
Optical flow algorithms offer a way to estimate motion from a sequence of images. The computation of optical flow plays a key-role in several computer vision applications, including motion detection and segmentation, frame interpolation, three-dimensional scene reconstruction, robot navigation and video compression. In the case of gradient based optical flow implementation, the pre-filtering step plays a vital role, not only for accurate computation of optical flow, but also for the improvement of performance. Generally, in optical flow computation, filtering is used at the initial level on original input images and afterwards, the images are resized. In this paper, we propose an image filtering approach as a pre-processing step for the Lucas-Kanade pyramidal optical flow algorithm. Based on a study of different types of filtering methods and applied on the Iterative Refined Lucas-Kanade, we have concluded on the best filtering practice. As the Gaussian smoothing filter was selected, an empirical approach for the Gaussian variance estimation was introduced. Tested on the Middlebury image sequences, a correlation between the image intensity value and the standard deviation value of the Gaussian function was established. Finally, we have found that our selection method offers a better performance for the Lucas-Kanade optical flow algorithm.
Image and Video Compression with VLSI Neural Networks
NASA Technical Reports Server (NTRS)
Fang, W.; Sheu, B.
1993-01-01
An advanced motion-compensated predictive video compression system based on artificial neural networks has been developed to effectively eliminate the temporal and spatial redundancy of video image sequences and thus reduce the bandwidth and storage required for the transmission and recording of the video signal. The VLSI neuroprocessor for high-speed high-ratio image compression based upon a self-organization network and the conventional algorithm for vector quantization are compared. The proposed method is quite efficient and can achieve near-optimal results.
Joint Video Stitching and Stabilization from Moving Cameras.
Guo, Heng; Liu, Shuaicheng; He, Tong; Zhu, Shuyuan; Zeng, Bing; Gabbouj, Moncef
2016-09-08
In this paper, we extend image stitching to video stitching for videos that are captured for the same scene simultaneously by multiple moving cameras. In practice, videos captured under this circumstance often appear shaky. Directly applying image stitching methods for shaking videos often suffers from strong spatial and temporal artifacts. To solve this problem, we propose a unified framework in which video stitching and stabilization are performed jointly. Specifically, our system takes several overlapping videos as inputs. We estimate both inter motions (between different videos) and intra motions (between neighboring frames within a video). Then, we solve an optimal virtual 2D camera path from all original paths. An enlarged field of view along the virtual path is finally obtained by a space-temporal optimization that takes both inter and intra motions into consideration. Two important components of this optimization are that (1) a grid-based tracking method is designed for an improved robustness, which produces features that are distributed evenly within and across multiple views, and (2) a mesh-based motion model is adopted for the handling of the scene parallax. Some experimental results are provided to demonstrate the effectiveness of our approach on various consumer-level videos and a Plugin, named "Video Stitcher" is developed at Adobe After Effects CC2015 to show the processed videos.
Lossless Video Sequence Compression Using Adaptive Prediction
NASA Technical Reports Server (NTRS)
Li, Ying; Sayood, Khalid
2007-01-01
We present an adaptive lossless video compression algorithm based on predictive coding. The proposed algorithm exploits temporal, spatial, and spectral redundancies in a backward adaptive fashion with extremely low side information. The computational complexity is further reduced by using a caching strategy. We also study the relationship between the operational domain for the coder (wavelet or spatial) and the amount of temporal and spatial redundancy in the sequence being encoded. Experimental results show that the proposed scheme provides significant improvements in compression efficiencies.
A fuzzy measure approach to motion frame analysis for scene detection. M.S. Thesis - Houston Univ.
NASA Technical Reports Server (NTRS)
Leigh, Albert B.; Pal, Sankar K.
1992-01-01
This paper addresses a solution to the problem of scene estimation of motion video data in the fuzzy set theoretic framework. Using fuzzy image feature extractors, a new algorithm is developed to compute the change of information in each of two successive frames to classify scenes. This classification process of raw input visual data can be used to establish structure for correlation. The algorithm attempts to fulfill the need for nonlinear, frame-accurate access to video data for applications such as video editing and visual document archival/retrieval systems in multimedia environments.
Science documentary video slides to enhance education and communication
NASA Astrophysics Data System (ADS)
Byrne, J. M.; Little, L. J.; Dodgson, K.
2010-12-01
Documentary production can convey powerful messages using a combination of authentic science and reinforcing video imagery. Conventional documentary production contains too much information for many viewers to follow; hence many powerful points may be lost. But documentary productions that are re-edited into short video sequences and made available through web based video servers allow the teacher/viewer to access the material as video slides. Each video slide contains one critical discussion segment of the larger documentary. A teacher/viewer can review the documentary one segment at a time in a class room, public forum, or in the comfort of home. The sequential presentation of the video slides allows the viewer to best absorb the documentary message. The website environment provides space for additional questions and discussion to enhance the video message.
Activity recognition using Video Event Segmentation with Text (VEST)
NASA Astrophysics Data System (ADS)
Holloway, Hillary; Jones, Eric K.; Kaluzniacki, Andrew; Blasch, Erik; Tierno, Jorge
2014-06-01
Multi-Intelligence (multi-INT) data includes video, text, and signals that require analysis by operators. Analysis methods include information fusion approaches such as filtering, correlation, and association. In this paper, we discuss the Video Event Segmentation with Text (VEST) method, which provides event boundaries of an activity to compile related message and video clips for future interest. VEST infers meaningful activities by clustering multiple streams of time-sequenced multi-INT intelligence data and derived fusion products. We discuss exemplar results that segment raw full-motion video (FMV) data by using extracted commentary message timestamps, FMV metadata, and user-defined queries.
Self-aligning and compressed autosophy video databases
NASA Astrophysics Data System (ADS)
Holtz, Klaus E.
1993-04-01
Autosophy, an emerging new science, explains `self-assembling structures,' such as crystals or living trees, in mathematical terms. This research provides a new mathematical theory of `learning' and a new `information theory' which permits the growing of self-assembling data network in a computer memory similar to the growing of `data crystals' or `data trees' without data processing or programming. Autosophy databases are educated very much like a human child to organize their own internal data storage. Input patterns, such as written questions or images, are converted to points in a mathematical omni dimensional hyperspace. The input patterns are then associated with output patterns, such as written answers or images. Omni dimensional information storage will result in enormous data compression because each pattern fragment is only stored once. Pattern recognition in the text or image files is greatly simplified by the peculiar omni dimensional storage method. Video databases will absorb input images from a TV camera and associate them with textual information. The `black box' operations are totally self-aligning where the input data will determine their own hyperspace storage locations. Self-aligning autosophy databases may lead to a new generation of brain-like devices.
Automatic video segmentation and indexing
NASA Astrophysics Data System (ADS)
Chahir, Youssef; Chen, Liming
1999-08-01
Indexing is an important aspect of video database management. Video indexing involves the analysis of video sequences, which is a computationally intensive process. However, effective management of digital video requires robust indexing techniques. The main purpose of our proposed video segmentation is twofold. Firstly, we develop an algorithm that identifies camera shot boundary. The approach is based on the use of combination of color histograms and block-based technique. Next, each temporal segment is represented by a color reference frame which specifies the shot similarities and which is used in the constitution of scenes. Experimental results using a variety of videos selected in the corpus of the French Audiovisual National Institute are presented to demonstrate the effectiveness of performing shot detection, the content characterization of shots and the scene constitution.
Programmable remapper for image processing
NASA Technical Reports Server (NTRS)
Juday, Richard D. (Inventor); Sampsell, Jeffrey B. (Inventor)
1991-01-01
A video-rate coordinate remapper includes a memory for storing a plurality of transformations on look-up tables for remapping input images from one coordinate system to another. Such transformations are operator selectable. The remapper includes a collective processor by which certain input pixels of an input image are transformed to a portion of the output image in a many-to-one relationship. The remapper includes an interpolative processor by which the remaining input pixels of the input image are transformed to another portion of the output image in a one-to-many relationship. The invention includes certain specific transforms for creating output images useful for certain defects of visually impaired people. The invention also includes means for shifting input pixels and means for scrolling the output matrix.
Video bioinformatics analysis of human embryonic stem cell colony growth.
Lin, Sabrina; Fonteno, Shawn; Satish, Shruthi; Bhanu, Bir; Talbot, Prue
2010-05-20
Because video data are complex and are comprised of many images, mining information from video material is difficult to do without the aid of computer software. Video bioinformatics is a powerful quantitative approach for extracting spatio-temporal data from video images using computer software to perform dating mining and analysis. In this article, we introduce a video bioinformatics method for quantifying the growth of human embryonic stem cells (hESC) by analyzing time-lapse videos collected in a Nikon BioStation CT incubator equipped with a camera for video imaging. In our experiments, hESC colonies that were attached to Matrigel were filmed for 48 hours in the BioStation CT. To determine the rate of growth of these colonies, recipes were developed using CL-Quant software which enables users to extract various types of data from video images. To accurately evaluate colony growth, three recipes were created. The first segmented the image into the colony and background, the second enhanced the image to define colonies throughout the video sequence accurately, and the third measured the number of pixels in the colony over time. The three recipes were run in sequence on video data collected in a BioStation CT to analyze the rate of growth of individual hESC colonies over 48 hours. To verify the truthfulness of the CL-Quant recipes, the same data were analyzed manually using Adobe Photoshop software. When the data obtained using the CL-Quant recipes and Photoshop were compared, results were virtually identical, indicating the CL-Quant recipes were truthful. The method described here could be applied to any video data to measure growth rates of hESC or other cells that grow in colonies. In addition, other video bioinformatics recipes can be developed in the future for other cell processes such as migration, apoptosis, and cell adhesion.
NASA Astrophysics Data System (ADS)
Froehlich, Jan; Grandinetti, Stefan; Eberhardt, Bernd; Walter, Simon; Schilling, Andreas; Brendel, Harald
2014-03-01
High quality video sequences are required for the evaluation of tone mapping operators and high dynamic range (HDR) displays. We provide scenic and documentary scenes with a dynamic range of up to 18 stops. The scenes are staged using professional film lighting, make-up and set design to enable the evaluation of image and material appearance. To address challenges for HDR-displays and temporal tone mapping operators, the sequences include highlights entering and leaving the image, brightness changing over time, high contrast skin tones, specular highlights and bright, saturated colors. HDR-capture is carried out using two cameras mounted on a mirror-rig. To achieve a cinematic depth of field, digital motion picture cameras with Super-35mm size sensors are used. We provide HDR-video sequences to serve as a common ground for the evaluation of temporal tone mapping operators and HDR-displays. They are available to the scientific community for further research.
Automated frame selection process for high-resolution microendoscopy
NASA Astrophysics Data System (ADS)
Ishijima, Ayumu; Schwarz, Richard A.; Shin, Dongsuk; Mondrik, Sharon; Vigneswaran, Nadarajah; Gillenwater, Ann M.; Anandasabapathy, Sharmila; Richards-Kortum, Rebecca
2015-04-01
We developed an automated frame selection algorithm for high-resolution microendoscopy video sequences. The algorithm rapidly selects a representative frame with minimal motion artifact from a short video sequence, enabling fully automated image analysis at the point-of-care. The algorithm was evaluated by quantitative comparison of diagnostically relevant image features and diagnostic classification results obtained using automated frame selection versus manual frame selection. A data set consisting of video sequences collected in vivo from 100 oral sites and 167 esophageal sites was used in the analysis. The area under the receiver operating characteristic curve was 0.78 (automated selection) versus 0.82 (manual selection) for oral sites, and 0.93 (automated selection) versus 0.92 (manual selection) for esophageal sites. The implementation of fully automated high-resolution microendoscopy at the point-of-care has the potential to reduce the number of biopsies needed for accurate diagnosis of precancer and cancer in low-resource settings where there may be limited infrastructure and personnel for standard histologic analysis.
Heart rate measurement based on face video sequence
NASA Astrophysics Data System (ADS)
Xu, Fang; Zhou, Qin-Wu; Wu, Peng; Chen, Xing; Yang, Xiaofeng; Yan, Hong-jian
2015-03-01
This paper proposes a new non-contact heart rate measurement method based on photoplethysmography (PPG) theory. With this method we can measure heart rate remotely with a camera and ambient light. We collected video sequences of subjects, and detected remote PPG signals through video sequences. Remote PPG signals were analyzed with two methods, Blind Source Separation Technology (BSST) and Cross Spectral Power Technology (CSPT). BSST is a commonly used method, and CSPT is used for the first time in the study of remote PPG signals in this paper. Both of the methods can acquire heart rate, but compared with BSST, CSPT has clearer physical meaning, and the computational complexity of CSPT is lower than that of BSST. Our work shows that heart rates detected by CSPT method have good consistency with the heart rates measured by a finger clip oximeter. With good accuracy and low computational complexity, the CSPT method has a good prospect for the application in the field of home medical devices and mobile health devices.
NASA Technical Reports Server (NTRS)
Pope, Alan T. (Inventor); Stephens, Chad L. (Inventor); Habowski, Tyler (Inventor)
2017-01-01
Method for physiologically modulating videogames and simulations includes utilizing input from a motion-sensing video game system and input from a physiological signal acquisition device. The inputs from the physiological signal sensors are utilized to change the response of a user's avatar to inputs from the motion-sensing sensors. The motion-sensing system comprises a 3D sensor system having full-body 3D motion capture of a user's body. This arrangement encourages health-enhancing physiological self-regulation skills or therapeutic amplification of healthful physiological characteristics. The system provides increased motivation for users to utilize biofeedback as may be desired for treatment of various conditions.
Subjective evaluation of H.265/HEVC based dynamic adaptive video streaming over HTTP (HEVC-DASH)
NASA Astrophysics Data System (ADS)
Irondi, Iheanyi; Wang, Qi; Grecos, Christos
2015-02-01
The Dynamic Adaptive Streaming over HTTP (DASH) standard is becoming increasingly popular for real-time adaptive HTTP streaming of internet video in response to unstable network conditions. Integration of DASH streaming techniques with the new H.265/HEVC video coding standard is a promising area of research. The performance of HEVC-DASH systems has been previously evaluated by a few researchers using objective metrics, however subjective evaluation would provide a better measure of the user's Quality of Experience (QoE) and overall performance of the system. This paper presents a subjective evaluation of an HEVC-DASH system implemented in a hardware testbed. Previous studies in this area have focused on using the current H.264/AVC (Advanced Video Coding) or H.264/SVC (Scalable Video Coding) codecs and moreover, there has been no established standard test procedure for the subjective evaluation of DASH adaptive streaming. In this paper, we define a test plan for HEVC-DASH with a carefully justified data set employing longer video sequences that would be sufficient to demonstrate the bitrate switching operations in response to various network condition patterns. We evaluate the end user's real-time QoE online by investigating the perceived impact of delay, different packet loss rates, fluctuating bandwidth, and the perceived quality of using different DASH video stream segment sizes on a video streaming session using different video sequences. The Mean Opinion Score (MOS) results give an insight into the performance of the system and expectation of the users. The results from this study show the impact of different network impairments and different video segments on users' QoE and further analysis and study may help in optimizing system performance.
Keyhole imaging method for dynamic objects behind the occlusion area
NASA Astrophysics Data System (ADS)
Hao, Conghui; Chen, Xi; Dong, Liquan; Zhao, Yuejin; Liu, Ming; Kong, Lingqin; Hui, Mei; Liu, Xiaohua; Wu, Hong
2018-01-01
A method of keyhole imaging based on camera array is realized to obtain the video image behind a keyhole in shielded space at a relatively long distance. We get the multi-angle video images by using a 2×2 CCD camera array to take the images behind the keyhole in four directions. The multi-angle video images are saved in the form of frame sequences. This paper presents a method of video frame alignment. In order to remove the non-target area outside the aperture, we use the canny operator and morphological method to realize the edge detection of images and fill the images. The image stitching of four images is accomplished on the basis of the image stitching algorithm of two images. In the image stitching algorithm of two images, the SIFT method is adopted to accomplish the initial matching of images, and then the RANSAC algorithm is applied to eliminate the wrong matching points and to obtain a homography matrix. A method of optimizing transformation matrix is proposed in this paper. Finally, the video image with larger field of view behind the keyhole can be synthesized with image frame sequence in which every single frame is stitched. The results show that the screen of the video is clear and natural, the brightness transition is smooth. There is no obvious artificial stitching marks in the video, and it can be applied in different engineering environment .
Dynamic Textures Modeling via Joint Video Dictionary Learning.
Wei, Xian; Li, Yuanxiang; Shen, Hao; Chen, Fang; Kleinsteuber, Martin; Wang, Zhongfeng
2017-04-06
Video representation is an important and challenging task in the computer vision community. In this paper, we consider the problem of modeling and classifying video sequences of dynamic scenes which could be modeled in a dynamic textures (DT) framework. At first, we assume that image frames of a moving scene can be modeled as a Markov random process. We propose a sparse coding framework, named joint video dictionary learning (JVDL), to model a video adaptively. By treating the sparse coefficients of image frames over a learned dictionary as the underlying "states", we learn an efficient and robust linear transition matrix between two adjacent frames of sparse events in time series. Hence, a dynamic scene sequence is represented by an appropriate transition matrix associated with a dictionary. In order to ensure the stability of JVDL, we impose several constraints on such transition matrix and dictionary. The developed framework is able to capture the dynamics of a moving scene by exploring both sparse properties and the temporal correlations of consecutive video frames. Moreover, such learned JVDL parameters can be used for various DT applications, such as DT synthesis and recognition. Experimental results demonstrate the strong competitiveness of the proposed JVDL approach in comparison with state-of-the-art video representation methods. Especially, it performs significantly better in dealing with DT synthesis and recognition on heavily corrupted data.
Menon, Durgapoorna; Chelakkot, Prameela G; Sunil, Devika; Lakshmaiah, Ashwini
2017-12-01
The purpose of this study is to assess the quality of videos available in YouTube on CyberKnife. The term "CyberKnife" was input into the search window of www.youtube.com on a specific date and the first 50 videos were assessed for technical and content issues. The data was tabulated and analysed. The search yielded 32,300 videos in 0.33 s. Among the first 50 analysed, most were professional videos, mostly on CyberKnife in general and for brain tumours. Most of the videos did not mention anything about patient selection or lesion size. The other technical details were covered by most although they seemed muffled by the animations. Many patient videos were recordings of one entire treatment, thus giving future patients an insight on what to expect. Almost half the videos projected glorified views about the treatment technique. The company videos were reasonably accurate and well presented as were many institutional videos, although there was a tendency to gloss over a few points. The glorification of the treatment technique was a disturbing finding. The profound trust of the patients on the health care system is humbling.
ERIC Educational Resources Information Center
Flowers, Susan K.; Easter, Carla; Holmes, Andrea; Cohen, Brian; Bednarski, April E.; Mardis, Elaine R.; Wilson, Richard K.; Elgin, Sarah C. R.
2005-01-01
Sequencing of the human genome has ushered in a new era of biology. The technologies developed to facilitate the sequencing of the human genome are now being applied to the sequencing of other genomes. In 2004, a partnership was formed between Washington University School of Medicine Genome Sequencing Center's Outreach Program and Washington…
Non-mydriatic video ophthalmoscope to measure fast temporal changes of the human retina
NASA Astrophysics Data System (ADS)
Tornow, Ralf P.; Kolář, Radim; Odstrčilík, Jan
2015-07-01
The analysis of fast temporal changes of the human retina can be used to get insight to normal physiological behavior and to detect pathological deviations. This can be important for the early detection of glaucoma and other eye diseases. We developed a small, lightweight, USB powered video ophthalmoscope that allows taking video sequences of the human retina with at least 25 frames per second without dilating the pupil. Short sequences (about 10 s) of the optic nerve head (20° x 15°) are recorded from subjects and registered offline using two-stage process (phase correlation and Lucas-Kanade approach) to compensate for eye movements. From registered video sequences, different parameters can be calculated. Two applications are described here: measurement of (i) cardiac cycle induced pulsatile reflection changes and (ii) eye movements and fixation pattern. Cardiac cycle induced pulsatile reflection changes are caused by changing blood volume in the retina. Waveform and pulse parameters like amplitude and rise time can be measured in any selected areas within the retinal image. Fixation pattern ΔY(ΔX) can be assessed from eye movements during video acquisition. The eye movements ΔX[t], ΔY[t] are derived from image registration results with high temporal (40 ms) and spatial (1,86 arcmin) resolution. Parameters of pulsatile reflection changes and fixation pattern can be affected in beginning glaucoma and the method described here may support early detection of glaucoma and other eye disease.
Object detection in cinematographic video sequences for automatic indexing
NASA Astrophysics Data System (ADS)
Stauder, Jurgen; Chupeau, Bertrand; Oisel, Lionel
2003-06-01
This paper presents an object detection framework applied to cinematographic post-processing of video sequences. Post-processing is done after production and before editing. At the beginning of each shot of a video, a slate (also called clapperboard) is shown. The slate contains notably an electronic audio timecode that is necessary for audio-visual synchronization. This paper presents an object detection framework to detect slates in video sequences for automatic indexing and post-processing. It is based on five steps. The first two steps aim to reduce drastically the video data to be analyzed. They ensure high recall rate but have low precision. The first step detects images at the beginning of a shot possibly showing up a slate while the second step searches in these images for candidates regions with color distribution similar to slates. The objective is to not miss any slate while eliminating long parts of video without slate appearance. The third and fourth steps are statistical classification and pattern matching to detected and precisely locate slates in candidate regions. These steps ensure high recall rate and high precision. The objective is to detect slates with very little false alarms to minimize interactive corrections. In a last step, electronic timecodes are read from slates to automize audio-visual synchronization. The presented slate detector has a recall rate of 89% and a precision of 97,5%. By temporal integration, much more than 89% of shots in dailies are detected. By timecode coherence analysis, the precision can be raised too. Issues for future work are to accelerate the system to be faster than real-time and to extend the framework for several slate types.
Video quality pooling adaptive to perceptual distortion severity.
Park, Jincheol; Seshadrinathan, Kalpana; Lee, Sanghoon; Bovik, Alan Conrad
2013-02-01
It is generally recognized that severe video distortions that are transient in space and/or time have a large effect on overall perceived video quality. In order to understand this phenomena, we study the distribution of spatio-temporally local quality scores obtained from several video quality assessment (VQA) algorithms on videos suffering from compression and lossy transmission over communication channels. We propose a content adaptive spatial and temporal pooling strategy based on the observed distribution. Our method adaptively emphasizes "worst" scores along both the spatial and temporal dimensions of a video sequence and also considers the perceptual effect of large-area cohesive motion flow such as egomotion. We demonstrate the efficacy of the method by testing it using three different VQA algorithms on the LIVE Video Quality database and the EPFL-PoliMI video quality database.
A MPEG-4 encoder based on TMS320C6416
NASA Astrophysics Data System (ADS)
Li, Gui-ju; Liu, Wei-ning
2013-08-01
Engineering and products need to achieve real-time video encoding by DSP, but the high computational complexity and huge amount of data requires that system has high data throughput. In this paper, a real-time MPEG-4 video encoder is designed based on TMS320C6416 platform. The kernel is the DSP of TMS320C6416T and FPGA chip f as the organization and management of video data. In order to control the flow of input and output data. Encoded stream is output using the synchronous serial port. The system has the clock frequency of 1GHz and has up to 8000 MIPS speed processing capacity when running at full speed. Due to the low coding efficiency of MPEG-4 video encoder transferred directly to DSP platform, it is needed to improve the program structure, data structures and algorithms combined with TMS320C6416T characteristics. First: Design the image storage architecture by balancing the calculation spending, storage space cost and EDMA read time factors. Open up a more buffer in memory, each buffer cache 16 lines of video data to be encoded, reconstruction image and reference image including search range. By using the variable alignment mode of the DSP, modifying the definition of structure variables and change the look-up table which occupy larger space with a direct calculation array to save memory space. After the program structure optimization, the program code, all variables, buffering buffers and the interpolation image including the search range can be placed in memory. Then, as to the time-consuming process modules and some functions which are called many times, the corresponding modules are written in parallel assembly language of TMS320C6416T which can increase the running speed. Besides, the motion estimation algorithm is improved by using a cross-hexagon search algorithm, The search speed can be increased obviously. Finally, the execution time, signal-to-noise ratio and compression ratio of a real-time image acquisition sequence is given. The experimental results show that the designed encoder in this paper can accomplish real-time encoding of a 768× 576, 25 frames per second grayscale video. The code rate is 1.5M bits per second.
Method of determining the necessary number of observations for video stream documents recognition
NASA Astrophysics Data System (ADS)
Arlazarov, Vladimir V.; Bulatov, Konstantin; Manzhikov, Temudzhin; Slavin, Oleg; Janiszewski, Igor
2018-04-01
This paper discusses a task of document recognition on a sequence of video frames. In order to optimize the processing speed an estimation is performed of stability of recognition results obtained from several video frames. Considering identity document (Russian internal passport) recognition on a mobile device it is shown that significant decrease is possible of the number of observations necessary for obtaining precise recognition result.
Selective encryption for H.264/AVC video coding
NASA Astrophysics Data System (ADS)
Shi, Tuo; King, Brian; Salama, Paul
2006-02-01
Due to the ease with which digital data can be manipulated and due to the ongoing advancements that have brought us closer to pervasive computing, the secure delivery of video and images has become a challenging problem. Despite the advantages and opportunities that digital video provide, illegal copying and distribution as well as plagiarism of digital audio, images, and video is still ongoing. In this paper we describe two techniques for securing H.264 coded video streams. The first technique, SEH264Algorithm1, groups the data into the following blocks of data: (1) a block that contains the sequence parameter set and the picture parameter set, (2) a block containing a compressed intra coded frame, (3) a block containing the slice header of a P slice, all the headers of the macroblock within the same P slice, and all the luma and chroma DC coefficients belonging to the all the macroblocks within the same slice, (4) a block containing all the ac coefficients, and (5) a block containing all the motion vectors. The first three are encrypted whereas the last two are not. The second method, SEH264Algorithm2, relies on the use of multiple slices per coded frame. The algorithm searches the compressed video sequence for start codes (0x000001) and then encrypts the next N bits of data.
Real-time interactive simulation: using touch panels, graphics tablets, and video-terminal keyboards
DOE Office of Scientific and Technical Information (OSTI.GOV)
Venhuizen, J.R.
1983-01-01
A Simulation Laboratory utilizing only digital computers for interactive computing must rely on CRT based graphics devices for output devices, and keyboards, graphics tablets, and touch panels, etc., for input devices. The devices all work well, with the combination of a CRT with a touch panel mounted on it as the most flexible combination of input/output devices for interactive simulation.
ERIC Educational Resources Information Center
Li, C.-H.
2014-01-01
Most second/foreign language (L2) learners have difficulty understanding listening input because of its implicit and ephemeral nature, and they typically have better reading comprehension than listening comprehension skills. This study examines the effects of using an interactive advance-organizer activity on the DVD video comprehension of L2…
ERIC Educational Resources Information Center
Wagner, Elvis
2013-01-01
The use of video technology has become widespread in the teaching and testing of second-language (L2) listening, yet research into how this technology affects the learning and testing process has lagged. The current study investigated how the channel of input (audiovisual vs. audio-only) used on an L2 listening test affected test-taker…
Video calls from lay bystanders to dispatch centers - risk assessment of information security.
Bolle, Stein R; Hasvold, Per; Henriksen, Eva
2011-09-30
Video calls from mobile phones can improve communication during medical emergencies. Lay bystanders can be instructed and supervised by health professionals at Emergency Medical Communication Centers. Before implementation of video mobile calls in emergencies, issues of information security should be addressed. Information security was assessed for risk, based on the information security standard ISO/IEC 27005:2008. A multi-professional team used structured brainstorming to find threats to the information security aspects confidentiality, quality, integrity, and availability. Twenty security threats of different risk levels were identified and analyzed. Solutions were proposed to reduce the risk level. Given proper implementation, we found no risks to information security that would advocate against the use of video calls between lay bystanders and Emergency Medical Communication Centers. The identified threats should be used as input to formal requirements when planning and implementing video calls from mobile phones for these call centers.
Video calls from lay bystanders to dispatch centers - risk assessment of information security
2011-01-01
Background Video calls from mobile phones can improve communication during medical emergencies. Lay bystanders can be instructed and supervised by health professionals at Emergency Medical Communication Centers. Before implementation of video mobile calls in emergencies, issues of information security should be addressed. Methods Information security was assessed for risk, based on the information security standard ISO/IEC 27005:2008. A multi-professional team used structured brainstorming to find threats to the information security aspects confidentiality, quality, integrity, and availability. Results Twenty security threats of different risk levels were identified and analyzed. Solutions were proposed to reduce the risk level. Conclusions Given proper implementation, we found no risks to information security that would advocate against the use of video calls between lay bystanders and Emergency Medical Communication Centers. The identified threats should be used as input to formal requirements when planning and implementing video calls from mobile phones for these call centers. PMID:21958387
The Simple Video Coder: A free tool for efficiently coding social video data.
Barto, Daniel; Bird, Clark W; Hamilton, Derek A; Fink, Brandi C
2017-08-01
Videotaping of experimental sessions is a common practice across many disciplines of psychology, ranging from clinical therapy, to developmental science, to animal research. Audio-visual data are a rich source of information that can be easily recorded; however, analysis of the recordings presents a major obstacle to project completion. Coding behavior is time-consuming and often requires ad-hoc training of a student coder. In addition, existing software is either prohibitively expensive or cumbersome, which leaves researchers with inadequate tools to quickly process video data. We offer the Simple Video Coder-free, open-source software for behavior coding that is flexible in accommodating different experimental designs, is intuitive for students to use, and produces outcome measures of event timing, frequency, and duration. Finally, the software also offers extraction tools to splice video into coded segments suitable for training future human coders or for use as input for pattern classification algorithms.
A review of video security training and assessment-systems and their applications
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cellucci, J.; Hall, R.J.
1991-01-01
This paper reports that during the last 10 years computer-aided video data collection and playback systems have been used as nuclear facility security training and assessment tools with varying degrees of success. These mobile systems have been used by trained security personnel for response force training, vulnerability assessment, force-on-force exercises and crisis management. Typically, synchronous recordings from multiple video cameras, communications audio, and digital sensor inputs; are played back to the exercise participants and then edited for training and briefing. Factors that have influence user acceptance include: frequency of use, the demands placed on security personnel, fear of punishment, usermore » training requirements and equipment cost. The introduction of S-VHS video and new software for scenario planning, video editing and data reduction; should bring about a wider range of security applications and supply the opportunity for significant cost sharing with other user groups.« less
Visual content highlighting via automatic extraction of embedded captions on MPEG compressed video
NASA Astrophysics Data System (ADS)
Yeo, Boon-Lock; Liu, Bede
1996-03-01
Embedded captions in TV programs such as news broadcasts, documentaries and coverage of sports events provide important information on the underlying events. In digital video libraries, such captions represent a highly condensed form of key information on the contents of the video. In this paper we propose a scheme to automatically detect the presence of captions embedded in video frames. The proposed method operates on reduced image sequences which are efficiently reconstructed from compressed MPEG video and thus does not require full frame decompression. The detection, extraction and analysis of embedded captions help to capture the highlights of visual contents in video documents for better organization of video, to present succinctly the important messages embedded in the images, and to facilitate browsing, searching and retrieval of relevant clips.
Two-Stream Transformer Networks for Video-based Face Alignment.
Liu, Hao; Lu, Jiwen; Feng, Jianjiang; Zhou, Jie
2017-08-01
In this paper, we propose a two-stream transformer networks (TSTN) approach for video-based face alignment. Unlike conventional image-based face alignment approaches which cannot explicitly model the temporal dependency in videos and motivated by the fact that consistent movements of facial landmarks usually occur across consecutive frames, our TSTN aims to capture the complementary information of both the spatial appearance on still frames and the temporal consistency information across frames. To achieve this, we develop a two-stream architecture, which decomposes the video-based face alignment into spatial and temporal streams accordingly. Specifically, the spatial stream aims to transform the facial image to the landmark positions by preserving the holistic facial shape structure. Accordingly, the temporal stream encodes the video input as active appearance codes, where the temporal consistency information across frames is captured to help shape refinements. Experimental results on the benchmarking video-based face alignment datasets show very competitive performance of our method in comparisons to the state-of-the-arts.
Multilevel analysis of sports video sequences
NASA Astrophysics Data System (ADS)
Han, Jungong; Farin, Dirk; de With, Peter H. N.
2006-01-01
We propose a fully automatic and flexible framework for analysis and summarization of tennis broadcast video sequences, using visual features and specific game-context knowledge. Our framework can analyze a tennis video sequence at three levels, which provides a broad range of different analysis results. The proposed framework includes novel pixel-level and object-level tennis video processing algorithms, such as a moving-player detection taking both the color and the court (playing-field) information into account, and a player-position tracking algorithm based on a 3-D camera model. Additionally, we employ scene-level models for detecting events, like service, base-line rally and net-approach, based on a number real-world visual features. The system can summarize three forms of information: (1) all court-view playing frames in a game, (2) the moving trajectory and real-speed of each player, as well as relative position between the player and the court, (3) the semantic event segments in a game. The proposed framework is flexible in choosing the level of analysis that is desired. It is effective because the framework makes use of several visual cues obtained from the real-world domain to model important events like service, thereby increasing the accuracy of the scene-level analysis. The paper presents attractive experimental results highlighting the system efficiency and analysis capabilities.
Gear Shifting of Quadriceps during Isometric Knee Extension Disclosed Using Ultrasonography.
Zhang, Shu; Huang, Weijian; Zeng, Yu; Shi, Wenxiu; Diao, Xianfen; Wei, Xiguang; Ling, Shan
2018-01-01
Ultrasonography has been widely employed to estimate the morphological changes of muscle during contraction. To further investigate the motion pattern of quadriceps during isometric knee extensions, we studied the relative motion pattern between femur and quadriceps under ultrasonography. An interesting observation is that although the force of isometric knee extension can be controlled to change almost linearly, femur in the simultaneously captured ultrasound video sequences has several different piecewise moving patterns. This phenomenon is like quadriceps having several forward gear ratios like a car starting from rest towards maximal voluntary contraction (MVC) and then returning to rest. Therefore, to verify this assumption, we captured several ultrasound video sequences of isometric knee extension and collected the torque/force signal simultaneously. Then we extract the shapes of femur from these ultrasound video sequences using video processing techniques and study the motion pattern both qualitatively and quantitatively. The phenomenon can be seen easier via a comparison between the torque signal and relative spatial distance between femur and quadriceps. Furthermore, we use cluster analysis techniques to study the process and the clustering results also provided preliminary support to the conclusion that, during both ramp increasing and decreasing phases, quadriceps contraction may have several forward gear ratios relative to femur.
NASA Astrophysics Data System (ADS)
Chen, H.; Ye, Sh.; Nedzvedz, O. V.; Ablameyko, S. V.
2018-03-01
Study of crowd movement is an important practical problem, and its solution is used in video surveillance systems for preventing various emergency situations. In the general case, a group of fast-moving people is of more interest than a group of stationary or slow-moving people. We propose a new method for crowd movement analysis using a video sequence, based on integral optical flow. We have determined several characteristics of a moving crowd such as density, speed, direction of motion, symmetry, and in/out index. These characteristics are used for further analysis of a video scene.
Online tracking of outdoor lighting variations for augmented reality with moving cameras.
Liu, Yanli; Granier, Xavier
2012-04-01
In augmented reality, one of key tasks to achieve a convincing visual appearance consistency between virtual objects and video scenes is to have a coherent illumination along the whole sequence. As outdoor illumination is largely dependent on the weather, the lighting condition may change from frame to frame. In this paper, we propose a full image-based approach for online tracking of outdoor illumination variations from videos captured with moving cameras. Our key idea is to estimate the relative intensities of sunlight and skylight via a sparse set of planar feature-points extracted from each frame. To address the inevitable feature misalignments, a set of constraints are introduced to select the most reliable ones. Exploiting the spatial and temporal coherence of illumination, the relative intensities of sunlight and skylight are finally estimated by using an optimization process. We validate our technique on a set of real-life videos and show that the results with our estimations are visually coherent along the video sequences.
Scollato, A; Perrini, P; Benedetto, N; Di Lorenzo, N
2007-06-01
We propose an easy-to-construct digital video editing system ideal to produce video documentation and still images. A digital video editing system applicable to many video sources in the operating room is described in detail. The proposed system has proved easy to use and permits one to obtain videography quickly and easily. Mixing different streams of video input from all the devices in use in the operating room, the application of filters and effects produces a final, professional end-product. Recording on a DVD provides an inexpensive, portable and easy-to-use medium to store or re-edit or tape at a later time. From stored videography it is easy to extract high-quality, still images useful for teaching, presentations and publications. In conclusion digital videography and still photography can easily be recorded by the proposed system, producing high-quality video recording. The use of firewire ports provides good compatibility with next-generation hardware and software. The high standard of quality makes the proposed system one of the lowest priced products available today.
Detection of illegal transfer of videos over the Internet
NASA Astrophysics Data System (ADS)
Chaisorn, Lekha; Sainui, Janya; Manders, Corey
2010-07-01
In this paper, a method for detecting infringements or modifications of a video in real-time is proposed. The method first segments a video stream into shots, after which it extracts some reference frames as keyframes. This process is performed employing a Singular Value Decomposition (SVD) technique developed in this work. Next, for each input video (represented by its keyframes), ordinal-based signature and SIFT (Scale Invariant Feature Transform) descriptors are generated. The ordinal-based method employs a two-level bitmap indexing scheme to construct the index for each video signature. The first level clusters all input keyframes into k clusters while the second level converts the ordinal-based signatures into bitmap vectors. On the other hand, the SIFT-based method directly uses the descriptors as the index. Given a suspect video (being streamed or transferred on the Internet), we generate the signature (ordinal and SIFT descriptors) then we compute similarity between its signature and those signatures in the database based on ordinal signature and SIFT descriptors separately. For similarity measure, besides the Euclidean distance, Boolean operators are also utilized during the matching process. We have tested our system by performing several experiments on 50 videos (each about 1/2 hour in duration) obtained from the TRECVID 2006 data set. For experiments set up, we refer to the conditions provided by TRECVID 2009 on "Content-based copy detection" task. In addition, we also refer to the requirements issued in the call for proposals by MPEG standard on the similar task. Initial result shows that our framework is effective and robust. As compared to our previous work, on top of the achievement we obtained by reducing the storage space and time taken in the ordinal based method, by introducing the SIFT features, we could achieve an overall accuracy in F1 measure of about 96% (improved about 8%).
Hardware/Software Issues for Video Guidance Systems: The Coreco Frame Grabber
NASA Technical Reports Server (NTRS)
Bales, John W.
1996-01-01
The F64 frame grabber is a high performance video image acquisition and processing board utilizing the TMS320C40 and TMS34020 processors. The hardware is designed for the ISA 16 bit bus and supports multiple digital or analog cameras. It has an acquisition rate of 40 million pixels per second, with a variable sampling frequency of 510 kHz to MO MHz. The board has a 4MB frame buffer memory expandable to 32 MB, and has a simultaneous acquisition and processing capability. It supports both VGA and RGB displays, and accepts all analog and digital video input standards.
Video-modelling to improve task completion in a child with autism.
Rayner, Christopher Stephen
2010-01-01
To evaluate the use of video modelling as an intervention for increasing task completion for individuals with autism who have high support needs. A 12-year-old-boy with autism received video modelling intervention on two routines (unpacking his bag and brushing his teeth). Use of the video modelling intervention led to rapid increases in the percentage of steps performed in the unpacking his bag sequence and these gains generalized to packing his bag prior to departure from school. There was limited success in the use of the video modelling intervention for teaching the participant to brush his teeth. Video modelling can be successfully applied to enhance daily functioning in a classroom environment for students with autism and high support needs.
Storage, retrieval, and edit of digital video using Motion JPEG
NASA Astrophysics Data System (ADS)
Sudharsanan, Subramania I.; Lee, D. H.
1994-04-01
In a companion paper we describe a Micro Channel adapter card that can perform real-time JPEG (Joint Photographic Experts Group) compression of a 640 by 480 24-bit image within 1/30th of a second. Since this corresponds to NTSC video rates at considerably good perceptual quality, this system can be used for real-time capture and manipulation of continuously fed video. To facilitate capturing the compressed video in a storage medium, an IBM Bus master SCSI adapter with cache is utilized. Efficacy of the data transfer mechanism is considerably improved using the System Control Block architecture, an extension to Micro Channel bus masters. We show experimental results that the overall system can perform at compressed data rates of about 1.5 MBytes/second sustained and with sporadic peaks to about 1.8 MBytes/second depending on the image sequence content. We also describe mechanisms to access the compressed data very efficiently through special file formats. This in turn permits creation of simpler sequence editors. Another advantage of the special file format is easy control of forward, backward and slow motion playback. The proposed method can be extended for design of a video compression subsystem for a variety of personal computing systems.
Gamifying Video Object Segmentation.
Spampinato, Concetto; Palazzo, Simone; Giordano, Daniela
2017-10-01
Video object segmentation can be considered as one of the most challenging computer vision problems. Indeed, so far, no existing solution is able to effectively deal with the peculiarities of real-world videos, especially in cases of articulated motion and object occlusions; limitations that appear more evident when we compare the performance of automated methods with the human one. However, manually segmenting objects in videos is largely impractical as it requires a lot of time and concentration. To address this problem, in this paper we propose an interactive video object segmentation method, which exploits, on one hand, the capability of humans to identify correctly objects in visual scenes, and on the other hand, the collective human brainpower to solve challenging and large-scale tasks. In particular, our method relies on a game with a purpose to collect human inputs on object locations, followed by an accurate segmentation phase achieved by optimizing an energy function encoding spatial and temporal constraints between object regions as well as human-provided location priors. Performance analysis carried out on complex video benchmarks, and exploiting data provided by over 60 users, demonstrated that our method shows a better trade-off between annotation times and segmentation accuracy than interactive video annotation and automated video object segmentation approaches.
SNP ID-info: SNP ID searching and visualization platform.
Yang, Cheng-Hong; Chuang, Li-Yeh; Cheng, Yu-Huei; Wen, Cheng-Hao; Chang, Phei-Lang; Chang, Hsueh-Wei
2008-09-01
Many association studies provide the relationship between single nucleotide polymorphisms (SNPs), diseases and cancers, without giving a SNP ID, however. Here, we developed the SNP ID-info freeware to provide the SNP IDs within inputting genetic and physical information of genomes. The program provides an "SNP-ePCR" function to generate the full-sequence using primers and template inputs. In "SNPosition," sequence from SNP-ePCR or direct input is fed to match the SNP IDs from SNP fasta-sequence. In "SNP search" and "SNP fasta" function, information of SNPs within the cytogenetic band, contig position, and keyword input are acceptable. Finally, the SNP ID neighboring environment for inputs is completely visualized in the order of contig position and marked with SNP and flanking hits. The SNP identification problems inherent in NCBI SNP BLAST are also avoided. In conclusion, the SNP ID-info provides a visualized SNP ID environment for multiple inputs and assists systematic SNP association studies. The server and user manual are available at http://bio.kuas.edu.tw/snpid-info.
Minimizing structural vibrations with Input Shaping (TM)
NASA Technical Reports Server (NTRS)
Singhose, Bill; Singer, Neil
1995-01-01
A new method for commanding machines to move with increased dynamic performance was developed. This method is an enhanced version of input shaping, a patented vibration suppression algorithm. This technique intercepts a command input to a system command that moves the mechanical system with increased performance and reduced residual vibration. This document describes many advanced methods for generating highly optimized shaping sequences which are tuned to particular systems. The shaping sequence is important because it determines the trade off between move/settle time of the system and the insensitivity of the input shaping algorithm to variations or uncertainties in the machine which can be controlled. For example, a system with a 5 Hz resonance that takes 1 second to settle can be improved to settle instantaneously using a 0.2 shaping sequence (thus improving settle time by a factor of 5). This system could vary by plus or minus 15% in its natural frequency and still have no apparent vibration. However, the same system shaped with a 0.3 second shaping sequence could tolerate plus or minus 40% or more variation in natural frequency. This document describes how to generate sequences that maximize performance, sequences that maximize insensitivity, and sequences that trade off between the two. Several software tools are documented and included.
AlignMe—a membrane protein sequence alignment web server
Stamm, Marcus; Staritzbichler, René; Khafizov, Kamil; Forrest, Lucy R.
2014-01-01
We present a web server for pair-wise alignment of membrane protein sequences, using the program AlignMe. The server makes available two operational modes of AlignMe: (i) sequence to sequence alignment, taking two sequences in fasta format as input, combining information about each sequence from multiple sources and producing a pair-wise alignment (PW mode); and (ii) alignment of two multiple sequence alignments to create family-averaged hydropathy profile alignments (HP mode). For the PW sequence alignment mode, four different optimized parameter sets are provided, each suited to pairs of sequences with a specific similarity level. These settings utilize different types of inputs: (position-specific) substitution matrices, secondary structure predictions and transmembrane propensities from transmembrane predictions or hydrophobicity scales. In the second (HP) mode, each input multiple sequence alignment is converted into a hydrophobicity profile averaged over the provided set of sequence homologs; the two profiles are then aligned. The HP mode enables qualitative comparison of transmembrane topologies (and therefore potentially of 3D folds) of two membrane proteins, which can be useful if the proteins have low sequence similarity. In summary, the AlignMe web server provides user-friendly access to a set of tools for analysis and comparison of membrane protein sequences. Access is available at http://www.bioinfo.mpg.de/AlignMe PMID:24753425
A bio-inspired system for spatio-temporal recognition in static and video imagery
NASA Astrophysics Data System (ADS)
Khosla, Deepak; Moore, Christopher K.; Chelian, Suhas
2007-04-01
This paper presents a bio-inspired method for spatio-temporal recognition in static and video imagery. It builds upon and extends our previous work on a bio-inspired Visual Attention and object Recognition System (VARS). The VARS approach locates and recognizes objects in a single frame. This work presents two extensions of VARS. The first extension is a Scene Recognition Engine (SCE) that learns to recognize spatial relationships between objects that compose a particular scene category in static imagery. This could be used for recognizing the category of a scene, e.g., office vs. kitchen scene. The second extension is the Event Recognition Engine (ERE) that recognizes spatio-temporal sequences or events in sequences. This extension uses a working memory model to recognize events and behaviors in video imagery by maintaining and recognizing ordered spatio-temporal sequences. The working memory model is based on an ARTSTORE1 neural network that combines an ART-based neural network with a cascade of sustained temporal order recurrent (STORE)1 neural networks. A series of Default ARTMAP classifiers ascribes event labels to these sequences. Our preliminary studies have shown that this extension is robust to variations in an object's motion profile. We evaluated the performance of the SCE and ERE on real datasets. The SCE module was tested on a visual scene classification task using the LabelMe2 dataset. The ERE was tested on real world video footage of vehicles and pedestrians in a street scene. Our system is able to recognize the events in this footage involving vehicles and pedestrians.
Validation of a Video-based Game-Understanding Test Procedure in Badminton.
ERIC Educational Resources Information Center
Blomqvist, Minna T.; Luhtanen, Pekka; Laakso, Lauri; Keskinen, Esko
2000-01-01
Reports the development and validation of video-based game-understanding tests in badminton for elementary and secondary students. The tests included different sequences that simulated actual game situations. Players had to solve tactical problems by selecting appropriate solutions and arguments for their decisions. Results suggest that the test…
Traveling wave linear accelerator with RF power flow outside of accelerating cavities
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dolgashev, Valery A.
A high power RF traveling wave accelerator structure includes a symmetric RF feed, an input matching cell coupled to the symmetric RF feed, a sequence of regular accelerating cavities coupled to the input matching cell at an input beam pipe end of the sequence, one or more waveguides parallel to and coupled to the sequence of regular accelerating cavities, an output matching cell coupled to the sequence of regular accelerating cavities at an output beam pipe end of the sequence, and output waveguide circuit or RF loads coupled to the output matching cell. Each of the regular accelerating cavities hasmore » a nose cone that cuts off field propagating into the beam pipe and therefore all power flows in a traveling wave along the structure in the waveguide.« less
Converting from DDOR SASF to APF
NASA Technical Reports Server (NTRS)
Gladden, Roy E.; Khanampompan, Teerapat; Fisher, Forest W.
2008-01-01
A computer program called ddor_sasf2apf converts delta-door (delta differential one-way range) request from an SASF (spacecraft activity sequence file) format to an APF (apgen plan file) format for use in the Mars Reconnaissance Orbiter (MRO) missionplanning- and-sequencing process. The APF is used as an input to APGEN/AUTOGEN in the MRO activity- planning and command-sequencegenerating process to sequence the delta-door (DDOR) activity. The DDOR activity is a spacecraft tracking technique for determining spacecraft location. The input to ddor_sasf2apf is an input request SASF provided by an observation team that utilizes DDOR. ddor_sasf2apf parses this DDOR SASF input, rearranging parameters and reformatting the request to produce an APF file for use in AUTOGEN and/or APGEN. The benefit afforded by ddor_sasf2apf is to enable the use of the DDOR SASF file earlier in the planning stage of the command-sequence-generating process and to produce sequences, optimized for DDOR operations, that are more accurate and more robust than would otherwise be possible.
NASA Astrophysics Data System (ADS)
Zhang, Xunxun; Xu, Hongke; Fang, Jianwu
2018-01-01
Along with the rapid development of the unmanned aerial vehicle technology, multiple vehicle tracking (MVT) in aerial video sequence has received widespread interest for providing the required traffic information. Due to the camera motion and complex background, MVT in aerial video sequence poses unique challenges. We propose an efficient MVT algorithm via driver behavior-based Kalman filter (DBKF) and an improved deterministic data association (IDDA) method. First, a hierarchical image registration method is put forward to compensate the camera motion. Afterward, to improve the accuracy of the state estimation, we propose the DBKF module by incorporating the driver behavior into the Kalman filter, where artificial potential field is introduced to reflect the driver behavior. Then, to implement the data association, a local optimization method is designed instead of global optimization. By introducing the adaptive operating strategy, the proposed IDDA method can also deal with the situation in which the vehicles suddenly appear or disappear. Finally, comprehensive experiments on the DARPA VIVID data set and KIT AIS data set demonstrate that the proposed algorithm can generate satisfactory and superior results.
Recognition of Indian Sign Language in Live Video
NASA Astrophysics Data System (ADS)
Singha, Joyeeta; Das, Karen
2013-05-01
Sign Language Recognition has emerged as one of the important area of research in Computer Vision. The difficulty faced by the researchers is that the instances of signs vary with both motion and appearance. Thus, in this paper a novel approach for recognizing various alphabets of Indian Sign Language is proposed where continuous video sequences of the signs have been considered. The proposed system comprises of three stages: Preprocessing stage, Feature Extraction and Classification. Preprocessing stage includes skin filtering, histogram matching. Eigen values and Eigen Vectors were considered for feature extraction stage and finally Eigen value weighted Euclidean distance is used to recognize the sign. It deals with bare hands, thus allowing the user to interact with the system in natural way. We have considered 24 different alphabets in the video sequences and attained a success rate of 96.25%.
Variable disparity-motion estimation based fast three-view video coding
NASA Astrophysics Data System (ADS)
Bae, Kyung-Hoon; Kim, Seung-Cheol; Hwang, Yong Seok; Kim, Eun-Soo
2009-02-01
In this paper, variable disparity-motion estimation (VDME) based 3-view video coding is proposed. In the encoding, key-frame coding (KFC) based motion estimation and variable disparity estimation (VDE) for effectively fast three-view video encoding are processed. These proposed algorithms enhance the performance of 3-D video encoding/decoding system in terms of accuracy of disparity estimation and computational overhead. From some experiments, stereo sequences of 'Pot Plant' and 'IVO', it is shown that the proposed algorithm's PSNRs is 37.66 and 40.55 dB, and the processing time is 0.139 and 0.124 sec/frame, respectively.
Application of M-JPEG compression hardware to dynamic stimulus production.
Mulligan, J B
1997-01-01
Inexpensive circuit boards have appeared on the market which transform a normal micro-computer's disk drive into a video disk capable of playing extended video sequences in real time. This technology enables the performance of experiments which were previously impossible, or at least prohibitively expensive. The new technology achieves this capability using special-purpose hardware to compress and decompress individual video frames, enabling a video stream to be transferred over relatively low-bandwidth disk interfaces. This paper will describe the use of such devices for visual psychophysics and present the technical issues that must be considered when evaluating individual products.
Selecting salient frames for spatiotemporal video modeling and segmentation.
Song, Xiaomu; Fan, Guoliang
2007-12-01
We propose a new statistical generative model for spatiotemporal video segmentation. The objective is to partition a video sequence into homogeneous segments that can be used as "building blocks" for semantic video segmentation. The baseline framework is a Gaussian mixture model (GMM)-based video modeling approach that involves a six-dimensional spatiotemporal feature space. Specifically, we introduce the concept of frame saliency to quantify the relevancy of a video frame to the GMM-based spatiotemporal video modeling. This helps us use a small set of salient frames to facilitate the model training by reducing data redundancy and irrelevance. A modified expectation maximization algorithm is developed for simultaneous GMM training and frame saliency estimation, and the frames with the highest saliency values are extracted to refine the GMM estimation for video segmentation. Moreover, it is interesting to find that frame saliency can imply some object behaviors. This makes the proposed method also applicable to other frame-related video analysis tasks, such as key-frame extraction, video skimming, etc. Experiments on real videos demonstrate the effectiveness and efficiency of the proposed method.
A no-reference image and video visual quality metric based on machine learning
NASA Astrophysics Data System (ADS)
Frantc, Vladimir; Voronin, Viacheslav; Semenishchev, Evgenii; Minkin, Maxim; Delov, Aliy
2018-04-01
The paper presents a novel visual quality metric for lossy compressed video quality assessment. High degree of correlation with subjective estimations of quality is due to using of a convolutional neural network trained on a large amount of pairs video sequence-subjective quality score. We demonstrate how our predicted no-reference quality metric correlates with qualitative opinion in a human observer study. Results are shown on the EVVQ dataset with comparison existing approaches.
NASA Astrophysics Data System (ADS)
Gohatre, Umakant Bhaskar; Patil, Venkat P.
2018-04-01
In computer vision application, the multiple object detection and tracking, in real-time operation is one of the important research field, that have gained a lot of attentions, in last few years for finding non stationary entities in the field of image sequence. The detection of object is advance towards following the moving object in video and then representation of object is step to track. The multiple object recognition proof is one of the testing assignment from detection multiple objects from video sequence. The picture enrollment has been for quite some time utilized as a reason for the location the detection of moving multiple objects. The technique of registration to discover correspondence between back to back casing sets in view of picture appearance under inflexible and relative change. The picture enrollment is not appropriate to deal with event occasion that can be result in potential missed objects. In this paper, for address such problems, designs propose novel approach. The divided video outlines utilizing area adjancy diagram of visual appearance and geometric properties. Then it performed between graph sequences by using multi graph matching, then getting matching region labeling by a proposed graph coloring algorithms which assign foreground label to respective region. The plan design is robust to unknown transformation with significant improvement in overall existing work which is related to moving multiple objects detection in real time parameters.
Film grain noise modeling in advanced video coding
NASA Astrophysics Data System (ADS)
Oh, Byung Tae; Kuo, C.-C. Jay; Sun, Shijun; Lei, Shawmin
2007-01-01
A new technique for film grain noise extraction, modeling and synthesis is proposed and applied to the coding of high definition video in this work. The film grain noise is viewed as a part of artistic presentation by people in the movie industry. On one hand, since the film grain noise can boost the natural appearance of pictures in high definition video, it should be preserved in high-fidelity video processing systems. On the other hand, video coding with film grain noise is expensive. It is desirable to extract film grain noise from the input video as a pre-processing step at the encoder and re-synthesize the film grain noise and add it back to the decoded video as a post-processing step at the decoder. Under this framework, the coding gain of the denoised video is higher while the quality of the final reconstructed video can still be well preserved. Following this idea, we present a method to remove film grain noise from image/video without distorting its original content. Besides, we describe a parametric model containing a small set of parameters to represent the extracted film grain noise. The proposed model generates the film grain noise that is close to the real one in terms of power spectral density and cross-channel spectral correlation. Experimental results are shown to demonstrate the efficiency of the proposed scheme.
Efficient video-equipped fire detection approach for automatic fire alarm systems
NASA Astrophysics Data System (ADS)
Kang, Myeongsu; Tung, Truong Xuan; Kim, Jong-Myon
2013-01-01
This paper proposes an efficient four-stage approach that automatically detects fire using video capabilities. In the first stage, an approximate median method is used to detect video frame regions involving motion. In the second stage, a fuzzy c-means-based clustering algorithm is employed to extract candidate regions of fire from all of the movement-containing regions. In the third stage, a gray level co-occurrence matrix is used to extract texture parameters by tracking red-colored objects in the candidate regions. These texture features are, subsequently, used as inputs of a back-propagation neural network to distinguish between fire and nonfire. Experimental results indicate that the proposed four-stage approach outperforms other fire detection algorithms in terms of consistently increasing the accuracy of fire detection in both indoor and outdoor test videos.
Methods and new approaches to the calculation of physiological parameters by videodensitometry
NASA Technical Reports Server (NTRS)
Kedem, D.; Londstrom, D. P.; Rhea, T. C., Jr.; Nelson, J. H.; Price, R. R.; Smith, C. W.; Graham, T. P., Jr.; Brill, A. B.; Kedem, D.
1976-01-01
A complex system featuring a video-camera connected to a video disk, cine (medical motion picture) camera and PDP-9 computer with various input/output facilities has been developed. This system enables the performance of quantitative analysis of various functions recorded in clinical studies. Several studies are described, such as heart chamber volume calculations, left ventricle ejection fraction, blood flow through the lungs and also the possibility of obtaining information about blood flow and constrictions in small cross-section vessels
Adaptive Motor Resistance Video Game Exercise Apparatus and Method of Use Thereof
NASA Technical Reports Server (NTRS)
Reich, Alton (Inventor); Shaw, James (Inventor)
2015-01-01
The invention comprises a method and/or an apparatus using computer configured exercise equipment and an electric motor provided physical resistance in conjunction with a game system, such as a video game system, where the exercise system provides real physical resistance to a user interface. Results of user interaction with the user interface are integrated into a video game, such as running on a game console. The resistance system comprises: a subject interface, software control, a controller, an electric servo assist/resist motor, an actuator, and/or a subject sensor. The system provides actual physical interaction with a resistance device as input to the game console and game run thereon.
Incremental Implicit Learning of Bundles of Statistical Patterns
Qian, Ting; Jaeger, T. Florian; Aslin, Richard N.
2016-01-01
Forming an accurate representation of a task environment often takes place incrementally as the information relevant to learning the representation only unfolds over time. This incremental nature of learning poses an important problem: it is usually unclear whether a sequence of stimuli consists of only a single pattern, or multiple patterns that are spliced together. In the former case, the learner can directly use each observed stimulus to continuously revise its representation of the task environment. In the latter case, however, the learner must first parse the sequence of stimuli into different bundles, so as to not conflate the multiple patterns. We created a video-game statistical learning paradigm and investigated 1) whether learners without prior knowledge of the existence of multiple “stimulus bundles” — subsequences of stimuli that define locally coherent statistical patterns — could detect their presence in the input, and 2) whether learners are capable of constructing a rich representation that encodes the various statistical patterns associated with bundles. By comparing human learning behavior to the predictions of three computational models, we find evidence that learners can handle both tasks successfully. In addition, we discuss the underlying reasons for why the learning of stimulus bundles occurs even when such behavior may seem irrational. PMID:27639552
Ultrafast learning in a hard-limited neural network pattern recognizer
NASA Astrophysics Data System (ADS)
Hu, Chia-Lun J.
1996-03-01
As we published in the last five years, the supervised learning in a hard-limited perceptron system can be accomplished in a noniterative manner if the input-output mapping to be learned satisfies a certain positive-linear-independency (or PLI) condition. When this condition is satisfied (for most practical pattern recognition applications, this condition should be satisfied,) the connection matrix required to meet this mapping can be obtained noniteratively in one step. Generally, there exist infinitively many solutions for the connection matrix when the PLI condition is satisfied. We can then select an optimum solution such that the recognition of any untrained patterns will become optimally robust in the recognition mode. The learning speed is very fast and close to real-time because the learning process is noniterative and one-step. This paper reports the theoretical analysis and the design of a practical charter recognition system for recognizing hand-written alphabets. The experimental result is recorded in real-time on an unedited video tape for demonstration purposes. It is seen from this real-time movie that the recognition of the untrained hand-written alphabets is invariant to size, location, orientation, and writing sequence, even the training is done with standard size, standard orientation, central location and standard writing sequence.
Vehicle speed detection based on gaussian mixture model using sequential of images
NASA Astrophysics Data System (ADS)
Setiyono, Budi; Ratna Sulistyaningrum, Dwi; Soetrisno; Fajriyah, Farah; Wahyu Wicaksono, Danang
2017-09-01
Intelligent Transportation System is one of the important components in the development of smart cities. Detection of vehicle speed on the highway is supporting the management of traffic engineering. The purpose of this study is to detect the speed of the moving vehicles using digital image processing. Our approach is as follows: The inputs are a sequence of frames, frame rate (fps) and ROI. The steps are following: First we separate foreground and background using Gaussian Mixture Model (GMM) in each frames. Then in each frame, we calculate the location of object and its centroid. Next we determine the speed by computing the movement of centroid in sequence of frames. In the calculation of speed, we only consider frames when the centroid is inside the predefined region of interest (ROI). Finally we transform the pixel displacement into a time unit of km/hour. Validation of the system is done by comparing the speed calculated manually and obtained by the system. The results of software testing can detect the speed of vehicles with the highest accuracy is 97.52% and the lowest accuracy is 77.41%. And the detection results of testing by using real video footage on the road is included with real speed of the vehicle.
NASA Astrophysics Data System (ADS)
Grossman, Barry G.; Gonzalez, Frank S.; Blatt, Joel H.; Hooker, Jeffery A.
1992-03-01
The development of efficient high speed techniques to recognize, locate, and quantify damage is vitally important for successful automated inspection systems such as ones used for the inspection of undersea pipelines. Two critical problems must be solved to achieve these goals: the reduction of nonuseful information present in the video image and automatic recognition and quantification of extent and location of damage. Artificial neural network processed moire profilometry appears to be a promising technique to accomplish this. Real time video moire techniques have been developed which clearly distinguish damaged and undamaged areas on structures, thus reducing the amount of extraneous information input into an inspection system. Artificial neural networks have demonstrated advantages for image processing, since they can learn the desired response to a given input and are inherently fast when implemented in hardware due to their parallel computing architecture. Video moire images of pipes with dents of different depths were used to train a neural network, with the desired output being the location and severity of the damage. The system was then successfully tested with a second series of moire images. The techniques employed and the results obtained are discussed.
Exploring associations between gaze patterns and putative human mirror neuron system activity.
Donaldson, Peter H; Gurvich, Caroline; Fielding, Joanne; Enticott, Peter G
2015-01-01
The human mirror neuron system (MNS) is hypothesized to be crucial to social cognition. Given that key MNS-input regions such as the superior temporal sulcus are involved in biological motion processing, and mirror neuron activity in monkeys has been shown to vary with visual attention, aberrant MNS function may be partly attributable to atypical visual input. To examine the relationship between gaze pattern and interpersonal motor resonance (IMR; an index of putative MNS activity), healthy right-handed participants aged 18-40 (n = 26) viewed videos of transitive grasping actions or static hands, whilst the left primary motor cortex received transcranial magnetic stimulation. Motor-evoked potentials recorded in contralateral hand muscles were used to determine IMR. Participants also underwent eyetracking analysis to assess gaze patterns whilst viewing the same videos. No relationship was observed between predictive gaze and IMR. However, IMR was positively associated with fixation counts in areas of biological motion in the videos, and negatively associated with object areas. These findings are discussed with reference to visual influences on the MNS, and the possibility that MNS atypicalities might be influenced by visual processes such as aberrant gaze pattern.
Helping Video Games Rewire "Our Minds"
NASA Technical Reports Server (NTRS)
Pope, Alan T.; Palsson, Olafur S.
2001-01-01
Biofeedback-modulated video games are games that respond to physiological signals as well as mouse, joystick or game controller input; they embody the concept of improving physiological functioning by rewarding specific healthy body signals with success at playing a video game. The NASA patented biofeedback-modulated game method blends biofeedback into popular off-the- shelf video games in such a way that the games do not lose their entertainment value. This method uses physiological signals (e.g., electroencephalogram frequency band ratio) not simply to drive a biofeedback display directly, or periodically modify a task as in other systems, but to continuously modulate parameters (e.g., game character speed and mobility) of a game task in real time while the game task is being performed by other means (e.g., a game controller). Biofeedback-modulated video games represent a new generation of computer and video game environments that train valuable mental skills beyond eye-hand coordination. These psychophysiological training technologies are poised to exploit the revolution in interactive multimedia home entertainment for the personal improvement, not just the diversion, of the user.
Video Analysis in Cross-Cultural Environments and Methodological Issues
ERIC Educational Resources Information Center
Montandon, Christiane
2015-01-01
This paper addresses the use of videography combined with group interviews, as a way to better understand the informal learnings of 11-12 year old children in cross-cultural encounters during French-German school exchanges. The complete, consistent video data required the researchers to choose the most significant sequences to highlight the…
ERIC Educational Resources Information Center
Martin, James E.; And Others
1992-01-01
This study examined the effects of two indirect corrective feedback procedures (picture and video referencing involving instructor prompting) on the assembly skills of five secondary students with moderate mental retardation. Picture and video referencing conditions were more effective than assembly photographs, sequenced pictures, sequenced…
Self-expressive Dictionary Learning for Dynamic 3D Reconstruction.
Zheng, Enliang; Ji, Dinghuang; Dunn, Enrique; Frahm, Jan-Michael
2017-08-22
We target the problem of sparse 3D reconstruction of dynamic objects observed by multiple unsynchronized video cameras with unknown temporal overlap. To this end, we develop a framework to recover the unknown structure without sequencing information across video sequences. Our proposed compressed sensing framework poses the estimation of 3D structure as the problem of dictionary learning, where the dictionary is defined as an aggregation of the temporally varying 3D structures. Given the smooth motion of dynamic objects, we observe any element in the dictionary can be well approximated by a sparse linear combination of other elements in the same dictionary (i.e. self-expression). Our formulation optimizes a biconvex cost function that leverages a compressed sensing formulation and enforces both structural dependency coherence across video streams, as well as motion smoothness across estimates from common video sources. We further analyze the reconstructability of our approach under different capture scenarios, and its comparison and relation to existing methods. Experimental results on large amounts of synthetic data as well as real imagery demonstrate the effectiveness of our approach.
Objective assessment of MPEG-2 video quality
NASA Astrophysics Data System (ADS)
Gastaldo, Paolo; Zunino, Rodolfo; Rovetta, Stefano
2002-07-01
The increasing use of video compression standards in broadcasting television systems has required, in recent years, the development of video quality measurements that take into account artifacts specifically caused by digital compression techniques. In this paper we present a methodology for the objective quality assessment of MPEG video streams by using circular back-propagation feedforward neural networks. Mapping neural networks can render nonlinear relationships between objective features and subjective judgments, thus avoiding any simplifying assumption on the complexity of the model. The neural network processes an instantaneous set of input values, and yields an associated estimate of perceived quality. Therefore, the neural-network approach turns objective quality assessment into adaptive modeling of subjective perception. The objective features used for the estimate are chosen according to the assessed relevance to perceived quality and are continuously extracted in real time from compressed video streams. The overall system mimics perception but does not require any analytical model of the underlying physical phenomenon. The capability to process compressed video streams represents an important advantage over existing approaches, like avoiding the stream-decoding process greatly enhances real-time performance. Experimental results confirm that the system provides satisfactory, continuous-time approximations for actual scoring curves concerning real test videos.
Decontaminate feature for tracking: adaptive tracking via evolutionary feature subset
NASA Astrophysics Data System (ADS)
Liu, Qiaoyuan; Wang, Yuru; Yin, Minghao; Ren, Jinchang; Li, Ruizhi
2017-11-01
Although various visual tracking algorithms have been proposed in the last 2-3 decades, it remains a challenging problem for effective tracking with fast motion, deformation, occlusion, etc. Under complex tracking conditions, most tracking models are not discriminative and adaptive enough. When the combined feature vectors are inputted to the visual models, this may lead to redundancy causing low efficiency and ambiguity causing poor performance. An effective tracking algorithm is proposed to decontaminate features for each video sequence adaptively, where the visual modeling is treated as an optimization problem from the perspective of evolution. Every feature vector is compared to a biological individual and then decontaminated via classical evolutionary algorithms. With the optimized subsets of features, the "curse of dimensionality" has been avoided while the accuracy of the visual model has been improved. The proposed algorithm has been tested on several publicly available datasets with various tracking challenges and benchmarked with a number of state-of-the-art approaches. The comprehensive experiments have demonstrated the efficacy of the proposed methodology.
Thompson, Joseph J; McColeman, C M; Stepanova, Ekaterina R; Blair, Mark R
2017-04-01
Many theories of complex cognitive-motor skill learning are built on the notion that basic cognitive processes group actions into easy-to-perform sequences. The present work examines predictions derived from laboratory-based studies of motor chunking and motor preparation using data collected from the real-time strategy video game StarCraft 2. We examined 996,163 action sequences in the telemetry data of 3,317 players across seven levels of skill. As predicted, the latency to the first action (thought to be the beginning of a chunked sequence) is delayed relative to the other actions in the group. Other predictions, inspired by the memory drum theory of Henry and Rogers, received only weak support. Copyright © 2017 Cognitive Science Society, Inc.
Estimation of velocities via optical flow
NASA Astrophysics Data System (ADS)
Popov, A.; Miller, A.; Miller, B.; Stepanyan, K.
2017-02-01
This article presents an approach to the optical flow (OF) usage as a general navigation means providing the information about the linear and angular vehicle's velocities. The term of "OF" came from opto-electronic devices where it corresponds to a video sequence of images related to the camera motion either over static surfaces or set of objects. Even if the positions of these objects are unknown in advance, one can estimate the camera motion provided just by video sequence itself and some metric information, such as distance between the objects or the range to the surface. This approach is applicable to any passive observation system which is able to produce a sequence of images, such as radio locator or sonar. Here the UAV application of the OF is considered since it is historically
Chess-playing epilepsy: a case report with video-EEG and back averaging.
Mann, M W; Gueguen, B; Guillou, S; Debrand, E; Soufflet, C
2004-12-01
A patient suffering from juvenile myoclonic epilepsy experienced myoclonic jerks, fairly regularly, while playing chess. The myoclonus appeared particularly when he had to plan his strategy, to choose between two solutions or while raising the arm to move a chess figure. Video-EEG-polygraphy was performed, with back averaging of the myoclonus registered during a chess match and during neuropsychological testing with Kohs cubes. The EEG spike wave complexes were localised in the fronto-central region. [Published with video sequences].
Video enhancement method with color-protection post-processing
NASA Astrophysics Data System (ADS)
Kim, Youn Jin; Kwak, Youngshin
2015-01-01
The current study is aimed to propose a post-processing method for video enhancement by adopting a color-protection technique. The color-protection intends to attenuate perceptible artifacts due to over-enhancements in visually sensitive image regions such as low-chroma colors, including skin and gray objects. In addition, reducing the loss in color texture caused by the out-of-color-gamut signals is also taken into account. Consequently, color reproducibility of video sequences could be remarkably enhanced while the undesirable visual exaggerations are minimized.
Tan, M L H; Kok, K; Ganesh, V; Thomas, S S
2014-02-01
Breast cancer patient's expectation and choice of reconstruction is increasing and patients often satisfy their information needs outside clinic time by searching the world wide web. The aim of our study was to analyse the quality of content and extent of information regarding breast reconstruction available on YouTube videos and whether this is an appropriate additional source of information for patients. A snapshot qualitative and quantitative analysis of the first 100 videos was performed after the term 'breast reconstruction' was input into the search window of the video sharing website www.youtube.com on the 1st of September 2011. Qualitative categorical analysis included patient, oncological and reconstruction factors. It was concluded that although videos uploaded onto YouTube do not provide comprehensive information, it is a useful resource that can be utilised in patient education provided comprehensive and validated videos are made available. Copyright © 2013 Elsevier Ltd. All rights reserved.
NASA Technical Reports Server (NTRS)
Reiber, J. H. C.
1976-01-01
To automate the data acquisition procedure, a real-time contour detection and data acquisition system for the left ventricular outline was developed using video techniques. The X-ray image of the contrast-filled left ventricle is stored for subsequent processing on film (cineangiogram), video tape or disc. The cineangiogram is converted into video format using a television camera. The video signal from either the TV camera, video tape or disc is the input signal to the system. The contour detection is based on a dynamic thresholding technique. Since the left ventricular outline is a smooth continuous function, for each contour side a narrow expectation window is defined in which the next borderpoint will be detected. A computer interface was designed and built for the online acquisition of the coordinates using a PDP-12 computer. The advantage of this system over other available systems is its potential for online, real-time acquisition of the left ventricular size and shape during angiocardiography.
NASA Technical Reports Server (NTRS)
McCarty, Kaley Corinne
2013-01-01
One of the projects that I am completing this summer is a Launch Services Program intern 'How to' set up a clean room informational video. The purpose of this video is to go along with a clean room kit that can be checked out by employees at the Kennedy Space Center and to be taken to classrooms to help educate students and intrigue them about NASA. The video will include 'how to' set up and operate a clean room at NASA. This is a group project so we will be acting as a team and contributing our own input and ideas. We will include various activities for children in classrooms to complete, while learning and having fun. Activities that we will explain and film include: helping children understand the proper way to wear a bunny suit, a brief background on cleanrooms, and the importance of maintaining the cleanliness of a space craft. This project will be shown to LSP management and co-workers; we will be presenting the video once it is completed.
About subjective evaluation of adaptive video streaming
NASA Astrophysics Data System (ADS)
Tavakoli, Samira; Brunnström, Kjell; Garcia, Narciso
2015-03-01
The usage of HTTP Adaptive Streaming (HAS) technology by content providers is increasing rapidly. Having available the video content in multiple qualities, using HAS allows to adapt the quality of downloaded video to the current network conditions providing smooth video-playback. However, the time-varying video quality by itself introduces a new type of impairment. The quality adaptation can be done in different ways. In order to find the best adaptation strategy maximizing users perceptual quality it is necessary to investigate about the subjective perception of adaptation-related impairments. However, the novelties of these impairments and their comparably long time duration make most of the standardized assessment methodologies fall less suited for studying HAS degradation. Furthermore, in traditional testing methodologies, the quality of the video in audiovisual services is often evaluated separated and not in the presence of audio. Nevertheless, the requirement of jointly evaluating the audio and the video within a subjective test is a relatively under-explored research field. In this work, we address the research question of determining the appropriate assessment methodology to evaluate the sequences with time-varying quality due to the adaptation. This was done by studying the influence of different adaptation related parameters through two different subjective experiments using a methodology developed to evaluate long test sequences. In order to study the impact of audio presence on quality assessment by the test subjects, one of the experiments was done in the presence of audio stimuli. The experimental results were subsequently compared with another experiment using the standardized single stimulus Absolute Category Rating (ACR) methodology.
Weinstein, Ronald S; López, Ana Mariá; Barker, Gail P; Krupinski, Elizabeth A; Beinar, Sandra J; Major, Janet; Skinner, Tracy; Holcomb, Michael J; McNeely, Richard A
2007-10-01
The Institute for Advanced Telemedicine and Telehealth (i.e., T-Health Institute), a division of the state-wide Arizona Telemedicine Program (ATP), specializes in the creation of innovative health care education programs. This paper describes a first-of-a-kind video amphitheater specifically designed to promote communication within heterogeneous student groups training in the various health care professions. The amphitheater has an audio-video system that facilitates the assembly of ad hoc "in-the-room" electronic interdisciplinary student groups. Off-site faculty members and students can be inserted into groups by video conferencing. When fully implemented, every student will have a personal video camera trained on them, a head phone/microphone, and a personal voice channel. A command and control system will manage the video inputs of the individual participant's head-and-shoulder video images. An audio mixer will manage the separate voice channels of the individual participants and mix them into individual group-specific voice channels for use by the groups' participants. The audio-video system facilitates the easy reconfiguration of the interprofessional electronic groups, viewed on the video wall, without the individual participants in the electronic groups leaving their seats. The amphitheater will serve as a classroom as well as a unique education research laboratory.
[An fMRI study on brain activation patterns of males and females during video sexual stimulation].
Yang, Bo; Zhang, Jin-shan; Wang, Tao; Zhou, Yi-cheng; Liu, Ji-hong; Ma, Lin
2007-08-01
To investigate the difference in the brain activation patterns of males and females during video sexual stimulation by functional magnetic resonance imaging (fMRI). The participants were 20 adult males and 20 adult females, all healthy, right-handed, and with no history of sexual function disorder and physical, psychiatric or neurological diseases. Blood-oxygen-level-dependent fMRI was performed using a 1.5 T MR scanner. Three-dimensional anatomical image of the entire brain were obtained by using a T1-weighted three-dimensional anatomical image spoiled gradient echo pulse sequence. Each person was shown neutral and erotic video sequences for 60 s each in a block-study fashion, i.e. neutral scenes--erotic scenes--neutral scenes, and so on. The total scanning time was approximately 7 minutes, with a 12 s interval between two subsequent video sequences in order to avoid any overlapping between erotic and neutral information. The video sexual stimulation produced different results in the men and women. The females showed activation both in the left and the right amygdala, greater in the former than in the latter ([220.52 +/- 17.09] mm3 vs. [155.45 +/- 18.34] mm3, P < 0.05), but in the males only the left amygdala was activated. The males showed greater brain activation than the females in the left anterior cingulate gyrus ([420.75 +/- 19.37] mm3 vs. [310.67 +/- 10.53] mm3, P < 0.05), but less than the females in the splenium of the corpus callosum ([363.32 +/- 13.30] mm3 vs. [473.45 +/- 14.92] mm3, P < 0.01). Brain activation patterns of males and females during video sexual stimulation are different, underlying which is presumably the difference in both the structure and function of the brain between men and women.
3D video coding: an overview of present and upcoming standards
NASA Astrophysics Data System (ADS)
Merkle, Philipp; Müller, Karsten; Wiegand, Thomas
2010-07-01
An overview of existing and upcoming 3D video coding standards is given. Various different 3D video formats are available, each with individual pros and cons. The 3D video formats can be separated into two classes: video-only formats (such as stereo and multiview video) and depth-enhanced formats (such as video plus depth and multiview video plus depth). Since all these formats exist of at least two video sequences and possibly additional depth data, efficient compression is essential for the success of 3D video applications and technologies. For the video-only formats the H.264 family of coding standards already provides efficient and widely established compression algorithms: H.264/AVC simulcast, H.264/AVC stereo SEI message, and H.264/MVC. For the depth-enhanced formats standardized coding algorithms are currently being developed. New and specially adapted coding approaches are necessary, as the depth or disparity information included in these formats has significantly different characteristics than video and is not displayed directly, but used for rendering. Motivated by evolving market needs, MPEG has started an activity to develop a generic 3D video standard within the 3DVC ad-hoc group. Key features of the standard are efficient and flexible compression of depth-enhanced 3D video representations and decoupling of content creation and display requirements.
Shadow Detection Based on Regions of Light Sources for Object Extraction in Nighttime Video
Lee, Gil-beom; Lee, Myeong-jin; Lee, Woo-Kyung; Park, Joo-heon; Kim, Tae-Hwan
2017-01-01
Intelligent video surveillance systems detect pre-configured surveillance events through background modeling, foreground and object extraction, object tracking, and event detection. Shadow regions inside video frames sometimes appear as foreground objects, interfere with ensuing processes, and finally degrade the event detection performance of the systems. Conventional studies have mostly used intensity, color, texture, and geometric information to perform shadow detection in daytime video, but these methods lack the capability of removing shadows in nighttime video. In this paper, a novel shadow detection algorithm for nighttime video is proposed; this algorithm partitions each foreground object based on the object’s vertical histogram and screens out shadow objects by validating their orientations heading toward regions of light sources. From the experimental results, it can be seen that the proposed algorithm shows more than 93.8% shadow removal and 89.9% object extraction rates for nighttime video sequences, and the algorithm outperforms conventional shadow removal algorithms designed for daytime videos. PMID:28327515
A deep learning pipeline for Indian dance style classification
NASA Astrophysics Data System (ADS)
Dewan, Swati; Agarwal, Shubham; Singh, Navjyoti
2018-04-01
In this paper, we address the problem of dance style classification to classify Indian dance or any dance in general. We propose a 3-step deep learning pipeline. First, we extract 14 essential joint locations of the dancer from each video frame, this helps us to derive any body region location within the frame, we use this in the second step which forms the main part of our pipeline. Here, we divide the dancer into regions of important motion in each video frame. We then extract patches centered at these regions. Main discriminative motion is captured in these patches. We stack the features from all such patches of a frame into a single vector and form our hierarchical dance pose descriptor. Finally, in the third step, we build a high level representation of the dance video using the hierarchical descriptors and train it using a Recurrent Neural Network (RNN) for classification. Our novelty also lies in the way we use multiple representations for a single video. This helps us to: (1) Overcome the RNN limitation of learning small sequences over big sequences such as dance; (2) Extract more data from the available dataset for effective deep learning by training multiple representations. Our contributions in this paper are three-folds: (1) We provide a deep learning pipeline for classification of any form of dance; (2) We prove that a segmented representation of a dance video works well with sequence learning techniques for recognition purposes; (3) We extend and refine the ICD dataset and provide a new dataset for evaluation of dance. Our model performs comparable or better in some cases than the state-of-the-art on action recognition benchmarks.
NASA Astrophysics Data System (ADS)
Adedayo, Bada; Wang, Qi; Alcaraz Calero, Jose M.; Grecos, Christos
2015-02-01
The recent explosion in video-related Internet traffic has been driven by the widespread use of smart mobile devices, particularly smartphones with advanced cameras that are able to record high-quality videos. Although many of these devices offer the facility to record videos at different spatial and temporal resolutions, primarily with local storage considerations in mind, most users only ever use the highest quality settings. The vast majority of these devices are optimised for compressing the acquired video using a single built-in codec and have neither the computational resources nor battery reserves to transcode the video to alternative formats. This paper proposes a new low-complexity dynamic resource allocation engine for cloud-based video transcoding services that are both scalable and capable of being delivered in real-time. Firstly, through extensive experimentation, we establish resource requirement benchmarks for a wide range of transcoding tasks. The set of tasks investigated covers the most widely used input formats (encoder type, resolution, amount of motion and frame rate) associated with mobile devices and the most popular output formats derived from a comprehensive set of use cases, e.g. a mobile news reporter directly transmitting videos to the TV audience of various video format requirements, with minimal usage of resources both at the reporter's end and at the cloud infrastructure end for transcoding services.
Semi-automated camera trap image processing for the detection of ungulate fence crossing events.
Janzen, Michael; Visser, Kaitlyn; Visscher, Darcy; MacLeod, Ian; Vujnovic, Dragomir; Vujnovic, Ksenija
2017-09-27
Remote cameras are an increasingly important tool for ecological research. While remote camera traps collect field data with minimal human attention, the images they collect require post-processing and characterization before it can be ecologically and statistically analyzed, requiring the input of substantial time and money from researchers. The need for post-processing is due, in part, to a high incidence of non-target images. We developed a stand-alone semi-automated computer program to aid in image processing, categorization, and data reduction by employing background subtraction and histogram rules. Unlike previous work that uses video as input, our program uses still camera trap images. The program was developed for an ungulate fence crossing project and tested against an image dataset which had been previously processed by a human operator. Our program placed images into categories representing the confidence of a particular sequence of images containing a fence crossing event. This resulted in a reduction of 54.8% of images that required further human operator characterization while retaining 72.6% of the known fence crossing events. This program can provide researchers using remote camera data the ability to reduce the time and cost required for image post-processing and characterization. Further, we discuss how this procedure might be generalized to situations not specifically related to animal use of linear features.
Visual Perceptual Echo Reflects Learning of Regularities in Rapid Luminance Sequences.
Chang, Acer Y-C; Schwartzman, David J; VanRullen, Rufin; Kanai, Ryota; Seth, Anil K
2017-08-30
A novel neural signature of active visual processing has recently been described in the form of the "perceptual echo", in which the cross-correlation between a sequence of randomly fluctuating luminance values and occipital electrophysiological signals exhibits a long-lasting periodic (∼100 ms cycle) reverberation of the input stimulus (VanRullen and Macdonald, 2012). As yet, however, the mechanisms underlying the perceptual echo and its function remain unknown. Reasoning that natural visual signals often contain temporally predictable, though nonperiodic features, we hypothesized that the perceptual echo may reflect a periodic process associated with regularity learning. To test this hypothesis, we presented subjects with successive repetitions of a rapid nonperiodic luminance sequence, and examined the effects on the perceptual echo, finding that echo amplitude linearly increased with the number of presentations of a given luminance sequence. These data suggest that the perceptual echo reflects a neural signature of regularity learning.Furthermore, when a set of repeated sequences was followed by a sequence with inverted luminance polarities, the echo amplitude decreased to the same level evoked by a novel stimulus sequence. Crucially, when the original stimulus sequence was re-presented, the echo amplitude returned to a level consistent with the number of presentations of this sequence, indicating that the visual system retained sequence-specific information, for many seconds, even in the presence of intervening visual input. Altogether, our results reveal a previously undiscovered regularity learning mechanism within the human visual system, reflected by the perceptual echo. SIGNIFICANCE STATEMENT How the brain encodes and learns fast-changing but nonperiodic visual input remains unknown, even though such visual input characterizes natural scenes. We investigated whether the phenomenon of "perceptual echo" might index such learning. The perceptual echo is a long-lasting reverberation between a rapidly changing visual input and evoked neural activity, apparent in cross-correlations between occipital EEG and stimulus sequences, peaking in the alpha (∼10 Hz) range. We indeed found that perceptual echo is enhanced by repeatedly presenting the same visual sequence, indicating that the human visual system can rapidly and automatically learn regularities embedded within fast-changing dynamic sequences. These results point to a previously undiscovered regularity learning mechanism, operating at a rate defined by the alpha frequency. Copyright © 2017 the authors 0270-6474/17/378486-12$15.00/0.
Reading your own lips: common-coding theory and visual speech perception.
Tye-Murray, Nancy; Spehar, Brent P; Myerson, Joel; Hale, Sandra; Sommers, Mitchell S
2013-02-01
Common-coding theory posits that (1) perceiving an action activates the same representations of motor plans that are activated by actually performing that action, and (2) because of individual differences in the ways that actions are performed, observing recordings of one's own previous behavior activates motor plans to an even greater degree than does observing someone else's behavior. We hypothesized that if observing oneself activates motor plans to a greater degree than does observing others, and if these activated plans contribute to perception, then people should be able to lipread silent video clips of their own previous utterances more accurately than they can lipread video clips of other talkers. As predicted, two groups of participants were able to lipread video clips of themselves, recorded more than two weeks earlier, significantly more accurately than video clips of others. These results suggest that visual input activates speech motor activity that links to word representations in the mental lexicon.
ATLAS Live: Collaborative Information Streams
NASA Astrophysics Data System (ADS)
Goldfarb, Steven; ATLAS Collaboration
2011-12-01
I report on a pilot project launched in 2010 focusing on facilitating communication and information exchange within the ATLAS Collaboration, through the combination of digital signage software and webcasting. The project, called ATLAS Live, implements video streams of information, ranging from detailed detector and data status to educational and outreach material. The content, including text, images, video and audio, is collected, visualised and scheduled using digital signage software. The system is robust and flexible, utilizing scripts to input data from remote sources, such as the CERN Document Server, Indico, or any available URL, and to integrate these sources into professional-quality streams, including text scrolling, transition effects, inter and intra-screen divisibility. Information is published via the encoding and webcasting of standard video streams, viewable on all common platforms, using a web browser or other common video tool. Authorisation is enforced at the level of the streaming and at the web portals, using the CERN SSO system.
Chaos based video encryption using maps and Ikeda time delay system
NASA Astrophysics Data System (ADS)
Valli, D.; Ganesan, K.
2017-12-01
Chaos based cryptosystems are an efficient method to deal with improved speed and highly secured multimedia encryption because of its elegant features, such as randomness, mixing, ergodicity, sensitivity to initial conditions and control parameters. In this paper, two chaos based cryptosystems are proposed: one is the higher-dimensional 12D chaotic map and the other is based on the Ikeda delay differential equation (DDE) suitable for designing a real-time secure symmetric video encryption scheme. These encryption schemes employ a substitution box (S-box) to diffuse the relationship between pixels of plain video and cipher video along with the diffusion of current input pixel with the previous cipher pixel, called cipher block chaining (CBC). The proposed method enhances the robustness against statistical, differential and chosen/known plain text attacks. Detailed analysis is carried out in this paper to demonstrate the security and uniqueness of the proposed scheme.
Prinz, A; Bolz, M; Findl, O
2005-11-01
Owing to the complex topographical aspects of ophthalmic surgery, teaching with conventional surgical videos has led to a poor understanding among medical students. A novel multimedia three dimensional (3D) computer animated program, called "Ophthalmic Operation Vienna" has been developed, where surgical videos are accompanied by 3D animated sequences of all surgical steps for five operations. The aim of the study was to assess the effect of 3D animations on the understanding of cataract and glaucoma surgery among medical students. Set in the Medical University of Vienna, Department of Ophthalmology, 172 students were randomised into two groups: a 3D group (n=90), that saw the 3D animations and video sequences, and a control group (n=82), that saw only the surgical videos. The narrated text was identical for both groups. After the presentation, students were questioned and tested using multiple choice questions. Students in the 3D group found the interactive multimedia teaching methods to be a valuable supplement to the conventional surgical videos. The 3D group outperformed the control group not only in topographical understanding by 16% (p<0.0001), but also in theoretical understanding by 7% (p<0.003). Women in the 3D group gained most by 19% over the control group (p<0.0001). The use of 3D animations lead to a better understanding of difficult surgical topics among medical students, especially for female users. Gender related benefits of using multimedia should be further explored.
NASA Astrophysics Data System (ADS)
Cicala, L.; Angelino, C. V.; Ruatta, G.; Baccaglini, E.; Raimondo, N.
2015-08-01
Unmanned Aerial Vehicles (UAVs) are often employed to collect high resolution images in order to perform image mosaicking and/or 3D reconstruction. Images are usually stored on board and then processed with on-ground desktop software. In such a way the computational load, and hence the power consumption, is moved on ground, leaving on board only the task of storing data. Such an approach is important in the case of small multi-rotorcraft UAVs because of their low endurance due to the short battery life. Images can be stored on board with either still image or video data compression. Still image system are preferred when low frame rates are involved, because video coding systems are based on motion estimation and compensation algorithms which fail when the motion vectors are significantly long and when the overlapping between subsequent frames is very small. In this scenario, UAVs attitude and position metadata from the Inertial Navigation System (INS) can be employed to estimate global motion parameters without video analysis. A low complexity image analysis can be still performed in order to refine the motion field estimated using only the metadata. In this work, we propose to use this refinement step in order to improve the position and attitude estimation produced by the navigation system in order to maximize the encoder performance. Experiments are performed on both simulated and real world video sequences.
Image quality assessment for video stream recognition systems
NASA Astrophysics Data System (ADS)
Chernov, Timofey S.; Razumnuy, Nikita P.; Kozharinov, Alexander S.; Nikolaev, Dmitry P.; Arlazarov, Vladimir V.
2018-04-01
Recognition and machine vision systems have long been widely used in many disciplines to automate various processes of life and industry. Input images of optical recognition systems can be subjected to a large number of different distortions, especially in uncontrolled or natural shooting conditions, which leads to unpredictable results of recognition systems, making it impossible to assess their reliability. For this reason, it is necessary to perform quality control of the input data of recognition systems, which is facilitated by modern progress in the field of image quality evaluation. In this paper, we investigate the approach to designing optical recognition systems with built-in input image quality estimation modules and feedback, for which the necessary definitions are introduced and a model for describing such systems is constructed. The efficiency of this approach is illustrated by the example of solving the problem of selecting the best frames for recognition in a video stream for a system with limited resources. Experimental results are presented for the system for identity documents recognition, showing a significant increase in the accuracy and speed of the system under simulated conditions of automatic camera focusing, leading to blurring of frames.
Yakubova, Gulnoza; Hughes, Elizabeth M; Shinaberry, Megan
2016-07-01
The purpose of this study was to determine the effectiveness of a video modeling intervention with concrete-representational-abstract instructional sequence in teaching mathematics concepts to students with autism spectrum disorder (ASD). A multiple baseline across skills design of single-case experimental methodology was used to determine the effectiveness of the intervention on the acquisition and maintenance of addition, subtraction, and number comparison skills for four elementary school students with ASD. Findings supported the effectiveness of the intervention in improving skill acquisition and maintenance at a 3-week follow-up. Implications for practice and future research are discussed.
Qin, Lei; Snoussi, Hichem; Abdallah, Fahed
2014-01-01
We propose a novel approach for tracking an arbitrary object in video sequences for visual surveillance. The first contribution of this work is an automatic feature extraction method that is able to extract compact discriminative features from a feature pool before computing the region covariance descriptor. As the feature extraction method is adaptive to a specific object of interest, we refer to the region covariance descriptor computed using the extracted features as the adaptive covariance descriptor. The second contribution is to propose a weakly supervised method for updating the object appearance model during tracking. The method performs a mean-shift clustering procedure among the tracking result samples accumulated during a period of time and selects a group of reliable samples for updating the object appearance model. As such, the object appearance model is kept up-to-date and is prevented from contamination even in case of tracking mistakes. We conducted comparing experiments on real-world video sequences, which confirmed the effectiveness of the proposed approaches. The tracking system that integrates the adaptive covariance descriptor and the clustering-based model updating method accomplished stable object tracking on challenging video sequences. PMID:24865883
Tracking Algorithm of Multiple Pedestrians Based on Particle Filters in Video Sequences
Liu, Yun; Wang, Chuanxu; Zhang, Shujun; Cui, Xuehong
2016-01-01
Pedestrian tracking is a critical problem in the field of computer vision. Particle filters have been proven to be very useful in pedestrian tracking for nonlinear and non-Gaussian estimation problems. However, pedestrian tracking in complex environment is still facing many problems due to changes of pedestrian postures and scale, moving background, mutual occlusion, and presence of pedestrian. To surmount these difficulties, this paper presents tracking algorithm of multiple pedestrians based on particle filters in video sequences. The algorithm acquires confidence value of the object and the background through extracting a priori knowledge thus to achieve multipedestrian detection; it adopts color and texture features into particle filter to get better observation results and then automatically adjusts weight value of each feature according to current tracking environment. During the process of tracking, the algorithm processes severe occlusion condition to prevent drift and loss phenomena caused by object occlusion and associates detection results with particle state to propose discriminated method for object disappearance and emergence thus to achieve robust tracking of multiple pedestrians. Experimental verification and analysis in video sequences demonstrate that proposed algorithm improves the tracking performance and has better tracking results. PMID:27847514
De Ley, Paul; De Ley, Irma Tandingan; Morris, Krystalynne; Abebe, Eyualem; Mundo-Ocampo, Manuel; Yoder, Melissa; Heras, Joseph; Waumann, Dora; Rocha-Olivares, Axayácatl; Jay Burr, A.H; Baldwin, James G; Thomas, W. Kelley
2005-01-01
Molecular surveys of meiofaunal diversity face some interesting methodological challenges when it comes to interstitial nematodes from soils and sediments. Morphology-based surveys are greatly limited in processing speed, while barcoding approaches for nematodes are hampered by difficulties of matching sequence data with traditional taxonomy. Intermediate technology is needed to bridge the gap between both approaches. An example of such technology is video capture and editing microscopy, which consists of the recording of taxonomically informative multifocal series of microscopy images as digital video clips. The integration of multifocal imaging with sequence analysis of the D2D3 region of large subunit (LSU) rDNA is illustrated here in the context of a combined morphological and barcode sequencing survey of marine nematodes from Baja California and California. The resulting video clips and sequence data are made available online in the database NemATOL (http://nematol.unh.edu/). Analyses of 37 barcoded nematodes suggest that these represent at least 32 species, none of which matches available D2D3 sequences in public databases. The recorded multifocal vouchers allowed us to identify most specimens to genus, and will be used to match specimens with subsequent species identifications and descriptions of preserved specimens. Like molecular barcodes, multifocal voucher archives are part of a wider effort at structuring and changing the process of biodiversity discovery. We argue that data-rich surveys and phylogenetic tools for analysis of barcode sequences are an essential component of the exploration of phyla with a high fraction of undiscovered species. Our methods are also directly applicable to other meiofauna such as for example gastrotrichs and tardigrades. PMID:16214752
Display system employing acousto-optic tunable filter
NASA Technical Reports Server (NTRS)
Lambert, James L. (Inventor)
1995-01-01
An acousto-optic tunable filter (AOTF) is employed to generate a display by driving the AOTF with a RF electrical signal comprising modulated red, green, and blue video scan line signals and scanning the AOTF with a linearly polarized, pulsed light beam, resulting in encoding of color video columns (scan lines) of an input video image into vertical columns of the AOTF output beam. The AOTF is illuminated periodically as each acoustically-encoded scan line fills the cell aperture of the AOTF. A polarizing beam splitter removes the unused first order beam component of the AOTF output and, if desired, overlays a real world scene on the output plane. Resolutions as high as 30,000 lines are possible, providing holographic display capability.
Display system employing acousto-optic tunable filter
NASA Technical Reports Server (NTRS)
Lambert, James L. (Inventor)
1993-01-01
An acousto-optic tunable filter (AOTF) is employed to generate a display by driving the AOTF with a RF electrical signal comprising modulated red, green, and blue video scan line signals and scanning the AOTF with a linearly polarized, pulsed light beam, resulting in encoding of color video columns (scan lines) of an input video image into vertical columns of the AOTF output beam. The AOTF is illuminated periodically as each acoustically-encoded scan line fills the cell aperture of the AOTF. A polarizing beam splitter removes the unused first order beam component of the AOTF output and, if desired, overlays a real world scene on the output plane. Resolutions as high as 30,000 lines are possible, providing holographic display capability.
Water surface modeling from a single viewpoint video.
Li, Chuan; Pickup, David; Saunders, Thomas; Cosker, Darren; Marshall, David; Hall, Peter; Willis, Philip
2013-07-01
We introduce a video-based approach for producing water surface models. Recent advances in this field output high-quality results but require dedicated capturing devices and only work in limited conditions. In contrast, our method achieves a good tradeoff between the visual quality and the production cost: It automatically produces a visually plausible animation using a single viewpoint video as the input. Our approach is based on two discoveries: first, shape from shading (SFS) is adequate to capture the appearance and dynamic behavior of the example water; second, shallow water model can be used to estimate a velocity field that produces complex surface dynamics. We will provide qualitative evaluation of our method and demonstrate its good performance across a wide range of scenes.
Lip-reading enhancement for law enforcement
NASA Astrophysics Data System (ADS)
Theobald, Barry J.; Harvey, Richard; Cox, Stephen J.; Lewis, Colin; Owen, Gari P.
2006-09-01
Accurate lip-reading techniques would be of enormous benefit for agencies involved in counter-terrorism and other law-enforcement areas. Unfortunately, there are very few skilled lip-readers, and it is apparently a difficult skill to transmit, so the area is under-resourced. In this paper we investigate the possibility of making the lip-reading task more amenable to a wider range of operators by enhancing lip movements in video sequences using active appearance models. These are generative, parametric models commonly used to track faces in images and video sequences. The parametric nature of the model allows a face in an image to be encoded in terms of a few tens of parameters, while the generative nature allows faces to be re-synthesised using the parameters. The aim of this study is to determine if exaggerating lip-motions in video sequences by amplifying the parameters of the model improves lip-reading ability. We also present results of lip-reading tests undertaken by experienced (but non-expert) adult subjects who claim to use lip-reading in their speech recognition process. The results, which are comparisons of word error-rates on unprocessed and processed video, are mixed. We find that there appears to be the potential to improve the word error rate but, for the method to improve the intelligibility there is need for more sophisticated tracking and visual modelling. Our technique can also act as an expression or visual gesture amplifier and so has applications to animation and the presentation of information via avatars or synthetic humans.
NASA Astrophysics Data System (ADS)
Zingoni, Andrea; Diani, Marco; Corsini, Giovanni
2016-10-01
We developed an algorithm for automatically detecting small and poorly contrasted (dim) moving objects in real-time, within video sequences acquired through a steady infrared camera. The algorithm is suitable for different situations since it is independent of the background characteristics and of changes in illumination. Unlike other solutions, small objects of any size (up to single-pixel), either hotter or colder than the background, can be successfully detected. The algorithm is based on accurately estimating the background at the pixel level and then rejecting it. A novel approach permits background estimation to be robust to changes in the scene illumination and to noise, and not to be biased by the transit of moving objects. Care was taken in avoiding computationally costly procedures, in order to ensure the real-time performance even using low-cost hardware. The algorithm was tested on a dataset of 12 video sequences acquired in different conditions, providing promising results in terms of detection rate and false alarm rate, independently of background and objects characteristics. In addition, the detection map was produced frame by frame in real-time, using cheap commercial hardware. The algorithm is particularly suitable for applications in the fields of video-surveillance and computer vision. Its reliability and speed permit it to be used also in critical situations, like in search and rescue, defence and disaster monitoring.
Coding visual features extracted from video sequences.
Baroffio, Luca; Cesana, Matteo; Redondi, Alessandro; Tagliasacchi, Marco; Tubaro, Stefano
2014-05-01
Visual features are successfully exploited in several applications (e.g., visual search, object recognition and tracking, etc.) due to their ability to efficiently represent image content. Several visual analysis tasks require features to be transmitted over a bandwidth-limited network, thus calling for coding techniques to reduce the required bit budget, while attaining a target level of efficiency. In this paper, we propose, for the first time, a coding architecture designed for local features (e.g., SIFT, SURF) extracted from video sequences. To achieve high coding efficiency, we exploit both spatial and temporal redundancy by means of intraframe and interframe coding modes. In addition, we propose a coding mode decision based on rate-distortion optimization. The proposed coding scheme can be conveniently adopted to implement the analyze-then-compress (ATC) paradigm in the context of visual sensor networks. That is, sets of visual features are extracted from video frames, encoded at remote nodes, and finally transmitted to a central controller that performs visual analysis. This is in contrast to the traditional compress-then-analyze (CTA) paradigm, in which video sequences acquired at a node are compressed and then sent to a central unit for further processing. In this paper, we compare these coding paradigms using metrics that are routinely adopted to evaluate the suitability of visual features in the context of content-based retrieval, object recognition, and tracking. Experimental results demonstrate that, thanks to the significant coding gains achieved by the proposed coding scheme, ATC outperforms CTA with respect to all evaluation metrics.
NASA Astrophysics Data System (ADS)
Wang, Guanxi; Tie, Yun; Qi, Lin
2017-07-01
In this paper, we propose a novel approach based on Depth Maps and compute Multi-Scale Histograms of Oriented Gradient (MSHOG) from sequences of depth maps to recognize actions. Each depth frame in a depth video sequence is projected onto three orthogonal Cartesian planes. Under each projection view, the absolute difference between two consecutive projected maps is accumulated through a depth video sequence to form a Depth Map, which is called Depth Motion Trail Images (DMTI). The MSHOG is then computed from the Depth Maps for the representation of an action. In addition, we apply L2-Regularized Collaborative Representation (L2-CRC) to classify actions. We evaluate the proposed approach on MSR Action3D dataset and MSRGesture3D dataset. Promising experimental result demonstrates the effectiveness of our proposed method.
CVD2014-A Database for Evaluating No-Reference Video Quality Assessment Algorithms.
Nuutinen, Mikko; Virtanen, Toni; Vaahteranoksa, Mikko; Vuori, Tero; Oittinen, Pirkko; Hakkinen, Jukka
2016-07-01
In this paper, we present a new video database: CVD2014-Camera Video Database. In contrast to previous video databases, this database uses real cameras rather than introducing distortions via post-processing, which results in a complex distortion space in regard to the video acquisition process. CVD2014 contains a total of 234 videos that are recorded using 78 different cameras. Moreover, this database contains the observer-specific quality evaluation scores rather than only providing mean opinion scores. We have also collected open-ended quality descriptions that are provided by the observers. These descriptions were used to define the quality dimensions for the videos in CVD2014. The dimensions included sharpness, graininess, color balance, darkness, and jerkiness. At the end of this paper, a performance study of image and video quality algorithms for predicting the subjective video quality is reported. For this performance study, we proposed a new performance measure that accounts for observer variance. The performance study revealed that there is room for improvement regarding the video quality assessment algorithms. The CVD2014 video database has been made publicly available for the research community. All video sequences and corresponding subjective ratings can be obtained from the CVD2014 project page (http://www.helsinki.fi/psychology/groups/visualcognition/).
Wireless Augmented Reality Communication System
NASA Technical Reports Server (NTRS)
Agan, Martin (Inventor); Devereaux, Ann (Inventor); Jedrey, Thomas (Inventor)
2015-01-01
A portable unit is for video communication to select a user name in a user name network. A transceiver wirelessly accesses a communication network through a wireless connection to a general purpose node coupled to the communication network. A user interface can receive user input to log on to a user name network through the communication network. The user name network has a plurality of user names, at least one of the plurality of user names is associated with a remote portable unit, logged on to the user name network and available for video communication.
Wireless Augmented Reality Communication System
NASA Technical Reports Server (NTRS)
Jedrey, Thomas (Inventor); Agan, Martin (Inventor); Devereaux, Ann (Inventor)
2017-01-01
A portable unit is for video communication to select a user name in a user name network. A transceiver wirelessly accesses a communication network through a wireless connection to a general purpose node coupled to the communication network. A user interface can receive user input to log on to a user name network through the communication network. The user name network has a plurality of user names, at least one of the plurality of user names is associated with a remote portable unit, logged on to the user name network and available for video communication.
Improving truck and speed data using paired video and single-loop sensors
DOT National Transportation Integrated Search
2006-12-01
Real-time speed and truck data are important inputs for modern freeway traffic control and : management systems. However, these data are not directly measurable by single-loop detectors. : Although dual-loop detectors provide speeds and classified ve...
Adam, Maya; Chen, Sharon F; Amieva, Manuel; Deitz, Jennifer; Jang, Heeju; Porwal, Aarti; Prober, Charles
2017-07-01
Medical students often struggle to appreciate the clinical relevance of material taught in the preclinical years. The authors believe videos could be effectively used to interweave a patient's illness script with foundational basic science concepts. In collaboration with four other U.S. medical schools, educators at the Stanford University School of Medicine created 36 short, animated, patient-centered springboard videos (third-person, narrated accounts of authentic patient cases conveying foundational pathophysiology) in 2014. The videos were used to introduce students to 36 content modules, created as part of a microbiology, immunology, and infectious diseases curriculum. The videos were created with input from faculty content experts and in some cases medical students, and were piloted using a flipped classroom pedagogical approach in January 2015-June 2016. Student feedback from course evaluations and focus groups was analyzed using a mixed-methods approach. On the course evaluations, the majority of students rated the patient-centered videos positively, and the majority of comments on the videos were positive, highlighting both enhanced engagement and enhanced learning and retention. Comments from focus groups mirrored the course evaluation comments and highlighted different usage patterns for the videos. The authors will continue to gather and analyze data from schools using the videos as part of their core preclinical curriculum, and will produce similar videos for use in other areas of undergraduate medical education. These videos could support students' review of content taught previously and be repurposed for use in continuing and graduate medical education, as well as patient education.
Deep Recurrent Neural Networks for Human Activity Recognition
Murad, Abdulmajid
2017-01-01
Adopting deep learning methods for human activity recognition has been effective in extracting discriminative features from raw input sequences acquired from body-worn sensors. Although human movements are encoded in a sequence of successive samples in time, typical machine learning methods perform recognition tasks without exploiting the temporal correlations between input data samples. Convolutional neural networks (CNNs) address this issue by using convolutions across a one-dimensional temporal sequence to capture dependencies among input data. However, the size of convolutional kernels restricts the captured range of dependencies between data samples. As a result, typical models are unadaptable to a wide range of activity-recognition configurations and require fixed-length input windows. In this paper, we propose the use of deep recurrent neural networks (DRNNs) for building recognition models that are capable of capturing long-range dependencies in variable-length input sequences. We present unidirectional, bidirectional, and cascaded architectures based on long short-term memory (LSTM) DRNNs and evaluate their effectiveness on miscellaneous benchmark datasets. Experimental results show that our proposed models outperform methods employing conventional machine learning, such as support vector machine (SVM) and k-nearest neighbors (KNN). Additionally, the proposed models yield better performance than other deep learning techniques, such as deep believe networks (DBNs) and CNNs. PMID:29113103
Deep Recurrent Neural Networks for Human Activity Recognition.
Murad, Abdulmajid; Pyun, Jae-Young
2017-11-06
Adopting deep learning methods for human activity recognition has been effective in extracting discriminative features from raw input sequences acquired from body-worn sensors. Although human movements are encoded in a sequence of successive samples in time, typical machine learning methods perform recognition tasks without exploiting the temporal correlations between input data samples. Convolutional neural networks (CNNs) address this issue by using convolutions across a one-dimensional temporal sequence to capture dependencies among input data. However, the size of convolutional kernels restricts the captured range of dependencies between data samples. As a result, typical models are unadaptable to a wide range of activity-recognition configurations and require fixed-length input windows. In this paper, we propose the use of deep recurrent neural networks (DRNNs) for building recognition models that are capable of capturing long-range dependencies in variable-length input sequences. We present unidirectional, bidirectional, and cascaded architectures based on long short-term memory (LSTM) DRNNs and evaluate their effectiveness on miscellaneous benchmark datasets. Experimental results show that our proposed models outperform methods employing conventional machine learning, such as support vector machine (SVM) and k-nearest neighbors (KNN). Additionally, the proposed models yield better performance than other deep learning techniques, such as deep believe networks (DBNs) and CNNs.
Master/Programmable-Slave Computer
NASA Technical Reports Server (NTRS)
Smaistrla, David; Hall, William A.
1990-01-01
Unique modular computer features compactness, low power, mass storage of data, multiprocessing, and choice of various input/output modes. Master processor communicates with user via usual keyboard and video display terminal. Coordinates operations of as many as 24 slave processors, each dedicated to different experiment. Each slave circuit card includes slave microprocessor and assortment of input/output circuits for communication with external equipment, with master processor, and with other slave processors. Adaptable to industrial process control with selectable degrees of automatic control, automatic and/or manual monitoring, and manual intervention.
A generic flexible and robust approach for intelligent real-time video-surveillance systems
NASA Astrophysics Data System (ADS)
Desurmont, Xavier; Delaigle, Jean-Francois; Bastide, Arnaud; Macq, Benoit
2004-05-01
In this article we present a generic, flexible and robust approach for an intelligent real-time video-surveillance system. A previous version of the system was presented in [1]. The goal of these advanced tools is to provide help to operators by detecting events of interest in visual scenes and highlighting alarms and compute statistics. The proposed system is a multi-camera platform able to handle different standards of video inputs (composite, IP, IEEE1394 ) and which can basically compress (MPEG4), store and display them. This platform also integrates advanced video analysis tools, such as motion detection, segmentation, tracking and interpretation. The design of the architecture is optimised to playback, display, and process video flows in an efficient way for video-surveillance application. The implementation is distributed on a scalable computer cluster based on Linux and IP network. It relies on POSIX threads for multitasking scheduling. Data flows are transmitted between the different modules using multicast technology and under control of a TCP-based command network (e.g. for bandwidth occupation control). We report here some results and we show the potential use of such a flexible system in third generation video surveillance system. We illustrate the interest of the system in a real case study, which is the indoor surveillance.
Molinari, Luisa; Mameli, Consuelo; Gnisci, Augusto
2013-09-01
A sequential analysis of classroom discourse is needed to investigate the conditions under which the triadic initiation-response-feedback (IRF) pattern may host different teaching orientations. The purpose of the study is twofold: first, to describe the characteristics of classroom discourse and, second, to identify and explore the different interactive sequences that can be captured with a sequential statistical analysis. Twelve whole-class activities were video recorded in three Italian primary schools. We observed classroom interaction as it occurs naturally on an everyday basis. In total, we collected 587 min of video recordings. Subsequently, 828 triadic IRF patterns were extracted from this material and analysed with the programme Generalized Sequential Query (GSEQ). The results indicate that classroom discourse may unfold in different ways. In particular, we identified and described four types of sequences. Dialogic sequences were triggered by authentic questions, and continued through further relaunches. Monologic sequences were directed to fulfil the teachers' pre-determined didactic purposes. Co-constructive sequences fostered deduction, reasoning, and thinking. Scaffolding sequences helped and sustained children with difficulties. The application of sequential analyses allowed us to show that interactive sequences may account for a variety of meanings, thus making a significant contribution to the literature and research practice in classroom discourse. © 2012 The British Psychological Society.
Automatic generation of pictorial transcripts of video programs
NASA Astrophysics Data System (ADS)
Shahraray, Behzad; Gibbon, David C.
1995-03-01
An automatic authoring system for the generation of pictorial transcripts of video programs which are accompanied by closed caption information is presented. A number of key frames, each of which represents the visual information in a segment of the video (i.e., a scene), are selected automatically by performing a content-based sampling of the video program. The textual information is recovered from the closed caption signal and is initially segmented based on its implied temporal relationship with the video segments. The text segmentation boundaries are then adjusted, based on lexical analysis and/or caption control information, to account for synchronization errors due to possible delays in the detection of scene boundaries or the transmission of the caption information. The closed caption text is further refined through linguistic processing for conversion to lower- case with correct capitalization. The key frames and the related text generate a compact multimedia presentation of the contents of the video program which lends itself to efficient storage and transmission. This compact representation can be viewed on a computer screen, or used to generate the input to a commercial text processing package to generate a printed version of the program.
Scrambling for anonymous visual communications
NASA Astrophysics Data System (ADS)
Dufaux, Frederic; Ebrahimi, Touradj
2005-08-01
In this paper, we present a system for anonymous visual communications. Target application is an anonymous video chat. The system is identifying faces in the video sequence by means of face detection or skin detection. The corresponding regions are subsequently scrambled. We investigate several approaches for scrambling, either in the image-domain or in the transform-domain. Experiment results show the effectiveness of the proposed system.
Fingerprint multicast in secure video streaming.
Zhao, H Vicky; Liu, K J Ray
2006-01-01
Digital fingerprinting is an emerging technology to protect multimedia content from illegal redistribution, where each distributed copy is labeled with unique identification information. In video streaming, huge amount of data have to be transmitted to a large number of users under stringent latency constraints, so the bandwidth-efficient distribution of uniquely fingerprinted copies is crucial. This paper investigates the secure multicast of anticollusion fingerprinted video in streaming applications and analyzes their performance. We first propose a general fingerprint multicast scheme that can be used with most spread spectrum embedding-based multimedia fingerprinting systems. To further improve the bandwidth efficiency, we explore the special structure of the fingerprint design and propose a joint fingerprint design and distribution scheme. From our simulations, the two proposed schemes can reduce the bandwidth requirement by 48% to 87%, depending on the number of users, the characteristics of video sequences, and the network and computation constraints. We also show that under the constraint that all colluders have the same probability of detection, the embedded fingerprints in the two schemes have approximately the same collusion resistance. Finally, we propose a fingerprint drift compensation scheme to improve the quality of the reconstructed sequences at the decoder's side without introducing extra communication overhead.
Motion adaptive Kalman filter for super-resolution
NASA Astrophysics Data System (ADS)
Richter, Martin; Nasse, Fabian; Schröder, Hartmut
2011-01-01
Superresolution is a sophisticated strategy to enhance image quality of both low and high resolution video, performing tasks like artifact reduction, scaling and sharpness enhancement in one algorithm, all of them reconstructing high frequency components (above Nyquist frequency) in some way. Especially recursive superresolution algorithms can fulfill high quality aspects because they control the video output using a feed-back loop and adapt the result in the next iteration. In addition to excellent output quality, temporal recursive methods are very hardware efficient and therefore even attractive for real-time video processing. A very promising approach is the utilization of Kalman filters as proposed by Farsiu et al. Reliable motion estimation is crucial for the performance of superresolution. Therefore, robust global motion models are mainly used, but this also limits the application of superresolution algorithm. Thus, handling sequences with complex object motion is essential for a wider field of application. Hence, this paper proposes improvements by extending the Kalman filter approach using motion adaptive variance estimation and segmentation techniques. Experiments confirm the potential of our proposal for ideal and real video sequences with complex motion and further compare its performance to state-of-the-art methods like trainable filters.
Automated multiple target detection and tracking in UAV videos
NASA Astrophysics Data System (ADS)
Mao, Hongwei; Yang, Chenhui; Abousleman, Glen P.; Si, Jennie
2010-04-01
In this paper, a novel system is presented to detect and track multiple targets in Unmanned Air Vehicles (UAV) video sequences. Since the output of the system is based on target motion, we first segment foreground moving areas from the background in each video frame using background subtraction. To stabilize the video, a multi-point-descriptor-based image registration method is performed where a projective model is employed to describe the global transformation between frames. For each detected foreground blob, an object model is used to describe its appearance and motion information. Rather than immediately classifying the detected objects as targets, we track them for a certain period of time and only those with qualified motion patterns are labeled as targets. In the subsequent tracking process, a Kalman filter is assigned to each tracked target to dynamically estimate its position in each frame. Blobs detected at a later time are used as observations to update the state of the tracked targets to which they are associated. The proposed overlap-rate-based data association method considers the splitting and merging of the observations, and therefore is able to maintain tracks more consistently. Experimental results demonstrate that the system performs well on real-world UAV video sequences. Moreover, careful consideration given to each component in the system has made the proposed system feasible for real-time applications.
Human Splice-Site Prediction with Deep Neural Networks.
Naito, Tatsuhiko
2018-04-18
Accurate splice-site prediction is essential to delineate gene structures from sequence data. Several computational techniques have been applied to create a system to predict canonical splice sites. For classification tasks, deep neural networks (DNNs) have achieved record-breaking results and often outperformed other supervised learning techniques. In this study, a new method of splice-site prediction using DNNs was proposed. The proposed system receives an input sequence data and returns an answer as to whether it is splice site. The length of input is 140 nucleotides, with the consensus sequence (i.e., "GT" and "AG" for the donor and acceptor sites, respectively) in the middle. Each input sequence model is applied to the pretrained DNN model that determines the probability that an input is a splice site. The model consists of convolutional layers and bidirectional long short-term memory network layers. The pretraining and validation were conducted using the data set tested in previously reported methods. The performance evaluation results showed that the proposed method can outperform the previous methods. In addition, the pattern learned by the DNNs was visualized as position frequency matrices (PFMs). Some of PFMs were very similar to the consensus sequence. The trained DNN model and the brief source code for the prediction system are uploaded. Further improvement will be achieved following the further development of DNNs.
Applied learning-based color tone mapping for face recognition in video surveillance system
NASA Astrophysics Data System (ADS)
Yew, Chuu Tian; Suandi, Shahrel Azmin
2012-04-01
In this paper, we present an applied learning-based color tone mapping technique for video surveillance system. This technique can be applied onto both color and grayscale surveillance images. The basic idea is to learn the color or intensity statistics from a training dataset of photorealistic images of the candidates appeared in the surveillance images, and remap the color or intensity of the input image so that the color or intensity statistics match those in the training dataset. It is well known that the difference in commercial surveillance cameras models, and signal processing chipsets used by different manufacturers will cause the color and intensity of the images to differ from one another, thus creating additional challenges for face recognition in video surveillance system. Using Multi-Class Support Vector Machines as the classifier on a publicly available video surveillance camera database, namely SCface database, this approach is validated and compared to the results of using holistic approach on grayscale images. The results show that this technique is suitable to improve the color or intensity quality of video surveillance system for face recognition.
NASA Astrophysics Data System (ADS)
Thiebaud, P.; Cross, D. C.
1980-07-01
A new solid-state radar switchboard equipped with 16 input ports which will output data to 16 displays is presented. Each of the ports will handle a single two-dimensional radar input, or three ports will accommodate a three-dimensional radar input. A video switch card of the switchboard is used to switch all signals, with the exception of the IFF-mode-control lines. Each card accepts inputs from up to 16 sources and can pass a signal with bandwidth greater than 20 MHz to the display assigned to that card. The synchro amplifier of current systems has been eliminated and in the new design each PPI receives radar data via a single coaxial cable. This significant reduction in cabling is achieved by adding a serial-to-parallel interface and a digital-to-synchro converter located at the PPI.
Prinz, A; Bolz, M; Findl, O
2005-01-01
Background/aim: Owing to the complex topographical aspects of ophthalmic surgery, teaching with conventional surgical videos has led to a poor understanding among medical students. A novel multimedia three dimensional (3D) computer animated program, called “Ophthalmic Operation Vienna” has been developed, where surgical videos are accompanied by 3D animated sequences of all surgical steps for five operations. The aim of the study was to assess the effect of 3D animations on the understanding of cataract and glaucoma surgery among medical students. Method: Set in the Medical University of Vienna, Department of Ophthalmology, 172 students were randomised into two groups: a 3D group (n = 90), that saw the 3D animations and video sequences, and a control group (n = 82), that saw only the surgical videos. The narrated text was identical for both groups. After the presentation, students were questioned and tested using multiple choice questions. Results: Students in the 3D group found the interactive multimedia teaching methods to be a valuable supplement to the conventional surgical videos. The 3D group outperformed the control group not only in topographical understanding by 16% (p<0.0001), but also in theoretical understanding by 7% (p<0.003). Women in the 3D group gained most by 19% over the control group (p<0.0001). Conclusions: The use of 3D animations lead to a better understanding of difficult surgical topics among medical students, especially for female users. Gender related benefits of using multimedia should be further explored. PMID:16234460
Topper, Nicholas C.; Burke, S.N.; Maurer, A.P.
2014-01-01
BACKGROUND Current methods for aligning neurophysiology and video data are either prepackaged, requiring the additional purchase of a software suite, or use a blinking LED with a stationary pulse-width and frequency. These methods lack significant user interface for adaptation, are expensive, or risk a misalignment of the two data streams. NEW METHOD A cost-effective means to obtain high-precision alignment of behavioral and neurophysiological data is obtained by generating an audio-pulse embedded with two domains of information, a low-frequency binary-counting signal and a high, randomly changing frequency. This enabled the derivation of temporal information while maintaining enough entropy in the system for algorithmic alignment. RESULTS The sample to frame index constructed using the audio input correlation method described in this paper enables video and data acquisition to be aligned at a sub-frame level of precision. COMPARISONS WITH EXISTING METHOD Traditionally, a synchrony pulse is recorded on-screen via a flashing diode. The higher sampling rate of the audio input of the camcorder enables the timing of an event to be detected with greater precision. CONCLUSIONS While On-line analysis and synchronization using specialized equipment may be the ideal situation in some cases, the method presented in the current paper presents a viable, low cost alternative, and gives the flexibility to interface with custom off-line analysis tools. Moreover, the ease of constructing and implements this set-up presented in the current paper makes it applicable to a wide variety of applications that require video recording. PMID:25256648
Topper, Nicholas C; Burke, Sara N; Maurer, Andrew Porter
2014-12-30
Current methods for aligning neurophysiology and video data are either prepackaged, requiring the additional purchase of a software suite, or use a blinking LED with a stationary pulse-width and frequency. These methods lack significant user interface for adaptation, are expensive, or risk a misalignment of the two data streams. A cost-effective means to obtain high-precision alignment of behavioral and neurophysiological data is obtained by generating an audio-pulse embedded with two domains of information, a low-frequency binary-counting signal and a high, randomly changing frequency. This enabled the derivation of temporal information while maintaining enough entropy in the system for algorithmic alignment. The sample to frame index constructed using the audio input correlation method described in this paper enables video and data acquisition to be aligned at a sub-frame level of precision. Traditionally, a synchrony pulse is recorded on-screen via a flashing diode. The higher sampling rate of the audio input of the camcorder enables the timing of an event to be detected with greater precision. While on-line analysis and synchronization using specialized equipment may be the ideal situation in some cases, the method presented in the current paper presents a viable, low cost alternative, and gives the flexibility to interface with custom off-line analysis tools. Moreover, the ease of constructing and implements this set-up presented in the current paper makes it applicable to a wide variety of applications that require video recording. Copyright © 2014 Elsevier B.V. All rights reserved.
Background-Modeling-Based Adaptive Prediction for Surveillance Video Coding.
Zhang, Xianguo; Huang, Tiejun; Tian, Yonghong; Gao, Wen
2014-02-01
The exponential growth of surveillance videos presents an unprecedented challenge for high-efficiency surveillance video coding technology. Compared with the existing coding standards that were basically developed for generic videos, surveillance video coding should be designed to make the best use of the special characteristics of surveillance videos (e.g., relative static background). To do so, this paper first conducts two analyses on how to improve the background and foreground prediction efficiencies in surveillance video coding. Following the analysis results, we propose a background-modeling-based adaptive prediction (BMAP) method. In this method, all blocks to be encoded are firstly classified into three categories. Then, according to the category of each block, two novel inter predictions are selectively utilized, namely, the background reference prediction (BRP) that uses the background modeled from the original input frames as the long-term reference and the background difference prediction (BDP) that predicts the current data in the background difference domain. For background blocks, the BRP can effectively improve the prediction efficiency using the higher quality background as the reference; whereas for foreground-background-hybrid blocks, the BDP can provide a better reference after subtracting its background pixels. Experimental results show that the BMAP can achieve at least twice the compression ratio on surveillance videos as AVC (MPEG-4 Advanced Video Coding) high profile, yet with a slightly additional encoding complexity. Moreover, for the foreground coding performance, which is crucial to the subjective quality of moving objects in surveillance videos, BMAP also obtains remarkable gains over several state-of-the-art methods.
Development and Pilot Testing of a Video-Assisted Informed Consent Process
Sonne, Susan C.; Andrews, Jeannette O.; Gentilin, Stephanie M.; Oppenheimer, Stephanie; Obeid, Jihad; Brady, Kathleen; Wolf, Sharon; Davis, Randal; Magruder, Kathryn
2013-01-01
The informed consent process for research has come under scrutiny, as consent documents are increasingly long and difficult to understand. Innovations are needed to improve comprehension in order to make the consent process truly informed. We report on the development and pilot testing of video clips that could be used during the consent process to better explain research procedures to potential participants. Based on input from researchers and community partners, 15 videos of common research procedures/concepts were produced. The utility of the videos was then tested by embedding them in mock informed consent documents that were presented via an online electronic consent system designed for delivery via iPad. Three mock consents were developed, each containing five videos. All participants (n=61) read both a paper version and the video-assisted iPad version of the same mock consent and were randomized to which format they reviewed first. Participants were given a competency quiz that posed specific questions about the information in the consent after reviewing the first consent document to which they were exposed. Most participants (78.7%) preferred the video-assisted format compared to paper (12.9%). Nearly all (96.7%) reported that the videos improved their understanding of the procedures described in the consent document; however, comprehension of material did not significantly differ by consent format. Results suggest videos may be helpful in providing participants with information about study procedures in a way that is easy to understand. Additional testing of video consents for complex protocols and with subjects of lower literacy is warranted. PMID:23747986
Development and pilot testing of a video-assisted informed consent process.
Sonne, Susan C; Andrews, Jeannette O; Gentilin, Stephanie M; Oppenheimer, Stephanie; Obeid, Jihad; Brady, Kathleen; Wolf, Sharon; Davis, Randal; Magruder, Kathryn
2013-09-01
The informed consent process for research has come under scrutiny, as consent documents are increasingly long and difficult to understand. Innovations are needed to improve comprehension in order to make the consent process truly informed. We report on the development and pilot testing of video clips that could be used during the consent process to better explain research procedures to potential participants. Based on input from researchers and community partners, 15 videos of common research procedures/concepts were produced. The utility of the videos was then tested by embedding them in mock-informed consent documents that were presented via an online electronic consent system designed for delivery via iPad. Three mock consents were developed, each containing five videos. All participants (n = 61) read both a paper version and the video-assisted iPad version of the same mock consent and were randomized to which format they reviewed first. Participants were given a competency quiz that posed specific questions about the information in the consent after reviewing the first consent document to which they were exposed. Most participants (78.7%) preferred the video-assisted format compared to paper (12.9%). Nearly all (96.7%) reported that the videos improved their understanding of the procedures described in the consent document; however, the comprehension of material did not significantly differ by consent format. Results suggest videos may be helpful in providing participants with information about study procedures in a way that is easy to understand. Additional testing of video consents for complex protocols and with subjects of lower literacy is warranted. Copyright © 2013 Elsevier Inc. All rights reserved.
Real-time filtering and detection of dynamics for compression of HDTV
NASA Technical Reports Server (NTRS)
Sauer, Ken D.; Bauer, Peter
1991-01-01
The preprocessing of video sequences for data compressing is discussed. The end goal associated with this is a compression system for HDTV capable of transmitting perceptually lossless sequences at under one bit per pixel. Two subtopics were emphasized to prepare the video signal for more efficient coding: (1) nonlinear filtering to remove noise and shape the signal spectrum to take advantage of insensitivities of human viewers; and (2) segmentation of each frame into temporally dynamic/static regions for conditional frame replenishment. The latter technique operates best under the assumption that the sequence can be modelled as a superposition of active foreground and static background. The considerations were restricted to monochrome data, since it was expected to use the standard luminance/chrominance decomposition, which concentrates most of the bandwidth requirements in the luminance. Similar methods may be applied to the two chrominance signals.
Dynamic video encryption algorithm for H.264/AVC based on a spatiotemporal chaos system.
Xu, Hui; Tong, Xiao-Jun; Zhang, Miao; Wang, Zhu; Li, Ling-Hao
2016-06-01
Video encryption schemes mostly employ the selective encryption method to encrypt parts of important and sensitive video information, aiming to ensure the real-time performance and encryption efficiency. The classic block cipher is not applicable to video encryption due to the high computational overhead. In this paper, we propose the encryption selection control module to encrypt video syntax elements dynamically which is controlled by the chaotic pseudorandom sequence. A novel spatiotemporal chaos system and binarization method is used to generate a key stream for encrypting the chosen syntax elements. The proposed scheme enhances the resistance against attacks through the dynamic encryption process and high-security stream cipher. Experimental results show that the proposed method exhibits high security and high efficiency with little effect on the compression ratio and time cost.
The Concrete-Representational-Abstract Sequence of Instruction in Mathematics Classrooms
ERIC Educational Resources Information Center
Mudaly, Vimolan; Naidoo, Jayaluxmi
2015-01-01
The purpose of this paper is to explore how master mathematics teachers use the concrete-representational-abstract (CRA) sequence of instruction in mathematics classrooms. Data was collected from a convenience sample of six master teachers by observations, video recordings of their teaching, and semi-structured interviews. Data collection also…
Teacher Deployment of "Oh" in Known-Answer Question Sequences
ERIC Educational Resources Information Center
Hosoda, Yuri
2016-01-01
This conversation analytic study describes some specific interactional contexts in which native English-speaking teachers produce "oh" in known-answer question sequences in English language classes. The data for this study come from 10 video-recorded Japanese primary school English language class sessions. The analysis identified three…
Underwater video enhancement using multi-camera super-resolution
NASA Astrophysics Data System (ADS)
Quevedo, E.; Delory, E.; Callicó, G. M.; Tobajas, F.; Sarmiento, R.
2017-12-01
Image spatial resolution is critical in several fields such as medicine, communications or satellite, and underwater applications. While a large variety of techniques for image restoration and enhancement has been proposed in the literature, this paper focuses on a novel Super-Resolution fusion algorithm based on a Multi-Camera environment that permits to enhance the quality of underwater video sequences without significantly increasing computation. In order to compare the quality enhancement, two objective quality metrics have been used: PSNR (Peak Signal-to-Noise Ratio) and the SSIM (Structural SIMilarity) index. Results have shown that the proposed method enhances the objective quality of several underwater sequences, avoiding the appearance of undesirable artifacts, with respect to basic fusion Super-Resolution algorithms.
Phase-based motion magnification video for monitoring of vital signals using the Hermite transform
NASA Astrophysics Data System (ADS)
Brieva, Jorge; Moya-Albor, Ernesto
2017-11-01
In this paper we present a new Eulerian phase-based motion magnification technique using the Hermite Transform (HT) decomposition that is inspired in the Human Vision System (HVS). We test our method in one sequence of the breathing of a newborn baby and on a video sequence that shows the heartbeat on the wrist. We detect and magnify the heart pulse applying our technique. Our motion magnification approach is compared to the Laplacian phase based approach by means of quantitative metrics (based on the RMS error and the Fourier transform) to measure the quality of both reconstruction and magnification. In addition a noise robustness analysis is performed for the two methods.
Low-cost synchronization of high-speed audio and video recordings in bio-acoustic experiments.
Laurijssen, Dennis; Verreycken, Erik; Geipel, Inga; Daems, Walter; Peremans, Herbert; Steckel, Jan
2018-02-27
In this paper, we present a method for synchronizing high-speed audio and video recordings of bio-acoustic experiments. By embedding a random signal into the recorded video and audio data, robust synchronization of a diverse set of sensor streams can be performed without the need to keep detailed records. The synchronization can be performed using recording devices without dedicated synchronization inputs. We demonstrate the efficacy of the approach in two sets of experiments: behavioral experiments on different species of echolocating bats and the recordings of field crickets. We present the general operating principle of the synchronization method, discuss its synchronization strength and provide insights into how to construct such a device using off-the-shelf components. © 2018. Published by The Company of Biologists Ltd.
NASA Astrophysics Data System (ADS)
Mantel, Claire; Korhonen, Jari; Pedersen, Jesper M.; Bech, Søren; Andersen, Jakob Dahl; Forchhammer, Søren
2015-01-01
This paper focuses on the influence of ambient light on the perceived quality of videos displayed on Liquid Crystal Display (LCD) with local backlight dimming. A subjective test assessing the quality of videos with two backlight dimming methods and three lighting conditions, i.e. no light, low light level (5 lux) and higher light level (60 lux) was organized to collect subjective data. Results show that participants prefer the method exploiting local dimming possibilities to the conventional full backlight but that this preference varies depending on the ambient light level. The clear preference for one method at the low light conditions decreases at the high ambient light, confirming that the ambient light significantly attenuates the perception of the leakage defect (light leaking through dark pixels). Results are also highly dependent on the content of the sequence, which can modulate the effect of the ambient light from having an important influence on the quality grades to no influence at all.
Zhao, Zijian; Voros, Sandrine; Weng, Ying; Chang, Faliang; Li, Ruijian
2017-12-01
Worldwide propagation of minimally invasive surgeries (MIS) is hindered by their drawback of indirect observation and manipulation, while monitoring of surgical instruments moving in the operated body required by surgeons is a challenging problem. Tracking of surgical instruments by vision-based methods is quite lucrative, due to its flexible implementation via software-based control with no need to modify instruments or surgical workflow. A MIS instrument is conventionally split into a shaft and end-effector portions, while a 2D/3D tracking-by-detection framework is proposed, which performs the shaft tracking followed by the end-effector one. The former portion is described by line features via the RANSAC scheme, while the latter is depicted by special image features based on deep learning through a well-trained convolutional neural network. The method verification in 2D and 3D formulation is performed through the experiments on ex-vivo video sequences, while qualitative validation on in-vivo video sequences is obtained. The proposed method provides robust and accurate tracking, which is confirmed by the experimental results: its 3D performance in ex-vivo video sequences exceeds those of the available state-of -the-art methods. Moreover, the experiments on in-vivo sequences demonstrate that the proposed method can tackle the difficult condition of tracking with unknown camera parameters. Further refinements of the method will refer to the occlusion and multi-instrumental MIS applications.
NASA Astrophysics Data System (ADS)
Huber, Samuel; Dunau, Patrick; Wellig, Peter; Stein, Karin
2017-10-01
Background: In target detection, the success rates depend strongly on human observer performances. Two prior studies tested the contributions of target detection algorithms and prior training sessions. The aim of this Swiss-German cooperation study was to evaluate the dependency of human observer performance on the quality of supporting image analysis algorithms. Methods: The participants were presented 15 different video sequences. Their task was to detect all targets in the shortest possible time. Each video sequence showed a heavily cluttered simulated public area from a different viewing angle. In each video sequence, the number of avatars in the area was altered to 100, 150 and 200 subjects. The number of targets appearing was kept at 10%. The number of marked targets varied from 0, 5, 10, 20 up to 40 marked subjects while keeping the positive predictive value of the detection algorithm at 20%. During the task, workload level was assessed by applying an acoustic secondary task. Detection rates and detection times for the targets were analyzed using inferential statistics. Results: The study found Target Detection Time to increase and Target Detection Rates to decrease with increasing numbers of avatars. The same is true for the Secondary Task Reaction Time while there was no effect on Secondary Task Hit Rate. Furthermore, we found a trend for a u-shaped correlation between the numbers of markings and RTST indicating increased workload. Conclusion: The trial results may indicate useful criteria for the design of training and support of observers in observational tasks.
Tezer, Fadime Irsel; Agan, Kadriye; Borggraefe, Ingo; Noachtar, Soheyl
2013-09-01
This patient report demonstrates the importance of seizure evolution in the localising value of seizure semiology. Spread of epileptic activity from frontal to temporal lobe, as demonstrated by invasive recordings, was reflected by change from hyperkinetic movements to arrest of activity with mild oral and manual automatisms. [Published with video sequences].
Kychakoff, George; Afromowitz, Martin A; Hugle, Richard E
2005-06-21
A system for detection and control of deposition on pendant tubes in recovery and power boilers includes one or more deposit monitoring sensors operating in infrared regions and about 4 or 8.7 microns and directly producing images of the interior of the boiler. An image pre-processing circuit (95) in which a 2-D image formed by the video data input is captured, and includes a low pass filter for performing noise filtering of said video input. An image segmentation module (105) for separating the image of the recovery boiler interior into background, pendant tubes, and deposition. An image-understanding unit (115) matches derived regions to a 3-D model of said boiler. It derives a 3-D structure the deposition on pendant tubes in the boiler and provides the information about deposits to the plant distributed control system (130) for more efficient operation of the plant pendant tube cleaning and operating systems.
Activity Recognition in Egocentric video using SVM, kNN and Combined SVMkNN Classifiers
NASA Astrophysics Data System (ADS)
Sanal Kumar, K. P.; Bhavani, R., Dr.
2017-08-01
Egocentric vision is a unique perspective in computer vision which is human centric. The recognition of egocentric actions is a challenging task which helps in assisting elderly people, disabled patients and so on. In this work, life logging activity videos are taken as input. There are 2 categories, first one is the top level and second one is second level. Here, the recognition is done using the features like Histogram of Oriented Gradients (HOG), Motion Boundary Histogram (MBH) and Trajectory. The features are fused together and it acts as a single feature. The extracted features are reduced using Principal Component Analysis (PCA). The features that are reduced are provided as input to the classifiers like Support Vector Machine (SVM), k nearest neighbor (kNN) and combined Support Vector Machine (SVM) and k Nearest Neighbor (kNN) (combined SVMkNN). These classifiers are evaluated and the combined SVMkNN provided better results than other classifiers in the literature.
NASA Astrophysics Data System (ADS)
Cavigelli, Lukas; Bernath, Dominic; Magno, Michele; Benini, Luca
2016-10-01
Detecting and classifying targets in video streams from surveillance cameras is a cumbersome, error-prone and expensive task. Often, the incurred costs are prohibitive for real-time monitoring. This leads to data being stored locally or transmitted to a central storage site for post-incident examination. The required communication links and archiving of the video data are still expensive and this setup excludes preemptive actions to respond to imminent threats. An effective way to overcome these limitations is to build a smart camera that analyzes the data on-site, close to the sensor, and transmits alerts when relevant video sequences are detected. Deep neural networks (DNNs) have come to outperform humans in visual classifications tasks and are also performing exceptionally well on other computer vision tasks. The concept of DNNs and Convolutional Networks (ConvNets) can easily be extended to make use of higher-dimensional input data such as multispectral data. We explore this opportunity in terms of achievable accuracy and required computational effort. To analyze the precision of DNNs for scene labeling in an urban surveillance scenario we have created a dataset with 8 classes obtained in a field experiment. We combine an RGB camera with a 25-channel VIS-NIR snapshot sensor to assess the potential of multispectral image data for target classification. We evaluate several new DNNs, showing that the spectral information fused together with the RGB frames can be used to improve the accuracy of the system or to achieve similar accuracy with a 3x smaller computation effort. We achieve a very high per-pixel accuracy of 99.1%. Even for scarcely occurring, but particularly interesting classes, such as cars, 75% of the pixels are labeled correctly with errors occurring only around the border of the objects. This high accuracy was obtained with a training set of only 30 labeled images, paving the way for fast adaptation to various application scenarios.
Classification and Weakly Supervised Pain Localization using Multiple Segment Representation.
Sikka, Karan; Dhall, Abhinav; Bartlett, Marian Stewart
2014-10-01
Automatic pain recognition from videos is a vital clinical application and, owing to its spontaneous nature, poses interesting challenges to automatic facial expression recognition (AFER) research. Previous pain vs no-pain systems have highlighted two major challenges: (1) ground truth is provided for the sequence, but the presence or absence of the target expression for a given frame is unknown, and (2) the time point and the duration of the pain expression event(s) in each video are unknown. To address these issues we propose a novel framework (referred to as MS-MIL) where each sequence is represented as a bag containing multiple segments, and multiple instance learning (MIL) is employed to handle this weakly labeled data in the form of sequence level ground-truth. These segments are generated via multiple clustering of a sequence or running a multi-scale temporal scanning window, and are represented using a state-of-the-art Bag of Words (BoW) representation. This work extends the idea of detecting facial expressions through 'concept frames' to 'concept segments' and argues through extensive experiments that algorithms such as MIL are needed to reap the benefits of such representation. The key advantages of our approach are: (1) joint detection and localization of painful frames using only sequence-level ground-truth, (2) incorporation of temporal dynamics by representing the data not as individual frames but as segments, and (3) extraction of multiple segments, which is well suited to signals with uncertain temporal location and duration in the video. Extensive experiments on UNBC-McMaster Shoulder Pain dataset highlight the effectiveness of the approach by achieving competitive results on both tasks of pain classification and localization in videos. We also empirically evaluate the contributions of different components of MS-MIL. The paper also includes the visualization of discriminative facial patches, important for pain detection, as discovered by our algorithm and relates them to Action Units that have been associated with pain expression. We conclude the paper by demonstrating that MS-MIL yields a significant improvement on another spontaneous facial expression dataset, the FEEDTUM dataset.
Xiaoming Zou; Grizelle Gonzalez
1997-01-01
Plant community succession alters the quantity and chemistry of organic inputs to soils. These differences in organic input may trigger changes in soil fertility and fauna1 activity. We examined earthworm density and community structure along a successional sequence of plant communities in abandoned tropical pastures in Puerto Rico. The chronological sequence of these...
ERIC Educational Resources Information Center
Kudoh, Masaharu; Shibuki, Katsuei
2006-01-01
We have previously reported that sound sequence discrimination learning requires cholinergic inputs to the auditory cortex (AC) in rats. In that study, reward was used for motivating discrimination behavior in rats. Therefore, dopaminergic inputs mediating reward signals may have an important role in the learning. We tested the possibility in the…
Comparative study of methods for recognition of an unknown person's action from a video sequence
NASA Astrophysics Data System (ADS)
Hori, Takayuki; Ohya, Jun; Kurumisawa, Jun
2009-02-01
This paper proposes a Tensor Decomposition Based method that can recognize an unknown person's action from a video sequence, where the unknown person is not included in the database (tensor) used for the recognition. The tensor consists of persons, actions and time-series image features. For the observed unknown person's action, one of the actions stored in the tensor is assumed. Using the motion signature obtained from the assumption, the unknown person's actions are synthesized. The actions of one of the persons in the tensor are replaced by the synthesized actions. Then, the core tensor for the replaced tensor is computed. This process is repeated for the actions and persons. For each iteration, the difference between the replaced and original core tensors is computed. The assumption that gives the minimal difference is the action recognition result. For the time-series image features to be stored in the tensor and to be extracted from the observed video sequence, the human body silhouette's contour shape based feature is used. To show the validity of our proposed method, our proposed method is experimentally compared with Nearest Neighbor rule and Principal Component analysis based method. Experiments using 33 persons' seven kinds of action show that our proposed method achieves better recognition accuracies for the seven actions than the other methods.
Genome sequencing of a single tardigrade Hypsibius dujardini individual
Arakawa, Kazuharu; Yoshida, Yuki; Tomita, Masaru
2016-01-01
Tardigrades are ubiquitous microscopic animals that play an important role in the study of metazoan phylogeny. Most terrestrial tardigrades can withstand extreme environments by entering an ametabolic desiccated state termed anhydrobiosis. Due to their small size and the non-axenic nature of laboratory cultures, molecular studies of tardigrades are prone to contamination. To minimize the possibility of microbial contaminations and to obtain high-quality genomic information, we have developed an ultra-low input library sequencing protocol to enable the genome sequencing of a single tardigrade Hypsibius dujardini individual. Here, we describe the details of our sequencing data and the ultra-low input library preparation methodologies. PMID:27529330
Genome sequencing of a single tardigrade Hypsibius dujardini individual.
Arakawa, Kazuharu; Yoshida, Yuki; Tomita, Masaru
2016-08-16
Tardigrades are ubiquitous microscopic animals that play an important role in the study of metazoan phylogeny. Most terrestrial tardigrades can withstand extreme environments by entering an ametabolic desiccated state termed anhydrobiosis. Due to their small size and the non-axenic nature of laboratory cultures, molecular studies of tardigrades are prone to contamination. To minimize the possibility of microbial contaminations and to obtain high-quality genomic information, we have developed an ultra-low input library sequencing protocol to enable the genome sequencing of a single tardigrade Hypsibius dujardini individual. Here, we describe the details of our sequencing data and the ultra-low input library preparation methodologies.
A modular DNA signal translator for the controlled release of a protein by an aptamer.
Beyer, Stefan; Simmel, Friedrich C
2006-01-01
Owing to the intimate linkage of sequence and structure in nucleic acids, DNA is an extremely attractive molecule for the development of molecular devices, in particular when a combination of information processing and chemomechanical tasks is desired. Many of the previously demonstrated devices are driven by hybridization between DNA 'effector' strands and specific recognition sequences on the device. For applications it is of great interest to link several of such molecular devices together within artificial reaction cascades. Often it will not be possible to choose DNA sequences freely, e.g. when functional nucleic acids such as aptamers are used. In such cases translation of an arbitrary 'input' sequence into a desired effector sequence may be required. Here we demonstrate a molecular 'translator' for information encoded in DNA and show how it can be used to control the release of a protein by an aptamer using an arbitrarily chosen DNA input strand. The function of the translator is based on branch migration and the action of the endonuclease FokI. The modular design of the translator facilitates the adaptation of the device to various input or output sequences.
NASA Astrophysics Data System (ADS)
Hasan, Taufiq; Bořil, Hynek; Sangwan, Abhijeet; L Hansen, John H.
2013-12-01
The ability to detect and organize `hot spots' representing areas of excitement within video streams is a challenging research problem when techniques rely exclusively on video content. A generic method for sports video highlight selection is presented in this study which leverages both video/image structure as well as audio/speech properties. Processing begins where the video is partitioned into small segments and several multi-modal features are extracted from each segment. Excitability is computed based on the likelihood of the segmental features residing in certain regions of their joint probability density function space which are considered both exciting and rare. The proposed measure is used to rank order the partitioned segments to compress the overall video sequence and produce a contiguous set of highlights. Experiments are performed on baseball videos based on signal processing advancements for excitement assessment in the commentators' speech, audio energy, slow motion replay, scene cut density, and motion activity as features. Detailed analysis on correlation between user excitability and various speech production parameters is conducted and an effective scheme is designed to estimate the excitement level of commentator's speech from the sports videos. Subjective evaluation of excitability and ranking of video segments demonstrate a higher correlation with the proposed measure compared to well-established techniques indicating the effectiveness of the overall approach.
Video-based face recognition via convolutional neural networks
NASA Astrophysics Data System (ADS)
Bao, Tianlong; Ding, Chunhui; Karmoshi, Saleem; Zhu, Ming
2017-06-01
Face recognition has been widely studied recently while video-based face recognition still remains a challenging task because of the low quality and large intra-class variation of video captured face images. In this paper, we focus on two scenarios of video-based face recognition: 1)Still-to-Video(S2V) face recognition, i.e., querying a still face image against a gallery of video sequences; 2)Video-to-Still(V2S) face recognition, in contrast to S2V scenario. A novel method was proposed in this paper to transfer still and video face images to an Euclidean space by a carefully designed convolutional neural network, then Euclidean metrics are used to measure the distance between still and video images. Identities of still and video images that group as pairs are used as supervision. In the training stage, a joint loss function that measures the Euclidean distance between the predicted features of training pairs and expanding vectors of still images is optimized to minimize the intra-class variation while the inter-class variation is guaranteed due to the large margin of still images. Transferred features are finally learned via the designed convolutional neural network. Experiments are performed on COX face dataset. Experimental results show that our method achieves reliable performance compared with other state-of-the-art methods.
McCormick, Paul C
2014-09-01
Dumbbell tumors of the cervical spine can present considerable management challenges related to adequate exposure of both intraspinal and paraspinal tumor components, potential injury to the vertebral artery, and spinal stability. This video demonstrates the microsurgical removal of a large cervical dumbbell schwannoma with instrumented fusion via a single stage extended posterior approach. The video shows patient positioning, tumor exposure, and the sequence and techniques of tumor resection, vertebral artery identification and protection, and dural repair. The video can be found here: http://youtu.be/3lIVfKEcxss.
Self-induced stretch syncope of adolescence: a video-EEG documentation.
Mazzuca, Michel; Thomas, Pierre
2007-12-01
We present the first video-EEG documentation, with ECG and EMG features, of stretch syncope of adolescence in a young, healthy 16-year-old boy. Stretch syncope of adolescence is a rarely reported, benign cause of fainting in young patients, which can be confused with epileptic seizures. In our patient, syncopes were self-induced to avoid school. Dynamic transcranial Doppler showed evidence of blood flow decrease in both posterior cerebral arteries mimicking effects of a Valsalva manoeuvre. Dynamic angiogram of the vertebral arteries was normal. Hypotheses concerning the physiopathology are discussed. [Published with video sequences].
Registration of retinal sequences from new video-ophthalmoscopic camera.
Kolar, Radim; Tornow, Ralf P; Odstrcilik, Jan; Liberdova, Ivana
2016-05-20
Analysis of fast temporal changes on retinas has become an important part of diagnostic video-ophthalmology. It enables investigation of the hemodynamic processes in retinal tissue, e.g. blood-vessel diameter changes as a result of blood-pressure variation, spontaneous venous pulsation influenced by intracranial-intraocular pressure difference, blood-volume changes as a result of changes in light reflection from retinal tissue, and blood flow using laser speckle contrast imaging. For such applications, image registration of the recorded sequence must be performed. Here we use a new non-mydriatic video-ophthalmoscope for simple and fast acquisition of low SNR retinal sequences. We introduce a novel, two-step approach for fast image registration. The phase correlation in the first stage removes large eye movements. Lucas-Kanade tracking in the second stage removes small eye movements. We propose robust adaptive selection of the tracking points, which is the most important part of tracking-based approaches. We also describe a method for quantitative evaluation of the registration results, based on vascular tree intensity profiles. The achieved registration error evaluated on 23 sequences (5840 frames) is 0.78 ± 0.67 pixels inside the optic disc and 1.39 ± 0.63 pixels outside the optic disc. We compared the results with the commonly used approaches based on Lucas-Kanade tracking and scale-invariant feature transform, which achieved worse results. The proposed method can efficiently correct particular frames of retinal sequences for shift and rotation. The registration results for each frame (shift in X and Y direction and eye rotation) can also be used for eye-movement evaluation during single-spot fixation tasks.
Picardi, N
1999-01-01
The facility of the tape recording of a surgical operation, by means of simple manageable apparatuses and at low costs, especially in comparison with the former cinematography, makes it possible for all surgeons to record their own operative activity. Therefore at present the demonstration in video of surgical interventions is very common, but very often the video-tapes show surgical events only in straight chronological succession, as for facts of chronicle news. The simplification of the otherwise sophisticated digital technology of informatics elaboration of images makes more convenient and advisable to assemble the more meaningful sequences for a final product of higher scientific value. The digital technology gives at the best its contribution during the phase of post-production of the video-tape, where the surgeon himself can assemble an end product of more value because aimed to a scientific and rational communication. Thanks to such an elaboration the video-tape can aim not simply to become a good documentary, but also to achieve an educational purpose or becomes a truly scientific film. The initial video will be recorded following a specific project, the script, foreseeing and programming what has to be demonstrated of the surgical operation, establishing therefore in advance the most important steps of the intervention. The sequences recorded will then be assembled not necessarily in a chronological succession but integrating the moving images with static pictures, as drawings, schemes, tables, aside the picture-in picture technique, and besides the vocal descriptive comment. The cinema language has accustomed us to a series of passages among the different sequences as fading, cross-over, "flash-back", aiming to stimulate the psychological associative powers and encourage those critical. The video-tape can be opportunely shortened, paying attention to show only the essential phases of the operation for demonstrate only the core of the problem and utilize at the best the physiological period of active attention of the observer. The informatic digital elaboration has become so easy that the surgeon himself can be able to elaborate personally on his personal computer, with professional and scientific attitude, the sequences of his surgical activity in a product of more general value. His personal engagement also in the phase of post-production gives him the possibility to demonstrate uprightly with images the complex surgical experience of science, skill and ability to communicate, perhaps better than he is able to do with words.
Modeling of video compression effects on target acquisition performance
NASA Astrophysics Data System (ADS)
Cha, Jae H.; Preece, Bradley; Espinola, Richard L.
2009-05-01
The effect of video compression on image quality was investigated from the perspective of target acquisition performance modeling. Human perception tests were conducted recently at the U.S. Army RDECOM CERDEC NVESD, measuring identification (ID) performance on simulated military vehicle targets at various ranges. These videos were compressed with different quality and/or quantization levels utilizing motion JPEG, motion JPEG2000, and MPEG-4 encoding. To model the degradation on task performance, the loss in image quality is fit to an equivalent Gaussian MTF scaled by the Structural Similarity Image Metric (SSIM). Residual compression artifacts are treated as 3-D spatio-temporal noise. This 3-D noise is found by taking the difference of the uncompressed frame, with the estimated equivalent blur applied, and the corresponding compressed frame. Results show good agreement between the experimental data and the model prediction. This method has led to a predictive performance model for video compression by correlating various compression levels to particular blur and noise input parameters for NVESD target acquisition performance model suite.
Ranging Apparatus and Method Implementing Stereo Vision System
NASA Technical Reports Server (NTRS)
Li, Larry C. (Inventor); Cox, Brian J. (Inventor)
1997-01-01
A laser-directed ranging system for use in telerobotics applications and other applications involving physically handicapped individuals. The ranging system includes a left and right video camera mounted on a camera platform, and a remotely positioned operator. The position of the camera platform is controlled by three servo motors to orient the roll axis, pitch axis and yaw axis of the video cameras, based upon an operator input such as head motion. A laser is provided between the left and right video camera and is directed by the user to point to a target device. The images produced by the left and right video cameras are processed to eliminate all background images except for the spot created by the laser. This processing is performed by creating a digital image of the target prior to illumination by the laser, and then eliminating common pixels from the subsequent digital image which includes the laser spot. The horizontal disparity between the two processed images is calculated for use in a stereometric ranging analysis from which range is determined.
Comparison of ASGARD and UFOCapture
NASA Technical Reports Server (NTRS)
Blaauw, Rhiannon C.; Cruse, Katherine S.
2011-01-01
The Meteoroid Environment Office is undertaking a comparison between UFOCapture/Analyzer and ASGARD (All Sky and Guided Automatic Realtime Detection). To accomplish this, video output from a Watec video camera on a 17 mm Schneider lens (25 degree field of view) was split and input into the two different meteor detection softwares. The purpose of this study is to compare the sensitivity of the two systems, false alarm rates and trajectory information, among other quantities. The important components of each software will be highlighted and comments made about the detection/rejection algorithms and the amount of user-labor required for each system.
Temporal flicker reduction and denoising in video using sparse directional transforms
NASA Astrophysics Data System (ADS)
Kanumuri, Sandeep; Guleryuz, Onur G.; Civanlar, M. Reha; Fujibayashi, Akira; Boon, Choong S.
2008-08-01
The bulk of the video content available today over the Internet and over mobile networks suffers from many imperfections caused during acquisition and transmission. In the case of user-generated content, which is typically produced with inexpensive equipment, these imperfections manifest in various ways through noise, temporal flicker and blurring, just to name a few. Imperfections caused by compression noise and temporal flicker are present in both studio-produced and user-generated video content transmitted at low bit-rates. In this paper, we introduce an algorithm designed to reduce temporal flicker and noise in video sequences. The algorithm takes advantage of the sparse nature of video signals in an appropriate transform domain that is chosen adaptively based on local signal statistics. When the signal corresponds to a sparse representation in this transform domain, flicker and noise, which are spread over the entire domain, can be reduced easily by enforcing sparsity. Our results show that the proposed algorithm reduces flicker and noise significantly and enables better presentation of compressed videos.
Video documentation of experiments at the USGS debris-flow flume 1992–2017
Logan, Matthew; Iverson, Richard M.
2007-11-23
This set of videos presents about 18 hours of footage documenting the 163 experiments conducted at the USGS debris-flow flume from 1992 to 2017. Owing to improvements in video technology over the years, the quality of footage from recent experiments generally exceeds that from earlier experiments.Use the list below to access the individual videos, which are mostly grouped by date and subject matter. When a video is selected from the list, multiple video sequences are generally shown in succession, beginning with a far-field overview and proceeding to close-up views and post-experiment documentation.Interpretations and data from experiments at the USGS debris-flow flume are not provided here but can be found in published reports, many of which are available online at: https://profile.usgs.gov/riverson/A brief introduction to the flume facility is also available online in USGS Open-File Report 92–483 [http://pubs.er.usgs.gov/usgspubs/ofr/ofr92483].
NASA Astrophysics Data System (ADS)
Duplaga, M.; Leszczuk, M. I.; Papir, Z.; Przelaskowski, A.
2008-12-01
Wider dissemination of medical digital video libraries is affected by two correlated factors, resource effective content compression that directly influences its diagnostic credibility. It has been proved that it is possible to meet these contradictory requirements halfway for long-lasting and low motion surgery recordings at compression ratios close to 100 (bronchoscopic procedures were a case study investigated). As the main supporting assumption, it has been accepted that the content can be compressed as far as clinicians are not able to sense a loss of video diagnostic fidelity (a visually lossless compression). Different market codecs were inspected by means of the combined subjective and objective tests toward their usability in medical video libraries. Subjective tests involved a panel of clinicians who had to classify compressed bronchoscopic video content according to its quality under the bubble sort algorithm. For objective tests, two metrics (hybrid vector measure and hosaka Plots) were calculated frame by frame and averaged over a whole sequence.
Authoring Data-Driven Videos with DataClips.
Amini, Fereshteh; Riche, Nathalie Henry; Lee, Bongshin; Monroy-Hernandez, Andres; Irani, Pourang
2017-01-01
Data videos, or short data-driven motion graphics, are an increasingly popular medium for storytelling. However, creating data videos is difficult as it involves pulling together a unique combination of skills. We introduce DataClips, an authoring tool aimed at lowering the barriers to crafting data videos. DataClips allows non-experts to assemble data-driven "clips" together to form longer sequences. We constructed the library of data clips by analyzing the composition of over 70 data videos produced by reputable sources such as The New York Times and The Guardian. We demonstrate that DataClips can reproduce over 90% of our data videos corpus. We also report on a qualitative study comparing the authoring process and outcome achieved by (1) non-experts using DataClips, and (2) experts using Adobe Illustrator and After Effects to create data-driven clips. Results indicated that non-experts are able to learn and use DataClips with a short training period. In the span of one hour, they were able to produce more videos than experts using a professional editing tool, and their clips were rated similarly by an independent audience.
Meldrum, Sarah; Savarimuthu, Bastin Tr; Licorish, Sherlock; Tahir, Amjed; Bosu, Michael; Jayakaran, Prasath
2017-01-01
There is little research that characterises knee pain related information disseminated via social media. However, variances in the content and quality of such sources could compromise optimal patient care. This study explored the nature of the comments on YouTube videos related to non-specific knee pain, to determine their helpfulness to the users. A systematic search identified 900 videos related to knee pain on the YouTube database. A total of 3537 comments from 58 videos were included in the study. A categorisation scheme was developed and 1000 randomly selected comments were analysed according to this scheme. The most common category was the users providing personal information or describing a personal situation (19%), followed by appreciation or acknowledgement of others' inputs (17%) and asking questions (15%). Of the questions, 33% were related to seeking help in relation to a specific situation. Over 10% of the comments contained negativity or disagreement; while 4.4% of comments reported they intended to pursue an action, based on the information presented in the video and/or from user comments. It was observed that individuals commenting on YouTube videos on knee pain were most often soliciting advice and information specific to their condition. The analysis of comments from the most commented videos using a keyword-based search approach suggests that the YouTube videos can be used for disseminating general advice on knee pain.
Meldrum, Sarah; Savarimuthu, Bastin TR; Licorish, Sherlock; Tahir, Amjed; Bosu, Michael; Jayakaran, Prasath
2017-01-01
Objective There is little research that characterises knee pain related information disseminated via social media. However, variances in the content and quality of such sources could compromise optimal patient care. This study explored the nature of the comments on YouTube videos related to non-specific knee pain, to determine their helpfulness to the users. Methods A systematic search identified 900 videos related to knee pain on the YouTube database. A total of 3537 comments from 58 videos were included in the study. A categorisation scheme was developed and 1000 randomly selected comments were analysed according to this scheme. Results The most common category was the users providing personal information or describing a personal situation (19%), followed by appreciation or acknowledgement of others’ inputs (17%) and asking questions (15%). Of the questions, 33% were related to seeking help in relation to a specific situation. Over 10% of the comments contained negativity or disagreement; while 4.4% of comments reported they intended to pursue an action, based on the information presented in the video and/or from user comments. Conclusion It was observed that individuals commenting on YouTube videos on knee pain were most often soliciting advice and information specific to their condition. The analysis of comments from the most commented videos using a keyword-based search approach suggests that the YouTube videos can be used for disseminating general advice on knee pain. PMID:29942583
High Tech and Library Access for People with Disabilities.
ERIC Educational Resources Information Center
Roatch, Mary A.
1992-01-01
Describes tools that enable people with disabilities to access print information, including optical character recognition, synthetic voice output, other input devices, Braille access devices, large print displays, television and video, TDD (Telecommunications Devices for the Deaf), and Telebraille. Use of technology by libraries to meet mandates…
NASA Dryden Flight Research Center
NASA Technical Reports Server (NTRS)
Navarro, Robert
2009-01-01
This DVD has several short videos showing some of the work that Dryden is involved in with experimental aircraft. These are: shots showing the Active AeroElastic Wing (AAW) loads calibration tests, AAW roll maneuvers, AAW flight control surface inputs, Helios flight, and takeoff, and Pathfinder takeoff, flight and landing.
Miniature electrometer preamplifier effectively compensates for input capacitance
NASA Technical Reports Server (NTRS)
Burrous, C. N.; Deboo, G. J.
1966-01-01
Negative capacitance preamplifier using a dual MOS /Metal Oxide Silicon/ transistor in conjunction with bipolar transistors is used with intracellular microelectrodes in recording bioelectric potentials. Applications would include use as a pickup plate video amplifier in storage tube tests and for pH and ionization chamber measurements.
A Macintosh-Based Scientific Images Video Analysis System
NASA Technical Reports Server (NTRS)
Groleau, Nicolas; Friedland, Peter (Technical Monitor)
1994-01-01
A set of experiments was designed at MIT's Man-Vehicle Laboratory in order to evaluate the effects of zero gravity on the human orientation system. During many of these experiments, the movements of the eyes are recorded on high quality video cassettes. The images must be analyzed off-line to calculate the position of the eyes at every moment in time. To this aim, I have implemented a simple inexpensive computerized system which measures the angle of rotation of the eye from digitized video images. The system is implemented on a desktop Macintosh computer, processes one play-back frame per second and exhibits adequate levels of accuracy and precision. The system uses LabVIEW, a digital output board, and a video input board to control a VCR, digitize video images, analyze them, and provide a user friendly interface for the various phases of the process. The system uses the Concept Vi LabVIEW library (Graftek's Image, Meudon la Foret, France) for image grabbing and displaying as well as translation to and from LabVIEW arrays. Graftek's software layer drives an Image Grabber board from Neotech (Eastleigh, United Kingdom). A Colour Adapter box from Neotech provides adequate video signal synchronization. The system also requires a LabVIEW driven digital output board (MacADIOS II from GW Instruments, Cambridge, MA) controlling a slightly modified VCR remote control used mainly to advance the video tape frame by frame.
Detection of distorted frames in retinal video-sequences via machine learning
NASA Astrophysics Data System (ADS)
Kolar, Radim; Liberdova, Ivana; Odstrcilik, Jan; Hracho, Michal; Tornow, Ralf P.
2017-07-01
This paper describes detection of distorted frames in retinal sequences based on set of global features extracted from each frame. The feature vector is consequently used in classification step, in which three types of classifiers are tested. The best classification accuracy 96% has been achieved with support vector machine approach.
Authentic L2 Interactions as Material for a Pragmatic Awareness-Raising Activity
ERIC Educational Resources Information Center
Cheng, Tsui-Ping
2016-01-01
This study draws on conversation analysis to explore the pedagogical possibility of using audiovisual depictions of authentic disagreement sequences from L2 interactions as sources for an awareness-raising activity in an English as a Second Language (ESL) classroom. Video excerpts of disagreement sequences collected from two ESL classes were used…
Using video-oriented instructions to speed up sequence comparison.
Wozniak, A
1997-04-01
This document presents an implementation of the well-known Smith-Waterman algorithm for comparison of proteic and nucleic sequences, using specialized video instructions. These instructions, SIMD-like in their design, make possible parallelization of the algorithm at the instruction level. Benchmarks on an ULTRA SPARC running at 167 MHz show a speed-up factor of two compared to the same algorithm implemented with integer instructions on the same machine. Performance reaches over 18 million matrix cells per second on a single processor, giving to our knowledge the fastest implementation of the Smith-Waterman algorithm on a workstation. The accelerated procedure was introduced in LASSAP--a LArge Scale Sequence compArison Package software developed at INRIA--which handles parallelism at higher level. On a SUN Enterprise 6000 server with 12 processors, a speed of nearly 200 million matrix cells per second has been obtained. A sequence of length 300 amino acids is scanned against SWISSPROT R33 (1,8531,385 residues) in 29 s. This procedure is not restricted to databank scanning. It applies to all cases handled by LASSAP (intra- and inter-bank comparisons, Z-score computation, etc.
Hierarchical structure for audio-video based semantic classification of sports video sequences
NASA Astrophysics Data System (ADS)
Kolekar, M. H.; Sengupta, S.
2005-07-01
A hierarchical structure for sports event classification based on audio and video content analysis is proposed in this paper. Compared to the event classifications in other games, those of cricket are very challenging and yet unexplored. We have successfully solved cricket video classification problem using a six level hierarchical structure. The first level performs event detection based on audio energy and Zero Crossing Rate (ZCR) of short-time audio signal. In the subsequent levels, we classify the events based on video features using a Hidden Markov Model implemented through Dynamic Programming (HMM-DP) using color or motion as a likelihood function. For some of the game-specific decisions, a rule-based classification is also performed. Our proposed hierarchical structure can easily be applied to any other sports. Our results are very promising and we have moved a step forward towards addressing semantic classification problems in general.
Privacy enabling technology for video surveillance
NASA Astrophysics Data System (ADS)
Dufaux, Frédéric; Ouaret, Mourad; Abdeljaoued, Yousri; Navarro, Alfonso; Vergnenègre, Fabrice; Ebrahimi, Touradj
2006-05-01
In this paper, we address the problem privacy in video surveillance. We propose an efficient solution based on transformdomain scrambling of regions of interest in a video sequence. More specifically, the sign of selected transform coefficients is flipped during encoding. We address more specifically the case of Motion JPEG 2000. Simulation results show that the technique can be successfully applied to conceal information in regions of interest in the scene while providing with a good level of security. Furthermore, the scrambling is flexible and allows adjusting the amount of distortion introduced. This is achieved with a small impact on coding performance and negligible computational complexity increase. In the proposed video surveillance system, heterogeneous clients can remotely access the system through the Internet or 2G/3G mobile phone network. Thanks to the inherently scalable Motion JPEG 2000 codestream, the server is able to adapt the resolution and bandwidth of the delivered video depending on the usage environment of the client.
Design and implementation of a non-linear symphonic soundtrack of a video game
NASA Astrophysics Data System (ADS)
Sporka, Adam J.; Valta, Jan
2017-10-01
The music in the contemporary video games is often interactive. The music playback is based on transitions between pieces of available music material. These transitions happen in response to evolving gameplay. This paradigm is referred to as the adaptive music. Our challenge was to design, create, and implement the soundtrack of the upcoming video game Kingdom Come: Deliverance. Our soundtrack is a collection of compositions with symphonic orchestration. Per our design decision, our intention was to implement the adaptive music in a way which respected the nature of the orchestral film score. We created our own adaptive music middleware, called Sequence Music Engine, implementing a high-level music logic as well as the low-level playback infrastructure. Our system can handle hours of video game music, helps maintain the relevance of the music throughout the video game, and minimises the repetitiveness of the individual pieces.
Library orientation on videotape: production planning and administrative support.
Shedlock, J; Tawyea, E W
1989-01-01
New student-faculty-staff orientation is an important public service in a medical library and demands creativity, imagination, teaching skill, coordination, and cooperation on the part of public services staff. The Northwestern University Medical Library (NUML) implemented a video production service in the spring of 1986 and used the new service to produce an orientation videotape for incoming students, new faculty, and medical center staff. Planning is an important function in video production, and the various phases of outlining topics, drafting scripts, matching video sequences, and actual taping of video, voice, and music are described. The NUML orientation videotape demonstrates how reference and audiovisual services merge talent and skills to benefit the library user. Videotape production, however, cannot happen in a vacuum of good intentions and high ideals. This paper also presents the management support and cost analysis needed to make video production services a reality for use by public service departments.
Video watermarking for mobile phone applications
NASA Astrophysics Data System (ADS)
Mitrea, M.; Duta, S.; Petrescu, M.; Preteux, F.
2005-08-01
Nowadays, alongside with the traditional voice signal, music, video, and 3D characters tend to become common data to be run, stored and/or processed on mobile phones. Hence, to protect their related intellectual property rights also becomes a crucial issue. The video sequences involved in such applications are generally coded at very low bit rates. The present paper starts by presenting an accurate statistical investigation on such a video as well as on a very dangerous attack (the StirMark attack). The obtained results are turned into practice when adapting a spread spectrum watermarking method to such applications. The informed watermarking approach was also considered: an outstanding method belonging to this paradigm has been adapted and re evaluated under the low rate video constraint. The experimental results were conducted in collaboration with the SFR mobile services provider in France. They also allow a comparison between the spread spectrum and informed embedding techniques.
Characterization, adaptive traffic shaping, and multiplexing of real-time MPEG II video
NASA Astrophysics Data System (ADS)
Agrawal, Sanjay; Barry, Charles F.; Binnai, Vinay; Kazovsky, Leonid G.
1997-01-01
We obtain network traffic model for real-time MPEG-II encoded digital video by analyzing video stream samples from real-time encoders from NUKO Information Systems. MPEG-II sample streams include a resolution intensive movie, City of Joy, an action intensive movie, Aliens, a luminance intensive (black and white) movie, Road To Utopia, and a chrominance intensive (color) movie, Dick Tracy. From our analysis we obtain a heuristic model for the encoded video traffic which uses a 15-stage Markov process to model the I,B,P frame sequences within a group of pictures (GOP). A jointly-correlated Gaussian process is used to model the individual frame sizes. Scene change arrivals are modeled according to a gamma process. Simulations show that our MPEG-II traffic model generates, I,B,P frame sequences and frame sizes that closely match the sample MPEG-II stream traffic characteristics as they relate to latency and buffer occupancy in network queues. To achieve high multiplexing efficiency we propose a traffic shaping scheme which sets preferred 1-frame generation times among a group of encoders so as to minimize the overall variation in total offered traffic while still allowing the individual encoders to react to scene changes. Simulations show that our scheme results in multiplexing gains of up to 10% enabling us to multiplex twenty 6 Mbps MPEG-II video streams instead of 18 streams over an ATM/SONET OC3 link without latency or cell loss penalty. This scheme is due for a patent.
USDA-ARS?s Scientific Manuscript database
The advancement of next-generation sequencing technologies in conjunction with new bioinformatics tools enabled fine-tuning of sequence-based high resolution mapping strategies for complex genomes. Although genotyping-by-sequencing (GBS) provides a large number of markers, its application for assoc...
Research on compression performance of ultrahigh-definition videos
NASA Astrophysics Data System (ADS)
Li, Xiangqun; He, Xiaohai; Qing, Linbo; Tao, Qingchuan; Wu, Di
2017-11-01
With the popularization of high-definition (HD) images and videos (1920×1080 pixels and above), there are even 4K (3840×2160) television signals and 8 K (8192×4320) ultrahigh-definition videos. The demand for HD images and videos is increasing continuously, along with the increasing data volume. The storage and transmission cannot be properly solved only by virtue of the expansion capacity of hard disks and the update and improvement of transmission devices. Based on the full use of the coding standard high-efficiency video coding (HEVC), super-resolution reconstruction technology, and the correlation between the intra- and the interprediction, we first put forward a "division-compensation"-based strategy to further improve the compression performance of a single image and frame I. Then, by making use of the above thought and HEVC encoder and decoder, a video compression coding frame is designed. HEVC is used inside the frame. Last, with the super-resolution reconstruction technology, the reconstructed video quality is further improved. The experiment shows that by the proposed compression method for a single image (frame I) and video sequence here, the performance is superior to that of HEVC in a low bit rate environment.
Bring It to the Pitch: Combining Video and Movement Data to Enhance Team Sport Analysis.
Stein, Manuel; Janetzko, Halldor; Lamprecht, Andreas; Breitkreutz, Thorsten; Zimmermann, Philipp; Goldlucke, Bastian; Schreck, Tobias; Andrienko, Gennady; Grossniklaus, Michael; Keim, Daniel A
2018-01-01
Analysts in professional team sport regularly perform analysis to gain strategic and tactical insights into player and team behavior. Goals of team sport analysis regularly include identification of weaknesses of opposing teams, or assessing performance and improvement potential of a coached team. Current analysis workflows are typically based on the analysis of team videos. Also, analysts can rely on techniques from Information Visualization, to depict e.g., player or ball trajectories. However, video analysis is typically a time-consuming process, where the analyst needs to memorize and annotate scenes. In contrast, visualization typically relies on an abstract data model, often using abstract visual mappings, and is not directly linked to the observed movement context anymore. We propose a visual analytics system that tightly integrates team sport video recordings with abstract visualization of underlying trajectory data. We apply appropriate computer vision techniques to extract trajectory data from video input. Furthermore, we apply advanced trajectory and movement analysis techniques to derive relevant team sport analytic measures for region, event and player analysis in the case of soccer analysis. Our system seamlessly integrates video and visualization modalities, enabling analysts to draw on the advantages of both analysis forms. Several expert studies conducted with team sport analysts indicate the effectiveness of our integrated approach.
Doulamis, A; Doulamis, N; Ntalianis, K; Kollias, S
2003-01-01
In this paper, an unsupervised video object (VO) segmentation and tracking algorithm is proposed based on an adaptable neural-network architecture. The proposed scheme comprises: 1) a VO tracking module and 2) an initial VO estimation module. Object tracking is handled as a classification problem and implemented through an adaptive network classifier, which provides better results compared to conventional motion-based tracking algorithms. Network adaptation is accomplished through an efficient and cost effective weight updating algorithm, providing a minimum degradation of the previous network knowledge and taking into account the current content conditions. A retraining set is constructed and used for this purpose based on initial VO estimation results. Two different scenarios are investigated. The first concerns extraction of human entities in video conferencing applications, while the second exploits depth information to identify generic VOs in stereoscopic video sequences. Human face/ body detection based on Gaussian distributions is accomplished in the first scenario, while segmentation fusion is obtained using color and depth information in the second scenario. A decision mechanism is also incorporated to detect time instances for weight updating. Experimental results and comparisons indicate the good performance of the proposed scheme even in sequences with complicated content (object bending, occlusion).
Universal sensor interface module (USIM)
NASA Astrophysics Data System (ADS)
King, Don; Torres, A.; Wynn, John
1999-01-01
A universal sensor interface model (USIM) is being developed by the Raytheon-TI Systems Company for use with fields of unattended distributed sensors. In its production configuration, the USIM will be a multichip module consisting of a set of common modules. The common module USIM set consists of (1) a sensor adapter interface (SAI) module, (2) digital signal processor (DSP) and associated memory module, and (3) a RF transceiver model. The multispectral sensor interface is designed around a low-power A/D converted, whose input/output interface consists of: -8 buffered, sampled inputs from various devices including environmental, acoustic seismic and magnetic sensors. The eight sensor inputs are each high-impedance, low- capacitance, differential amplifiers. The inputs are ideally suited for interface with discrete or MEMS sensors, since the differential input will allow direct connection with high-impedance bridge sensors and capacitance voltage sources. Each amplifier is connected to a 22-bit (Delta) (Sigma) A/D converter to enable simultaneous samples. The low power (Delta) (Sigma) converter provides 22-bit resolution at sample frequencies up to 142 hertz (used for magnetic sensors) and 16-bit resolution at frequencies up to 1168 hertz (used for acoustic and seismic sensors). The video interface module is based around the TMS320C5410 DSP. It can provide sensor array addressing, video data input, data calibration and correction. The processor module is based upon a MPC555. It will be used for mode control, synchronization of complex sensors, sensor signal processing, array processing, target classification and tracking. Many functions of the A/D, DSP and transceiver can be powered down by using variable clock speeds under software command or chip power switches. They can be returned to intermediate or full operation by DSP command. Power management may be based on the USIM's internal timer, command from the USIM transceiver, or by sleep mode processing management. The low power detection mode is implemented by monitoring any of the sensor analog outputs at lower sample rates for detection over a software controllable threshold.
Automated sequence-specific protein NMR assignment using the memetic algorithm MATCH.
Volk, Jochen; Herrmann, Torsten; Wüthrich, Kurt
2008-07-01
MATCH (Memetic Algorithm and Combinatorial Optimization Heuristics) is a new memetic algorithm for automated sequence-specific polypeptide backbone NMR assignment of proteins. MATCH employs local optimization for tracing partial sequence-specific assignments within a global, population-based search environment, where the simultaneous application of local and global optimization heuristics guarantees high efficiency and robustness. MATCH thus makes combined use of the two predominant concepts in use for automated NMR assignment of proteins. Dynamic transition and inherent mutation are new techniques that enable automatic adaptation to variable quality of the experimental input data. The concept of dynamic transition is incorporated in all major building blocks of the algorithm, where it enables switching between local and global optimization heuristics at any time during the assignment process. Inherent mutation restricts the intrinsically required randomness of the evolutionary algorithm to those regions of the conformation space that are compatible with the experimental input data. Using intact and artificially deteriorated APSY-NMR input data of proteins, MATCH performed sequence-specific resonance assignment with high efficiency and robustness.
Long period pseudo random number sequence generator
NASA Technical Reports Server (NTRS)
Wang, Charles C. (Inventor)
1989-01-01
A circuit for generating a sequence of pseudo random numbers, (A sub K). There is an exponentiator in GF(2 sup m) for the normal basis representation of elements in a finite field GF(2 sup m) each represented by m binary digits and having two inputs and an output from which the sequence (A sub K). Of pseudo random numbers is taken. One of the two inputs is connected to receive the outputs (E sub K) of maximal length shift register of n stages. There is a switch having a pair of inputs and an output. The switch outputs is connected to the other of the two inputs of the exponentiator. One of the switch inputs is connected for initially receiving a primitive element (A sub O) in GF(2 sup m). Finally, there is a delay circuit having an input and an output. The delay circuit output is connected to the other of the switch inputs and the delay circuit input is connected to the output of the exponentiator. Whereby after the exponentiator initially receives the primitive element (A sub O) in GF(2 sup m) through the switch, the switch can be switched to cause the exponentiator to receive as its input a delayed output A(K-1) from the exponentiator thereby generating (A sub K) continuously at the output of the exponentiator. The exponentiator in GF(2 sup m) is novel and comprises a cyclic-shift circuit; a Massey-Omura multiplier; and, a control logic circuit all operably connected together to perform the function U(sub i) = 92(sup i) (for n(sub i) = 1 or 1 (for n(subi) = 0).
Introduction to study and simulation of low rate video coding schemes
NASA Technical Reports Server (NTRS)
1992-01-01
During this period, the development of simulators for the various HDTV systems proposed to the FCC were developed. These simulators will be tested using test sequences from the MPEG committee. The results will be extrapolated to HDTV video sequences. Currently, the simulator for the compression aspects of the Advanced Digital Television (ADTV) was completed. Other HDTV proposals are at various stages of development. A brief overview of the ADTV system is given. Some coding results obtained using the simulator are discussed. These results are compared to those obtained using the CCITT H.261 standard. These results in the context of the CCSDS specifications are evaluated and some suggestions as to how the ADTV system could be implemented in the NASA network are made.
An improved real time superresolution FPGA system
NASA Astrophysics Data System (ADS)
Lakshmi Narasimha, Pramod; Mudigoudar, Basavaraj; Yue, Zhanfeng; Topiwala, Pankaj
2009-05-01
In numerous computer vision applications, enhancing the quality and resolution of captured video can be critical. Acquired video is often grainy and low quality due to motion, transmission bottlenecks, etc. Postprocessing can enhance it. Superresolution greatly decreases camera jitter to deliver a smooth, stabilized, high quality video. In this paper, we extend previous work on a real-time superresolution application implemented in ASIC/FPGA hardware. A gradient based technique is used to register the frames at the sub-pixel level. Once we get the high resolution grid, we use an improved regularization technique in which the image is iteratively modified by applying back-projection to get a sharp and undistorted image. The algorithm was first tested in software and migrated to hardware, to achieve 320x240 -> 1280x960, about 30 fps, a stunning superresolution by 16X in total pixels. Various input parameters, such as size of input image, enlarging factor and the number of nearest neighbors, can be tuned conveniently by the user. We use a maximum word size of 32 bits to implement the algorithm in Matlab Simulink as well as in FPGA hardware, which gives us a fine balance between the number of bits and performance. The proposed system is robust and highly efficient. We have shown the performance improvement of the hardware superresolution over the software version (C code).
A design for living technology: experiments with the mind time machine.
Ikegami, Takashi
2013-01-01
Living technology aims to help people expand their experiences in everyday life. The environment offers people ways to interact with it, which we call affordances. Living technology is a design for new affordances. When we experience something new, we remember it by the way we perceive and interact with it. Recent studies in neuroscience have led to the idea of a default mode network, which is a baseline activity of a brain system. The autonomy of artificial life must be understood as a sort of default mode that self-organizes its baseline activity, preparing for its external inputs and its interaction with humans. I thus propose a method for creating a suitable default mode as a design principle for living technology. I built a machine called the mind time machine (MTM), which runs continuously for 10 h per day and receives visual data from its environment using 15 video cameras. The MTM receives and edits the video inputs while it self-organizes the momentary now. Its base program is a neural network that includes chaotic dynamics inside the system and a meta-network that consists of video feedback systems. Using this system as the hardware and a default mode network as a conceptual framework, I describe the system's autonomous behavior. Using the MTM as a testing ground, I propose a design principle for living technology.
SIRSALE: integrated video database management tools
NASA Astrophysics Data System (ADS)
Brunie, Lionel; Favory, Loic; Gelas, J. P.; Lefevre, Laurent; Mostefaoui, Ahmed; Nait-Abdesselam, F.
2002-07-01
Video databases became an active field of research during the last decade. The main objective in such systems is to provide users with capabilities to friendly search, access and playback distributed stored video data in the same way as they do for traditional distributed databases. Hence, such systems need to deal with hard issues : (a) video documents generate huge volumes of data and are time sensitive (streams must be delivered at a specific bitrate), (b) contents of video data are very hard to be automatically extracted and need to be humanly annotated. To cope with these issues, many approaches have been proposed in the literature including data models, query languages, video indexing etc. In this paper, we present SIRSALE : a set of video databases management tools that allow users to manipulate video documents and streams stored in large distributed repositories. All the proposed tools are based on generic models that can be customized for specific applications using ad-hoc adaptation modules. More precisely, SIRSALE allows users to : (a) browse video documents by structures (sequences, scenes, shots) and (b) query the video database content by using a graphical tool, adapted to the nature of the target video documents. This paper also presents an annotating interface which allows archivists to describe the content of video documents. All these tools are coupled to a video player integrating remote VCR functionalities and are based on active network technology. So, we present how dedicated active services allow an optimized video transport for video streams (with Tamanoir active nodes). We then describe experiments of using SIRSALE on an archive of news video and soccer matches. The system has been demonstrated to professionals with a positive feedback. Finally, we discuss open issues and present some perspectives.
Collaborative Estimation in Distributed Sensor Networks
ERIC Educational Resources Information Center
Kar, Swarnendu
2013-01-01
Networks of smart ultra-portable devices are already indispensable in our lives, augmenting our senses and connecting our lives through real time processing and communication of sensory (e.g., audio, video, location) inputs. Though usually hidden from the user's sight, the engineering of these devices involves fierce tradeoffs between energy…
78 FR 29387 - Government-Owned Inventions, Available for Licensing
Federal Register 2010, 2011, 2012, 2013, 2014
2013-05-20
... System for Physiologically Modulating Action Role-playing Open World Video Games and Simulations Which... Deposition Measurement for the Electron Beam Free Form Fabrication (EBF3) Process; NASA Case No.: LAR-17887-1... Modulating Videogames and Simulations Which Use Gesture and Body Image Sensing Control Input Devices; NASA...
Fast repurposing of high-resolution stereo video content for mobile use
NASA Astrophysics Data System (ADS)
Karaoglu, Ali; Lee, Bong Ho; Boev, Atanas; Cheong, Won-Sik; Gotchev, Atanas
2012-06-01
3D video content is captured and created mainly in high resolution targeting big cinema or home TV screens. For 3D mobile devices, equipped with small-size auto-stereoscopic displays, such content has to be properly repurposed, preferably in real-time. The repurposing requires not only spatial resizing but also properly maintaining the output stereo disparity, as it should deliver realistic, pleasant and harmless 3D perception. In this paper, we propose an approach to adapt the disparity range of the source video to the comfort disparity zone of the target display. To achieve this, we adapt the scale and the aspect ratio of the source video. We aim at maximizing the disparity range of the retargeted content within the comfort zone, and minimizing the letterboxing of the cropped content. The proposed algorithm consists of five stages. First, we analyse the display profile, which characterises what 3D content can be comfortably observed in the target display. Then, we perform fast disparity analysis of the input stereoscopic content. Instead of returning the dense disparity map, it returns an estimate of the disparity statistics (min, max, meanand variance) per frame. Additionally, we detect scene cuts, where sharp transitions in disparities occur. Based on the estimated input, and desired output disparity ranges, we derive the optimal cropping parameters and scale of the cropping window, which would yield the targeted disparity range and minimize the area of cropped and letterboxed content. Once the rescaling and cropping parameters are known, we perform resampling procedure using spline-based and perceptually optimized resampling (anti-aliasing) kernels, which have also a very efficient computational structure. Perceptual optimization is achieved through adjusting the cut-off frequency of the anti-aliasing filter with the throughput of the target display.
Leonard, Laurence B.; Fey, Marc E.; Deevy, Patricia; Bredin-Oja, Shelley L.
2015-01-01
We tested four predictions based on the assumption that optional infinitives can be attributed to properties of the input whereby children inappropriately extract nonfinite subject-verb sequences (e.g. the girl run) from larger input utterances (e.g. Does the girl run? Let’s watch the girl run). Thirty children with specific language impairment (SLI) and 30 typically developing children heard novel and familiar verbs that appeared exclusively either in utterances containing nonfinite subject-verb sequences or in simple sentences with the verb inflected for third person singular –s. Subsequent testing showed strong input effects, especially for the SLI group. The results provide support for input-based factors as significant contributors not only to the optional infinitive period in typical development, but also to the especially protracted optional infinitive period seen in SLI. PMID:25076070
Input preshaping with frequency domain information for flexible-link manipulator control
NASA Technical Reports Server (NTRS)
Tzes, Anthony; Englehart, Matthew J.; Yurkovich, Stephen
1989-01-01
The application of an input preshaping scheme to flexible manipulators is considered. The resulting control corresponds to a feedforward term that convolves in real-time the desired reference input with a sequence of impulses and produces a vibration free output. The robustness of the algorithm with respect to injected disturbances and modal frequency variations is not satisfactory and can be improved by convolving the input with a longer sequence of impulses. The incorporation of the preshaping scheme to a closed-loop plant, using acceleration feedback, offers satisfactory disturbance rejection due to feedback and cancellation of the flexible mode effects due to the preshaping. A frequency domain identification scheme is used to estimate the modal frequencies on-line and subsequently update the spacing between the impulses. The combined adaptive input preshaping scheme provides the fastest possible slew that results in a vibration free output.
Cervinka, Miroslav; Cervinková, Zuzana; Novák, Jan; Spicák, Jan; Rudolf, Emil; Peychl, Jan
2004-06-01
Alternatives and their teaching are an essential part of the curricula at the Faculty of Medicine. Dynamic screen-based video recordings are the most important type of alternative models employed for teaching purposes. Currently, the majority of teaching materials for this purpose are based on PowerPoint presentations, which are very popular because of their high versatility and visual impact. Furthermore, current developments in the field of image capturing devices and software enable the use of digitised video streams, tailored precisely to the specific situation. Here, we demonstrate that with reasonable financial resources, it is possible to prepare video sequences and to introduce them into the PowerPoint presentation, thereby shaping the teaching process according to individual students' needs and specificities.
Weighted-MSE based on saliency map for assessing video quality of H.264 video streams
NASA Astrophysics Data System (ADS)
Boujut, H.; Benois-Pineau, J.; Hadar, O.; Ahmed, T.; Bonnet, P.
2011-01-01
Human vision system is very complex and has been studied for many years specifically for purposes of efficient encoding of visual, e.g. video content from digital TV. There have been physiological and psychological evidences which indicate that viewers do not pay equal attention to all exposed visual information, but only focus on certain areas known as focus of attention (FOA) or saliency regions. In this work, we propose a novel based objective quality assessment metric, for assessing the perceptual quality of decoded video sequences affected by transmission errors and packed loses. The proposed method weights the Mean Square Error (MSE), Weighted-MSE (WMSE), according to the calculated saliency map at each pixel. Our method was validated trough subjective quality experiments.
High-performance software-only H.261 video compression on PC
NASA Astrophysics Data System (ADS)
Kasperovich, Leonid
1996-03-01
This paper describes an implementation of a software H.261 codec for PC, that takes an advantage of the fast computational algorithms for DCT-based video compression, which have been presented by the author at the February's 1995 SPIE/IS&T meeting. The motivation for developing the H.261 prototype system is to demonstrate a feasibility of real time software- only videoconferencing solution to operate across a wide range of network bandwidth, frame rate, and resolution of the input video. As the bandwidths of current network technology will be increased, the higher frame rate and resolution of video to be transmitted is allowed, that requires, in turn, a software codec to be able to compress pictures of CIF (352 X 288) resolution at up to 30 frame/sec. Running on Pentium 133 MHz PC the codec presented is capable to compress video in CIF format at 21 - 23 frame/sec. This result is comparable to the known hardware-based H.261 solutions, but it doesn't require any specific hardware. The methods to achieve high performance, the program optimization technique for Pentium microprocessor along with the performance profile, showing the actual contribution of the different encoding/decoding stages to the overall computational process, are presented.
From image captioning to video summary using deep recurrent networks and unsupervised segmentation
NASA Astrophysics Data System (ADS)
Morosanu, Bogdan-Andrei; Lemnaru, Camelia
2018-04-01
Automatic captioning systems based on recurrent neural networks have been tremendously successful at providing realistic natural language captions for complex and varied image data. We explore methods for adapting existing models trained on large image caption data sets to a similar problem, that of summarising videos using natural language descriptions and frame selection. These architectures create internal high level representations of the input image that can be used to define probability distributions and distance metrics on these distributions. Specifically, we interpret each hidden unit inside a layer of the caption model as representing the un-normalised log probability of some unknown image feature of interest for the caption generation process. We can then apply well understood statistical divergence measures to express the difference between images and create an unsupervised segmentation of video frames, classifying consecutive images of low divergence as belonging to the same context, and those of high divergence as belonging to different contexts. To provide a final summary of the video, we provide a group of selected frames and a text description accompanying them, allowing a user to perform a quick exploration of large unlabeled video databases.
Constructing spherical panoramas of a bladder phantom from endoscopic video using bundle adjustment
NASA Astrophysics Data System (ADS)
Soper, Timothy D.; Chandler, John E.; Porter, Michael P.; Seibel, Eric J.
2011-03-01
The high recurrence rate of bladder cancer requires patients to undergo frequent surveillance screenings over their lifetime following initial diagnosis and resection. Our laboratory is developing panoramic stitching software that would compile several minutes of cystoscopic video into a single panoramic image, covering the entire bladder, for review by an urolgist at a later time or remote location. Global alignment of video frames is achieved by using a bundle adjuster that simultaneously recovers both the 3D structure of the bladder as well as the scope motion using only the video frames as input. The result of the algorithm is a complete 360° spherical panorama of the outer surface. The details of the software algorithms are presented here along with results from both a virtual cystoscopy as well from real endoscopic imaging of a bladder phantom. The software successfully stitched several hundred video frames into a single panoramic with subpixel accuracy and with no knowledge of the intrinsic camera properties, such as focal length and radial distortion. In the discussion, we outline future work in development of the software as well as identifying factors pertinent to clinical translation of this technology.
NASA Technical Reports Server (NTRS)
Haines, Richard F.; Chuang, Sherry L.
1993-01-01
Current plans indicate that there will be a large number of life science experiments carried out during the thirty year-long mission of the Biological Flight Research Laboratory (BFRL) on board Space Station Freedom (SSF). Non-human life science experiments will be performed in the BFRL. Two distinct types of activities have already been identified for this facility: (1) collect, store, distribute, analyze and manage engineering and science data from the Habitats, Glovebox and Centrifuge, (2) perform a broad range of remote science activities in the Glovebox and Habitat chambers in conjunction with the remotely located principal investigator (PI). These activities require extensive video coverage, viewing and/or recording and distribution to video displays on board SSF and to the ground. This paper concentrates mainly on the second type of activity. Each of the two BFRL habitat racks are designed to be configurable for either six rodent habitats per rack, four plant habitats per rack, or a combination of the above. Two video cameras will be installed in each habitat with a spare attachment for a third camera when needed. Therefore, a video system that can accommodate up to 12-18 camera inputs per habitat rack must be considered.
Tabletop Games: Platforms, Experimental Games and Design Recommendations
NASA Astrophysics Data System (ADS)
Haller, Michael; Forlines, Clifton; Koeffel, Christina; Leitner, Jakob; Shen, Chia
While the last decade has seen massive improvements in not only the rendering quality, but also the overall performance of console and desktop video games, these improvements have not necessarily led to a greater population of video game players. In addition to continuing these improvements, the video game industry is also constantly searching for new ways to convert non-players into dedicated gamers. Despite the growing popularity of computer-based video games, people still love to play traditional board games, such as Risk, Monopoly, and Trivial Pursuit. Both video and board games have their strengths and weaknesses, and an intriguing conclusion is to merge both worlds. We believe that a tabletop form-factor provides an ideal interface for digital board games. The design and implementation of tabletop games will be influenced by the hardware platforms, form factors, sensing technologies, as well as input techniques and devices that are available and chosen. This chapter is divided into three major sections. In the first section, we describe the most recent tabletop hardware technologies that have been used by tabletop researchers and practitioners. In the second section, we discuss a set of experimental tabletop games. The third section presents ten evaluation heuristics for tabletop game design.
Combining 3D structure of real video and synthetic objects
NASA Astrophysics Data System (ADS)
Kim, Man-Bae; Song, Mun-Sup; Kim, Do-Kyoon
1998-04-01
This paper presents a new approach of combining real video and synthetic objects. The purpose of this work is to use the proposed technology in the fields of advanced animation, virtual reality, games, and so forth. Computer graphics has been used in the fields previously mentioned. Recently, some applications have added real video to graphic scenes for the purpose of augmenting the realism that the computer graphics lacks in. This approach called augmented or mixed reality can produce more realistic environment that the entire use of computer graphics. Our approach differs from the virtual reality and augmented reality in the manner that computer- generated graphic objects are combined to 3D structure extracted from monocular image sequences. The extraction of the 3D structure requires the estimation of 3D depth followed by the construction of a height map. Graphic objects are then combined to the height map. The realization of our proposed approach is carried out in the following steps: (1) We derive 3D structure from test image sequences. The extraction of the 3D structure requires the estimation of depth and the construction of a height map. Due to the contents of the test sequence, the height map represents the 3D structure. (2) The height map is modeled by Delaunay triangulation or Bezier surface and each planar surface is texture-mapped. (3) Finally, graphic objects are combined to the height map. Because 3D structure of the height map is already known, Step (3) is easily manipulated. Following this procedure, we produced an animation video demonstrating the combination of the 3D structure and graphic models. Users can navigate the realistic 3D world whose associated image is rendered on the display monitor.
Enhanced learning of natural visual sequences in newborn chicks.
Wood, Justin N; Prasad, Aditya; Goldman, Jason G; Wood, Samantha M W
2016-07-01
To what extent are newborn brains designed to operate over natural visual input? To address this question, we used a high-throughput controlled-rearing method to examine whether newborn chicks (Gallus gallus) show enhanced learning of natural visual sequences at the onset of vision. We took the same set of images and grouped them into either natural sequences (i.e., sequences showing different viewpoints of the same real-world object) or unnatural sequences (i.e., sequences showing different images of different real-world objects). When raised in virtual worlds containing natural sequences, newborn chicks developed the ability to recognize familiar images of objects. Conversely, when raised in virtual worlds containing unnatural sequences, newborn chicks' object recognition abilities were severely impaired. In fact, the majority of the chicks raised with the unnatural sequences failed to recognize familiar images of objects despite acquiring over 100 h of visual experience with those images. Thus, newborn chicks show enhanced learning of natural visual sequences at the onset of vision. These results indicate that newborn brains are designed to operate over natural visual input.
PrimerMapper: high throughput primer design and graphical assembly for PCR and SNP detection
O’Halloran, Damien M.
2016-01-01
Primer design represents a widely employed gambit in diverse molecular applications including PCR, sequencing, and probe hybridization. Variations of PCR, including primer walking, allele-specific PCR, and nested PCR provide specialized validation and detection protocols for molecular analyses that often require screening large numbers of DNA fragments. In these cases, automated sequence retrieval and processing become important features, and furthermore, a graphic that provides the user with a visual guide to the distribution of designed primers across targets is most helpful in quickly ascertaining primer coverage. To this end, I describe here, PrimerMapper, which provides a comprehensive graphical user interface that designs robust primers from any number of inputted sequences while providing the user with both, graphical maps of primer distribution for each inputted sequence, and also a global assembled map of all inputted sequences with designed primers. PrimerMapper also enables the visualization of graphical maps within a browser and allows the user to draw new primers directly onto the webpage. Other features of PrimerMapper include allele-specific design features for SNP genotyping, a remote BLAST window to NCBI databases, and remote sequence retrieval from GenBank and dbSNP. PrimerMapper is hosted at GitHub and freely available without restriction. PMID:26853558
Senile myoclonic epilepsy in Down syndrome: a video and EEG presentation of two cases.
De Simone, Roberto; Daquin, Géraldine; Genton, Pierre
2006-09-01
Myoclonic epilepsy is being increasingly recognized as a late-onset complication in middle-aged or elderly patients with Down syndrome, in association with cognitive decline. We show video and EEG recordings of two patients, both aged 56 years, diagnosed with this condition. At onset, myoclonic epilepsy in elderly DS patients may resemble, in its clinical expression, the classical juvenile myoclonic epilepsy with the characteristic occurrence of jerks on awakening. It is clearly associated with an Alzheimer-type dementia, and may also occur in non-DS patients with Alzheimer's disease: hence the possible denomination of "senile myoclonic epilepsy". [Published with video sequences].
Logo recognition in video by line profile classification
NASA Astrophysics Data System (ADS)
den Hollander, Richard J. M.; Hanjalic, Alan
2003-12-01
We present an extension to earlier work on recognizing logos in video stills. The logo instances considered here are rigid planar objects observed at a distance in the scene, so the possible perspective transformation can be approximated by an affine transformation. For this reason we can classify the logos by matching (invariant) line profiles. We enhance our previous method by considering multiple line profiles instead of a single profile of the logo. The positions of the lines are based on maxima in the Hough transform space of the segmented logo foreground image. Experiments are performed on MPEG1 sport video sequences to show the performance of the proposed method.
supernovae: Photometric classification of supernovae
NASA Astrophysics Data System (ADS)
Charnock, Tom; Moss, Adam
2017-05-01
Supernovae classifies supernovae using their light curves directly as inputs to a deep recurrent neural network, which learns information from the sequence of observations. Observational time and filter fluxes are used as inputs; since the inputs are agnostic, additional data such as host galaxy information can also be included.
Digital Watermarking: From Concepts to Real-Time Video Applications
1999-01-01
includes still- image , video, audio, and geometry data among others-the fundamental con- cept of steganography can be transferred from the field of...size of the message, which should be as small as possible. Some commercially available algorithms for image watermarking forego the secure-watermarking... image compres- sion.’ The image’s luminance component is divided into 8 x 8 pixel blocks. The algorithm selects a sequence of blocks and applies the
Test and Evaluation of Teleconferencing Video Codecs Transmitting at 1.5 Mbps.
1985-08-01
video teleconferencing codecs on the market as of November 1984 to facilitate the choice of an appropriate frame format and data compression algorithm...Engineer, computer company, male 5. Chapter Officer, national civic organization, female Group Y: 6. Marketing Representative, communication systems...both mon:tors to C4ve t e evi uators an idea what kind of cictures they will have to ; ucge . Special suggestions were given regardinc the sequences witn
Adaptive metric learning with deep neural networks for video-based facial expression recognition
NASA Astrophysics Data System (ADS)
Liu, Xiaofeng; Ge, Yubin; Yang, Chao; Jia, Ping
2018-01-01
Video-based facial expression recognition has become increasingly important for plenty of applications in the real world. Despite that numerous efforts have been made for the single sequence, how to balance the complex distribution of intra- and interclass variations well between sequences has remained a great difficulty in this area. We propose the adaptive (N+M)-tuplet clusters loss function and optimize it with the softmax loss simultaneously in the training phrase. The variations introduced by personal attributes are alleviated using the similarity measurements of multiple samples in the feature space with many fewer comparison times as conventional deep metric learning approaches, which enables the metric calculations for large data applications (e.g., videos). Both the spatial and temporal relations are well explored by a unified framework that consists of an Inception-ResNet network with long short term memory and the two fully connected layer branches structure. Our proposed method has been evaluated with three well-known databases, and the experimental results show that our method outperforms many state-of-the-art approaches.
Tracking flow of leukocytes in blood for drug analysis
NASA Astrophysics Data System (ADS)
Basharat, Arslan; Turner, Wesley; Stephens, Gillian; Badillo, Benjamin; Lumpkin, Rick; Andre, Patrick; Perera, Amitha
2011-03-01
Modern microscopy techniques allow imaging of circulating blood components under vascular flow conditions. The resulting video sequences provide unique insights into the behavior of blood cells within the vasculature and can be used as a method to monitor and quantitate the recruitment of inflammatory cells at sites of vascular injury/ inflammation and potentially serve as a pharmacodynamic biomarker, helping screen new therapies and individualize dose and combinations of drugs. However, manual analysis of these video sequences is intractable, requiring hours per 400 second video clip. In this paper, we present an automated technique to analyze the behavior and recruitment of human leukocytes in whole blood under physiological conditions of shear through a simple multi-channel fluorescence microscope in real-time. This technique detects and tracks the recruitment of leukocytes to a bioactive surface coated on a flow chamber. Rolling cells (cells which partially bind to the bioactive matrix) are detected counted, and have their velocity measured and graphed. The challenges here include: high cell density, appearance similarity, and low (1Hz) frame rate. Our approach performs frame differencing based motion segmentation, track initialization and online tracking of individual leukocytes.
Learning to count begins in infancy: evidence from 18 month olds' visual preferences.
Slaughter, Virginia; Itakura, Shoji; Kutsuki, Aya; Siegal, Michael
2011-10-07
We used a preferential looking paradigm to evaluate infants' preferences for correct versus incorrect counting. Infants viewed a video depicting six fish. In the correct counting sequence, a hand pointed to each fish in turn, accompanied by verbal counting up to six. In the incorrect counting sequence, the hand moved between two of the six fish while there was still verbal counting to six, thereby violating the one-to-one correspondence principle of correct counting. Experiment 1 showed that Australian 18 month olds, but not 15 month olds, significantly preferred to watch the correct counting sequence. In experiment 2, Australian infants' preference for correct counting disappeared when the count words were replaced by beeps or by Japanese count words. In experiment 3, Japanese 18 month olds significantly preferred the correct counting video only when counting was in Japanese. These results show that infants start to acquire the abstract principles governing correct counting prior to producing any counting behaviour.
Learning to count begins in infancy: evidence from 18 month olds' visual preferences
Slaughter, Virginia; Itakura, Shoji; Kutsuki, Aya; Siegal, Michael
2011-01-01
We used a preferential looking paradigm to evaluate infants' preferences for correct versus incorrect counting. Infants viewed a video depicting six fish. In the correct counting sequence, a hand pointed to each fish in turn, accompanied by verbal counting up to six. In the incorrect counting sequence, the hand moved between two of the six fish while there was still verbal counting to six, thereby violating the one-to-one correspondence principle of correct counting. Experiment 1 showed that Australian 18 month olds, but not 15 month olds, significantly preferred to watch the correct counting sequence. In experiment 2, Australian infants' preference for correct counting disappeared when the count words were replaced by beeps or by Japanese count words. In experiment 3, Japanese 18 month olds significantly preferred the correct counting video only when counting was in Japanese. These results show that infants start to acquire the abstract principles governing correct counting prior to producing any counting behaviour. PMID:21325331
Dissecting children's observational learning of complex actions through selective video displays.
Flynn, Emma; Whiten, Andrew
2013-10-01
Children can learn how to use complex objects by watching others, yet the relative importance of different elements they may observe, such as the interactions of the individual parts of the apparatus, a model's movements, and desirable outcomes, remains unclear. In total, 140 3-year-olds and 140 5-year-olds participated in a study where they observed a video showing tools being used to extract a reward item from a complex puzzle box. Conditions varied according to the elements that could be seen in the video: (a) the whole display, including the model's hands, the tools, and the box; (b) the tools and the box but not the model's hands; (c) the model's hands and the tools but not the box; (d) only the end state with the box opened; and (e) no demonstration. Children's later attempts at the task were coded to establish whether they imitated the hierarchically organized sequence of the model's actions, the action details, and/or the outcome. Children's successful retrieval of the reward from the box and the replication of hierarchical sequence information were reduced in all but the whole display condition. Only once children had attempted the task and witnessed a second demonstration did the display focused on the tools and box prove to be better for hierarchical sequence information than the display focused on the tools and hands only. Copyright © 2013 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Bonanno, A.; Bozzo, G.; Sapia, P.
2017-11-01
In this work, we present a coherent sequence of experiments on electromagnetic (EM) induction and eddy currents, appropriate for university undergraduate students, based on a magnet falling through a drilled aluminum disk. The sequence, leveraging on the didactical interplay between the EM and mechanical aspects of the experiments, allows us to exploit the students’ awareness of mechanics to elicit their comprehension of EM phenomena. The proposed experiments feature two kinds of measurements: (i) kinematic measurements (performed by means of high-speed video analysis) give information on the system’s kinematics and, via appropriate numerical data processing, allow us to get dynamic information, in particular on energy dissipation; (ii) induced electromagnetic field (EMF) measurements (by using a homemade multi-coil sensor connected to a cheap data acquisition system) allow us to quantitatively determine the inductive effects of the moving magnet on its neighborhood. The comparison between experimental results and the predictions from an appropriate theoretical model (of the dissipative coupling between the moving magnet and the conducting disk) offers many educational hints on relevant topics related to EM induction, such as Maxwell’s displacement current, magnetic field flux variation, and the conceptual link between induced EMF and induced currents. Moreover, the didactical activity gives students the opportunity to be trained in video analysis, data acquisition and numerical data processing.
Space Shuttle Main Engine Propellant Path Leak Detection Using Sequential Image Processing
NASA Technical Reports Server (NTRS)
Smith, L. Montgomery; Malone, Jo Anne; Crawford, Roger A.
1995-01-01
Initial research in this study using theoretical radiation transport models established that the occurrence of a leak is accompanies by a sudden but sustained change in intensity in a given region of an image. In this phase, temporal processing of video images on a frame-by-frame basis was used to detect leaks within a given field of view. The leak detection algorithm developed in this study consists of a digital highpass filter cascaded with a moving average filter. The absolute value of the resulting discrete sequence is then taken and compared to a threshold value to produce the binary leak/no leak decision at each point in the image. Alternatively, averaging over the full frame of the output image produces a single time-varying mean value estimate that is indicative of the intensity and extent of a leak. Laboratory experiments were conducted in which artificially created leaks on a simulated SSME background were produced and recorded from a visible wavelength video camera. This data was processed frame-by-frame over the time interval of interest using an image processor implementation of the leak detection algorithm. In addition, a 20 second video sequence of an actual SSME failure was analyzed using this technique. The resulting output image sequences and plots of the full frame mean value versus time verify the effectiveness of the system.
Khanduja, Sumeet; Sampangi, Raju; Hemlatha, B C; Singh, Satvir; Lall, Ashish
2018-01-01
Purpose: The purpose of this study is to describe the use of commercial digital single light reflex (DSLR) for vitreoretinal surgery recording and compare it to standard 3-chip charged coupling device (CCD) camera. Methods: Simultaneous recording was done using Sony A7s2 camera and Sony high-definition 3-chip camera attached to each side of the microscope. The videos recorded from both the camera systems were edited and sequences of similar time frames were selected. Three sequences that selected for evaluation were (a) anterior segment surgery, (b) surgery under direct viewing system, and (c) surgery under indirect wide-angle viewing system. The videos of each sequence were evaluated and rated on a scale of 0-10 for color, contrast, and overall quality Results: Most results were rated either 8/10 or 9/10 for both the cameras. A noninferiority analysis by comparing mean scores of DSLR camera versus CCD camera was performed and P values were obtained. The mean scores of the two cameras were comparable for each other on all parameters assessed in the different videos except of color and contrast in posterior pole view and color on wide-angle view, which were rated significantly higher (better) in DSLR camera. Conclusion: Commercial DSLRs are an affordable low-cost alternative for vitreoretinal surgery recording and may be used for documentation and teaching. PMID:29283133
Khanduja, Sumeet; Sampangi, Raju; Hemlatha, B C; Singh, Satvir; Lall, Ashish
2018-01-01
The purpose of this study is to describe the use of commercial digital single light reflex (DSLR) for vitreoretinal surgery recording and compare it to standard 3-chip charged coupling device (CCD) camera. Simultaneous recording was done using Sony A7s2 camera and Sony high-definition 3-chip camera attached to each side of the microscope. The videos recorded from both the camera systems were edited and sequences of similar time frames were selected. Three sequences that selected for evaluation were (a) anterior segment surgery, (b) surgery under direct viewing system, and (c) surgery under indirect wide-angle viewing system. The videos of each sequence were evaluated and rated on a scale of 0-10 for color, contrast, and overall quality Results: Most results were rated either 8/10 or 9/10 for both the cameras. A noninferiority analysis by comparing mean scores of DSLR camera versus CCD camera was performed and P values were obtained. The mean scores of the two cameras were comparable for each other on all parameters assessed in the different videos except of color and contrast in posterior pole view and color on wide-angle view, which were rated significantly higher (better) in DSLR camera. Commercial DSLRs are an affordable low-cost alternative for vitreoretinal surgery recording and may be used for documentation and teaching.
Circuit for measuring time differences among events
Romrell, Delwin M.
1977-01-01
An electronic circuit has a plurality of input terminals. Application of a first input signal to any one of the terminals initiates a timing sequence. Later inputs to the same terminal are ignored but a later input to any other terminal of the plurality generates a signal which can be used to measure the time difference between the later input and the first input signal. Also, such time differences may be measured between the first input signal and an input signal to any other terminal of the plurality or the circuit may be reset at any time by an external reset signal.
Are YouTube videos accurate and reliable on basic life support and cardiopulmonary resuscitation?
Yaylaci, Serpil; Serinken, Mustafa; Eken, Cenker; Karcioglu, Ozgur; Yilmaz, Atakan; Elicabuk, Hayri; Dal, Onur
2014-10-01
The objective of this study is to investigate reliability and accuracy of the information on YouTube videos related to CPR and BLS in accord with 2010 CPR guidelines. YouTube was queried using four search terms 'CPR', 'cardiopulmonary resuscitation', 'BLS' and 'basic life support' between 2011 and 2013. Sources that uploaded the videos, the record time, the number of viewers in the study period, inclusion of human or manikins were recorded. The videos were rated if they displayed the correct order of resuscitative efforts in full accord with 2010 CPR guidelines or not. Two hundred and nine videos meeting the inclusion criteria after the search in YouTube with four search terms ('CPR', 'cardiopulmonary resuscitation', 'BLS' and 'basic life support') comprised the study sample subjected to the analysis. Median score of the videos is 5 (IQR: 3.5-6). Only 11.5% (n = 24) of the videos were found to be compatible with 2010 CPR guidelines with regard to sequence of interventions. Videos uploaded by 'Guideline bodies' had significantly higher rates of download when compared with the videos uploaded by other sources. Sources of the videos and date of upload (year) were not shown to have any significant effect on the scores received (P = 0.615 and 0.513, respectively). The videos' number of downloads did not differ according to the videos compatible with the guidelines (P = 0.832). The videos downloaded more than 10,000 times had a higher score than the others (P = 0.001). The majority of You-Tube video clips purporting to be about CPR are not relevant educational material. Of those that are focused on teaching CPR, only a small minority optimally meet the 2010 Resucitation Guidelines. © 2014 Australasian College for Emergency Medicine and Australasian Society for Emergency Medicine.
Enhanced sequencing coverage with digital droplet multiple displacement amplification
Sidore, Angus M.; Lan, Freeman; Lim, Shaun W.; Abate, Adam R.
2016-01-01
Sequencing small quantities of DNA is important for applications ranging from the assembly of uncultivable microbial genomes to the identification of cancer-associated mutations. To obtain sufficient quantities of DNA for sequencing, the small amount of starting material must be amplified significantly. However, existing methods often yield errors or non-uniform coverage, reducing sequencing data quality. Here, we describe digital droplet multiple displacement amplification, a method that enables massive amplification of low-input material while maintaining sequence accuracy and uniformity. The low-input material is compartmentalized as single molecules in millions of picoliter droplets. Because the molecules are isolated in compartments, they amplify to saturation without competing for resources; this yields uniform representation of all sequences in the final product and, in turn, enhances the quality of the sequence data. We demonstrate the ability to uniformly amplify the genomes of single Escherichia coli cells, comprising just 4.7 fg of starting DNA, and obtain sequencing coverage distributions that rival that of unamplified material. Digital droplet multiple displacement amplification provides a simple and effective method for amplifying minute amounts of DNA for accurate and uniform sequencing. PMID:26704978
NASA Astrophysics Data System (ADS)
Davies, Bob; Lienhart, Rainer W.; Yeo, Boon-Lock
1999-08-01
The metaphor of film and TV permeates the design of software to support video on the PC. Simply transplanting the non- interactive, sequential experience of film to the PC fails to exploit the virtues of the new context. Video ont eh PC should be interactive and non-sequential. This paper experiments with a variety of tools for using video on the PC that exploits the new content of the PC. Some feature are more successful than others. Applications that use these tools are explored, including primarily the home video archive but also streaming video servers on the Internet. The ability to browse, edit, abstract and index large volumes of video content such as home video and corporate video is a problem without appropriate solution in today's market. The current tools available are complex, unfriendly video editors, requiring hours of work to prepare a short home video, far more work that a typical home user can be expected to provide. Our proposed solution treats video like a text document, providing functionality similar to a text editor. Users can browse, interact, edit and compose one or more video sequences with the same ease and convenience as handling text documents. With this level of text-like composition, we call what is normally a sequential medium a 'video document'. An important component of the proposed solution is shot detection, the ability to detect when a short started or stopped. When combined with a spreadsheet of key frames, the host become a grid of pictures that can be manipulated and viewed in the same way that a spreadsheet can be edited. Multiple video documents may be viewed, joined, manipulated, and seamlessly played back. Abstracts of unedited video content can be produce automatically to create novel video content for export to other venues. Edited and raw video content can be published to the net or burned to a CD-ROM with a self-installing viewer for Windows 98 and Windows NT 4.0.
Design of a highly integrated video acquisition module for smart video flight unit development
NASA Astrophysics Data System (ADS)
Lebre, V.; Gasti, W.
2017-11-01
CCD and APS devices are widely used in space missions as instrument sensors and/or in Avionics units like star detectors/trackers. Therefore, various and numerous designs of video acquisition chains have been produced. Basically, a classical video acquisition chain is constituted of two main functional blocks: the Proximity Electronics (PEC), including detector drivers and the Analogue Processing Chain (APC) Electronics that embeds the ADC, a master sequencer and the host interface. Nowadays, low power technologies allow to improve the integration, radiometric performances and power budget optimisation of video units and to standardize video units design and development. To this end, ESA has initiated a development activity through a competitive process requesting the expertise of experienced actors in the field of high resolution electronics for earth observation and Scientific missions. THALES ALENIA SPACE has been granted this activity as a prime contractor through ESA contract called HIVAC that holds for Highly Integrated Video Acquisition Chain. This paper presents main objectives of the on going HIVAC project and focuses on the functionalities and performances offered by the usage of the under development HIVAC board for future optical instruments.
Kalwitzki, M; Beyer, C; Meller, C
2010-11-01
Whilst preparing undergraduate students for a clinical course in paediatric dentistry, four consecutive classes (n = 107) were divided into two groups. Seven behaviour-modifying techniques were introduced: systematic desensitization, operant conditioning, modelling, Tell, Show, Do-principle, substitution, change of roles and the active involvement of the patient. The behaviour-modifying techniques that had been taught to group one (n = 57) through lecturing were taught to group two (n = 50) through video sequences and vice versa in the following semester. Immediately after the presentations, students were asked by means of a questionnaire about their perceptions of ease of using the different techniques and their intention for clinical application of each technique. After completion of the clinical course, they were asked about which behaviour-modifying techniques they had actually used when dealing with patients. Concerning the perception of ease of using the different techniques, there were considerable differences for six of the seven techniques (P < 0.05). Whilst some techniques seemed more difficult to apply clinically after lecturing, others seemed more difficult after video-based teaching. Concerning the intention for clinical application and the actual clinical application, there were higher percentages for all techniques taught after video-based teaching. However, the differences were significant only for two techniques in each case (P < 0.05). It is concluded that the use of video based teaching enhances the intention for application and the actual clinical application only for a limited number of behaviour-modifying techniques. © 2010 John Wiley & Sons A/S.
Timeline Resource Analysis Program (TRAP): User's manual and program document
NASA Technical Reports Server (NTRS)
Sessler, J. G.
1981-01-01
The Timeline Resource Analysis Program (TRAP), developed for scheduling and timelining problems, is described. Given an activity network, TRAP generates timeline plots, resource histograms, and tabular summaries of the network, schedules, and resource levels. It is written in ANSI FORTRAN for the Honeywell SIGMA 5 computer and operates in the interactive mode using the TEKTRONIX 4014-1 graphics terminal. The input network file may be a standard SIGMA 5 file or one generated using the Interactive Graphics Design System. The timeline plots can be displayed in two orderings: according to the sequence in which the tasks were read on input, and a waterfall sequence in which the tasks are ordered by start time. The input order is especially meaningful when the network consists of several interacting subnetworks. The waterfall sequence is helpful in assessing the project status at any point in time.
Subjective Quality Assessment of Underwater Video for Scientific Applications
Moreno-Roldán, José-Miguel; Luque-Nieto, Miguel-Ángel; Poncela, Javier; Díaz-del-Río, Víctor; Otero, Pablo
2015-01-01
Underwater video services could be a key application in the better scientific knowledge of the vast oceanic resources in our planet. However, limitations in the capacity of current available technology for underwater networks (UWSNs) raise the question of the feasibility of these services. When transmitting video, the main constraints are the limited bandwidth and the high propagation delays. At the same time the service performance depends on the needs of the target group. This paper considers the problems of estimations for the Mean Opinion Score (a standard quality measure) in UWSNs based on objective methods and addresses the topic of quality assessment in potential underwater video services from a subjective point of view. The experimental design and the results of a test planned according standardized psychometric methods are presented. The subjects used in the quality assessment test were ocean scientists. Video sequences were recorded in actual exploration expeditions and were processed to simulate conditions similar to those that might be found in UWSNs. Our experimental results show how videos are considered to be useful for scientific purposes even in very low bitrate conditions. PMID:26694400
A semi-automatic annotation tool for cooking video
NASA Astrophysics Data System (ADS)
Bianco, Simone; Ciocca, Gianluigi; Napoletano, Paolo; Schettini, Raimondo; Margherita, Roberto; Marini, Gianluca; Gianforme, Giorgio; Pantaleo, Giuseppe
2013-03-01
In order to create a cooking assistant application to guide the users in the preparation of the dishes relevant to their profile diets and food preferences, it is necessary to accurately annotate the video recipes, identifying and tracking the foods of the cook. These videos present particular annotation challenges such as frequent occlusions, food appearance changes, etc. Manually annotate the videos is a time-consuming, tedious and error-prone task. Fully automatic tools that integrate computer vision algorithms to extract and identify the elements of interest are not error free, and false positive and false negative detections need to be corrected in a post-processing stage. We present an interactive, semi-automatic tool for the annotation of cooking videos that integrates computer vision techniques under the supervision of the user. The annotation accuracy is increased with respect to completely automatic tools and the human effort is reduced with respect to completely manual ones. The performance and usability of the proposed tool are evaluated on the basis of the time and effort required to annotate the same video sequences.
Strategies for combining physics videos and virtual laboratories in the training of physics teachers
NASA Astrophysics Data System (ADS)
Dickman, Adriana; Vertchenko, Lev; Martins, Maria Inés
2007-03-01
Among the multimedia resources used in physics education, the most prominent are virtual laboratories and videos. On one hand, computer simulations and applets have very attractive graphic interfaces, showing an incredible amount of detail and movement. On the other hand, videos, offer the possibility of displaying high quality images, and are becoming more feasible with the increasing availability of digital resources. We believe it is important to discuss, throughout the teacher training program, both the functionality of information and communication technology (ICT) in physics education and, the varied applications of these resources. In our work we suggest the introduction of ICT resources in a sequence integrating these important tools in the teacher training program, as opposed to the traditional approach, in which virtual laboratories and videos are introduced separately. In this perspective, when we introduce and utilize virtual laboratory techniques we also provide for its use in videos, taking advantage of graphic interfaces. Thus the students in our program learn to use instructional software in the production of videos for classroom use.
Subjective Quality Assessment of Underwater Video for Scientific Applications.
Moreno-Roldán, José-Miguel; Luque-Nieto, Miguel-Ángel; Poncela, Javier; Díaz-del-Río, Víctor; Otero, Pablo
2015-12-15
Underwater video services could be a key application in the better scientific knowledge of the vast oceanic resources in our planet. However, limitations in the capacity of current available technology for underwater networks (UWSNs) raise the question of the feasibility of these services. When transmitting video, the main constraints are the limited bandwidth and the high propagation delays. At the same time the service performance depends on the needs of the target group. This paper considers the problems of estimations for the Mean Opinion Score (a standard quality measure) in UWSNs based on objective methods and addresses the topic of quality assessment in potential underwater video services from a subjective point of view. The experimental design and the results of a test planned according standardized psychometric methods are presented. The subjects used in the quality assessment test were ocean scientists. Video sequences were recorded in actual exploration expeditions and were processed to simulate conditions similar to those that might be found in UWSNs. Our experimental results show how videos are considered to be useful for scientific purposes even in very low bitrate conditions.
Detection of goal events in soccer videos
NASA Astrophysics Data System (ADS)
Kim, Hyoung-Gook; Roeber, Steffen; Samour, Amjad; Sikora, Thomas
2005-01-01
In this paper, we present an automatic extraction of goal events in soccer videos by using audio track features alone without relying on expensive-to-compute video track features. The extracted goal events can be used for high-level indexing and selective browsing of soccer videos. The detection of soccer video highlights using audio contents comprises three steps: 1) extraction of audio features from a video sequence, 2) event candidate detection of highlight events based on the information provided by the feature extraction Methods and the Hidden Markov Model (HMM), 3) goal event selection to finally determine the video intervals to be included in the summary. For this purpose we compared the performance of the well known Mel-scale Frequency Cepstral Coefficients (MFCC) feature extraction method vs. MPEG-7 Audio Spectrum Projection feature (ASP) extraction method based on three different decomposition methods namely Principal Component Analysis( PCA), Independent Component Analysis (ICA) and Non-Negative Matrix Factorization (NMF). To evaluate our system we collected five soccer game videos from various sources. In total we have seven hours of soccer games consisting of eight gigabytes of data. One of five soccer games is used as the training data (e.g., announcers' excited speech, audience ambient speech noise, audience clapping, environmental sounds). Our goal event detection results are encouraging.
An Attention-Information-Based Spatial Adaptation Framework for Browsing Videos via Mobile Devices
NASA Astrophysics Data System (ADS)
Li, Houqiang; Wang, Yi; Chen, Chang Wen
2007-12-01
With the growing popularity of personal digital assistant devices and smart phones, more and more consumers are becoming quite enthusiastic to appreciate videos via mobile devices. However, limited display size of the mobile devices has been imposing significant barriers for users to enjoy browsing high-resolution videos. In this paper, we present an attention-information-based spatial adaptation framework to address this problem. The whole framework includes two major parts: video content generation and video adaptation system. During video compression, the attention information in video sequences will be detected using an attention model and embedded into bitstreams with proposed supplement-enhanced information (SEI) structure. Furthermore, we also develop an innovative scheme to adaptively adjust quantization parameters in order to simultaneously improve the quality of overall encoding and the quality of transcoding the attention areas. When the high-resolution bitstream is transmitted to mobile users, a fast transcoding algorithm we developed earlier will be applied to generate a new bitstream for attention areas in frames. The new low-resolution bitstream containing mostly attention information, instead of the high-resolution one, will be sent to users for display on the mobile devices. Experimental results show that the proposed spatial adaptation scheme is able to improve both subjective and objective video qualities.
Bayesian Modeling of Temporal Coherence in Videos for Entity Discovery and Summarization.
Mitra, Adway; Biswas, Soma; Bhattacharyya, Chiranjib
2017-03-01
A video is understood by users in terms of entities present in it. Entity Discovery is the task of building appearance model for each entity (e.g., a person), and finding all its occurrences in the video. We represent a video as a sequence of tracklets, each spanning 10-20 frames, and associated with one entity. We pose Entity Discovery as tracklet clustering, and approach it by leveraging Temporal Coherence (TC): the property that temporally neighboring tracklets are likely to be associated with the same entity. Our major contributions are the first Bayesian nonparametric models for TC at tracklet-level. We extend Chinese Restaurant Process (CRP) to TC-CRP, and further to Temporally Coherent Chinese Restaurant Franchise (TC-CRF) to jointly model entities and temporal segments using mixture components and sparse distributions. For discovering persons in TV serial videos without meta-data like scripts, these methods show considerable improvement over state-of-the-art approaches to tracklet clustering in terms of clustering accuracy, cluster purity and entity coverage. The proposed methods can perform online tracklet clustering on streaming videos unlike existing approaches, and can automatically reject false tracklets. Finally we discuss entity-driven video summarization- where temporal segments of the video are selected based on the discovered entities, to create a semantically meaningful summary.
Video transmission on ATM networks. Ph.D. Thesis
NASA Technical Reports Server (NTRS)
Chen, Yun-Chung
1993-01-01
The broadband integrated services digital network (B-ISDN) is expected to provide high-speed and flexible multimedia applications. Multimedia includes data, graphics, image, voice, and video. Asynchronous transfer mode (ATM) is the adopted transport techniques for B-ISDN and has the potential for providing a more efficient and integrated environment for multimedia. It is believed that most broadband applications will make heavy use of visual information. The prospect of wide spread use of image and video communication has led to interest in coding algorithms for reducing bandwidth requirements and improving image quality. The major results of a study on the bridging of network transmission performance and video coding are: Using two representative video sequences, several video source models are developed. The fitness of these models are validated through the use of statistical tests and network queuing performance. A dual leaky bucket algorithm is proposed as an effective network policing function. The concept of the dual leaky bucket algorithm can be applied to a prioritized coding approach to achieve transmission efficiency. A mapping of the performance/control parameters at the network level into equivalent parameters at the video coding level is developed. Based on that, a complete set of principles for the design of video codecs for network transmission is proposed.
A polygon soup representation for free viewpoint video
NASA Astrophysics Data System (ADS)
Colleu, T.; Pateux, S.; Morin, L.; Labit, C.
2010-02-01
This paper presents a polygon soup representation for multiview data. Starting from a sequence of multi-view video plus depth (MVD) data, the proposed representation takes into account, in a unified manner, different issues such as compactness, compression, and intermediate view synthesis. The representation is built in two steps. First, a set of 3D quads is extracted using a quadtree decomposition of the depth maps. Second, a selective elimination of the quads is performed in order to reduce inter-view redundancies and thus provide a compact representation. Moreover, the proposed methodology for extracting the representation allows to reduce ghosting artifacts. Finally, an adapted compression technique is proposed that limits coding artifacts. The results presented on two real sequences show that the proposed representation provides a good trade-off between rendering quality and data compactness.
Robot Sequencing and Visualization Program (RSVP)
NASA Technical Reports Server (NTRS)
Cooper, Brian K.; Maxwell,Scott A.; Hartman, Frank R.; Wright, John R.; Yen, Jeng; Toole, Nicholas T.; Gorjian, Zareh; Morrison, Jack C
2013-01-01
The Robot Sequencing and Visualization Program (RSVP) is being used in the Mars Science Laboratory (MSL) mission for downlink data visualization and command sequence generation. RSVP reads and writes downlink data products from the operations data server (ODS) and writes uplink data products to the ODS. The primary users of RSVP are members of the Rover Planner team (part of the Integrated Planning and Execution Team (IPE)), who use it to perform traversability/articulation analyses, take activity plan input from the Science and Mission Planning teams, and create a set of rover sequences to be sent to the rover every sol. The primary inputs to RSVP are downlink data products and activity plans in the ODS database. The primary outputs are command sequences to be placed in the ODS for further processing prior to uplink to each rover. RSVP is composed of two main subsystems. The first, called the Robot Sequence Editor (RoSE), understands the MSL activity and command dictionaries and takes care of converting incoming activity level inputs into command sequences. The Rover Planners use the RoSE component of RSVP to put together command sequences and to view and manage command level resources like time, power, temperature, etc. (via a transparent realtime connection to SEQGEN). The second component of RSVP is called HyperDrive, a set of high-fidelity computer graphics displays of the Martian surface in 3D and in stereo. The Rover Planners can explore the environment around the rover, create commands related to motion of all kinds, and see the simulated result of those commands via its underlying tight coupling with flight navigation, motor, and arm software. This software is the evolutionary replacement for the Rover Sequencing and Visualization software used to create command sequences (and visualize the Martian surface) for the Mars Exploration Rover mission.
Accidental Turbulent Discharge Rate Estimation from Videos
NASA Astrophysics Data System (ADS)
Ibarra, Eric; Shaffer, Franklin; Savaş, Ömer
2015-11-01
A technique to estimate the volumetric discharge rate in accidental oil releases using high speed video streams is described. The essence of the method is similar to PIV processing, however the cross correlation is carried out on the visible features of the efflux, which are usually turbulent, opaque and immiscible. The key step in the process is to perform a pixelwise time filtering on the video stream, in which the parameters are commensurate with the scales of the large eddies. The velocity field extracted from the shell of visible features is then used to construct an approximate velocity profile within the discharge. The technique has been tested on laboratory experiments using both water and oil jets at Re ~105 . The technique is accurate to 20%, which is sufficient for initial responders to deploy adequate resources for containment. The software package requires minimal user input and is intended for deployment on an ROV in the field. Supported by DOI via NETL.
ERIC Educational Resources Information Center
Teri, Linda; Truax, Paula
1994-01-01
Primary caregivers (n=41) of memory-impaired patients rated a standardized stimulus of depression and their actual patient. They were able to correctly identify depression in both. Further, their mood was unassociated with video ratings and only moderately associated with patient ratings. The findings support reliance on caregiver input.…
ERIC Educational Resources Information Center
Stagg, Steven D.; Slavny, Rachel; Hand, Charlotte; Cardoso, Alice; Smith, Pamela
2014-01-01
Research investigating expressivity in children with autism spectrum disorder has reported flat affect or bizarre facial expressivity within this population; however, the impact expressivity may have on first impression formation has received little research input. We examined how videos of children with autism spectrum disorder were rated for…
High Performance Implementation of 3D Convolutional Neural Networks on a GPU.
Lan, Qiang; Wang, Zelong; Wen, Mei; Zhang, Chunyuan; Wang, Yijie
2017-01-01
Convolutional neural networks have proven to be highly successful in applications such as image classification, object tracking, and many other tasks based on 2D inputs. Recently, researchers have started to apply convolutional neural networks to video classification, which constitutes a 3D input and requires far larger amounts of memory and much more computation. FFT based methods can reduce the amount of computation, but this generally comes at the cost of an increased memory requirement. On the other hand, the Winograd Minimal Filtering Algorithm (WMFA) can reduce the number of operations required and thus can speed up the computation, without increasing the required memory. This strategy was shown to be successful for 2D neural networks. We implement the algorithm for 3D convolutional neural networks and apply it to a popular 3D convolutional neural network which is used to classify videos and compare it to cuDNN. For our highly optimized implementation of the algorithm, we observe a twofold speedup for most of the 3D convolution layers of our test network compared to the cuDNN version.
High Performance Implementation of 3D Convolutional Neural Networks on a GPU
Wang, Zelong; Wen, Mei; Zhang, Chunyuan; Wang, Yijie
2017-01-01
Convolutional neural networks have proven to be highly successful in applications such as image classification, object tracking, and many other tasks based on 2D inputs. Recently, researchers have started to apply convolutional neural networks to video classification, which constitutes a 3D input and requires far larger amounts of memory and much more computation. FFT based methods can reduce the amount of computation, but this generally comes at the cost of an increased memory requirement. On the other hand, the Winograd Minimal Filtering Algorithm (WMFA) can reduce the number of operations required and thus can speed up the computation, without increasing the required memory. This strategy was shown to be successful for 2D neural networks. We implement the algorithm for 3D convolutional neural networks and apply it to a popular 3D convolutional neural network which is used to classify videos and compare it to cuDNN. For our highly optimized implementation of the algorithm, we observe a twofold speedup for most of the 3D convolution layers of our test network compared to the cuDNN version. PMID:29250109
Automatic Mrf-Based Registration of High Resolution Satellite Video Data
NASA Astrophysics Data System (ADS)
Platias, C.; Vakalopoulou, M.; Karantzalos, K.
2016-06-01
In this paper we propose a deformable registration framework for high resolution satellite video data able to automatically and accurately co-register satellite video frames and/or register them to a reference map/image. The proposed approach performs non-rigid registration, formulates a Markov Random Fields (MRF) model, while efficient linear programming is employed for reaching the lowest potential of the cost function. The developed approach has been applied and validated on satellite video sequences from Skybox Imaging and compared with a rigid, descriptor-based registration method. Regarding the computational performance, both the MRF-based and the descriptor-based methods were quite efficient, with the first one converging in some minutes and the second in some seconds. Regarding the registration accuracy the proposed MRF-based method significantly outperformed the descriptor-based one in all the performing experiments.
NASA Astrophysics Data System (ADS)
Sembiring, L.; Van Ormondt, M.; Van Dongeren, A. R.; Roelvink, J. A.
2017-07-01
Rip currents are one of the most dangerous coastal hazards for swimmers. In order to minimize the risk, a coastal operational-process based-model system can be utilized in order to provide forecast of nearshore waves and currents that may endanger beach goers. In this paper, an operational model for rip current prediction by utilizing nearshore bathymetry obtained from video image technique is demonstrated. For the nearshore scale model, XBeach1 is used with which tidal currents, wave induced currents (including the effect of the wave groups) can be simulated simultaneously. Up-to-date bathymetry will be obtained using video images technique, cBathy 2. The system will be tested for the Egmond aan Zee beach, located in the northern part of the Dutch coastline. This paper will test the applicability of bathymetry obtained from video technique to be used as input for the numerical modelling system by comparing simulation results using surveyed bathymetry and model results using video bathymetry. Results show that the video technique is able to produce bathymetry converging towards the ground truth observations. This bathymetry validation will be followed by an example of operational forecasting type of simulation on predicting rip currents. Rip currents flow fields simulated over measured and modeled bathymetries are compared in order to assess the performance of the proposed forecast system.
Video segmentation using keywords
NASA Astrophysics Data System (ADS)
Ton-That, Vinh; Vong, Chi-Tai; Nguyen-Dao, Xuan-Truong; Tran, Minh-Triet
2018-04-01
At DAVIS-2016 Challenge, many state-of-art video segmentation methods achieve potential results, but they still much depend on annotated frames to distinguish between background and foreground. It takes a lot of time and efforts to create these frames exactly. In this paper, we introduce a method to segment objects from video based on keywords given by user. First, we use a real-time object detection system - YOLOv2 to identify regions containing objects that have labels match with the given keywords in the first frame. Then, for each region identified from the previous step, we use Pyramid Scene Parsing Network to assign each pixel as foreground or background. These frames can be used as input frames for Object Flow algorithm to perform segmentation on entire video. We conduct experiments on a subset of DAVIS-2016 dataset in half the size of its original size, which shows that our method can handle many popular classes in PASCAL VOC 2012 dataset with acceptable accuracy, about 75.03%. We suggest widely testing by combining other methods to improve this result in the future.
Multi-star processing and gyro filtering for the video inertial pointing system
NASA Technical Reports Server (NTRS)
Murphy, J. P.
1976-01-01
The video inertial pointing (VIP) system is being developed to satisfy the acquisition and pointing requirements of astronomical telescopes. The VIP system uses a single video sensor to provide star position information that can be used to generate three-axis pointing error signals (multi-star processing) and for input to a cathode ray tube (CRT) display of the star field. The pointing error signals are used to update the telescope's gyro stabilization system (gyro filtering). The CRT display facilitates target acquisition and positioning of the telescope by a remote operator. Linearized small angle equations are used for the multistar processing and a consideration of error performance and singularities lead to star pair location restrictions and equation selection criteria. A discrete steady-state Kalman filter which uses the integration of the gyros is developed and analyzed. The filter includes unit time delays representing asynchronous operations of the VIP microprocessor and video sensor. A digital simulation of a typical gyro stabilized gimbal is developed and used to validate the approach to the gyro filtering.
Adherent Raindrop Modeling, Detectionand Removal in Video.
You, Shaodi; Tan, Robby T; Kawakami, Rei; Mukaigawa, Yasuhiro; Ikeuchi, Katsushi
2016-09-01
Raindrops adhered to a windscreen or window glass can significantly degrade the visibility of a scene. Modeling, detecting and removing raindrops will, therefore, benefit many computer vision applications, particularly outdoor surveillance systems and intelligent vehicle systems. In this paper, a method that automatically detects and removes adherent raindrops is introduced. The core idea is to exploit the local spatio-temporal derivatives of raindrops. To accomplish the idea, we first model adherent raindrops using law of physics, and detect raindrops based on these models in combination with motion and intensity temporal derivatives of the input video. Having detected the raindrops, we remove them and restore the images based on an analysis that some areas of raindrops completely occludes the scene, and some other areas occlude only partially. For partially occluding areas, we restore them by retrieving as much as possible information of the scene, namely, by solving a blending function on the detected partially occluding areas using the temporal intensity derivative. For completely occluding areas, we recover them by using a video completion technique. Experimental results using various real videos show the effectiveness of our method.
TRW Video News: Chandra X-ray Observatory
NASA Technical Reports Server (NTRS)
1999-01-01
This NASA Kennedy Space Center sponsored video release presents live footage of the Chandra X-ray Observatory prior to STS-93 as well as several short animations recreating some of its activities in space. These animations include a Space Shuttle fly-by with Chandra, two perspectives of Chandra's deployment from the Shuttle, the Chandra deployment orbit sequence, the Initial Upper Stage (IUS) first stage burn, and finally a "beauty shot", which represents another animated view of Chandra in space.
Audiovisual focus of attention and its application to Ultra High Definition video compression
NASA Astrophysics Data System (ADS)
Rerabek, Martin; Nemoto, Hiromi; Lee, Jong-Seok; Ebrahimi, Touradj
2014-02-01
Using Focus of Attention (FoA) as a perceptual process in image and video compression belongs to well-known approaches to increase coding efficiency. It has been shown that foveated coding, when compression quality varies across the image according to region of interest, is more efficient than the alternative coding, when all region are compressed in a similar way. However, widespread use of such foveated compression has been prevented due to two main conflicting causes, namely, the complexity and the efficiency of algorithms for FoA detection. One way around these is to use as much information as possible from the scene. Since most video sequences have an associated audio, and moreover, in many cases there is a correlation between the audio and the visual content, audiovisual FoA can improve efficiency of the detection algorithm while remaining of low complexity. This paper discusses a simple yet efficient audiovisual FoA algorithm based on correlation of dynamics between audio and video signal components. Results of audiovisual FoA detection algorithm are subsequently taken into account for foveated coding and compression. This approach is implemented into H.265/HEVC encoder producing a bitstream which is fully compliant to any H.265/HEVC decoder. The influence of audiovisual FoA in the perceived quality of high and ultra-high definition audiovisual sequences is explored and the amount of gain in compression efficiency is analyzed.
Subjective evaluation of HEVC in mobile devices
NASA Astrophysics Data System (ADS)
Garcia, Ray; Kalva, Hari
2013-03-01
Mobile compute environments provide a unique set of user needs and expectations that designers must consider. With increased multimedia use in mobile environments, video encoding methods within the smart phone market segment are key factors that contribute to positive user experience. Currently available display resolutions and expected cellular bandwidth are major factors the designer must consider when determining which encoding methods should be supported. The desired goal is to maximize the consumer experience, reduce cost, and reduce time to market. This paper presents a comparative evaluation of the quality of user experience when HEVC and AVC/H.264 video coding standards were used. The goal of the study was to evaluate any improvements in user experience when using HEVC. Subjective comparisons were made between H.264/AVC and HEVC encoding standards in accordance with Doublestimulus impairment scale (DSIS) as defined by ITU-R BT.500-13. Test environments are based on smart phone LCD resolutions and expected cellular bit rates, such as 200kbps and 400kbps. Subjective feedback shows both encoding methods are adequate at 400kbps constant bit rate. However, a noticeable consumer experience gap was observed for 200 kbps. Significantly less H.264 subjective quality is noticed with video sequences that have multiple objects moving and no single point of visual attraction. Video sequences with single points of visual attraction or few moving objects tended to have higher H.264 subjective quality.
The Interplay of Representations and Patterns of Classroom Discourse in Science Teaching Sequences
ERIC Educational Resources Information Center
Tang, Kok-Sing
2016-01-01
The purpose of this study is to examines the relationship between the communicative approach of classroom talk and the modes of representations used by science teachers. Based on video data from two physics classrooms in Singapore, a recurring pattern in the relationship was observed as the teaching sequence of a lesson unfolded. It was found that…
ERIC Educational Resources Information Center
de Milliano, Ilona; van Gelderen, Amos; Sleegers, Peter
2016-01-01
This study examines the relationship between types and sequences of self-regulated reading activities in task-oriented reading with quality of task achievement of 51 low-achieving adolescents (Grade 8). The study used think aloud combined with video observations to analyse the students' approach of a content-area reading task in the stages of…
ERIC Educational Resources Information Center
Shahrill, Masitah; Clarke, David J.
2014-01-01
A teachers' practice cannot be characterised by a single lesson, hence comparison is best made with lesson sequences that better sample the diversity of a teacher's practice. In this study, we video recorded lesson sequences in four Year 8 mathematics classrooms, as well as interviewed each of the four teachers in Brunei Darussalam. Because of our…
NASA Astrophysics Data System (ADS)
Ciaramello, Francis M.; Hemami, Sheila S.
2007-02-01
For members of the Deaf Community in the United States, current communication tools include TTY/TTD services, video relay services, and text-based communication. With the growth of cellular technology, mobile sign language conversations are becoming a possibility. Proper coding techniques must be employed to compress American Sign Language (ASL) video for low-rate transmission while maintaining the quality of the conversation. In order to evaluate these techniques, an appropriate quality metric is needed. This paper demonstrates that traditional video quality metrics, such as PSNR, fail to predict subjective intelligibility scores. By considering the unique structure of ASL video, an appropriate objective metric is developed. Face and hand segmentation is performed using skin-color detection techniques. The distortions in the face and hand regions are optimally weighted and pooled across all frames to create an objective intelligibility score for a distorted sequence. The objective intelligibility metric performs significantly better than PSNR in terms of correlation with subjective responses.
SCTP as scalable video coding transport
NASA Astrophysics Data System (ADS)
Ortiz, Jordi; Graciá, Eduardo Martínez; Skarmeta, Antonio F.
2013-12-01
This study presents an evaluation of the Stream Transmission Control Protocol (SCTP) for the transport of the scalable video codec (SVC), proposed by MPEG as an extension to H.264/AVC. Both technologies fit together properly. On the one hand, SVC permits to split easily the bitstream into substreams carrying different video layers, each with different importance for the reconstruction of the complete video sequence at the receiver end. On the other hand, SCTP includes features, such as the multi-streaming and multi-homing capabilities, that permit to transport robustly and efficiently the SVC layers. Several transmission strategies supported on baseline SCTP and its concurrent multipath transfer (CMT) extension are compared with the classical solutions based on the Transmission Control Protocol (TCP) and the Realtime Transmission Protocol (RTP). Using ns-2 simulations, it is shown that CMT-SCTP outperforms TCP and RTP in error-prone networking environments. The comparison is established according to several performance measurements, including delay, throughput, packet loss, and peak signal-to-noise ratio of the received video.
Introduction: Intradural Spinal Surgery video supplement.
McCormick, Paul C
2014-09-01
This Neurosurgical Focus video supplement contains detailed narrated videos of a broad range of intradural pathology such as neoplasms, including intramedullary, extramedullary, and dumbbell tumors, vascular malformations, functional disorders, and rare conditions that are often overlooked or misdiagnosed such as arachnoid cysts, ventral spinal cord herniation, and dorsal arachnoid web. The intent of this supplement is to provide meaningful educational and instructional content at all levels of training and practice. As such, the selected video submissions each provide a comprehensive detailed narrative description and coordinated video that contains the entire spectrum of relevant information including imaging, operative setup and positioning, and exposure, as well as surgical strategies, techniques, and sequencing toward the safe and effective achievement of the operative objective. This level of detail often necessitated a more lengthy video duration than is typically presented in oral presentations or standard video clips from peer reviewed publications. Unfortunately, space limitations precluded the inclusion of several other excellent video submissions in this supplement. While most videos in this supplement reflect standard operative approaches and techniques there are also submissions that describe innovative exposures and techniques that have expanded surgical options such as ventral approaches, stereotactic guidance, and minimally invasive exposures. There is some redundancy in both the topics and techniques both to underscore fundamental surgical principles as well as to provide complementary perspective from different surgeons. It has been my privilege to serve as guest editor for this video supplement and I would like to extend my appreciation to Mark Bilsky, Bill Krauss, and Sander Connolly for reviewing the large number submitted videos. Most of all, I would like to thank the authors for their skill and effort in the preparation of the outstanding videos that constitute this video supplement.
A new method for digital video documentation in surgical procedures and minimally invasive surgery.
Wurnig, P N; Hollaus, P H; Wurnig, C H; Wolf, R K; Ohtsuka, T; Pridun, N S
2003-02-01
Documentation of surgical procedures is limited to the accuracy of description, which depends on the vocabulary and the descriptive prowess of the surgeon. Even analog video recording could not solve the problem of documentation satisfactorily due to the abundance of recorded material. By capturing the video digitally, most problems are solved in the circumstances described in this article. We developed a cheap and useful digital video capturing system that consists of conventional computer components. Video images and clips can be captured intraoperatively and are immediately available. The system is a commercial personal computer specially configured for digital video capturing and is connected by wire to the video tower. Filming was done with a conventional endoscopic video camera. A total of 65 open and endoscopic procedures were documented in an orthopedic and a thoracic surgery unit. The median number of clips per surgical procedure was 6 (range, 1-17), and the median storage volume was 49 MB (range, 3-360 MB) in compressed form. The median duration of a video clip was 4 min 25 s (range, 45 s to 21 min). Median time for editing a video clip was 12 min for an advanced user (including cutting, title for the movie, and compression). The quality of the clips renders them suitable for presentations. This digital video documentation system allows easy capturing of intraoperative video sequences in high quality. All possibilities of documentation can be performed. With the use of an endoscopic video camera, no compromises with respect to sterility and surgical elbowroom are necessary. The cost is much lower than commercially available systems, and setting changes can be performed easily without trained specialists.
NASA Technical Reports Server (NTRS)
Appleberry, W. T.
1980-01-01
Rotary sequencer is assembled from conventional planetary differential gearset and latching mechanism utilizing inputs and outputs which are coaxial. Applications include automated production-line equipment in home appliances and in vehicles.
A method for automatically abstracting visual documents
NASA Technical Reports Server (NTRS)
Rorvig, Mark E.
1994-01-01
Visual documents--motion sequences on film, videotape, and digital recording--constitute a major source of information for the Space Agency, as well as all other government and private sector entities. This article describes a method for automatically selecting key frames from visual documents. These frames may in turn be used to represent the total image sequence of visual documents in visual libraries, hypermedia systems, and training algorithm reduces 51 minutes of video sequences to 134 frames; a reduction of information in the range of 700:1.
Adaptive precompensators for flexible-link manipulator control
NASA Technical Reports Server (NTRS)
Tzes, Anthony P.; Yurkovich, Stephen
1989-01-01
The application of input precompensators to flexible manipulators is considered. Frequency domain compensators color the input around the flexible mode locations, resulting in a bandstop or notch filter in cascade with the system. Time domain compensators apply a sequence of impulses at prespecified times related to the modal frequencies. The resulting control corresponds to a feedforward term that convolves in real-time the desired reference input with a sequence of impulses and produces a vibration-free output. An adaptive precompensator can be implemented by combining a frequency domain identification scheme which is used to estimate online the modal frequencies and subsequently update the bandstop interval or the spacing between the impulses. The combined adaptive input preshaping scheme provides the most rapid slew that results in a vibration-free output. Experimental results are presented to verify the results.
RBT-GA: a novel metaheuristic for solving the Multiple Sequence Alignment problem.
Taheri, Javid; Zomaya, Albert Y
2009-07-07
Multiple Sequence Alignment (MSA) has always been an active area of research in Bioinformatics. MSA is mainly focused on discovering biologically meaningful relationships among different sequences or proteins in order to investigate the underlying main characteristics/functions. This information is also used to generate phylogenetic trees. This paper presents a novel approach, namely RBT-GA, to solve the MSA problem using a hybrid solution methodology combining the Rubber Band Technique (RBT) and the Genetic Algorithm (GA) metaheuristic. RBT is inspired by the behavior of an elastic Rubber Band (RB) on a plate with several poles, which is analogues to locations in the input sequences that could potentially be biologically related. A GA attempts to mimic the evolutionary processes of life in order to locate optimal solutions in an often very complex landscape. RBT-GA is a population based optimization algorithm designed to find the optimal alignment for a set of input protein sequences. In this novel technique, each alignment answer is modeled as a chromosome consisting of several poles in the RBT framework. These poles resemble locations in the input sequences that are most likely to be correlated and/or biologically related. A GA-based optimization process improves these chromosomes gradually yielding a set of mostly optimal answers for the MSA problem. RBT-GA is tested with one of the well-known benchmarks suites (BALiBASE 2.0) in this area. The obtained results show that the superiority of the proposed technique even in the case of formidable sequences.
Exploring Techniques for Vision Based Human Activity Recognition: Methods, Systems, and Evaluation
Xu, Xin; Tang, Jinshan; Zhang, Xiaolong; Liu, Xiaoming; Zhang, Hong; Qiu, Yimin
2013-01-01
With the wide applications of vision based intelligent systems, image and video analysis technologies have attracted the attention of researchers in the computer vision field. In image and video analysis, human activity recognition is an important research direction. By interpreting and understanding human activities, we can recognize and predict the occurrence of crimes and help the police or other agencies react immediately. In the past, a large number of papers have been published on human activity recognition in video and image sequences. In this paper, we provide a comprehensive survey of the recent development of the techniques, including methods, systems, and quantitative evaluation of the performance of human activity recognition. PMID:23353144
NASA Technical Reports Server (NTRS)
Palsson, Olafur S. (Inventor); Harris, Randall L., Sr. (Inventor); Pope, Alan T. (Inventor)
2002-01-01
Apparatus and methods for modulating the control authority (i.e., control function) of a computer simulation or game input device (e.g., joystick, button control) using physiological information so as to affect the user's ability to impact or control the simulation or game with the input device. One aspect is to use the present invention, along with a computer simulation or game, to affect physiological state or physiological self-regulation according to some programmed criterion (e.g., increase, decrease, or maintain) in order to perform better at the game task. When the affected physiological state or physiological self-regulation is the target of self-regulation or biofeedback training, the simulation or game play reinforces therapeutic changes in the physiological signal(s).
NASA Astrophysics Data System (ADS)
Crone, T. J.; Mittelstaedt, E. L.; Fornari, D. J.
2014-12-01
Fluid flow rates through high-temperature mid-ocean ridge hydrothermal vents are likely quite sensitive to poroelastic forcing mechanisms such as tidal loading and tectonic activity. Because poroelastic deformation and flow perturbations are estimated to extend to considerable depths within young oceanic crust, observations of flow rate changes at seafloor vents have the potential to provide constraints on the flow geometry and permeability structure of the underlying hydrothermal systems, as well as the quantities of heat and chemicals they exchange with overlying ocean, and the potential biological productivity of ecosystems they host. To help provide flow rate measurements in these challenging environments, we have developed two new optical flow oriented technologies. The first is a new form of Optical Plume Velocimetry (OPV) which relies on single-frame temporal cross-correlation to obtain time-averaged image velocity fields from short video sequences. The second is the VentCam, a deep sea camera system that can collect high-frame-rate video sequences at focused hydrothermal vents suitable for analysis with OPV. During the July 2014 R/V Atlantis/Alvin expedition to Axial Seamount, we deployed the VentCam at the ~300C Phoenix vent within the ASHES vent field and positioned it with DSRV Alvin. We collected 24 seconds of video at 50 frames per second every half-hour for approximately 10 days beginning July 22nd. We are currently applying single-frame lag OPV to these videos to estimate relative and absolute fluid flow rates through this vent. To explore the relationship between focused and diffuse venting, we deployed a second optical flow camera, the Diffuse Effluent Measurement System (DEMS), adjacent to this vent at a fracture within the lava carapace where low-T (~30C) fluids were exiting. This system collected video sequences and diffuse flow measurements at overlapping time intervals. Here we present the preliminary results of our work with VentCam and OPV, and comparisons with results from the DEMS camera.
Real-time transmission of digital video using variable-length coding
NASA Technical Reports Server (NTRS)
Bizon, Thomas P.; Shalkhauser, Mary JO; Whyte, Wayne A., Jr.
1993-01-01
Huffman coding is a variable-length lossless compression technique where data with a high probability of occurrence is represented with short codewords, while 'not-so-likely' data is assigned longer codewords. Compression is achieved when the high-probability levels occur so frequently that their benefit outweighs any penalty paid when a less likely input occurs. One instance where Huffman coding is extremely effective occurs when data is highly predictable and differential coding can be applied (as with a digital video signal). For that reason, it is desirable to apply this compression technique to digital video transmission; however, special care must be taken in order to implement a communication protocol utilizing Huffman coding. This paper addresses several of the issues relating to the real-time transmission of Huffman-coded digital video over a constant-rate serial channel. Topics discussed include data rate conversion (from variable to a fixed rate), efficient data buffering, channel coding, recovery from communication errors, decoder synchronization, and decoder architectures. A description of the hardware developed to execute Huffman coding and serial transmission is also included. Although this paper focuses on matters relating to Huffman-coded digital video, the techniques discussed can easily be generalized for a variety of applications which require transmission of variable-length data.
Bähr, Florian; Ritter, Alexander; Seidel, Gundula; Puta, Christian; Gabriel, Holger H W; Hamzei, Farsin
2018-01-01
Action observation (AO) allows access to a network that processes visuomotor and sensorimotor inputs and is believed to be involved in observational learning of motor skills. We conducted three consecutive experiments to examine the boosting effect of AO on the motor outcome of the untrained hand by either mirror visual feedback (MVF), video therapy (VT), or a combination of both. In the first experiment, healthy participants trained either with MVF or without mirror feedback while in the second experiment, participants either trained with VT or observed animal videos. In the third experiment, participants first observed video clips that were followed by either training with MVF or training without mirror feedback. The outcomes for the untrained hand were quantified by scores from five motor tasks. The results demonstrated that MVF and VT significantly increase the motor performance of the untrained hand by the use of AO. We found that MVF was the most effective approach to increase the performance of the target effector. On the contrary, the combination of MVF and VT turns out to be less effective looking from clinical perspective. The gathered results suggest that action-related motor competence with the untrained hand is acquired by both mirror-based and video-based AO.
Ritter, Alexander; Seidel, Gundula; Puta, Christian; Gabriel, Holger H. W.; Hamzei, Farsin
2018-01-01
Action observation (AO) allows access to a network that processes visuomotor and sensorimotor inputs and is believed to be involved in observational learning of motor skills. We conducted three consecutive experiments to examine the boosting effect of AO on the motor outcome of the untrained hand by either mirror visual feedback (MVF), video therapy (VT), or a combination of both. In the first experiment, healthy participants trained either with MVF or without mirror feedback while in the second experiment, participants either trained with VT or observed animal videos. In the third experiment, participants first observed video clips that were followed by either training with MVF or training without mirror feedback. The outcomes for the untrained hand were quantified by scores from five motor tasks. The results demonstrated that MVF and VT significantly increase the motor performance of the untrained hand by the use of AO. We found that MVF was the most effective approach to increase the performance of the target effector. On the contrary, the combination of MVF and VT turns out to be less effective looking from clinical perspective. The gathered results suggest that action-related motor competence with the untrained hand is acquired by both mirror-based and video-based AO. PMID:29849570
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nguyen, V; James, J; Wang, B
Purpose: To describe an in-house video goggle feedback system for motion management during simulation and treatment of radiation therapy patients. Methods: This video goggle system works by splitting and amplifying the video output signal directly from the Varian Real-Time Position Management (RPM) workstation or TrueBeam imaging workstation into two signals using a Distribution Amplifier. The first signal S[1] gets reconnected back to the monitor. The second signal S[2] gets connected to the input of a Video Scaler. The S[2] signal can be scaled, cropped and panned in real time to display only the relevant information to the patient. The outputmore » signal from the Video Scaler gets connected to an HDMI Extender Transmitter via a DVI-D to HDMI converter cable. The S[2] signal can be transported from the HDMI Extender Transmitter to the HDMI Extender Receiver located inside the treatment room via a Cat5e/6 cable. Inside the treatment room, the HDMI Extender Receiver is permanently mounted on the wall near the conduit where the Cat5e/6 cable is located. An HDMI cable is used to connect from the output of the HDMI Receiver to the video goggles. Results: This video goggle feedback system is currently being used at two institutions. At one institution, the system was just recently implemented for simulation and treatments on two breath-hold gated patients with 8+ total fractions over a two month period. At the other institution, the system was used to treat 100+ breath-hold gated patients on three Varian TrueBeam linacs and has been operational for twelve months. The average time to prepare the video goggle system for treatment is less than 1 minute. Conclusion: The video goggle system provides an efficient and reliable method to set up a video feedback signal for radiotherapy patients with motion management.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
O’Connor, J. Michael; Pretorius, P. Hendrik; Johnson, Karen
2013-12-15
Purpose: This technical note documents a method that the authors developed for combining a signal to synchronize a patient-monitoring device with a second physiological signal for inclusion into list-mode acquisition. Our specific application requires synchronizing an external patient motion-tracking system with a medical imaging system by multiplexing the tracking input with the ECG input. The authors believe that their methodology can be adapted for use in a variety of medical imaging modalities including single photon emission computed tomography (SPECT) and positron emission tomography (PET). Methods: The authors insert a unique pulse sequence into a single physiological input channel. This sequencemore » is then recorded in the list-mode acquisition along with the R-wave pulse used for ECG gating. The specific form of our pulse sequence allows for recognition of the time point being synchronized even when portions of the pulse sequence are lost due to collisions with R-wave pulses. This was achieved by altering our software used in binning the list-mode data to recognize even a portion of our pulse sequence. Limitations on heart rates at which our pulse sequence could be reliably detected were investigated by simulating the mixing of the two signals as a function of heart rate and time point during the cardiac cycle at which our pulse sequence is mixed with the cardiac signal. Results: The authors have successfully achieved accurate temporal synchronization of our motion-tracking system with acquisition of SPECT projections used in 17 recent clinical research cases. In our simulation analysis the authors determined that synchronization to enable compensation for body and respiratory motion could be achieved for heart rates up to 125 beats-per-minute (bpm). Conclusions: Synchronization of list-mode acquisition with external patient monitoring devices such as those employed in motion-tracking can reliably be achieved using a simple method that can be implemented using minimal external hardware and software modification through a single input channel, while still recording cardiac gating signals.« less
Sajatovic, Martha; Herrmann, Lynn K; Van Doren, Jamie R; Tatsuoka, Curtis; Welter, Elisabeth; Perzynski, Adam T; Bukach, Ashley; Needham, Kelley; Liu, Hongyan; Berg, Anne T
2017-11-01
Epilepsy is a common neurological condition that is often associated with stigmatizing attitudes and negative stereotypes among the general public. This randomized controlled trial (RCT) tested two new communication approaches targeting epilepsy stigma versus an education-alone approach. Two brief stigma-reduction videos were developed, informed by community stakeholder input; one highlighted role competency in people with epilepsy; the other highlighted social inclusion of people with epilepsy. A control video was also developed. A Web-based survey using a prospective RCT design compared effects of experimental videos and control on acceptability, perceived impact, epilepsy knowledge, and epilepsy stigma. Epilepsy knowledge and stigma were measured with the Epilepsy Knowledge Questionnaire (EKQ) and Attitudes and Beliefs about Living with Epilepsy (ABLE), respectively. A total of 295 participants completed the study. Mean age was 23.1 (standard deviation = 3.27) years; 59.0% were male, and 71.4% were white. Overall, respondents felt videos impacted their epilepsy attitudes. EKQ scores were similar across videos, with a trend for higher knowledge in experimental videos versus control (p = 0.06). The role competency and control videos were associated with slightly better perceived impact on attitudes. There were no differences between videos on ABLE scores (p = 0.568). There were subgroup differences suggesting that men, younger individuals, whites, and those with personal epilepsy experience had more stigmatizing attitudes. This RCT tested communication strategies to improve knowledge and attitudes about epilepsy. Although this initial effort will require follow-up, we have demonstrated the acceptability, feasibility, and potential of novel communication strategies to target epilepsy stigma, and a Web-based approach for assessing them. Wiley Periodicals, Inc. © 2017 International League Against Epilepsy.
Self Occlusion and Disocclusion in Causal Video Object Segmentation
2015-12-18
computation is parameter- free in contrast to [4, 32, 10]. Taylor et al . [30] perform layer segmentation in longer video sequences leveraging occlusion cues...shows that our method recovers from errors in the first frame (short of failed detection). 4413 image ground truth Lee et al . [19] Grundman et al . [14...Ochs et al . [23] Taylor et al . [30] ours Figure 7. Sample Visual Results on FBMS-59. Comparison of various state-of-the-art methods. Only a single
OpenMP Parallelization and Optimization of Graph-based Machine Learning Algorithms
2016-05-01
composed of hyper - spectral video sequences recording the release of chemical plumes at the Dugway Proving Ground. We use the 329 frames of the...video. Each frame is a hyper - spectral image with dimension 128 × 320 × 129, where 129 is the dimension of the channel of each pixel. The total number of...j=1 . Then we use the nested for- loop to calculate the values of WXY by the formula (1). We then put the corresponding value in an array which
An Adaptive Inpainting Algorithm Based on DCT Induced Wavelet Regularization
2013-01-01
research in image processing. Applications of image inpainting include old films restoration, video inpainting [4], de -interlacing of video sequences...show 5 (a) (b) (c) (d) (e) (f) Fig. 1. Performance of various inpainting algorithms for a cartoon image with text. (a) the original test image; (b...the test image with text; inpainted images by (c) SF (PSNR=37.38 dB); (d) SF-LDCT (PSNR=37.37 dB); (e) MCA (PSNR=37.04 dB); and (f) the proposed
Efficient Use of Video for 3d Modelling of Cultural Heritage Objects
NASA Astrophysics Data System (ADS)
Alsadik, B.; Gerke, M.; Vosselman, G.
2015-03-01
Currently, there is a rapid development in the techniques of the automated image based modelling (IBM), especially in advanced structure-from-motion (SFM) and dense image matching methods, and camera technology. One possibility is to use video imaging to create 3D reality based models of cultural heritage architectures and monuments. Practically, video imaging is much easier to apply when compared to still image shooting in IBM techniques because the latter needs a thorough planning and proficiency. However, one is faced with mainly three problems when video image sequences are used for highly detailed modelling and dimensional survey of cultural heritage objects. These problems are: the low resolution of video images, the need to process a large number of short baseline video images and blur effects due to camera shake on a significant number of images. In this research, the feasibility of using video images for efficient 3D modelling is investigated. A method is developed to find the minimal significant number of video images in terms of object coverage and blur effect. This reduction in video images is convenient to decrease the processing time and to create a reliable textured 3D model compared with models produced by still imaging. Two experiments for modelling a building and a monument are tested using a video image resolution of 1920×1080 pixels. Internal and external validations of the produced models are applied to find out the final predicted accuracy and the model level of details. Related to the object complexity and video imaging resolution, the tests show an achievable average accuracy between 1 - 5 cm when using video imaging, which is suitable for visualization, virtual museums and low detailed documentation.
Self-organizing neural integration of pose-motion features for human action recognition
Parisi, German I.; Weber, Cornelius; Wermter, Stefan
2015-01-01
The visual recognition of complex, articulated human movements is fundamental for a wide range of artificial systems oriented toward human-robot communication, action classification, and action-driven perception. These challenging tasks may generally involve the processing of a huge amount of visual information and learning-based mechanisms for generalizing a set of training actions and classifying new samples. To operate in natural environments, a crucial property is the efficient and robust recognition of actions, also under noisy conditions caused by, for instance, systematic sensor errors and temporarily occluded persons. Studies of the mammalian visual system and its outperforming ability to process biological motion information suggest separate neural pathways for the distinct processing of pose and motion features at multiple levels and the subsequent integration of these visual cues for action perception. We present a neurobiologically-motivated approach to achieve noise-tolerant action recognition in real time. Our model consists of self-organizing Growing When Required (GWR) networks that obtain progressively generalized representations of sensory inputs and learn inherent spatio-temporal dependencies. During the training, the GWR networks dynamically change their topological structure to better match the input space. We first extract pose and motion features from video sequences and then cluster actions in terms of prototypical pose-motion trajectories. Multi-cue trajectories from matching action frames are subsequently combined to provide action dynamics in the joint feature space. Reported experiments show that our approach outperforms previous results on a dataset of full-body actions captured with a depth sensor, and ranks among the best results for a public benchmark of domestic daily actions. PMID:26106323
The use of open data from social media for the creation of 3D georeferenced modeling
NASA Astrophysics Data System (ADS)
Themistocleous, Kyriacos
2016-08-01
There is a great deal of open source video on the internet that is posted by users on social media sites. With the release of low-cost unmanned aerial vehicles, many hobbyists are uploading videos from different locations, especially in remote areas. Using open source data that is available on the internet, this study utilized structure to motion (SfM) as a range imaging technique to estimate 3 dimensional landscape features from 2 dimensional image sequences subtracted from video, applied image distortion correction and geo-referencing. This type of documentation may be necessary for cultural heritage sites that are inaccessible or documentation is difficult, where we can access video from Unmanned Aerial Vehicles (UAV). These 3D models can be viewed using Google Earth, create orthoimage, drawings and create digital terrain modeling for cultural heritage and archaeological purposes in remote or inaccessible areas.
A study on multiresolution lossless video coding using inter/intra frame adaptive prediction
NASA Astrophysics Data System (ADS)
Nakachi, Takayuki; Sawabe, Tomoko; Fujii, Tetsuro
2003-06-01
Lossless video coding is required in the fields of archiving and editing digital cinema or digital broadcasting contents. This paper combines a discrete wavelet transform and adaptive inter/intra-frame prediction in the wavelet transform domain to create multiresolution lossless video coding. The multiresolution structure offered by the wavelet transform facilitates interchange among several video source formats such as Super High Definition (SHD) images, HDTV, SDTV, and mobile applications. Adaptive inter/intra-frame prediction is an extension of JPEG-LS, a state-of-the-art lossless still image compression standard. Based on the image statistics of the wavelet transform domains in successive frames, inter/intra frame adaptive prediction is applied to the appropriate wavelet transform domain. This adaptation offers superior compression performance. This is achieved with low computational cost and no increase in additional information. Experiments on digital cinema test sequences confirm the effectiveness of the proposed algorithm.
Three-dimensional face pose detection and tracking using monocular videos: tool and application.
Dornaika, Fadi; Raducanu, Bogdan
2009-08-01
Recently, we have proposed a real-time tracker that simultaneously tracks the 3-D head pose and facial actions in monocular video sequences that can be provided by low quality cameras. This paper has two main contributions. First, we propose an automatic 3-D face pose initialization scheme for the real-time tracker by adopting a 2-D face detector and an eigenface system. Second, we use the proposed methods-the initialization and tracking-for enhancing the human-machine interaction functionality of an AIBO robot. More precisely, we show how the orientation of the robot's camera (or any active vision system) can be controlled through the estimation of the user's head pose. Applications based on head-pose imitation such as telepresence, virtual reality, and video games can directly exploit the proposed techniques. Experiments on real videos confirm the robustness and usefulness of the proposed methods.