Detection of inter-frame forgeries in digital videos.
K, Sitara; Mehtre, B M
2018-05-26
Videos are acceptable as evidence in the court of law, provided its authenticity and integrity are scientifically validated. Videos recorded by surveillance systems are susceptible to malicious alterations of visual content by perpetrators locally or remotely. Such malicious alterations of video contents (called video forgeries) are categorized into inter-frame and intra-frame forgeries. In this paper, we propose inter-frame forgery detection techniques using tamper traces from spatio-temporal and compressed domains. Pristine videos containing frames that are recorded during sudden camera zooming event, may get wrongly classified as tampered videos leading to an increase in false positives. To address this issue, we propose a method for zooming detection and it is incorporated in video tampering detection. Frame shuffling detection, which was not explored so far is also addressed in our work. Our method is capable of differentiating various inter-frame tamper events and its localization in the temporal domain. The proposed system is tested on 23,586 videos of which 2346 are pristine and rest of them are candidates of inter-frame forged videos. Experimental results show that we have successfully detected frame shuffling with encouraging accuracy rates. We have achieved improved accuracy on forgery detection in frame insertion, frame deletion and frame duplication. Copyright © 2018. Published by Elsevier B.V.
NASA Astrophysics Data System (ADS)
Sharma, Kajal; Moon, Inkyu; Kim, Sung Gaun
2012-10-01
Estimating depth has long been a major issue in the field of computer vision and robotics. The Kinect sensor's active sensing strategy provides high-frame-rate depth maps and can recognize user gestures and human pose. This paper presents a technique to estimate the depth of features extracted from video frames, along with an improved feature-matching method. In this paper, we used the Kinect camera developed by Microsoft, which captured color and depth images for further processing. Feature detection and selection is an important task for robot navigation. Many feature-matching techniques have been proposed earlier, and this paper proposes an improved feature matching between successive video frames with the use of neural network methodology in order to reduce the computation time of feature matching. The features extracted are invariant to image scale and rotation, and different experiments were conducted to evaluate the performance of feature matching between successive video frames. The extracted features are assigned distance based on the Kinect technology that can be used by the robot in order to determine the path of navigation, along with obstacle detection applications.
Multi-Frame Convolutional Neural Networks for Object Detection in Temporal Data
2017-03-01
maximum 200 words) Given the problem of detecting objects in video , existing neural-network solutions rely on a post-processing step to combine...information across frames and strengthen conclusions. This technique has been successful for videos with simple, dominant objects but it cannot detect objects...Computer Science iii THIS PAGE INTENTIONALLY LEFT BLANK iv ABSTRACT Given the problem of detecting objects in video , existing neural-network solutions rely
A study on multiresolution lossless video coding using inter/intra frame adaptive prediction
NASA Astrophysics Data System (ADS)
Nakachi, Takayuki; Sawabe, Tomoko; Fujii, Tetsuro
2003-06-01
Lossless video coding is required in the fields of archiving and editing digital cinema or digital broadcasting contents. This paper combines a discrete wavelet transform and adaptive inter/intra-frame prediction in the wavelet transform domain to create multiresolution lossless video coding. The multiresolution structure offered by the wavelet transform facilitates interchange among several video source formats such as Super High Definition (SHD) images, HDTV, SDTV, and mobile applications. Adaptive inter/intra-frame prediction is an extension of JPEG-LS, a state-of-the-art lossless still image compression standard. Based on the image statistics of the wavelet transform domains in successive frames, inter/intra frame adaptive prediction is applied to the appropriate wavelet transform domain. This adaptation offers superior compression performance. This is achieved with low computational cost and no increase in additional information. Experiments on digital cinema test sequences confirm the effectiveness of the proposed algorithm.
A fuzzy measure approach to motion frame analysis for scene detection. M.S. Thesis - Houston Univ.
NASA Technical Reports Server (NTRS)
Leigh, Albert B.; Pal, Sankar K.
1992-01-01
This paper addresses a solution to the problem of scene estimation of motion video data in the fuzzy set theoretic framework. Using fuzzy image feature extractors, a new algorithm is developed to compute the change of information in each of two successive frames to classify scenes. This classification process of raw input visual data can be used to establish structure for correlation. The algorithm attempts to fulfill the need for nonlinear, frame-accurate access to video data for applications such as video editing and visual document archival/retrieval systems in multimedia environments.
Cerina, Luca; Iozzia, Luca; Mainardi, Luca
2017-11-14
In this paper, common time- and frequency-domain variability indexes obtained by pulse rate variability (PRV) series extracted from video-photoplethysmographic signal (vPPG) were compared with heart rate variability (HRV) parameters calculated from synchronized ECG signals. The dual focus of this study was to analyze the effect of different video acquisition frame-rates starting from 60 frames-per-second (fps) down to 7.5 fps and different video compression techniques using both lossless and lossy codecs on PRV parameters estimation. Video recordings were acquired through an off-the-shelf GigE Sony XCG-C30C camera on 60 young, healthy subjects (age 23±4 years) in the supine position. A fully automated, signal extraction method based on the Kanade-Lucas-Tomasi (KLT) algorithm for regions of interest (ROI) detection and tracking, in combination with a zero-phase principal component analysis (ZCA) signal separation technique was employed to convert the video frames sequence to a pulsatile signal. The frame-rate degradation was simulated on video recordings by directly sub-sampling the ROI tracking and signal extraction modules, to correctly mimic videos recorded at a lower speed. The compression of the videos was configured to avoid any frame rejection caused by codec quality leveling, FFV1 codec was used for lossless compression and H.264 with variable quality parameter as lossy codec. The results showed that a reduced frame-rate leads to inaccurate tracking of ROIs, increased time-jitter in the signals dynamics and local peak displacements, which degrades the performances in all the PRV parameters. The root mean square of successive differences (RMSSD) and the proportion of successive differences greater than 50 ms (PNN50) indexes in time-domain and the low frequency (LF) and high frequency (HF) power in frequency domain were the parameters which highly degraded with frame-rate reduction. Such a degradation can be partially mitigated by up-sampling the measured signal at a higher frequency (namely 60 Hz). Concerning the video compression, the results showed that compression techniques are suitable for the storage of vPPG recordings, although lossless or intra-frame compression are to be preferred over inter-frame compression methods. FFV1 performances are very close to the uncompressed (UNC) version with less than 45% disk size. H.264 showed a degradation of the PRV estimation directly correlated with the increase of the compression ratio.
Compressed-domain video indexing techniques using DCT and motion vector information in MPEG video
NASA Astrophysics Data System (ADS)
Kobla, Vikrant; Doermann, David S.; Lin, King-Ip; Faloutsos, Christos
1997-01-01
Development of various multimedia applications hinges on the availability of fast and efficient storage, browsing, indexing, and retrieval techniques. Given that video is typically stored efficiently in a compressed format, if we can analyze the compressed representation directly, we can avoid the costly overhead of decompressing and operating at the pixel level. Compressed domain parsing of video has been presented in earlier work where a video clip is divided into shots, subshots, and scenes. In this paper, we describe key frame selection, feature extraction, and indexing and retrieval techniques that are directly applicable to MPEG compressed video. We develop a frame-type independent representation of the various types of frames present in an MPEG video in which al frames can be considered equivalent. Features are derived from the available DCT, macroblock, and motion vector information and mapped to a low-dimensional space where they can be accessed with standard database techniques. The spatial information is used as primary index while the temporal information is used to enhance the robustness of the system during the retrieval process. The techniques presented enable fast archiving, indexing, and retrieval of video. Our operational prototype typically takes a fraction of a second to retrieve similar video scenes from our database, with over 95% success.
Tele-Assessment of the Berg Balance Scale: Effects of Transmission Characteristics.
Venkataraman, Kavita; Morgan, Michelle; Amis, Kristopher A; Landerman, Lawrence R; Koh, Gerald C; Caves, Kevin; Hoenig, Helen
2017-04-01
To compare Berg Balance Scale (BBS) rating using videos with differing transmission characteristics with direct in-person rating. Repeated-measures study for the assessment of the BBS in 8 configurations: in person, high-definition video with slow motion review, standard-definition videos with varying bandwidths and frame rates (768 kilobytes per second [kbps] videos at 8, 15, and 30 frames per second [fps], 30 fps videos at 128, 384, and 768 kbps). Medical center. Patients with limitations (N=45) in ≥1 of 3 specific aspects of motor function: fine motor coordination, gross motor coordination, and gait and balance. Not applicable. Ability to rate the BBS in person and using videos with differing bandwidths and frame rates in frontal and lateral views. Compared with in-person rating (7%), 18% (P=.29) of high-definition videos and 37% (P=.03) of standard-definition videos could not be rated. Interrater reliability for the high-definition videos was .96 (95% confidence interval, .94-.97). Rating failure proportions increased from 20% in videos with the highest bandwidth to 60% (P<.001) in videos with the lowest bandwidth, with no significant differences in proportions across frame rate categories. Both frontal and lateral views were critical for successful rating using videos, with 60% to 70% (P<.001) of videos unable to be rated on a single view. Although there is some loss of information when using videos to rate the BBS compared to in-person ratings, it is feasible to reliably rate the BBS remotely in standard clinical spaces. However, optimal video rating requires frontal and lateral views for each assessment, high-definition video with high bandwidth, and the ability to carry out slow motion review. Copyright © 2016 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
Development of high-speed video cameras
NASA Astrophysics Data System (ADS)
Etoh, Takeharu G.; Takehara, Kohsei; Okinaka, Tomoo; Takano, Yasuhide; Ruckelshausen, Arno; Poggemann, Dirk
2001-04-01
Presented in this paper is an outline of the R and D activities on high-speed video cameras, which have been done in Kinki University since more than ten years ago, and are currently proceeded as an international cooperative project with University of Applied Sciences Osnabruck and other organizations. Extensive marketing researches have been done, (1) on user's requirements on high-speed multi-framing and video cameras by questionnaires and hearings, and (2) on current availability of the cameras of this sort by search of journals and websites. Both of them support necessity of development of a high-speed video camera of more than 1 million fps. A video camera of 4,500 fps with parallel readout was developed in 1991. A video camera with triple sensors was developed in 1996. The sensor is the same one as developed for the previous camera. The frame rate is 50 million fps for triple-framing and 4,500 fps for triple-light-wave framing, including color image capturing. Idea on a video camera of 1 million fps with an ISIS, In-situ Storage Image Sensor, was proposed in 1993 at first, and has been continuously improved. A test sensor was developed in early 2000, and successfully captured images at 62,500 fps. Currently, design of a prototype ISIS is going on, and, hopefully, will be fabricated in near future. Epoch-making cameras in history of development of high-speed video cameras by other persons are also briefly reviewed.
Yang, Yang; Stanković, Vladimir; Xiong, Zixiang; Zhao, Wei
2009-03-01
Following recent works on the rate region of the quadratic Gaussian two-terminal source coding problem and limit-approaching code designs, this paper examines multiterminal source coding of two correlated, i.e., stereo, video sequences to save the sum rate over independent coding of both sequences. Two multiterminal video coding schemes are proposed. In the first scheme, the left sequence of the stereo pair is coded by H.264/AVC and used at the joint decoder to facilitate Wyner-Ziv coding of the right video sequence. The first I-frame of the right sequence is successively coded by H.264/AVC Intracoding and Wyner-Ziv coding. An efficient stereo matching algorithm based on loopy belief propagation is then adopted at the decoder to produce pixel-level disparity maps between the corresponding frames of the two decoded video sequences on the fly. Based on the disparity maps, side information for both motion vectors and motion-compensated residual frames of the right sequence are generated at the decoder before Wyner-Ziv encoding. In the second scheme, source splitting is employed on top of classic and Wyner-Ziv coding for compression of both I-frames to allow flexible rate allocation between the two sequences. Experiments with both schemes on stereo video sequences using H.264/AVC, LDPC codes for Slepian-Wolf coding of the motion vectors, and scalar quantization in conjunction with LDPC codes for Wyner-Ziv coding of the residual coefficients give a slightly lower sum rate than separate H.264/AVC coding of both sequences at the same video quality.
Kim, Seung-Cheol; Dong, Xiao-Bin; Kwon, Min-Woo; Kim, Eun-Soo
2013-05-06
A novel approach for fast generation of video holograms of three-dimensional (3-D) moving objects using a motion compensation-based novel-look-up-table (MC-N-LUT) method is proposed. Motion compensation has been widely employed in compression of conventional 2-D video data because of its ability to exploit high temporal correlation between successive video frames. Here, this concept of motion-compensation is firstly applied to the N-LUT based on its inherent property of shift-invariance. That is, motion vectors of 3-D moving objects are extracted between the two consecutive video frames, and with them motions of the 3-D objects at each frame are compensated. Then, through this process, 3-D object data to be calculated for its video holograms are massively reduced, which results in a dramatic increase of the computational speed of the proposed method. Experimental results with three kinds of 3-D video scenarios reveal that the average number of calculated object points and the average calculation time for one object point of the proposed method, have found to be reduced down to 86.95%, 86.53% and 34.99%, 32.30%, respectively compared to those of the conventional N-LUT and temporal redundancy-based N-LUT (TR-N-LUT) methods.
Video-rate scanning two-photon excitation fluorescence microscopy and ratio imaging with cameleons.
Fan, G Y; Fujisaki, H; Miyawaki, A; Tsay, R K; Tsien, R Y; Ellisman, M H
1999-01-01
A video-rate (30 frames/s) scanning two-photon excitation microscope has been successfully tested. The microscope, based on a Nikon RCM 8000, incorporates a femtosecond pulsed laser with wavelength tunable from 690 to 1050 nm, prechirper optics for laser pulse-width compression, resonant galvanometer for video-rate point scanning, and a pair of nonconfocal detectors for fast emission ratioing. An increase in fluorescent emission of 1.75-fold is consistently obtained with the use of the prechirper optics. The nonconfocal detectors provide another 2.25-fold increase in detection efficiency. Ratio imaging and optical sectioning can therefore be performed more efficiently without confocal optics. Faster frame rates, at 60, 120, and 240 frames/s, can be achieved with proportionally reduced scan lines per frame. Useful two-photon images can be acquired at video rate with a laser power as low as 2.7 mW at specimen with the genetically modified green fluorescent proteins. Preliminary results obtained using this system confirm that the yellow "cameleons" exhibit similar optical properties as under one-photon excitation conditions. Dynamic two-photon images of cardiac myocytes and ratio images of yellow cameleon-2.1, -3.1, and -3.1nu are also presented. PMID:10233058
VanWye, William R; Hoover, Donald L
2018-05-01
Qualitative analysis has its limitations as the speed of human movement often occurs more quickly than can be comprehended. Digital video allows for frame-by-frame analysis, and therefore likely more effective interventions for gait dysfunction. Although the use of digital video outside laboratory settings, just a decade ago, was challenging due to cost and time constraints, rapid use of smartphones and software applications has made this technology much more practical for clinical usage. A 35-year-old man presented for evaluation with the chief complaint of knee pain 24 months status-post triple arthrodesis following a work-related crush injury. In-clinic qualitative gait analysis revealed gait dysfunction, which was augmented by using a standard IPhone® 3GS camera. After video capture, an IPhone® application (Speed Up TV®, https://itunes.apple.com/us/app/speeduptv/id386986953?mt=8 ) allowed for frame-by-frame analysis. Corrective techniques were employed using in-clinic equipment to develop and apply a temporary heel-to-toe rocker sole (HTRS) to the patient's shoe. Post-intervention video revealed significantly improved gait efficiency with a decrease in pain. The patient was promptly fitted with a permanent HTRS orthosis. This intervention enabled the patient to successfully complete a work conditioning program and progress to job retraining. Video allows for multiple views, which can be further enhanced by using applications for frame-by-frame analysis and zoom capabilities. This is especially useful for less experienced observers of human motion, as well as for establishing comparative signs prior to implementation of training and/or permanent devices.
From image captioning to video summary using deep recurrent networks and unsupervised segmentation
NASA Astrophysics Data System (ADS)
Morosanu, Bogdan-Andrei; Lemnaru, Camelia
2018-04-01
Automatic captioning systems based on recurrent neural networks have been tremendously successful at providing realistic natural language captions for complex and varied image data. We explore methods for adapting existing models trained on large image caption data sets to a similar problem, that of summarising videos using natural language descriptions and frame selection. These architectures create internal high level representations of the input image that can be used to define probability distributions and distance metrics on these distributions. Specifically, we interpret each hidden unit inside a layer of the caption model as representing the un-normalised log probability of some unknown image feature of interest for the caption generation process. We can then apply well understood statistical divergence measures to express the difference between images and create an unsupervised segmentation of video frames, classifying consecutive images of low divergence as belonging to the same context, and those of high divergence as belonging to different contexts. To provide a final summary of the video, we provide a group of selected frames and a text description accompanying them, allowing a user to perform a quick exploration of large unlabeled video databases.
Constructing spherical panoramas of a bladder phantom from endoscopic video using bundle adjustment
NASA Astrophysics Data System (ADS)
Soper, Timothy D.; Chandler, John E.; Porter, Michael P.; Seibel, Eric J.
2011-03-01
The high recurrence rate of bladder cancer requires patients to undergo frequent surveillance screenings over their lifetime following initial diagnosis and resection. Our laboratory is developing panoramic stitching software that would compile several minutes of cystoscopic video into a single panoramic image, covering the entire bladder, for review by an urolgist at a later time or remote location. Global alignment of video frames is achieved by using a bundle adjuster that simultaneously recovers both the 3D structure of the bladder as well as the scope motion using only the video frames as input. The result of the algorithm is a complete 360° spherical panorama of the outer surface. The details of the software algorithms are presented here along with results from both a virtual cystoscopy as well from real endoscopic imaging of a bladder phantom. The software successfully stitched several hundred video frames into a single panoramic with subpixel accuracy and with no knowledge of the intrinsic camera properties, such as focal length and radial distortion. In the discussion, we outline future work in development of the software as well as identifying factors pertinent to clinical translation of this technology.
Selecting salient frames for spatiotemporal video modeling and segmentation.
Song, Xiaomu; Fan, Guoliang
2007-12-01
We propose a new statistical generative model for spatiotemporal video segmentation. The objective is to partition a video sequence into homogeneous segments that can be used as "building blocks" for semantic video segmentation. The baseline framework is a Gaussian mixture model (GMM)-based video modeling approach that involves a six-dimensional spatiotemporal feature space. Specifically, we introduce the concept of frame saliency to quantify the relevancy of a video frame to the GMM-based spatiotemporal video modeling. This helps us use a small set of salient frames to facilitate the model training by reducing data redundancy and irrelevance. A modified expectation maximization algorithm is developed for simultaneous GMM training and frame saliency estimation, and the frames with the highest saliency values are extracted to refine the GMM estimation for video segmentation. Moreover, it is interesting to find that frame saliency can imply some object behaviors. This makes the proposed method also applicable to other frame-related video analysis tasks, such as key-frame extraction, video skimming, etc. Experiments on real videos demonstrate the effectiveness and efficiency of the proposed method.
Vehicle counting system using real-time video processing
NASA Astrophysics Data System (ADS)
Crisóstomo-Romero, Pedro M.
2006-02-01
Transit studies are important for planning a road network with optimal vehicular flow. A vehicular count is essential. This article presents a vehicle counting system based on video processing. An advantage of such system is the greater detail than is possible to obtain, like shape, size and speed of vehicles. The system uses a video camera placed above the street to image transit in real-time. The video camera must be placed at least 6 meters above the street level to achieve proper acquisition quality. Fast image processing algorithms and small image dimensions are used to allow real-time processing. Digital filters, mathematical morphology, segmentation and other techniques allow identifying and counting all vehicles in the image sequences. The system was implemented under Linux in a 1.8 GHz Pentium 4 computer. A successful count was obtained with frame rates of 15 frames per second for images of size 240x180 pixels and 24 frames per second for images of size 180x120 pixels, thus being able to count vehicles whose speeds do not exceed 150 km/h.
WE-AB-BRA-12: Virtual Endoscope Tracking for Endoscopy-CT Image Registration
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ingram, W; Rao, A; Wendt, R
Purpose: The use of endoscopy in radiotherapy will remain limited until we can register endoscopic video to CT using standard clinical equipment. In this phantom study we tested a registration method using virtual endoscopy to measure CT-space positions from endoscopic video. Methods: Our phantom is a contorted clay cylinder with 2-mm-diameter markers in the luminal surface. These markers are visible on both CT and endoscopic video. Virtual endoscope images were rendered from a polygonal mesh created by segmenting the phantom’s luminal surface on CT. We tested registration accuracy by tracking the endoscope’s 6-degree-of-freedom coordinates frame-to-frame in a video recorded asmore » it moved through the phantom, and using these coordinates to measure CT-space positions of markers visible in the final frame. To track the endoscope we used the Nelder-Mead method to search for coordinates that render the virtual frame most similar to the next recorded frame. We measured the endoscope’s initial-frame coordinates using a set of visible markers, and for image similarity we used a combination of mutual information and gradient alignment. CT-space marker positions were measured by projecting their final-frame pixel addresses through the virtual endoscope to intersect with the mesh. Registration error was quantified as the distance between this intersection and the marker’s manually-selected CT-space position. Results: Tracking succeeded for 6 of 8 videos, for which the mean registration error was 4.8±3.5mm (24 measurements total). The mean error in the axial direction (3.1±3.3mm) was larger than in the sagittal or coronal directions (2.0±2.3mm, 1.7±1.6mm). In the other 2 videos, the virtual endoscope got stuck in a false minimum. Conclusion: Our method can successfully track the position and orientation of an endoscope, and it provides accurate spatial mapping from endoscopic video to CT. This method will serve as a foundation for an endoscopy-CT registration framework that is clinically valuable and requires no specialized equipment.« less
Key Frame Extraction in the Summary Space.
Li, Xuelong; Zhao, Bin; Lu, Xiaoqiang; Xuelong Li; Bin Zhao; Xiaoqiang Lu; Lu, Xiaoqiang; Li, Xuelong; Zhao, Bin
2018-06-01
Key frame extraction is an efficient way to create the video summary which helps users obtain a quick comprehension of the video content. Generally, the key frames should be representative of the video content, meanwhile, diverse to reduce the redundancy. Based on the assumption that the video data are near a subspace of a high-dimensional space, a new approach, named as key frame extraction in the summary space, is proposed for key frame extraction in this paper. The proposed approach aims to find the representative frames of the video and filter out similar frames from the representative frame set. First of all, the video data are mapped to a high-dimensional space, named as summary space. Then, a new representation is learned for each frame by analyzing the intrinsic structure of the summary space. Specifically, the learned representation can reflect the representativeness of the frame, and is utilized to select representative frames. Next, the perceptual hash algorithm is employed to measure the similarity of representative frames. As a result, the key frame set is obtained after filtering out similar frames from the representative frame set. Finally, the video summary is constructed by assigning the key frames in temporal order. Additionally, the ground truth, created by filtering out similar frames from human-created summaries, is utilized to evaluate the quality of the video summary. Compared with several traditional approaches, the experimental results on 80 videos from two datasets indicate the superior performance of our approach.
Encrypting Digital Camera with Automatic Encryption Key Deletion
NASA Technical Reports Server (NTRS)
Oakley, Ernest C. (Inventor)
2007-01-01
A digital video camera includes an image sensor capable of producing a frame of video data representing an image viewed by the sensor, an image memory for storing video data such as previously recorded frame data in a video frame location of the image memory, a read circuit for fetching the previously recorded frame data, an encryption circuit having an encryption key input connected to receive the previously recorded frame data from the read circuit as an encryption key, an un-encrypted data input connected to receive the frame of video data from the image sensor and an encrypted data output port, and a write circuit for writing a frame of encrypted video data received from the encrypted data output port of the encryption circuit to the memory and overwriting the video frame location storing the previously recorded frame data.
Dense mesh sampling for video-based facial animation
NASA Astrophysics Data System (ADS)
Peszor, Damian; Wojciechowska, Marzena
2016-06-01
The paper describes an approach for selection of feature points on three-dimensional, triangle mesh obtained using various techniques from several video footages. This approach has a dual purpose. First, it allows to minimize the data stored for the purpose of facial animation, so that instead of storing position of each vertex in each frame, one could store only a small subset of vertices for each frame and calculate positions of others based on the subset. Second purpose is to select feature points that could be used for anthropometry-based retargeting of recorded mimicry to another model, with sampling density beyond that which can be achieved using marker-based performance capture techniques. Developed approach was successfully tested on artificial models, models constructed using structured light scanner, and models constructed from video footages using stereophotogrammetry.
Full-frame video stabilization with motion inpainting.
Matsushita, Yasuyuki; Ofek, Eyal; Ge, Weina; Tang, Xiaoou; Shum, Heung-Yeung
2006-07-01
Video stabilization is an important video enhancement technology which aims at removing annoying shaky motion from videos. We propose a practical and robust approach of video stabilization that produces full-frame stabilized videos with good visual quality. While most previous methods end up with producing smaller size stabilized videos, our completion method can produce full-frame videos by naturally filling in missing image parts by locally aligning image data of neighboring frames. To achieve this, motion inpainting is proposed to enforce spatial and temporal consistency of the completion in both static and dynamic image areas. In addition, image quality in the stabilized video is enhanced with a new practical deblurring algorithm. Instead of estimating point spread functions, our method transfers and interpolates sharper image pixels of neighboring frames to increase the sharpness of the frame. The proposed video completion and deblurring methods enabled us to develop a complete video stabilizer which can naturally keep the original image quality in the stabilized videos. The effectiveness of our method is confirmed by extensive experiments over a wide variety of videos.
Hierarchical video summarization
NASA Astrophysics Data System (ADS)
Ratakonda, Krishna; Sezan, M. Ibrahim; Crinon, Regis J.
1998-12-01
We address the problem of key-frame summarization of vide in the absence of any a priori information about its content. This is a common problem that is encountered in home videos. We propose a hierarchical key-frame summarization algorithm where a coarse-to-fine key-frame summary is generated. A hierarchical key-frame summary facilitates multi-level browsing where the user can quickly discover the content of the video by accessing its coarsest but most compact summary and then view a desired segment of the video with increasingly more detail. At the finest level, the summary is generated on the basis of color features of video frames, using an extension of a recently proposed key-frame extraction algorithm. The finest level key-frames are recursively clustered using a novel pairwise K-means clustering approach with temporal consecutiveness constraint. We also address summarization of MPEG-2 compressed video without fully decoding the bitstream. We also propose efficient mechanisms that facilitate decoding the video when the hierarchical summary is utilized in browsing and playback of video segments starting at selected key-frames.
A novel key-frame extraction approach for both video summary and video index.
Lei, Shaoshuai; Xie, Gang; Yan, Gaowei
2014-01-01
Existing key-frame extraction methods are basically video summary oriented; yet the index task of key-frames is ignored. This paper presents a novel key-frame extraction approach which can be available for both video summary and video index. First a dynamic distance separability algorithm is advanced to divide a shot into subshots based on semantic structure, and then appropriate key-frames are extracted in each subshot by SVD decomposition. Finally, three evaluation indicators are proposed to evaluate the performance of the new approach. Experimental results show that the proposed approach achieves good semantic structure for semantics-based video index and meanwhile produces video summary consistent with human perception.
Parallel Key Frame Extraction for Surveillance Video Service in a Smart City.
Zheng, Ran; Yao, Chuanwei; Jin, Hai; Zhu, Lei; Zhang, Qin; Deng, Wei
2015-01-01
Surveillance video service (SVS) is one of the most important services provided in a smart city. It is very important for the utilization of SVS to provide design efficient surveillance video analysis techniques. Key frame extraction is a simple yet effective technique to achieve this goal. In surveillance video applications, key frames are typically used to summarize important video content. It is very important and essential to extract key frames accurately and efficiently. A novel approach is proposed to extract key frames from traffic surveillance videos based on GPU (graphics processing units) to ensure high efficiency and accuracy. For the determination of key frames, motion is a more salient feature in presenting actions or events, especially in surveillance videos. The motion feature is extracted in GPU to reduce running time. It is also smoothed to reduce noise, and the frames with local maxima of motion information are selected as the final key frames. The experimental results show that this approach can extract key frames more accurately and efficiently compared with several other methods.
First- and third-party ground truth for key frame extraction from consumer video clips
NASA Astrophysics Data System (ADS)
Costello, Kathleen; Luo, Jiebo
2007-02-01
Extracting key frames (KF) from video is of great interest in many applications, such as video summary, video organization, video compression, and prints from video. KF extraction is not a new problem. However, current literature has been focused mainly on sports or news video. In the consumer video space, the biggest challenges for key frame selection from consumer videos are the unconstrained content and lack of any preimposed structure. In this study, we conduct ground truth collection of key frames from video clips taken by digital cameras (as opposed to camcorders) using both first- and third-party judges. The goals of this study are: (1) to create a reference database of video clips reasonably representative of the consumer video space; (2) to identify associated key frames by which automated algorithms can be compared and judged for effectiveness; and (3) to uncover the criteria used by both first- and thirdparty human judges so these criteria can influence algorithm design. The findings from these ground truths will be discussed.
Video framerate, resolution and grayscale tradeoffs for undersea telemanipulator
NASA Technical Reports Server (NTRS)
Ranadive, V.; Sheridan, T. B.
1981-01-01
The product of Frame Rate (F) in frames per second, Resolution (R) in total pixels and grayscale in bits (G) equals the transmission band rate in bits per second. Thus for a fixed channel capacity there are tradeoffs between F, R and G in the actual sampling of the picture for a particular manual control task in the present case remote undersea manipulation. A manipulator was used in the MASTER/SLAVE mode to study these tradeoffs. Images were systematically degraded from 28 frames per second, 128 x 128 pixels and 16 levels (4 bits) grayscale, with various FRG combinations constructed from a real-time digitized (charge-injection) video camera. It was found that frame rate, resolution and grayscale could be independently reduced without preventing the operator from accomplishing his/her task. Threshold points were found beyond which degradation would prevent any successful performance. A general conclusion is that a well trained operator can perform familiar remote manipulator tasks with a considerably degrade picture, down to 50 K bits/ sec.
NASA Astrophysics Data System (ADS)
Buford, James A., Jr.; Cosby, David; Bunfield, Dennis H.; Mayhall, Anthony J.; Trimble, Darian E.
2007-04-01
AMRDEC has successfully tested hardware and software for Real-Time Scene Generation for IR and SAL Sensors on COTS PC based hardware and video cards. AMRDEC personnel worked with nVidia and Concurrent Computer Corporation to develop a Scene Generation system capable of frame rates of at least 120Hz while frame locked to an external source (such as a missile seeker) with no dropped frames. Latency measurements and image validation were performed using COTS and in-house developed hardware and software. Software for the Scene Generation system was developed using OpenSceneGraph.
Common and Innovative Visuals: A sparsity modeling framework for video.
Abdolhosseini Moghadam, Abdolreza; Kumar, Mrityunjay; Radha, Hayder
2014-05-02
Efficient video representation models are critical for many video analysis and processing tasks. In this paper, we present a framework based on the concept of finding the sparsest solution to model video frames. To model the spatio-temporal information, frames from one scene are decomposed into two components: (i) a common frame, which describes the visual information common to all the frames in the scene/segment, and (ii) a set of innovative frames, which depicts the dynamic behaviour of the scene. The proposed approach exploits and builds on recent results in the field of compressed sensing to jointly estimate the common frame and the innovative frames for each video segment. We refer to the proposed modeling framework by CIV (Common and Innovative Visuals). We show how the proposed model can be utilized to find scene change boundaries and extend CIV to videos from multiple scenes. Furthermore, the proposed model is robust to noise and can be used for various video processing applications without relying on motion estimation and detection or image segmentation. Results for object tracking, video editing (object removal, inpainting) and scene change detection are presented to demonstrate the efficiency and the performance of the proposed model.
High-frame-rate infrared and visible cameras for test range instrumentation
NASA Astrophysics Data System (ADS)
Ambrose, Joseph G.; King, B.; Tower, John R.; Hughes, Gary W.; Levine, Peter A.; Villani, Thomas S.; Esposito, Benjamin J.; Davis, Timothy J.; O'Mara, K.; Sjursen, W.; McCaffrey, Nathaniel J.; Pantuso, Francis P.
1995-09-01
Field deployable, high frame rate camera systems have been developed to support the test and evaluation activities at the White Sands Missile Range. The infrared cameras employ a 640 by 480 format PtSi focal plane array (FPA). The visible cameras employ a 1024 by 1024 format backside illuminated CCD. The monolithic, MOS architecture of the PtSi FPA supports commandable frame rate, frame size, and integration time. The infrared cameras provide 3 - 5 micron thermal imaging in selectable modes from 30 Hz frame rate, 640 by 480 frame size, 33 ms integration time to 300 Hz frame rate, 133 by 142 frame size, 1 ms integration time. The infrared cameras employ a 500 mm, f/1.7 lens. Video outputs are 12-bit digital video and RS170 analog video with histogram-based contrast enhancement. The 1024 by 1024 format CCD has a 32-port, split-frame transfer architecture. The visible cameras exploit this architecture to provide selectable modes from 30 Hz frame rate, 1024 by 1024 frame size, 32 ms integration time to 300 Hz frame rate, 1024 by 1024 frame size (with 2:1 vertical binning), 0.5 ms integration time. The visible cameras employ a 500 mm, f/4 lens, with integration time controlled by an electro-optical shutter. Video outputs are RS170 analog video (512 by 480 pixels), and 12-bit digital video.
NASA Astrophysics Data System (ADS)
Cunningham, Cindy C.; Peloquin, Tracy D.
1999-02-01
Since late 1996 the Forensic Identification Services Section of the Ontario Provincial Police has been actively involved in state-of-the-art image capture and the processing of video images extracted from crime scene videos. The benefits and problems of this technology for video analysis are discussed. All analysis is being conducted on SUN Microsystems UNIX computers, networked to a digital disk recorder that is used for video capture. The primary advantage of this system over traditional frame grabber technology is reviewed. Examples from actual cases are presented and the successes and limitations of this approach are explored. Suggestions to companies implementing security technology plans for various organizations (banks, stores, restaurants, etc.) will be made. Future directions for this work and new technologies are also discussed.
Heterogeneity image patch index and its application to consumer video summarization.
Dang, Chinh T; Radha, Hayder
2014-06-01
Automatic video summarization is indispensable for fast browsing and efficient management of large video libraries. In this paper, we introduce an image feature that we refer to as heterogeneity image patch (HIP) index. The proposed HIP index provides a new entropy-based measure of the heterogeneity of patches within any picture. By evaluating this index for every frame in a video sequence, we generate a HIP curve for that sequence. We exploit the HIP curve in solving two categories of video summarization applications: key frame extraction and dynamic video skimming. Under the key frame extraction frame-work, a set of candidate key frames is selected from abundant video frames based on the HIP curve. Then, a proposed patch-based image dissimilarity measure is used to create affinity matrix of these candidates. Finally, a set of key frames is extracted from the affinity matrix using a min–max based algorithm. Under video skimming, we propose a method to measure the distance between a video and its skimmed representation. The video skimming problem is then mapped into an optimization framework and solved by minimizing a HIP-based distance for a set of extracted excerpts. The HIP framework is pixel-based and does not require semantic information or complex camera motion estimation. Our simulation results are based on experiments performed on consumer videos and are compared with state-of-the-art methods. It is shown that the HIP approach outperforms other leading methods, while maintaining low complexity.
Informative-frame filtering in endoscopy videos
NASA Astrophysics Data System (ADS)
An, Yong Hwan; Hwang, Sae; Oh, JungHwan; Lee, JeongKyu; Tavanapong, Wallapak; de Groen, Piet C.; Wong, Johnny
2005-04-01
Advances in video technology are being incorporated into today"s healthcare practice. For example, colonoscopy is an important screening tool for colorectal cancer. Colonoscopy allows for the inspection of the entire colon and provides the ability to perform a number of therapeutic operations during a single procedure. During a colonoscopic procedure, a tiny video camera at the tip of the endoscope generates a video signal of the internal mucosa of the colon. The video data are displayed on a monitor for real-time analysis by the endoscopist. Other endoscopic procedures include upper gastrointestinal endoscopy, enteroscopy, bronchoscopy, cystoscopy, and laparoscopy. However, a significant number of out-of-focus frames are included in this type of videos since current endoscopes are equipped with a single, wide-angle lens that cannot be focused. The out-of-focus frames do not hold any useful information. To reduce the burdens of the further processes such as computer-aided image processing or human expert"s examinations, these frames need to be removed. We call an out-of-focus frame as non-informative frame and an in-focus frame as informative frame. We propose a new technique to classify the video frames into two classes, informative and non-informative frames using a combination of Discrete Fourier Transform (DFT), Texture Analysis, and K-Means Clustering. The proposed technique can evaluate the frames without any reference image, and does not need any predefined threshold value. Our experimental studies indicate that it achieves over 96% of four different performance metrics (i.e. precision, sensitivity, specificity, and accuracy).
Real-time CT-video registration for continuous endoscopic guidance
NASA Astrophysics Data System (ADS)
Merritt, Scott A.; Rai, Lav; Higgins, William E.
2006-03-01
Previous research has shown that CT-image-based guidance could be useful for the bronchoscopic assessment of lung cancer. This research drew upon the registration of bronchoscopic video images to CT-based endoluminal renderings of the airway tree. The proposed methods either were restricted to discrete single-frame registration, which took several seconds to complete, or required non-real-time buffering and processing of video sequences. We have devised a fast 2D/3D image registration method that performs single-frame CT-Video registration in under 1/15th of a second. This allows the method to be used for real-time registration at full video frame rates without significantly altering the physician's behavior. The method achieves its speed through a gradient-based optimization method that allows most of the computation to be performed off-line. During live registration, the optimization iteratively steps toward the locally optimal viewpoint at which a CT-based endoluminal view is most similar to a current bronchoscopic video frame. After an initial registration to begin the process (generally done in the trachea for bronchoscopy), subsequent registrations are performed in real-time on each incoming video frame. As each new bronchoscopic video frame becomes available, the current optimization is initialized using the previous frame's optimization result, allowing continuous guidance to proceed without manual re-initialization. Tests were performed using both synthetic and pre-recorded bronchoscopic video. The results show that the method is robust to initialization errors, that registration accuracy is high, and that continuous registration can proceed on real-time video at >15 frames per sec. with minimal user-intervention.
Video Super-Resolution via Bidirectional Recurrent Convolutional Networks.
Huang, Yan; Wang, Wei; Wang, Liang
2018-04-01
Super resolving a low-resolution video, namely video super-resolution (SR), is usually handled by either single-image SR or multi-frame SR. Single-Image SR deals with each video frame independently, and ignores intrinsic temporal dependency of video frames which actually plays a very important role in video SR. Multi-Frame SR generally extracts motion information, e.g., optical flow, to model the temporal dependency, but often shows high computational cost. Considering that recurrent neural networks (RNNs) can model long-term temporal dependency of video sequences well, we propose a fully convolutional RNN named bidirectional recurrent convolutional network for efficient multi-frame SR. Different from vanilla RNNs, 1) the commonly-used full feedforward and recurrent connections are replaced with weight-sharing convolutional connections. So they can greatly reduce the large number of network parameters and well model the temporal dependency in a finer level, i.e., patch-based rather than frame-based, and 2) connections from input layers at previous timesteps to the current hidden layer are added by 3D feedforward convolutions, which aim to capture discriminate spatio-temporal patterns for short-term fast-varying motions in local adjacent frames. Due to the cheap convolutional operations, our model has a low computational complexity and runs orders of magnitude faster than other multi-frame SR methods. With the powerful temporal dependency modeling, our model can super resolve videos with complex motions and achieve well performance.
Echocardiogram video summarization
NASA Astrophysics Data System (ADS)
Ebadollahi, Shahram; Chang, Shih-Fu; Wu, Henry D.; Takoma, Shin
2001-05-01
This work aims at developing innovative algorithms and tools for summarizing echocardiogram videos. Specifically, we summarize the digital echocardiogram videos by temporally segmenting them into the constituent views and representing each view by the most informative frame. For the segmentation we take advantage of the well-defined spatio- temporal structure of the echocardiogram videos. Two different criteria are used: presence/absence of color and the shape of the region of interest (ROI) in each frame of the video. The change in the ROI is due to different modes of echocardiograms present in one study. The representative frame is defined to be the frame corresponding to the end- diastole of the heart cycle. To locate the end-diastole we track the ECG of each frame to find the exact time the time- marker on the ECG crosses the peak of the end-diastole we track the ECG of each frame to find the exact time the time- marker on the ECG crosses the peak of the R-wave. The corresponding frame is chosen to be the key-frame. The entire echocardiogram video can be summarized into either a static summary, which is a storyboard type of summary and a dynamic summary, which is a concatenation of the selected segments of the echocardiogram video. To the best of our knowledge, this if the first automated system for summarizing the echocardiogram videos base don visual content.
Video Analysis of Rolling Cylinders
ERIC Educational Resources Information Center
Phommarach, S.; Wattanakasiwich, P.; Johnston, I.
2012-01-01
In this work, we studied the rolling motion of solid and hollow cylinders down an inclined plane at different angles. The motions were captured on video at 300 frames s[superscript -1], and the videos were analyzed frame by frame using video analysis software. Data from the real motion were compared with the theory of rolling down an inclined…
Dactyl Alphabet Gesture Recognition in a Video Sequence Using Microsoft Kinect
NASA Astrophysics Data System (ADS)
Artyukhin, S. G.; Mestetskiy, L. M.
2015-05-01
This paper presents an efficient framework for solving the problem of static gesture recognition based on data obtained from the web cameras and depth sensor Kinect (RGB-D - data). Each gesture given by a pair of images: color image and depth map. The database store gestures by it features description, genereated by frame for each gesture of the alphabet. Recognition algorithm takes as input a video sequence (a sequence of frames) for marking, put in correspondence with each frame sequence gesture from the database, or decide that there is no suitable gesture in the database. First, classification of the frame of the video sequence is done separately without interframe information. Then, a sequence of successful marked frames in equal gesture is grouped into a single static gesture. We propose a method combined segmentation of frame by depth map and RGB-image. The primary segmentation is based on the depth map. It gives information about the position and allows to get hands rough border. Then, based on the color image border is specified and performed analysis of the shape of the hand. Method of continuous skeleton is used to generate features. We propose a method of skeleton terminal branches, which gives the opportunity to determine the position of the fingers and wrist. Classification features for gesture is description of the position of the fingers relative to the wrist. The experiments were carried out with the developed algorithm on the example of the American Sign Language. American Sign Language gesture has several components, including the shape of the hand, its orientation in space and the type of movement. The accuracy of the proposed method is evaluated on the base of collected gestures consisting of 2700 frames.
Robust 3D DFT video watermarking
NASA Astrophysics Data System (ADS)
Deguillaume, Frederic; Csurka, Gabriela; O'Ruanaidh, Joseph J.; Pun, Thierry
1999-04-01
This paper proposes a new approach for digital watermarking and secure copyright protection of videos, the principal aim being to discourage illicit copying and distribution of copyrighted material. The method presented here is based on the discrete Fourier transform (DFT) of three dimensional chunks of video scene, in contrast with previous works on video watermarking where each video frame was marked separately, or where only intra-frame or motion compensation parameters were marked in MPEG compressed videos. Two kinds of information are hidden in the video: a watermark and a template. Both are encoded using an owner key to ensure the system security and are embedded in the 3D DFT magnitude of video chunks. The watermark is a copyright information encoded in the form of a spread spectrum signal. The template is a key based grid and is used to detect and invert the effect of frame-rate changes, aspect-ratio modification and rescaling of frames. The template search and matching is performed in the log-log-log map of the 3D DFT magnitude. The performance of the presented technique is evaluated experimentally and compared with a frame-by-frame 2D DFT watermarking approach.
Tidewater Community College Distance Learning Report.
ERIC Educational Resources Information Center
Tidewater Community Coll., Norfolk, VA.
This study of distance learning at Tidewater Community College (TCC) was conducted to determine enrollment patterns, retention, and success in distance learning courses and student perceptions. Distance learning was defined as students enrolled in one of three modes of course delivery: telecourse, online, and compressed video. The time frame for…
Unmanned Vehicle Guidance Using Video Camera/Vehicle Model
NASA Technical Reports Server (NTRS)
Sutherland, T.
1999-01-01
A video guidance sensor (VGS) system has flown on both STS-87 and STS-95 to validate a single camera/target concept for vehicle navigation. The main part of the image algorithm was the subtraction of two consecutive images using software. For a nominal size image of 256 x 256 pixels this subtraction can take a large portion of the time between successive frames in standard rate video leaving very little time for other computations. The purpose of this project was to integrate the software subtraction into hardware to speed up the subtraction process and allow for more complex algorithms to be performed, both in hardware and software.
VanderKnyff, Jeremy; Friedman, Daniela B; Tanner, Andrea
2015-01-01
Using a sample of YouTube videos posted on the YouTube channels of organ procurement organizations, a content analysis was conducted to identify the frames used to strategically communicate prodonation messages. A total of 377 videos were coded for general characteristics, format, speaker characteristics, organs discussed, structure, problem definition, and treatment. Principal components analysis identified message frames, and k-means cluster analysis established distinct groupings of videos on the basis of the strength of their relationship to message frames. Analysis of these frames and clusters found that organ procurement organizations present multiple, and sometimes competing, video types and message frames on YouTube. This study serves as important formative research that will inform future studies to measure the effectiveness of the distinct message frames and clusters identified.
Lewis, Carmen L; Pignone, Michael P; Sheridan, Stacey L; Downs, Stephen M; Kinsinger, Linda S
2003-11-01
To assess the effect of providing structured information about the benefits and harms of mammography in differing frames on women's perceptions of screening. Randomized control trial. General internal medicine academic practice. One hundred seventy-nine women aged 35 through 49. Women received 1 of 3 5-minute videos about the benefits and harms of screening mammography in women aged 40 to 49. These videos differed only in the way the probabilities of potential outcomes were framed (positive, neutral, or negative). We measured the change in accurate responses to questions about potential benefits and harms of mammography, and the change in the proportion of participants who perceived that the benefits of mammography were more important than the harms. Before seeing the videos, women's knowledge about the benefits and harms of mammography was inaccurate (82% responded incorrectly to all 3 knowledge questions). After seeing the videos, the proportion that answered correctly increased by 52%, 43%, and 30% for the 3 knowledge questions, respectively, but there were no differences between video frames. At baseline, most women thought the benefits of mammography outweighed the harms (79% positive frame, 80% neutral frame, and 85% negative frame). After the videos, these proportions were similar among the 3 groups (84%, 81%, 83%, P =.93). Women improved the accuracy of their responses to questions about the benefits and harms of mammography after seeing the videos, but this change was not affected by the framing of information. Women strongly perceived that the benefits of mammography outweighed the harms, and providing accurate information had no effect on these perceptions, regardless of how it was framed.
Real-time unmanned aircraft systems surveillance video mosaicking using GPU
NASA Astrophysics Data System (ADS)
Camargo, Aldo; Anderson, Kyle; Wang, Yi; Schultz, Richard R.; Fevig, Ronald A.
2010-04-01
Digital video mosaicking from Unmanned Aircraft Systems (UAS) is being used for many military and civilian applications, including surveillance, target recognition, border protection, forest fire monitoring, traffic control on highways, monitoring of transmission lines, among others. Additionally, NASA is using digital video mosaicking to explore the moon and planets such as Mars. In order to compute a "good" mosaic from video captured by a UAS, the algorithm must deal with motion blur, frame-to-frame jitter associated with an imperfectly stabilized platform, perspective changes as the camera tilts in flight, as well as a number of other factors. The most suitable algorithms use SIFT (Scale-Invariant Feature Transform) to detect the features consistent between video frames. Utilizing these features, the next step is to estimate the homography between two consecutives video frames, perform warping to properly register the image data, and finally blend the video frames resulting in a seamless video mosaick. All this processing takes a great deal of resources of resources from the CPU, so it is almost impossible to compute a real time video mosaic on a single processor. Modern graphics processing units (GPUs) offer computational performance that far exceeds current CPU technology, allowing for real-time operation. This paper presents the development of a GPU-accelerated digital video mosaicking implementation and compares it with CPU performance. Our tests are based on two sets of real video captured by a small UAS aircraft; one video comes from Infrared (IR) and Electro-Optical (EO) cameras. Our results show that we can obtain a speed-up of more than 50 times using GPU technology, so real-time operation at a video capture of 30 frames per second is feasible.
Automated UAV-based mapping for airborne reconnaissance and video exploitation
NASA Astrophysics Data System (ADS)
Se, Stephen; Firoozfam, Pezhman; Goldstein, Norman; Wu, Linda; Dutkiewicz, Melanie; Pace, Paul; Naud, J. L. Pierre
2009-05-01
Airborne surveillance and reconnaissance are essential for successful military missions. Such capabilities are critical for force protection, situational awareness, mission planning, damage assessment and others. UAVs gather huge amount of video data but it is extremely labour-intensive for operators to analyse hours and hours of received data. At MDA, we have developed a suite of tools towards automated video exploitation including calibration, visualization, change detection and 3D reconstruction. The on-going work is to improve the robustness of these tools and automate the process as much as possible. Our calibration tool extracts and matches tie-points in the video frames incrementally to recover the camera calibration and poses, which are then refined by bundle adjustment. Our visualization tool stabilizes the video, expands its field-of-view and creates a geo-referenced mosaic from the video frames. It is important to identify anomalies in a scene, which may include detecting any improvised explosive devices (IED). However, it is tedious and difficult to compare video clips to look for differences manually. Our change detection tool allows the user to load two video clips taken from two passes at different times and flags any changes between them. 3D models are useful for situational awareness, as it is easier to understand the scene by visualizing it in 3D. Our 3D reconstruction tool creates calibrated photo-realistic 3D models from video clips taken from different viewpoints, using both semi-automated and automated approaches. The resulting 3D models also allow distance measurements and line-of- sight analysis.
Video error concealment using block matching and frequency selective extrapolation algorithms
NASA Astrophysics Data System (ADS)
P. K., Rajani; Khaparde, Arti
2017-06-01
Error Concealment (EC) is a technique at the decoder side to hide the transmission errors. It is done by analyzing the spatial or temporal information from available video frames. It is very important to recover distorted video because they are used for various applications such as video-telephone, video-conference, TV, DVD, internet video streaming, video games etc .Retransmission-based and resilient-based methods, are also used for error removal. But these methods add delay and redundant data. So error concealment is the best option for error hiding. In this paper, the error concealment methods such as Block Matching error concealment algorithm is compared with Frequency Selective Extrapolation algorithm. Both the works are based on concealment of manually error video frames as input. The parameter used for objective quality measurement was PSNR (Peak Signal to Noise Ratio) and SSIM(Structural Similarity Index). The original video frames along with error video frames are compared with both the Error concealment algorithms. According to simulation results, Frequency Selective Extrapolation is showing better quality measures such as 48% improved PSNR and 94% increased SSIM than Block Matching Algorithm.
Flexible video conference system based on ASICs and DSPs
NASA Astrophysics Data System (ADS)
Hu, Qiang; Yu, Songyu
1995-02-01
In this paper, a video conference system we developed recently is presented. In this system the video codec is compatible with CCITT H.261, the audio codec is compatible with G.711 and G.722, the channel interface circuit is designed according to CCITT H.221. In this paper emphasis is given to the video codec, which is both flexible and robust. The video codec is based on LSI LOGIC Corporation's L64700 series video compression chipset. The main function blocks of H.261, such as DCT, motion estimation, VLC, VLD, are performed by this chipset, but the chipset is a nude chipset, no peripheral function, such as memory interface, is integrated into it, this results in great difficulty to implement the system. To implement the frame buffer controller, a DSP-TMS 320c25 and a group of GALs is used, SRAM is used as a current and previous frame buffer, the DSP is not only the controller of the frame buffer, it's also the controller of the whole video codec. Because of the use of the DSP, the architecture of the video codec is very flexible, many system parameters can be reconfigured for different applications. The architecture of the whole video codec is a streamline structure. In H.261, BCH(511,493) coding is recommended to work against random errors in transmission, but if burst error occurs, it causes serious result. To solve this problem, an interleaving method is used, that means the BCH code is interleaved before it's transmitted, in the receiver it is interleaved again and the bit stream is in the original order, but the error bits are distributed into several BCH words, and the BCH decoder is able to correct it. Considering that extreme conditions may occur, a function block is implemented which is somewhat like a watchdog, it assures that the receiver can recover from errors no matter what serious error occurs in transmission. In developing the video conference system, a new synchronization problem must be solved, the monitor on the receiver can't be easily synchronized with the camera on another side, a new method is described in detail which can solve this problem successfully.
Video conference quality assessment based on cooperative sensing of video and audio
NASA Astrophysics Data System (ADS)
Wang, Junxi; Chen, Jialin; Tian, Xin; Zhou, Cheng; Zhou, Zheng; Ye, Lu
2015-12-01
This paper presents a method to video conference quality assessment, which is based on cooperative sensing of video and audio. In this method, a proposed video quality evaluation method is used to assess the video frame quality. The video frame is divided into noise image and filtered image by the bilateral filters. It is similar to the characteristic of human visual, which could also be seen as a low-pass filtering. The audio frames are evaluated by the PEAQ algorithm. The two results are integrated to evaluate the video conference quality. A video conference database is built to test the performance of the proposed method. It could be found that the objective results correlate well with MOS. Then we can conclude that the proposed method is efficiency in assessing video conference quality.
Motion-Blur-Free High-Speed Video Shooting Using a Resonant Mirror
Inoue, Michiaki; Gu, Qingyi; Takaki, Takeshi; Ishii, Idaku; Tajima, Kenji
2017-01-01
This study proposes a novel concept of actuator-driven frame-by-frame intermittent tracking for motion-blur-free video shooting of fast-moving objects. The camera frame and shutter timings are controlled for motion blur reduction in synchronization with a free-vibration-type actuator vibrating with a large amplitude at hundreds of hertz so that motion blur can be significantly reduced in free-viewpoint high-frame-rate video shooting for fast-moving objects by deriving the maximum performance of the actuator. We develop a prototype of a motion-blur-free video shooting system by implementing our frame-by-frame intermittent tracking algorithm on a high-speed video camera system with a resonant mirror vibrating at 750 Hz. It can capture 1024 × 1024 images of fast-moving objects at 750 fps with an exposure time of 0.33 ms without motion blur. Several experimental results for fast-moving objects verify that our proposed method can reduce image degradation from motion blur without decreasing the camera exposure time. PMID:29109385
Organ donation on Web 2.0: content and audience analysis of organ donation videos on YouTube.
Tian, Yan
2010-04-01
This study examines the content of and audience response to organ donation videos on YouTube, a Web 2.0 platform, with framing theory. Positive frames were identified in both video content and audience comments. Analysis revealed a reciprocity relationship between media frames and audience frames. Videos covered content categories such as kidney, liver, organ donation registration process, and youth. Videos were favorably rated. No significant differences were found between videos produced by organizations and individuals in the United States and those produced in other countries. The findings provide insight into how new communication technologies are shaping health communication in ways that differ from traditional media. The implications of Web 2.0, characterized by user-generated content and interactivity, for health communication and health campaign practice are discussed.
2001-01-24
The potential for investigating combustion at the limits of flammability, and the implications for spacecraft fire safety, led to the Structures Of Flame Balls At Low Lewis-number (SOFBALL) experiment flown twice aboard the Space Shuttle in 1997. The success there led to reflight on STS-107 Research 1 mission plarned for 2002. This image is a video frame which shows MSL-1 flameballs which are intrinsically dim, thus requiring the use of image intensifiers on video cameras. The principal investigator is Dr. Paul Ronney of the University of Southern California, Los Angeles. Glenn Research in Cleveland, OH, manages the project.
Temporal compressive imaging for video
NASA Astrophysics Data System (ADS)
Zhou, Qun; Zhang, Linxia; Ke, Jun
2018-01-01
In many situations, imagers are required to have higher imaging speed, such as gunpowder blasting analysis and observing high-speed biology phenomena. However, measuring high-speed video is a challenge to camera design, especially, in infrared spectrum. In this paper, we reconstruct a high-frame-rate video from compressive video measurements using temporal compressive imaging (TCI) with a temporal compression ratio T=8. This means that, 8 unique high-speed temporal frames will be obtained from a single compressive frame using a reconstruction algorithm. Equivalently, the video frame rates is increased by 8 times. Two methods, two-step iterative shrinkage/threshold (TwIST) algorithm and the Gaussian mixture model (GMM) method, are used for reconstruction. To reduce reconstruction time and memory usage, each frame of size 256×256 is divided into patches of size 8×8. The influence of different coded mask to reconstruction is discussed. The reconstruction qualities using TwIST and GMM are also compared.
Colonoscopy video quality assessment using hidden Markov random fields
NASA Astrophysics Data System (ADS)
Park, Sun Young; Sargent, Dusty; Spofford, Inbar; Vosburgh, Kirby
2011-03-01
With colonoscopy becoming a common procedure for individuals aged 50 or more who are at risk of developing colorectal cancer (CRC), colon video data is being accumulated at an ever increasing rate. However, the clinically valuable information contained in these videos is not being maximally exploited to improve patient care and accelerate the development of new screening methods. One of the well-known difficulties in colonoscopy video analysis is the abundance of frames with no diagnostic information. Approximately 40% - 50% of the frames in a colonoscopy video are contaminated by noise, acquisition errors, glare, blur, and uneven illumination. Therefore, filtering out low quality frames containing no diagnostic information can significantly improve the efficiency of colonoscopy video analysis. To address this challenge, we present a quality assessment algorithm to detect and remove low quality, uninformative frames. The goal of our algorithm is to discard low quality frames while retaining all diagnostically relevant information. Our algorithm is based on a hidden Markov model (HMM) in combination with two measures of data quality to filter out uninformative frames. Furthermore, we present a two-level framework based on an embedded hidden Markov model (EHHM) to incorporate the proposed quality assessment algorithm into a complete, automated diagnostic image analysis system for colonoscopy video.
Artificial Neural Network applied to lightning flashes
NASA Astrophysics Data System (ADS)
Gin, R. B.; Guedes, D.; Bianchi, R.
2013-05-01
The development of video cameras enabled cientists to study lightning discharges comportment with more precision. The main goal of this project is to create a system able to detect images of lightning discharges stored in videos and classify them using an Artificial Neural Network (ANN)using C Language and OpenCV libraries. The developed system, can be split in two different modules: detection module and classification module. The detection module uses OpenCV`s computer vision libraries and image processing techniques to detect if there are significant differences between frames in a sequence, indicating that something, still not classified, occurred. Whenever there is a significant difference between two consecutive frames, two main algorithms are used to analyze the frame image: brightness and shape algorithms. These algorithms detect both shape and brightness of the event, removing irrelevant events like birds, as well as detecting the relevant events exact position, allowing the system to track it over time. The classification module uses a neural network to classify the relevant events as horizontal or vertical lightning, save the event`s images and calculates his number of discharges. The Neural Network was implemented using the backpropagation algorithm, and was trained with 42 training images , containing 57 lightning events (one image can have more than one lightning). TheANN was tested with one to five hidden layers, with up to 50 neurons each. The best configuration achieved a success rate of 95%, with one layer containing 20 neurons (33 test images with 42 events were used in this phase). This configuration was implemented in the developed system to analyze 20 video files, containing 63 lightning discharges previously manually detected. Results showed that all the lightning discharges were detected, many irrelevant events were unconsidered, and the event's number of discharges was correctly computed. The neural network used in this project achieved a success rate of 90%. The videos used in this experiment were acquired by seven video cameras installed in São Bernardo do Campo, Brazil, that continuously recorded lightning events during the summer. The cameras were disposed in a 360 loop, recording all data at a time resolution of 33ms. During this period, several convective storms were recorded.
Automatic video summarization driven by a spatio-temporal attention model
NASA Astrophysics Data System (ADS)
Barland, R.; Saadane, A.
2008-02-01
According to the literature, automatic video summarization techniques can be classified in two parts, following the output nature: "video skims", which are generated using portions of the original video and "key-frame sets", which correspond to the images, selected from the original video, having a significant semantic content. The difference between these two categories is reduced when we consider automatic procedures. Most of the published approaches are based on the image signal and use either pixel characterization or histogram techniques or image decomposition by blocks. However, few of them integrate properties of the Human Visual System (HVS). In this paper, we propose to extract keyframes for video summarization by studying the variations of salient information between two consecutive frames. For each frame, a saliency map is produced simulating the human visual attention by a bottom-up (signal-dependent) approach. This approach includes three parallel channels for processing three early visual features: intensity, color and temporal contrasts. For each channel, the variations of the salient information between two consecutive frames are computed. These outputs are then combined to produce the global saliency variation which determines the key-frames. Psychophysical experiments have been defined and conducted to analyze the relevance of the proposed key-frame extraction algorithm.
NASA Technical Reports Server (NTRS)
Ziemke, Robert A.
1990-01-01
The objective of the High Resolution, High Frame Rate Video Technology (HHVT) development effort is to provide technology advancements to remove constraints on the amount of high speed, detailed optical data recorded and transmitted for microgravity science and application experiments. These advancements will enable the development of video systems capable of high resolution, high frame rate video data recording, processing, and transmission. Techniques such as multichannel image scan, video parameter tradeoff, and the use of dual recording media were identified as methods of making the most efficient use of the near-term technology.
WCE video segmentation using textons
NASA Astrophysics Data System (ADS)
Gallo, Giovanni; Granata, Eliana
2010-03-01
Wireless Capsule Endoscopy (WCE) integrates wireless transmission with image and video technology. It has been used to examine the small intestine non invasively. Medical specialists look for signicative events in the WCE video by direct visual inspection manually labelling, in tiring and up to one hour long sessions, clinical relevant frames. This limits the WCE usage. To automatically discriminate digestive organs such as esophagus, stomach, small intestine and colon is of great advantage. In this paper we propose to use textons for the automatic discrimination of abrupt changes within a video. In particular, we consider, as features, for each frame hue, saturation, value, high-frequency energy content and the responses to a bank of Gabor filters. The experiments have been conducted on ten video segments extracted from WCE videos, in which the signicative events have been previously labelled by experts. Results have shown that the proposed method may eliminate up to 70% of the frames from further investigations. The direct analysis of the doctors may hence be concentrated only on eventful frames. A graphical tool showing sudden changes in the textons frequencies for each frame is also proposed as a visual aid to find clinically relevant segments of the video.
Spatial resampling of IDR frames for low bitrate video coding with HEVC
NASA Astrophysics Data System (ADS)
Hosking, Brett; Agrafiotis, Dimitris; Bull, David; Easton, Nick
2015-03-01
As the demand for higher quality and higher resolution video increases, many applications fail to meet this demand due to low bandwidth restrictions. One factor contributing to this problem is the high bitrate requirement of the intra-coded Instantaneous Decoding Refresh (IDR) frames featuring in all video coding standards. Frequent coding of IDR frames is essential for error resilience in order to prevent the occurrence of error propagation. However, as each one consumes a huge portion of the available bitrate, the quality of future coded frames is hindered by high levels of compression. This work presents a new technique, known as Spatial Resampling of IDR Frames (SRIF), and shows how it can increase the rate distortion performance by providing a higher and more consistent level of video quality at low bitrates.
Updegraff, John A; Brick, Cameron; Emanuel, Amber S; Mintzer, Roy E; Sherman, David K
2015-01-01
The present study examined how gain- and loss-framed informational videos about oral health influence self-reported flossing behavior over a 6-month period, as well as the roles of perceived susceptibility to oral health problems and approach/avoidance motivational orientation in moderating these effects. An age and ethnically diverse sample of 855 American adults were randomized to receive no health message, or either a gain-framed or loss-framed video presented on the Internet. Self-reported flossing was assessed longitudinally at 2 and 6 months. Among the entire sample, susceptibility interacted with frame to predict flossing. Participants who watched a video where the frame (gain/loss) matched perceived susceptibility (low/high) had significantly greater likelihood of flossing at recommended levels at the 6-month follow-up, compared with those who viewed a mismatched video or no video at all. However, young adults (18-24) showed stronger moderation by motivational orientation than by perceived susceptibility, in line with previous work largely conducted with young adult samples. Brief informational interventions can influence long-term health behavior, particularly when the gain- or loss-frame of the information matches the recipient's beliefs about their health outcome risks.
Blurry-frame detection and shot segmentation in colonoscopy videos
NASA Astrophysics Data System (ADS)
Oh, JungHwan; Hwang, Sae; Tavanapong, Wallapak; de Groen, Piet C.; Wong, Johnny
2003-12-01
Colonoscopy is an important screening procedure for colorectal cancer. During this procedure, the endoscopist visually inspects the colon. Human inspection, however, is not without error. We hypothesize that colonoscopy videos may contain additional valuable information missed by the endoscopist. Video segmentation is the first necessary step for the content-based video analysis and retrieval to provide efficient access to the important images and video segments from a large colonoscopy video database. Based on the unique characteristics of colonoscopy videos, we introduce a new scheme to detect and remove blurry frames, and segment the videos into shots based on the contents. Our experimental results show that the average precision and recall of the proposed scheme are over 90% for the detection of non-blurry images. The proposed method of blurry frame detection and shot segmentation is extensible to the videos captured from other endoscopic procedures such as upper gastrointestinal endoscopy, enteroscopy, cystoscopy, and laparoscopy.
Automatic treatment of flight test images using modern tools: SAAB and Aeritalia joint approach
NASA Astrophysics Data System (ADS)
Kaelldahl, A.; Duranti, P.
The use of onboard cine cameras, as well as that of on ground cinetheodolites, is very popular in flight tests. The high resolution of film and the high frame rate of cinecameras are still not exceeded by video technology. Video technology can successfully enter the flight test scenario once the availability of solid-state optical sensors dramatically reduces the dimensions, and weight of TV cameras, thus allowing to locate them in positions compatible with space or operational limitations (e.g., HUD cameras). A proper combination of cine and video cameras is the typical solution for a complex flight test program. The output of such devices is very helpful in many flight areas. Several sucessful applications of this technology are summarized. Analysis of the large amount of data produced (frames of images) requires a very long time. The analysis is normally carried out manually. In order to improve the situation, in the last few years, several flight test centers have devoted their attention to possible techniques which allow for quicker and more effective image treatment.
Finnish Mathematics Teaching from a Reform Perspective: A Video-Based Case-Study Analysis
ERIC Educational Resources Information Center
Andrews, Paul
2013-01-01
This article offers a qualitative analysis of videotaped mathematics lessons taught by four teachers in a provincial university city in Finland. My study is framed not only by Finnish success on Programme for International Student Assessment (PISA) but also by the objectives of current mathematics education reform, which are consistent with PISA's…
Network-based H.264/AVC whole frame loss visibility model and frame dropping methods.
Chang, Yueh-Lun; Lin, Ting-Lan; Cosman, Pamela C
2012-08-01
We examine the visual effect of whole frame loss by different decoders. Whole frame losses are introduced in H.264/AVC compressed videos which are then decoded by two different decoders with different common concealment effects: frame copy and frame interpolation. The videos are seen by human observers who respond to each glitch they spot. We found that about 39% of whole frame losses of B frames are not observed by any of the subjects, and over 58% of the B frame losses are observed by 20% or fewer of the subjects. Using simple predictive features which can be calculated inside a network node with no access to the original video and no pixel level reconstruction of the frame, we developed models which can predict the visibility of whole B frame losses. The models are then used in a router to predict the visual impact of a frame loss and perform intelligent frame dropping to relieve network congestion. Dropping frames based on their visual scores proves superior to random dropping of B frames.
The comparison and analysis of extracting video key frame
NASA Astrophysics Data System (ADS)
Ouyang, S. Z.; Zhong, L.; Luo, R. Q.
2018-05-01
Video key frame extraction is an important part of the large data processing. Based on the previous work in key frame extraction, we summarized four important key frame extraction algorithms, and these methods are largely developed by comparing the differences between each of two frames. If the difference exceeds a threshold value, take the corresponding frame as two different keyframes. After the research, the key frame extraction based on the amount of mutual trust is proposed, the introduction of information entropy, by selecting the appropriate threshold values into the initial class, and finally take a similar mean mutual information as a candidate key frame. On this paper, several algorithms is used to extract the key frame of tunnel traffic videos. Then, with the analysis to the experimental results and comparisons between the pros and cons of these algorithms, the basis of practical applications is well provided.
Reduction of capsule endoscopy reading times by unsupervised image mining.
Iakovidis, D K; Tsevas, S; Polydorou, A
2010-09-01
The screening of the small intestine has become painless and easy with wireless capsule endoscopy (WCE) that is a revolutionary, relatively non-invasive imaging technique performed by a wireless swallowable endoscopic capsule transmitting thousands of video frames per examination. The average time required for the visual inspection of a full 8-h WCE video ranges from 45 to 120min, depending on the experience of the examiner. In this paper, we propose a novel approach to WCE reading time reduction by unsupervised mining of video frames. The proposed methodology is based on a data reduction algorithm which is applied according to a novel scheme for the extraction of representative video frames from a full length WCE video. It can be used either as a video summarization or as a video bookmarking tool, providing the comparative advantage of being general, unbounded by the finiteness of a training set. The number of frames extracted is controlled by a parameter that can be tuned automatically. Comprehensive experiments on real WCE videos indicate that a significant reduction in the reading times is feasible. In the case of the WCE videos used this reduction reached 85% without any loss of abnormalities.
Development of a video tampering dataset for forensic investigation.
Ismael Al-Sanjary, Omar; Ahmed, Ahmed Abdullah; Sulong, Ghazali
2016-09-01
Forgery is an act of modifying a document, product, image or video, among other media. Video tampering detection research requires an inclusive database of video modification. This paper aims to discuss a comprehensive proposal to create a dataset composed of modified videos for forensic investigation, in order to standardize existing techniques for detecting video tampering. The primary purpose of developing and designing this new video library is for usage in video forensics, which can be consciously associated with reliable verification using dynamic and static camera recognition. To the best of the author's knowledge, there exists no similar library among the research community. Videos were sourced from YouTube and by exploring social networking sites extensively by observing posted videos and rating their feedback. The video tampering dataset (VTD) comprises a total of 33 videos, divided among three categories in video tampering: (1) copy-move, (2) splicing, and (3) swapping-frames. Compared to existing datasets, this is a higher number of tampered videos, and with longer durations. The duration of every video is 16s, with a 1280×720 resolution, and a frame rate of 30 frames per second. Moreover, all videos possess the same formatting quality (720p(HD).avi). Both temporal and spatial video features were considered carefully during selection of the videos, and there exists complete information related to the doctored regions in every modified video in the VTD dataset. This database has been made publically available for research on splicing, Swapping frames, and copy-move tampering, and, as such, various video tampering detection issues with ground truth. The database has been utilised by many international researchers and groups of researchers. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Shot boundary detection and label propagation for spatio-temporal video segmentation
NASA Astrophysics Data System (ADS)
Piramanayagam, Sankaranaryanan; Saber, Eli; Cahill, Nathan D.; Messinger, David
2015-02-01
This paper proposes a two stage algorithm for streaming video segmentation. In the first stage, shot boundaries are detected within a window of frames by comparing dissimilarity between 2-D segmentations of each frame. In the second stage, the 2-D segments are propagated across the window of frames in both spatial and temporal direction. The window is moved across the video to find all shot transitions and obtain spatio-temporal segments simultaneously. As opposed to techniques that operate on entire video, the proposed approach consumes significantly less memory and enables segmentation of lengthy videos. We tested our segmentation based shot detection method on the TRECVID 2007 video dataset and compared it with block-based technique. Cut detection results on the TRECVID 2007 dataset indicate that our algorithm has comparable results to the best of the block-based methods. The streaming video segmentation routine also achieves promising results on a challenging video segmentation benchmark database.
Multiple Sensor Camera for Enhanced Video Capturing
NASA Astrophysics Data System (ADS)
Nagahara, Hajime; Kanki, Yoshinori; Iwai, Yoshio; Yachida, Masahiko
A resolution of camera has been drastically improved under a current request for high-quality digital images. For example, digital still camera has several mega pixels. Although a video camera has the higher frame-rate, the resolution of a video camera is lower than that of still camera. Thus, the high-resolution is incompatible with the high frame rate of ordinary cameras in market. It is difficult to solve this problem by a single sensor, since it comes from physical limitation of the pixel transfer rate. In this paper, we propose a multi-sensor camera for capturing a resolution and frame-rate enhanced video. Common multi-CCDs camera, such as 3CCD color camera, has same CCD for capturing different spectral information. Our approach is to use different spatio-temporal resolution sensors in a single camera cabinet for capturing higher resolution and frame-rate information separately. We build a prototype camera which can capture high-resolution (2588×1958 pixels, 3.75 fps) and high frame-rate (500×500, 90 fps) videos. We also proposed the calibration method for the camera. As one of the application of the camera, we demonstrate an enhanced video (2128×1952 pixels, 90 fps) generated from the captured videos for showing the utility of the camera.
Synchronization of video recording and laser pulses including background light suppression
NASA Technical Reports Server (NTRS)
Kalshoven, Jr., James E. (Inventor); Tierney, Jr., Michael (Inventor); Dabney, Philip W. (Inventor)
2004-01-01
An apparatus for and a method of triggering a pulsed light source, in particular a laser light source, for predictable capture of the source by video equipment. A frame synchronization signal is derived from the video signal of a camera to trigger the laser and position the resulting laser light pulse in the appropriate field of the video frame and during the opening of the electronic shutter, if such shutter is included in the camera. Positioning of the laser pulse in the proper video field allows, after recording, for the viewing of the laser light image with a video monitor using the pause mode on a standard cassette-type VCR. This invention also allows for fine positioning of the laser pulse to fall within the electronic shutter opening. For cameras with externally controllable electronic shutters, the invention provides for background light suppression by increasing shutter speed during the frame in which the laser light image is captured. This results in the laser light appearing in one frame in which the background scene is suppressed with the laser light being uneffected, while in all other frames, the shutter speed is slower, allowing for the normal recording of the background scene. This invention also allows for arbitrary (manual or external) triggering of the laser with full video synchronization and background light suppression.
Marcos, Ma Shiela Angeli; David, Laura; Peñaflor, Eileen; Ticzon, Victor; Soriano, Maricor
2008-10-01
We introduce an automated benthic counting system in application for rapid reef assessment that utilizes computer vision on subsurface underwater reef video. Video acquisition was executed by lowering a submersible bullet-type camera from a motor boat while moving across the reef area. A GPS and echo sounder were linked to the video recorder to record bathymetry and location points. Analysis of living and non-living components was implemented through image color and texture feature extraction from the reef video frames and classification via Linear Discriminant Analysis. Compared to common rapid reef assessment protocols, our system can perform fine scale data acquisition and processing in one day. Reef video was acquired in Ngedarrak Reef, Koror, Republic of Palau. Overall success performance ranges from 60% to 77% for depths of 1 to 3 m. The development of an automated rapid reef classification system is most promising for reef studies that need fast and frequent data acquisition of percent cover of living and nonliving components.
2001-01-24
The potential for investigating combustion at the limits of flammability, and the implications for spacecraft fire safety, led to the Structures Of Flame Balls At Low Lewis-number (SOFBALL) experiment flown twice aboard the Space Shuttle in 1997. The success there led to on STS-107 Research 1 mission plarned for 2002. Shown here are video frames captured during the Microgravity Sciences Lab-1 mission in 1997. Flameballs are intrinsically dim, thus requiring the use of image intensifiers on video cameras. The principal investigator is Dr. Paul Ronney of the University of Southern California, Los Angeles. Glenn Research in Cleveland, OH, manages the project.
Two-Stream Transformer Networks for Video-based Face Alignment.
Liu, Hao; Lu, Jiwen; Feng, Jianjiang; Zhou, Jie
2017-08-01
In this paper, we propose a two-stream transformer networks (TSTN) approach for video-based face alignment. Unlike conventional image-based face alignment approaches which cannot explicitly model the temporal dependency in videos and motivated by the fact that consistent movements of facial landmarks usually occur across consecutive frames, our TSTN aims to capture the complementary information of both the spatial appearance on still frames and the temporal consistency information across frames. To achieve this, we develop a two-stream architecture, which decomposes the video-based face alignment into spatial and temporal streams accordingly. Specifically, the spatial stream aims to transform the facial image to the landmark positions by preserving the holistic facial shape structure. Accordingly, the temporal stream encodes the video input as active appearance codes, where the temporal consistency information across frames is captured to help shape refinements. Experimental results on the benchmarking video-based face alignment datasets show very competitive performance of our method in comparisons to the state-of-the-arts.
Updegraff, John A.; Brick, Cameron; Emanuel, Amber S.; Mintzer, Roy E.; Sherman, David K.
2015-01-01
Objective The present study examined how gain- and loss-framed informational videos about oral health influence self-reported flossing behavior over a 6-month period, as well as the roles of perceived susceptibility to oral health problems and approach/avoidance motivational orientation in moderating these effects. Method An age and ethnically diverse sample of 855 American adults were randomized to receive no health message, or either a gain-framed or loss-framed video presented on the Internet. Self-reported flossing was assessed longitudinally at 2 and 6 months. Results Among the entire sample, susceptibility interacted with frame to predict flossing. Participants who watched a video where the frame (gain/loss) matched perceived susceptibility (low/high) had significantly greater likelihood of flossing at recommended levels at the 6-month follow-up, compared with those who viewed a mismatched video or no video at all. However, young adults (18–24) showed stronger moderation by motivational orientation than by perceived susceptibility, in line with previous work largely conducted with young adult samples. Conclusion Brief informational interventions can influence long-term health behavior, particularly when the gain- or loss-frame of the information matches the recipient’s beliefs about their health outcome risks. PMID:25020153
Miniaturized video-rate epi-third-harmonic-generation fiber-microscope.
Chia, Shih-Hsuan; Yu, Che-Hang; Lin, Chih-Han; Cheng, Nai-Chia; Liu, Tzu-Ming; Chan, Ming-Che; Chen, I-Hsiu; Sun, Chi-Kuang
2010-08-02
With a micro-electro-mechanical system (MEMS) mirror, we successfully developed a miniaturized epi-third-harmonic-generation (epi-THG) fiber-microscope with a video frame rate (31 Hz), which was designed for in vivo optical biopsy of human skin. With a large-mode-area (LMA) photonic crystal fiber (PCF) and a regular microscopic objective, the nonlinear distortion of the ultrafast pulses delivery could be much reduced while still achieving a 0.4 microm lateral resolution for epi-THG signals. In vivo real time virtual biopsy of the Asian skin with a video rate (31 Hz) and a sub-micron resolution was obtained. The result indicates that this miniaturized system was compact enough for the least invasive hand-held clinical use.
NASA Astrophysics Data System (ADS)
Nightingale, James; Wang, Qi; Grecos, Christos
2015-02-01
In recent years video traffic has become the dominant application on the Internet with global year-on-year increases in video-oriented consumer services. Driven by improved bandwidth in both mobile and fixed networks, steadily reducing hardware costs and the development of new technologies, many existing and new classes of commercial and industrial video applications are now being upgraded or emerging. Some of the use cases for these applications include areas such as public and private security monitoring for loss prevention or intruder detection, industrial process monitoring and critical infrastructure monitoring. The use of video is becoming commonplace in defence, security, commercial, industrial, educational and health contexts. Towards optimal performances, the design or optimisation in each of these applications should be context aware and task oriented with the characteristics of the video stream (frame rate, spatial resolution, bandwidth etc.) chosen to match the use case requirements. For example, in the security domain, a task-oriented consideration may be that higher resolution video would be required to identify an intruder than to simply detect his presence. Whilst in the same case, contextual factors such as the requirement to transmit over a resource-limited wireless link, may impose constraints on the selection of optimum task-oriented parameters. This paper presents a novel, conceptually simple and easily implemented method of assessing video quality relative to its suitability for a particular task and dynamically adapting videos streams during transmission to ensure that the task can be successfully completed. Firstly we defined two principle classes of tasks: recognition tasks and event detection tasks. These task classes are further subdivided into a set of task-related profiles, each of which is associated with a set of taskoriented attributes (minimum spatial resolution, minimum frame rate etc.). For example, in the detection class, profiles for intruder detection will require different temporal characteristics (frame rate) from those used for detection of high motion objects such as vehicles or aircrafts. We also define a set of contextual attributes that are associated with each instance of a running application that include resource constraints imposed by the transmission system employed and the hardware platforms used as source and destination of the video stream. Empirical results are presented and analysed to demonstrate the advantages of the proposed schemes.
NASA Technical Reports Server (NTRS)
2004-01-01
Ever wonder whether a still shot from a home video could serve as a "picture perfect" photograph worthy of being framed and proudly displayed on the mantle? Wonder no more. A critical imaging code used to enhance video footage taken from spaceborne imaging instruments is now available within a portable photography tool capable of producing an optimized, high-resolution image from multiple video frames.
Smartphone based automatic organ validation in ultrasound video.
Vaish, Pallavi; Bharath, R; Rajalakshmi, P
2017-07-01
Telesonography involves transmission of ultrasound video from remote areas to the doctors for getting diagnosis. Due to the lack of trained sonographers in remote areas, the ultrasound videos scanned by these untrained persons do not contain the proper information that is required by a physician. As compared to standard methods for video transmission, mHealth driven systems need to be developed for transmitting valid medical videos. To overcome this problem, we are proposing an organ validation algorithm to evaluate the ultrasound video based on the content present. This will guide the semi skilled person to acquire the representative data from patient. Advancement in smartphone technology allows us to perform high medical image processing on smartphone. In this paper we have developed an Application (APP) for a smartphone which can automatically detect the valid frames (which consist of clear organ visibility) in an ultrasound video and ignores the invalid frames (which consist of no-organ visibility), and produces a compressed sized video. This is done by extracting the GIST features from the Region of Interest (ROI) of the frame and then classifying the frame using SVM classifier with quadratic kernel. The developed application resulted with the accuracy of 94.93% in classifying valid and invalid images.
Research on compression performance of ultrahigh-definition videos
NASA Astrophysics Data System (ADS)
Li, Xiangqun; He, Xiaohai; Qing, Linbo; Tao, Qingchuan; Wu, Di
2017-11-01
With the popularization of high-definition (HD) images and videos (1920×1080 pixels and above), there are even 4K (3840×2160) television signals and 8 K (8192×4320) ultrahigh-definition videos. The demand for HD images and videos is increasing continuously, along with the increasing data volume. The storage and transmission cannot be properly solved only by virtue of the expansion capacity of hard disks and the update and improvement of transmission devices. Based on the full use of the coding standard high-efficiency video coding (HEVC), super-resolution reconstruction technology, and the correlation between the intra- and the interprediction, we first put forward a "division-compensation"-based strategy to further improve the compression performance of a single image and frame I. Then, by making use of the above thought and HEVC encoder and decoder, a video compression coding frame is designed. HEVC is used inside the frame. Last, with the super-resolution reconstruction technology, the reconstructed video quality is further improved. The experiment shows that by the proposed compression method for a single image (frame I) and video sequence here, the performance is superior to that of HEVC in a low bit rate environment.
Does framing the hot hand belief change decision-making behavior in volleyball?
Raab, Markus; MacMahon, Clare
2015-06-01
Previous discussions of the hot hand belief, wherein athletes believe that they have a greater chance of scoring after 2 or 3 hits (successes) compared with 2 or 3 misses, have focused on whether this is the case within game statistics. Researchers have argued that the perception of the hot hand in random sequences is a bias of the cognitive system. Yet most have failed to explore the impact of framing on the stability of the belief and the behavior based on it. The authors conducted 2 studies that manipulated the frame of a judgment task. In Study 1, framing was manipulated via instructions in a playmaker allocation paradigm in volleyball. In Study 2, the frame was manipulated by presenting videos for allocation decisions from either the actor or observer perspective. Both manipulations changed the hot hand belief and sequential choices. We found in both studies that the belief in continuation of positive or negative streaks is nonlinear and allocations to the same player after 3 successive hits are reduced. The authors argue that neither the hot hand belief nor hot hand behavior is stable, but rather, both are sensitive to decision frames. The results can inform coaches on the importance of how to provide information to athletes.
Bi, Sheng; Zeng, Xiao; Tang, Xin; Qin, Shujia; Lai, King Wai Chiu
2016-01-01
Compressive sensing (CS) theory has opened up new paths for the development of signal processing applications. Based on this theory, a novel single pixel camera architecture has been introduced to overcome the current limitations and challenges of traditional focal plane arrays. However, video quality based on this method is limited by existing acquisition and recovery methods, and the method also suffers from being time-consuming. In this paper, a multi-frame motion estimation algorithm is proposed in CS video to enhance the video quality. The proposed algorithm uses multiple frames to implement motion estimation. Experimental results show that using multi-frame motion estimation can improve the quality of recovered videos. To further reduce the motion estimation time, a block match algorithm is used to process motion estimation. Experiments demonstrate that using the block match algorithm can reduce motion estimation time by 30%. PMID:26950127
Tiny videos: a large data set for nonparametric video retrieval and frame classification.
Karpenko, Alexandre; Aarabi, Parham
2011-03-01
In this paper, we present a large database of over 50,000 user-labeled videos collected from YouTube. We develop a compact representation called "tiny videos" that achieves high video compression rates while retaining the overall visual appearance of the video as it varies over time. We show that frame sampling using affinity propagation-an exemplar-based clustering algorithm-achieves the best trade-off between compression and video recall. We use this large collection of user-labeled videos in conjunction with simple data mining techniques to perform related video retrieval, as well as classification of images and video frames. The classification results achieved by tiny videos are compared with the tiny images framework [24] for a variety of recognition tasks. The tiny images data set consists of 80 million images collected from the Internet. These are the largest labeled research data sets of videos and images available to date. We show that tiny videos are better suited for classifying scenery and sports activities, while tiny images perform better at recognizing objects. Furthermore, we demonstrate that combining the tiny images and tiny videos data sets improves classification precision in a wider range of categories.
Schemes for efficient transmission of encoded video streams on high-speed networks
NASA Astrophysics Data System (ADS)
Ramanathan, Srinivas; Vin, Harrick M.; Rangan, P. Venkat
1994-04-01
In this paper, we argue that significant performance benefits can accrue if integrated networks implement application-specific mechanisms that account for the diversities in media compression schemes. Towards this end, we propose a simple, yet effective, strategy called Frame Induced Packet Discarding (FIPD), in which, upon detection of loss of a threshold number (determined by an application's video encoding scheme) of packets belonging to a video frame, the network attempts to discard all the remaining packets of that frame. In order to analytically quantify the performance of FIPD so as to obtain fractional frame losses that can be guaranteed to video channels, we develop a finite state, discrete time markov chain model of the FIPD strategy. The fractional frame loss thus computed can serve as the criterion for admission control at the network. Performance evaluations demonstrate the utility of the FIPD strategy.
Toll, Benjamin A.; O’Malley, Stephanie S.; Katulak, Nicole A.; Wu, Ran; Dubin, Joel A.; Latimer, Amy; Meandzija, Boris; George, Tony P.; Jatlow, Peter; Cooney, Judith L.; Salovey, Peter
2008-01-01
Prospect theory suggests that because smoking cessation is a prevention behavior with a fairly certain outcome, gain-framed messages will be more persuasive than loss-framed messages when attempting to encourage smoking cessation. To test this hypothesis, the authors randomly assigned participants (N = 258) in a clinical trial to either a gain- or loss-framed condition, in which they received factually equivalent video and printed messages encouraging smoking cessation that emphasized either the benefits of quitting (gains) or the costs of continuing to smoke (losses), respectively. All participants received open label sustained-release bupropion (300 mg/day) for 7 weeks. In the intent-to-treat analysis, the difference between the experimental groups by either point prevalence or continuous abstinence was not statistically significant. Among 170 treatment completers, however, a significantly higher proportion of participants were continuously abstinent in the gain-framed condition as compared with the loss-framed condition. These data suggest that gain-framed messages may be more persuasive than loss-framed messages in promoting early success in smoking cessation for participants who are engaged in treatment. PMID:18072836
NASA Technical Reports Server (NTRS)
2000-01-01
Video Pics is a software program that generates high-quality photos from video. The software was developed under an SBIR contract with Marshall Space Flight Center by Redhawk Vision, Inc.--a subsidiary of Irvine Sensors Corporation. Video Pics takes information content from multiple frames of video and enhances the resolution of a selected frame. The resulting image has enhanced sharpness and clarity like that of a 35 mm photo. The images are generated as digital files and are compatible with image editing software.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Teuton, Jeremy R.; Griswold, Richard L.; Mehdi, Beata L.
Precise analysis of both (S)TEM images and video are time and labor intensive processes. As an example, determining when crystal growth and shrinkage occurs during the dynamic process of Li dendrite deposition and stripping involves manually scanning through each frame in the video to extract a specific set of frames/images. For large numbers of images, this process can be very time consuming, so a fast and accurate automated method is desirable. Given this need, we developed software that uses analysis of video compression statistics for detecting and characterizing events in large data sets. This software works by converting the datamore » into a series of images which it compresses into an MPEG-2 video using the open source “avconv” utility [1]. The software does not use the video itself, but rather analyzes the video statistics from the first pass of the video encoding that avconv records in the log file. This file contains statistics for each frame of the video including the frame quality, intra-texture and predicted texture bits, forward and backward motion vector resolution, among others. In all, avconv records 15 statistics for each frame. By combining different statistics, we have been able to detect events in various types of data. We have developed an interactive tool for exploring the data and the statistics that aids the analyst in selecting useful statistics for each analysis. Going forward, an algorithm for detecting and possibly describing events automatically can be written based on statistic(s) for each data type.« less
Scalable gastroscopic video summarization via similar-inhibition dictionary selection.
Wang, Shuai; Cong, Yang; Cao, Jun; Yang, Yunsheng; Tang, Yandong; Zhao, Huaici; Yu, Haibin
2016-01-01
This paper aims at developing an automated gastroscopic video summarization algorithm to assist clinicians to more effectively go through the abnormal contents of the video. To select the most representative frames from the original video sequence, we formulate the problem of gastroscopic video summarization as a dictionary selection issue. Different from the traditional dictionary selection methods, which take into account only the number and reconstruction ability of selected key frames, our model introduces the similar-inhibition constraint to reinforce the diversity of selected key frames. We calculate the attention cost by merging both gaze and content change into a prior cue to help select the frames with more high-level semantic information. Moreover, we adopt an image quality evaluation process to eliminate the interference of the poor quality images and a segmentation process to reduce the computational complexity. For experiments, we build a new gastroscopic video dataset captured from 30 volunteers with more than 400k images and compare our method with the state-of-the-arts using the content consistency, index consistency and content-index consistency with the ground truth. Compared with all competitors, our method obtains the best results in 23 of 30 videos evaluated based on content consistency, 24 of 30 videos evaluated based on index consistency and all videos evaluated based on content-index consistency. For gastroscopic video summarization, we propose an automated annotation method via similar-inhibition dictionary selection. Our model can achieve better performance compared with other state-of-the-art models and supplies more suitable key frames for diagnosis. The developed algorithm can be automatically adapted to various real applications, such as the training of young clinicians, computer-aided diagnosis or medical report generation. Copyright © 2015 Elsevier B.V. All rights reserved.
Software for Photometric and Astrometric Reduction of Video Meteors
NASA Astrophysics Data System (ADS)
Atreya, Prakash; Christou, Apostolos
2007-12-01
SPARVM is a Software for Photometric and Astrometric Reduction of Video Meteors being developed at Armagh Observatory. It is written in Interactive Data Language (IDL) and is designed to run primarily under Linux platform. The basic features of the software will be derivation of light curves, estimation of angular velocity and radiant position for single station data. For double station data, calculation of 3D coordinates of meteors, velocity, brightness, and estimation of meteoroid's orbit including uncertainties. Currently, the software supports extraction of time and date from video frames, estimation of position of cameras (Azimuth, Altitude), finding stellar sources in video frames and transformation of coordinates from video, frames to Horizontal coordinate system (Azimuth, Altitude), and Equatorial coordinate system (RA, Dec).
NASA Astrophysics Data System (ADS)
Bulan, Orhan; Bernal, Edgar A.; Loce, Robert P.; Wu, Wencheng
2013-03-01
Video cameras are widely deployed along city streets, interstate highways, traffic lights, stop signs and toll booths by entities that perform traffic monitoring and law enforcement. The videos captured by these cameras are typically compressed and stored in large databases. Performing a rapid search for a specific vehicle within a large database of compressed videos is often required and can be a time-critical life or death situation. In this paper, we propose video compression and decompression algorithms that enable fast and efficient vehicle or, more generally, event searches in large video databases. The proposed algorithm selects reference frames (i.e., I-frames) based on a vehicle having been detected at a specified position within the scene being monitored while compressing a video sequence. A search for a specific vehicle in the compressed video stream is performed across the reference frames only, which does not require decompression of the full video sequence as in traditional search algorithms. Our experimental results on videos captured in a local road show that the proposed algorithm significantly reduces the search space (thus reducing time and computational resources) in vehicle search tasks within compressed video streams, particularly those captured in light traffic volume conditions.
Hardware accelerator design for tracking in smart camera
NASA Astrophysics Data System (ADS)
Singh, Sanjay; Dunga, Srinivasa Murali; Saini, Ravi; Mandal, A. S.; Shekhar, Chandra; Vohra, Anil
2011-10-01
Smart Cameras are important components in video analysis. For video analysis, smart cameras needs to detect interesting moving objects, track such objects from frame to frame, and perform analysis of object track in real time. Therefore, the use of real-time tracking is prominent in smart cameras. The software implementation of tracking algorithm on a general purpose processor (like PowerPC) could achieve low frame rate far from real-time requirements. This paper presents the SIMD approach based hardware accelerator designed for real-time tracking of objects in a scene. The system is designed and simulated using VHDL and implemented on Xilinx XUP Virtex-IIPro FPGA. Resulted frame rate is 30 frames per second for 250x200 resolution video in gray scale.
Simulation and Real-Time Verification of Video Algorithms on the TI C6400 Using Simulink
2004-08-20
SPONSOR/MONITOR’S ACRONYM(S) 11. SPONSOR/MONITOR’S REPORT NUMBER(S) 12 . DISTRIBUTION/AVAILABILITY STATEMENT Approved for public release...plot estimates over time (scrolling data) Adjust detection threshold (click mouse on graph) Monitor video capture Input video frames Captured frames 12 ...Video App: Surveillance Recording 1 2 7 3 4 9 5 6 11 SL for video Explanation of GUI 12 Target Options8 Build Process 10 13 14 15 16 M-code snippet
Aerial video mosaicking using binary feature tracking
NASA Astrophysics Data System (ADS)
Minnehan, Breton; Savakis, Andreas
2015-05-01
Unmanned Aerial Vehicles are becoming an increasingly attractive platform for many applications, as their cost decreases and their capabilities increase. Creating detailed maps from aerial data requires fast and accurate video mosaicking methods. Traditional mosaicking techniques rely on inter-frame homography estimations that are cascaded through the video sequence. Computationally expensive keypoint matching algorithms are often used to determine the correspondence of keypoints between frames. This paper presents a video mosaicking method that uses an object tracking approach for matching keypoints between frames to improve both efficiency and robustness. The proposed tracking method matches local binary descriptors between frames and leverages the spatial locality of the keypoints to simplify the matching process. Our method is robust to cascaded errors by determining the homography between each frame and the ground plane rather than the prior frame. The frame-to-ground homography is calculated based on the relationship of each point's image coordinates and its estimated location on the ground plane. Robustness to moving objects is integrated into the homography estimation step through detecting anomalies in the motion of keypoints and eliminating the influence of outliers. The resulting mosaics are of high accuracy and can be computed in real time.
Video-based eye tracking for neuropsychiatric assessment.
Adhikari, Sam; Stark, David E
2017-01-01
This paper presents a video-based eye-tracking method, ideally deployed via a mobile device or laptop-based webcam, as a tool for measuring brain function. Eye movements and pupillary motility are tightly regulated by brain circuits, are subtly perturbed by many disease states, and are measurable using video-based methods. Quantitative measurement of eye movement by readily available webcams may enable early detection and diagnosis, as well as remote/serial monitoring, of neurological and neuropsychiatric disorders. We successfully extracted computational and semantic features for 14 testing sessions, comprising 42 individual video blocks and approximately 17,000 image frames generated across several days of testing. Here, we demonstrate the feasibility of collecting video-based eye-tracking data from a standard webcam in order to assess psychomotor function. Furthermore, we were able to demonstrate through systematic analysis of this data set that eye-tracking features (in particular, radial and tangential variance on a circular visual-tracking paradigm) predict performance on well-validated psychomotor tests. © 2017 New York Academy of Sciences.
Improved segmentation of occluded and adjoining vehicles in traffic surveillance videos
NASA Astrophysics Data System (ADS)
Juneja, Medha; Grover, Priyanka
2013-12-01
Occlusion in image processing refers to concealment of any part of the object or the whole object from view of an observer. Real time videos captured by static cameras on roads often encounter overlapping and hence, occlusion of vehicles. Occlusion in traffic surveillance videos usually occurs when an object which is being tracked is hidden by another object. This makes it difficult for the object detection algorithms to distinguish all the vehicles efficiently. Also morphological operations tend to join the close proximity vehicles resulting in formation of a single bounding box around more than one vehicle. Such problems lead to errors in further video processing, like counting of vehicles in a video. The proposed system brings forward efficient moving object detection and tracking approach to reduce such errors. The paper uses successive frame subtraction technique for detection of moving objects. Further, this paper implements the watershed algorithm to segment the overlapped and adjoining vehicles. The segmentation results have been improved by the use of noise and morphological operations.
Automated tracking of whiskers in videos of head fixed rodents.
Clack, Nathan G; O'Connor, Daniel H; Huber, Daniel; Petreanu, Leopoldo; Hires, Andrew; Peron, Simon; Svoboda, Karel; Myers, Eugene W
2012-01-01
We have developed software for fully automated tracking of vibrissae (whiskers) in high-speed videos (>500 Hz) of head-fixed, behaving rodents trimmed to a single row of whiskers. Performance was assessed against a manually curated dataset consisting of 1.32 million video frames comprising 4.5 million whisker traces. The current implementation detects whiskers with a recall of 99.998% and identifies individual whiskers with 99.997% accuracy. The average processing rate for these images was 8 Mpx/s/cpu (2.6 GHz Intel Core2, 2 GB RAM). This translates to 35 processed frames per second for a 640 px×352 px video of 4 whiskers. The speed and accuracy achieved enables quantitative behavioral studies where the analysis of millions of video frames is required. We used the software to analyze the evolving whisking strategies as mice learned a whisker-based detection task over the course of 6 days (8148 trials, 25 million frames) and measure the forces at the sensory follicle that most underlie haptic perception.
Automated Tracking of Whiskers in Videos of Head Fixed Rodents
Clack, Nathan G.; O'Connor, Daniel H.; Huber, Daniel; Petreanu, Leopoldo; Hires, Andrew; Peron, Simon; Svoboda, Karel; Myers, Eugene W.
2012-01-01
We have developed software for fully automated tracking of vibrissae (whiskers) in high-speed videos (>500 Hz) of head-fixed, behaving rodents trimmed to a single row of whiskers. Performance was assessed against a manually curated dataset consisting of 1.32 million video frames comprising 4.5 million whisker traces. The current implementation detects whiskers with a recall of 99.998% and identifies individual whiskers with 99.997% accuracy. The average processing rate for these images was 8 Mpx/s/cpu (2.6 GHz Intel Core2, 2 GB RAM). This translates to 35 processed frames per second for a 640 px×352 px video of 4 whiskers. The speed and accuracy achieved enables quantitative behavioral studies where the analysis of millions of video frames is required. We used the software to analyze the evolving whisking strategies as mice learned a whisker-based detection task over the course of 6 days (8148 trials, 25 million frames) and measure the forces at the sensory follicle that most underlie haptic perception. PMID:22792058
High-Speed Video Analysis in a Conceptual Physics Class
NASA Astrophysics Data System (ADS)
Desbien, Dwain M.
2011-09-01
The use of probe ware and computers has become quite common in introductory physics classrooms. Video analysis is also becoming more popular and is available to a wide range of students through commercially available and/or free software.2,3 Video analysis allows for the study of motions that cannot be easily measured in the traditional lab setting and also allows real-world situations to be analyzed. Many motions are too fast to easily be captured at the standard video frame rate of 30 frames per second (fps) employed by most video cameras. This paper will discuss using a consumer camera that can record high-frame-rate video in a college-level conceptual physics class. In particular this will involve the use of model rockets to determine the acceleration during the boost period right at launch and compare it to a simple model of the expected acceleration.
Omorczyk, Jarosław; Nosiadek, Leszek; Ambroży, Tadeusz; Nosiadek, Andrzej
2015-01-01
The main aim of this study was to verify the usefulness of selected simple methods of recording and fast biomechanical analysis performed by judges of artistic gymnastics in assessing a gymnast's movement technique. The study participants comprised six artistic gymnastics judges, who assessed back handsprings using two methods: a real-time observation method and a frame-by-frame video analysis method. They also determined flexion angles of knee and hip joints using the computer program. In the case of the real-time observation method, the judges gave a total of 5.8 error points with an arithmetic mean of 0.16 points for the flexion of the knee joints. In the high-speed video analysis method, the total amounted to 8.6 error points and the mean value amounted to 0.24 error points. For the excessive flexion of hip joints, the sum of the error values was 2.2 error points and the arithmetic mean was 0.06 error points during real-time observation. The sum obtained using frame-by-frame analysis method equaled 10.8 and the mean equaled 0.30 error points. Error values obtained through the frame-by-frame video analysis of movement technique were higher than those obtained through the real-time observation method. The judges were able to indicate the number of the frame in which the maximal joint flexion occurred with good accuracy. Using the real-time observation method as well as the high-speed video analysis performed without determining the exact angle for assessing movement technique were found to be insufficient tools for improving the quality of judging.
Retinal slit lamp video mosaicking.
De Zanet, Sandro; Rudolph, Tobias; Richa, Rogerio; Tappeiner, Christoph; Sznitman, Raphael
2016-06-01
To this day, the slit lamp remains the first tool used by an ophthalmologist to examine patient eyes. Imaging of the retina poses, however, a variety of problems, namely a shallow depth of focus, reflections from the optical system, a small field of view and non-uniform illumination. For ophthalmologists, the use of slit lamp images for documentation and analysis purposes, however, remains extremely challenging due to large image artifacts. For this reason, we propose an automatic retinal slit lamp video mosaicking, which enlarges the field of view and reduces amount of noise and reflections, thus enhancing image quality. Our method is composed of three parts: (i) viable content segmentation, (ii) global registration and (iii) image blending. Frame content is segmented using gradient boosting with custom pixel-wise features. Speeded-up robust features are used for finding pair-wise translations between frames with robust random sample consensus estimation and graph-based simultaneous localization and mapping for global bundle adjustment. Foreground-aware blending based on feathering merges video frames into comprehensive mosaics. Foreground is segmented successfully with an area under the curve of the receiver operating characteristic curve of 0.9557. Mosaicking results and state-of-the-art methods were compared and rated by ophthalmologists showing a strong preference for a large field of view provided by our method. The proposed method for global registration of retinal slit lamp images of the retina into comprehensive mosaics improves over state-of-the-art methods and is preferred qualitatively.
Dosis, Aristotelis; Bello, Fernando; Moorthy, Krishna; Munz, Yaron; Gillies, Duncan; Darzi, Ara
2004-01-01
Surgical dexterity in operating theatres has traditionally been assessed subjectively. Electromagnetic (EM) motion tracking systems such as the Imperial College Surgical Assessment Device (ICSAD) have been shown to produce valid and accurate objective measures of surgical skill. To allow for video integration we have modified the data acquisition and built it within the ROVIMAS analysis software. We then used ActiveX 9.0 DirectShow video capturing and the system clock as a time stamp for the synchronized concurrent acquisition of kinematic data and video frames. Interactive video/motion data browsing was implemented to allow the user to concentrate on frames exhibiting certain kinematic properties that could result in operative errors. We exploited video-data synchronization to calculate the camera visual hull by identifying all 3D vertices using the ICSAD electromagnetic sensors. We also concentrated on high velocity peaks as a means of identifying potential erroneous movements to be confirmed by studying the corresponding video frames. The outcome of the study clearly shows that the kinematic data are precisely synchronized with the video frames and that the velocity peaks correspond to large and sudden excursions of the instrument tip. We validated the camera visual hull by both video and geometrical kinematic analysis and we observed that graphs containing fewer sudden velocity peaks are less likely to have erroneous movements. This work presented further developments to the well-established ICSAD dexterity analysis system. Synchronized real-time motion and video acquisition provides a comprehensive assessment solution by combining quantitative motion analysis tools and qualitative targeted video scoring.
Algorithm for Video Summarization of Bronchoscopy Procedures
2011-01-01
Background The duration of bronchoscopy examinations varies considerably depending on the diagnostic and therapeutic procedures used. It can last more than 20 minutes if a complex diagnostic work-up is included. With wide access to videobronchoscopy, the whole procedure can be recorded as a video sequence. Common practice relies on an active attitude of the bronchoscopist who initiates the recording process and usually chooses to archive only selected views and sequences. However, it may be important to record the full bronchoscopy procedure as documentation when liability issues are at stake. Furthermore, an automatic recording of the whole procedure enables the bronchoscopist to focus solely on the performed procedures. Video recordings registered during bronchoscopies include a considerable number of frames of poor quality due to blurry or unfocused images. It seems that such frames are unavoidable due to the relatively tight endobronchial space, rapid movements of the respiratory tract due to breathing or coughing, and secretions which occur commonly in the bronchi, especially in patients suffering from pulmonary disorders. Methods The use of recorded bronchoscopy video sequences for diagnostic, reference and educational purposes could be considerably extended with efficient, flexible summarization algorithms. Thus, the authors developed a prototype system to create shortcuts (called summaries or abstracts) of bronchoscopy video recordings. Such a system, based on models described in previously published papers, employs image analysis methods to exclude frames or sequences of limited diagnostic or education value. Results The algorithm for the selection or exclusion of specific frames or shots from video sequences recorded during bronchoscopy procedures is based on several criteria, including automatic detection of "non-informative", frames showing the branching of the airways and frames including pathological lesions. Conclusions The paper focuses on the challenge of generating summaries of bronchoscopy video recordings. PMID:22185344
Murugesan, Yahini Prabha; Alsadoon, Abeer; Manoranjan, Paul; Prasad, P W C
2018-06-01
Augmented reality-based surgeries have not been successfully implemented in oral and maxillofacial areas due to limitations in geometric accuracy and image registration. This paper aims to improve the accuracy and depth perception of the augmented video. The proposed system consists of a rotational matrix and translation vector algorithm to reduce the geometric error and improve the depth perception by including 2 stereo cameras and a translucent mirror in the operating room. The results on the mandible/maxilla area show that the new algorithm improves the video accuracy by 0.30-0.40 mm (in terms of overlay error) and the processing rate to 10-13 frames/s compared to 7-10 frames/s in existing systems. The depth perception increased by 90-100 mm. The proposed system concentrates on reducing the geometric error. Thus, this study provides an acceptable range of accuracy with a shorter operating time, which provides surgeons with a smooth surgical flow. Copyright © 2018 John Wiley & Sons, Ltd.
Time synchronized video systems
NASA Technical Reports Server (NTRS)
Burnett, Ron
1994-01-01
The idea of synchronizing multiple video recordings to some type of 'range' time has been tried to varying degrees of success in the past. Combining this requirement with existing time code standards (SMPTE) and the new innovations in desktop multimedia however, have afforded an opportunity to increase the flexibility and usefulness of such efforts without adding costs over the traditional data recording and reduction systems. The concept described can use IRIG, GPS or a battery backed internal clock as the master time source. By converting that time source to Vertical Interval Time Code or Longitudinal Time Code, both in accordance with the SMPTE standards, the user will obtain a tape that contains machine/computer readable time code suitable for use with editing equipment that is available off-the-shelf. Accuracy on playback is then determined by the playback system chosen by the user. Accuracies of +/- 2 frames are common among inexpensive systems and complete frame accuracy is more a matter of the users' budget than the capability of the recording system.
The experiments and analysis of several selective video encryption methods
NASA Astrophysics Data System (ADS)
Zhang, Yue; Yang, Cheng; Wang, Lei
2013-07-01
This paper presents four methods for selective video encryption based on the MPEG-2 video compression,including the slices, the I-frames, the motion vectors, and the DCT coefficients. We use the AES encryption method for simulation experiment for the four methods on VS2010 Platform, and compare the video effects and the processing speed of each frame after the video encrypted. The encryption depth can be arbitrarily selected, and design the encryption depth by using the double limit counting method, so the accuracy can be increased.
Mehmood, Irfan; Sajjad, Muhammad; Baik, Sung Wook
2014-01-01
Wireless capsule endoscopy (WCE) has great advantages over traditional endoscopy because it is portable and easy to use, especially in remote monitoring health-services. However, during the WCE process, the large amount of captured video data demands a significant deal of computation to analyze and retrieve informative video frames. In order to facilitate efficient WCE data collection and browsing task, we present a resource- and bandwidth-aware WCE video summarization framework that extracts the representative keyframes of the WCE video contents by removing redundant and non-informative frames. For redundancy elimination, we use Jeffrey-divergence between color histograms and inter-frame Boolean series-based correlation of color channels. To remove non-informative frames, multi-fractal texture features are extracted to assist the classification using an ensemble-based classifier. Owing to the limited WCE resources, it is impossible for the WCE system to perform computationally intensive video summarization tasks. To resolve computational challenges, mobile-cloud architecture is incorporated, which provides resizable computing capacities by adaptively offloading video summarization tasks between the client and the cloud server. The qualitative and quantitative results are encouraging and show that the proposed framework saves information transmission cost and bandwidth, as well as the valuable time of data analysts in browsing remote sensing data. PMID:25225874
NASA Astrophysics Data System (ADS)
Li, Jia; Tian, Yonghong; Gao, Wen
2008-01-01
In recent years, the amount of streaming video has grown rapidly on the Web. Often, retrieving these streaming videos offers the challenge of indexing and analyzing the media in real time because the streams must be treated as effectively infinite in length, thus precluding offline processing. Generally speaking, captions are important semantic clues for video indexing and retrieval. However, existing caption detection methods often have difficulties to make real-time detection for streaming video, and few of them concern on the differentiation of captions from scene texts and scrolling texts. In general, these texts have different roles in streaming video retrieval. To overcome these difficulties, this paper proposes a novel approach which explores the inter-frame correlation analysis and wavelet-domain modeling for real-time caption detection in streaming video. In our approach, the inter-frame correlation information is used to distinguish caption texts from scene texts and scrolling texts. Moreover, wavelet-domain Generalized Gaussian Models (GGMs) are utilized to automatically remove non-text regions from each frame and only keep caption regions for further processing. Experiment results show that our approach is able to offer real-time caption detection with high recall and low false alarm rate, and also can effectively discern caption texts from the other texts even in low resolutions.
Mehmood, Irfan; Sajjad, Muhammad; Baik, Sung Wook
2014-09-15
Wireless capsule endoscopy (WCE) has great advantages over traditional endoscopy because it is portable and easy to use, especially in remote monitoring health-services. However, during the WCE process, the large amount of captured video data demands a significant deal of computation to analyze and retrieve informative video frames. In order to facilitate efficient WCE data collection and browsing task, we present a resource- and bandwidth-aware WCE video summarization framework that extracts the representative keyframes of the WCE video contents by removing redundant and non-informative frames. For redundancy elimination, we use Jeffrey-divergence between color histograms and inter-frame Boolean series-based correlation of color channels. To remove non-informative frames, multi-fractal texture features are extracted to assist the classification using an ensemble-based classifier. Owing to the limited WCE resources, it is impossible for the WCE system to perform computationally intensive video summarization tasks. To resolve computational challenges, mobile-cloud architecture is incorporated, which provides resizable computing capacities by adaptively offloading video summarization tasks between the client and the cloud server. The qualitative and quantitative results are encouraging and show that the proposed framework saves information transmission cost and bandwidth, as well as the valuable time of data analysts in browsing remote sensing data.
A video wireless capsule endoscopy system powered wirelessly: design, analysis and experiment
NASA Astrophysics Data System (ADS)
Pan, Guobing; Xin, Wenhui; Yan, Guozheng; Chen, Jiaoliao
2011-06-01
Wireless capsule endoscopy (WCE), as a relatively new technology, has brought about a revolution in the diagnosis of gastrointestinal (GI) tract diseases. However, the existing WCE systems are not widely applied in clinic because of the low frame rate and low image resolution. A video WCE system based on a wireless power supply is developed in this paper. This WCE system consists of a video capsule endoscope (CE), a wireless power transmission device, a receiving box and an image processing station. Powered wirelessly, the video CE has the abilities of imaging the GI tract and transmitting the images wirelessly at a frame rate of 30 frames per second (f/s). A mathematical prototype was built to analyze the power transmission system, and some experiments were performed to test the capability of energy transferring. The results showed that the wireless electric power supply system had the ability to transfer more than 136 mW power, which was enough for the working of a video CE. In in vitro experiments, the video CE produced clear images of the small intestine of a pig with the resolution of 320 × 240, and transmitted NTSC format video outside the body. Because of the wireless power supply, the video WCE system with high frame rate and high resolution becomes feasible, and provides a novel solution for the diagnosis of the GI tract in clinic.
An adaptive enhancement algorithm for infrared video based on modified k-means clustering
NASA Astrophysics Data System (ADS)
Zhang, Linze; Wang, Jingqi; Wu, Wen
2016-09-01
In this paper, we have proposed a video enhancement algorithm to improve the output video of the infrared camera. Sometimes the video obtained by infrared camera is very dark since there is no clear target. In this case, infrared video should be divided into frame images by frame extraction, in order to carry out the image enhancement. For the first frame image, which can be divided into k sub images by using K-means clustering according to the gray interval it occupies before k sub images' histogram equalization according to the amount of information per sub image, we used a method to solve a problem that final cluster centers close to each other in some cases; and for the other frame images, their initial cluster centers can be determined by the final clustering centers of the previous ones, and the histogram equalization of each sub image will be carried out after image segmentation based on K-means clustering. The histogram equalization can make the gray value of the image to the whole gray level, and the gray level of each sub image is determined by the ratio of pixels to a frame image. Experimental results show that this algorithm can improve the contrast of infrared video where night target is not obvious which lead to a dim scene, and reduce the negative effect given by the overexposed pixels adaptively in a certain range.
(abstract) Synthesis of Speaker Facial Movements to Match Selected Speech Sequences
NASA Technical Reports Server (NTRS)
Scott, Kenneth C.
1994-01-01
We are developing a system for synthesizing image sequences the simulate the facial motion of a speaker. To perform this synthesis, we are pursuing two major areas of effort. We are developing the necessary computer graphics technology to synthesize a realistic image sequence of a person speaking selected speech sequences. Next, we are developing a model that expresses the relation between spoken phonemes and face/mouth shape. A subject is video taped speaking an arbitrary text that contains expression of the full list of desired database phonemes. The subject is video taped from the front speaking normally, recording both audio and video detail simultaneously. Using the audio track, we identify the specific video frames on the tape relating to each spoken phoneme. From this range we digitize the video frame which represents the extreme of mouth motion/shape. Thus, we construct a database of images of face/mouth shape related to spoken phonemes. A selected audio speech sequence is recorded which is the basis for synthesizing a matching video sequence; the speaker need not be the same as used for constructing the database. The audio sequence is analyzed to determine the spoken phoneme sequence and the relative timing of the enunciation of those phonemes. Synthesizing an image sequence corresponding to the spoken phoneme sequence is accomplished using a graphics technique known as morphing. Image sequence keyframes necessary for this processing are based on the spoken phoneme sequence and timing. We have been successful in synthesizing the facial motion of a native English speaker for a small set of arbitrary speech segments. Our future work will focus on advancement of the face shape/phoneme model and independent control of facial features.
Heterogeneous CPU-GPU moving targets detection for UAV video
NASA Astrophysics Data System (ADS)
Li, Maowen; Tang, Linbo; Han, Yuqi; Yu, Chunlei; Zhang, Chao; Fu, Huiquan
2017-07-01
Moving targets detection is gaining popularity in civilian and military applications. On some monitoring platform of motion detection, some low-resolution stationary cameras are replaced by moving HD camera based on UAVs. The pixels of moving targets in the HD Video taken by UAV are always in a minority, and the background of the frame is usually moving because of the motion of UAVs. The high computational cost of the algorithm prevents running it at higher resolutions the pixels of frame. Hence, to solve the problem of moving targets detection based UAVs video, we propose a heterogeneous CPU-GPU moving target detection algorithm for UAV video. More specifically, we use background registration to eliminate the impact of the moving background and frame difference to detect small moving targets. In order to achieve the effect of real-time processing, we design the solution of heterogeneous CPU-GPU framework for our method. The experimental results show that our method can detect the main moving targets from the HD video taken by UAV, and the average process time is 52.16ms per frame which is fast enough to solve the problem.
Video segmentation using keywords
NASA Astrophysics Data System (ADS)
Ton-That, Vinh; Vong, Chi-Tai; Nguyen-Dao, Xuan-Truong; Tran, Minh-Triet
2018-04-01
At DAVIS-2016 Challenge, many state-of-art video segmentation methods achieve potential results, but they still much depend on annotated frames to distinguish between background and foreground. It takes a lot of time and efforts to create these frames exactly. In this paper, we introduce a method to segment objects from video based on keywords given by user. First, we use a real-time object detection system - YOLOv2 to identify regions containing objects that have labels match with the given keywords in the first frame. Then, for each region identified from the previous step, we use Pyramid Scene Parsing Network to assign each pixel as foreground or background. These frames can be used as input frames for Object Flow algorithm to perform segmentation on entire video. We conduct experiments on a subset of DAVIS-2016 dataset in half the size of its original size, which shows that our method can handle many popular classes in PASCAL VOC 2012 dataset with acceptable accuracy, about 75.03%. We suggest widely testing by combining other methods to improve this result in the future.
Novel Integration of Frame Rate Up Conversion and HEVC Coding Based on Rate-Distortion Optimization.
Guo Lu; Xiaoyun Zhang; Li Chen; Zhiyong Gao
2018-02-01
Frame rate up conversion (FRUC) can improve the visual quality by interpolating new intermediate frames. However, high frame rate videos by FRUC are confronted with more bitrate consumption or annoying artifacts of interpolated frames. In this paper, a novel integration framework of FRUC and high efficiency video coding (HEVC) is proposed based on rate-distortion optimization, and the interpolated frames can be reconstructed at encoder side with low bitrate cost and high visual quality. First, joint motion estimation (JME) algorithm is proposed to obtain robust motion vectors, which are shared between FRUC and video coding. What's more, JME is embedded into the coding loop and employs the original motion search strategy in HEVC coding. Then, the frame interpolation is formulated as a rate-distortion optimization problem, where both the coding bitrate consumption and visual quality are taken into account. Due to the absence of original frames, the distortion model for interpolated frames is established according to the motion vector reliability and coding quantization error. Experimental results demonstrate that the proposed framework can achieve 21% ~ 42% reduction in BDBR, when compared with the traditional methods of FRUC cascaded with coding.
High-Order Model and Dynamic Filtering for Frame Rate Up-Conversion.
Bao, Wenbo; Zhang, Xiaoyun; Chen, Li; Ding, Lianghui; Gao, Zhiyong
2018-08-01
This paper proposes a novel frame rate up-conversion method through high-order model and dynamic filtering (HOMDF) for video pixels. Unlike the constant brightness and linear motion assumptions in traditional methods, the intensity and position of the video pixels are both modeled with high-order polynomials in terms of time. Then, the key problem of our method is to estimate the polynomial coefficients that represent the pixel's intensity variation, velocity, and acceleration. We propose to solve it with two energy objectives: one minimizes the auto-regressive prediction error of intensity variation by its past samples, and the other minimizes video frame's reconstruction error along the motion trajectory. To efficiently address the optimization problem for these coefficients, we propose the dynamic filtering solution inspired by video's temporal coherence. The optimal estimation of these coefficients is reformulated into a dynamic fusion of the prior estimate from pixel's temporal predecessor and the maximum likelihood estimate from current new observation. Finally, frame rate up-conversion is implemented using motion-compensated interpolation by pixel-wise intensity variation and motion trajectory. Benefited from the advanced model and dynamic filtering, the interpolated frame has much better visual quality. Extensive experiments on the natural and synthesized videos demonstrate the superiority of HOMDF over the state-of-the-art methods in both subjective and objective comparisons.
Enhanced video indirect ophthalmoscopy (VIO) via robust mosaicing.
Estrada, Rolando; Tomasi, Carlo; Cabrera, Michelle T; Wallace, David K; Freedman, Sharon F; Farsiu, Sina
2011-10-01
Indirect ophthalmoscopy (IO) is the standard of care for evaluation of the neonatal retina. When recorded on video from a head-mounted camera, IO images have low quality and narrow Field of View (FOV). We present an image fusion methodology for converting a video IO recording into a single, high quality, wide-FOV mosaic that seamlessly blends the best frames in the video. To this end, we have developed fast and robust algorithms for automatic evaluation of video quality, artifact detection and removal, vessel mapping, registration, and multi-frame image fusion. Our experiments show the effectiveness of the proposed methods.
As time passes by: Observed motion-speed and psychological time during video playback.
Nyman, Thomas Jonathan; Karlsson, Eric Per Anders; Antfolk, Jan
2017-01-01
Research shows that psychological time (i.e., the subjective experience and assessment of the passage of time) is malleable and that the central nervous system re-calibrates temporal information in accordance with situational factors so that psychological time flows slower or faster. Observed motion-speed (e.g., the visual perception of a rolling ball) is an important situational factor which influences the production of time estimates. The present study examines previous findings showing that observed slow and fast motion-speed during video playback respectively results in over- and underproductions of intervals of time. Here, we investigated through three separate experiments: a) the main effect of observed motion-speed during video playback on a time production task and b) the interactive effect of the frame rate (frames per second; fps) and motion-speed during video playback on a time production task. No main effect of video playback-speed or interactive effect between video playback-speed and frame rate was found on time production.
As time passes by: Observed motion-speed and psychological time during video playback
Karlsson, Eric Per Anders; Antfolk, Jan
2017-01-01
Research shows that psychological time (i.e., the subjective experience and assessment of the passage of time) is malleable and that the central nervous system re-calibrates temporal information in accordance with situational factors so that psychological time flows slower or faster. Observed motion-speed (e.g., the visual perception of a rolling ball) is an important situational factor which influences the production of time estimates. The present study examines previous findings showing that observed slow and fast motion-speed during video playback respectively results in over- and underproductions of intervals of time. Here, we investigated through three separate experiments: a) the main effect of observed motion-speed during video playback on a time production task and b) the interactive effect of the frame rate (frames per second; fps) and motion-speed during video playback on a time production task. No main effect of video playback-speed or interactive effect between video playback-speed and frame rate was found on time production. PMID:28614353
Keyhole imaging method for dynamic objects behind the occlusion area
NASA Astrophysics Data System (ADS)
Hao, Conghui; Chen, Xi; Dong, Liquan; Zhao, Yuejin; Liu, Ming; Kong, Lingqin; Hui, Mei; Liu, Xiaohua; Wu, Hong
2018-01-01
A method of keyhole imaging based on camera array is realized to obtain the video image behind a keyhole in shielded space at a relatively long distance. We get the multi-angle video images by using a 2×2 CCD camera array to take the images behind the keyhole in four directions. The multi-angle video images are saved in the form of frame sequences. This paper presents a method of video frame alignment. In order to remove the non-target area outside the aperture, we use the canny operator and morphological method to realize the edge detection of images and fill the images. The image stitching of four images is accomplished on the basis of the image stitching algorithm of two images. In the image stitching algorithm of two images, the SIFT method is adopted to accomplish the initial matching of images, and then the RANSAC algorithm is applied to eliminate the wrong matching points and to obtain a homography matrix. A method of optimizing transformation matrix is proposed in this paper. Finally, the video image with larger field of view behind the keyhole can be synthesized with image frame sequence in which every single frame is stitched. The results show that the screen of the video is clear and natural, the brightness transition is smooth. There is no obvious artificial stitching marks in the video, and it can be applied in different engineering environment .
Indexing and retrieval of MPEG compressed video
NASA Astrophysics Data System (ADS)
Kobla, Vikrant; Doermann, David S.
1998-04-01
To keep pace with the increased popularity of digital video as an archival medium, the development of techniques for fast and efficient analysis of ideo streams is essential. In particular, solutions to the problems of storing, indexing, browsing, and retrieving video data from large multimedia databases are necessary to a low access to these collections. Given that video is often stored efficiently in a compressed format, the costly overhead of decompression can be reduced by analyzing the compressed representation directly. In earlier work, we presented compressed domain parsing techniques which identified shots, subshots, and scenes. In this article, we present efficient key frame selection, feature extraction, indexing, and retrieval techniques that are directly applicable to MPEG compressed video. We develop a frame type independent representation which normalizes spatial and temporal features including frame type, frame size, macroblock encoding, and motion compensation vectors. Features for indexing are derived directly from this representation and mapped to a low- dimensional space where they can be accessed using standard database techniques. Spatial information is used as primary index into the database and temporal information is used to rank retrieved clips and enhance the robustness of the system. The techniques presented enable efficient indexing, querying, and retrieval of compressed video as demonstrated by our system which typically takes a fraction of a second to retrieve similar video scenes from a database, with over 95 percent recall.
Dynamic Textures Modeling via Joint Video Dictionary Learning.
Wei, Xian; Li, Yuanxiang; Shen, Hao; Chen, Fang; Kleinsteuber, Martin; Wang, Zhongfeng
2017-04-06
Video representation is an important and challenging task in the computer vision community. In this paper, we consider the problem of modeling and classifying video sequences of dynamic scenes which could be modeled in a dynamic textures (DT) framework. At first, we assume that image frames of a moving scene can be modeled as a Markov random process. We propose a sparse coding framework, named joint video dictionary learning (JVDL), to model a video adaptively. By treating the sparse coefficients of image frames over a learned dictionary as the underlying "states", we learn an efficient and robust linear transition matrix between two adjacent frames of sparse events in time series. Hence, a dynamic scene sequence is represented by an appropriate transition matrix associated with a dictionary. In order to ensure the stability of JVDL, we impose several constraints on such transition matrix and dictionary. The developed framework is able to capture the dynamics of a moving scene by exploring both sparse properties and the temporal correlations of consecutive video frames. Moreover, such learned JVDL parameters can be used for various DT applications, such as DT synthesis and recognition. Experimental results demonstrate the strong competitiveness of the proposed JVDL approach in comparison with state-of-the-art video representation methods. Especially, it performs significantly better in dealing with DT synthesis and recognition on heavily corrupted data.
Gibbs, Ann E.; Cochran, Susan A.; Tierney, Peter W.
2013-01-01
Underwater video footage was collected in nearshore waters (<60-meter depth) off the Hawaiian Islands from 2002 to 2011 as part of the U.S. Geological Survey (USGS) Coastal and Marine Geology Program's Pacific Coral Reef Project, to improve seafloor characterization and for the development and ground-truthing of benthic-habitat maps. This report includes nearly 53 hours of digital underwater video footage collected during four USGS cruises and more than 10,200 still images extracted from the videos, including still frames from every 10 seconds along transect lines, and still frames showing both an overview and a near-bottom view from fixed stations. Environmental Systems Research Institute (ESRI) shapefiles of individual video and still-image locations, and Google Earth kml files with explanatory text and links to the video and still images, are included. This report documents the various camera systems and methods used to collect the videos, and the techniques and software used to convert the analog video tapes into digital data in order to process the images for optimum viewing and to extract the still images, along with a brief summary of each survey cruise.
A video event trigger for high frame rate, high resolution video technology
NASA Astrophysics Data System (ADS)
Williams, Glenn L.
1991-12-01
When video replaces film the digitized video data accumulates very rapidly, leading to a difficult and costly data storage problem. One solution exists for cases when the video images represent continuously repetitive 'static scenes' containing negligible activity, occasionally interrupted by short events of interest. Minutes or hours of redundant video frames can be ignored, and not stored, until activity begins. A new, highly parallel digital state machine generates a digital trigger signal at the onset of a video event. High capacity random access memory storage coupled with newly available fuzzy logic devices permits the monitoring of a video image stream for long term or short term changes caused by spatial translation, dilation, appearance, disappearance, or color change in a video object. Pretrigger and post-trigger storage techniques are then adaptable for archiving the digital stream from only the significant video images.
A video event trigger for high frame rate, high resolution video technology
NASA Technical Reports Server (NTRS)
Williams, Glenn L.
1991-01-01
When video replaces film the digitized video data accumulates very rapidly, leading to a difficult and costly data storage problem. One solution exists for cases when the video images represent continuously repetitive 'static scenes' containing negligible activity, occasionally interrupted by short events of interest. Minutes or hours of redundant video frames can be ignored, and not stored, until activity begins. A new, highly parallel digital state machine generates a digital trigger signal at the onset of a video event. High capacity random access memory storage coupled with newly available fuzzy logic devices permits the monitoring of a video image stream for long term or short term changes caused by spatial translation, dilation, appearance, disappearance, or color change in a video object. Pretrigger and post-trigger storage techniques are then adaptable for archiving the digital stream from only the significant video images.
Feasibility of video codec algorithms for software-only playback
NASA Astrophysics Data System (ADS)
Rodriguez, Arturo A.; Morse, Ken
1994-05-01
Software-only video codecs can provide good playback performance in desktop computers with a 486 or 68040 CPU running at 33 MHz without special hardware assistance. Typically, playback of compressed video can be categorized into three tasks: the actual decoding of the video stream, color conversion, and the transfer of decoded video data from system RAM to video RAM. By current standards, good playback performance is the decoding and display of video streams of 320 by 240 (or larger) compressed frames at 15 (or greater) frames-per- second. Software-only video codecs have evolved by modifying and tailoring existing compression methodologies to suit video playback in desktop computers. In this paper we examine the characteristics used to evaluate software-only video codec algorithms, namely: image fidelity (i.e., image quality), bandwidth (i.e., compression) ease-of-decoding (i.e., playback performance), memory consumption, compression to decompression asymmetry, scalability, and delay. We discuss the tradeoffs among these variables and the compromises that can be made to achieve low numerical complexity for software-only playback. Frame- differencing approaches are described since software-only video codecs typically employ them to enhance playback performance. To complement other papers that appear in this session of the Proceedings, we review methods derived from binary pattern image coding since these methods are amenable for software-only playback. In particular, we introduce a novel approach called pixel distribution image coding.
Method of determining the necessary number of observations for video stream documents recognition
NASA Astrophysics Data System (ADS)
Arlazarov, Vladimir V.; Bulatov, Konstantin; Manzhikov, Temudzhin; Slavin, Oleg; Janiszewski, Igor
2018-04-01
This paper discusses a task of document recognition on a sequence of video frames. In order to optimize the processing speed an estimation is performed of stability of recognition results obtained from several video frames. Considering identity document (Russian internal passport) recognition on a mobile device it is shown that significant decrease is possible of the number of observations necessary for obtaining precise recognition result.
Luo, Xiongbiao; Mori, Kensaku
2014-06-01
Endoscope 3-D motion tracking, which seeks to synchronize pre- and intra-operative images in endoscopic interventions, is usually performed as video-volume registration that optimizes the similarity between endoscopic video and pre-operative images. The tracking performance, in turn, depends significantly on whether a similarity measure can successfully characterize the difference between video sequences and volume rendering images driven by pre-operative images. The paper proposes a discriminative structural similarity measure, which uses the degradation of structural information and takes image correlation or structure, luminance, and contrast into consideration, to boost video-volume registration. By applying the proposed similarity measure to endoscope tracking, it was demonstrated to be more accurate and robust than several available similarity measures, e.g., local normalized cross correlation, normalized mutual information, modified mean square error, or normalized sum squared difference. Based on clinical data evaluation, the tracking error was reduced significantly from at least 14.6 mm to 4.5 mm. The processing time was accelerated more than 30 frames per second using graphics processing unit.
Immersive video for virtual tourism
NASA Astrophysics Data System (ADS)
Hernandez, Luis A.; Taibo, Javier; Seoane, Antonio J.
2001-11-01
This paper describes a new panoramic, 360 degree(s) video system and its use in a real application for virtual tourism. The development of this system has required to design new hardware for multi-camera recording, and software for video processing in order to elaborate the panorama frames and to playback the resulting high resolution video footage on a regular PC. The system makes use of new VR display hardware, such as WindowVR, in order to make the view dependent on the viewer's spatial orientation and so enhance immersiveness. There are very few examples of similar technologies and the existing ones are extremely expensive and/or impossible to be implemented on personal computers with acceptable quality. The idea of the system starts from the concept of Panorama picture, developed in technologies such as QuickTimeVR. This idea is extended to the concept of panorama frame that leads to panorama video. However, many problems are to be solved to implement this simple scheme. Data acquisition involves simultaneously footage recording in every direction, and latter processing to convert every set of frames in a single high resolution panorama frame. Since there is no common hardware capable of 4096x512 video playback at 25 fps rate, it must be stripped in smaller pieces which the system must manage to get the right frames of the right parts as the user movement demands it. As the system must be immersive, the physical interface to watch the 360 degree(s) video is a WindowVR, that is, a flat screen with an orientation tracker that the user holds in his hands, moving it like if it were a virtual window through which the city and its activity is being shown.
NASA Astrophysics Data System (ADS)
Zingoni, Andrea; Diani, Marco; Corsini, Giovanni
2016-10-01
We developed an algorithm for automatically detecting small and poorly contrasted (dim) moving objects in real-time, within video sequences acquired through a steady infrared camera. The algorithm is suitable for different situations since it is independent of the background characteristics and of changes in illumination. Unlike other solutions, small objects of any size (up to single-pixel), either hotter or colder than the background, can be successfully detected. The algorithm is based on accurately estimating the background at the pixel level and then rejecting it. A novel approach permits background estimation to be robust to changes in the scene illumination and to noise, and not to be biased by the transit of moving objects. Care was taken in avoiding computationally costly procedures, in order to ensure the real-time performance even using low-cost hardware. The algorithm was tested on a dataset of 12 video sequences acquired in different conditions, providing promising results in terms of detection rate and false alarm rate, independently of background and objects characteristics. In addition, the detection map was produced frame by frame in real-time, using cheap commercial hardware. The algorithm is particularly suitable for applications in the fields of video-surveillance and computer vision. Its reliability and speed permit it to be used also in critical situations, like in search and rescue, defence and disaster monitoring.
Hardware/Software Issues for Video Guidance Systems: The Coreco Frame Grabber
NASA Technical Reports Server (NTRS)
Bales, John W.
1996-01-01
The F64 frame grabber is a high performance video image acquisition and processing board utilizing the TMS320C40 and TMS34020 processors. The hardware is designed for the ISA 16 bit bus and supports multiple digital or analog cameras. It has an acquisition rate of 40 million pixels per second, with a variable sampling frequency of 510 kHz to MO MHz. The board has a 4MB frame buffer memory expandable to 32 MB, and has a simultaneous acquisition and processing capability. It supports both VGA and RGB displays, and accepts all analog and digital video input standards.
Data compression techniques applied to high resolution high frame rate video technology
NASA Technical Reports Server (NTRS)
Hartz, William G.; Alexovich, Robert E.; Neustadter, Marc S.
1989-01-01
An investigation is presented of video data compression applied to microgravity space experiments using High Resolution High Frame Rate Video Technology (HHVT). An extensive survey of methods of video data compression, described in the open literature, was conducted. The survey examines compression methods employing digital computing. The results of the survey are presented. They include a description of each method and assessment of image degradation and video data parameters. An assessment is made of present and near term future technology for implementation of video data compression in high speed imaging system. Results of the assessment are discussed and summarized. The results of a study of a baseline HHVT video system, and approaches for implementation of video data compression, are presented. Case studies of three microgravity experiments are presented and specific compression techniques and implementations are recommended.
A spatiotemporal decomposition strategy for personal home video management
NASA Astrophysics Data System (ADS)
Yi, Haoran; Kozintsev, Igor; Polito, Marzia; Wu, Yi; Bouguet, Jean-Yves; Nefian, Ara; Dulong, Carole
2007-01-01
With the advent and proliferation of low cost and high performance digital video recorder devices, an increasing number of personal home video clips are recorded and stored by the consumers. Compared to image data, video data is lager in size and richer in multimedia content. Efficient access to video content is expected to be more challenging than image mining. Previously, we have developed a content-based image retrieval system and the benchmarking framework for personal images. In this paper, we extend our personal image retrieval system to include personal home video clips. A possible initial solution to video mining is to represent video clips by a set of key frames extracted from them thus converting the problem into an image search one. Here we report that a careful selection of key frames may improve the retrieval accuracy. However, because video also has temporal dimension, its key frame representation is inherently limited. The use of temporal information can give us better representation for video content at semantic object and concept levels than image-only based representation. In this paper we propose a bottom-up framework to combine interest point tracking, image segmentation and motion-shape factorization to decompose the video into spatiotemporal regions. We show an example application of activity concept detection using the trajectories extracted from the spatio-temporal regions. The proposed approach shows good potential for concise representation and indexing of objects and their motion in real-life consumer video.
Innovative Video Diagnostic Equipment for Material Science
NASA Technical Reports Server (NTRS)
Capuano, G.; Titomanlio, D.; Soellner, W.; Seidel, A.
2012-01-01
Materials science experiments under microgravity increasingly rely on advanced optical systems to determine the physical properties of the samples under investigation. This includes video systems with high spatial and temporal resolution. The acquisition, handling, storage and transmission to ground of the resulting video data are very challenging. Since the available downlink data rate is limited, the capability to compress the video data significantly without compromising the data quality is essential. We report on the development of a Digital Video System (DVS) for EML (Electro Magnetic Levitator) which provides real-time video acquisition, high compression using advanced Wavelet algorithms, storage and transmission of a continuous flow of video with different characteristics in terms of image dimensions and frame rates. The DVS is able to operate with the latest generation of high-performance cameras acquiring high resolution video images up to 4Mpixels@60 fps or high frame rate video images up to about 1000 fps@512x512pixels.
High-performance software-only H.261 video compression on PC
NASA Astrophysics Data System (ADS)
Kasperovich, Leonid
1996-03-01
This paper describes an implementation of a software H.261 codec for PC, that takes an advantage of the fast computational algorithms for DCT-based video compression, which have been presented by the author at the February's 1995 SPIE/IS&T meeting. The motivation for developing the H.261 prototype system is to demonstrate a feasibility of real time software- only videoconferencing solution to operate across a wide range of network bandwidth, frame rate, and resolution of the input video. As the bandwidths of current network technology will be increased, the higher frame rate and resolution of video to be transmitted is allowed, that requires, in turn, a software codec to be able to compress pictures of CIF (352 X 288) resolution at up to 30 frame/sec. Running on Pentium 133 MHz PC the codec presented is capable to compress video in CIF format at 21 - 23 frame/sec. This result is comparable to the known hardware-based H.261 solutions, but it doesn't require any specific hardware. The methods to achieve high performance, the program optimization technique for Pentium microprocessor along with the performance profile, showing the actual contribution of the different encoding/decoding stages to the overall computational process, are presented.
Visual content highlighting via automatic extraction of embedded captions on MPEG compressed video
NASA Astrophysics Data System (ADS)
Yeo, Boon-Lock; Liu, Bede
1996-03-01
Embedded captions in TV programs such as news broadcasts, documentaries and coverage of sports events provide important information on the underlying events. In digital video libraries, such captions represent a highly condensed form of key information on the contents of the video. In this paper we propose a scheme to automatically detect the presence of captions embedded in video frames. The proposed method operates on reduced image sequences which are efficiently reconstructed from compressed MPEG video and thus does not require full frame decompression. The detection, extraction and analysis of embedded captions help to capture the highlights of visual contents in video documents for better organization of video, to present succinctly the important messages embedded in the images, and to facilitate browsing, searching and retrieval of relevant clips.
NASA Technical Reports Server (NTRS)
Papanyan, Valeri; Oshle, Edward; Adamo, Daniel
2008-01-01
Measurement of the jettisoned object departure trajectory and velocity vector in the International Space Station (ISS) reference frame is vitally important for prompt evaluation of the object s imminent orbit. We report on the first successful application of photogrammetric analysis of the ISS imagery for the prompt computation of the jettisoned object s position and velocity vectors. As post-EVA analyses examples, we present the Floating Potential Probe (FPP) and the Russian "Orlan" Space Suit jettisons, as well as the near-real-time (provided in several hours after the separation) computations of the Video Stanchion Support Assembly Flight Support Assembly (VSSA-FSA) and Early Ammonia Servicer (EAS) jettisons during the US astronauts space-walk. Standard close-range photogrammetry analysis was used during this EVA to analyze two on-board camera image sequences down-linked from the ISS. In this approach the ISS camera orientations were computed from known coordinates of several reference points on the ISS hardware. Then the position of the jettisoned object for each time-frame was computed from its image in each frame of the video-clips. In another, "quick-look" approach used in near-real time, orientation of the cameras was computed from their position (from the ISS CAD model) and operational data (pan and tilt) then location of the jettisoned object was calculated only for several frames of the two synchronized movies. Keywords: Photogrammetry, International Space Station, jettisons, image analysis.
Li, Hao; Lu, Jing; Shi, Guohua; Zhang, Yudong
2010-01-01
With the use of adaptive optics (AO), high-resolution microscopic imaging of living human retina in the single cell level has been achieved. In an adaptive optics confocal scanning laser ophthalmoscope (AOSLO) system, with a small field size (about 1 degree, 280 μm), the motion of the eye severely affects the stabilization of the real-time video images and results in significant distortions of the retina images. In this paper, Scale-Invariant Feature Transform (SIFT) is used to abstract stable point features from the retina images. Kanade-Lucas-Tomasi(KLT) algorithm is applied to track the features. With the tracked features, the image distortion in each frame is removed by the second-order polynomial transformation, and 10 successive frames are co-added to enhance the image quality. Features of special interest in an image can also be selected manually and tracked by KLT. A point on a cone is selected manually, and the cone is tracked from frame to frame. PMID:21258443
A new user-assisted segmentation and tracking technique for an object-based video editing system
NASA Astrophysics Data System (ADS)
Yu, Hong Y.; Hong, Sung-Hoon; Lee, Mike M.; Choi, Jae-Gark
2004-03-01
This paper presents a semi-automatic segmentation method which can be used to generate video object plane (VOP) for object based coding scheme and multimedia authoring environment. Semi-automatic segmentation can be considered as a user-assisted segmentation technique. A user can initially mark objects of interest around the object boundaries and then the user-guided and selected objects are continuously separated from the unselected areas through time evolution in the image sequences. The proposed segmentation method consists of two processing steps: partially manual intra-frame segmentation and fully automatic inter-frame segmentation. The intra-frame segmentation incorporates user-assistance to define the meaningful complete visual object of interest to be segmentation and decides precise object boundary. The inter-frame segmentation involves boundary and region tracking to obtain temporal coherence of moving object based on the object boundary information of previous frame. The proposed method shows stable efficient results that could be suitable for many digital video applications such as multimedia contents authoring, content based coding and indexing. Based on these results, we have developed objects based video editing system with several convenient editing functions.
NASA Astrophysics Data System (ADS)
Ciaramello, Frank M.; Hemami, Sheila S.
2009-02-01
Communication of American Sign Language (ASL) over mobile phones would be very beneficial to the Deaf community. ASL video encoded to achieve the rates provided by current cellular networks must be heavily compressed and appropriate assessment techniques are required to analyze the intelligibility of the compressed video. As an extension to a purely spatial measure of intelligibility, this paper quantifies the effect of temporal compression artifacts on sign language intelligibility. These artifacts can be the result of motion-compensation errors that distract the observer or frame rate reductions. They reduce the the perception of smooth motion and disrupt the temporal coherence of the video. Motion-compensation errors that affect temporal coherence are identified by measuring the block-level correlation between co-located macroblocks in adjacent frames. The impact of frame rate reductions was quantified through experimental testing. A subjective study was performed in which fluent ASL participants rated the intelligibility of sequences encoded at a range of 5 different frame rates and with 3 different levels of distortion. The subjective data is used to parameterize an objective intelligibility measure which is highly correlated with subjective ratings at multiple frame rates.
Video attention deviation estimation using inter-frame visual saliency map analysis
NASA Astrophysics Data System (ADS)
Feng, Yunlong; Cheung, Gene; Le Callet, Patrick; Ji, Yusheng
2012-01-01
A viewer's visual attention during video playback is the matching of his eye gaze movement to the changing video content over time. If the gaze movement matches the video content (e.g., follow a rolling soccer ball), then the viewer keeps his visual attention. If the gaze location moves from one video object to another, then the viewer shifts his visual attention. A video that causes a viewer to shift his attention often is a "busy" video. Determination of which video content is busy is an important practical problem; a busy video is difficult for encoder to deploy region of interest (ROI)-based bit allocation, and hard for content provider to insert additional overlays like advertisements, making the video even busier. One way to determine the busyness of video content is to conduct eye gaze experiments with a sizable group of test subjects, but this is time-consuming and costineffective. In this paper, we propose an alternative method to determine the busyness of video-formally called video attention deviation (VAD): analyze the spatial visual saliency maps of the video frames across time. We first derive transition probabilities of a Markov model for eye gaze using saliency maps of a number of consecutive frames. We then compute steady state probability of the saccade state in the model-our estimate of VAD. We demonstrate that the computed steady state probability for saccade using saliency map analysis matches that computed using actual gaze traces for a range of videos with different degrees of busyness. Further, our analysis can also be used to segment video into shorter clips of different degrees of busyness by computing the Kullback-Leibler divergence using consecutive motion compensated saliency maps.
Video-based convolutional neural networks for activity recognition from robot-centric videos
NASA Astrophysics Data System (ADS)
Ryoo, M. S.; Matthies, Larry
2016-05-01
In this evaluation paper, we discuss convolutional neural network (CNN)-based approaches for human activity recognition. In particular, we investigate CNN architectures designed to capture temporal information in videos and their applications to the human activity recognition problem. There have been multiple previous works to use CNN-features for videos. These include CNNs using 3-D XYT convolutional filters, CNNs using pooling operations on top of per-frame image-based CNN descriptors, and recurrent neural networks to learn temporal changes in per-frame CNN descriptors. We experimentally compare some of these different representatives CNNs while using first-person human activity videos. We especially focus on videos from a robots viewpoint, captured during its operations and human-robot interactions.
Extended image differencing for change detection in UAV video mosaics
NASA Astrophysics Data System (ADS)
Saur, Günter; Krüger, Wolfgang; Schumann, Arne
2014-03-01
Change detection is one of the most important tasks when using unmanned aerial vehicles (UAV) for video reconnaissance and surveillance. We address changes of short time scale, i.e. the observations are taken in time distances from several minutes up to a few hours. Each observation is a short video sequence acquired by the UAV in near-nadir view and the relevant changes are, e.g., recently parked or moved vehicles. In this paper we extend our previous approach of image differencing for single video frames to video mosaics. A precise image-to-image registration combined with a robust matching approach is needed to stitch the video frames to a mosaic. Additionally, this matching algorithm is applied to mosaic pairs in order to align them to a common geometry. The resulting registered video mosaic pairs are the input of the change detection procedure based on extended image differencing. A change mask is generated by an adaptive threshold applied to a linear combination of difference images of intensity and gradient magnitude. The change detection algorithm has to distinguish between relevant and non-relevant changes. Examples for non-relevant changes are stereo disparity at 3D structures of the scene, changed size of shadows, and compression or transmission artifacts. The special effects of video mosaicking such as geometric distortions and artifacts at moving objects have to be considered, too. In our experiments we analyze the influence of these effects on the change detection results by considering several scenes. The results show that for video mosaics this task is more difficult than for single video frames. Therefore, we extended the image registration by estimating an elastic transformation using a thin plate spline approach. The results for mosaics are comparable to that of single video frames and are useful for interactive image exploitation due to a larger scene coverage.
A hybrid frame concealment algorithm for H.264/AVC.
Yan, Bo; Gharavi, Hamid
2010-01-01
In packet-based video transmissions, packets loss due to channel errors may result in the loss of the whole video frame. Recently, many error concealment algorithms have been proposed in order to combat channel errors; however, most of the existing algorithms can only deal with the loss of macroblocks and are not able to conceal the whole missing frame. In order to resolve this problem, in this paper, we have proposed a new hybrid motion vector extrapolation (HMVE) algorithm to recover the whole missing frame, and it is able to provide more accurate estimation for the motion vectors of the missing frame than other conventional methods. Simulation results show that it is highly effective and significantly outperforms other existing frame recovery methods.
Video-based measurements for wireless capsule endoscope tracking
NASA Astrophysics Data System (ADS)
Spyrou, Evaggelos; Iakovidis, Dimitris K.
2014-01-01
The wireless capsule endoscope is a swallowable medical device equipped with a miniature camera enabling the visual examination of the gastrointestinal (GI) tract. It wirelessly transmits thousands of images to an external video recording system, while its location and orientation are being tracked approximately by external sensor arrays. In this paper we investigate a video-based approach to tracking the capsule endoscope without requiring any external equipment. The proposed method involves extraction of speeded up robust features from video frames, registration of consecutive frames based on the random sample consensus algorithm, and estimation of the displacement and rotation of interest points within these frames. The results obtained by the application of this method on wireless capsule endoscopy videos indicate its effectiveness and improved performance over the state of the art. The findings of this research pave the way for a cost-effective localization and travel distance measurement of capsule endoscopes in the GI tract, which could contribute in the planning of more accurate surgical interventions.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Feltus, M.A.; Morlang, G.M.
1996-06-01
The use of neutron radiography for visualization of fluid flow through flow visualization modules has been very successful. Current experiments at the Penn State Breazeale Reactor serve to verify the mixing and transport of soluble boron under natural flow conditions as would be experienced in a pressurized water reactor. Different flow geometries have been modeled including holes, slots, and baffles. Flow modules are constructed of aluminum box material 1 1/2 inches by 4 inches in varying lengths. An experimental flow system was built which pumps fluid to a head tank and natural circulation flow occurs from the head tank throughmore » the flow visualization module to be radiographed. The entire flow system is mounted on a portable assembly to allow placement of the flow visualization module in front of the neutron beam port. A neutron-transparent fluorinert fluid is used to simulate water at different densities. Boron is modeled by gadolinium oxide powder as a tracer element, which is placed in a mixing assembly and injected into the system by remote operated electric valve, once the reactor is at power. The entire sequence is recorded on real-time video. Still photographs are made frame-by-frame from the video tape. Computers are used to digitally enhance the video and still photographs. The data obtained from the enhancement will be used for verification of simple geometry predictions using the TRAC and RELAP thermal-hydraulic codes. A detailed model of a reactor vessel inlet plenum, downcomer region, flow distribution area and core inlet is being constructed to model the AP600 plenum. Successive radiography experiments of each section of the model under identical conditions will provide a complete vessel/core model for comparison with the thermal-hydraulic codes.« less
NASA Astrophysics Data System (ADS)
Wróżyński, Rafał; Pyszny, Krzysztof; Sojka, Mariusz; Przybyła, Czesław; Murat-Błażejewska, Sadżide
2017-06-01
The article describes how the Structure-from-Motion (SfM) method can be used to calculate the volume of anthropogenic microtopography. In the proposed workflow, data is obtained using mass-market devices such as a compact camera (Canon G9) and a smartphone (iPhone5). The volume is computed using free open source software (VisualSFMv0.5.23, CMPMVSv0.6.0., MeshLab) on a PCclass computer. The input data is acquired from video frames. To verify the method laboratory tests on the embankment of a known volume has been carried out. Models of the test embankment were built using two independent measurements made with those two devices. No significant differences were found between the models in a comparative analysis. The volumes of the models differed from the actual volume just by 0.7‰ and 2‰. After a successful laboratory verification, field measurements were carried out in the same way. While building the model from the data acquired with a smartphone, it was observed that a series of frames, approximately 14% of all the frames, was rejected. The missing frames caused the point cloud to be less dense in the place where they had been rejected. This affected the model's volume differed from the volume acquired with a camera by 7%. In order to improve the homogeneity, the frame extraction frequency was increased in the place where frames have been previously missing. A uniform model was thereby obtained with point cloud density evenly distributed. There was a 1.5% difference between the embankment's volume and the volume calculated from the camera-recorded video. The presented method permits the number of input frames to be increased and the model's accuracy to be enhanced without making an additional measurement, which may not be possible in the case of temporary features.
Error-free holographic frames encryption with CA pixel-permutation encoding algorithm
NASA Astrophysics Data System (ADS)
Li, Xiaowei; Xiao, Dan; Wang, Qiong-Hua
2018-01-01
The security of video data is necessary in network security transmission hence cryptography is technique to make video data secure and unreadable to unauthorized users. In this paper, we propose a holographic frames encryption technique based on the cellular automata (CA) pixel-permutation encoding algorithm. The concise pixel-permutation algorithm is used to address the drawbacks of the traditional CA encoding methods. The effectiveness of the proposed video encoding method is demonstrated by simulation examples.
Can You See Me Now Visualizing Battlefield Facial Recognition Technology in 2035
2010-04-01
County Sheriff’s Department, use certain measurements such as the distance between eyes, the length of the nose, or the shape of the ears. 8 However...captures multiple frames of video and composites them into an appropriately high-resolution image that can be processed by the facial recognition software...stream of data. High resolution video systems, such as those described below will be able to capture orders of magnitude more data in one video frame
Variable disparity-motion estimation based fast three-view video coding
NASA Astrophysics Data System (ADS)
Bae, Kyung-Hoon; Kim, Seung-Cheol; Hwang, Yong Seok; Kim, Eun-Soo
2009-02-01
In this paper, variable disparity-motion estimation (VDME) based 3-view video coding is proposed. In the encoding, key-frame coding (KFC) based motion estimation and variable disparity estimation (VDE) for effectively fast three-view video encoding are processed. These proposed algorithms enhance the performance of 3-D video encoding/decoding system in terms of accuracy of disparity estimation and computational overhead. From some experiments, stereo sequences of 'Pot Plant' and 'IVO', it is shown that the proposed algorithm's PSNRs is 37.66 and 40.55 dB, and the processing time is 0.139 and 0.124 sec/frame, respectively.
Automated frame selection process for high-resolution microendoscopy
NASA Astrophysics Data System (ADS)
Ishijima, Ayumu; Schwarz, Richard A.; Shin, Dongsuk; Mondrik, Sharon; Vigneswaran, Nadarajah; Gillenwater, Ann M.; Anandasabapathy, Sharmila; Richards-Kortum, Rebecca
2015-04-01
We developed an automated frame selection algorithm for high-resolution microendoscopy video sequences. The algorithm rapidly selects a representative frame with minimal motion artifact from a short video sequence, enabling fully automated image analysis at the point-of-care. The algorithm was evaluated by quantitative comparison of diagnostically relevant image features and diagnostic classification results obtained using automated frame selection versus manual frame selection. A data set consisting of video sequences collected in vivo from 100 oral sites and 167 esophageal sites was used in the analysis. The area under the receiver operating characteristic curve was 0.78 (automated selection) versus 0.82 (manual selection) for oral sites, and 0.93 (automated selection) versus 0.92 (manual selection) for esophageal sites. The implementation of fully automated high-resolution microendoscopy at the point-of-care has the potential to reduce the number of biopsies needed for accurate diagnosis of precancer and cancer in low-resource settings where there may be limited infrastructure and personnel for standard histologic analysis.
Video Completion in Digital Stabilization Task Using Pseudo-Panoramic Technique
NASA Astrophysics Data System (ADS)
Favorskaya, M. N.; Buryachenko, V. V.; Zotin, A. G.; Pakhirka, A. I.
2017-05-01
Video completion is a necessary stage after stabilization of a non-stationary video sequence, if it is desirable to make the resolution of the stabilized frames equalled the resolution of the original frames. Usually the cropped stabilized frames lose 10-20% of area that means the worse visibility of the reconstructed scenes. The extension of a view of field may appear due to the pan-tilt-zoom unwanted camera movement. Our approach deals with a preparing of pseudo-panoramic key frame during a stabilization stage as a pre-processing step for the following inpainting. It is based on a multi-layered representation of each frame including the background and objects, moving differently. The proposed algorithm involves four steps, such as the background completion, local motion inpainting, local warping, and seamless blending. Our experiments show that a necessity of a seamless stitching occurs often than a local warping step. Therefore, a seamless blending was investigated in details including four main categories, such as feathering-based, pyramid-based, gradient-based, and optimal seam-based blending.
Real-time image sequence segmentation using curve evolution
NASA Astrophysics Data System (ADS)
Zhang, Jun; Liu, Weisong
2001-04-01
In this paper, we describe a novel approach to image sequence segmentation and its real-time implementation. This approach uses the 3D structure tensor to produce a more robust frame difference signal and uses curve evolution to extract whole objects. Our algorithm is implemented on a standard PC running the Windows operating system with video capture from a USB camera that is a standard Windows video capture device. Using the Windows standard video I/O functionalities, our segmentation software is highly portable and easy to maintain and upgrade. In its current implementation on a Pentium 400, the system can perform segmentation at 5 frames/sec with a frame resolution of 160 by 120.
An unsupervised method for summarizing egocentric sport videos
NASA Astrophysics Data System (ADS)
Habibi Aghdam, Hamed; Jahani Heravi, Elnaz; Puig, Domenec
2015-12-01
People are getting more interested to record their sport activities using head-worn or hand-held cameras. This type of videos which is called egocentric sport videos has different motion and appearance patterns compared with life-logging videos. While a life-logging video can be defined in terms of well-defined human-object interactions, notwithstanding, it is not trivial to describe egocentric sport videos using well-defined activities. For this reason, summarizing egocentric sport videos based on human-object interaction might fail to produce meaningful results. In this paper, we propose an unsupervised method for summarizing egocentric videos by identifying the key-frames of the video. Our method utilizes both appearance and motion information and it automatically finds the number of the key-frames. Our blind user study on the new dataset collected from YouTube shows that in 93:5% cases, the users choose the proposed method as their first video summary choice. In addition, our method is within the top 2 choices of the users in 99% of studies.
Reconstructing Interlaced High-Dynamic-Range Video Using Joint Learning.
Inchang Choi; Seung-Hwan Baek; Kim, Min H
2017-11-01
For extending the dynamic range of video, it is a common practice to capture multiple frames sequentially with different exposures and combine them to extend the dynamic range of each video frame. However, this approach results in typical ghosting artifacts due to fast and complex motion in nature. As an alternative, video imaging with interlaced exposures has been introduced to extend the dynamic range. However, the interlaced approach has been hindered by jaggy artifacts and sensor noise, leading to concerns over image quality. In this paper, we propose a data-driven approach for jointly solving two specific problems of deinterlacing and denoising that arise in interlaced video imaging with different exposures. First, we solve the deinterlacing problem using joint dictionary learning via sparse coding. Since partial information of detail in differently exposed rows is often available via interlacing, we make use of the information to reconstruct details of the extended dynamic range from the interlaced video input. Second, we jointly solve the denoising problem by tailoring sparse coding to better handle additive noise in low-/high-exposure rows, and also adopt multiscale homography flow to temporal sequences for denoising. We anticipate that the proposed method will allow for concurrent capture of higher dynamic range video frames without suffering from ghosting artifacts. We demonstrate the advantages of our interlaced video imaging compared with the state-of-the-art high-dynamic-range video methods.
Indexed Captioned Searchable Videos: A Learning Companion for STEM Coursework
NASA Astrophysics Data System (ADS)
Tuna, Tayfun; Subhlok, Jaspal; Barker, Lecia; Shah, Shishir; Johnson, Olin; Hovey, Christopher
2017-02-01
Videos of classroom lectures have proven to be a popular and versatile learning resource. A key shortcoming of the lecture video format is accessing the content of interest hidden in a video. This work meets this challenge with an advanced video framework featuring topical indexing, search, and captioning (ICS videos). Standard optical character recognition (OCR) technology was enhanced with image transformations for extraction of text from video frames to support indexing and search. The images and text on video frames is analyzed to divide lecture videos into topical segments. The ICS video player integrates indexing, search, and captioning in video playback providing instant access to the content of interest. This video framework has been used by more than 70 courses in a variety of STEM disciplines and assessed by more than 4000 students. Results presented from the surveys demonstrate the value of the videos as a learning resource and the role played by videos in a students learning process. Survey results also establish the value of indexing and search features in a video platform for education. This paper reports on the development and evaluation of ICS videos framework and over 5 years of usage experience in several STEM courses.
A novel multiple description scalable coding scheme for mobile wireless video transmission
NASA Astrophysics Data System (ADS)
Zheng, Haifeng; Yu, Lun; Chen, Chang Wen
2005-03-01
We proposed in this paper a novel multiple description scalable coding (MDSC) scheme based on in-band motion compensation temporal filtering (IBMCTF) technique in order to achieve high video coding performance and robust video transmission. The input video sequence is first split into equal-sized groups of frames (GOFs). Within a GOF, each frame is hierarchically decomposed by discrete wavelet transform. Since there is a direct relationship between wavelet coefficients and what they represent in the image content after wavelet decomposition, we are able to reorganize the spatial orientation trees to generate multiple bit-streams and employed SPIHT algorithm to achieve high coding efficiency. We have shown that multiple bit-stream transmission is very effective in combating error propagation in both Internet video streaming and mobile wireless video. Furthermore, we adopt the IBMCTF scheme to remove the redundancy for inter-frames along the temporal direction using motion compensated temporal filtering, thus high coding performance and flexible scalability can be provided in this scheme. In order to make compressed video resilient to channel error and to guarantee robust video transmission over mobile wireless channels, we add redundancy to each bit-stream and apply error concealment strategy for lost motion vectors. Unlike traditional multiple description schemes, the integration of these techniques enable us to generate more than two bit-streams that may be more appropriate for multiple antenna transmission of compressed video. Simulate results on standard video sequences have shown that the proposed scheme provides flexible tradeoff between coding efficiency and error resilience.
Characterization, adaptive traffic shaping, and multiplexing of real-time MPEG II video
NASA Astrophysics Data System (ADS)
Agrawal, Sanjay; Barry, Charles F.; Binnai, Vinay; Kazovsky, Leonid G.
1997-01-01
We obtain network traffic model for real-time MPEG-II encoded digital video by analyzing video stream samples from real-time encoders from NUKO Information Systems. MPEG-II sample streams include a resolution intensive movie, City of Joy, an action intensive movie, Aliens, a luminance intensive (black and white) movie, Road To Utopia, and a chrominance intensive (color) movie, Dick Tracy. From our analysis we obtain a heuristic model for the encoded video traffic which uses a 15-stage Markov process to model the I,B,P frame sequences within a group of pictures (GOP). A jointly-correlated Gaussian process is used to model the individual frame sizes. Scene change arrivals are modeled according to a gamma process. Simulations show that our MPEG-II traffic model generates, I,B,P frame sequences and frame sizes that closely match the sample MPEG-II stream traffic characteristics as they relate to latency and buffer occupancy in network queues. To achieve high multiplexing efficiency we propose a traffic shaping scheme which sets preferred 1-frame generation times among a group of encoders so as to minimize the overall variation in total offered traffic while still allowing the individual encoders to react to scene changes. Simulations show that our scheme results in multiplexing gains of up to 10% enabling us to multiplex twenty 6 Mbps MPEG-II video streams instead of 18 streams over an ATM/SONET OC3 link without latency or cell loss penalty. This scheme is due for a patent.
Multi-frame knowledge based text enhancement for mobile phone captured videos
NASA Astrophysics Data System (ADS)
Ozarslan, Suleyman; Eren, P. Erhan
2014-02-01
In this study, we explore automated text recognition and enhancement using mobile phone captured videos of store receipts. We propose a method which includes Optical Character Resolution (OCR) enhanced by our proposed Row Based Multiple Frame Integration (RB-MFI), and Knowledge Based Correction (KBC) algorithms. In this method, first, the trained OCR engine is used for recognition; then, the RB-MFI algorithm is applied to the output of the OCR. The RB-MFI algorithm determines and combines the most accurate rows of the text outputs extracted by using OCR from multiple frames of the video. After RB-MFI, KBC algorithm is applied to these rows to correct erroneous characters. Results of the experiments show that the proposed video-based approach which includes the RB-MFI and the KBC algorithm increases the word character recognition rate to 95%, and the character recognition rate to 98%.
Evaluation of experimental UAV video change detection
NASA Astrophysics Data System (ADS)
Bartelsen, J.; Saur, G.; Teutsch, C.
2016-10-01
During the last ten years, the availability of images acquired from unmanned aerial vehicles (UAVs) has been continuously increasing due to the improvements and economic success of flight and sensor systems. From our point of view, reliable and automatic image-based change detection may contribute to overcoming several challenging problems in military reconnaissance, civil security, and disaster management. Changes within a scene can be caused by functional activities, i.e., footprints or skid marks, excavations, or humidity penetration; these might be recognizable in aerial images, but are almost overlooked when change detection is executed manually. With respect to the circumstances, these kinds of changes may be an indication of sabotage, terroristic activity, or threatening natural disasters. Although image-based change detection is possible from both ground and aerial perspectives, in this paper we primarily address the latter. We have applied an extended approach to change detection as described by Saur and Kruger,1 and Saur et al.2 and have built upon the ideas of Saur and Bartelsen.3 The commercial simulation environment Virtual Battle Space 3 (VBS3) is used to simulate aerial "before" and "after" image acquisition concerning flight path, weather conditions and objects within the scene and to obtain synthetic videos. Video frames, which depict the same part of the scene, including "before" and "after" changes and not necessarily from the same perspective, are registered pixel-wise against each other by a photogrammetric concept, which is based on a homography. The pixel-wise registration is used to apply an automatic difference analysis, which, to a limited extent, is able to suppress typical errors caused by imprecise frame registration, sensor noise, vegetation and especially parallax effects. The primary concern of this paper is to seriously evaluate the possibilities and limitations of our current approach for image-based change detection with respect to the flight path, viewpoint change and parametrization. Hence, based on synthetic "before" and "after" videos of a simulated scene, we estimated the precision and recall of automatically detected changes. In addition and based on our approach, we illustrate the results showing the change detection in short, but real video sequences. Future work will improve the photogrammetric approach for frame registration, and extensive real video material, capable of change detection, will be acquired.
Jin, Meihua; Jung, Ji-Young; Lee, Jung-Ryun
2016-10-12
With the arrival of the era of Internet of Things (IoT), Wi-Fi Direct is becoming an emerging wireless technology that allows one to communicate through a direct connection between the mobile devices anytime, anywhere. In Wi-Fi Direct-based IoT networks, all devices are categorized by group of owner (GO) and client. Since portability is emphasized in Wi-Fi Direct devices, it is essential to control the energy consumption of a device very efficiently. In order to avoid unnecessary power consumed by GO, Wi-Fi Direct standard defines two power-saving methods: Opportunistic and Notice of Absence (NoA) power-saving methods. In this paper, we suggest an algorithm to enhance the energy efficiency of Wi-Fi Direct power-saving, considering the characteristics of multimedia video traffic. Proposed algorithm utilizes the statistical distribution for the size of video frames and adjusts the lengths of awake intervals in a beacon interval dynamically. In addition, considering the inter-dependency among video frames, the proposed algorithm ensures that a video frame having high priority is transmitted with higher probability than other frames having low priority. Simulation results show that the proposed method outperforms the traditional NoA method in terms of average delay and energy efficiency.
Jin, Meihua; Jung, Ji-Young; Lee, Jung-Ryun
2016-01-01
With the arrival of the era of Internet of Things (IoT), Wi-Fi Direct is becoming an emerging wireless technology that allows one to communicate through a direct connection between the mobile devices anytime, anywhere. In Wi-Fi Direct-based IoT networks, all devices are categorized by group of owner (GO) and client. Since portability is emphasized in Wi-Fi Direct devices, it is essential to control the energy consumption of a device very efficiently. In order to avoid unnecessary power consumed by GO, Wi-Fi Direct standard defines two power-saving methods: Opportunistic and Notice of Absence (NoA) power-saving methods. In this paper, we suggest an algorithm to enhance the energy efficiency of Wi-Fi Direct power-saving, considering the characteristics of multimedia video traffic. Proposed algorithm utilizes the statistical distribution for the size of video frames and adjusts the lengths of awake intervals in a beacon interval dynamically. In addition, considering the inter-dependency among video frames, the proposed algorithm ensures that a video frame having high priority is transmitted with higher probability than other frames having low priority. Simulation results show that the proposed method outperforms the traditional NoA method in terms of average delay and energy efficiency. PMID:27754315
NASA Astrophysics Data System (ADS)
Huang, Feng; Sun, Lifeng; Zhong, Yuzhuo
2006-01-01
Robust transmission of live video over ad hoc wireless networks presents new challenges: high bandwidth requirements are coupled with delay constraints; even a single packet loss causes error propagation until a complete video frame is coded in the intra-mode; ad hoc wireless networks suffer from bursty packet losses that drastically degrade the viewing experience. Accordingly, we propose a novel UMD coder capable of quickly recovering from losses and ensuring continuous playout. It uses 'peg' frames to prevent error propagation in the High-Resolution (HR) description and improve the robustness of key frames. The Low-Resolution (LR) coder works independent of the HR one, but they can also help each other recover from losses. Like many UMD coders, our UMD coder is drift-free, disruption-tolerant and able to make good use of the asymmetric available bandwidths of multiple paths. The simulation results under different conditions show that the proposed UMD coder has the highest decoded quality and lowest probability of pause when compared with concurrent UMDC techniques. The coder also has a comparable decoded quality, lower startup delay and lower probability of pause than a state-of-the-art FEC-based scheme. To provide robustness for video multicast applications, we propose non-end-to-end UMDC-based video distribution over a multi-tree multicast network. The multiplicity of parents decorrelates losses and the non-end-to-end feature increases the throughput of UMDC video data. We deploy an application-level service of LR description reconstruction in some intermediate nodes of the LR multicast tree. The principle behind this is to reconstruct the disrupted LR frames by the correctly received HR frames. As a result, the viewing experience at the downstream nodes benefits from the protection reconstruction at the upstream nodes.
Development Of A Dynamic Radiographic Capability Using High-Speed Video
NASA Astrophysics Data System (ADS)
Bryant, Lawrence E.
1985-02-01
High-speed video equipment can be used to optically image up to 2,000 full frames per second or 12,000 partial frames per second. X-ray image intensifiers have historically been used to image radiographic images at 30 frames per second. By combining these two types of equipment, it is possible to perform dynamic x-ray imaging of up to 2,000 full frames per second. The technique has been demonstrated using conventional, industrial x-ray sources such as 150 Kv and 300 Kv constant potential x-ray generators, 2.5 MeV Van de Graaffs, and linear accelerators. A crude form of this high-speed radiographic imaging has been shown to be possible with a cobalt 60 source. Use of a maximum aperture lens makes best use of the available light output from the image intensifier. The x-ray image intensifier input and output fluors decay rapidly enough to allow the high frame rate imaging. Data are presented on the maximum possible video frame rates versus x-ray penetration of various thicknesses of aluminum and steel. Photographs illustrate typical radiographic setups using the high speed imaging method. Video recordings show several demonstrations of this technique with the played-back x-ray images slowed down up to 100 times as compared to the actual event speed. Typical applications include boiling type action of liquids in metal containers, compressor operation with visualization of crankshaft, connecting rod and piston movement and thermal battery operation. An interesting aspect of this technique combines both the optical and x-ray capabilities to observe an object or event with both external and internal details with one camera in a visual mode and the other camera in an x-ray mode. This allows both kinds of video images to appear side by side in a synchronized presentation.
Tuong, William; Larsen, Elizabeth R; Armstrong, April W
2014-04-01
This systematic review examines the effectiveness of videos in modifying health behaviors. We searched PubMed (1975-2012), PsycINFO (1975-2012), EMBASE (1975-2012), and CINAHL (1983-2012) for controlled clinical trials that examined the effectiveness of video interventions in changing health behaviors. Twenty-eight studies comprised of 12,703 subjects were included in the systematic review. Video interventions were variably effective for modifying health behaviors depending on the target behaviors to be influenced. Video interventions appear to be effective in breast self-examination, prostate cancer screening, sunscreen adherence, self-care in patients with heart failure, HIV testing, treatment adherence, and female condom use. However, videos have not shown to be effective in influencing addiction behaviors when they are not tailored. Compared to loss-framing, gain-framed messages may be more effective in promoting certain types of health behavior change. Also, video modeling may facilitate learning of new behaviors and can be an important consideration in future video interventions.
A study of video frame rate on the perception of moving imagery detail
NASA Technical Reports Server (NTRS)
Haines, Richard F.; Chuang, Sherry L.
1993-01-01
The rate at which each frame of color moving video imagery is displayed was varied in small steps to determine what is the minimal acceptable frame rate for life scientists viewing white rats within a small enclosure. Two, twenty five second-long scenes (slow and fast animal motions) were evaluated by nine NASA principal investigators and animal care technicians. The mean minimum acceptable frame rate across these subjects was 3.9 fps both for the slow and fast moving animal scenes. The highest single trial frame rate averaged across all subjects for the slow and the fast scene was 6.2 and 4.8, respectively. Further research is called for in which frame rate, image size, and color/gray scale depth are covaried during the same observation period.
Innovative Solution to Video Enhancement
NASA Technical Reports Server (NTRS)
2001-01-01
Through a licensing agreement, Intergraph Government Solutions adapted a technology originally developed at NASA's Marshall Space Flight Center for enhanced video imaging by developing its Video Analyst(TM) System. Marshall's scientists developed the Video Image Stabilization and Registration (VISAR) technology to help FBI agents analyze video footage of the deadly 1996 Olympic Summer Games bombing in Atlanta, Georgia. VISAR technology enhanced nighttime videotapes made with hand-held camcorders, revealing important details about the explosion. Intergraph's Video Analyst System is a simple, effective, and affordable tool for video enhancement and analysis. The benefits associated with the Video Analyst System include support of full-resolution digital video, frame-by-frame analysis, and the ability to store analog video in digital format. Up to 12 hours of digital video can be stored and maintained for reliable footage analysis. The system also includes state-of-the-art features such as stabilization, image enhancement, and convolution to help improve the visibility of subjects in the video without altering underlying footage. Adaptable to many uses, Intergraph#s Video Analyst System meets the stringent demands of the law enforcement industry in the areas of surveillance, crime scene footage, sting operations, and dash-mounted video cameras.
A Kinect based sign language recognition system using spatio-temporal features
NASA Astrophysics Data System (ADS)
Memiş, Abbas; Albayrak, Songül
2013-12-01
This paper presents a sign language recognition system that uses spatio-temporal features on RGB video images and depth maps for dynamic gestures of Turkish Sign Language. Proposed system uses motion differences and accumulation approach for temporal gesture analysis. Motion accumulation method, which is an effective method for temporal domain analysis of gestures, produces an accumulated motion image by combining differences of successive video frames. Then, 2D Discrete Cosine Transform (DCT) is applied to accumulated motion images and temporal domain features transformed into spatial domain. These processes are performed on both RGB images and depth maps separately. DCT coefficients that represent sign gestures are picked up via zigzag scanning and feature vectors are generated. In order to recognize sign gestures, K-Nearest Neighbor classifier with Manhattan distance is performed. Performance of the proposed sign language recognition system is evaluated on a sign database that contains 1002 isolated dynamic signs belongs to 111 words of Turkish Sign Language (TSL) in three different categories. Proposed sign language recognition system has promising success rates.
An improved KCF tracking algorithm based on multi-feature and multi-scale
NASA Astrophysics Data System (ADS)
Wu, Wei; Wang, Ding; Luo, Xin; Su, Yang; Tian, Weiye
2018-02-01
The purpose of visual tracking is to associate the target object in a continuous video frame. In recent years, the method based on the kernel correlation filter has become the research hotspot. However, the algorithm still has some problems such as video capture equipment fast jitter, tracking scale transformation. In order to improve the ability of scale transformation and feature description, this paper has carried an innovative algorithm based on the multi feature fusion and multi-scale transform. The experimental results show that our method solves the problem that the target model update when is blocked or its scale transforms. The accuracy of the evaluation (OPE) is 77.0%, 75.4% and the success rate is 69.7%, 66.4% on the VOT and OTB datasets. Compared with the optimal one of the existing target-based tracking algorithms, the accuracy of the algorithm is improved by 6.7% and 6.3% respectively. The success rates are improved by 13.7% and 14.2% respectively.
NASA Astrophysics Data System (ADS)
Liang, Yu-Li
Multimedia data is increasingly important in scientific discovery and people's daily lives. Content of massive multimedia is often diverse and noisy, and motion between frames is sometimes crucial in analyzing those data. Among all, still images and videos are commonly used formats. Images are compact in size but do not contain motion information. Videos record motion but are sometimes too big to be analyzed. Sequential images, which are a set of continuous images with low frame rate, stand out because they are smaller than videos and still maintain motion information. This thesis investigates features in different types of noisy sequential images, and the proposed solutions that intelligently combined multiple features to successfully retrieve visual information from on-line videos and cloudy satellite images. The first task is detecting supraglacial lakes above ice sheet in sequential satellite images. The dynamics of supraglacial lakes on the Greenland ice sheet deeply affect glacier movement, which is directly related to sea level rise and global environment change. Detecting lakes above ice is suffering from diverse image qualities and unexpected clouds. A new method is proposed to efficiently extract prominent lake candidates with irregular shapes, heterogeneous backgrounds, and in cloudy images. The proposed system fully automatize the procedure that track lakes with high accuracy. We further cooperated with geoscientists to examine the tracked lakes and found new scientific findings. The second one is detecting obscene content in on-line video chat services, such as Chatroulette, that randomly match pairs of users in video chat sessions. A big problem encountered in such systems is the presence of flashers and obscene content. Because of various obscene content and unstable qualities of videos capture by home web-camera, detecting misbehaving users is a highly challenging task. We propose SafeVchat, which is the first solution that achieves satisfactory detection rate by using facial features and skin color model. To harness all the features in the scene, we further developed another system using multiple types of local descriptors along with Bag-of-Visual Word framework. In addition, an investigation of new contour feature in detecting obscene content is presented.
Dynamic frame resizing with convolutional neural network for efficient video compression
NASA Astrophysics Data System (ADS)
Kim, Jaehwan; Park, Youngo; Choi, Kwang Pyo; Lee, JongSeok; Jeon, Sunyoung; Park, JeongHoon
2017-09-01
In the past, video codecs such as vc-1 and H.263 used a technique to encode reduced-resolution video and restore original resolution from the decoder for improvement of coding efficiency. The techniques of vc-1 and H.263 Annex Q are called dynamic frame resizing and reduced-resolution update mode, respectively. However, these techniques have not been widely used due to limited performance improvements that operate well only under specific conditions. In this paper, video frame resizing (reduced/restore) technique based on machine learning is proposed for improvement of coding efficiency. The proposed method features video of low resolution made by convolutional neural network (CNN) in encoder and reconstruction of original resolution using CNN in decoder. The proposed method shows improved subjective performance over all the high resolution videos which are dominantly consumed recently. In order to assess subjective quality of the proposed method, Video Multi-method Assessment Fusion (VMAF) which showed high reliability among many subjective measurement tools was used as subjective metric. Moreover, to assess general performance, diverse bitrates are tested. Experimental results showed that BD-rate based on VMAF was improved by about 51% compare to conventional HEVC. Especially, VMAF values were significantly improved in low bitrate. Also, when the method is subjectively tested, it had better subjective visual quality in similar bit rate.
NASA Astrophysics Data System (ADS)
Zhang, Chao; Zhang, Qian; Zheng, Chi; Qiu, Guoping
2018-04-01
Video foreground segmentation is one of the key problems in video processing. In this paper, we proposed a novel and fully unsupervised approach for foreground object co-localization and segmentation of unconstrained videos. We firstly compute both the actual edges and motion boundaries of the video frames, and then align them by their HOG feature maps. Then, by filling the occlusions generated by the aligned edges, we obtained more precise masks about the foreground object. Such motion-based masks could be derived as the motion-based likelihood. Moreover, the color-base likelihood is adopted for the segmentation process. Experimental Results show that our approach outperforms most of the State-of-the-art algorithms.
Shaking video stabilization with content completion
NASA Astrophysics Data System (ADS)
Peng, Yi; Ye, Qixiang; Liu, Yanmei; Jiao, Jianbin
2009-01-01
A new stabilization algorithm to counterbalance the shaking motion in a video based on classical Kandade-Lucas- Tomasi (KLT) method is presented in this paper. Feature points are evaluated with law of large numbers and clustering algorithm to reduce the side effect of moving foreground. Analysis on the change of motion direction is also carried out to detect the existence of shaking. For video clips with detected shaking, an affine transformation is performed to warp the current frame to the reference one. In addition, the missing content of a frame during the stabilization is completed with optical flow analysis and mosaicking operation. Experiments on video clips demonstrate the effectiveness of the proposed algorithm.
Video Image Stabilization and Registration (VISAR) Software
NASA Technical Reports Server (NTRS)
1999-01-01
Two scientists at NASA's Marshall Space Flight Center,atmospheric scientist Paul Meyer and solar physicist Dr. David Hathaway, developed promising new software, called Video Image Stabilization and Registration (VISAR). VISAR may help law enforcement agencies catch criminals by improving the quality of video recorded at crime scenes. In this photograph, the single frame at left, taken at night, was brightened in order to enhance details and reduce noise or snow. To further overcome the video defects in one frame, Law enforcement officials can use VISAR software to add information from multiple frames to reveal a person. Images from less than a second of videotape were added together to create the clarified image at right. VISAR stabilizes camera motion in the horizontal and vertical as well as rotation and zoom effects producing clearer images of moving objects, smoothes jagged edges, enhances still images, and reduces video noise or snow. VISAR could also have applications in medical and meteorological imaging. It could steady images of ultrasounds, which are infamous for their grainy, blurred quality. The software can be used for defense application by improving recornaissance video imagery made by military vehicles, aircraft, and ships traveling in harsh, rugged environments.
Video Coding and the Application Level Framing Protocol Architecture
1992-06-01
missing ADU can be sent to the decoder when and if it arrives. The need for out-of- order processing arises for two reasons. First, ADUs may be reordered...by the network. Second, an ADU which is lost and then successfully retransmitted will arrive out of order. In either case, out-of- order processing makes...the code do not allow at least some out-of- order processing , one of the strong points of the ALF architecture is eliminated. 2.3.4 Header Data
NASA Technical Reports Server (NTRS)
Johnston, James C.; Rosenthal, Bruce N.; Bonner, Mary JO; Hahn, Richard C.; Herbach, Bruce
1989-01-01
A series of ground-based telepresence experiments have been performed to determine the minimum video frame rate and resolution required for the successive performance of materials science experiments in space. The approach used is to simulate transmission between earth and space station with transmission between laboratories on earth. The experiments include isothermal dendrite growth, physical vapor transport, and glass melting. Modifications of existing apparatus, software developed, and the establishment of an inhouse network are reviewed.
A combined stereo-photogrammetry and underwater-video system to study group composition of dolphins
NASA Astrophysics Data System (ADS)
Bräger, S.; Chong, A.; Dawson, S.; Slooten, E.; Würsig, B.
1999-11-01
One reason for the paucity of knowledge of dolphin social structure is the difficulty of measuring individual dolphins. In Hector's dolphins, Cephalorhynchus hectori, total body length is a function of age, and sex can be determined by individual colouration pattern. We developed a novel system combining stereo-photogrammetry and underwater-video to record dolphin group composition. The system consists of two downward-looking single-lens-reflex (SLR) cameras and a Hi8 video camera in an underwater housing mounted on a small boat. Bow-riding Hector's dolphins were photographed and video-taped at close range in coastal waters around the South Island of New Zealand. Three-dimensional, stereoscopic measurements of the distance between the blowhole and the anterior margin of the dorsal fin (BH-DF) were calibrated by a suspended frame with reference points. Growth functions derived from measurements of 53 dead Hector's dolphins (29 female : 24 male) provided the necessary reference data. For the analysis, the measurements were synchronised with corresponding underwater-video of the genital area. A total of 27 successful measurements (8 with corresponding sex) were obtained, showing how this new system promises to be potentially useful for cetacean studies.
Guo, Yang-Yang; He, Dong-Jian; Liu, Cong
2018-06-25
Insect behaviour is an important research topic in plant protection. To study insect behaviour accurately, it is necessary to observe and record their flight trajectory quantitatively and precisely in three dimensions (3D). The goal of this research was to analyse frames extracted from videos using Kernelized Correlation Filters (KCF) and Background Subtraction (BS) (KCF-BS) to plot the 3D trajectory of cabbage butterfly (P. rapae). Considering the experimental environment with a wind tunnel, a quadrature binocular vision insect video capture system was designed and applied in this study. The KCF-BS algorithm was used to track the butterfly in video frames and obtain coordinates of the target centroid in two videos. Finally the 3D trajectory was calculated according to the matching relationship in the corresponding frames of two angles in the video. To verify the validity of the KCF-BS algorithm, Compressive Tracking (CT) and Spatio-Temporal Context Learning (STC) algorithms were performed. The results revealed that the KCF-BS tracking algorithm performed more favourably than CT and STC in terms of accuracy and robustness.
MPCM: a hardware coder for super slow motion video sequences
NASA Astrophysics Data System (ADS)
Alcocer, Estefanía; López-Granado, Otoniel; Gutierrez, Roberto; Malumbres, Manuel P.
2013-12-01
In the last decade, the improvements in VLSI levels and image sensor technologies have led to a frenetic rush to provide image sensors with higher resolutions and faster frame rates. As a result, video devices were designed to capture real-time video at high-resolution formats with frame rates reaching 1,000 fps and beyond. These ultrahigh-speed video cameras are widely used in scientific and industrial applications, such as car crash tests, combustion research, materials research and testing, fluid dynamics, and flow visualization that demand real-time video capturing at extremely high frame rates with high-definition formats. Therefore, data storage capability, communication bandwidth, processing time, and power consumption are critical parameters that should be carefully considered in their design. In this paper, we propose a fast FPGA implementation of a simple codec called modulo-pulse code modulation (MPCM) which is able to reduce the bandwidth requirements up to 1.7 times at the same image quality when compared with PCM coding. This allows current high-speed cameras to capture in a continuous manner through a 40-Gbit Ethernet point-to-point access.
Automated video-based assessment of surgical skills for training and evaluation in medical schools.
Zia, Aneeq; Sharma, Yachna; Bettadapura, Vinay; Sarin, Eric L; Ploetz, Thomas; Clements, Mark A; Essa, Irfan
2016-09-01
Routine evaluation of basic surgical skills in medical schools requires considerable time and effort from supervising faculty. For each surgical trainee, a supervisor has to observe the trainees in person. Alternatively, supervisors may use training videos, which reduces some of the logistical overhead. All these approaches however are still incredibly time consuming and involve human bias. In this paper, we present an automated system for surgical skills assessment by analyzing video data of surgical activities. We compare different techniques for video-based surgical skill evaluation. We use techniques that capture the motion information at a coarser granularity using symbols or words, extract motion dynamics using textural patterns in a frame kernel matrix, and analyze fine-grained motion information using frequency analysis. We were successfully able to classify surgeons into different skill levels with high accuracy. Our results indicate that fine-grained analysis of motion dynamics via frequency analysis is most effective in capturing the skill relevant information in surgical videos. Our evaluations show that frequency features perform better than motion texture features, which in-turn perform better than symbol-/word-based features. Put succinctly, skill classification accuracy is positively correlated with motion granularity as demonstrated by our results on two challenging video datasets.
Two schemes for rapid generation of digital video holograms using PC cluster
NASA Astrophysics Data System (ADS)
Park, Hanhoon; Song, Joongseok; Kim, Changseob; Park, Jong-Il
2017-12-01
Computer-generated holography (CGH), which is a process of generating digital holograms, is computationally expensive. Recently, several methods/systems of parallelizing the process using graphic processing units (GPUs) have been proposed. Indeed, use of multiple GPUs or a personal computer (PC) cluster (each PC with GPUs) enabled great improvements in the process speed. However, extant literature has less often explored systems involving rapid generation of multiple digital holograms and specialized systems for rapid generation of a digital video hologram. This study proposes a system that uses a PC cluster and is able to more efficiently generate a video hologram. The proposed system is designed to simultaneously generate multiple frames and accelerate the generation by parallelizing the CGH computations across a number of frames, as opposed to separately generating each individual frame while parallelizing the CGH computations within each frame. The proposed system also enables the subprocesses for generating each frame to execute in parallel through multithreading. With these two schemes, the proposed system significantly reduced the data communication time for generating a digital hologram when compared with that of the state-of-the-art system.
Dikbas, Salih; Altunbasak, Yucel
2013-08-01
In this paper, a new low-complexity true-motion estimation (TME) algorithm is proposed for video processing applications, such as motion-compensated temporal frame interpolation (MCTFI) or motion-compensated frame rate up-conversion (MCFRUC). Regular motion estimation, which is often used in video coding, aims to find the motion vectors (MVs) to reduce the temporal redundancy, whereas TME aims to track the projected object motion as closely as possible. TME is obtained by imposing implicit and/or explicit smoothness constraints on the block-matching algorithm. To produce better quality-interpolated frames, the dense motion field at interpolation time is obtained for both forward and backward MVs; then, bidirectional motion compensation using forward and backward MVs is applied by mixing both elegantly. Finally, the performance of the proposed algorithm for MCTFI is demonstrated against recently proposed methods and smoothness constraint optical flow employed by a professional video production suite. Experimental results show that the quality of the interpolated frames using the proposed method is better when compared with the MCFRUC techniques.
Self-Supervised Video Hashing With Hierarchical Binary Auto-Encoder.
Song, Jingkuan; Zhang, Hanwang; Li, Xiangpeng; Gao, Lianli; Wang, Meng; Hong, Richang
2018-07-01
Existing video hash functions are built on three isolated stages: frame pooling, relaxed learning, and binarization, which have not adequately explored the temporal order of video frames in a joint binary optimization model, resulting in severe information loss. In this paper, we propose a novel unsupervised video hashing framework dubbed self-supervised video hashing (SSVH), which is able to capture the temporal nature of videos in an end-to-end learning to hash fashion. We specifically address two central problems: 1) how to design an encoder-decoder architecture to generate binary codes for videos and 2) how to equip the binary codes with the ability of accurate video retrieval. We design a hierarchical binary auto-encoder to model the temporal dependencies in videos with multiple granularities, and embed the videos into binary codes with less computations than the stacked architecture. Then, we encourage the binary codes to simultaneously reconstruct the visual content and neighborhood structure of the videos. Experiments on two real-world data sets show that our SSVH method can significantly outperform the state-of-the-art methods and achieve the current best performance on the task of unsupervised video retrieval.
Self-Supervised Video Hashing With Hierarchical Binary Auto-Encoder
NASA Astrophysics Data System (ADS)
Song, Jingkuan; Zhang, Hanwang; Li, Xiangpeng; Gao, Lianli; Wang, Meng; Hong, Richang
2018-07-01
Existing video hash functions are built on three isolated stages: frame pooling, relaxed learning, and binarization, which have not adequately explored the temporal order of video frames in a joint binary optimization model, resulting in severe information loss. In this paper, we propose a novel unsupervised video hashing framework dubbed Self-Supervised Video Hashing (SSVH), that is able to capture the temporal nature of videos in an end-to-end learning-to-hash fashion. We specifically address two central problems: 1) how to design an encoder-decoder architecture to generate binary codes for videos; and 2) how to equip the binary codes with the ability of accurate video retrieval. We design a hierarchical binary autoencoder to model the temporal dependencies in videos with multiple granularities, and embed the videos into binary codes with less computations than the stacked architecture. Then, we encourage the binary codes to simultaneously reconstruct the visual content and neighborhood structure of the videos. Experiments on two real-world datasets (FCVID and YFCC) show that our SSVH method can significantly outperform the state-of-the-art methods and achieve the currently best performance on the task of unsupervised video retrieval.
Resnick, H; Acierno, R; Holmes, M; Kilpatrick, D G; Jager, N
1999-01-01
Violent sexual assault such as rape typically results in extremely high levels of acute distress. The intensity of these acute psychological reactions may play a role in later recovery, with higher levels of immediate distress associated with poorer outcome. Unfortunately, post-rape forensic evidence collection procedures may serve to increase, rather than reduce initial distress, potentially exacerbating future psychopathology. To address these concerns, an acute time-frame hospital-based video intervention was developed to: (a) minimize anxiety during forensic rape exams, and (b) prevent post-rape posttraumatic stress disorder (PTSD), panic, and anxiety. Preliminary data indicated that (1) psychological distress at the time of the exam was strongly related to PTSD symptomatology 6 weeks post-rape, and (2) the video intervention successfully reduced distress during forensic exams.
NASA Astrophysics Data System (ADS)
Bartolini, Franco; Pasquini, Cristina; Piva, Alessandro
2001-04-01
The recent development of video compression algorithms allowed the diffusion of systems for the transmission of video sequences over data networks. However, the transmission over error prone mobile communication channels is yet an open issue. In this paper, a system developed for the real time transmission of H263 video coded sequences over TETRA mobile networks is presented. TETRA is an open digital trunked radio standard defined by the European Telecommunications Standardization Institute developed for professional mobile radio users, providing full integration of voice and data services. Experimental tests demonstrate that, in spite of the low frame rate allowed by the SW only implementation of the decoder and by the low channel rate a video compression technique such as that complying with the H263 standard, is still preferable to a simpler but less effective frame based compression system.
NASA Astrophysics Data System (ADS)
Aishwariya, A.; Pallavi Sudhir, Gulavani; Garg, Nemesa; Karthikeyan, B.
2017-11-01
A body worn camera is small video camera worn on the body, typically used by police officers to record arrests, evidence from crime scenes. It helps preventing and resolving complaints brought by members of the public; and strengthening police transparency, performance, and accountability. The main constants of this type of the system are video format, resolution, frames rate, and audio quality. This system records the video in .mp4 format with 1080p resolution and 30 frames per second. One more important aspect to while designing this system is amount of power the system requires as battery management becomes very critical. The main design challenges are Size of the Video, Audio for the video. Combining both audio and video and saving it in .mp4 format, Battery, size that is required for 8 hours of continuous recording, Security. For prototyping this system is implemented using Raspberry Pi model B.
Magnetic Braking: A Video Analysis
NASA Astrophysics Data System (ADS)
Molina-Bolívar, J. A.; Abella-Palacios, A. J.
2012-10-01
This paper presents a laboratory exercise that introduces students to the use of video analysis software and the Lenz's law demonstration. Digital techniques have proved to be very useful for the understanding of physical concepts. In particular, the availability of affordable digital video offers students the opportunity to actively engage in kinematics in introductory-level physics.1,2 By using digital videos frame advance features and "marking" the position of a moving object in each frame, students are able to more precisely determine the position of an object at much smaller time increments than would be possible with common time devices. Once the student collects data consisting of positions and times, these values may be manipulated to determine velocity and acceleration. There are a variety of commercial and free applications that can be used for video analysis. Because the relevant technology has become inexpensive, video analysis has become a prevalent tool in introductory physics courses.
Online tracking of outdoor lighting variations for augmented reality with moving cameras.
Liu, Yanli; Granier, Xavier
2012-04-01
In augmented reality, one of key tasks to achieve a convincing visual appearance consistency between virtual objects and video scenes is to have a coherent illumination along the whole sequence. As outdoor illumination is largely dependent on the weather, the lighting condition may change from frame to frame. In this paper, we propose a full image-based approach for online tracking of outdoor illumination variations from videos captured with moving cameras. Our key idea is to estimate the relative intensities of sunlight and skylight via a sparse set of planar feature-points extracted from each frame. To address the inevitable feature misalignments, a set of constraints are introduced to select the most reliable ones. Exploiting the spatial and temporal coherence of illumination, the relative intensities of sunlight and skylight are finally estimated by using an optimization process. We validate our technique on a set of real-life videos and show that the results with our estimations are visually coherent along the video sequences.
A Macintosh-Based Scientific Images Video Analysis System
NASA Technical Reports Server (NTRS)
Groleau, Nicolas; Friedland, Peter (Technical Monitor)
1994-01-01
A set of experiments was designed at MIT's Man-Vehicle Laboratory in order to evaluate the effects of zero gravity on the human orientation system. During many of these experiments, the movements of the eyes are recorded on high quality video cassettes. The images must be analyzed off-line to calculate the position of the eyes at every moment in time. To this aim, I have implemented a simple inexpensive computerized system which measures the angle of rotation of the eye from digitized video images. The system is implemented on a desktop Macintosh computer, processes one play-back frame per second and exhibits adequate levels of accuracy and precision. The system uses LabVIEW, a digital output board, and a video input board to control a VCR, digitize video images, analyze them, and provide a user friendly interface for the various phases of the process. The system uses the Concept Vi LabVIEW library (Graftek's Image, Meudon la Foret, France) for image grabbing and displaying as well as translation to and from LabVIEW arrays. Graftek's software layer drives an Image Grabber board from Neotech (Eastleigh, United Kingdom). A Colour Adapter box from Neotech provides adequate video signal synchronization. The system also requires a LabVIEW driven digital output board (MacADIOS II from GW Instruments, Cambridge, MA) controlling a slightly modified VCR remote control used mainly to advance the video tape frame by frame.
Feathering effect detection and artifact agglomeration index-based video deinterlacing technique
NASA Astrophysics Data System (ADS)
Martins, André Luis; Rodrigues, Evandro Luis Linhari; de Paiva, Maria Stela Veludo
2018-03-01
Several video deinterlacing techniques have been developed, and each one presents a better performance in certain conditions. Occasionally, even the most modern deinterlacing techniques create frames with worse quality than primitive deinterlacing processes. This paper validates that the final image quality can be improved by combining different types of deinterlacing techniques. The proposed strategy is able to select between two types of deinterlaced frames and, if necessary, make the local correction of the defects. This decision is based on an artifact agglomeration index obtained from a feathering effect detection map. Starting from a deinterlaced frame produced by the "interfield average" method, the defective areas are identified, and, if deemed appropriate, these areas are replaced by pixels generated through the "edge-based line average" method. Test results have proven that the proposed technique is able to produce video frames with higher quality than applying a single deinterlacing technique through getting what is good from intra- and interfield methods.
Stereo and IMU-Assisted Visual Odometry for Small Robots
NASA Technical Reports Server (NTRS)
2012-01-01
This software performs two functions: (1) taking stereo image pairs as input, it computes stereo disparity maps from them by cross-correlation to achieve 3D (three-dimensional) perception; (2) taking a sequence of stereo image pairs as input, it tracks features in the image sequence to estimate the motion of the cameras between successive image pairs. A real-time stereo vision system with IMU (inertial measurement unit)-assisted visual odometry was implemented on a single 750 MHz/520 MHz OMAP3530 SoC (system on chip) from TI (Texas Instruments). Frame rates of 46 fps (frames per second) were achieved at QVGA (Quarter Video Graphics Array i.e. 320 240), or 8 fps at VGA (Video Graphics Array 640 480) resolutions, while simultaneously tracking up to 200 features, taking full advantage of the OMAP3530's integer DSP (digital signal processor) and floating point ARM processors. This is a substantial advancement over previous work as the stereo implementation produces 146 Mde/s (millions of disparities evaluated per second) in 2.5W, yielding a stereo energy efficiency of 58.8 Mde/J, which is 3.75 better than prior DSP stereo while providing more functionality.
ERIC Educational Resources Information Center
Koleza, Eugenia; Pappas, John
2008-01-01
In this article, we present the results of a qualitative research project on the effect of motion analysis activities in a Video-Based Laboratory (VBL) on students' understanding of position, velocity and frames of reference. The participants in our research were 48 pre-service teachers enrolled in Education Departments with no previous strong…
Bandwidth characteristics of multimedia data traffic on a local area network
NASA Technical Reports Server (NTRS)
Chuang, Shery L.; Doubek, Sharon; Haines, Richard F.
1993-01-01
Limited spacecraft communication links call for users to investigate the potential use of video compression and multimedia technologies to optimize bandwidth allocations. The objective was to determine the transmission characteristics of multimedia data - motion video, text or bitmap graphics, and files transmitted independently and simultaneously over an ethernet local area network. Commercial desktop video teleconferencing hardware and software and Intel's proprietary Digital Video Interactive (DVI) video compression algorithm were used, and typical task scenarios were selected. The transmission time, packet size, number of packets, and network utilization of the data were recorded. Each data type - compressed motion video, text and/or bitmapped graphics, and a compressed image file - was first transmitted independently and its characteristics recorded. The results showed that an average bandwidth of 7.4 kilobits per second (kbps) was used to transmit graphics; an average bandwidth of 86.8 kbps was used to transmit an 18.9-kilobyte (kB) image file; a bandwidth of 728.9 kbps was used to transmit compressed motion video at 15 frames per second (fps); and a bandwidth of 75.9 kbps was used to transmit compressed motion video at 1.5 fps. Average packet sizes were 933 bytes for graphics, 498.5 bytes for the image file, 345.8 bytes for motion video at 15 fps, and 341.9 bytes for motion video at 1.5 fps. Simultaneous transmission of multimedia data types was also characterized. The multimedia packets used transmission bandwidths of 341.4 kbps and 105.8kbps. Bandwidth utilization varied according to the frame rate (frames per second) setting for the transmission of motion video. Packet size did not vary significantly between the data types. When these characteristics are applied to Space Station Freedom (SSF), the packet sizes fall within the maximum specified by the Consultative Committee for Space Data Systems (CCSDS). The uplink of imagery to SSF may be performed at minimal frame rates and/or within seconds of delay, depending on the user's allocated bandwidth. Further research to identify the acceptable delay interval and its impact on human performance is required. Additional studies in network performance using various video compression algorithms and integrated multimedia techniques are needed to determine the optimal design approach for utilizing SSF's data communications system.
NASA Astrophysics Data System (ADS)
Jeong, Mira; Nam, Jae-Yeal; Ko, Byoung Chul
2017-09-01
In this paper, we focus on pupil center detection in various video sequences that include head poses and changes in illumination. To detect the pupil center, we first find four eye landmarks in each eye by using cascade local regression based on a regression forest. Based on the rough location of the pupil, a fast radial symmetric transform is applied using the previously found pupil location to rearrange the fine pupil center. As the final step, the pupil displacement is estimated between the previous frame and the current frame to maintain the level of accuracy against a false locating result occurring in a particular frame. We generated a new face dataset, called Keimyung University pupil detection (KMUPD), with infrared camera. The proposed method was successfully applied to the KMUPD dataset, and the results indicate that its pupil center detection capability is better than that of other methods and with a shorter processing time.
Informative frame detection from wireless capsule video endoscopic images
NASA Astrophysics Data System (ADS)
Bashar, Md. Khayrul; Mori, Kensaku; Suenaga, Yasuhito; Kitasaka, Takayuki; Mekada, Yoshito
2008-03-01
Wireless capsule endoscopy (WCE) is a new clinical technology permitting the visualization of the small bowel, the most difficult segment of the digestive tract. The major drawback of this technology is the high amount of time for video diagnosis. In this study, we propose a method for informative frame detection by isolating useless frames that are substantially covered by turbid fluids or their contamination with other materials, e.g., faecal, semi-processed or unabsorbed foods etc. Such materials and fluids present a wide range of colors, from brown to yellow, and/or bubble-like texture patterns. The detection scheme, therefore, consists of two stages: highly contaminated non-bubbled (HCN) frame detection and significantly bubbled (SB) frame detection. Local color moments in the Ohta color space are used to characterize HCN frames, which are isolated by the Support Vector Machine (SVM) classifier in Stage-1. The rest of the frames go to the Stage-2, where Laguerre gauss Circular Harmonic Functions (LG-CHFs) extract the characteristics of the bubble-structures in a multi-resolution framework. An automatic segmentation method is designed to extract the bubbled regions based on local absolute energies of the CHF responses, derived from the grayscale version of the original color image. Final detection of the informative frames is obtained by using threshold operation on the extracted regions. An experiment with 20,558 frames from the three videos shows the excellent average detection accuracy (96.75%) by the proposed method, when compared with the Gabor based- (74.29%) and discrete wavelet based features (62.21%).
NASA Astrophysics Data System (ADS)
Manjanaik, N.; Parameshachari, B. D.; Hanumanthappa, S. N.; Banu, Reshma
2017-08-01
Intra prediction process of H.264 video coding standard used to code first frame i.e. Intra frame of video to obtain good coding efficiency compare to previous video coding standard series. More benefit of intra frame coding is to reduce spatial pixel redundancy with in current frame, reduces computational complexity and provides better rate distortion performance. To code Intra frame it use existing process Rate Distortion Optimization (RDO) method. This method increases computational complexity, increases in bit rate and reduces picture quality so it is difficult to implement in real time applications, so the many researcher has been developed fast mode decision algorithm for coding of intra frame. The previous work carried on Intra frame coding in H.264 standard using fast decision mode intra prediction algorithm based on different techniques was achieved increased in bit rate, degradation of picture quality(PSNR) for different quantization parameters. Many previous approaches of fast mode decision algorithms on intra frame coding achieved only reduction of computational complexity or it save encoding time and limitation was increase in bit rate with loss of quality of picture. In order to avoid increase in bit rate and loss of picture quality a better approach was developed. In this paper developed a better approach i.e. Gaussian pulse for Intra frame coding using diagonal down left intra prediction mode to achieve higher coding efficiency in terms of PSNR and bitrate. In proposed method Gaussian pulse is multiplied with each 4x4 frequency domain coefficients of 4x4 sub macro block of macro block of current frame before quantization process. Multiplication of Gaussian pulse for each 4x4 integer transformed coefficients at macro block levels scales the information of the coefficients in a reversible manner. The resulting signal would turn abstract. Frequency samples are abstract in a known and controllable manner without intermixing of coefficients, it avoids picture getting bad hit for higher values of quantization parameters. The proposed work was implemented using MATLAB and JM 18.6 reference software. The proposed work measure the performance parameters PSNR, bit rate and compression of intra frame of yuv video sequences in QCIF resolution under different values of quantization parameter with Gaussian value for diagonal down left intra prediction mode. The simulation results of proposed algorithm are tabulated and compared with previous algorithm i.e. Tian et al method. The proposed algorithm achieved reduced in bit rate averagely 30.98% and maintain consistent picture quality for QCIF sequences compared to previous algorithm i.e. Tian et al method.
Farnbacher, Michael J; Krause, Horst H; Hagel, Alexander F; Raithel, Martin; Neurath, Markus F; Schneider, Thomas
2014-03-01
OBJECTIVE. Colon capsule endoscopy (CCE) proved to be highly sensitive in detection of colorectal polyps (CP). Major limitation is the time-consuming video reading. The aim of this prospective, double-center study was to assess the theoretical time-saving potential and its possible impact on the reliability of "QuickView" (QV), in the presentation of CP as compared to normal mode (NM). METHODS. During NM reading of 65 CCE videos (mean patient´s age 56 years), all frames showing CPs were collected and compared to the number of frames presented by QV at increasing QV settings (10, 20, ... 80%). Reliability of QV in presenting polyps <6 mm and ≥6 mm (significant polyp), and identifying patients for subsequent therapeutic colonoscopy, capsule egestion rate, cleansing level, and estimated time-saving potential were assessed. RESULTS. At a 30% QV setting, the QV video presented 89% of the significant polyps and 86% of any polyps with ≥1 frame (per-polyp analysis) identified in NM before. At a 10% QV setting, 98% of the 52 patients with significant polyps could be identified (per-patient analysis) by QV video analysis. Capsule excretion rate was 74% and colon cleanliness was adequate in 85%. QV´s presentation rate correlates to the QV setting, the polyp size, and the number of frames per finding. CONCLUSIONS. Depending on its setting, the reliability of QV in presenting CP as compared to NM reading is notable. However, if no significant polyp is presented by QV, NM reading must be performed afterwards. The reduction of frames to be analyzed in QV might speed up identification of candidates for therapeutic colonoscopy.
NASA Astrophysics Data System (ADS)
Tornow, Ralf P.; Milczarek, Aleksandra; Odstrcilik, Jan; Kolar, Radim
2017-07-01
A parallel video ophthalmoscope was developed to acquire short video sequences (25 fps, 250 frames) of both eyes simultaneously with exact synchronization. Video sequences were registered off-line to compensate for eye movements. From registered video sequences dynamic parameters like cardiac cycle induced reflection changes and eye movements can be calculated and compared between eyes.
Query by example video based on fuzzy c-means initialized by fixed clustering center
NASA Astrophysics Data System (ADS)
Hou, Sujuan; Zhou, Shangbo; Siddique, Muhammad Abubakar
2012-04-01
Currently, the high complexity of video contents has posed the following major challenges for fast retrieval: (1) efficient similarity measurements, and (2) efficient indexing on the compact representations. A video-retrieval strategy based on fuzzy c-means (FCM) is presented for querying by example. Initially, the query video is segmented and represented by a set of shots, each shot can be represented by a key frame, and then we used video processing techniques to find visual cues to represent the key frame. Next, because the FCM algorithm is sensitive to the initializations, here we initialized the cluster center by the shots of query video so that users could achieve appropriate convergence. After an FCM cluster was initialized by the query video, each shot of query video was considered a benchmark point in the aforesaid cluster, and each shot in the database possessed a class label. The similarity between the shots in the database with the same class label and benchmark point can be transformed into the distance between them. Finally, the similarity between the query video and the video in database was transformed into the number of similar shots. Our experimental results demonstrated the performance of this proposed approach.
Macias, Elsa; Lloret, Jaime; Suarez, Alvaro; Garcia, Miguel
2012-01-01
Current mobile phones come with several sensors and powerful video cameras. These video cameras can be used to capture good quality scenes, which can be complemented with the information gathered by the sensors also embedded in the phones. For example, the surroundings of a beach recorded by the camera of the mobile phone, jointly with the temperature of the site can let users know via the Internet if the weather is nice enough to swim. In this paper, we present a system that tags the video frames of the video recorded from mobile phones with the data collected by the embedded sensors. The tagged video is uploaded to a video server, which is placed on the Internet and is accessible by any user. The proposed system uses a semantic approach with the stored information in order to make easy and efficient video searches. Our experimental results show that it is possible to tag video frames in real time and send the tagged video to the server with very low packet delay variations. As far as we know there is not any other application developed as the one presented in this paper. PMID:22438753
Macias, Elsa; Lloret, Jaime; Suarez, Alvaro; Garcia, Miguel
2012-01-01
Current mobile phones come with several sensors and powerful video cameras. These video cameras can be used to capture good quality scenes, which can be complemented with the information gathered by the sensors also embedded in the phones. For example, the surroundings of a beach recorded by the camera of the mobile phone, jointly with the temperature of the site can let users know via the Internet if the weather is nice enough to swim. In this paper, we present a system that tags the video frames of the video recorded from mobile phones with the data collected by the embedded sensors. The tagged video is uploaded to a video server, which is placed on the Internet and is accessible by any user. The proposed system uses a semantic approach with the stored information in order to make easy and efficient video searches. Our experimental results show that it is possible to tag video frames in real time and send the tagged video to the server with very low packet delay variations. As far as we know there is not any other application developed as the one presented in this paper.
1999-06-01
Two scientists at NASA's Marshall Space Flight Center,atmospheric scientist Paul Meyer and solar physicist Dr. David Hathaway, developed promising new software, called Video Image Stabilization and Registration (VISAR). VISAR may help law enforcement agencies catch criminals by improving the quality of video recorded at crime scenes. In this photograph, the single frame at left, taken at night, was brightened in order to enhance details and reduce noise or snow. To further overcome the video defects in one frame, Law enforcement officials can use VISAR software to add information from multiple frames to reveal a person. Images from less than a second of videotape were added together to create the clarified image at right. VISAR stabilizes camera motion in the horizontal and vertical as well as rotation and zoom effects producing clearer images of moving objects, smoothes jagged edges, enhances still images, and reduces video noise or snow. VISAR could also have applications in medical and meteorological imaging. It could steady images of ultrasounds, which are infamous for their grainy, blurred quality. The software can be used for defense application by improving recornaissance video imagery made by military vehicles, aircraft, and ships traveling in harsh, rugged environments.
Fast Video Encryption Using the H.264 Error Propagation Property for Smart Mobile Devices
Chung, Yongwha; Lee, Sungju; Jeon, Taewoong; Park, Daihee
2015-01-01
In transmitting video data securely over Video Sensor Networks (VSNs), since mobile handheld devices have limited resources in terms of processor clock speed and battery size, it is necessary to develop an efficient method to encrypt video data to meet the increasing demand for secure connections. Selective encryption methods can reduce the amount of computation needed while satisfying high-level security requirements. This is achieved by selecting an important part of the video data and encrypting it. In this paper, to ensure format compliance and security, we propose a special encryption method for H.264, which encrypts only the DC/ACs of I-macroblocks and the motion vectors of P-macroblocks. In particular, the proposed new selective encryption method exploits the error propagation property in an H.264 decoder and improves the collective performance by analyzing the tradeoff between the visual security level and the processing speed compared to typical selective encryption methods (i.e., I-frame, P-frame encryption, and combined I-/P-frame encryption). Experimental results show that the proposed method can significantly reduce the encryption workload without any significant degradation of visual security. PMID:25850068
Remote driving with reduced bandwidth communication
NASA Technical Reports Server (NTRS)
Depiero, Frederick W.; Noell, Timothy E.; Gee, Timothy F.
1993-01-01
Oak Ridge National Laboratory has developed a real-time video transmission system for low bandwidth remote operations. The system supports both continuous transmission of video for remote driving and progressive transmission of still images. Inherent in the system design is a spatiotemporal limitation to the effects of channel errors. The average data rate of the system is 64,000 bits/s, a compression of approximately 1000:1 for the black and white National Television Standard Code video. The image quality of the transmissions is maintained at a level that supports teleoperation of a high mobility multipurpose wheeled vehicle at speeds up to 15 mph on a moguled dirt track. Video compression is achieved by using Laplacian image pyramids and a combination of classical techniques. Certain subbands of the image pyramid are transmitted by using interframe differencing with a periodic refresh to aid in bandwidth reduction. Images are also foveated to concentrate image detail in a steerable region. The system supports dynamic video quality adjustments between frame rate, image detail, and foveation rate. A typical configuration for the system used during driving has a frame rate of 4 Hz, a compression per frame of 125:1, and a resulting latency of less than 1s.
NASA Astrophysics Data System (ADS)
Davies, Bob; Lienhart, Rainer W.; Yeo, Boon-Lock
1999-08-01
The metaphor of film and TV permeates the design of software to support video on the PC. Simply transplanting the non- interactive, sequential experience of film to the PC fails to exploit the virtues of the new context. Video ont eh PC should be interactive and non-sequential. This paper experiments with a variety of tools for using video on the PC that exploits the new content of the PC. Some feature are more successful than others. Applications that use these tools are explored, including primarily the home video archive but also streaming video servers on the Internet. The ability to browse, edit, abstract and index large volumes of video content such as home video and corporate video is a problem without appropriate solution in today's market. The current tools available are complex, unfriendly video editors, requiring hours of work to prepare a short home video, far more work that a typical home user can be expected to provide. Our proposed solution treats video like a text document, providing functionality similar to a text editor. Users can browse, interact, edit and compose one or more video sequences with the same ease and convenience as handling text documents. With this level of text-like composition, we call what is normally a sequential medium a 'video document'. An important component of the proposed solution is shot detection, the ability to detect when a short started or stopped. When combined with a spreadsheet of key frames, the host become a grid of pictures that can be manipulated and viewed in the same way that a spreadsheet can be edited. Multiple video documents may be viewed, joined, manipulated, and seamlessly played back. Abstracts of unedited video content can be produce automatically to create novel video content for export to other venues. Edited and raw video content can be published to the net or burned to a CD-ROM with a self-installing viewer for Windows 98 and Windows NT 4.0.
47 CFR Appendix - Technical Appendix 1
Code of Federal Regulations, 2010 CFR
2010-10-01
... display program material that has been encoded in any and all of the video formats contained in Table A3... frame rate of the transmitted video format. 2. Output Formats Equipment shall support 4:3 center cut-out... for composite video (yellow). Output shall produce video with ITU-R BT.500-11 quality scale of Grade 4...
Correction of Line Interleaving Displacement in Frame Captured Aerial Video Imagery
B. Cooke; A. Saucier
1995-01-01
Scientists with the USDA Forest Service are currently assessing the usefulness of aerial video imagery for various purposes including midcycle inventory updates. The potential of video image data for these purposes may be compromised by scan line interleaving displacement problems. Interleaving displacement problems cause features in video raster datasets to have...
Idbeaa, Tarik; Abdul Samad, Salina; Husain, Hafizah
2016-01-01
This paper presents a novel secure and robust steganographic technique in the compressed video domain namely embedding-based byte differencing (EBBD). Unlike most of the current video steganographic techniques which take into account only the intra frames for data embedding, the proposed EBBD technique aims to hide information in both intra and inter frames. The information is embedded into a compressed video by simultaneously manipulating the quantized AC coefficients (AC-QTCs) of luminance components of the frames during MPEG-2 encoding process. Later, during the decoding process, the embedded information can be detected and extracted completely. Furthermore, the EBBD basically deals with two security concepts: data encryption and data concealing. Hence, during the embedding process, secret data is encrypted using the simplified data encryption standard (S-DES) algorithm to provide better security to the implemented system. The security of the method lies in selecting candidate AC-QTCs within each non-overlapping 8 × 8 sub-block using a pseudo random key. Basic performance of this steganographic technique verified through experiments on various existing MPEG-2 encoded videos over a wide range of embedded payload rates. Overall, the experimental results verify the excellent performance of the proposed EBBD with a better trade-off in terms of imperceptibility and payload, as compared with previous techniques while at the same time ensuring minimal bitrate increase and negligible degradation of PSNR values. PMID:26963093
NASA Astrophysics Data System (ADS)
Gilles, Antonin; Gioia, Patrick; Cozot, Rémi; Morin, Luce
2015-09-01
The hybrid point-source/wave-field method is a newly proposed approach for Computer-Generated Hologram (CGH) calculation, based on the slicing of the scene into several depth layers parallel to the hologram plane. The complex wave scattered by each depth layer is then computed using either a wave-field or a point-source approach according to a threshold criterion on the number of points within the layer. Finally, the complex waves scattered by all the depth layers are summed up in order to obtain the final CGH. Although outperforming both point-source and wave-field methods without producing any visible artifact, this approach has not yet been used for animated holograms, and the possible exploitation of temporal redundancies has not been studied. In this paper, we propose a fast computation of video holograms by taking into account those redundancies. Our algorithm consists of three steps. First, intensity and depth data of the current 3D video frame are extracted and compared with those of the previous frame in order to remove temporally redundant data. Then the CGH pattern for this compressed frame is generated using the hybrid point-source/wave-field approach. The resulting CGH pattern is finally transmitted to the video output and stored in the previous frame buffer. Experimental results reveal that our proposed method is able to produce video holograms at interactive rates without producing any visible artifact.
Robust video transmission with distributed source coded auxiliary channel.
Wang, Jiajun; Majumdar, Abhik; Ramchandran, Kannan
2009-12-01
We propose a novel solution to the problem of robust, low-latency video transmission over lossy channels. Predictive video codecs, such as MPEG and H.26x, are very susceptible to prediction mismatch between encoder and decoder or "drift" when there are packet losses. These mismatches lead to a significant degradation in the decoded quality. To address this problem, we propose an auxiliary codec system that sends additional information alongside an MPEG or H.26x compressed video stream to correct for errors in decoded frames and mitigate drift. The proposed system is based on the principles of distributed source coding and uses the (possibly erroneous) MPEG/H.26x decoder reconstruction as side information at the auxiliary decoder. The distributed source coding framework depends upon knowing the statistical dependency (or correlation) between the source and the side information. We propose a recursive algorithm to analytically track the correlation between the original source frame and the erroneous MPEG/H.26x decoded frame. Finally, we propose a rate-distortion optimization scheme to allocate the rate used by the auxiliary encoder among the encoding blocks within a video frame. We implement the proposed system and present extensive simulation results that demonstrate significant gains in performance both visually and objectively (on the order of 2 dB in PSNR over forward error correction based solutions and 1.5 dB in PSNR over intrarefresh based solutions for typical scenarios) under tight latency constraints.
Idbeaa, Tarik; Abdul Samad, Salina; Husain, Hafizah
2016-01-01
This paper presents a novel secure and robust steganographic technique in the compressed video domain namely embedding-based byte differencing (EBBD). Unlike most of the current video steganographic techniques which take into account only the intra frames for data embedding, the proposed EBBD technique aims to hide information in both intra and inter frames. The information is embedded into a compressed video by simultaneously manipulating the quantized AC coefficients (AC-QTCs) of luminance components of the frames during MPEG-2 encoding process. Later, during the decoding process, the embedded information can be detected and extracted completely. Furthermore, the EBBD basically deals with two security concepts: data encryption and data concealing. Hence, during the embedding process, secret data is encrypted using the simplified data encryption standard (S-DES) algorithm to provide better security to the implemented system. The security of the method lies in selecting candidate AC-QTCs within each non-overlapping 8 × 8 sub-block using a pseudo random key. Basic performance of this steganographic technique verified through experiments on various existing MPEG-2 encoded videos over a wide range of embedded payload rates. Overall, the experimental results verify the excellent performance of the proposed EBBD with a better trade-off in terms of imperceptibility and payload, as compared with previous techniques while at the same time ensuring minimal bitrate increase and negligible degradation of PSNR values.
Birgiolas, Justas; Jernigan, Christopher M.; Gerkin, Richard C.; Smith, Brian H.; Crook, Sharon M.
2017-01-01
Many scientifically and agriculturally important insects use antennae to detect the presence of volatile chemical compounds and extend their proboscis during feeding. The ability to rapidly obtain high-resolution measurements of natural antenna and proboscis movements and assess how they change in response to chemical, developmental, and genetic manipulations can aid the understanding of insect behavior. By extending our previous work on assessing aggregate insect swarm or animal group movements from natural and laboratory videos using the video analysis software SwarmSight, we developed a novel, free, and open-source software module, SwarmSight Appendage Tracking (SwarmSight.org) for frame-by-frame tracking of insect antenna and proboscis positions from conventional web camera videos using conventional computers. The software processes frames about 120 times faster than humans, performs at better than human accuracy, and, using 30 frames per second (fps) videos, can capture antennal dynamics up to 15 Hz. The software was used to track the antennal response of honey bees to two odors and found significant mean antennal retractions away from the odor source about 1 s after odor presentation. We observed antenna position density heat map cluster formation and cluster and mean angle dependence on odor concentration. PMID:29364251
NASA Astrophysics Data System (ADS)
Dolloff, John; Hottel, Bryant; Edwards, David; Theiss, Henry; Braun, Aaron
2017-05-01
This paper presents an overview of the Full Motion Video-Geopositioning Test Bed (FMV-GTB) developed to investigate algorithm performance and issues related to the registration of motion imagery and subsequent extraction of feature locations along with predicted accuracy. A case study is included corresponding to a video taken from a quadcopter. Registration of the corresponding video frames is performed without the benefit of a priori sensor attitude (pointing) information. In particular, tie points are automatically measured between adjacent frames using standard optical flow matching techniques from computer vision, an a priori estimate of sensor attitude is then computed based on supplied GPS sensor positions contained in the video metadata and a photogrammetric/search-based structure from motion algorithm, and then a Weighted Least Squares adjustment of all a priori metadata across the frames is performed. Extraction of absolute 3D feature locations, including their predicted accuracy based on the principles of rigorous error propagation, is then performed using a subset of the registered frames. Results are compared to known locations (check points) over a test site. Throughout this entire process, no external control information (e.g. surveyed points) is used other than for evaluation of solution errors and corresponding accuracy.
OPSO - The OpenGL based Field Acquisition and Telescope Guiding System
NASA Astrophysics Data System (ADS)
Škoda, P.; Fuchs, J.; Honsa, J.
2006-07-01
We present OPSO, a modular pointing and auto-guiding system for the coudé spectrograph of the Ondřejov observatory 2m telescope. The current field and slit viewing CCD cameras with image intensifiers are giving only standard TV video output. To allow the acquisition and guiding of very faint targets, we have designed an image enhancing system working in real time on TV frames grabbed by BT878-based video capture card. Its basic capabilities include the sliding averaging of hundreds of frames with bad pixel masking and removal of outliers, display of median of set of frames, quick zooming, contrast and brightness adjustment, plotting of horizontal and vertical cross cuts of seeing disk within given intensity range and many more. From the programmer's point of view, the system consists of three tasks running in parallel on a Linux PC. One C task controls the video capturing over Video for Linux (v4l2) interface and feeds the frames into the large block of shared memory, where the core image processing is done by another C program calling the OpenGL library. The GUI is, however, dynamically built in Python from XML description of widgets prepared in Glade. All tasks are exchanging information by IPC calls using the shared memory segments.
MobileASL: intelligibility of sign language video over mobile phones.
Cavender, Anna; Vanam, Rahul; Barney, Dane K; Ladner, Richard E; Riskin, Eve A
2008-01-01
For Deaf people, access to the mobile telephone network in the United States is currently limited to text messaging, forcing communication in English as opposed to American Sign Language (ASL), the preferred language. Because ASL is a visual language, mobile video phones have the potential to give Deaf people access to real-time mobile communication in their preferred language. However, even today's best video compression techniques can not yield intelligible ASL at limited cell phone network bandwidths. Motivated by this constraint, we conducted one focus group and two user studies with members of the Deaf Community to determine the intelligibility effects of video compression techniques that exploit the visual nature of sign language. Inspired by eye tracking results that show high resolution foveal vision is maintained around the face, we studied region-of-interest encodings (where the face is encoded at higher quality) as well as reduced frame rates (where fewer, better quality, frames are displayed every second). At all bit rates studied here, participants preferred moderate quality increases in the face region, sacrificing quality in other regions. They also preferred slightly lower frame rates because they yield better quality frames for a fixed bit rate. The limited processing power of cell phones is a serious concern because a real-time video encoder and decoder will be needed. Choosing less complex settings for the encoder can reduce encoding time, but will affect video quality. We studied the intelligibility effects of this tradeoff and found that we can significantly speed up encoding time without severely affecting intelligibility. These results show promise for real-time access to the current low-bandwidth cell phone network through sign-language-specific encoding techniques.
Interactive CT-Video Registration for the Continuous Guidance of Bronchoscopy
Merritt, Scott A.; Khare, Rahul; Bascom, Rebecca
2014-01-01
Bronchoscopy is a major step in lung cancer staging. To perform bronchoscopy, the physician uses a procedure plan, derived from a patient’s 3D computed-tomography (CT) chest scan, to navigate the bronchoscope through the lung airways. Unfortunately, physicians vary greatly in their ability to perform bronchoscopy. As a result, image-guided bronchoscopy systems, drawing upon the concept of CT-based virtual bronchoscopy (VB), have been proposed. These systems attempt to register the bronchoscope’s live position within the chest to a CT-based virtual chest space. Recent methods, which register the bronchoscopic video to CT-based endoluminal airway renderings, show promise but do not enable continuous real-time guidance. We present a CT-video registration method inspired by computer-vision innovations in the fields of image alignment and image-based rendering. In particular, motivated by the Lucas–Kanade algorithm, we propose an inverse-compositional framework built around a gradient-based optimization procedure. We next propose an implementation of the framework suitable for image-guided bronchoscopy. Laboratory tests, involving both single frames and continuous video sequences, demonstrate the robustness and accuracy of the method. Benchmark timing tests indicate that the method can run continuously at 300 frames/s, well beyond the real-time bronchoscopic video rate of 30 frames/s. This compares extremely favorably to the ≥1 s/frame speeds of other methods and indicates the method’s potential for real-time continuous registration. A human phantom study confirms the method’s efficacy for real-time guidance in a controlled setting, and, hence, points the way toward the first interactive CT-video registration approach for image-guided bronchoscopy. Along this line, we demonstrate the method’s efficacy in a complete guidance system by presenting a clinical study involving lung cancer patients. PMID:23508260
ERIC Educational Resources Information Center
Gross, M. Scott
2005-01-01
This article provides a review of some of the currently available literature surrounding the academic study of video games. Many of these theoretical methods have been used to study film and television and are discussed here in order to frame the need for further examination of video games. Suggestions for the use of video games in the classroom…
Countermeasures for unintentional and intentional video watermarking attacks
NASA Astrophysics Data System (ADS)
Deguillaume, Frederic; Csurka, Gabriela; Pun, Thierry
2000-05-01
These last years, the rapidly growing digital multimedia market has revealed an urgent need for effective copyright protection mechanisms. Therefore, digital audio, image and video watermarking has recently become a very active area of research, as a solution to this problem. Many important issues have been pointed out, one of them being the robustness to non-intentional and intentional attacks. This paper studies some attacks and proposes countermeasures applied to videos. General attacks are lossy copying/transcoding such as MPEG compression and digital/analog (D/A) conversion, changes of frame-rate, changes of display format, and geometrical distortions. More specific attacks are sequence edition, and statistical attacks such as averaging or collusion. Averaging attack consists of averaging locally consecutive frames to cancel the watermark. This attack works well for schemes which embed random independent marks into frames. In the collusion attack the watermark is estimated from single frames (based on image denoising), and averaged over different scenes for better accuracy. The estimated watermark is then subtracted from each frame. Collusion requires that the same mark is embedded into all frames. The proposed countermeasures first ensures robustness to general attacks by spread spectrum encoding in the frequency domain and by the use of an additional template. Secondly, a Bayesian criterion, evaluating the probability of a correctly decoded watermark, is used for rejection of outliers, and to implement an algorithm against statistical attacks. The idea is to embed randomly chosen marks among a finite set of marks, into subsequences of videos which are long enough to resist averaging attacks, but short enough to avoid collusion attacks. The Bayesian criterion is needed to select the correct mark at the decoding step. Finally, the paper presents experimental results showing the robustness of the proposed method.
High-Speed Video Analysis of Damped Harmonic Motion
ERIC Educational Resources Information Center
Poonyawatpornkul, J.; Wattanakasiwich, P.
2013-01-01
In this paper, we acquire and analyse high-speed videos of a spring-mass system oscillating in glycerin at different temperatures. Three cases of damped harmonic oscillation are investigated and analysed by using high-speed video at a rate of 120 frames s[superscript -1] and Tracker Video Analysis (Tracker) software. We present empirical data for…
Initial results from a video-laser rangefinder device
Neil A. Clark
2000-01-01
Three hundred and nine width measurements at various heights to 10 m on a metal light pole were calculated from video images captured with a prototype video-laser rangefinder instrument. Data were captured at distances from 6 to 15 m. The endpoints for the width measurements were manually selected to the nearest pixel from individual video frames.Chi-square...
Real-time video compressing under DSP/BIOS
NASA Astrophysics Data System (ADS)
Chen, Qiu-ping; Li, Gui-ju
2009-10-01
This paper presents real-time MPEG-4 Simple Profile video compressing based on the DSP processor. The programming framework of video compressing is constructed using TMS320C6416 Microprocessor, TDS510 simulator and PC. It uses embedded real-time operating system DSP/BIOS and the API functions to build periodic function, tasks and interruptions etcs. Realize real-time video compressing. To the questions of data transferring among the system. Based on the architecture of the C64x DSP, utilized double buffer switched and EDMA data transfer controller to transit data from external memory to internal, and realize data transition and processing at the same time; the architecture level optimizations are used to improve software pipeline. The system used DSP/BIOS to realize multi-thread scheduling. The whole system realizes high speed transition of a great deal of data. Experimental results show the encoder can realize real-time encoding of 768*576, 25 frame/s video images.
Robust Video Stabilization Using Particle Keypoint Update and l1-Optimized Camera Path
Jeon, Semi; Yoon, Inhye; Jang, Jinbeum; Yang, Seungji; Kim, Jisung; Paik, Joonki
2017-01-01
Acquisition of stabilized video is an important issue for various type of digital cameras. This paper presents an adaptive camera path estimation method using robust feature detection to remove shaky artifacts in a video. The proposed algorithm consists of three steps: (i) robust feature detection using particle keypoints between adjacent frames; (ii) camera path estimation and smoothing; and (iii) rendering to reconstruct a stabilized video. As a result, the proposed algorithm can estimate the optimal homography by redefining important feature points in the flat region using particle keypoints. In addition, stabilized frames with less holes can be generated from the optimal, adaptive camera path that minimizes a temporal total variation (TV). The proposed video stabilization method is suitable for enhancing the visual quality for various portable cameras and can be applied to robot vision, driving assistant systems, and visual surveillance systems. PMID:28208622
Kang, Seok; Ha, Jae-Sik; Velasco, Teresa
2017-05-01
This study investigated videos about Attention Deficit Hyperactivity Disorder (ADHD) on YouTube in terms of issues, sources, and episodic-thematic aspects. A total of 685 videos uploaded onto YouTube between 2006 and 2014 were content analyzed. Results demonstrated that the top three key issues about ADHD were symptom, child, and treatment. Doctor, patient, and supporter were the three most interviewed sources. Videos from the public sector including the government, company representative, and public organizations were relatively rare compared to other sources suggesting the potential for a greater role for the government and public sector contributions to YouTube to provide credible information relevant to public awareness, campaigns, and policy announcements. Meanwhile, many personal videos in the episodic frame advocated social solutions. This result implies that YouTube videos about health information from the private sectors have the potential to affect change at the social level.
Laws of reflection and Snell's law revisited by video modeling
NASA Astrophysics Data System (ADS)
Rodrigues, M.; Simeão Carvalho, P.
2014-07-01
Video modelling is being used, nowadays, as a tool for teaching and learning several topics in Physics. Most of these topics are related to kinematics. In this work we show how video modelling can be used for demonstrations and experimental teaching in optics, namely the laws of reflection and the well-known Snell's Law of light. Videos were recorded with a photo camera at 30 frames/s, and analysed with the open source software Tracker. Data collected from several frames was treated with the Data Tool module, and graphs were built to obtain relations between incident, reflected and refraction angles, as well as to determine the refractive index of Perspex. These videos can be freely distributed in the web and explored with students within the classroom, or as a homework assignment to improve student's understanding on specific contents. They present a large didactic potential for teaching basic optics in high school with an interactive methodology.
Global-constrained hidden Markov model applied on wireless capsule endoscopy video segmentation
NASA Astrophysics Data System (ADS)
Wan, Yiwen; Duraisamy, Prakash; Alam, Mohammad S.; Buckles, Bill
2012-06-01
Accurate analysis of wireless capsule endoscopy (WCE) videos is vital but tedious. Automatic image analysis can expedite this task. Video segmentation of WCE into the four parts of the gastrointestinal tract is one way to assist a physician. The segmentation approach described in this paper integrates pattern recognition with statiscal analysis. Iniatially, a support vector machine is applied to classify video frames into four classes using a combination of multiple color and texture features as the feature vector. A Poisson cumulative distribution, for which the parameter depends on the length of segments, models a prior knowledge. A priori knowledge together with inter-frame difference serves as the global constraints driven by the underlying observation of each WCE video, which is fitted by Gaussian distribution to constrain the transition probability of hidden Markov model.Experimental results demonstrated effectiveness of the approach.
Automated videography for residential communications
NASA Astrophysics Data System (ADS)
Kurtz, Andrew F.; Neustaedter, Carman; Blose, Andrew C.
2010-02-01
The current widespread use of webcams for personal video communication over the Internet suggests that opportunities exist to develop video communications systems optimized for domestic use. We discuss both prior and existing technologies, and the results of user studies that indicate potential needs and expectations for people relative to personal video communications. In particular, users anticipate an easily used, high image quality video system, which enables multitasking communications during the course of real-world activities and provides appropriate privacy controls. To address these needs, we propose a potential approach premised on automated capture of user activity. We then describe a method that adapts cinematography principles, with a dual-camera videography system, to automatically control image capture relative to user activity, using semantic or activity-based cues to determine user position and motion. In particular, we discuss an approach to automatically manage shot framing, shot selection, and shot transitions, with respect to one or more local users engaged in real-time, unscripted events, while transmitting the resulting video to a remote viewer. The goal is to tightly frame subjects (to provide more detail), while minimizing subject loss and repeated abrupt shot framing changes in the images as perceived by a remote viewer. We also discuss some aspects of the system and related technologies that we have experimented with thus far. In summary, the method enables users to participate in interactive video-mediated communications while engaged in other activities.
NASA Astrophysics Data System (ADS)
Duplaga, M.; Leszczuk, M. I.; Papir, Z.; Przelaskowski, A.
2008-12-01
Wider dissemination of medical digital video libraries is affected by two correlated factors, resource effective content compression that directly influences its diagnostic credibility. It has been proved that it is possible to meet these contradictory requirements halfway for long-lasting and low motion surgery recordings at compression ratios close to 100 (bronchoscopic procedures were a case study investigated). As the main supporting assumption, it has been accepted that the content can be compressed as far as clinicians are not able to sense a loss of video diagnostic fidelity (a visually lossless compression). Different market codecs were inspected by means of the combined subjective and objective tests toward their usability in medical video libraries. Subjective tests involved a panel of clinicians who had to classify compressed bronchoscopic video content according to its quality under the bubble sort algorithm. For objective tests, two metrics (hybrid vector measure and hosaka Plots) were calculated frame by frame and averaged over a whole sequence.
Feasibility study of a real-time operating system for a multichannel MPEG-4 encoder
NASA Astrophysics Data System (ADS)
Lehtoranta, Olli; Hamalainen, Timo D.
2005-03-01
Feasibility of DSP/BIOS real-time operating system for a multi-channel MPEG-4 encoder is studied. Performances of two MPEG-4 encoder implementations with and without the operating system are compared in terms of encoding frame rate and memory requirements. The effects of task switching frequency and number of parallel video channels to the encoding frame rate are measured. The research is carried out on a 200 MHz TMS320C6201 fixed point DSP using QCIF (176x144 pixels) video format. Compared to a traditional DSP implementation without an operating system, inclusion of DSP/BIOS reduces total system throughput only by 1 QCIF frames/s. The operating system has 6 KB data memory overhead and program memory requirement of 15.7 KB. Hence, the overhead is considered low enough for resource critical mobile video applications.
Impact of nutrition messages on children's food choice: pilot study.
Bannon, Katie; Schwartz, Marlene B
2006-03-01
This pilot study tested the influence of nutrition message framing on snack choice among kindergarteners. Three classrooms were randomly assigned to watch one of the following 60s videos: (a) a gain-framed nutrition message (i.e. the positive benefits of eating apples) (n=14); (b) a loss-framed message (i.e. the negative consequences of not eating apples) (n=18); or (c) a control scene (children playing a game) (n=18). Following this, the children were offered a choice between animal crackers and an apple for their snack. Among the children who saw one of the nutrition message videos, 56% chose apples rather than animal crackers; in the control condition only 33% chose apples. This difference was statistically significant (chi2=7.56, p<0.01). These results suggest that videos containing nutritional messages may have a positive influence on children's short-term food choices.
Mode shape analysis using a commercially available peak store video frame buffer
NASA Technical Reports Server (NTRS)
Snow, Walter L.; Childers, Brooks A.
1994-01-01
Time exposure photography, sometimes coupled with strobe illumination, is an accepted method for motion analysis that bypasses frame by frame analysis and resynthesis of data. Garden variety video cameras can now exploit this technique using a unique frame buffer that is a non-integrating memory that compares incoming data with that already stored. The device continuously outputs an analog video signal of the stored contents which can then be redigitized and analyzed using conventional equipment. Historically, photographic time exposures have been used to record the displacement envelope of harmonically oscillating structures to show mode shape. Mode shape analysis is crucial, for example, in aeroelastic testing of wind tunnel models. Aerodynamic, inertial, and elastic forces can couple together leading to catastrophic failure of a poorly designed aircraft. This paper will explore the usefulness of the peak store device as a videometric tool and in particular discuss methods for analyzing a targeted vibrating plate using the 'peak store' in conjunction with calibration methods familiar to the close-range videometry community. Results for the first three normal modes will be presented.
Mode shape analysis using a commercially available "peak-store" video frame buffer
NASA Astrophysics Data System (ADS)
Snow, Walter L.; Childers, Brooks A.
1994-10-01
Time exposure photography, sometimes coupled with strobe illumination, is an accepted method for motion analysis that bypasses frame by frame analysis and re synthesis of data. Garden variety video cameras can now exploit this technique using a unique frame buffer that is a non integrating memory that compares incoming data with that already stored. The device continuously outputs an analog video signal of the stored contents which can then be redigitized and analyzed using conventional equipment. Historically, photographic time exposures have been used to record the displacement envelope of harmonically oscillating structures to show mode shape. Mode shape analysis is crucial, for example, in aeroelastic testing of wind tunnel models. Aerodynamic, inertial, and elastic forces can couple together leading to catastrophic failure of a poorly designed aircraft. This paper will explore the usefulness of the peak store device as a videometric tool and in particular discuss methods for analyzing a targeted vibrating plate using the `peak store' in conjunction with calibration methods familiar to the close-range videometry community. Results for the first three normal modes will be presented.
Underwater image mosaicking and visual odometry
NASA Astrophysics Data System (ADS)
Sadjadi, Firooz; Tangirala, Sekhar; Sorber, Scott
2017-05-01
This paper summarizes the results of studies in underwater odometery using a video camera for estimating the velocity of an unmanned underwater vehicle (UUV). Underwater vehicles are usually equipped with sonar and Inertial Measurement Unit (IMU) - an integrated sensor package that combines multiple accelerometers and gyros to produce a three dimensional measurement of both specific force and angular rate with respect to an inertial reference frame for navigation. In this study, we investigate the use of odometry information obtainable from a video camera mounted on a UUV to extract vehicle velocity relative to the ocean floor. A key challenge with this process is the seemingly bland (i.e. featureless) nature of video data obtained underwater which could make conventional approaches to image-based motion estimation difficult. To address this problem, we perform image enhancement, followed by frame to frame image transformation, registration and mosaicking/stitching. With this approach the velocity components associated with the moving sensor (vehicle) are readily obtained from (i) the components of the transform matrix at each frame; (ii) information about the height of the vehicle above the seabed; and (iii) the sensor resolution. Preliminary results are presented.
Space Shuttle Main Engine Propellant Path Leak Detection Using Sequential Image Processing
NASA Technical Reports Server (NTRS)
Smith, L. Montgomery; Malone, Jo Anne; Crawford, Roger A.
1995-01-01
Initial research in this study using theoretical radiation transport models established that the occurrence of a leak is accompanies by a sudden but sustained change in intensity in a given region of an image. In this phase, temporal processing of video images on a frame-by-frame basis was used to detect leaks within a given field of view. The leak detection algorithm developed in this study consists of a digital highpass filter cascaded with a moving average filter. The absolute value of the resulting discrete sequence is then taken and compared to a threshold value to produce the binary leak/no leak decision at each point in the image. Alternatively, averaging over the full frame of the output image produces a single time-varying mean value estimate that is indicative of the intensity and extent of a leak. Laboratory experiments were conducted in which artificially created leaks on a simulated SSME background were produced and recorded from a visible wavelength video camera. This data was processed frame-by-frame over the time interval of interest using an image processor implementation of the leak detection algorithm. In addition, a 20 second video sequence of an actual SSME failure was analyzed using this technique. The resulting output image sequences and plots of the full frame mean value versus time verify the effectiveness of the system.
NASA Astrophysics Data System (ADS)
Klein, P.; Gröber, S.; Kuhn, J.; Fleischhauer, A.; Müller, A.
2015-01-01
The selection and application of coordinate systems is an important issue in physics. However, considering different frames of references in a given problem sometimes seems un-intuitive and is difficult for students. We present a concrete problem of projectile motion which vividly demonstrates the value of considering different frames of references. We use this example to explore the effectiveness of video-based motion analysis (VBMA) as an instructional technique at university level in enhancing students’ understanding of the abstract concept of coordinate systems. A pilot study with 47 undergraduate students indicates that VBMA instruction improves conceptual understanding of this issue.
Effects of a rater training on rating accuracy in a physical examination skills assessment
Weitz, Gunther; Vinzentius, Christian; Twesten, Christoph; Lehnert, Hendrik; Bonnemeier, Hendrik; König, Inke R.
2014-01-01
Background: The accuracy and reproducibility of medical skills assessment is generally low. Rater training has little or no effect. Our knowledge in this field, however, relies on studies involving video ratings of overall clinical performances. We hypothesised that a rater training focussing on the frame of reference could improve accuracy in grading the curricular assessment of a highly standardised physical head-to-toe examination. Methods: Twenty-one raters assessed the performance of 242 third-year medical students. Eleven raters had been randomly assigned to undergo a brief frame-of-reference training a few days before the assessment. 218 encounters were successfully recorded on video and re-assessed independently by three additional observers. Accuracy was defined as the concordance between the raters' grade and the median of the observers' grade. After the assessment, both students and raters filled in a questionnaire about their views on the assessment. Results: Rater training did not have a measurable influence on accuracy. However, trained raters rated significantly more stringently than untrained raters, and their overall stringency was closer to the stringency of the observers. The questionnaire indicated a higher awareness of the halo effect in the trained raters group. Although the self-assessment of the students mirrored the assessment of the raters in both groups, the students assessed by trained raters felt more discontent with their grade. Conclusions: While training had some marginal effects, it failed to have an impact on the individual accuracy. These results in real-life encounters are consistent with previous studies on rater training using video assessments of clinical performances. The high degree of standardisation in this study was not suitable to harmonize the trained raters’ grading. The data support the notion that the process of appraising medical performance is highly individual. A frame-of-reference training as applied does not effectively adjust the physicians' judgement on medical students in real-live assessments. PMID:25489341
Effects of a rater training on rating accuracy in a physical examination skills assessment.
Weitz, Gunther; Vinzentius, Christian; Twesten, Christoph; Lehnert, Hendrik; Bonnemeier, Hendrik; König, Inke R
2014-01-01
The accuracy and reproducibility of medical skills assessment is generally low. Rater training has little or no effect. Our knowledge in this field, however, relies on studies involving video ratings of overall clinical performances. We hypothesised that a rater training focussing on the frame of reference could improve accuracy in grading the curricular assessment of a highly standardised physical head-to-toe examination. Twenty-one raters assessed the performance of 242 third-year medical students. Eleven raters had been randomly assigned to undergo a brief frame-of-reference training a few days before the assessment. 218 encounters were successfully recorded on video and re-assessed independently by three additional observers. Accuracy was defined as the concordance between the raters' grade and the median of the observers' grade. After the assessment, both students and raters filled in a questionnaire about their views on the assessment. Rater training did not have a measurable influence on accuracy. However, trained raters rated significantly more stringently than untrained raters, and their overall stringency was closer to the stringency of the observers. The questionnaire indicated a higher awareness of the halo effect in the trained raters group. Although the self-assessment of the students mirrored the assessment of the raters in both groups, the students assessed by trained raters felt more discontent with their grade. While training had some marginal effects, it failed to have an impact on the individual accuracy. These results in real-life encounters are consistent with previous studies on rater training using video assessments of clinical performances. The high degree of standardisation in this study was not suitable to harmonize the trained raters' grading. The data support the notion that the process of appraising medical performance is highly individual. A frame-of-reference training as applied does not effectively adjust the physicians' judgement on medical students in real-live assessments.
Content fragile watermarking for H.264/AVC video authentication
NASA Astrophysics Data System (ADS)
Ait Sadi, K.; Guessoum, A.; Bouridane, A.; Khelifi, F.
2017-04-01
Discrete cosine transform is exploited in this work to generate the authentication data that are treated as a fragile watermark. This watermark is embedded in the motion vectors. The advances in multimedia technologies and digital processing tools have brought with them new challenges for the source and content authentication. To ensure the integrity of the H.264/AVC video stream, we introduce an approach based on a content fragile video watermarking method using an independent authentication of each group of pictures (GOPs) within the video. This technique uses robust visual features extracted from the video pertaining to the set of selected macroblocs (MBs) which hold the best partition mode in a tree-structured motion compensation process. An additional security degree is offered by the proposed method through using a more secured keyed function HMAC-SHA-256 and randomly choosing candidates from already selected MBs. In here, the watermark detection and verification processes are blind, whereas the tampered frames detection is not since it needs the original frames within the tampered GOPs. The proposed scheme achieves an accurate authentication technique with a high fragility and fidelity whilst maintaining the original bitrate and the perceptual quality. Furthermore, its ability to detect the tampered frames in case of spatial, temporal and colour manipulations is confirmed.
Intelligent keyframe extraction for video printing
NASA Astrophysics Data System (ADS)
Zhang, Tong
2004-10-01
Nowadays most digital cameras have the functionality of taking short video clips, with the length of video ranging from several seconds to a couple of minutes. The purpose of this research is to develop an algorithm which extracts an optimal set of keyframes from each short video clip so that the user could obtain proper video frames to print out. In current video printing systems, keyframes are normally obtained by evenly sampling the video clip over time. Such an approach, however, may not reflect highlights or regions of interest in the video. Keyframes derived in this way may also be improper for video printing in terms of either content or image quality. In this paper, we present an intelligent keyframe extraction approach to derive an improved keyframe set by performing semantic analysis of the video content. For a video clip, a number of video and audio features are analyzed to first generate a candidate keyframe set. These features include accumulative color histogram and color layout differences, camera motion estimation, moving object tracking, face detection and audio event detection. Then, the candidate keyframes are clustered and evaluated to obtain a final keyframe set. The objective is to automatically generate a limited number of keyframes to show different views of the scene; to show different people and their actions in the scene; and to tell the story in the video shot. Moreover, frame extraction for video printing, which is a rather subjective problem, is considered in this work for the first time, and a semi-automatic approach is proposed.
Evaluation of Skybox Video and Still Image products
NASA Astrophysics Data System (ADS)
d'Angelo, P.; Kuschk, G.; Reinartz, P.
2014-11-01
The SkySat-1 satellite lauched by Skybox Imaging on November 21 in 2013 opens a new chapter in civilian earth observation as it is the first civilian satellite to image a target in high definition panchromatic video for up to 90 seconds. The small satellite with a mass of 100 kg carries a telescope with 3 frame sensors. Two products are available: Panchromatic video with a resolution of around 1 meter and a frame size of 2560 × 1080 pixels at 30 frames per second. Additionally, the satellite can collect still imagery with a swath of 8 km in the panchromatic band, and multispectral images with 4 bands. Using super-resolution techniques, sub-meter accuracy is reached for the still imagery. The paper provides an overview of the satellite design and imaging products. The still imagery product consists of 3 stripes of frame images with a footprint of approximately 2.6 × 1.1 km. Using bundle block adjustment, the frames are registered, and their accuracy is evaluated. Image quality of the panchromatic, multispectral and pansharpened products are evaluated. The video product used in this evaluation consists of a 60 second gazing acquisition of Las Vegas. A DSM is generated by dense stereo matching. Multiple techniques such as pairwise matching or multi image matching are used and compared. As no ground truth height reference model is availble to the authors, comparisons on flat surface and compare differently matched DSMs are performed. Additionally, visual inspection of DSM and DSM profiles show a detailed reconstruction of small features and large skyscrapers.
Khan, Tareq; Shrestha, Ravi; Imtiaz, Md. Shamin
2015-01-01
Presented is a new power-efficient colour generation algorithm for wireless capsule endoscopy (WCE) application. In WCE, transmitting colour image data from the human intestine through radio frequency (RF) consumes a huge amount of power. The conventional way is to transmit all R, G and B components of all frames. Using the proposed dictionary-based colour generation scheme, instead of sending all R, G and B frames, first one colour frame is sent followed by a series of grey-scale frames. At the receiver end, the colour information is extracted from the colour frame and then added to colourise the grey-scale frames. After a certain number of grey-scale frames, another colour frame is sent followed by the same number of grey-scale frames. This process is repeated until the end of the video sequence to maintain the colour similarity. As a result, over 50% of RF transmission power can be saved using the proposed scheme, which will eventually lead to a battery life extension of the capsule by 4–7 h. The reproduced colour images have been evaluated both statistically and subjectively by professional gastroenterologists. The algorithm is finally implemented using a WCE prototype and the performance is validated using an ex-vivo trial. PMID:26609405
Multiframe video coding for improved performance over wireless channels.
Budagavi, M; Gibson, J D
2001-01-01
We propose and evaluate a multi-frame extension to block motion compensation (BMC) coding of videoconferencing-type video signals for wireless channels. The multi-frame BMC (MF-BMC) coder makes use of the redundancy that exists across multiple frames in typical videoconferencing sequences to achieve additional compression over that obtained by using the single frame BMC (SF-BMC) approach, such as in the base-level H.263 codec. The MF-BMC approach also has an inherent ability of overcoming some transmission errors and is thus more robust when compared to the SF-BMC approach. We model the error propagation process in MF-BMC coding as a multiple Markov chain and use Markov chain analysis to infer that the use of multiple frames in motion compensation increases robustness. The Markov chain analysis is also used to devise a simple scheme which randomizes the selection of the frame (amongst the multiple previous frames) used in BMC to achieve additional robustness. The MF-BMC coders proposed are a multi-frame extension of the base level H.263 coder and are found to be more robust than the base level H.263 coder when subjected to simulated errors commonly encountered on wireless channels.
Joint Transform Correlation for face tracking: elderly fall detection application
NASA Astrophysics Data System (ADS)
Katz, Philippe; Aron, Michael; Alfalou, Ayman
2013-03-01
In this paper, an iterative tracking algorithm based on a non-linear JTC (Joint Transform Correlator) architecture and enhanced by a digital image processing method is proposed and validated. This algorithm is based on the computation of a correlation plane where the reference image is updated at each frame. For that purpose, we use the JTC technique in real time to track a patient (target image) in a room fitted with a video camera. The correlation plane is used to localize the target image in the current video frame (frame i). Then, the reference image to be exploited in the next frame (frame i+1) is updated according to the previous one (frame i). In an effort to validate our algorithm, our work is divided into two parts: (i) a large study based on different sequences with several situations and different JTC parameters is achieved in order to quantify their effects on the tracking performances (decimation, non-linearity coefficient, size of the correlation plane, size of the region of interest...). (ii) the tracking algorithm is integrated into an application of elderly fall detection. The first reference image is a face detected by means of Haar descriptors, and then localized into the new video image thanks to our tracking method. In order to avoid a bad update of the reference frame, a method based on a comparison of image intensity histograms is proposed and integrated in our algorithm. This step ensures a robust tracking of the reference frame. This article focuses on face tracking step optimisation and evalutation. A supplementary step of fall detection, based on vertical acceleration and position, will be added and studied in further work.
Deep Spatial-Temporal Joint Feature Representation for Video Object Detection.
Zhao, Baojun; Zhao, Boya; Tang, Linbo; Han, Yuqi; Wang, Wenzheng
2018-03-04
With the development of deep neural networks, many object detection frameworks have shown great success in the fields of smart surveillance, self-driving cars, and facial recognition. However, the data sources are usually videos, and the object detection frameworks are mostly established on still images and only use the spatial information, which means that the feature consistency cannot be ensured because the training procedure loses temporal information. To address these problems, we propose a single, fully-convolutional neural network-based object detection framework that involves temporal information by using Siamese networks. In the training procedure, first, the prediction network combines the multiscale feature map to handle objects of various sizes. Second, we introduce a correlation loss by using the Siamese network, which provides neighboring frame features. This correlation loss represents object co-occurrences across time to aid the consistent feature generation. Since the correlation loss should use the information of the track ID and detection label, our video object detection network has been evaluated on the large-scale ImageNet VID dataset where it achieves a 69.5% mean average precision (mAP).
Robust Pedestrian Tracking and Recognition from FLIR Video: A Unified Approach via Sparse Coding
Li, Xin; Guo, Rui; Chen, Chao
2014-01-01
Sparse coding is an emerging method that has been successfully applied to both robust object tracking and recognition in the vision literature. In this paper, we propose to explore a sparse coding-based approach toward joint object tracking-and-recognition and explore its potential in the analysis of forward-looking infrared (FLIR) video to support nighttime machine vision systems. A key technical contribution of this work is to unify existing sparse coding-based approaches toward tracking and recognition under the same framework, so that they can benefit from each other in a closed-loop. On the one hand, tracking the same object through temporal frames allows us to achieve improved recognition performance through dynamical updating of template/dictionary and combining multiple recognition results; on the other hand, the recognition of individual objects facilitates the tracking of multiple objects (i.e., walking pedestrians), especially in the presence of occlusion within a crowded environment. We report experimental results on both the CASIAPedestrian Database and our own collected FLIR video database to demonstrate the effectiveness of the proposed joint tracking-and-recognition approach. PMID:24961216
Emotional Responses to Environmental Messages and Future Behavioral Intentions
ERIC Educational Resources Information Center
Perrin, Jeffrey L.
2011-01-01
The present research investigated effects of message framing (losses-framed or gains-framed), message modality (video with text or text-only) and emotional arousal on environmentally responsible behavioral intentions. The sample consisted of 161 college students. The present research did not find a significant difference in behavioral intentions…
Mantokoudis, Georgios; Dähler, Claudia; Dubach, Patrick; Kompis, Martin; Caversaccio, Marco D; Senn, Pascal
2013-01-01
To analyze speech reading through Internet video calls by profoundly hearing-impaired individuals and cochlear implant (CI) users. Speech reading skills of 14 deaf adults and 21 CI users were assessed using the Hochmair Schulz Moser (HSM) sentence test. We presented video simulations using different video resolutions (1280 × 720, 640 × 480, 320 × 240, 160 × 120 px), frame rates (30, 20, 10, 7, 5 frames per second (fps)), speech velocities (three different speakers), webcameras (Logitech Pro9000, C600 and C500) and image/sound delays (0-500 ms). All video simulations were presented with and without sound and in two screen sizes. Additionally, scores for live Skype™ video connection and live face-to-face communication were assessed. Higher frame rate (>7 fps), higher camera resolution (>640 × 480 px) and shorter picture/sound delay (<100 ms) were associated with increased speech perception scores. Scores were strongly dependent on the speaker but were not influenced by physical properties of the camera optics or the full screen mode. There is a significant median gain of +8.5%pts (p = 0.009) in speech perception for all 21 CI-users if visual cues are additionally shown. CI users with poor open set speech perception scores (n = 11) showed the greatest benefit under combined audio-visual presentation (median speech perception +11.8%pts, p = 0.032). Webcameras have the potential to improve telecommunication of hearing-impaired individuals.
Evaluation of the effectiveness of color attributes for video indexing
NASA Astrophysics Data System (ADS)
Chupeau, Bertrand; Forest, Ronan
2001-10-01
Color features are reviewed and their effectiveness assessed in the application framework of key-frame clustering for abstracting unconstrained video. Existing color spaces and associated quantization schemes are first studied. Description of global color distribution by means of histograms is then detailed. In our work, 12 combinations of color space and quantization were selected, together with 12 histogram metrics. Their respective effectiveness with respect to picture similarity measurement was evaluated through a query-by-example scenario. For that purpose, a set of still-picture databases was built by extracting key frames from several video clips, including news, documentaries, sports and cartoons. Classical retrieval performance evaluation criteria were adapted to the specificity of our testing methodology.
Evaluation of the effectiveness of color attributes for video indexing
NASA Astrophysics Data System (ADS)
Chupeau, Bertrand; Forest, Ronan
2001-01-01
Color features are reviewed and their effectiveness assessed in the application framework of key-frame clustering for abstracting unconstrained video. Existing color spaces and associated quantization schemes are first studied. Description of global color distribution by means of histograms is then detailed. In our work, twelve combinations of color space and quantization were selected, together with twelve histogram metrics. Their respective effectiveness with respect to picture similarity measurement was evaluated through a query-be-example scenario. For that purpose, a set of still-picture databases was built by extracting key-frames from several video clips, including news, documentaries, sports and cartoons. Classical retrieval performance evaluation criteria were adapted to the specificity of our testing methodology.
Evaluation of the effectiveness of color attributes for video indexing
NASA Astrophysics Data System (ADS)
Chupeau, Bertrand; Forest, Ronan
2000-12-01
Color features are reviewed and their effectiveness assessed in the application framework of key-frame clustering for abstracting unconstrained video. Existing color spaces and associated quantization schemes are first studied. Description of global color distribution by means of histograms is then detailed. In our work, twelve combinations of color space and quantization were selected, together with twelve histogram metrics. Their respective effectiveness with respect to picture similarity measurement was evaluated through a query-be-example scenario. For that purpose, a set of still-picture databases was built by extracting key-frames from several video clips, including news, documentaries, sports and cartoons. Classical retrieval performance evaluation criteria were adapted to the specificity of our testing methodology.
Quantitative evaluation of low-cost frame-grabber boards for personal computers.
Kofler, J M; Gray, J E; Fuelberth, J T; Taubel, J P
1995-11-01
Nine moderately priced frame-grabber boards for both Macintosh (Apple Computers, Cupertino, CA) and IBM-compatible computers were evaluated using a Society of Motion Pictures and Television Engineers (SMPTE) pattern and a video signal generator for dynamic range, gray-scale reproducibility, and spatial integrity of the captured image. The degradation of the video information ranged from minor to severe. Some boards are of reasonable quality for applications in diagnostic imaging and education. However, price and quality are not necessarily directly related.
OpenMP Parallelization and Optimization of Graph-based Machine Learning Algorithms
2016-05-01
composed of hyper - spectral video sequences recording the release of chemical plumes at the Dugway Proving Ground. We use the 329 frames of the...video. Each frame is a hyper - spectral image with dimension 128 × 320 × 129, where 129 is the dimension of the channel of each pixel. The total number of...j=1 . Then we use the nested for- loop to calculate the values of WXY by the formula (1). We then put the corresponding value in an array which
Content-based video retrieval by example video clip
NASA Astrophysics Data System (ADS)
Dimitrova, Nevenka; Abdel-Mottaleb, Mohamed
1997-01-01
This paper presents a novel approach for video retrieval from a large archive of MPEG or Motion JPEG compressed video clips. We introduce a retrieval algorithm that takes a video clip as a query and searches the database for clips with similar contents. Video clips are characterized by a sequence of representative frame signatures, which are constructed from DC coefficients and motion information (`DC+M' signatures). The similarity between two video clips is determined by using their respective signatures. This method facilitates retrieval of clips for the purpose of video editing, broadcast news retrieval, or copyright violation detection.
Optimal frame-by-frame result combination strategy for OCR in video stream
NASA Astrophysics Data System (ADS)
Bulatov, Konstantin; Lynchenko, Aleksander; Krivtsov, Valeriy
2018-04-01
This paper describes the problem of combining classification results of multiple observations of one object. This task can be regarded as a particular case of a decision-making using a combination of experts votes with calculated weights. The accuracy of various methods of combining the classification results depending on different models of input data is investigated on the example of frame-by-frame character recognition in a video stream. Experimentally it is shown that the strategy of choosing a single most competent expert in case of input data without irrelevant observations has an advantage (in this case irrelevant means with character localization and segmentation errors). At the same time this work demonstrates the advantage of combining several most competent experts according to multiplication rule or voting if irrelevant samples are present in the input data.
Detection of Abnormal Events via Optical Flow Feature Analysis
Wang, Tian; Snoussi, Hichem
2015-01-01
In this paper, a novel algorithm is proposed to detect abnormal events in video streams. The algorithm is based on the histogram of the optical flow orientation descriptor and the classification method. The details of the histogram of the optical flow orientation descriptor are illustrated for describing movement information of the global video frame or foreground frame. By combining one-class support vector machine and kernel principal component analysis methods, the abnormal events in the current frame can be detected after a learning period characterizing normal behaviors. The difference abnormal detection results are analyzed and explained. The proposed detection method is tested on benchmark datasets, then the experimental results show the effectiveness of the algorithm. PMID:25811227
Holographic optical coherence imaging of tumor spheroids
NASA Astrophysics Data System (ADS)
Yu, P.; Mustata, M.; Turek, J. J.; French, P. M. W.; Melloch, M. R.; Nolte, D. D.
2003-07-01
We present depth-resolved coherence-domain images of living tissue using a dynamic holographic semiconductor film. An AlGaAs photorefractive quantum-well device is used in an adaptive interferometer that records coherent backscattered (image-bearing) light from inside rat osteogenic sarcoma tumor spheroids up to 1 mm in diameter in vitro. The data consist of sequential holographic image frames at successive depths through the tumor represented as a visual video "fly-through." The images from the tumor spheroids reveal heterogeneous structures presumably caused by necrosis and microcalcifications characteristic of human tumors in their early avascular growth.
NASA Astrophysics Data System (ADS)
Barnett, Barry S.; Bovik, Alan C.
1995-04-01
This paper presents a real time full motion video conferencing system based on the Visual Pattern Image Sequence Coding (VPISC) software codec. The prototype system hardware is comprised of two personal computers, two camcorders, two frame grabbers, and an ethernet connection. The prototype system software has a simple structure. It runs under the Disk Operating System, and includes a user interface, a video I/O interface, an event driven network interface, and a free running or frame synchronous video codec that also acts as the controller for the video and network interfaces. Two video coders have been tested in this system. Simple implementations of Visual Pattern Image Coding and VPISC have both proven to support full motion video conferencing with good visual quality. Future work will concentrate on expanding this prototype to support the motion compensated version of VPISC, as well as encompassing point-to-point modem I/O and multiple network protocols. The application will be ported to multiple hardware platforms and operating systems. The motivation for developing this prototype system is to demonstrate the practicality of software based real time video codecs. Furthermore, software video codecs are not only cheaper, but are more flexible system solutions because they enable different computer platforms to exchange encoded video information without requiring on-board protocol compatible video codex hardware. Software based solutions enable true low cost video conferencing that fits the `open systems' model of interoperability that is so important for building portable hardware and software applications.
Robust video super-resolution with registration efficiency adaptation
NASA Astrophysics Data System (ADS)
Zhang, Xinfeng; Xiong, Ruiqin; Ma, Siwei; Zhang, Li; Gao, Wen
2010-07-01
Super-Resolution (SR) is a technique to construct a high-resolution (HR) frame by fusing a group of low-resolution (LR) frames describing the same scene. The effectiveness of the conventional super-resolution techniques, when applied on video sequences, strongly relies on the efficiency of motion alignment achieved by image registration. Unfortunately, such efficiency is limited by the motion complexity in the video and the capability of adopted motion model. In image regions with severe registration errors, annoying artifacts usually appear in the produced super-resolution video. This paper proposes a robust video super-resolution technique that adapts itself to the spatially-varying registration efficiency. The reliability of each reference pixel is measured by the corresponding registration error and incorporated into the optimization objective function of SR reconstruction. This makes the SR reconstruction highly immune to the registration errors, as outliers with higher registration errors are assigned lower weights in the objective function. In particular, we carefully design a mechanism to assign weights according to registration errors. The proposed superresolution scheme has been tested with various video sequences and experimental results clearly demonstrate the effectiveness of the proposed method.
Topical video object discovery from key frames by modeling word co-occurrence prior.
Zhao, Gangqiang; Yuan, Junsong; Hua, Gang; Yang, Jiong
2015-12-01
A topical video object refers to an object, that is, frequently highlighted in a video. It could be, e.g., the product logo and the leading actor/actress in a TV commercial. We propose a topic model that incorporates a word co-occurrence prior for efficient discovery of topical video objects from a set of key frames. Previous work using topic models, such as latent Dirichelet allocation (LDA), for video object discovery often takes a bag-of-visual-words representation, which ignored important co-occurrence information among the local features. We show that such data driven co-occurrence information from bottom-up can conveniently be incorporated in LDA with a Gaussian Markov prior, which combines top-down probabilistic topic modeling with bottom-up priors in a unified model. Our experiments on challenging videos demonstrate that the proposed approach can discover different types of topical objects despite variations in scale, view-point, color and lighting changes, or even partial occlusions. The efficacy of the co-occurrence prior is clearly demonstrated when compared with topic models without such priors.
A no-reference video quality assessment metric based on ROI
NASA Astrophysics Data System (ADS)
Jia, Lixiu; Zhong, Xuefei; Tu, Yan; Niu, Wenjuan
2015-01-01
A no reference video quality assessment metric based on the region of interest (ROI) was proposed in this paper. In the metric, objective video quality was evaluated by integrating the quality of two compressed artifacts, i.e. blurring distortion and blocking distortion. The Gaussian kernel function was used to extract the human density maps of the H.264 coding videos from the subjective eye tracking data. An objective bottom-up ROI extraction model based on magnitude discrepancy of discrete wavelet transform between two consecutive frames, center weighted color opponent model, luminance contrast model and frequency saliency model based on spectral residual was built. Then only the objective saliency maps were used to compute the objective blurring and blocking quality. The results indicate that the objective ROI extraction metric has a higher the area under the curve (AUC) value. Comparing with the conventional video quality assessment metrics which measured all the video quality frames, the metric proposed in this paper not only decreased the computation complexity, but improved the correlation between subjective mean opinion score (MOS) and objective scores.
Practical low-cost visual communication using binary images for deaf sign language.
Manoranjan, M D; Robinson, J A
2000-03-01
Deaf sign language transmitted by video requires a temporal resolution of 8 to 10 frames/s for effective communication. Conventional videoconferencing applications, when operated over low bandwidth telephone lines, provide very low temporal resolution of pictures, of the order of less than a frame per second, resulting in jerky movement of objects. This paper presents a practical solution for sign language communication, offering adequate temporal resolution of images using moving binary sketches or cartoons, implemented on standard personal computer hardware with low-cost cameras and communicating over telephone lines. To extract cartoon points an efficient feature extraction algorithm adaptive to the global statistics of the image is proposed. To improve the subjective quality of the binary images, irreversible preprocessing techniques, such as isolated point removal and predictive filtering, are used. A simple, efficient and fast recursive temporal prefiltering scheme, using histograms of successive frames, reduces the additive and multiplicative noise from low-cost cameras. An efficient three-dimensional (3-D) compression scheme codes the binary sketches. Subjective tests performed on the system confirm that it can be used for sign language communication over telephone lines.
NASA Astrophysics Data System (ADS)
Baca, Michael J.
1990-09-01
A system to display images generated by the Naval Postgraduate School Infrared Search and Target Designation (a modified AN/SAR-8 Advanced Development Model) in near real time was developed using a 33 MHz NIC computer as the central controller. This computer was enhanced with a Data Translation DT2861 Frame Grabber for image processing and an interface board designed and constructed at NPS to provide synchronization between the IRSTD and Frame Grabber. Images are displayed in false color in a video raster format on a 512 by 480 pixel resolution monitor. Using FORTRAN, programs have been written to acquire, unscramble, expand and display a 3 deg sector of data. The time line for acquisition, processing and display has been analyzed and repetition periods of less than four seconds for successive screen displays have been achieved. This represents a marked improvement over previous methods necessitating slower Direct Memory Access transfers of data into the Frame Grabber. Recommendations are made for further improvements to enhance the speed and utility of images produced.
On scalable lossless video coding based on sub-pixel accurate MCTF
NASA Astrophysics Data System (ADS)
Yea, Sehoon; Pearlman, William A.
2006-01-01
We propose two approaches to scalable lossless coding of motion video. They achieve SNR-scalable bitstream up to lossless reconstruction based upon the subpixel-accurate MCTF-based wavelet video coding. The first approach is based upon a two-stage encoding strategy where a lossy reconstruction layer is augmented by a following residual layer in order to obtain (nearly) lossless reconstruction. The key advantages of our approach include an 'on-the-fly' determination of bit budget distribution between the lossy and the residual layers, freedom to use almost any progressive lossy video coding scheme as the first layer and an added feature of near-lossless compression. The second approach capitalizes on the fact that we can maintain the invertibility of MCTF with an arbitrary sub-pixel accuracy even in the presence of an extra truncation step for lossless reconstruction thanks to the lifting implementation. Experimental results show that the proposed schemes achieve compression ratios not obtainable by intra-frame coders such as Motion JPEG-2000 thanks to their inter-frame coding nature. Also they are shown to outperform the state-of-the-art non-scalable inter-frame coder H.264 (JM) lossless mode, with the added benefit of bitstream embeddedness.
Ebe, Kazuyu; Sugimoto, Satoru; Utsunomiya, Satoru; Kagamu, Hiroshi; Aoyama, Hidefumi; Court, Laurence; Tokuyama, Katsuichi; Baba, Ryuta; Ogihara, Yoshisada; Ichikawa, Kosuke; Toyama, Joji
2015-08-01
To develop and evaluate a new video image-based QA system, including in-house software, that can display a tracking state visually and quantify the positional accuracy of dynamic tumor tracking irradiation in the Vero4DRT system. Sixteen trajectories in six patients with pulmonary cancer were obtained with the ExacTrac in the Vero4DRT system. Motion data in the cranio-caudal direction (Y direction) were used as the input for a programmable motion table (Quasar). A target phantom was placed on the motion table, which was placed on the 2D ionization chamber array (MatriXX). Then, the 4D modeling procedure was performed on the target phantom during a reproduction of the patient's tumor motion. A substitute target with the patient's tumor motion was irradiated with 6-MV x-rays under the surrogate infrared system. The 2D dose images obtained from the MatriXX (33 frames/s; 40 s) were exported to in-house video-image analyzing software. The absolute differences in the Y direction between the center of the exposed target and the center of the exposed field were calculated. Positional errors were observed. The authors' QA results were compared to 4D modeling function errors and gimbal motion errors obtained from log analyses in the ExacTrac to verify the accuracy of their QA system. The patients' tumor motions were evaluated in the wave forms, and the peak-to-peak distances were also measured to verify their reproducibility. Thirteen of sixteen trajectories (81.3%) were successfully reproduced with Quasar. The peak-to-peak distances ranged from 2.7 to 29.0 mm. Three trajectories (18.7%) were not successfully reproduced due to the limited motions of the Quasar. Thus, 13 of 16 trajectories were summarized. The mean number of video images used for analysis was 1156. The positional errors (absolute mean difference + 2 standard deviation) ranged from 0.54 to 1.55 mm. The error values differed by less than 1 mm from 4D modeling function errors and gimbal motion errors in the ExacTrac log analyses (n = 13). The newly developed video image-based QA system, including in-house software, can analyze more than a thousand images (33 frames/s). Positional errors are approximately equivalent to those in ExacTrac log analyses. This system is useful for the visual illustration of the progress of the tracking state and for the quantification of positional accuracy during dynamic tumor tracking irradiation in the Vero4DRT system.
Tackling action-based video abstraction of animated movies for video browsing
NASA Astrophysics Data System (ADS)
Ionescu, Bogdan; Ott, Laurent; Lambert, Patrick; Coquin, Didier; Pacureanu, Alexandra; Buzuloiu, Vasile
2010-07-01
We address the issue of producing automatic video abstracts in the context of the video indexing of animated movies. For a quick browse of a movie's visual content, we propose a storyboard-like summary, which follows the movie's events by retaining one key frame for each specific scene. To capture the shot's visual activity, we use histograms of cumulative interframe distances, and the key frames are selected according to the distribution of the histogram's modes. For a preview of the movie's exciting action parts, we propose a trailer-like video highlight, whose aim is to show only the most interesting parts of the movie. Our method is based on a relatively standard approach, i.e., highlighting action through the analysis of the movie's rhythm and visual activity information. To suit every type of movie content, including predominantly static movies or movies without exciting parts, the concept of action depends on the movie's average rhythm. The efficiency of our approach is confirmed through several end-user studies.
NASA Astrophysics Data System (ADS)
Al-Hayani, Nazar; Al-Jawad, Naseer; Jassim, Sabah A.
2014-05-01
Video compression and encryption became very essential in a secured real time video transmission. Applying both techniques simultaneously is one of the challenges where the size and the quality are important in multimedia transmission. In this paper we proposed a new technique for video compression and encryption. Both encryption and compression are based on edges extracted from the high frequency sub-bands of wavelet decomposition. The compression algorithm based on hybrid of: discrete wavelet transforms, discrete cosine transform, vector quantization, wavelet based edge detection, and phase sensing. The compression encoding algorithm treats the video reference and non-reference frames in two different ways. The encryption algorithm utilized A5 cipher combined with chaotic logistic map to encrypt the significant parameters and wavelet coefficients. Both algorithms can be applied simultaneously after applying the discrete wavelet transform on each individual frame. Experimental results show that the proposed algorithms have the following features: high compression, acceptable quality, and resistance to the statistical and bruteforce attack with low computational processing.
A Secure and Robust Object-Based Video Authentication System
NASA Astrophysics Data System (ADS)
He, Dajun; Sun, Qibin; Tian, Qi
2004-12-01
An object-based video authentication system, which combines watermarking, error correction coding (ECC), and digital signature techniques, is presented for protecting the authenticity between video objects and their associated backgrounds. In this system, a set of angular radial transformation (ART) coefficients is selected as the feature to represent the video object and the background, respectively. ECC and cryptographic hashing are applied to those selected coefficients to generate the robust authentication watermark. This content-based, semifragile watermark is then embedded into the objects frame by frame before MPEG4 coding. In watermark embedding and extraction, groups of discrete Fourier transform (DFT) coefficients are randomly selected, and their energy relationships are employed to hide and extract the watermark. The experimental results demonstrate that our system is robust to MPEG4 compression, object segmentation errors, and some common object-based video processing such as object translation, rotation, and scaling while securely preventing malicious object modifications. The proposed solution can be further incorporated into public key infrastructure (PKI).
NASA Astrophysics Data System (ADS)
Eaton, Adam; Vincely, Vinoin; Lloyd, Paige; Hugenberg, Kurt; Vishwanath, Karthik
2017-03-01
Video Photoplethysmography (VPPG) is a numerical technique to process standard RGB video data of exposed human skin and extracting the heart-rate (HR) from the skin areas. Being a non-contact technique, VPPG has the potential to provide estimates of subject's heart-rate, respiratory rate, and even the heart rate variability of human subjects with potential applications ranging from infant monitors, remote healthcare and psychological experiments, particularly given the non-contact and sensor-free nature of the technique. Though several previous studies have reported successful correlations in HR obtained using VPPG algorithms to HR measured using the gold-standard electrocardiograph, others have reported that these correlations are dependent on controlling for duration of the video-data analyzed, subject motion, and ambient lighting. Here, we investigate the ability of two commonly used VPPG-algorithms in extraction of human heart-rates under three different laboratory conditions. We compare the VPPG HR values extracted across these three sets of experiments to the gold-standard values acquired by using an electrocardiogram or a commercially available pulseoximeter. The two VPPG-algorithms were applied with and without KLT-facial feature tracking and detection algorithms from the Computer Vision MATLAB® toolbox. Results indicate that VPPG based numerical approaches have the ability to provide robust estimates of subject HR values and are relatively insensitive to the devices used to record the video data. However, they are highly sensitive to conditions of video acquisition including subject motion, the location, size and averaging techniques applied to regions-of-interest as well as to the number of video frames used for data processing.
A deep learning pipeline for Indian dance style classification
NASA Astrophysics Data System (ADS)
Dewan, Swati; Agarwal, Shubham; Singh, Navjyoti
2018-04-01
In this paper, we address the problem of dance style classification to classify Indian dance or any dance in general. We propose a 3-step deep learning pipeline. First, we extract 14 essential joint locations of the dancer from each video frame, this helps us to derive any body region location within the frame, we use this in the second step which forms the main part of our pipeline. Here, we divide the dancer into regions of important motion in each video frame. We then extract patches centered at these regions. Main discriminative motion is captured in these patches. We stack the features from all such patches of a frame into a single vector and form our hierarchical dance pose descriptor. Finally, in the third step, we build a high level representation of the dance video using the hierarchical descriptors and train it using a Recurrent Neural Network (RNN) for classification. Our novelty also lies in the way we use multiple representations for a single video. This helps us to: (1) Overcome the RNN limitation of learning small sequences over big sequences such as dance; (2) Extract more data from the available dataset for effective deep learning by training multiple representations. Our contributions in this paper are three-folds: (1) We provide a deep learning pipeline for classification of any form of dance; (2) We prove that a segmented representation of a dance video works well with sequence learning techniques for recognition purposes; (3) We extend and refine the ICD dataset and provide a new dataset for evaluation of dance. Our model performs comparable or better in some cases than the state-of-the-art on action recognition benchmarks.
Subjective quality evaluation of low-bit-rate video
NASA Astrophysics Data System (ADS)
Masry, Mark; Hemami, Sheila S.; Osberger, Wilfried M.; Rohaly, Ann M.
2001-06-01
A subjective quality evaluation was performed to qualify vie4wre responses to visual defects that appear in low bit rate video at full and reduced frame rates. The stimuli were eight sequences compressed by three motion compensated encoders - Sorenson Video, H.263+ and a Wavelet based coder - operating at five bit/frame rate combinations. The stimulus sequences exhibited obvious coding artifacts whose nature differed across the three coders. The subjective evaluation was performed using the Single Stimulus Continuos Quality Evaluation method of UTI-R Rec. BT.500-8. Viewers watched concatenated coded test sequences and continuously registered the perceived quality using a slider device. Data form 19 viewers was colleted. An analysis of their responses to the presence of various artifacts across the range of possible coding conditions and content is presented. The effects of blockiness and blurriness on perceived quality are examined. The effects of changes in frame rate on perceived quality are found to be related to the nature of the motion in the sequence.
Multi-frame super-resolution with quality self-assessment for retinal fundus videos.
Köhler, Thomas; Brost, Alexander; Mogalle, Katja; Zhang, Qianyi; Köhler, Christiane; Michelson, Georg; Hornegger, Joachim; Tornow, Ralf P
2014-01-01
This paper proposes a novel super-resolution framework to reconstruct high-resolution fundus images from multiple low-resolution video frames in retinal fundus imaging. Natural eye movements during an examination are used as a cue for super-resolution in a robust maximum a-posteriori scheme. In order to compensate heterogeneous illumination on the fundus, we integrate retrospective illumination correction for photometric registration to the underlying imaging model. Our method utilizes quality self-assessment to provide objective quality scores for reconstructed images as well as to select regularization parameters automatically. In our evaluation on real data acquired from six human subjects with a low-cost video camera, the proposed method achieved considerable enhancements of low-resolution frames and improved noise and sharpness characteristics by 74%. In terms of image analysis, we demonstrate the importance of our method for the improvement of automatic blood vessel segmentation as an example application, where the sensitivity was increased by 13% using super-resolution reconstruction.
Automated multiple target detection and tracking in UAV videos
NASA Astrophysics Data System (ADS)
Mao, Hongwei; Yang, Chenhui; Abousleman, Glen P.; Si, Jennie
2010-04-01
In this paper, a novel system is presented to detect and track multiple targets in Unmanned Air Vehicles (UAV) video sequences. Since the output of the system is based on target motion, we first segment foreground moving areas from the background in each video frame using background subtraction. To stabilize the video, a multi-point-descriptor-based image registration method is performed where a projective model is employed to describe the global transformation between frames. For each detected foreground blob, an object model is used to describe its appearance and motion information. Rather than immediately classifying the detected objects as targets, we track them for a certain period of time and only those with qualified motion patterns are labeled as targets. In the subsequent tracking process, a Kalman filter is assigned to each tracked target to dynamically estimate its position in each frame. Blobs detected at a later time are used as observations to update the state of the tracked targets to which they are associated. The proposed overlap-rate-based data association method considers the splitting and merging of the observations, and therefore is able to maintain tracks more consistently. Experimental results demonstrate that the system performs well on real-world UAV video sequences. Moreover, careful consideration given to each component in the system has made the proposed system feasible for real-time applications.
Detection of distorted frames in retinal video-sequences via machine learning
NASA Astrophysics Data System (ADS)
Kolar, Radim; Liberdova, Ivana; Odstrcilik, Jan; Hracho, Michal; Tornow, Ralf P.
2017-07-01
This paper describes detection of distorted frames in retinal sequences based on set of global features extracted from each frame. The feature vector is consequently used in classification step, in which three types of classifiers are tested. The best classification accuracy 96% has been achieved with support vector machine approach.
Kloefkorn, Heidi E.; Pettengill, Travis R.; Turner, Sara M. F.; Streeter, Kristi A.; Gonzalez-Rothi, Elisa J.; Fuller, David D.; Allen, Kyle D.
2016-01-01
While rodent gait analysis can quantify the behavioral consequences of disease, significant methodological differences exist between analysis platforms and little validation has been performed to understand or mitigate these sources of variance. By providing the algorithms used to quantify gait, open-source gait analysis software can be validated and used to explore methodological differences. Our group is introducing, for the first time, a fully-automated, open-source method for the characterization of rodent spatiotemporal gait patterns, termed Automated Gait Analysis Through Hues and Areas (AGATHA). This study describes how AGATHA identifies gait events, validates AGATHA relative to manual digitization methods, and utilizes AGATHA to detect gait compensations in orthopaedic and spinal cord injury models. To validate AGATHA against manual digitization, results from videos of rodent gait, recorded at 1000 frames per second (fps), were compared. To assess one common source of variance (the effects of video frame rate), these 1000 fps videos were re-sampled to mimic several lower fps and compared again. While spatial variables were indistinguishable between AGATHA and manual digitization, low video frame rates resulted in temporal errors for both methods. At frame rates over 125 fps, AGATHA achieved a comparable accuracy and precision to manual digitization for all gait variables. Moreover, AGATHA detected unique gait changes in each injury model. These data demonstrate AGATHA is an accurate and precise platform for the analysis of rodent spatiotemporal gait patterns. PMID:27554674
Kloefkorn, Heidi E; Pettengill, Travis R; Turner, Sara M F; Streeter, Kristi A; Gonzalez-Rothi, Elisa J; Fuller, David D; Allen, Kyle D
2017-03-01
While rodent gait analysis can quantify the behavioral consequences of disease, significant methodological differences exist between analysis platforms and little validation has been performed to understand or mitigate these sources of variance. By providing the algorithms used to quantify gait, open-source gait analysis software can be validated and used to explore methodological differences. Our group is introducing, for the first time, a fully-automated, open-source method for the characterization of rodent spatiotemporal gait patterns, termed Automated Gait Analysis Through Hues and Areas (AGATHA). This study describes how AGATHA identifies gait events, validates AGATHA relative to manual digitization methods, and utilizes AGATHA to detect gait compensations in orthopaedic and spinal cord injury models. To validate AGATHA against manual digitization, results from videos of rodent gait, recorded at 1000 frames per second (fps), were compared. To assess one common source of variance (the effects of video frame rate), these 1000 fps videos were re-sampled to mimic several lower fps and compared again. While spatial variables were indistinguishable between AGATHA and manual digitization, low video frame rates resulted in temporal errors for both methods. At frame rates over 125 fps, AGATHA achieved a comparable accuracy and precision to manual digitization for all gait variables. Moreover, AGATHA detected unique gait changes in each injury model. These data demonstrate AGATHA is an accurate and precise platform for the analysis of rodent spatiotemporal gait patterns.
Object tracking using multiple camera video streams
NASA Astrophysics Data System (ADS)
Mehrubeoglu, Mehrube; Rojas, Diego; McLauchlan, Lifford
2010-05-01
Two synchronized cameras are utilized to obtain independent video streams to detect moving objects from two different viewing angles. The video frames are directly correlated in time. Moving objects in image frames from the two cameras are identified and tagged for tracking. One advantage of such a system involves overcoming effects of occlusions that could result in an object in partial or full view in one camera, when the same object is fully visible in another camera. Object registration is achieved by determining the location of common features in the moving object across simultaneous frames. Perspective differences are adjusted. Combining information from images from multiple cameras increases robustness of the tracking process. Motion tracking is achieved by determining anomalies caused by the objects' movement across frames in time in each and the combined video information. The path of each object is determined heuristically. Accuracy of detection is dependent on the speed of the object as well as variations in direction of motion. Fast cameras increase accuracy but limit the speed and complexity of the algorithm. Such an imaging system has applications in traffic analysis, surveillance and security, as well as object modeling from multi-view images. The system can easily be expanded by increasing the number of cameras such that there is an overlap between the scenes from at least two cameras in proximity. An object can then be tracked long distances or across multiple cameras continuously, applicable, for example, in wireless sensor networks for surveillance or navigation.
Practical life log video indexing based on content and context
NASA Astrophysics Data System (ADS)
Tancharoen, Datchakorn; Yamasaki, Toshihiko; Aizawa, Kiyoharu
2006-01-01
Today, multimedia information has gained an important role in daily life and people can use imaging devices to capture their visual experiences. In this paper, we present our personal Life Log system to record personal experiences in form of wearable video and environmental data; in addition, an efficient retrieval system is demonstrated to recall the desirable media. We summarize the practical video indexing techniques based on Life Log content and context to detect talking scenes by using audio/visual cues and semantic key frames from GPS data. Voice annotation is also demonstrated as a practical indexing method. Moreover, we apply body media sensors to record continuous life style and use body media data to index the semantic key frames. In the experiments, we demonstrated various video indexing results which provided their semantic contents and showed Life Log visualizations to examine personal life effectively.
Extraction of Blebs in Human Embryonic Stem Cell Videos.
Guan, Benjamin X; Bhanu, Bir; Talbot, Prue; Weng, Nikki Jo-Hao
2016-01-01
Blebbing is an important biological indicator in determining the health of human embryonic stem cells (hESC). Especially, areas of a bleb sequence in a video are often used to distinguish two cell blebbing behaviors in hESC: dynamic and apoptotic blebbings. This paper analyzes various segmentation methods for bleb extraction in hESC videos and introduces a bio-inspired score function to improve the performance in bleb extraction. Full bleb formation consists of bleb expansion and retraction. Blebs change their size and image properties dynamically in both processes and between frames. Therefore, adaptive parameters are needed for each segmentation method. A score function derived from the change of bleb area and orientation between consecutive frames is proposed which provides adaptive parameters for bleb extraction in videos. In comparison to manual analysis, the proposed method provides an automated fast and accurate approach for bleb sequence extraction.
User-assisted video segmentation system for visual communication
NASA Astrophysics Data System (ADS)
Wu, Zhengping; Chen, Chun
2002-01-01
Video segmentation plays an important role for efficient storage and transmission in visual communication. In this paper, we introduce a novel video segmentation system using point tracking and contour formation techniques. Inspired by the results from the study of the human visual system, we intend to solve the video segmentation problem into three separate phases: user-assisted feature points selection, feature points' automatic tracking, and contour formation. This splitting relieves the computer of ill-posed automatic segmentation problems, and allows a higher level of flexibility of the method. First, the precise feature points can be found using a combination of user assistance and an eigenvalue-based adjustment. Second, the feature points in the remaining frames are obtained using motion estimation and point refinement. At last, contour formation is used to extract the object, and plus a point insertion process to provide the feature points for next frame's tracking.
New architecture for dynamic frame-skipping transcoder.
Fung, Kai-Tat; Chan, Yui-Lam; Siu, Wan-Chi
2002-01-01
Transcoding is a key technique for reducing the bit rate of a previously compressed video signal. A high transcoding ratio may result in an unacceptable picture quality when the full frame rate of the incoming video bitstream is used. Frame skipping is often used as an efficient scheme to allocate more bits to the representative frames, so that an acceptable quality for each frame can be maintained. However, the skipped frame must be decompressed completely, which might act as a reference frame to nonskipped frames for reconstruction. The newly quantized discrete cosine transform (DCT) coefficients of the prediction errors need to be re-computed for the nonskipped frame with reference to the previous nonskipped frame; this can create undesirable complexity as well as introduce re-encoding errors. In this paper, we propose new algorithms and a novel architecture for frame-rate reduction to improve picture quality and to reduce complexity. The proposed architecture is mainly performed on the DCT domain to achieve a transcoder with low complexity. With the direct addition of DCT coefficients and an error compensation feedback loop, re-encoding errors are reduced significantly. Furthermore, we propose a frame-rate control scheme which can dynamically adjust the number of skipped frames according to the incoming motion vectors and re-encoding errors due to transcoding such that the decoded sequence can have a smooth motion as well as better transcoded pictures. Experimental results show that, as compared to the conventional transcoder, the new architecture for frame-skipping transcoder is more robust, produces fewer requantization errors, and has reduced computational complexity.
Pre-processing SAR image stream to facilitate compression for transport on bandwidth-limited-link
Rush, Bobby G.; Riley, Robert
2015-09-29
Pre-processing is applied to a raw VideoSAR (or similar near-video rate) product to transform the image frame sequence into a product that resembles more closely the type of product for which conventional video codecs are designed, while sufficiently maintaining utility and visual quality of the product delivered by the codec.
NASA Astrophysics Data System (ADS)
Yang, Yongchao; Dorn, Charles; Mancini, Tyler; Talken, Zachary; Kenyon, Garrett; Farrar, Charles; Mascareñas, David
2017-02-01
Experimental or operational modal analysis traditionally requires physically-attached wired or wireless sensors for vibration measurement of structures. This instrumentation can result in mass-loading on lightweight structures, and is costly and time-consuming to install and maintain on large civil structures, especially for long-term applications (e.g., structural health monitoring) that require significant maintenance for cabling (wired sensors) or periodic replacement of the energy supply (wireless sensors). Moreover, these sensors are typically placed at a limited number of discrete locations, providing low spatial sensing resolution that is hardly sufficient for modal-based damage localization, or model correlation and updating for larger-scale structures. Non-contact measurement methods such as scanning laser vibrometers provide high-resolution sensing capacity without the mass-loading effect; however, they make sequential measurements that require considerable acquisition time. As an alternative non-contact method, digital video cameras are relatively low-cost, agile, and provide high spatial resolution, simultaneous, measurements. Combined with vision based algorithms (e.g., image correlation, optical flow), video camera based measurements have been successfully used for vibration measurements and subsequent modal analysis, based on techniques such as the digital image correlation (DIC) and the point-tracking. However, they typically require speckle pattern or high-contrast markers to be placed on the surface of structures, which poses challenges when the measurement area is large or inaccessible. This work explores advanced computer vision and video processing algorithms to develop a novel video measurement and vision-based operational (output-only) modal analysis method that alleviate the need of structural surface preparation associated with existing vision-based methods and can be implemented in a relatively efficient and autonomous manner with little user supervision and calibration. First a multi-scale image processing method is applied on the frames of the video of a vibrating structure to extract the local pixel phases that encode local structural vibration, establishing a full-field spatiotemporal motion matrix. Then a high-spatial dimensional, yet low-modal-dimensional, over-complete model is used to represent the extracted full-field motion matrix using modal superposition, which is physically connected and manipulated by a family of unsupervised learning models and techniques, respectively. Thus, the proposed method is able to blindly extract modal frequencies, damping ratios, and full-field (as many points as the pixel number of the video frame) mode shapes from line of sight video measurements of the structure. The method is validated by laboratory experiments on a bench-scale building structure and a cantilever beam. Its ability for output (video measurements)-only identification and visualization of the weakly-excited mode is demonstrated and several issues with its implementation are discussed.
NASA Astrophysics Data System (ADS)
Ershov, Egor; Karnaukhov, Victor; Mozerov, Mikhail
2016-02-01
Two consecutive frames of a lateral navigation camera video sequence can be considered as an appropriate approximation to epipolar stereo. To overcome edge-aware inaccuracy caused by occlusion, we propose a model that matches the current frame to the next and to the previous ones. The positive disparity of matching to the previous frame has its symmetric negative disparity to the next frame. The proposed algorithm performs probabilistic choice for each matched pixel between the positive disparity and its symmetric disparity cost. A disparity map obtained by optimization over the cost volume composed of the proposed probabilistic choice is more accurate than the traditional left-to-right and right-to-left disparity maps cross-check. Also, our algorithm needs two times less computational operations per pixel than the cross-check technique. The effectiveness of our approach is demonstrated on synthetic data and real video sequences, with ground-truth value.
High-Speed Videography Overview
NASA Astrophysics Data System (ADS)
Miller, C. E.
1989-02-01
The field of high-speed videography (HSV) has continued to mature in recent years, due to the introduction of a mixture of new technology and extensions of existing technology. Recent low frame-rate innovations have the potential to dramatically expand the areas of information gathering and motion analysis at all frame-rates. Progress at the 0 - rate is bringing the battle of film versus video to the field of still photography. The pressure to push intermediate frame rates higher continues, although the maximum achievable frame rate has remained stable for several years. Higher maximum recording rates appear technologically practical, but economic factors impose severe limitations to development. The application of diverse photographic techniques to video-based systems is under-exploited. The basics of HSV apply to other fields, such as machine vision and robotics. Present motion analysis systems continue to function mainly as an instant replay replacement for high-speed movie film cameras. The interrelationship among lighting, shuttering and spatial resolution is examined.
High Resolution, High Frame Rate Video Technology
NASA Technical Reports Server (NTRS)
1990-01-01
Papers and working group summaries presented at the High Resolution, High Frame Rate Video (HHV) Workshop are compiled. HHV system is intended for future use on the Space Shuttle and Space Station Freedom. The Workshop was held for the dual purpose of: (1) allowing potential scientific users to assess the utility of the proposed system for monitoring microgravity science experiments; and (2) letting technical experts from industry recommend improvements to the proposed near-term HHV system. The following topics are covered: (1) State of the art in the video system performance; (2) Development plan for the HHV system; (3) Advanced technology for image gathering, coding, and processing; (4) Data compression applied to HHV; (5) Data transmission networks; and (6) Results of the users' requirements survey conducted by NASA.
Static hand gesture recognition from a video
NASA Astrophysics Data System (ADS)
Rokade, Rajeshree S.; Doye, Dharmpal
2011-10-01
A sign language (also signed language) is a language which, instead of acoustically conveyed sound patterns, uses visually transmitted sign patterns to convey meaning- "simultaneously combining hand shapes, orientation and movement of the hands". Sign languages commonly develop in deaf communities, which can include interpreters, friends and families of deaf people as well as people who are deaf or hard of hearing themselves. In this paper, we proposed a novel system for recognition of static hand gestures from a video, based on Kohonen neural network. We proposed algorithm to separate out key frames, which include correct gestures from a video sequence. We segment, hand images from complex and non uniform background. Features are extracted by applying Kohonen on key frames and recognition is done.
A passive terahertz video camera based on lumped element kinetic inductance detectors
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rowe, Sam, E-mail: sam.rowe@astro.cf.ac.uk; Pascale, Enzo; Doyle, Simon
We have developed a passive 350 GHz (850 μm) video-camera to demonstrate lumped element kinetic inductance detectors (LEKIDs)—designed originally for far-infrared astronomy—as an option for general purpose terrestrial terahertz imaging applications. The camera currently operates at a quasi-video frame rate of 2 Hz with a noise equivalent temperature difference per frame of ∼0.1 K, which is close to the background limit. The 152 element superconducting LEKID array is fabricated from a simple 40 nm aluminum film on a silicon dielectric substrate and is read out through a single microwave feedline with a cryogenic low noise amplifier and room temperature frequencymore » domain multiplexing electronics.« less
Video Encryption and Decryption on Quantum Computers
NASA Astrophysics Data System (ADS)
Yan, Fei; Iliyasu, Abdullah M.; Venegas-Andraca, Salvador E.; Yang, Huamin
2015-08-01
A method for video encryption and decryption on quantum computers is proposed based on color information transformations on each frame encoding the content of the encoding the content of the video. The proposed method provides a flexible operation to encrypt quantum video by means of the quantum measurement in order to enhance the security of the video. To validate the proposed approach, a tetris tile-matching puzzle game video is utilized in the experimental simulations. The results obtained suggest that the proposed method enhances the security and speed of quantum video encryption and decryption, both properties required for secure transmission and sharing of video content in quantum communication.
Mantokoudis, Georgios; Dähler, Claudia; Dubach, Patrick; Kompis, Martin; Caversaccio, Marco D.; Senn, Pascal
2013-01-01
Objective To analyze speech reading through Internet video calls by profoundly hearing-impaired individuals and cochlear implant (CI) users. Methods Speech reading skills of 14 deaf adults and 21 CI users were assessed using the Hochmair Schulz Moser (HSM) sentence test. We presented video simulations using different video resolutions (1280×720, 640×480, 320×240, 160×120 px), frame rates (30, 20, 10, 7, 5 frames per second (fps)), speech velocities (three different speakers), webcameras (Logitech Pro9000, C600 and C500) and image/sound delays (0–500 ms). All video simulations were presented with and without sound and in two screen sizes. Additionally, scores for live Skype™ video connection and live face-to-face communication were assessed. Results Higher frame rate (>7 fps), higher camera resolution (>640×480 px) and shorter picture/sound delay (<100 ms) were associated with increased speech perception scores. Scores were strongly dependent on the speaker but were not influenced by physical properties of the camera optics or the full screen mode. There is a significant median gain of +8.5%pts (p = 0.009) in speech perception for all 21 CI-users if visual cues are additionally shown. CI users with poor open set speech perception scores (n = 11) showed the greatest benefit under combined audio-visual presentation (median speech perception +11.8%pts, p = 0.032). Conclusion Webcameras have the potential to improve telecommunication of hearing-impaired individuals. PMID:23359119
Face landmark point tracking using LK pyramid optical flow
NASA Astrophysics Data System (ADS)
Zhang, Gang; Tang, Sikan; Li, Jiaquan
2018-04-01
LK pyramid optical flow is an effective method to implement object tracking in a video. It is used for face landmark point tracking in a video in the paper. The landmark points, i.e. outer corner of left eye, inner corner of left eye, inner corner of right eye, outer corner of right eye, tip of a nose, left corner of mouth, right corner of mouth, are considered. It is in the first frame that the landmark points are marked by hand. For subsequent frames, performance of tracking is analyzed. Two kinds of conditions are considered, i.e. single factors such as normalized case, pose variation and slowly moving, expression variation, illumination variation, occlusion, front face and rapidly moving, pose face and rapidly moving, and combination of the factors such as pose and illumination variation, pose and expression variation, pose variation and occlusion, illumination and expression variation, expression variation and occlusion. Global measures and local ones are introduced to evaluate performance of tracking under different factors or combination of the factors. The global measures contain the number of images aligned successfully, average alignment error, the number of images aligned before failure, and the local ones contain the number of images aligned successfully for components of a face, average alignment error for the components. To testify performance of tracking for face landmark points under different cases, tests are carried out for image sequences gathered by us. Results show that the LK pyramid optical flow method can implement face landmark point tracking under normalized case, expression variation, illumination variation which does not affect facial details, pose variation, and that different factors or combination of the factors have different effect on performance of alignment for different landmark points.
Slow speed—fast motion: time-lapse recordings in physics education
NASA Astrophysics Data System (ADS)
Vollmer, Michael; Möllmann, Klaus-Peter
2018-05-01
Video analysis with a 30 Hz frame rate is the standard tool in physics education. The development of affordable high-speed-cameras has extended the capabilities of the tool for much smaller time scales to the 1 ms range, using frame rates of typically up to 1000 frames s-1, allowing us to study transient physics phenomena happening too fast for the naked eye. Here we want to extend the range of phenomena which may be studied by video analysis in the opposite direction by focusing on much longer time scales ranging from minutes, hours to many days or even months. We discuss this time-lapse method, needed equipment and give a few hints of how to produce respective recordings for two specific experiments.
A system for endobronchial video analysis
NASA Astrophysics Data System (ADS)
Byrnes, Patrick D.; Higgins, William E.
2017-03-01
Image-guided bronchoscopy is a critical component in the treatment of lung cancer and other pulmonary disorders. During bronchoscopy, a high-resolution endobronchial video stream facilitates guidance through the lungs and allows for visual inspection of a patient's airway mucosal surfaces. Despite the detailed information it contains, little effort has been made to incorporate recorded video into the clinical workflow. Follow-up procedures often required in cancer assessment or asthma treatment could significantly benefit from effectively parsed and summarized video. Tracking diagnostic regions of interest (ROIs) could potentially better equip physicians to detect early airway-wall cancer or improve asthma treatments, such as bronchial thermoplasty. To address this need, we have developed a system for the postoperative analysis of recorded endobronchial video. The system first parses an input video stream into endoscopic shots, derives motion information, and selects salient representative key frames. Next, a semi-automatic method for CT-video registration creates data linkages between a CT-derived airway-tree model and the input video. These data linkages then enable the construction of a CT-video chest model comprised of a bronchoscopy path history (BPH) - defining all airway locations visited during a procedure - and texture-mapping information for rendering registered video frames onto the airwaytree model. A suite of analysis tools is included to visualize and manipulate the extracted data. Video browsing and retrieval is facilitated through a video table of contents (TOC) and a search query interface. The system provides a variety of operational modes and additional functionality, including the ability to define regions of interest. We demonstrate the potential of our system using two human case study examples.
A semi-automatic 2D-to-3D video conversion with adaptive key-frame selection
NASA Astrophysics Data System (ADS)
Ju, Kuanyu; Xiong, Hongkai
2014-11-01
To compensate the deficit of 3D content, 2D to 3D video conversion (2D-to-3D) has recently attracted more attention from both industrial and academic communities. The semi-automatic 2D-to-3D conversion which estimates corresponding depth of non-key-frames through key-frames is more desirable owing to its advantage of balancing labor cost and 3D effects. The location of key-frames plays a role on quality of depth propagation. This paper proposes a semi-automatic 2D-to-3D scheme with adaptive key-frame selection to keep temporal continuity more reliable and reduce the depth propagation errors caused by occlusion. The potential key-frames would be localized in terms of clustered color variation and motion intensity. The distance of key-frame interval is also taken into account to keep the accumulated propagation errors under control and guarantee minimal user interaction. Once their depth maps are aligned with user interaction, the non-key-frames depth maps would be automatically propagated by shifted bilateral filtering. Considering that depth of objects may change due to the objects motion or camera zoom in/out effect, a bi-directional depth propagation scheme is adopted where a non-key frame is interpolated from two adjacent key frames. The experimental results show that the proposed scheme has better performance than existing 2D-to-3D scheme with fixed key-frame interval.
JPEG XS-based frame buffer compression inside HEVC for power-aware video compression
NASA Astrophysics Data System (ADS)
Willème, Alexandre; Descampe, Antonin; Rouvroy, Gaël.; Pellegrin, Pascal; Macq, Benoit
2017-09-01
With the emergence of Ultra-High Definition video, reference frame buffers (FBs) inside HEVC-like encoders and decoders have to sustain huge bandwidth. The power consumed by these external memory accesses accounts for a significant share of the codec's total consumption. This paper describes a solution to significantly decrease the FB's bandwidth, making HEVC encoder more suitable for use in power-aware applications. The proposed prototype consists in integrating an embedded lightweight, low-latency and visually lossless codec at the FB interface inside HEVC in order to store each reference frame as several compressed bitstreams. As opposed to previous works, our solution compresses large picture areas (ranging from a CTU to a frame stripe) independently in order to better exploit the spatial redundancy found in the reference frame. This work investigates two data reuse schemes namely Level-C and Level-D. Our approach is made possible thanks to simplified motion estimation mechanisms further reducing the FB's bandwidth and inducing very low quality degradation. In this work, we integrated JPEG XS, the upcoming standard for lightweight low-latency video compression, inside HEVC. In practice, the proposed implementation is based on HM 16.8 and on XSM 1.1.2 (JPEG XS Test Model). Through this paper, the architecture of our HEVC with JPEG XS-based frame buffer compression is described. Then its performance is compared to HM encoder. Compared to previous works, our prototype provides significant external memory bandwidth reduction. Depending on the reuse scheme, one can expect bandwidth and FB size reduction ranging from 50% to 83.3% without significant quality degradation.
Adaptive Distributed Video Coding with Correlation Estimation using Expectation Propagation
Cui, Lijuan; Wang, Shuang; Jiang, Xiaoqian; Cheng, Samuel
2013-01-01
Distributed video coding (DVC) is rapidly increasing in popularity by the way of shifting the complexity from encoder to decoder, whereas no compression performance degrades, at least in theory. In contrast with conventional video codecs, the inter-frame correlation in DVC is explored at decoder based on the received syndromes of Wyner-Ziv (WZ) frame and side information (SI) frame generated from other frames available only at decoder. However, the ultimate decoding performances of DVC are based on the assumption that the perfect knowledge of correlation statistic between WZ and SI frames should be available at decoder. Therefore, the ability of obtaining a good statistical correlation estimate is becoming increasingly important in practical DVC implementations. Generally, the existing correlation estimation methods in DVC can be classified into two main types: pre-estimation where estimation starts before decoding and on-the-fly (OTF) estimation where estimation can be refined iteratively during decoding. As potential changes between frames might be unpredictable or dynamical, OTF estimation methods usually outperforms pre-estimation techniques with the cost of increased decoding complexity (e.g., sampling methods). In this paper, we propose a low complexity adaptive DVC scheme using expectation propagation (EP), where correlation estimation is performed OTF as it is carried out jointly with decoding of the factor graph-based DVC code. Among different approximate inference methods, EP generally offers better tradeoff between accuracy and complexity. Experimental results show that our proposed scheme outperforms the benchmark state-of-the-art DISCOVER codec and other cases without correlation tracking, and achieves comparable decoding performance but with significantly low complexity comparing with sampling method. PMID:23750314
Adaptive distributed video coding with correlation estimation using expectation propagation
NASA Astrophysics Data System (ADS)
Cui, Lijuan; Wang, Shuang; Jiang, Xiaoqian; Cheng, Samuel
2012-10-01
Distributed video coding (DVC) is rapidly increasing in popularity by the way of shifting the complexity from encoder to decoder, whereas no compression performance degrades, at least in theory. In contrast with conventional video codecs, the inter-frame correlation in DVC is explored at decoder based on the received syndromes of Wyner-Ziv (WZ) frame and side information (SI) frame generated from other frames available only at decoder. However, the ultimate decoding performances of DVC are based on the assumption that the perfect knowledge of correlation statistic between WZ and SI frames should be available at decoder. Therefore, the ability of obtaining a good statistical correlation estimate is becoming increasingly important in practical DVC implementations. Generally, the existing correlation estimation methods in DVC can be classified into two main types: pre-estimation where estimation starts before decoding and on-the-fly (OTF) estimation where estimation can be refined iteratively during decoding. As potential changes between frames might be unpredictable or dynamical, OTF estimation methods usually outperforms pre-estimation techniques with the cost of increased decoding complexity (e.g., sampling methods). In this paper, we propose a low complexity adaptive DVC scheme using expectation propagation (EP), where correlation estimation is performed OTF as it is carried out jointly with decoding of the factor graph-based DVC code. Among different approximate inference methods, EP generally offers better tradeoff between accuracy and complexity. Experimental results show that our proposed scheme outperforms the benchmark state-of-the-art DISCOVER codec and other cases without correlation tracking, and achieves comparable decoding performance but with significantly low complexity comparing with sampling method.
Adaptive Distributed Video Coding with Correlation Estimation using Expectation Propagation.
Cui, Lijuan; Wang, Shuang; Jiang, Xiaoqian; Cheng, Samuel
2012-10-15
Distributed video coding (DVC) is rapidly increasing in popularity by the way of shifting the complexity from encoder to decoder, whereas no compression performance degrades, at least in theory. In contrast with conventional video codecs, the inter-frame correlation in DVC is explored at decoder based on the received syndromes of Wyner-Ziv (WZ) frame and side information (SI) frame generated from other frames available only at decoder. However, the ultimate decoding performances of DVC are based on the assumption that the perfect knowledge of correlation statistic between WZ and SI frames should be available at decoder. Therefore, the ability of obtaining a good statistical correlation estimate is becoming increasingly important in practical DVC implementations. Generally, the existing correlation estimation methods in DVC can be classified into two main types: pre-estimation where estimation starts before decoding and on-the-fly (OTF) estimation where estimation can be refined iteratively during decoding. As potential changes between frames might be unpredictable or dynamical, OTF estimation methods usually outperforms pre-estimation techniques with the cost of increased decoding complexity (e.g., sampling methods). In this paper, we propose a low complexity adaptive DVC scheme using expectation propagation (EP), where correlation estimation is performed OTF as it is carried out jointly with decoding of the factor graph-based DVC code. Among different approximate inference methods, EP generally offers better tradeoff between accuracy and complexity. Experimental results show that our proposed scheme outperforms the benchmark state-of-the-art DISCOVER codec and other cases without correlation tracking, and achieves comparable decoding performance but with significantly low complexity comparing with sampling method.
TLE Balloon experiment campaign carried out on 25 August 2006 in Japan
NASA Astrophysics Data System (ADS)
Takahashi, Y.; Chikada, S.; Yoshida, A.; Adachi, T.; Sakanoi, T.
2006-12-01
The balloon observation campaign for TLE and lightning study was carried out 25 August 2006 in Japan by Tohoku University, supported by JAXA. The balloon was successfully launched at 18:33 LT at Sanriku Balloon Center of JAXA located in the east coast of northern part of Japan (Iwate prefecture). Three types of scientific payloads were installed at the 1 m-cubic gondola, that is, 3-axis VLF electric filed antenna and receiver (VLFR), 4 video frame CCD cameras (CCDI) and 2-color photometer (PM). The video images were stored in 4 HD video recorders, which have 20GB memories respectively, at 30 frames/sec and VLFR and PM data were put into digital data recorder with 30 GB memory at sampling rate of 100 kHz. The balloon floated at the altitude of 13 km until about 20:30 LT, going eastward and went up to 26 km at a distance of 130 km from the coast. And it went back westward at the altitude of 26 km until midnight. The total observation period is about 5 hours. Most of the equipments worked properly except for one video recorder. Some thunderstorms existed within the direct FOV from the balloon in the range of 400-600 km and more than about 400 lightning flashes were recorded as video images. We confirmed that, at least, one sprite halo was captured by CCDI which occurred in the oceanic thunderstorm at a distance of about 500 km from balloon. This is the first TLE image obtained by a balloon-borne camera. Simultaneous measurements of VLF sferics and lightning/TLE images will clarify the role of intracloud (IC) currents in producing and/or modulating TLEs as well as cloud-to-ground discharges (CG). Especially the effect of horizontal components will be investigated in detail, which cannot be detected on the ground, to explain the unsolved properties of TLEs, such as long time delay of TLE from the timing of stroke and large horizontal displacement between CG and TLEs.
Photometric Calibration of Consumer Video Cameras
NASA Technical Reports Server (NTRS)
Suggs, Robert; Swift, Wesley, Jr.
2007-01-01
Equipment and techniques have been developed to implement a method of photometric calibration of consumer video cameras for imaging of objects that are sufficiently narrow or sufficiently distant to be optically equivalent to point or line sources. Heretofore, it has been difficult to calibrate consumer video cameras, especially in cases of image saturation, because they exhibit nonlinear responses with dynamic ranges much smaller than those of scientific-grade video cameras. The present method not only takes this difficulty in stride but also makes it possible to extend effective dynamic ranges to several powers of ten beyond saturation levels. The method will likely be primarily useful in astronomical photometry. There are also potential commercial applications in medical and industrial imaging of point or line sources in the presence of saturation.This development was prompted by the need to measure brightnesses of debris in amateur video images of the breakup of the Space Shuttle Columbia. The purpose of these measurements is to use the brightness values to estimate relative masses of debris objects. In most of the images, the brightness of the main body of Columbia was found to exceed the dynamic ranges of the cameras. A similar problem arose a few years ago in the analysis of video images of Leonid meteors. The present method is a refined version of the calibration method developed to solve the Leonid calibration problem. In this method, one performs an endto- end calibration of the entire imaging system, including not only the imaging optics and imaging photodetector array but also analog tape recording and playback equipment (if used) and any frame grabber or other analog-to-digital converter (if used). To automatically incorporate the effects of nonlinearity and any other distortions into the calibration, the calibration images are processed in precisely the same manner as are the images of meteors, space-shuttle debris, or other objects that one seeks to analyze. The light source used to generate the calibration images is an artificial variable star comprising a Newtonian collimator illuminated by a light source modulated by a rotating variable neutral- density filter. This source acts as a point source, the brightness of which varies at a known rate. A video camera to be calibrated is aimed at this source. Fixed neutral-density filters are inserted in or removed from the light path as needed to make the video image of the source appear to fluctuate between dark and saturated bright. The resulting video-image data are analyzed by use of custom software that determines the integrated signal in each video frame and determines the system response curve (measured output signal versus input brightness). These determinations constitute the calibration, which is thereafter used in automatic, frame-by-frame processing of the data from the video images to be analyzed.
Improved Discrete Approximation of Laplacian of Gaussian
NASA Technical Reports Server (NTRS)
Shuler, Robert L., Jr.
2004-01-01
An improved method of computing a discrete approximation of the Laplacian of a Gaussian convolution of an image has been devised. The primary advantage of the method is that without substantially degrading the accuracy of the end result, it reduces the amount of information that must be processed and thus reduces the amount of circuitry needed to perform the Laplacian-of- Gaussian (LOG) operation. Some background information is necessary to place the method in context. The method is intended for application to the LOG part of a process of real-time digital filtering of digitized video data that represent brightnesses in pixels in a square array. The particular filtering process of interest is one that converts pixel brightnesses to binary form, thereby reducing the amount of information that must be performed in subsequent correlation processing (e.g., correlations between images in a stereoscopic pair for determining distances or correlations between successive frames of the same image for detecting motions). The Laplacian is often included in the filtering process because it emphasizes edges and textures, while the Gaussian is often included because it smooths out noise that might not be consistent between left and right images or between successive frames of the same image.
Method and System for Temporal Filtering in Video Compression Systems
NASA Technical Reports Server (NTRS)
Lu, Ligang; He, Drake; Jagmohan, Ashish; Sheinin, Vadim
2011-01-01
Three related innovations combine improved non-linear motion estimation, video coding, and video compression. The first system comprises a method in which side information is generated using an adaptive, non-linear motion model. This method enables extrapolating and interpolating a visual signal, including determining the first motion vector between the first pixel position in a first image to a second pixel position in a second image; determining a second motion vector between the second pixel position in the second image and a third pixel position in a third image; determining a third motion vector between the first pixel position in the first image and the second pixel position in the second image, the second pixel position in the second image, and the third pixel position in the third image using a non-linear model; and determining a position of the fourth pixel in a fourth image based upon the third motion vector. For the video compression element, the video encoder has low computational complexity and high compression efficiency. The disclosed system comprises a video encoder and a decoder. The encoder converts the source frame into a space-frequency representation, estimates the conditional statistics of at least one vector of space-frequency coefficients with similar frequencies, and is conditioned on previously encoded data. It estimates an encoding rate based on the conditional statistics and applies a Slepian-Wolf code with the computed encoding rate. The method for decoding includes generating a side-information vector of frequency coefficients based on previously decoded source data and encoder statistics and previous reconstructions of the source frequency vector. It also performs Slepian-Wolf decoding of a source frequency vector based on the generated side-information and the Slepian-Wolf code bits. The video coding element includes receiving a first reference frame having a first pixel value at a first pixel position, a second reference frame having a second pixel value at a second pixel position, and a third reference frame having a third pixel value at a third pixel position. It determines a first motion vector between the first pixel position and the second pixel position, a second motion vector between the second pixel position and the third pixel position, and a fourth pixel value for a fourth frame based upon a linear or nonlinear combination of the first pixel value, the second pixel value, and the third pixel value. A stationary filtering process determines the estimated pixel values. The parameters of the filter may be predetermined constants.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ingram, W; Yang, J; Beadle, B
Purpose: Endoscopic examinations are routine procedures for head-and-neck cancer patients. Our goal is to develop a method to map the recorded video to CT, providing valuable information for radiotherapy treatment planning and toxicity analysis. Methods: We map video frames to CT via virtual endoscopic images rendered at the real endoscope’s CT-space coordinates. We developed two complementary methods to find these coordinates by maximizing real-to-virtual image similarity:(1)Endoscope Tracking: moves the virtual endoscope frame-by-frame until the desired frame is reached. Utilizes prior knowledge of endoscope coordinates, but sensitive to local optima. (2)Location Search: moves the virtual endoscope along possible paths through themore » volume to find the desired frame. More robust, but more computationally expensive. We tested these methods on clay phantoms with embedded markers for point mapping and protruding bolus material for contour mapping, and we assessed them qualitatively on three patient exams. For mapped points we calculated 3D-distance errors, and for mapped contours we calculated mean absolute distances (MAD) from CT contours. Results: In phantoms, Endoscope Tracking had average point error=0.66±0.50cm and average bolus MAD=0.74±0.37cm for the first 80% of each video. After that the virtual endoscope got lost, increasing these values to 4.73±1.69cm and 4.06±0.30cm. Location Search had point error=0.49±0.44cm and MAD=0.53±0.28cm. Point errors were larger where the endoscope viewed the surface at shallow angles<10 degrees (1.38±0.62cm and 1.22±0.69cm for Endoscope Tracking and Location Search, respectively). In patients, Endoscope Tracking did not make it past the nasal cavity. However, Location Search found coordinates near the correct location for 70% of test frames. Its performance was best near the epiglottis and in the nasal cavity. Conclusion: Location Search is a robust and accurate technique to map endoscopic video to CT. Endoscope Tracking is sensitive to erratic camera motion and local optima, but could be used in conjunction with anchor points found using Location Search.« less
Standardization of Freeze Frame TV Codecs
1990-06-01
Kodak SV9600 Still Video Transceiver Colorado Video, Inc.286 Digital Transceiver Image Data Corp. CP-200 Photophone Interand Corp. DISCON Imagephone...error recovery Proprietary Proprby retransmission errorIMAGE BUILD-UP Sequential Sequential PHOTOPHONE Video Teleconferenc- DISCON Imaqephone GENERIC...and information transfer is effected among terminals. An indication of the function and power of these commands can be obtained by reviewing Table
ERIC Educational Resources Information Center
Danielowich, Robert M.
2014-01-01
Since many studies that use video to support teacher learning are situated in strongly guided contexts and encourage particular kinds of thinking, we still know very little about how more loosely guided contexts can support teachers to think about the dilemmas of practice associated with their own goals by reflecting about video. This study…
ERIC Educational Resources Information Center
Cain, William Christopher
2017-01-01
The following study was framed around a simple question: when a group of people is engaged in video conferencing, "what sort of things can they do to improve their group dynamics?" This is an important question for current and future educational practice because web-based video conferencing has increasingly become an important tool for…
Annotations of Mexican bullfighting videos for semantic index
NASA Astrophysics Data System (ADS)
Montoya Obeso, Abraham; Oropesa Morales, Lester Arturo; Fernando Vázquez, Luis; Cocolán Almeda, Sara Ivonne; Stoian, Andrei; García Vázquez, Mireya Saraí; Zamudio Fuentes, Luis Miguel; Montiel Perez, Jesús Yalja; de la O Torres, Saul; Ramírez Acosta, Alejandro Alvaro
2015-09-01
The video annotation is important for web indexing and browsing systems. Indeed, in order to evaluate the performance of video query and mining techniques, databases with concept annotations are required. Therefore, it is necessary generate a database with a semantic indexing that represents the digital content of the Mexican bullfighting atmosphere. This paper proposes a scheme to make complex annotations in a video in the frame of multimedia search engine project. Each video is partitioned using our segmentation algorithm that creates shots of different length and different number of frames. In order to make complex annotations about the video, we use ELAN software. The annotations are done in two steps: First, we take note about the whole content in each shot. Second, we describe the actions as parameters of the camera like direction, position and deepness. As a consequence, we obtain a more complete descriptor of every action. In both cases we use the concepts of the TRECVid 2014 dataset. We also propose new concepts. This methodology allows to generate a database with the necessary information to create descriptors and algorithms capable to detect actions to automatically index and classify new bullfighting multimedia content.
Automatic generation of pictorial transcripts of video programs
NASA Astrophysics Data System (ADS)
Shahraray, Behzad; Gibbon, David C.
1995-03-01
An automatic authoring system for the generation of pictorial transcripts of video programs which are accompanied by closed caption information is presented. A number of key frames, each of which represents the visual information in a segment of the video (i.e., a scene), are selected automatically by performing a content-based sampling of the video program. The textual information is recovered from the closed caption signal and is initially segmented based on its implied temporal relationship with the video segments. The text segmentation boundaries are then adjusted, based on lexical analysis and/or caption control information, to account for synchronization errors due to possible delays in the detection of scene boundaries or the transmission of the caption information. The closed caption text is further refined through linguistic processing for conversion to lower- case with correct capitalization. The key frames and the related text generate a compact multimedia presentation of the contents of the video program which lends itself to efficient storage and transmission. This compact representation can be viewed on a computer screen, or used to generate the input to a commercial text processing package to generate a printed version of the program.
Fuzzy Filtering Method for Color Videos Corrupted by Additive Noise
Ponomaryov, Volodymyr I.; Montenegro-Monroy, Hector; Nino-de-Rivera, Luis
2014-01-01
A novel method for the denoising of color videos corrupted by additive noise is presented in this paper. The proposed technique consists of three principal filtering steps: spatial, spatiotemporal, and spatial postprocessing. In contrast to other state-of-the-art algorithms, during the first spatial step, the eight gradient values in different directions for pixels located in the vicinity of a central pixel as well as the R, G, and B channel correlation between the analogous pixels in different color bands are taken into account. These gradient values give the information about the level of contamination then the designed fuzzy rules are used to preserve the image features (textures, edges, sharpness, chromatic properties, etc.). In the second step, two neighboring video frames are processed together. Possible local motions between neighboring frames are estimated using block matching procedure in eight directions to perform interframe filtering. In the final step, the edges and smoothed regions in a current frame are distinguished for final postprocessing filtering. Numerous simulation results confirm that this novel 3D fuzzy method performs better than other state-of-the-art techniques in terms of objective criteria (PSNR, MAE, NCD, and SSIM) as well as subjective perception via the human vision system in the different color videos. PMID:24688428
Constrained motion estimation-based error resilient coding for HEVC
NASA Astrophysics Data System (ADS)
Guo, Weihan; Zhang, Yongfei; Li, Bo
2018-04-01
Unreliable communication channels might lead to packet losses and bit errors in the videos transmitted through it, which will cause severe video quality degradation. This is even worse for HEVC since more advanced and powerful motion estimation methods are introduced to further remove the inter-frame dependency and thus improve the coding efficiency. Once a Motion Vector (MV) is lost or corrupted, it will cause distortion in the decoded frame. More importantly, due to motion compensation, the error will propagate along the motion prediction path, accumulate over time, and significantly degrade the overall video presentation quality. To address this problem, we study the problem of encoder-sider error resilient coding for HEVC and propose a constrained motion estimation scheme to mitigate the problem of error propagation to subsequent frames. The approach is achieved by cutting off MV dependencies and limiting the block regions which are predicted by temporal motion vector. The experimental results show that the proposed method can effectively suppress the error propagation caused by bit errors of motion vector and can improve the robustness of the stream in the bit error channels. When the bit error probability is 10-5, an increase of the decoded video quality (PSNR) by up to1.310dB and on average 0.762 dB can be achieved, compared to the reference HEVC.
Research of real-time video processing system based on 6678 multi-core DSP
NASA Astrophysics Data System (ADS)
Li, Xiangzhen; Xie, Xiaodan; Yin, Xiaoqiang
2017-10-01
In the information age, the rapid development in the direction of intelligent video processing, complex algorithm proposed the powerful challenge on the performance of the processor. In this article, through the FPGA + TMS320C6678 frame structure, the image to fog, merge into an organic whole, to stabilize the image enhancement, its good real-time, superior performance, break through the traditional function of video processing system is simple, the product defects such as single, solved the video application in security monitoring, video, etc. Can give full play to the video monitoring effectiveness, improve enterprise economic benefits.
Tracking cells in Life Cell Imaging videos using topological alignments.
Mosig, Axel; Jäger, Stefan; Wang, Chaofeng; Nath, Sumit; Ersoy, Ilker; Palaniappan, Kannap-pan; Chen, Su-Shing
2009-07-16
With the increasing availability of live cell imaging technology, tracking cells and other moving objects in live cell videos has become a major challenge for bioimage informatics. An inherent problem for most cell tracking algorithms is over- or under-segmentation of cells - many algorithms tend to recognize one cell as several cells or vice versa. We propose to approach this problem through so-called topological alignments, which we apply to address the problem of linking segmentations of two consecutive frames in the video sequence. Starting from the output of a conventional segmentation procedure, we align pairs of consecutive frames through assigning sets of segments in one frame to sets of segments in the next frame. We achieve this through finding maximum weighted solutions to a generalized "bipartite matching" between two hierarchies of segments, where we derive weights from relative overlap scores of convex hulls of sets of segments. For solving the matching task, we rely on an integer linear program. Practical experiments demonstrate that the matching task can be solved efficiently in practice, and that our method is both effective and useful for tracking cells in data sets derived from a so-called Large Scale Digital Cell Analysis System (LSDCAS). The source code of the implementation is available for download from http://www.picb.ac.cn/patterns/Software/topaln.
Investigation of kinematic features for dismount detection and tracking
NASA Astrophysics Data System (ADS)
Narayanaswami, Ranga; Tyurina, Anastasia; Diel, David; Mehra, Raman K.; Chinn, Janice M.
2012-05-01
With recent changes in threats and methods of warfighting and the use of unmanned aircrafts, ISR (Intelligence, Surveillance and Reconnaissance) activities have become critical to the military's efforts to maintain situational awareness and neutralize the enemy's activities. The identification and tracking of dismounts from surveillance video is an important step in this direction. Our approach combines advanced ultra fast registration techniques to identify moving objects with a classification algorithm based on both static and kinematic features of the objects. Our objective was to push the acceptable resolution beyond the capability of industry standard feature extraction methods such as SIFT (Scale Invariant Feature Transform) based features and inspired by it, SURF (Speeded-Up Robust Feature). Both of these methods utilize single frame images. We exploited the temporal component of the video signal to develop kinematic features. Of particular interest were the easily distinguishable frequencies characteristic of bipedal human versus quadrupedal animal motion. We examine limits of performance, frame rates and resolution required for human, animal and vehicles discrimination. A few seconds of video signal with the acceptable frame rate allow us to lower resolution requirements for individual frames as much as by a factor of five, which translates into the corresponding increase of the acceptable standoff distance between the sensor and the object of interest.
State of the art in video system performance
NASA Technical Reports Server (NTRS)
Lewis, Michael J.
1990-01-01
The closed circuit television (CCTV) system that is onboard the Space Shuttle has the following capabilities: camera, video signal switching and routing unit (VSU); and Space Shuttle video tape recorder. However, this system is inadequate for use with many experiments that require video imaging. In order to assess the state-of-the-art in video technology and data storage systems, a survey was conducted of the High Resolution, High Frame Rate Video Technology (HHVT) products. The performance of the state-of-the-art solid state cameras and image sensors, video recording systems, data transmission devices, and data storage systems versus users' requirements are shown graphically.
2016-09-01
Characteristics of Silver Carp (Hypophthalmichthys molitrix) Using Video Analyses and Principles of Projectile Physics by Glenn R. Parsons, Ehlana Stell...2002) estimated maximum swim speeds of videotaped, captive, and free-ranging dolphins, Delphinidae, by timed sequential analyses of video frames... videos to estimate the swim speeds and leap characteristics of carp as they exit the waters’ surface. We used both direct estimates of swim speeds as
NASA Astrophysics Data System (ADS)
Aghamaleki, Javad Abbasi; Behrad, Alireza
2018-01-01
Double compression detection is a crucial stage in digital image and video forensics. However, the detection of double compressed videos is challenging when the video forger uses the same quantization matrix and synchronized group of pictures (GOP) structure during the recompression history to conceal tampering effects. A passive approach is proposed for detecting double compressed MPEG videos with the same quantization matrix and synchronized GOP structure. To devise the proposed algorithm, the effects of recompression on P frames are mathematically studied. Then, based on the obtained guidelines, a feature vector is proposed to detect double compressed frames on the GOP level. Subsequently, sparse representations of the feature vectors are used for dimensionality reduction and enrich the traces of recompression. Finally, a support vector machine classifier is employed to detect and localize double compression in temporal domain. The experimental results show that the proposed algorithm achieves the accuracy of more than 95%. In addition, the comparisons of the results of the proposed method with those of other methods reveal the efficiency of the proposed algorithm.
Spatial correlation-based side information refinement for distributed video coding
NASA Astrophysics Data System (ADS)
Taieb, Mohamed Haj; Chouinard, Jean-Yves; Wang, Demin
2013-12-01
Distributed video coding (DVC) architecture designs, based on distributed source coding principles, have benefitted from significant progresses lately, notably in terms of achievable rate-distortion performances. However, a significant performance gap still remains when compared to prediction-based video coding schemes such as H.264/AVC. This is mainly due to the non-ideal exploitation of the video sequence temporal correlation properties during the generation of side information (SI). In fact, the decoder side motion estimation provides only an approximation of the true motion. In this paper, a progressive DVC architecture is proposed, which exploits the spatial correlation of the video frames to improve the motion-compensated temporal interpolation (MCTI). Specifically, Wyner-Ziv (WZ) frames are divided into several spatially correlated groups that are then sent progressively to the receiver. SI refinement (SIR) is performed as long as these groups are being decoded, thus providing more accurate SI for the next groups. It is shown that the proposed progressive SIR method leads to significant improvements over the Discover DVC codec as well as other SIR schemes recently introduced in the literature.
Backwards compatible high dynamic range video compression
NASA Astrophysics Data System (ADS)
Dolzhenko, Vladimir; Chesnokov, Vyacheslav; Edirisinghe, Eran A.
2014-02-01
This paper presents a two layer CODEC architecture for high dynamic range video compression. The base layer contains the tone mapped video stream encoded with 8 bits per component which can be decoded using conventional equipment. The base layer content is optimized for rendering on low dynamic range displays. The enhancement layer contains the image difference, in perceptually uniform color space, between the result of inverse tone mapped base layer content and the original video stream. Prediction of the high dynamic range content reduces the redundancy in the transmitted data while still preserves highlights and out-of-gamut colors. Perceptually uniform colorspace enables using standard ratedistortion optimization algorithms. We present techniques for efficient implementation and encoding of non-uniform tone mapping operators with low overhead in terms of bitstream size and number of operations. The transform representation is based on human vision system model and suitable for global and local tone mapping operators. The compression techniques include predicting the transform parameters from previously decoded frames and from already decoded data for current frame. Different video compression techniques are compared: backwards compatible and non-backwards compatible using AVC and HEVC codecs.
Measuring zebrafish turning rate.
Mwaffo, Violet; Butail, Sachit; di Bernardo, Mario; Porfiri, Maurizio
2015-06-01
Zebrafish is becoming a popular animal model in preclinical research, and zebrafish turning rate has been proposed for the analysis of activity in several domains. The turning rate is often estimated from the trajectory of the fish centroid that is output by commercial or custom-made target tracking software run on overhead videos of fish swimming. However, the accuracy of such indirect methods with respect to the turning rate associated with changes in heading during zebrafish locomotion is largely untested. Here, we compare two indirect methods for the turning rate estimation using the centroid velocity or position data, with full shape tracking for three different video sampling rates. We use tracking data from the overhead video recorded at 60, 30, and 15 frames per second of zebrafish swimming in a shallow water tank. Statistical comparisons of absolute turning rate across methods and sampling rates indicate that, while indirect methods are indistinguishable from full shape tracking, the video sampling rate significantly influences the turning rate measurement. The results of this study can aid in the selection of the video capture frame rate, an experimental design parameter in zebrafish behavioral experiments where activity is an important measure.
Solar Technical Assistance Team Success Stories Video - Text Version |
State, Local, and Tribal Governments | NREL Solar Technical Assistance Team Success Stories Video - Text Version Solar Technical Assistance Team Success Stories Video - Text Version Below is the text version for the Solar Technical Assistance Team Success Stories video. My name is Christopher
Ross, Sandy A; Rice, Clinton; Von Behren, Kristyn; Meyer, April; Alexander, Rachel; Murfin, Scott
2015-01-01
The purpose of this study was to establish intra-rater, intra-session, and inter-rater, reliability of sagittal plane hip, knee, and ankle angles with and without reflective markers using the GAITRite walkway and single video camera between student physical therapists and an experienced physical therapist. This study included thirty-two healthy participants age 20-59, stratified by age and gender. Participants performed three successful walks with and without markers applied to anatomical landmarks. GAITRite software was used to digitize sagittal hip, knee, and ankle angles at two phases of gait: (1) initial contact; and (2) mid-stance. Intra-rater reliability was more consistent for the experienced physical therapist, regardless of joint or phase of gait. Intra-session reliability was variable, the experienced physical therapist showed moderate to high reliability (intra-class correlation coefficient (ICC) = 0.50-0.89) and the student physical therapist showed very poor to high reliability (ICC = 0.07-0.85). Inter-rater reliability was highest during mid-stance at the knee with markers (ICC = 0.86) and lowest during mid-stance at the hip without markers (ICC = 0.25). Reliability of a single camera system, especially at the knee joint shows promise. Depending on the specific type of reliability, error can be attributed to the testers (e.g. lack of digitization practice and marker placement), participants (e.g. loose fitting clothing) and camera systems (e.g. frame rate and resolution). However, until the camera technology can be upgraded to a higher frame rate and resolution, and the software can be linked to the GAITRite walkway, the clinical utility for pre/post measures is limited.
Joint Machine Learning and Game Theory for Rate Control in High Efficiency Video Coding.
Gao, Wei; Kwong, Sam; Jia, Yuheng
2017-08-25
In this paper, a joint machine learning and game theory modeling (MLGT) framework is proposed for inter frame coding tree unit (CTU) level bit allocation and rate control (RC) optimization in High Efficiency Video Coding (HEVC). First, a support vector machine (SVM) based multi-classification scheme is proposed to improve the prediction accuracy of CTU-level Rate-Distortion (R-D) model. The legacy "chicken-and-egg" dilemma in video coding is proposed to be overcome by the learning-based R-D model. Second, a mixed R-D model based cooperative bargaining game theory is proposed for bit allocation optimization, where the convexity of the mixed R-D model based utility function is proved, and Nash bargaining solution (NBS) is achieved by the proposed iterative solution search method. The minimum utility is adjusted by the reference coding distortion and frame-level Quantization parameter (QP) change. Lastly, intra frame QP and inter frame adaptive bit ratios are adjusted to make inter frames have more bit resources to maintain smooth quality and bit consumption in the bargaining game optimization. Experimental results demonstrate that the proposed MLGT based RC method can achieve much better R-D performances, quality smoothness, bit rate accuracy, buffer control results and subjective visual quality than the other state-of-the-art one-pass RC methods, and the achieved R-D performances are very close to the performance limits from the FixedQP method.
Slow Speed--Fast Motion: Time-Lapse Recordings in Physics Education
ERIC Educational Resources Information Center
Vollmer, Michael; Möllmann, Klaus-Peter
2018-01-01
Video analysis with a 30 Hz frame rate is the standard tool in physics education. The development of affordable high-speed-cameras has extended the capabilities of the tool for much smaller time scales to the 1 ms range, using frame rates of typically up to 1000 frames s[superscript -1], allowing us to study transient physics phenomena happening…
Karayiannis, Nicolaos B; Sami, Abdul; Frost, James D; Wise, Merrill S; Mizrahi, Eli M
2005-04-01
This paper presents an automated procedure developed to extract quantitative information from video recordings of neonatal seizures in the form of motor activity signals. This procedure relies on optical flow computation to select anatomical sites located on the infants' body parts. Motor activity signals are extracted by tracking selected anatomical sites during the seizure using adaptive block matching. A block of pixels is tracked throughout a sequence of frames by searching for the most similar block of pixels in subsequent frames; this search is facilitated by employing various update strategies to account for the changing appearance of the block. The proposed procedure is used to extract temporal motor activity signals from video recordings of neonatal seizures and other events not associated with seizures.
Note: Sound recovery from video using SVD-based information extraction
NASA Astrophysics Data System (ADS)
Zhang, Dashan; Guo, Jie; Lei, Xiujun; Zhu, Chang'an
2016-08-01
This note reports an efficient singular value decomposition (SVD)-based vibration extraction approach that recovers sound information in silent high-speed video. A high-speed camera of which frame rates are in the range of 2 kHz-10 kHz is applied to film the vibrating objects. Sub-images cut from video frames are transformed into column vectors and then reconstructed to a new matrix. The SVD of the new matrix produces orthonormal image bases (OIBs) and image projections onto specific OIB can be recovered as understandable acoustical signals. Standard frequencies of 256 Hz and 512 Hz tuning forks are extracted offline from their vibrating surfaces and a 3.35 s speech signal is recovered online from a piece of paper that is stimulated by sound waves within 1 min.
Video Clip of a Rover Rock-Drilling Demonstration at JPL
2013-02-20
This frame from a video clip shows moments during a demonstration of drilling into a rock at NASA JPL, Pasadena, Calif., with a test double of the Mars rover Curiosity. The drill combines hammering and rotation motions of the bit.
Li, Yixian; Qi, Lehua; Song, Yongshan; Chao, Xujiang
2017-06-01
The components of carbon/carbon (C/C) composites have significant influence on the thermal and mechanical properties, so a quantitative characterization of component is necessary to study the microstructure of C/C composites, and further to improve the macroscopic properties of C/C composites. Considering the extinction crosses of the pyrocarbon matrix have significant moving features, the polarized light microscope (PLM) video is used to characterize C/C composites quantitatively because it contains sufficiently dynamic and structure information. Then the optical flow method is introduced to compute the optical flow field between the adjacent frames, and segment the components of C/C composites from PLM image by image processing. Meanwhile the matrix with different textures is re-segmented by the length difference of motion vectors, and then the component fraction of each component and extinction angle of pyrocarbon matrix are calculated directly. Finally, the C/C composites are successfully characterized from three aspects of carbon fiber, pyrocarbon, and pores by a series of image processing operators based on PLM video, and the errors of component fractions are less than 15%. © 2017 Wiley Periodicals, Inc.
VLSI design of lossless frame recompression using multi-orientation prediction
NASA Astrophysics Data System (ADS)
Lee, Yu-Hsuan; You, Yi-Lun; Chen, Yi-Guo
2016-01-01
Pursuing an experience of high-end visual quality drives human to demand a higher display resolution and a higher frame rate. Hence, a lot of powerful coding tools are aggregated together in emerging video coding standards to improve coding efficiency. This also makes video coding standards suffer from two design challenges: heavy computation and tremendous memory bandwidth. The first issue can be properly solved by a careful hardware architecture design with advanced semiconductor processes. Nevertheless, the second one becomes a critical design bottleneck for a modern video coding system. In this article, a lossless frame recompression using multi-orientation prediction technique is proposed to overcome this bottleneck. This work is realised into a silicon chip with the technology of TSMC 0.18 µm CMOS process. Its encoding capability can reach full-HD (1920 × 1080)@48 fps. The chip power consumption is 17.31 mW@100 MHz. Core area and chip area are 0.83 × 0.83 mm2 and 1.20 × 1.20 mm2, respectively. Experiment results demonstrate that this work exhibits an outstanding performance on lossless compression ratio with a competitive hardware performance.
Joint Multi-Leaf Segmentation, Alignment, and Tracking for Fluorescence Plant Videos.
Yin, Xi; Liu, Xiaoming; Chen, Jin; Kramer, David M
2018-06-01
This paper proposes a novel framework for fluorescence plant video processing. The plant research community is interested in the leaf-level photosynthetic analysis within a plant. A prerequisite for such analysis is to segment all leaves, estimate their structures, and track them over time. We identify this as a joint multi-leaf segmentation, alignment, and tracking problem. First, leaf segmentation and alignment are applied on the last frame of a plant video to find a number of well-aligned leaf candidates. Second, leaf tracking is applied on the remaining frames with leaf candidate transformation from the previous frame. We form two optimization problems with shared terms in their objective functions for leaf alignment and tracking respectively. A quantitative evaluation framework is formulated to evaluate the performance of our algorithm with four metrics. Two models are learned to predict the alignment accuracy and detect tracking failure respectively in order to provide guidance for subsequent plant biology analysis. The limitation of our algorithm is also studied. Experimental results show the effectiveness, efficiency, and robustness of the proposed method.
Bellaïche, Yohanns; Bosveld, Floris; Graner, François; Mikula, Karol; Remesíková, Mariana; Smísek, Michal
2011-01-01
In this paper, we present a novel algorithm for tracking cells in time lapse confocal microscopy movie of a Drosophila epithelial tissue during pupal morphogenesis. We consider a 2D + time video as a 3D static image, where frames are stacked atop each other, and using a spatio-temporal segmentation algorithm we obtain information about spatio-temporal 3D tubes representing evolutions of cells. The main idea for tracking is the usage of two distance functions--first one from the cells in the initial frame and second one from segmented boundaries. We track the cells backwards in time. The first distance function attracts the subsequently constructed cell trajectories to the cells in the initial frame and the second one forces them to be close to centerlines of the segmented tubular structures. This makes our tracking algorithm robust against noise and missing spatio-temporal boundaries. This approach can be generalized to a 3D + time video analysis, where spatio-temporal tubes are 4D objects.
Frame prediction using recurrent convolutional encoder with residual learning
NASA Astrophysics Data System (ADS)
Yue, Boxuan; Liang, Jun
2018-05-01
The prediction for the frame of a video is difficult but in urgent need in auto-driving. Conventional methods can only predict some abstract trends of the region of interest. The boom of deep learning makes the prediction for frames possible. In this paper, we propose a novel recurrent convolutional encoder and DE convolutional decoder structure to predict frames. We introduce the residual learning in the convolution encoder structure to solve the gradient issues. The residual learning can transform the gradient back propagation to an identity mapping. It can reserve the whole gradient information and overcome the gradient issues in Recurrent Neural Networks (RNN) and Convolutional Neural Networks (CNN). Besides, compared with the branches in CNNs and the gated structures in RNNs, the residual learning can save the training time significantly. In the experiments, we use UCF101 dataset to train our networks, the predictions are compared with some state-of-the-art methods. The results show that our networks can predict frames fast and efficiently. Furthermore, our networks are used for the driving video to verify the practicability.
Video image position determination
Christensen, Wynn; Anderson, Forrest L.; Kortegaard, Birchard L.
1991-01-01
An optical beam position controller in which a video camera captures an image of the beam in its video frames, and conveys those images to a processing board which calculates the centroid coordinates for the image. The image coordinates are used by motor controllers and stepper motors to position the beam in a predetermined alignment. In one embodiment, system noise, used in conjunction with Bernoulli trials, yields higher resolution centroid coordinates.
Different strategies for spatial updating in yaw and pitch path integration
Goeke, Caspar M.; König, Peter; Gramann, Klaus
2013-01-01
Research in spatial navigation revealed the existence of discrete strategies defined by the use of distinct reference frames during virtual path integration. The present study investigated the distribution of these navigation strategies as a function of gender, video gaming experience, and self-estimates of spatial navigation abilities in a population of 300 subjects. Participants watched videos of virtual passages through a star-field with one turn in either the horizontal (yaw) or the vertical (pitch) axis. At the end of a passage they selected one out of four homing arrows to indicate the initial starting location. To solve the task, participants could employ two discrete strategies, navigating within either an egocentric or an allocentric reference frame. The majority of valid subjects (232/260) consistently used the same strategy in more than 75% of all trials. With that approach 33.1% of all participants were classified as Turners (using an egocentric reference frame on both axes) and 46.5% as Non-turners (using an allocentric reference frame on both axes). 9.2% of all participants consistently used an egocentric reference frame in the yaw plane but an allocentric reference frame in the pitch plane (Switcher). Investigating the influence of gender on navigation strategies revealed that females predominantly used the Non-turner strategy while males used both the Turner and the Non-turner strategy with comparable probabilities. Other than expected, video gaming experience did not influence strategy use. Based on a strong quantitative basis with the sample size about an order of magnitude larger than in typical psychophysical studies these results demonstrate that most people reliably use one out of three possible navigation strategies (Turners, Non-turners, Switchers) for spatial updating and provides a sound estimate of how those strategies are distributed within the general population. PMID:23412683
Depth Extraction from Videos Using Geometric Context and Occlusion Boundaries (Open Access)
2014-09-05
RAZA ET AL .: DEPTH EXTRACTION FROM VIDEOS 1 Depth Extraction from Videos Using Geometric Context and Occlusion Boundaries S. Hussain Raza1...electronic forms. ar X iv :1 51 0. 07 31 7v 1 [ cs .C V ] 2 5 O ct 2 01 5 2 RAZA ET AL .: DEPTH EXTRACTION FROM VIDEOS Frame Ground Truth Depth...temporal segmentation using the method proposed by Grundmann et al . [4]. estimation and triangulation to estimate depth maps [17, 27](see Figure 1). In
Synthesis of Speaker Facial Movement to Match Selected Speech Sequences
NASA Technical Reports Server (NTRS)
Scott, K. C.; Kagels, D. S.; Watson, S. H.; Rom, H.; Wright, J. R.; Lee, M.; Hussey, K. J.
1994-01-01
A system is described which allows for the synthesis of a video sequence of a realistic-appearing talking human head. A phonic based approach is used to describe facial motion; image processing rather than physical modeling techniques are used to create video frames.
Video quality assesment using M-SVD
NASA Astrophysics Data System (ADS)
Tao, Peining; Eskicioglu, Ahmet M.
2007-01-01
Objective video quality measurement is a challenging problem in a variety of video processing application ranging from lossy compression to printing. An ideal video quality measure should be able to mimic the human observer. We present a new video quality measure, M-SVD, to evaluate distorted video sequences based on singular value decomposition. A computationally efficient approach is developed for full-reference (FR) video quality assessment. This measure is tested on the Video Quality Experts Group (VQEG) phase I FR-TV test data set. Our experiments show the graphical measure displays the amount of distortion as well as the distribution of error in all frames of the video sequence while the numerical measure has a good correlation with perceived video quality outperforms PSNR and other objective measures by a clear margin.
Modeling of video compression effects on target acquisition performance
NASA Astrophysics Data System (ADS)
Cha, Jae H.; Preece, Bradley; Espinola, Richard L.
2009-05-01
The effect of video compression on image quality was investigated from the perspective of target acquisition performance modeling. Human perception tests were conducted recently at the U.S. Army RDECOM CERDEC NVESD, measuring identification (ID) performance on simulated military vehicle targets at various ranges. These videos were compressed with different quality and/or quantization levels utilizing motion JPEG, motion JPEG2000, and MPEG-4 encoding. To model the degradation on task performance, the loss in image quality is fit to an equivalent Gaussian MTF scaled by the Structural Similarity Image Metric (SSIM). Residual compression artifacts are treated as 3-D spatio-temporal noise. This 3-D noise is found by taking the difference of the uncompressed frame, with the estimated equivalent blur applied, and the corresponding compressed frame. Results show good agreement between the experimental data and the model prediction. This method has led to a predictive performance model for video compression by correlating various compression levels to particular blur and noise input parameters for NVESD target acquisition performance model suite.
NASA Astrophysics Data System (ADS)
Boon, Choong S.; Guleryuz, Onur G.; Kawahara, Toshiro; Suzuki, Yoshinori
2006-08-01
We consider the mobile service scenario where video programming is broadcast to low-resolution wireless terminals. In such a scenario, broadcasters utilize simultaneous data services and bi-directional communications capabilities of the terminals in order to offer substantially enriched viewing experiences to users by allowing user participation and user tuned content. While users immediately benefit from this service when using their phones in mobile environments, the service is less appealing in stationary environments where a regular television provides competing programming at much higher display resolutions. We propose a fast super-resolution technique that allows the mobile terminals to show a much enhanced version of the broadcast video on nearby high-resolution devices, extending the appeal and usefulness of the broadcast service. The proposed single frame super-resolution algorithm uses recent sparse recovery results to provide high quality and high-resolution video reconstructions based solely on individual decoded frames provided by the low-resolution broadcast.
Architecture design of motion estimation for ITU-T H.263
NASA Astrophysics Data System (ADS)
Ku, Chung-Wei; Lin, Gong-Sheng; Chen, Liang-Gee; Lee, Yung-Ping
1997-01-01
Digitalized video and audio system has become the trend of the progress in multimedia, because it provides great performance in quality and feasibility of processing. However, as the huge amount of information is needed while the bandwidth is limitted, data compression plays an important role in the system. Say, for a 176 x 144 monochromic sequence with 10 frames/sec frame rate, the bandwidth is about 2Mbps. This wastes much channel resource and limits the applications. MPEG (moving picttre ezpert groip) standardizes the video codec scheme, and it performs high compression ratio while providing good quality. MPEG-i is used for the frame size about 352 x 240 and 30 frames per second, and MPEG-2 provides scalibility and can be applied on scenes with higher definition, say HDTV (high definition television). On the other hand, some applications concerns the very low bit-rate, such as videophone and video-conferencing. Because the channel bandwidth is much limitted in telephone network, a very high compression ratio must be required. ITU-T announced the H.263 video coding standards to meet the above requirements.8 According to the simulation results of TMN-5,22 it outperforms 11.263 with little overhead of complexity. Since wireless communication is the trend in the near future, low power design of the video codec is an important issue for portable visual telephone. Motion estimation is the most computation consuming parts in the whole video codec. About 60% of the computation is spent on this parts for the encoder. Several architectures were proposed for efficient processing of block matching algorithms. In this paper, in order to meet the requirements of 11.263 and the expectation of low power consumption, a modified sandwich architecture in21 is proposed. Based on the parallel processing philosophy, low power is expected and the generation of either one motion vector or four motion vectors with half-pixel accuracy is achieved concurrently. In addition, we will present our solution how to solve the other addition modes in 11.263 with the proposed architecture.
Nutation of Helianthus Annuus in a microgravity environment
NASA Technical Reports Server (NTRS)
Brown, A. H.
1981-01-01
An experiment to gather evidence to decide between the Darwinian concept of endogenously motivated nutation and the more mechanistic concept of gravity dependent nutation is described. If nutation persists in weightlessness, parameters describing the motion will be measured by recording in time lapse mode the video images of a population of seedlings that were grown at 1-g, but which will be observed at virtual zero gravity. Later, the plant images will be displayed on a video monitor in a laboratory, photographed on 16 millimeter film, and analyzed frame by frame to determine the kinetics of nutation for each specimen tested.
Players and thinkers and learners
NASA Astrophysics Data System (ADS)
Frye, Jonathan
2012-12-01
The stronghold that games have on our society has made it imperative that educators understand the impact that video games can have. Owens (2012) presented two frames for how the press discussed the popular game Spore, which incorporates elements of science topics. One frame suggested that the game teaches children about intelligent design, while the other implied the game merely made students excited about science topics. While this debate is nothing new, having foundations in several theoretical perspectives; educators must identify their own perceptions of video games and how even commercial games can be used as tools for teaching.
Holo-Chidi video concentrator card
NASA Astrophysics Data System (ADS)
Nwodoh, Thomas A.; Prabhakar, Aditya; Benton, Stephen A.
2001-12-01
The Holo-Chidi Video Concentrator Card is a frame buffer for the Holo-Chidi holographic video processing system. Holo- Chidi is designed at the MIT Media Laboratory for real-time computation of computer generated holograms and the subsequent display of the holograms at video frame rates. The Holo-Chidi system is made of two sets of cards - the set of Processor cards and the set of Video Concentrator Cards (VCCs). The Processor cards are used for hologram computation, data archival/retrieval from a host system, and for higher-level control of the VCCs. The VCC formats computed holographic data from multiple hologram computing Processor cards, converting the digital data to analog form to feed the acousto-optic-modulators of the Media lab's Mark-II holographic display system. The Video Concentrator card is made of: a High-Speed I/O (HSIO) interface whence data is transferred from the hologram computing Processor cards, a set of FIFOs and video RAM used as buffer for data for the hololines being displayed, a one-chip integrated microprocessor and peripheral combination that handles communication with other VCCs and furnishes the card with a USB port, a co-processor which controls display data formatting, and D-to-A converters that convert digital fringes to analog form. The co-processor is implemented with an SRAM-based FPGA with over 500,000 gates and controls all the signals needed to format the data from the multiple Processor cards into the format required by Mark-II. A VCC has three HSIO ports through which up to 500 Megabytes of computed holographic data can flow from the Processor Cards to the VCC per second. A Holo-Chidi system with three VCCs has enough frame buffering capacity to hold up to thirty two 36Megabyte hologram frames at a time. Pre-computed holograms may also be loaded into the VCC from a host computer through the low- speed USB port. Both the microprocessor and the co- processor in the VCC can access the main system memory used to store control programs and data for the VCC. The Card also generates the control signals used by the scanning mirrors of Mark-II. In this paper we discuss the design of the VCC and its implementation in the Holo-Chidi system.
Reliability of a Qualitative Video Analysis for Running.
Pipkin, Andrew; Kotecki, Kristy; Hetzel, Scott; Heiderscheit, Bryan
2016-07-01
Study Design Reliability study. Background Video analysis of running gait is frequently performed in orthopaedic and sports medicine practices to assess biomechanical factors that may contribute to injury. However, the reliability of a whole-body assessment has not been determined. Objective To determine the intrarater and interrater reliability of the qualitative assessment of specific running kinematics from a 2-dimensional video. Methods Running-gait analysis was performed on videos recorded from 15 individuals (8 male, 7 female) running at a self-selected pace (3.17 ± 0.40 m/s, 8:28 ± 1:04 min/mi) using a high-speed camera (120 frames per second). These videos were independently rated on 2 occasions by 3 experienced physical therapists using a standardized qualitative assessment. Fifteen sagittal and frontal plane kinematic variables were rated on a 3- or 5-point categorical scale at specific events of the gait cycle, including initial contact (n = 3) and midstance (n = 9), or across the full gait cycle (n = 3). The video frame number corresponding to each gait event was also recorded. Intrarater and interrater reliability values were calculated for gait-event detection (intraclass correlation coefficient [ICC] and standard error of measurement [SEM]) and the individual kinematic variables (weighted kappa [κw]). Results Gait-event detection was highly reproducible within raters (ICC = 0.94-1.00; SEM, 0.3-1.0 frames) and between raters (ICC = 0.77-1.00; SEM, 0.4-1.9 frames). Eleven of the 15 kinematic variables demonstrated substantial (κw = 0.60-0.799) or excellent (κw>0.80) intrarater agreement, with the exception of foot-to-center-of-mass position (κw = 0.59), forefoot position (κw = 0.58), ankle dorsiflexion at midstance (κw = 0.49), and center-of-mass vertical excursion (κw = 0.36). Interrater agreement for the kinematic measures varied more widely (κw = 0.00-0.85), with 5 variables showing substantial or excellent reliability. Conclusion The qualitative assessment of specific kinematic measures during running can be reliably performed with the use of a high-speed video camera. Detection of specific gait events was highly reproducible, as were common kinematic variables such as rearfoot position, foot-strike pattern, tibial inclination angle, knee flexion angle, and forward trunk lean. Other variables should be used with caution. J Orthop Sports Phys Ther 2016;46(7):556-561. Epub 6 Jun 2016. doi:10.2519/jospt.2016.6280.
The effect of visualizing the flow of multimedia content among and inside devices.
Lee, Dong-Seok
2009-05-01
This study introduces a user interface, referred to as the flow interface, which provides a graphical representation of the movement of content among and inside audio/video devices. The proposed interface provides a different frame of reference with content-oriented visualization of the generation, manipulation, storage, and display of content as well as input and output. The flow interface was applied to a VCR/DVD recorder combo, one of the most complicated consumer products. A between-group experiment was performed to determine whether the flow interface helps users to perform various tasks and to examine the learning effect of the flow interface, particularly in regard to hooking up and recording tasks. The results showed that participants with access to the flow interface performed better in terms of success rate and elapsed time. In addition, the participants indicated that they could easily understand the flow interface. The potential of the flow interface for application to other audio video devices, and design issues requiring further consideration, are discussed.
Shojaedini, Seyed Vahab; Heydari, Masoud
2014-10-01
Shape and movement features of sperms are important parameters for infertility study and treatment. In this article, a new method is introduced for characterization sperms in microscopic videos. In this method, first a hypothesis framework is defined to distinguish sperms from other particles in captured video. Then decision about each hypothesis is done in following steps: Selecting some primary regions as candidates for sperms by watershed-based segmentation, pruning of some false candidates during successive frames using graph theory concept and finally confirming correct sperms by using their movement trajectories. Performance of the proposed method is evaluated on real captured images belongs to semen with high density of sperms. The obtained results show the proposed method may detect 97% of sperms in presence of 5% false detections and track 91% of moving sperms. Furthermore, it can be shown that better characterization of sperms in proposed algorithm doesn't lead to extracting more false sperms compared to some present approaches.
An ultrahigh-speed color video camera operating at 1,000,000 fps with 288 frame memories
NASA Astrophysics Data System (ADS)
Kitamura, K.; Arai, T.; Yonai, J.; Hayashida, T.; Kurita, T.; Maruyama, H.; Namiki, J.; Yanagi, T.; Yoshida, T.; van Kuijk, H.; Bosiers, Jan T.; Saita, A.; Kanayama, S.; Hatade, K.; Kitagawa, S.; Etoh, T. Goji
2008-11-01
We developed an ultrahigh-speed color video camera that operates at 1,000,000 fps (frames per second) and had capacity to store 288 frame memories. In 2005, we developed an ultrahigh-speed, high-sensitivity portable color camera with a 300,000-pixel single CCD (ISIS-V4: In-situ Storage Image Sensor, Version 4). Its ultrahigh-speed shooting capability of 1,000,000 fps was made possible by directly connecting CCD storages, which record video images, to the photodiodes of individual pixels. The number of consecutive frames was 144. However, longer capture times were demanded when the camera was used during imaging experiments and for some television programs. To increase ultrahigh-speed capture times, we used a beam splitter and two ultrahigh-speed 300,000-pixel CCDs. The beam splitter was placed behind the pick up lens. One CCD was located at each of the two outputs of the beam splitter. The CCD driving unit was developed to separately drive two CCDs, and the recording period of the two CCDs was sequentially switched. This increased the recording capacity to 288 images, an increase of a factor of two over that of conventional ultrahigh-speed camera. A problem with the camera was that the incident light on each CCD was reduced by a factor of two by using the beam splitter. To improve the light sensitivity, we developed a microlens array for use with the ultrahigh-speed CCDs. We simulated the operation of the microlens array in order to optimize its shape and then fabricated it using stamping technology. Using this microlens increased the light sensitivity of the CCDs by an approximate factor of two. By using a beam splitter in conjunction with the microlens array, it was possible to make an ultrahigh-speed color video camera that has 288 frame memories but without decreasing the camera's light sensitivity.
NASA Astrophysics Data System (ADS)
Ozeki, Yasuyuki; Otsuka, Yoichi; Sato, Shuya; Hashimoto, Hiroyuki; Umemura, Wataru; Sumimura, Kazuhiko; Nishizawa, Norihiko; Fukui, Kiichi; Itoh, Kazuyoshi
2013-02-01
We have developed a video-rate stimulated Raman scattering (SRS) microscope with frame-by-frame wavenumber tunability. The system uses a 76-MHz picosecond Ti:sapphire laser and a subharmonically synchronized, 38-MHz Yb fiber laser. The Yb fiber laser pulses are spectrally sliced by a fast wavelength-tunable filter, which consists of a galvanometer scanner, a 4-f optical system and a reflective grating. The spectral resolution of the filter is ~ 3 cm-1. The wavenumber was scanned from 2800 to 3100 cm-1 with an arbitrary waveform synchronized to the frame trigger. For imaging, we introduced a 8-kHz resonant scanner and a galvanometer scanner. We were able to acquire SRS images of 500 x 480 pixels at a frame rate of 30.8 frames/s. Then these images were processed by principal component analysis followed by a modified algorithm of independent component analysis. This algorithm allows blind separation of constituents with overlapping Raman bands from SRS spectral images. The independent component (IC) spectra give spectroscopic information, and IC images can be used to produce pseudo-color images. We demonstrate various label-free imaging modalities such as 2D spectral imaging of the rat liver, two-color 3D imaging of a vessel in the rat liver, and spectral imaging of several sections of intestinal villi in the mouse. Various structures in the tissues such as lipid droplets, cytoplasm, fibrous texture, nucleus, and water-rich region were successfully visualized.
Real-time motion-based H.263+ frame rate control
NASA Astrophysics Data System (ADS)
Song, Hwangjun; Kim, JongWon; Kuo, C.-C. Jay
1998-12-01
Most existing H.263+ rate control algorithms, e.g. the one adopted in the test model of the near-term (TMN8), focus on the macroblock layer rate control and low latency under the assumptions of with a constant frame rate and through a constant bit rate (CBR) channel. These algorithms do not accommodate the transmission bandwidth fluctuation efficiently, and the resulting video quality can be degraded. In this work, we propose a new H.263+ rate control scheme which supports the variable bit rate (VBR) channel through the adjustment of the encoding frame rate and quantization parameter. A fast algorithm for the encoding frame rate control based on the inherent motion information within a sliding window in the underlying video is developed to efficiently pursue a good tradeoff between spatial and temporal quality. The proposed rate control algorithm also takes the time-varying bandwidth characteristic of the Internet into account and is able to accommodate the change accordingly. Experimental results are provided to demonstrate the superior performance of the proposed scheme.
Akkas, Oguz; Lee, Cheng Hsien; Hu, Yu Hen; Harris Adamson, Carisa; Rempel, David; Radwin, Robert G
2017-12-01
Two computer vision algorithms were developed to automatically estimate exertion time, duty cycle (DC) and hand activity level (HAL) from videos of workers performing 50 industrial tasks. The average DC difference between manual frame-by-frame analysis and the computer vision DC was -5.8% for the Decision Tree (DT) algorithm, and 1.4% for the Feature Vector Training (FVT) algorithm. The average HAL difference was 0.5 for the DT algorithm and 0.3 for the FVT algorithm. A sensitivity analysis, conducted to examine the influence that deviations in DC have on HAL, found it remained unaffected when DC error was less than 5%. Thus, a DC error less than 10% will impact HAL less than 0.5 HAL, which is negligible. Automatic computer vision HAL estimates were therefore comparable to manual frame-by-frame estimates. Practitioner Summary: Computer vision was used to automatically estimate exertion time, duty cycle and hand activity level from videos of workers performing industrial tasks.
NASA Astrophysics Data System (ADS)
Jackson, Christopher Robert
"Lucky-region" fusion (LRF) is a synthetic imaging technique that has proven successful in enhancing the quality of images distorted by atmospheric turbulence. The LRF algorithm selects sharp regions of an image obtained from a series of short exposure frames, and fuses the sharp regions into a final, improved image. In previous research, the LRF algorithm had been implemented on a PC using the C programming language. However, the PC did not have sufficient sequential processing power to handle real-time extraction, processing and reduction required when the LRF algorithm was applied to real-time video from fast, high-resolution image sensors. This thesis describes two hardware implementations of the LRF algorithm to achieve real-time image processing. The first was created with a VIRTEX-7 field programmable gate array (FPGA). The other developed using the graphics processing unit (GPU) of a NVIDIA GeForce GTX 690 video card. The novelty in the FPGA approach is the creation of a "black box" LRF video processing system with a general camera link input, a user controller interface, and a camera link video output. We also describe a custom hardware simulation environment we have built to test the FPGA LRF implementation. The advantage of the GPU approach is significantly improved development time, integration of image stabilization into the system, and comparable atmospheric turbulence mitigation.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ebe, Kazuyu, E-mail: nrr24490@nifty.com; Tokuyama, Katsuichi; Baba, Ryuta
Purpose: To develop and evaluate a new video image-based QA system, including in-house software, that can display a tracking state visually and quantify the positional accuracy of dynamic tumor tracking irradiation in the Vero4DRT system. Methods: Sixteen trajectories in six patients with pulmonary cancer were obtained with the ExacTrac in the Vero4DRT system. Motion data in the cranio–caudal direction (Y direction) were used as the input for a programmable motion table (Quasar). A target phantom was placed on the motion table, which was placed on the 2D ionization chamber array (MatriXX). Then, the 4D modeling procedure was performed on themore » target phantom during a reproduction of the patient’s tumor motion. A substitute target with the patient’s tumor motion was irradiated with 6-MV x-rays under the surrogate infrared system. The 2D dose images obtained from the MatriXX (33 frames/s; 40 s) were exported to in-house video-image analyzing software. The absolute differences in the Y direction between the center of the exposed target and the center of the exposed field were calculated. Positional errors were observed. The authors’ QA results were compared to 4D modeling function errors and gimbal motion errors obtained from log analyses in the ExacTrac to verify the accuracy of their QA system. The patients’ tumor motions were evaluated in the wave forms, and the peak-to-peak distances were also measured to verify their reproducibility. Results: Thirteen of sixteen trajectories (81.3%) were successfully reproduced with Quasar. The peak-to-peak distances ranged from 2.7 to 29.0 mm. Three trajectories (18.7%) were not successfully reproduced due to the limited motions of the Quasar. Thus, 13 of 16 trajectories were summarized. The mean number of video images used for analysis was 1156. The positional errors (absolute mean difference + 2 standard deviation) ranged from 0.54 to 1.55 mm. The error values differed by less than 1 mm from 4D modeling function errors and gimbal motion errors in the ExacTrac log analyses (n = 13). Conclusions: The newly developed video image-based QA system, including in-house software, can analyze more than a thousand images (33 frames/s). Positional errors are approximately equivalent to those in ExacTrac log analyses. This system is useful for the visual illustration of the progress of the tracking state and for the quantification of positional accuracy during dynamic tumor tracking irradiation in the Vero4DRT system.« less
Construction of a multimodal CT-video chest model
NASA Astrophysics Data System (ADS)
Byrnes, Patrick D.; Higgins, William E.
2014-03-01
Bronchoscopy enables a number of minimally invasive chest procedures for diseases such as lung cancer and asthma. For example, using the bronchoscope's continuous video stream as a guide, a physician can navigate through the lung airways to examine general airway health, collect tissue samples, or administer a disease treatment. In addition, physicians can now use new image-guided intervention (IGI) systems, which draw upon both three-dimensional (3D) multi-detector computed tomography (MDCT) chest scans and bronchoscopic video, to assist with bronchoscope navigation. Unfortunately, little use is made of the acquired video stream, a potentially invaluable source of information. In addition, little effort has been made to link the bronchoscopic video stream to the detailed anatomical information given by a patient's 3D MDCT chest scan. We propose a method for constructing a multimodal CT-video model of the chest. After automatically computing a patient's 3D MDCT-based airway-tree model, the method next parses the available video data to generate a positional linkage between a sparse set of key video frames and airway path locations. Next, a fusion/mapping of the video's color mucosal information and MDCT-based endoluminal surfaces is performed. This results in the final multimodal CT-video chest model. The data structure constituting the model provides a history of those airway locations visited during bronchoscopy. It also provides for quick visual access to relevant sections of the airway wall by condensing large portions of endoscopic video into representative frames containing important structural and textural information. When examined with a set of interactive visualization tools, the resulting fused data structure provides a rich multimodal data source. We demonstrate the potential of the multimodal model with both phantom and human data.
Modeling laser-plasma acceleration in the laboratory frame
DOE Office of Scientific and Technical Information (OSTI.GOV)
None
2011-01-01
A simulation of laser-plasma acceleration in the laboratory frame. Both the laser and the wakefield buckets must be resolved over the entire domain of the plasma, requiring many cells and many time steps. While researchers often use a simulation window that moves with the pulse, this reduces only the multitude of cells, not the multitude of time steps. For an artistic impression of how to solve the simulation by using the boosted-frame method, watch the video "Modeling laser-plasma acceleration in the wakefield frame".
ETHOWATCHER: validation of a tool for behavioral and video-tracking analysis in laboratory animals.
Crispim Junior, Carlos Fernando; Pederiva, Cesar Nonato; Bose, Ricardo Chessini; Garcia, Vitor Augusto; Lino-de-Oliveira, Cilene; Marino-Neto, José
2012-02-01
We present a software (ETHOWATCHER(®)) developed to support ethography, object tracking and extraction of kinematic variables from digital video files of laboratory animals. The tracking module allows controlled segmentation of the target from the background, extracting image attributes used to calculate the distance traveled, orientation, length, area and a path graph of the experimental animal. The ethography module allows recording of catalog-based behaviors from environment or from video files continuously or frame-by-frame. The output reports duration, frequency and latency of each behavior and the sequence of events in a time-segmented format, set by the user. Validation tests were conducted on kinematic measurements and on the detection of known behavioral effects of drugs. This software is freely available at www.ethowatcher.ufsc.br. Copyright © 2011 Elsevier Ltd. All rights reserved.
Constructing storyboards based on hierarchical clustering analysis
NASA Astrophysics Data System (ADS)
Hasebe, Satoshi; Sami, Mustafa M.; Muramatsu, Shogo; Kikuchi, Hisakazu
2005-07-01
There are growing needs for quick preview of video contents for the purpose of improving accessibility of video archives as well as reducing network traffics. In this paper, a storyboard that contains a user-specified number of keyframes is produced from a given video sequence. It is based on hierarchical cluster analysis of feature vectors that are derived from wavelet coefficients of video frames. Consistent use of extracted feature vectors is the key to avoid a repetition of computationally-intensive parsing of the same video sequence. Experimental results suggest that a significant reduction in computational time is gained by this strategy.
A data set for evaluating the performance of multi-class multi-object video tracking
NASA Astrophysics Data System (ADS)
Chakraborty, Avishek; Stamatescu, Victor; Wong, Sebastien C.; Wigley, Grant; Kearney, David
2017-05-01
One of the challenges in evaluating multi-object video detection, tracking and classification systems is having publically available data sets with which to compare different systems. However, the measures of performance for tracking and classification are different. Data sets that are suitable for evaluating tracking systems may not be appropriate for classification. Tracking video data sets typically only have ground truth track IDs, while classification video data sets only have ground truth class-label IDs. The former identifies the same object over multiple frames, while the latter identifies the type of object in individual frames. This paper describes an advancement of the ground truth meta-data for the DARPA Neovision2 Tower data set to allow both the evaluation of tracking and classification. The ground truth data sets presented in this paper contain unique object IDs across 5 different classes of object (Car, Bus, Truck, Person, Cyclist) for 24 videos of 871 image frames each. In addition to the object IDs and class labels, the ground truth data also contains the original bounding box coordinates together with new bounding boxes in instances where un-annotated objects were present. The unique IDs are maintained during occlusions between multiple objects or when objects re-enter the field of view. This will provide: a solid foundation for evaluating the performance of multi-object tracking of different types of objects, a straightforward comparison of tracking system performance using the standard Multi Object Tracking (MOT) framework, and classification performance using the Neovision2 metrics. These data have been hosted publically.
Coiled coil interactions for the targeting of liposomes for nucleic acid delivery
NASA Astrophysics Data System (ADS)
Oude Blenke, Erik E.; van den Dikkenberg, Joep; van Kolck, Bartjan; Kros, Alexander; Mastrobattista, Enrico
2016-04-01
Coiled coil interactions are strong protein-protein interactions that are involved in many biological processes, including intracellular trafficking and membrane fusion. A synthetic heterodimeric coiled-coil forming peptide pair, known as E3 (EIAALEK)3 and K3 (KIAALKE)3 was used to functionalize liposomes encapsulating a splice correcting oligonucleotide or siRNA. These peptide-functionalized vesicles are highly stable in solution but start to cluster when vesicles modified with complementary peptides are mixed together, demonstrating that the peptides quickly coil and crosslink the vesicles. When one of the peptides was anchored to the cell membrane using a hydrophobic cholesterol anchor, vesicles functionalized with the complementary peptide could be docked to these cells, whereas non-functionalized cells did not show any vesicle tethering. Although the anchored peptides do not have a downstream signaling pathway, microscopy pictures revealed that after four hours, the majority of the docked vesicles were internalized by endocytosis. Finally, for the first time, it was shown that the coiled coil assembly at the interface between the vesicles and the cell membrane induces active uptake and leads to cytosolic delivery of the nucleic acid cargo. Both the siRNA and the splice correcting oligonucleotide were functionally delivered, resulting respectively in the silencing or recovery of luciferase expression in the appropriate cell lines. These results demonstrate that the docking to the cell by coiled coil interaction can induce active uptake and achieve the successful intracellular delivery of otherwise membrane impermeable nucleic acids in a highly specific manner.Coiled coil interactions are strong protein-protein interactions that are involved in many biological processes, including intracellular trafficking and membrane fusion. A synthetic heterodimeric coiled-coil forming peptide pair, known as E3 (EIAALEK)3 and K3 (KIAALKE)3 was used to functionalize liposomes encapsulating a splice correcting oligonucleotide or siRNA. These peptide-functionalized vesicles are highly stable in solution but start to cluster when vesicles modified with complementary peptides are mixed together, demonstrating that the peptides quickly coil and crosslink the vesicles. When one of the peptides was anchored to the cell membrane using a hydrophobic cholesterol anchor, vesicles functionalized with the complementary peptide could be docked to these cells, whereas non-functionalized cells did not show any vesicle tethering. Although the anchored peptides do not have a downstream signaling pathway, microscopy pictures revealed that after four hours, the majority of the docked vesicles were internalized by endocytosis. Finally, for the first time, it was shown that the coiled coil assembly at the interface between the vesicles and the cell membrane induces active uptake and leads to cytosolic delivery of the nucleic acid cargo. Both the siRNA and the splice correcting oligonucleotide were functionally delivered, resulting respectively in the silencing or recovery of luciferase expression in the appropriate cell lines. These results demonstrate that the docking to the cell by coiled coil interaction can induce active uptake and achieve the successful intracellular delivery of otherwise membrane impermeable nucleic acids in a highly specific manner. Electronic supplementary information (ESI) available: Two videos of the experiment are shown in Fig. 5, demonstrating the distinctive characteristics of the peptide pair in a mixed population of cells are available in online. Video S1 shows the experiment in the bright field channel including the green channel (calcein-AM stained unfunctionalized cells) and orange channel (rhodamine labeled liposomes). Video S2 shows the exact same frames but combining the fluorescent channels only, including the blue channel for Hoechst nuclear staining. Both videos consist of 31 frames at a frame rate of 5 fps. The labeled liposomes are injected after frame 1. The videos span a total timeframe of 15 minutes. See DOI: 10.1039/c6nr00711b
Selective encryption for H.264/AVC video coding
NASA Astrophysics Data System (ADS)
Shi, Tuo; King, Brian; Salama, Paul
2006-02-01
Due to the ease with which digital data can be manipulated and due to the ongoing advancements that have brought us closer to pervasive computing, the secure delivery of video and images has become a challenging problem. Despite the advantages and opportunities that digital video provide, illegal copying and distribution as well as plagiarism of digital audio, images, and video is still ongoing. In this paper we describe two techniques for securing H.264 coded video streams. The first technique, SEH264Algorithm1, groups the data into the following blocks of data: (1) a block that contains the sequence parameter set and the picture parameter set, (2) a block containing a compressed intra coded frame, (3) a block containing the slice header of a P slice, all the headers of the macroblock within the same P slice, and all the luma and chroma DC coefficients belonging to the all the macroblocks within the same slice, (4) a block containing all the ac coefficients, and (5) a block containing all the motion vectors. The first three are encrypted whereas the last two are not. The second method, SEH264Algorithm2, relies on the use of multiple slices per coded frame. The algorithm searches the compressed video sequence for start codes (0x000001) and then encrypts the next N bits of data.
An improved multi-paths optimization method for video stabilization
NASA Astrophysics Data System (ADS)
Qin, Tao; Zhong, Sheng
2018-03-01
For video stabilization, the difference between original camera motion path and the optimized one is proportional to the cropping ratio and warping ratio. A good optimized path should preserve the moving tendency of the original one meanwhile the cropping ratio and warping ratio of each frame should be kept in a proper range. In this paper we use an improved warping-based motion representation model, and propose a gauss-based multi-paths optimization method to get a smoothing path and obtain a stabilized video. The proposed video stabilization method consists of two parts: camera motion path estimation and path smoothing. We estimate the perspective transform of adjacent frames according to warping-based motion representation model. It works well on some challenging videos where most previous 2D methods or 3D methods fail for lacking of long features trajectories. The multi-paths optimization method can deal well with parallax, as we calculate the space-time correlation of the adjacent grid, and then a kernel of gauss is used to weigh the motion of adjacent grid. Then the multi-paths are smoothed while minimize the crop ratio and the distortion. We test our method on a large variety of consumer videos, which have casual jitter and parallax, and achieve good results.
A hardware architecture for real-time shadow removal in high-contrast video
NASA Astrophysics Data System (ADS)
Verdugo, Pablo; Pezoa, Jorge E.; Figueroa, Miguel
2017-09-01
Broadcasting an outdoor sports event at daytime is a challenging task due to the high contrast that exists between areas in the shadow and light conditions within the same scene. Commercial cameras typically do not handle the high dynamic range of such scenes in a proper manner, resulting in broadcast streams with very little shadow detail. We propose a hardware architecture for real-time shadow removal in high-resolution video, which reduces the shadow effect and simultaneously improves shadow details. The algorithm operates only on the shadow portions of each video frame, thus improving the results and producing more realistic images than algorithms that operate on the entire frame, such as simplified Retinex and histogram shifting. The architecture receives an input in the RGB color space, transforms it into the YIQ space, and uses color information from both spaces to produce a mask of the shadow areas present in the image. The mask is then filtered using a connected components algorithm to eliminate false positives and negatives. The hardware uses pixel information at the edges of the mask to estimate the illumination ratio between light and shadow in the image, which is then used to correct the shadow area. Our prototype implementation simultaneously processes up to 7 video streams of 1920×1080 pixels at 60 frames per second on a Xilinx Kintex-7 XC7K325T FPGA.
Satellite markers: a simple method for ground truth car pose on stereo video
NASA Astrophysics Data System (ADS)
Gil, Gustavo; Savino, Giovanni; Piantini, Simone; Pierini, Marco
2018-04-01
Artificial prediction of future location of other cars in the context of advanced safety systems is a must. The remote estimation of car pose and particularly its heading angle is key to predict its future location. Stereo vision systems allow to get the 3D information of a scene. Ground truth in this specific context is associated with referential information about the depth, shape and orientation of the objects present in the traffic scene. Creating 3D ground truth is a measurement and data fusion task associated with the combination of different kinds of sensors. The novelty of this paper is the method to generate ground truth car pose only from video data. When the method is applied to stereo video, it also provides the extrinsic camera parameters for each camera at frame level which are key to quantify the performance of a stereo vision system when it is moving because the system is subjected to undesired vibrations and/or leaning. We developed a video post-processing technique which employs a common camera calibration tool for the 3D ground truth generation. In our case study, we focus in accurate car heading angle estimation of a moving car under realistic imagery. As outcomes, our satellite marker method provides accurate car pose at frame level, and the instantaneous spatial orientation for each camera at frame level.
Gravel, P; Tremblay, M; Leblond, H; Rossignol, S; de Guise, J A
2010-07-15
A computer-aided method for the tracking of morphological markers in fluoroscopic images of a rat walking on a treadmill is presented and validated. The markers correspond to bone articulations in a hind leg and are used to define the hip, knee, ankle and metatarsophalangeal joints. The method allows a user to identify, using a computer mouse, about 20% of the marker positions in a video and interpolate their trajectories from frame-to-frame. This results in a seven-fold speed improvement in detecting markers. This also eliminates confusion problems due to legs crossing and blurred images. The video images are corrected for geometric distortions from the X-ray camera, wavelet denoised, to preserve the sharpness of minute bone structures, and contrast enhanced. From those images, the marker positions across video frames are extracted, corrected for rat "solid body" motions on the treadmill, and used to compute the positional and angular gait patterns. Robust Bootstrap estimates of those gait patterns and their prediction and confidence bands are finally generated. The gait patterns are invaluable tools to study the locomotion of healthy animals or the complex process of locomotion recovery in animals with injuries. The method could, in principle, be adapted to analyze the locomotion of other animals as long as a fluoroscopic imager and a treadmill are available. Copyright 2010 Elsevier B.V. All rights reserved.
SarcOptiM for ImageJ: high-frequency online sarcomere length computing on stimulated cardiomyocytes.
Pasqualin, Côme; Gannier, François; Yu, Angèle; Malécot, Claire O; Bredeloux, Pierre; Maupoil, Véronique
2016-08-01
Accurate measurement of cardiomyocyte contraction is a critical issue for scientists working on cardiac physiology and physiopathology of diseases implying contraction impairment. Cardiomyocytes contraction can be quantified by measuring sarcomere length, but few tools are available for this, and none is freely distributed. We developed a plug-in (SarcOptiM) for the ImageJ/Fiji image analysis platform developed by the National Institutes of Health. SarcOptiM computes sarcomere length via fast Fourier transform analysis of video frames captured or displayed in ImageJ and thus is not tied to a dedicated video camera. It can work in real time or offline, the latter overcoming rotating motion or displacement-related artifacts. SarcOptiM includes a simulator and video generator of cardiomyocyte contraction. Acquisition parameters, such as pixel size and camera frame rate, were tested with both experimental recordings of rat ventricular cardiomyocytes and synthetic videos. It is freely distributed, and its source code is available. It works under Windows, Mac, or Linux operating systems. The camera speed is the limiting factor, since the algorithm can compute online sarcomere shortening at frame rates >10 kHz. In conclusion, SarcOptiM is a free and validated user-friendly tool for studying cardiomyocyte contraction in all species, including human. Copyright © 2016 the American Physiological Society.
Correction of projective distortion in long-image-sequence mosaics without prior information
NASA Astrophysics Data System (ADS)
Yang, Chenhui; Mao, Hongwei; Abousleman, Glen; Si, Jennie
2010-04-01
Image mosaicking is the process of piecing together multiple video frames or still images from a moving camera to form a wide-area or panoramic view of the scene being imaged. Mosaics have widespread applications in many areas such as security surveillance, remote sensing, geographical exploration, agricultural field surveillance, virtual reality, digital video, and medical image analysis, among others. When mosaicking a large number of still images or video frames, the quality of the resulting mosaic is compromised by projective distortion. That is, during the mosaicking process, the image frames that are transformed and pasted to the mosaic become significantly scaled down and appear out of proportion with respect to the mosaic. As more frames continue to be transformed, important target information in the frames can be lost since the transformed frames become too small, which eventually leads to the inability to continue further. Some projective distortion correction techniques make use of prior information such as GPS information embedded within the image, or camera internal and external parameters. Alternatively, this paper proposes a new algorithm to reduce the projective distortion without using any prior information whatsoever. Based on the analysis of the projective distortion, we approximate the projective matrix that describes the transformation between image frames using an affine model. Using singular value decomposition, we can deduce the affine model scaling factor that is usually very close to 1. By resetting the image scale of the affine model to 1, the transformed image size remains unchanged. Even though the proposed correction introduces some error in the image matching, this error is typically acceptable and more importantly, the final mosaic preserves the original image size after transformation. We demonstrate the effectiveness of this new correction algorithm on two real-world unmanned air vehicle (UAV) sequences. The proposed method is shown to be effective and suitable for real-time implementation.
NASA Astrophysics Data System (ADS)
Seeto, Wen Jun; Lipke, Elizabeth Ann
2016-03-01
Tracking of rolling cells via in vitro experiment is now commonly performed using customized computer programs. In most cases, two critical challenges continue to limit analysis of cell rolling data: long computation times due to the complexity of tracking algorithms and difficulty in accurately correlating a given cell with itself from one frame to the next, which is typically due to errors caused by cells that either come close in proximity to each other or come in contact with each other. In this paper, we have developed a sophisticated, yet simple and highly effective, rolling cell tracking system to address these two critical problems. This optical cell tracking analysis (OCTA) system first employs ImageJ for cell identification in each frame of a cell rolling video. A custom MATLAB code was written to use the geometric and positional information of all cells as the primary parameters for matching each individual cell with itself between consecutive frames and to avoid errors when tracking cells that come within close proximity to one another. Once the cells are matched, rolling velocity can be obtained for further analysis. The use of ImageJ for cell identification eliminates the need for high level MATLAB image processing knowledge. As a result, only fundamental MATLAB syntax is necessary for cell matching. OCTA has been implemented in the tracking of endothelial colony forming cell (ECFC) rolling under shear. The processing time needed to obtain tracked cell data from a 2 min ECFC rolling video recorded at 70 frames per second with a total of over 8000 frames is less than 6 min using a computer with an Intel® Core™ i7 CPU 2.80 GHz (8 CPUs). This cell tracking system benefits cell rolling analysis by substantially reducing the time required for post-acquisition data processing of high frame rate video recordings and preventing tracking errors when individual cells come in close proximity to one another.
High-Level Event Recognition in Unconstrained Videos
2013-01-01
frames per- forms well for urban soundscapes but not for polyphonic music. In place of GMM, Lu et al. [78] adopted spectral clustering to generate...Aucouturier JJ, Defreville B, Pachet F (2007) The bag-of-frames approach to audio pattern recognition: a sufficientmodel for urban soundscapes but not
Compressive Video Acquisition, Fusion and Processing
2010-12-14
architecture that comprises an optical computer (comprising a digital micromirror device, two lenses, a single photon detector, and an analog-to-digital (A/D...of the missing video frames with reasonable accuracy. Also, the similar nature of the four curves suggests that the actual values of (Ωx,Ωh,Γ) are not
ERIC Educational Resources Information Center
Mathews, Sarah A.; Lovett, Maria K.
2017-01-01
Video participatory research (VPR) is an emergent methodology that bridges visual methods with the epistemology of participatory research. This approach is motivated by the "crisis of representation" or "reflective turn" (Gubrium & Harper, 2013) that promotes research conducted with or by participants, conceptualizing…
Geospatial Video Monitoring of Benthic Habitats Using the Shallow-Water Positioning System (SWaPS)
2007-01-01
established from the video frames collected using SWaPS. C) Cover contours for the seagrass Thalassia testudinum. A B C surveyed using a spatial grid...distributions of seagrass species within this area are clearly influenced by their tolerance to salinity patterns. Thalassia testudinum, a species
Incremental principal component pursuit for video background modeling
Rodriquez-Valderrama, Paul A.; Wohlberg, Brendt
2017-03-14
An incremental Principal Component Pursuit (PCP) algorithm for video background modeling that is able to process one frame at a time while adapting to changes in background, with a computational complexity that allows for real-time processing, having a low memory footprint and is robust to translational and rotational jitter.
Siamese convolutional networks for tracking the spine motion
NASA Astrophysics Data System (ADS)
Liu, Yuan; Sui, Xiubao; Sun, Yicheng; Liu, Chengwei; Hu, Yong
2017-09-01
Deep learning models have demonstrated great success in various computer vision tasks such as image classification and object tracking. However, tracking the lumbar spine by digitalized video fluoroscopic imaging (DVFI), which can quantitatively analyze the motion mode of spine to diagnose lumbar instability, has not yet been well developed due to the lack of steady and robust tracking method. In this paper, we propose a novel visual tracking algorithm of the lumbar vertebra motion based on a Siamese convolutional neural network (CNN) model. We train a full-convolutional neural network offline to learn generic image features. The network is trained to learn a similarity function that compares the labeled target in the first frame with the candidate patches in the current frame. The similarity function returns a high score if the two images depict the same object. Once learned, the similarity function is used to track a previously unseen object without any adapting online. In the current frame, our tracker is performed by evaluating the candidate rotated patches sampled around the previous frame target position and presents a rotated bounding box to locate the predicted target precisely. Results indicate that the proposed tracking method can detect the lumbar vertebra steadily and robustly. Especially for images with low contrast and cluttered background, the presented tracker can still achieve good tracking performance. Further, the proposed algorithm operates at high speed for real time tracking.
Achieving real-time capsule endoscopy (CE) video visualization through panoramic imaging
NASA Astrophysics Data System (ADS)
Yi, Steven; Xie, Jean; Mui, Peter; Leighton, Jonathan A.
2013-02-01
In this paper, we mainly present a novel and real-time capsule endoscopy (CE) video visualization concept based on panoramic imaging. Typical CE videos run about 8 hours and are manually reviewed by physicians to locate diseases such as bleedings and polyps. To date, there is no commercially available tool capable of providing stabilized and processed CE video that is easy to analyze in real time. The burden on physicians' disease finding efforts is thus big. In fact, since the CE camera sensor has a limited forward looking view and low image frame rate (typical 2 frames per second), and captures very close range imaging on the GI tract surface, it is no surprise that traditional visualization method based on tracking and registration often fails to work. This paper presents a novel concept for real-time CE video stabilization and display. Instead of directly working on traditional forward looking FOV (field of view) images, we work on panoramic images to bypass many problems facing traditional imaging modalities. Methods on panoramic image generation based on optical lens principle leading to real-time data visualization will be presented. In addition, non-rigid panoramic image registration methods will be discussed.
Epistemic Frames for Epistemic Games
ERIC Educational Resources Information Center
Shaffer, David W.
2006-01-01
This paper, develops the concept of "epistemic frames" as a mechanism through which students can use experiences in video games, computer games, and other interactive learning environments to help them deal more effectively with situations outside of the original context of learning. Building on ideas of "islands of expertise" [Crowley, K., &…
Players and Thinkers and Learners
ERIC Educational Resources Information Center
Frye, Jonathan
2012-01-01
The stronghold that games have on our society has made it imperative that educators understand the impact that video games can have. Owens (2012) presented two frames for how the press discussed the popular game "Spore," which incorporates elements of science topics. One frame suggested that the game teaches children about intelligent design,…
Automatic video segmentation and indexing
NASA Astrophysics Data System (ADS)
Chahir, Youssef; Chen, Liming
1999-08-01
Indexing is an important aspect of video database management. Video indexing involves the analysis of video sequences, which is a computationally intensive process. However, effective management of digital video requires robust indexing techniques. The main purpose of our proposed video segmentation is twofold. Firstly, we develop an algorithm that identifies camera shot boundary. The approach is based on the use of combination of color histograms and block-based technique. Next, each temporal segment is represented by a color reference frame which specifies the shot similarities and which is used in the constitution of scenes. Experimental results using a variety of videos selected in the corpus of the French Audiovisual National Institute are presented to demonstrate the effectiveness of performing shot detection, the content characterization of shots and the scene constitution.
A novel sub-shot segmentation method for user-generated video
NASA Astrophysics Data System (ADS)
Lei, Zhuo; Zhang, Qian; Zheng, Chi; Qiu, Guoping
2018-04-01
With the proliferation of the user-generated videos, temporal segmentation is becoming a challengeable problem. Traditional video temporal segmentation methods like shot detection are not able to work on unedited user-generated videos, since they often only contain one single long shot. We propose a novel temporal segmentation framework for user-generated video. It finds similar frames with a tree partitioning min-Hash technique, constructs sparse temporal constrained affinity sub-graphs, and finally divides the video into sub-shot-level segments with a dense-neighbor-based clustering method. Experimental results show that our approach outperforms all the other related works. Furthermore, it is indicated that the proposed approach is able to segment user-generated videos at an average human level.
High efficiency video coding for ultrasound video communication in m-health systems.
Panayides, A; Antoniou, Z; Pattichis, M S; Pattichis, C S; Constantinides, A G
2012-01-01
Emerging high efficiency video compression methods and wider availability of wireless network infrastructure will significantly advance existing m-health applications. For medical video communications, the emerging video compression and network standards support low-delay and high-resolution video transmission, at the clinically acquired resolution and frame rates. Such advances are expected to further promote the adoption of m-health systems for remote diagnosis and emergency incidents in daily clinical practice. This paper compares the performance of the emerging high efficiency video coding (HEVC) standard to the current state-of-the-art H.264/AVC standard. The experimental evaluation, based on five atherosclerotic plaque ultrasound videos encoded at QCIF, CIF, and 4CIF resolutions demonstrates that 50% reductions in bitrate requirements is possible for equivalent clinical quality.
Video Segmentation Descriptors for Event Recognition
2014-12-08
Velastin, 3D Extended Histogram of Oriented Gradients (3DHOG) for Classification of Road Users in Urban Scenes , BMVC, 2009. [3] M.-Y. Chen and A. Hauptmann...computed on 3D volume outputted by the hierarchical segmentation . Each video is described as follows. Each supertube is temporally divided in n-frame...strength of these descriptors is their adaptability to the scene variations since they are grounded on a video segmentation . This makes them naturally robust
Neil A. Clark; Sang-Mook Lee
2004-01-01
This paper demonstrates how a digital video camera with a long lens can be used with pulse laser ranging in order to collect very large-scale tree crown measurements. The long focal length of the camera lens provides the magnification required for precise viewing of distant points with the trade-off of spatial coverage. Multiple video frames are mosaicked into a single...
Large-scale machine learning and evaluation platform for real-time traffic surveillance
NASA Astrophysics Data System (ADS)
Eichel, Justin A.; Mishra, Akshaya; Miller, Nicholas; Jankovic, Nicholas; Thomas, Mohan A.; Abbott, Tyler; Swanson, Douglas; Keller, Joel
2016-09-01
In traffic engineering, vehicle detectors are trained on limited datasets, resulting in poor accuracy when deployed in real-world surveillance applications. Annotating large-scale high-quality datasets is challenging. Typically, these datasets have limited diversity; they do not reflect the real-world operating environment. There is a need for a large-scale, cloud-based positive and negative mining process and a large-scale learning and evaluation system for the application of automatic traffic measurements and classification. The proposed positive and negative mining process addresses the quality of crowd sourced ground truth data through machine learning review and human feedback mechanisms. The proposed learning and evaluation system uses a distributed cloud computing framework to handle data-scaling issues associated with large numbers of samples and a high-dimensional feature space. The system is trained using AdaBoost on 1,000,000 Haar-like features extracted from 70,000 annotated video frames. The trained real-time vehicle detector achieves an accuracy of at least 95% for 1/2 and about 78% for 19/20 of the time when tested on ˜7,500,000 video frames. At the end of 2016, the dataset is expected to have over 1 billion annotated video frames.
Algorithm for Automatic Behavior Quantification of Laboratory Mice Using High-Frame-Rate Videos
NASA Astrophysics Data System (ADS)
Nie, Yuman; Takaki, Takeshi; Ishii, Idaku; Matsuda, Hiroshi
In this paper, we propose an algorithm for automatic behavior quantification in laboratory mice to quantify several model behaviors. The algorithm can detect repetitive motions of the fore- or hind-limbs at several or dozens of hertz, which are too rapid for the naked eye, from high-frame-rate video images. Multiple repetitive motions can always be identified from periodic frame-differential image features in four segmented regions — the head, left side, right side, and tail. Even when a mouse changes its posture and orientation relative to the camera, these features can still be extracted from the shift- and orientation-invariant shape of the mouse silhouette by using the polar coordinate system and adjusting the angle coordinate according to the head and tail positions. The effectiveness of the algorithm is evaluated by analyzing long-term 240-fps videos of four laboratory mice for six typical model behaviors: moving, rearing, immobility, head grooming, left-side scratching, and right-side scratching. The time durations for the model behaviors determined by the algorithm have detection/correction ratios greater than 80% for all the model behaviors. This shows good quantification results for actual animal testing.
NASA Astrophysics Data System (ADS)
Schultz, C. J.; Lang, T. J.; Leake, S.; Runco, M.; Blakeslee, R. J.
2017-12-01
Video and still frame images from cameras aboard the International Space Station (ISS) are used to inspire, educate, and provide a unique vantage point from low-Earth orbit that is second to none; however, these cameras have overlooked capabilities for contributing to scientific analysis of the Earth and near-space environment. The goal of this project is to study how georeferenced video/images from available ISS camera systems can be useful for scientific analysis, using lightning properties as a demonstration. Camera images from the crew cameras and high definition video from the Chiba University Meteor Camera were combined with lightning data from the National Lightning Detection Network (NLDN), ISS-Lightning Imaging Sensor (ISS-LIS), the Geostationary Lightning Mapper (GLM) and lightning mapping arrays. These cameras provide significant spatial resolution advantages ( 10 times or better) over ISS-LIS and GLM, but with lower temporal resolution. Therefore, they can serve as a complementarity analysis tool for studying lightning and thunderstorm processes from space. Lightning sensor data, Visible Infrared Imaging Radiometer Suite (VIIRS) derived city light maps, and other geographic databases were combined with the ISS attitude and position data to reverse geolocate each image or frame. An open-source Python toolkit has been developed to assist with this effort. Next, the locations and sizes of all flashes in each frame or image were computed and compared with flash characteristics from all available lightning datasets. This allowed for characterization of cloud features that are below the 4-km and 8-km resolution of ISS-LIS and GLM which may reduce the light that reaches the ISS-LIS or GLM sensor. In the case of video, consecutive frames were overlaid to determine the rate of change of the light escaping cloud top. Characterization of the rate of change in geometry, more generally the radius, of light escaping cloud top was integrated with the NLDN, ISS-LIS and GLM to understand how the peak rate of change and the peak area of each flash aligned with each lightning system in time. Flash features like leaders could be inferred from the video frames as well. Testing is being done to see if leader speeds may be accurately calculated under certain circumstances.
On mobile wireless ad hoc IP video transports
NASA Astrophysics Data System (ADS)
Kazantzidis, Matheos
2006-05-01
Multimedia transports in wireless, ad-hoc, multi-hop or mobile networks must be capable of obtaining information about the network and adaptively tune sending and encoding parameters to the network response. Obtaining meaningful metrics to guide a stable congestion control mechanism in the transport (i.e. passive, simple, end-to-end and network technology independent) is a complex problem. Equally difficult is obtaining a reliable QoS metrics that agrees with user perception in a client/server or distributed environment. Existing metrics, objective or subjective, are commonly used after or before to test or report on a transmission and require access to both original and transmitted frames. In this paper, we propose that an efficient and successful video delivery and the optimization of overall network QoS requires innovation in a) a direct measurement of available and bottleneck capacity for its congestion control and b) a meaningful subjective QoS metric that is dynamically reported to video sender. Once these are in place, a binomial -stable, fair and TCP friendly- algorithm can be used to determine the sending rate and other packet video parameters. An adaptive mpeg codec can then continually test and fit its parameters and temporal-spatial data-error control balance using the perceived QoS dynamic feedback. We suggest a new measurement based on a packet dispersion technique that is independent of underlying network mechanisms. We then present a binomial control based on direct measurements. We implement a QoS metric that is known to agree with user perception (MPQM) in a client/server, distributed environment by using predetermined table lookups and characterization of video content.
First video rate imagery from a 32-channel 22-GHz aperture synthesis passive millimetre wave imager
NASA Astrophysics Data System (ADS)
Salmon, Neil A.; Macpherson, Rod; Harvey, Andy; Hall, Peter; Hayward, Steve; Wilkinson, Peter; Taylor, Chris
2011-11-01
The first video rate imagery from a proof-of-concept 32-channel 22 GHz aperture synthesis imager is reported. This imager has been brought into operation over the first half of year 2011. Receiver noise temperatures have been measured to be ~453 K, close to original specifications, and the measured radiometric sensitivity agrees with the theoretical predictions for aperture synthesis imagers (2 K for a 40 ms integration time). The short term (few seconds) magnitude stability in the cross-correlations expressed as a fraction was measured to have a mean of 3.45×10-4 with a standard deviation of ~2.30×10-4, whilst the figure for the phase was found to have a mean of essentially zero with a standard deviation of 0.0181°. The susceptibility of the system to aliasing for point sources in the scene was examined and found to be well understood. The system was calibrated and security-relevant indoor near-field and out-door far-field imagery was created, at frame rates ranging from 1 to 200 frames per second. The results prove that an aperture synthesis imager can generate imagery in the near-field regime, successfully coping with the curved wave-fronts. The original objective of the project, to deliver a Technology Readiness Level (TRL) 4 laboratory demonstrator for aperture synthesis passive millimetre wave (PMMW) imaging, has been achieved. The project was co-funded by the Technology Strategy Board and the Royal Society of the United Kingdom.
Multilevel wireless capsule endoscopy video segmentation
NASA Astrophysics Data System (ADS)
Hwang, Sae; Celebi, M. Emre
2010-03-01
Wireless Capsule Endoscopy (WCE) is a relatively new technology (FDA approved in 2002) allowing doctors to view most of the small intestine. WCE transmits more than 50,000 video frames per examination and the visual inspection of the resulting video is a highly time-consuming task even for the experienced gastroenterologist. Typically, a medical clinician spends one or two hours to analyze a WCE video. To reduce the assessment time, it is critical to develop a technique to automatically discriminate digestive organs and shots each of which consists of the same or similar shots. In this paper a multi-level WCE video segmentation methodology is presented to reduce the examination time.
Machine learning for cardiac ultrasound time series data
NASA Astrophysics Data System (ADS)
Yuan, Baichuan; Chitturi, Sathya R.; Iyer, Geoffrey; Li, Nuoyu; Xu, Xiaochuan; Zhan, Ruohan; Llerena, Rafael; Yen, Jesse T.; Bertozzi, Andrea L.
2017-03-01
We consider the problem of identifying frames in a cardiac ultrasound video associated with left ventricular chamber end-systolic (ES, contraction) and end-diastolic (ED, expansion) phases of the cardiac cycle. Our procedure involves a simple application of non-negative matrix factorization (NMF) to a series of frames of a video from a single patient. Rank-2 NMF is performed to compute two end-members. The end members are shown to be close representations of the actual heart morphology at the end of each phase of the heart function. Moreover, the entire time series can be represented as a linear combination of these two end-member states thus providing a very low dimensional representation of the time dynamics of the heart. Unlike previous work, our methods do not require any electrocardiogram (ECG) information in order to select the end-diastolic frame. Results are presented for a data set of 99 patients including both healthy and diseased examples.
NASA Technical Reports Server (NTRS)
Childers, Brooks A.; Snow, Walter L.
1990-01-01
Considerations for acquiring and analyzing 30 Hz video frames from charge coupled device (CCD) cameras mounted in the wing tips of a Beech T-34 aircraft are described. Particular attention is given to the characterization and correction of optical distortions inherent in the data.
Automated mosaicking of sub-canopy video incorporating ancillary data
E. Kee; N.E. Clark; A.L. Abbott
2002-01-01
This work investigates the process of mosaicking overlapping video frames of individual tree stems in sub-canopy scenes captured with a portable multisensor instrument. The robust commercial computer vision systems that are in use today typically rely on precisely controlled conditions. Inconsistent lighting as well as image distortion caused by varying interior and...
High-performance electronic image stabilisation for shift and rotation correction
NASA Astrophysics Data System (ADS)
Parker, Steve C. J.; Hickman, D. L.; Wu, F.
2014-06-01
A novel low size, weight and power (SWaP) video stabiliser called HALO™ is presented that uses a SoC to combine the high processing bandwidth of an FPGA, with the signal processing flexibility of a CPU. An image based architecture is presented that can adapt the tiling of frames to cope with changing scene dynamics. A real-time implementation is then discussed that can generate several hundred optical flow vectors per video frame, to accurately calculate the unwanted rigid body translation and rotation of camera shake. The performance of the HALO™ stabiliser is comprehensively benchmarked against the respected Deshaker 3.0 off-line stabiliser plugin to VirtualDub. Eight different videos are used for benchmarking, simulating: battlefield, surveillance, security and low-level flight applications in both visible and IR wavebands. The results show that HALO™ rivals the performance of Deshaker within its operating envelope. Furthermore, HALO™ may be easily reconfigured to adapt to changing operating conditions or requirements; and can be used to host other video processing functionality like image distortion correction, fusion and contrast enhancement.
Cheng, Xuemin; Hao, Qun; Xie, Mengdi
2016-04-07
Video stabilization is an important technology for removing undesired motion in videos. This paper presents a comprehensive motion estimation method for electronic image stabilization techniques, integrating the speeded up robust features (SURF) algorithm, modified random sample consensus (RANSAC), and the Kalman filter, and also taking camera scaling and conventional camera translation and rotation into full consideration. Using SURF in sub-pixel space, feature points were located and then matched. The false matched points were removed by modified RANSAC. Global motion was estimated by using the feature points and modified cascading parameters, which reduced the accumulated errors in a series of frames and improved the peak signal to noise ratio (PSNR) by 8.2 dB. A specific Kalman filter model was established by considering the movement and scaling of scenes. Finally, video stabilization was achieved with filtered motion parameters using the modified adjacent frame compensation. The experimental results proved that the target images were stabilized even when the vibrating amplitudes of the video become increasingly large.
A web-based system for home monitoring of patients with Parkinson's disease using wearable sensors.
Chen, Bor-Rong; Patel, Shyamal; Buckley, Thomas; Rednic, Ramona; McClure, Douglas J; Shih, Ludy; Tarsy, Daniel; Welsh, Matt; Bonato, Paolo
2011-03-01
This letter introduces MercuryLive, a platform to enable home monitoring of patients with Parkinson's disease (PD) using wearable sensors. MercuryLive contains three tiers: a resource-aware data collection engine that relies upon wearable sensors, web services for live streaming and storage of sensor data, and a web-based graphical user interface client with video conferencing capability. Besides, the platform has the capability of analyzing sensor (i.e., accelerometer) data to reliably estimate clinical scores capturing the severity of tremor, bradykinesia, and dyskinesia. Testing results showed an average data latency of less than 400 ms and video latency of about 200 ms with video frame rate of about 13 frames/s when 800 kb/s of bandwidth were available and we used a 40% video compression, and data feature upload requiring 1 min of extra time following a 10 min interactive session. These results indicate that the proposed platform is suitable to monitor patients with PD to facilitate the titration of medications in the late stages of the disease.
NASA Astrophysics Data System (ADS)
Kuroda, R.; Sugawa, S.
2017-02-01
Ultra-high speed (UHS) CMOS image sensors with on-chop analog memories placed on the periphery of pixel array for the visualization of UHS phenomena are overviewed in this paper. The developed UHS CMOS image sensors consist of 400H×256V pixels and 128 memories/pixel, and the readout speed of 1Tpixel/sec is obtained, leading to 10 Mfps full resolution video capturing with consecutive 128 frames, and 20 Mfps half resolution video capturing with consecutive 256 frames. The first development model has been employed in the high speed video camera and put in practical use in 2012. By the development of dedicated process technologies, photosensitivity improvement and power consumption reduction were simultaneously achieved, and the performance improved version has been utilized in the commercialized high-speed video camera since 2015 that offers 10 Mfps with ISO16,000 photosensitivity. Due to the improved photosensitivity, clear images can be captured and analyzed even under low light condition, such as under a microscope as well as capturing of UHS light emission phenomena.
HiSeasNet: Oceanographic Ships Join the Grid
NASA Astrophysics Data System (ADS)
Berger, Jonathan; Orcutt, John; Foley, Steven; Bohlen, Steven
2006-05-01
HiSeasNet, the communications network providing full-period Internet access for the U.S. academic ocean research fleet, is an enabling technology that is changing the way oceanography is done in the 21st century. With the installation in March 2006 of a system on the research vessel (R/V) Seward Johnson and the planned installation on the R/V Marcus Langseth later this year, all but two of the Universities National Oceanographic Laboratories System (UNOLS) fleet of large/global and intermediate/ocean vessels will be equipped with HiSeasNet capability. HiSeasNet is a full-service Internet Protocol (IP) satellite network utilizing Cisco technology. In addition to the familiar IP services-such as e-mail, telnet, ssh, rlogin, Web traffic, and ftp-HiSeasNet can move real-time audio and video traffic across the satellite links. Phone systems onboard research ships can be connected to their home institutions' phone exchanges. Video teleconferencing with the current 96 kilobits per second circuits supports compressed video frame rates at about 10 frames per second, allowing for effective conversations and demonstrations with ship-to-shore video.
Low Cost Efficient Deliverying Video Surveillance Service to Moving Guard for Smart Home.
Gualotuña, Tatiana; Macías, Elsa; Suárez, Álvaro; C, Efraín R Fonseca; Rivadeneira, Andrés
2018-03-01
Low-cost video surveillance systems are attractive for Smart Home applications (especially in emerging economies). Those systems use the flexibility of the Internet of Things to operate the video camera only when an intrusion is detected. We are the only ones that focus on the design of protocols based on intelligent agents to communicate the video of an intrusion in real time to the guards by wireless or mobile networks. The goal is to communicate, in real time, the video to the guards who can be moving towards the smart home. However, this communication suffers from sporadic disruptions that difficults the control and drastically reduces user satisfaction and operativity of the system. In a novel way, we have designed a generic software architecture based on design patterns that can be adapted to any hardware in a simple way. The implanted hardware is of very low economic cost; the software frameworks are free. In the experimental tests we have shown that it is possible to communicate to the moving guard, intrusion notifications (by e-mail and by instant messaging), and the first video frames in less than 20 s. In addition, we automatically recovered the frames of video lost in the disruptions in a transparent way to the user, we supported vertical handover processes and we could save energy of the smartphone's battery. However, the most important thing was that the high satisfaction of the people who have used the system.
Low Cost Efficient Deliverying Video Surveillance Service to Moving Guard for Smart Home
Gualotuña, Tatiana; Fonseca C., Efraín R.; Rivadeneira, Andrés
2018-01-01
Low-cost video surveillance systems are attractive for Smart Home applications (especially in emerging economies). Those systems use the flexibility of the Internet of Things to operate the video camera only when an intrusion is detected. We are the only ones that focus on the design of protocols based on intelligent agents to communicate the video of an intrusion in real time to the guards by wireless or mobile networks. The goal is to communicate, in real time, the video to the guards who can be moving towards the smart home. However, this communication suffers from sporadic disruptions that difficults the control and drastically reduces user satisfaction and operativity of the system. In a novel way, we have designed a generic software architecture based on design patterns that can be adapted to any hardware in a simple way. The implanted hardware is of very low economic cost; the software frameworks are free. In the experimental tests we have shown that it is possible to communicate to the moving guard, intrusion notifications (by e-mail and by instant messaging), and the first video frames in less than 20 s. In addition, we automatically recovered the frames of video lost in the disruptions in a transparent way to the user, we supported vertical handover processes and we could save energy of the smartphone's battery. However, the most important thing was that the high satisfaction of the people who have used the system. PMID:29494551
High-frame-rate digital radiographic videography
NASA Astrophysics Data System (ADS)
King, Nicholas S. P.; Cverna, Frank H.; Albright, Kevin L.; Jaramillo, Steven A.; Yates, George J.; McDonald, Thomas E.; Flynn, Michael J.; Tashman, Scott
1994-10-01
High speed x-ray imaging can be an important tool for observing internal processes in a wide range of applications. In this paper we describe preliminary implementation of a system having the eventual goal of observing the internal dynamics of bone and joint reactions during loading. Two Los Alamos National Laboratory (LANL) gated and image intensified camera systems were used to record images from an x-ray image convertor tube to demonstrate the potential of high frame-rate digital radiographic videography in the analysis of bone and joint dynamics of the human body. Preliminary experiments were done at LANL to test the systems. Initial high frame-rate imaging (from 500 to 1000 frames/s) of a swinging pendulum mounted to the face of an X-ray image convertor tube demonstrated high contrast response and baseline sensitivity. The systems were then evaluated at the Motion Analysis Laboratory of Henry Ford Health Systems Bone and Joint Center. Imaging of a 9 inch acrylic disk with embedded lead markers rotating at approximately 1000 RPM, demonstrated the system response to a high velocity/high contrast target. By gating the P-20 phosphor image from the X-ray image convertor with a second image intensifier (II) and using a 100 microsecond wide optical gate through the second II, enough prompt light decay from the x-ray image convertor phosphor had taken place to achieve reduction of most of the motion blurring. Measurement of the marker velocity was made by using video frames acquired at 500 frames/s. The data obtained from both experiments successfully demonstrated the feasibility of the technique. Several key areas for improvement are discussed along with salient test results and experiment details.
NASA Astrophysics Data System (ADS)
He, Qiang; Schultz, Richard R.; Chu, Chee-Hung Henry
2008-04-01
The concept surrounding super-resolution image reconstruction is to recover a highly-resolved image from a series of low-resolution images via between-frame subpixel image registration. In this paper, we propose a novel and efficient super-resolution algorithm, and then apply it to the reconstruction of real video data captured by a small Unmanned Aircraft System (UAS). Small UAS aircraft generally have a wingspan of less than four meters, so that these vehicles and their payloads can be buffeted by even light winds, resulting in potentially unstable video. This algorithm is based on a coarse-to-fine strategy, in which a coarsely super-resolved image sequence is first built from the original video data by image registration and bi-cubic interpolation between a fixed reference frame and every additional frame. It is well known that the median filter is robust to outliers. If we calculate pixel-wise medians in the coarsely super-resolved image sequence, we can restore a refined super-resolved image. The primary advantage is that this is a noniterative algorithm, unlike traditional approaches based on highly-computational iterative algorithms. Experimental results show that our coarse-to-fine super-resolution algorithm is not only robust, but also very efficient. In comparison with five well-known super-resolution algorithms, namely the robust super-resolution algorithm, bi-cubic interpolation, projection onto convex sets (POCS), the Papoulis-Gerchberg algorithm, and the iterated back projection algorithm, our proposed algorithm gives both strong efficiency and robustness, as well as good visual performance. This is particularly useful for the application of super-resolution to UAS surveillance video, where real-time processing is highly desired.
Cho, Minwoo; Kim, Jee Hyun; Kong, Hyoun Joong; Hong, Kyoung Sup; Kim, Sungwan
2018-05-01
The colonoscopy adenoma detection rate depends largely on physician experience and skill, and overlooked colorectal adenomas could develop into cancer. This study assessed a system that detects polyps and summarizes meaningful information from colonoscopy videos. One hundred thirteen consecutive patients had colonoscopy videos prospectively recorded at the Seoul National University Hospital. Informative video frames were extracted using a MATLAB support vector machine (SVM) model and classified as bleeding, polypectomy, tool, residue, thin wrinkle, folded wrinkle, or common. Thin wrinkle, folded wrinkle, and common frames were reanalyzed using SVM for polyp detection. The SVM model was applied hierarchically for effective classification and optimization of the SVM. The mean classification accuracy according to type was over 93%; sensitivity was over 87%. The mean sensitivity for polyp detection was 82.1%, and the positive predicted value (PPV) was 39.3%. Polyps detected using the system were larger (6.3 ± 6.4 vs. 4.9 ± 2.5 mm; P = 0.003) with a more pedunculated morphology (Yamada type III, 10.2 vs. 0%; P < 0.001; Yamada type IV, 2.8 vs. 0%; P < 0.001) than polyps missed by the system. There were no statistically significant differences in polyp distribution or histology between the groups. Informative frames and suspected polyps were presented on a timeline. This summary was evaluated using the system usability scale questionnaire; 89.3% of participants expressed positive opinions. We developed and verified a system to extract meaningful information from colonoscopy videos. Although further improvement and validation of the system is needed, the proposed system is useful for physicians and patients.
Sarikaya, Duygu; Corso, Jason J; Guru, Khurshid A
2017-07-01
Video understanding of robot-assisted surgery (RAS) videos is an active research area. Modeling the gestures and skill level of surgeons presents an interesting problem. The insights drawn may be applied in effective skill acquisition, objective skill assessment, real-time feedback, and human-robot collaborative surgeries. We propose a solution to the tool detection and localization open problem in RAS video understanding, using a strictly computer vision approach and the recent advances of deep learning. We propose an architecture using multimodal convolutional neural networks for fast detection and localization of tools in RAS videos. To the best of our knowledge, this approach will be the first to incorporate deep neural networks for tool detection and localization in RAS videos. Our architecture applies a region proposal network (RPN) and a multimodal two stream convolutional network for object detection to jointly predict objectness and localization on a fusion of image and temporal motion cues. Our results with an average precision of 91% and a mean computation time of 0.1 s per test frame detection indicate that our study is superior to conventionally used methods for medical imaging while also emphasizing the benefits of using RPN for precision and efficiency. We also introduce a new data set, ATLAS Dione, for RAS video understanding. Our data set provides video data of ten surgeons from Roswell Park Cancer Institute, Buffalo, NY, USA, performing six different surgical tasks on the daVinci Surgical System (dVSS) with annotations of robotic tools per frame.
Qian, Zhi-Ming; Wang, Shuo Hong; Cheng, Xi En; Chen, Yan Qiu
2016-06-23
Fish tracking is an important step for video based analysis of fish behavior. Due to severe body deformation and mutual occlusion of multiple swimming fish, accurate and robust fish tracking from video image sequence is a highly challenging problem. The current tracking methods based on motion information are not accurate and robust enough to track the waving body and handle occlusion. In order to better overcome these problems, we propose a multiple fish tracking method based on fish head detection. The shape and gray scale characteristics of the fish image are employed to locate the fish head position. For each detected fish head, we utilize the gray distribution of the head region to estimate the fish head direction. Both the position and direction information from fish detection are then combined to build a cost function of fish swimming. Based on the cost function, global optimization method can be applied to associate the target between consecutive frames. Results show that our method can accurately detect the position and direction information of fish head, and has a good tracking performance for dozens of fish. The proposed method can successfully obtain the motion trajectories for dozens of fish so as to provide more precise data to accommodate systematic analysis of fish behavior.
Demirkus, Meltem; Precup, Doina; Clark, James J; Arbel, Tal
2016-06-01
Recent literature shows that facial attributes, i.e., contextual facial information, can be beneficial for improving the performance of real-world applications, such as face verification, face recognition, and image search. Examples of face attributes include gender, skin color, facial hair, etc. How to robustly obtain these facial attributes (traits) is still an open problem, especially in the presence of the challenges of real-world environments: non-uniform illumination conditions, arbitrary occlusions, motion blur and background clutter. What makes this problem even more difficult is the enormous variability presented by the same subject, due to arbitrary face scales, head poses, and facial expressions. In this paper, we focus on the problem of facial trait classification in real-world face videos. We have developed a fully automatic hierarchical and probabilistic framework that models the collective set of frame class distributions and feature spatial information over a video sequence. The experiments are conducted on a large real-world face video database that we have collected, labelled and made publicly available. The proposed method is flexible enough to be applied to any facial classification problem. Experiments on a large, real-world video database McGillFaces [1] of 18,000 video frames reveal that the proposed framework outperforms alternative approaches, by up to 16.96 and 10.13%, for the facial attributes of gender and facial hair, respectively.
Real-Time Detection of Sporadic Meteors in the Intensified TV Imaging Systems.
Vítek, Stanislav; Nasyrova, Maria
2017-12-29
The automatic observation of the night sky through wide-angle video systems with the aim of detecting meteor and fireballs is currently among routine astronomical observations. The observation is usually done in multi-station or network mode, so it is possible to estimate the direction and the speed of the body flight. The high velocity of the meteorite flying through the atmosphere determines the important features of the camera systems, namely the high frame rate. Thanks to high frame rates, such imaging systems produce a large amount of data, of which only a small fragment has scientific potential. This paper focuses on methods for the real-time detection of fast moving objects in the video sequences recorded by intensified TV systems with frame rates of about 60 frames per second. The goal of our effort is to remove all unnecessary data during the daytime and make free hard-drive capacity for the next observation. The processing of data from the MAIA (Meteor Automatic Imager and Analyzer) system is demonstrated in the paper.
Color image processing and object tracking workstation
NASA Technical Reports Server (NTRS)
Klimek, Robert B.; Paulick, Michael J.
1992-01-01
A system is described for automatic and semiautomatic tracking of objects on film or video tape which was developed to meet the needs of the microgravity combustion and fluid science experiments at NASA Lewis. The system consists of individual hardware parts working under computer control to achieve a high degree of automation. The most important hardware parts include 16 mm film projector, a lens system, a video camera, an S-VHS tapedeck, a frame grabber, and some storage and output devices. Both the projector and tapedeck have a computer interface enabling remote control. Tracking software was developed to control the overall operation. In the automatic mode, the main tracking program controls the projector or the tapedeck frame incrementation, grabs a frame, processes it, locates the edge of the objects being tracked, and stores the coordinates in a file. This process is performed repeatedly until the last frame is reached. Three representative applications are described. These applications represent typical uses and include tracking the propagation of a flame front, tracking the movement of a liquid-gas interface with extremely poor visibility, and characterizing a diffusion flame according to color and shape.
Real-Time Detection of Sporadic Meteors in the Intensified TV Imaging Systems
2017-01-01
The automatic observation of the night sky through wide-angle video systems with the aim of detecting meteor and fireballs is currently among routine astronomical observations. The observation is usually done in multi-station or network mode, so it is possible to estimate the direction and the speed of the body flight. The high velocity of the meteorite flying through the atmosphere determines the important features of the camera systems, namely the high frame rate. Thanks to high frame rates, such imaging systems produce a large amount of data, of which only a small fragment has scientific potential. This paper focuses on methods for the real-time detection of fast moving objects in the video sequences recorded by intensified TV systems with frame rates of about 60 frames per second. The goal of our effort is to remove all unnecessary data during the daytime and make free hard-drive capacity for the next observation. The processing of data from the MAIA (Meteor Automatic Imager and Analyzer) system is demonstrated in the paper. PMID:29286294
VLSI-based video event triggering for image data compression
NASA Astrophysics Data System (ADS)
Williams, Glenn L.
1994-02-01
Long-duration, on-orbit microgravity experiments require a combination of high resolution and high frame rate video data acquisition. The digitized high-rate video stream presents a difficult data storage problem. Data produced at rates of several hundred million bytes per second may require a total mission video data storage requirement exceeding one terabyte. A NASA-designed, VLSI-based, highly parallel digital state machine generates a digital trigger signal at the onset of a video event. High capacity random access memory storage coupled with newly available fuzzy logic devices permits the monitoring of a video image stream for long term (DC-like) or short term (AC-like) changes caused by spatial translation, dilation, appearance, disappearance, or color change in a video object. Pre-trigger and post-trigger storage techniques are then adaptable to archiving only the significant video images.
VLSI-based Video Event Triggering for Image Data Compression
NASA Technical Reports Server (NTRS)
Williams, Glenn L.
1994-01-01
Long-duration, on-orbit microgravity experiments require a combination of high resolution and high frame rate video data acquisition. The digitized high-rate video stream presents a difficult data storage problem. Data produced at rates of several hundred million bytes per second may require a total mission video data storage requirement exceeding one terabyte. A NASA-designed, VLSI-based, highly parallel digital state machine generates a digital trigger signal at the onset of a video event. High capacity random access memory storage coupled with newly available fuzzy logic devices permits the monitoring of a video image stream for long term (DC-like) or short term (AC-like) changes caused by spatial translation, dilation, appearance, disappearance, or color change in a video object. Pre-trigger and post-trigger storage techniques are then adaptable to archiving only the significant video images.
Enk, D; Enk, E
1995-11-01
Various in vitro models have been introduced for comparative examinations of post-dural-puncture trauma and measurement of liquor leakage through puncture sites. These models allow simulation of subarachnoid, but not of peridural, pressure. A new two-chamber-model realizes the simulation of both subarachnoid and peridural pressure and allows observation of in vitro punctures with video-documentation. Frame grabbing and (computer-aided) image analysis show new aspects of spinal puncture effects. Therefore, post-dural-puncture trauma and retraction can be objectively visualized by this method, which has not previously been demonstrated. Two-chamber-model consists of two short aluminium cylinders. Native human dura patches (8X8 mm) from fresh cadavers are put (correctly oriented) between two special polyamide seals. Mounted between the upper and lower cylinder, these seals stretch the dura patch, which remains flexible and even in all directions. After filling of the lower (subarachnoid) and upper (peridural) chamber with Ringer lactate solution, positive or negative physiological pressure can be adjusted by way of two (Ringer lactate solution filled) infusion lines in each chamber. Puncturing is performed at an angle of 57 degrees to the dura. The model allows examination with epi-illumination and transmitted (polarized) light. In vitro punctures are observed through an inverted camera lens with an CCD-Hi8 video camera (Canon UC1HI) looking into the peridural chamber and documented by means of an S-VHS video recorder (Panasonic NV-FS200EG). After true-colour frame grabbing by a video digitizer (Fast Screen Machine II), single video frames can be optimized and analysed with a 486-66 MHz computer and conventional software (Corel Draw 3.0, Photostyler 1.1a, DDL Aequitas 1.00b). Punctures demonstrated in this paper have been done under simulation of a transdural gradient of 20 cm water similar to the situation of a recumbent patient (15 cm water in the subarachnoid and -5 cm water in the peridural chamber). The punctures were followed by short-time observation for up to 10 minutes. By making it possible to obtains a picture of the puncture site at 20-ms intervals (because of the PAL norm of 50 half-frames/s), video-documentation has become accepted as superior to conventional photography. When the Ringer lactate solution in the subarachnoid chamber is stained with methylene blue, transdural leakage can easily be observed. The result of this documentation technique demonstrate that not dural puncture can be atraumatic, when a 29-G Quincke needle is used. Calculation on the difference between a digitized video frame before and after the puncture clearly illustrates the dural trauma. Owing to their non-cutting tip, as expected, pencil-point needles leave diffuse changes across the dura patch, whereas a more local trauma was observed after puncturing with cutting-tip needles. The same computer calculation between two video frames allows examination of post-puncture-dural retraction of the puncture site. In this connection, we found that relevant dural retraction is a phenomenon limited to the first minute after puncture. Thin spinal needles with so-called modern tips (e.g. Whitacre, Atraucan) can minimize the post-dural-puncture trauma, whereas thicker, conventional, spinal needles (Quincke) leave considerable dural defects. The two-chamber-model presented allows easy simulation of physiological subarachnoid and peridural pressure. The Ringer lactate solution in the subarachnoid chamber corresponds to the liquor, whereas that in the peridural chamber corresponds to the intercellular (peridural) space. The tension of the dural patch between the polyamide seals is similar to the situation in an anotomical model observed by spinaloscopy (in an earlier study). With the video documentation and computer-aided analysis technique introduced, dural trauma and retraction of the puncture site can be examined and demo
Ultrasound investigation of fetal human upper respiratory anatomy.
Wolfson, V P; Laitman, J T
1990-07-01
Although the human upper respiratory-upper digestive tract is an area of vital importance, relatively little is known about either the structural or functional changes that occur in the region during the fetal period. While investigations in our laboratory have begun to chart these changes through the use of postmortem materials, in vivo studies have been rarely attempted. This study combines ultrasonography with new applications of video editing to examine aspects of prenatal upper respiratory development. Structures of the fetal upper respiratory-digestive tract and their movements were studied through the use of ultrasonography and detailed frame-by-frame analysis. Twenty-five living fetuses, aged 18-36 weeks gestation, were studied in utero during routine diagnostic ultrasound examination. These real-time linear array sonograms were videotaped during each study. Videotapes were next analyzed for anatomical structures and movement patterns, played back through the ultrasound machine in normal speed, and then examined with a frame-by-frame video editor (FFVE) to identify structures and movements. Still images were photographed directly from the video monitor using a 35 mm camera. Results show that upper respiratory and digestive structures, as well as their movements, could be seen clearly during normal speed and repeat frame-by-frame analysis. Major structures that could be identified in the majority of subjects included trachea in 20 of 25 fetuses (80%); larynx, 76%; pharynx, 76%. Smaller structures were more variable, but were nevertheless observed on both sagittal and coronal section: piriform sinuses, 76%; thyroid cartilage, 36%; cricoid cartilage, 32%; and epiglottis, 16%. Movements of structures could also be seen and were those typically observed in connection with swallowing: fluttering tongue movements, changes in pharyngeal shape, and passage of a bolus via the piriform sinuses to esophagus. Fetal swallows had minimal laryngeal motion. This study represents the first time that the appearance of upper airway and digestive tract structures have been quantified in conjunction with their movements in the living fetus.
Using ARINC 818 Avionics Digital Video Bus (ADVB) for military displays
NASA Astrophysics Data System (ADS)
Alexander, Jon; Keller, Tim
2007-04-01
ARINC 818 Avionics Digital Video Bus (ADVB) is a new digital video interface and protocol standard developed especially for high bandwidth uncompressed digital video. The first draft of this standard, released in January of 2007, has been advanced by ARINC and the aerospace community to meet the acute needs of commercial aviation for higher performance digital video. This paper analyzes ARINC 818 for use in military display systems found in avionics, helicopters, and ground vehicles. The flexibility of ARINC 818 for the diverse resolutions, grayscales, pixel formats, and frame rates of military displays is analyzed as well as the suitability of ARINC 818 to support requirements for military video systems including bandwidth, latency, and reliability. Implementation issues relevant to military displays are presented.
High-Speed Video Observations of a Natural Lightning Stepped Leader
NASA Astrophysics Data System (ADS)
Jordan, D. M.; Hill, J. D.; Uman, M. A.; Yoshida, S.; Kawasaki, Z.
2010-12-01
High-speed video images of one branch of a natural negative lightning stepped leader were obtained at a frame rate of 300 kfps (3.33 us exposure) on June 18th, 2010 at the International Center for Lightning Research and Testing (ICLRT) located on the Camp Blanding Army National Guard Base in north-central Florida. The images were acquired using a 20 mm Nikon lens mounted on a Photron SA1.1 high-speed camera. A total of 225 frames (about 0.75 ms) of the downward stepped leader were captured, followed by 45 frames of the leader channel re-illumination by the return stroke and subsequent decay following the ground attachment of the primary leader channel. Luminous characteristics of dart-stepped leader propagation in triggered lightning obtained by Biagi et al. [2009, 2010] and of long laboratory spark formation [e.g., Bazelyan and Raizer, 1998; Gallimberti et al., 2002] are evident in the frames of the natural lightning stepped leader. Space stems/leaders are imaged in twelve different frames at various distances in front of the descending leader tip, which branches into two distinct components 125 frames after the channel enters the field of view. In each case, the space stem/leader appears to connect to the leader tip above in the subsequent frame, forming a new step. Each connection is associated with significant isolated brightening of the channel at the connection point followed by typically three or four frames of upward propagating re-illumination of the existing leader channel. In total, at least 80 individual steps were imaged.
NASA Astrophysics Data System (ADS)
Crone, T. J.; Mittelstaedt, E. L.; Fornari, D. J.
2014-12-01
Fluid flow rates through high-temperature mid-ocean ridge hydrothermal vents are likely quite sensitive to poroelastic forcing mechanisms such as tidal loading and tectonic activity. Because poroelastic deformation and flow perturbations are estimated to extend to considerable depths within young oceanic crust, observations of flow rate changes at seafloor vents have the potential to provide constraints on the flow geometry and permeability structure of the underlying hydrothermal systems, as well as the quantities of heat and chemicals they exchange with overlying ocean, and the potential biological productivity of ecosystems they host. To help provide flow rate measurements in these challenging environments, we have developed two new optical flow oriented technologies. The first is a new form of Optical Plume Velocimetry (OPV) which relies on single-frame temporal cross-correlation to obtain time-averaged image velocity fields from short video sequences. The second is the VentCam, a deep sea camera system that can collect high-frame-rate video sequences at focused hydrothermal vents suitable for analysis with OPV. During the July 2014 R/V Atlantis/Alvin expedition to Axial Seamount, we deployed the VentCam at the ~300C Phoenix vent within the ASHES vent field and positioned it with DSRV Alvin. We collected 24 seconds of video at 50 frames per second every half-hour for approximately 10 days beginning July 22nd. We are currently applying single-frame lag OPV to these videos to estimate relative and absolute fluid flow rates through this vent. To explore the relationship between focused and diffuse venting, we deployed a second optical flow camera, the Diffuse Effluent Measurement System (DEMS), adjacent to this vent at a fracture within the lava carapace where low-T (~30C) fluids were exiting. This system collected video sequences and diffuse flow measurements at overlapping time intervals. Here we present the preliminary results of our work with VentCam and OPV, and comparisons with results from the DEMS camera.
Coding visual features extracted from video sequences.
Baroffio, Luca; Cesana, Matteo; Redondi, Alessandro; Tagliasacchi, Marco; Tubaro, Stefano
2014-05-01
Visual features are successfully exploited in several applications (e.g., visual search, object recognition and tracking, etc.) due to their ability to efficiently represent image content. Several visual analysis tasks require features to be transmitted over a bandwidth-limited network, thus calling for coding techniques to reduce the required bit budget, while attaining a target level of efficiency. In this paper, we propose, for the first time, a coding architecture designed for local features (e.g., SIFT, SURF) extracted from video sequences. To achieve high coding efficiency, we exploit both spatial and temporal redundancy by means of intraframe and interframe coding modes. In addition, we propose a coding mode decision based on rate-distortion optimization. The proposed coding scheme can be conveniently adopted to implement the analyze-then-compress (ATC) paradigm in the context of visual sensor networks. That is, sets of visual features are extracted from video frames, encoded at remote nodes, and finally transmitted to a central controller that performs visual analysis. This is in contrast to the traditional compress-then-analyze (CTA) paradigm, in which video sequences acquired at a node are compressed and then sent to a central unit for further processing. In this paper, we compare these coding paradigms using metrics that are routinely adopted to evaluate the suitability of visual features in the context of content-based retrieval, object recognition, and tracking. Experimental results demonstrate that, thanks to the significant coding gains achieved by the proposed coding scheme, ATC outperforms CTA with respect to all evaluation metrics.
A content-based news video retrieval system: NVRS
NASA Astrophysics Data System (ADS)
Liu, Huayong; He, Tingting
2009-10-01
This paper focus on TV news programs and design a content-based news video browsing and retrieval system, NVRS, which is convenient for users to fast browsing and retrieving news video by different categories such as political, finance, amusement, etc. Combining audiovisual features and caption text information, the system automatically segments a complete news program into separate news stories. NVRS supports keyword-based news story retrieval, category-based news story browsing and generates key-frame-based video abstract for each story. Experiments show that the method of story segmentation is effective and the retrieval is also efficient.
Using dynamic mode decomposition for real-time background/foreground separation in video
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kutz, Jose Nathan; Grosek, Jacob; Brunton, Steven
The technique of dynamic mode decomposition (DMD) is disclosed herein for the purpose of robustly separating video frames into background (low-rank) and foreground (sparse) components in real-time. Foreground/background separation is achieved at the computational cost of just one singular value decomposition (SVD) and one linear equation solve, thus producing results orders of magnitude faster than robust principal component analysis (RPCA). Additional techniques, including techniques for analyzing the video for multi-resolution time-scale components, and techniques for reusing computations to allow processing of streaming video in real time, are also described herein.
Application of M-JPEG compression hardware to dynamic stimulus production.
Mulligan, J B
1997-01-01
Inexpensive circuit boards have appeared on the market which transform a normal micro-computer's disk drive into a video disk capable of playing extended video sequences in real time. This technology enables the performance of experiments which were previously impossible, or at least prohibitively expensive. The new technology achieves this capability using special-purpose hardware to compress and decompress individual video frames, enabling a video stream to be transferred over relatively low-bandwidth disk interfaces. This paper will describe the use of such devices for visual psychophysics and present the technical issues that must be considered when evaluating individual products.
Advanced High-Definition Video Cameras
NASA Technical Reports Server (NTRS)
Glenn, William
2007-01-01
A product line of high-definition color video cameras, now under development, offers a superior combination of desirable characteristics, including high frame rates, high resolutions, low power consumption, and compactness. Several of the cameras feature a 3,840 2,160-pixel format with progressive scanning at 30 frames per second. The power consumption of one of these cameras is about 25 W. The size of the camera, excluding the lens assembly, is 2 by 5 by 7 in. (about 5.1 by 12.7 by 17.8 cm). The aforementioned desirable characteristics are attained at relatively low cost, largely by utilizing digital processing in advanced field-programmable gate arrays (FPGAs) to perform all of the many functions (for example, color balance and contrast adjustments) of a professional color video camera. The processing is programmed in VHDL so that application-specific integrated circuits (ASICs) can be fabricated directly from the program. ["VHDL" signifies VHSIC Hardware Description Language C, a computing language used by the United States Department of Defense for describing, designing, and simulating very-high-speed integrated circuits (VHSICs).] The image-sensor and FPGA clock frequencies in these cameras have generally been much higher than those used in video cameras designed and manufactured elsewhere. Frequently, the outputs of these cameras are converted to other video-camera formats by use of pre- and post-filters.
Violent Interaction Detection in Video Based on Deep Learning
NASA Astrophysics Data System (ADS)
Zhou, Peipei; Ding, Qinghai; Luo, Haibo; Hou, Xinglin
2017-06-01
Violent interaction detection is of vital importance in some video surveillance scenarios like railway stations, prisons or psychiatric centres. Existing vision-based methods are mainly based on hand-crafted features such as statistic features between motion regions, leading to a poor adaptability to another dataset. En lightened by the development of convolutional networks on common activity recognition, we construct a FightNet to represent the complicated visual violence interaction. In this paper, a new input modality, image acceleration field is proposed to better extract the motion attributes. Firstly, each video is framed as RGB images. Secondly, optical flow field is computed using the consecutive frames and acceleration field is obtained according to the optical flow field. Thirdly, the FightNet is trained with three kinds of input modalities, i.e., RGB images for spatial networks, optical flow images and acceleration images for temporal networks. By fusing results from different inputs, we conclude whether a video tells a violent event or not. To provide researchers a common ground for comparison, we have collected a violent interaction dataset (VID), containing 2314 videos with 1077 fight ones and 1237 no-fight ones. By comparison with other algorithms, experimental results demonstrate that the proposed model for violent interaction detection shows higher accuracy and better robustness.
ERIC Educational Resources Information Center
Kim, Sujin; Song, Kim; Coppersmith, Sarah
2018-01-01
This qualitative case study was framed by an experiential learning approach organized around video resources and linguistically and culturally responsive content teaching. The study explored an overarching research question: How did teacher-learners in a grant project interact with a multimedia learning platform that combined teaching video and…
Legal and Ethical Issues in the Use of Video in Education Research. Working Paper Series.
ERIC Educational Resources Information Center
Arafeh, Sousan; McLaughlin, Mary
The National Center for Education Statistics (NCES), through the Education Statistics Services Institute, supported the research in this report to help frame future discussions about the use of video research techniques in educational settings. This paper addresses the context of technological, legal, and ethical change facing researchers who use…
Design Issues in Video Disc Map Display.
1984-10-01
such items as the equipment used by ETL in its work with discs and selected images from a disc. % %. I 4 11. VIDEO DISC TECHNOLOGY AND VOCABULARY 0...The term video refers to a television image. The standard home television set is equipped with a receiver, which is capable of picking up a signal...plays for one hour per side and is played at a constant linear velocity. The industria )y-formatted disc has 54,000 frames per side in concentric tracks
Next-Generation Image and Sound Processing Strategies: Exploiting the Biological Model
2007-05-01
several video game clips which were recorded while observers interactively played the games. The feature vectors may be derived from either: the...phase, we use a different video game clip to test the model. Frames from the test clip are passed in parallel to a bottom-up saliency model, as well as... video games (Figure 6). We found that the TD model alone predicts where humans look about twice as well as does the BU model alone; in addition, a
Digital television system design study
NASA Technical Reports Server (NTRS)
Huth, G. K.
1976-01-01
The use of digital techniques for transmission of pictorial data is discussed for multi-frame images (television). Video signals are processed in a manner which includes quantization and coding such that they are separable from the noise introduced into the channel. The performance of digital television systems is determined by the nature of the processing techniques (i.e., whether the video signal itself or, instead, something related to the video signal is quantized and coded) and to the quantization and coding schemes employed.
NASA Astrophysics Data System (ADS)
Chen, Xinyuan; Song, Li; Yang, Xiaokang
2016-09-01
Video denoising can be described as the problem of mapping from a specific length of noisy frames to clean one. We propose a deep architecture based on Recurrent Neural Network (RNN) for video denoising. The model learns a patch-based end-to-end mapping between the clean and noisy video sequences. It takes the corrupted video sequences as the input and outputs the clean one. Our deep network, which we refer to as deep Recurrent Neural Networks (deep RNNs or DRNNs), stacks RNN layers where each layer receives the hidden state of the previous layer as input. Experiment shows (i) the recurrent architecture through temporal domain extracts motion information and does favor to video denoising, and (ii) deep architecture have large enough capacity for expressing mapping relation between corrupted videos as input and clean videos as output, furthermore, (iii) the model has generality to learned different mappings from videos corrupted by different types of noise (e.g., Poisson-Gaussian noise). By training on large video databases, we are able to compete with some existing video denoising methods.
RAPID: A random access picture digitizer, display, and memory system
NASA Technical Reports Server (NTRS)
Yakimovsky, Y.; Rayfield, M.; Eskenazi, R.
1976-01-01
RAPID is a system capable of providing convenient digital analysis of video data in real-time. It has two modes of operation. The first allows for continuous digitization of an EIA RS-170 video signal. Each frame in the video signal is digitized and written in 1/30 of a second into RAPID's internal memory. The second mode leaves the content of the internal memory independent of the current input video. In both modes of operation the image contained in the memory is used to generate an EIA RS-170 composite video output signal representing the digitized image in the memory so that it can be displayed on a monitor.
Yoo, Sun K; Kim, D K; Jung, S M; Kim, E-K; Lim, J S; Kim, J H
2004-01-01
A Web-based, realtime, tele-ultrasound consultation system was designed. The system employed ActiveX control, MPEG-4 coding of full-resolution ultrasound video (640 x 480 pixels at 30 frames/s) and H.320 videoconferencing. It could be used via a Web browser. The system was evaluated over three types of commercial line: a cable connection, ADSL and VDSL. Three radiologists assessed the quality of compressed and uncompressed ultrasound video-sequences from 16 cases (10 abnormal livers, four abnormal kidneys and two abnormal gallbladders). The radiologists' scores showed that, at a given frame rate, increasing the bit rate was associated with increasing quality; however, at a certain threshold bit rate the quality did not increase significantly. The peak signal to noise ratio (PSNR) was also measured between the compressed and uncompressed images. In most cases, the PSNR increased as the bit rate increased, and increased as the number of dropped frames increased. There was a threshold bit rate, at a given frame rate, at which the PSNR did not improve significantly. Taking into account both sets of threshold values, a bit rate of more than 0.6 Mbit/s, at 30 frames/s, is suggested as the threshold for the maintenance of diagnostic image quality.
Real life identification of partially occluded weapons in video frames
NASA Astrophysics Data System (ADS)
Hempelmann, Christian F.; Arslan, Abdullah N.; Attardo, Salvatore; Blount, Grady P.; Sirakov, Nikolay M.
2016-05-01
We empirically test the capacity of an improved system to identify not just images of individual guns, but partially occluded guns and their parts appearing in a videoframe. This approach combines low-level geometrical information gleaned from the visual images and high-level semantic information stored in an ontology enriched with meronymic part-whole relations. The main improvements of the system are handling occlusion, new algorithms, and an emerging meronomy. Well-known and commonly deployed in ontologies, actual meronomies need to be engineered and populated with unique solutions. Here, this includes adjacency of weapon parts and essentiality of parts to the threat of and the diagnosticity for a weapon. In this study video sequences are processed frame by frame. The extraction method separates colors and removes the background. Then image subtraction of the next frame determines moving targets, before morphological closing is applied to the current frame in order to clean up noise and fill gaps. Next, the method calculates for each object the boundary coordinates and uses them to create a finite numerical sequence as a descriptor. Parts identification is done by cyclic sequence alignment and matching against the nodes of the weapons ontology. From the identified parts, the most-likely weapon will be determined by using the weapon ontology.
Mission Specialist Hawley works with the SWUIS experiment
2013-11-18
STS093-350-022 (22-27 July 1999) --- Astronaut Steven A. Hawley, mission specialist, works with the Southwest Ultraviolet Imaging System (SWUIS) experiment onboard the Earth-orbiting Space Shuttle Columbia. The SWUIS is based around a Maksutov-design Ultraviolet (UV) telescope and a UV-sensitive, image-intensified Charge-Coupled Device (CCD) camera that frames at video frame rates.
Optimal erasure protection for scalably compressed video streams with limited retransmission.
Taubman, David; Thie, Johnson
2005-08-01
This paper shows how the priority encoding transmission (PET) framework may be leveraged to exploit both unequal error protection and limited retransmission for RD-optimized delivery of streaming media. Previous work on scalable media protection with PET has largely ignored the possibility of retransmission. Conversely, the PET framework has not been harnessed by the substantial body of previous work on RD optimized hybrid forward error correction/automatic repeat request schemes. We limit our attention to sources which can be modeled as independently compressed frames (e.g., video frames), where each element in the scalable representation of each frame can be transmitted in one or both of two transmission slots. An optimization algorithm determines the level of protection which should be assigned to each element in each slot, subject to transmission bandwidth constraints. To balance the protection assigned to elements which are being transmitted for the first time with those which are being retransmitted, the proposed algorithm formulates a collection of hypotheses concerning its own behavior in future transmission slots. We show how the PET framework allows for a decoupled optimization algorithm with only modest complexity. Experimental results obtained with Motion JPEG2000 compressed video demonstrate that substantial performance benefits can be obtained using the proposed framework.
Efficient Lane Boundary Detection with Spatial-Temporal Knowledge Filtering
Nan, Zhixiong; Wei, Ping; Xu, Linhai; Zheng, Nanning
2016-01-01
Lane boundary detection technology has progressed rapidly over the past few decades. However, many challenges that often lead to lane detection unavailability remain to be solved. In this paper, we propose a spatial-temporal knowledge filtering model to detect lane boundaries in videos. To address the challenges of structure variation, large noise and complex illumination, this model incorporates prior spatial-temporal knowledge with lane appearance features to jointly identify lane boundaries. The model first extracts line segments in video frames. Two novel filters—the Crossing Point Filter (CPF) and the Structure Triangle Filter (STF)—are proposed to filter out the noisy line segments. The two filters introduce spatial structure constraints and temporal location constraints into lane detection, which represent the spatial-temporal knowledge about lanes. A straight line or curve model determined by a state machine is used to fit the line segments to finally output the lane boundaries. We collected a challenging realistic traffic scene dataset. The experimental results on this dataset and other standard dataset demonstrate the strength of our method. The proposed method has been successfully applied to our autonomous experimental vehicle. PMID:27529248
Video enhancement workbench: an operational real-time video image processing system
NASA Astrophysics Data System (ADS)
Yool, Stephen R.; Van Vactor, David L.; Smedley, Kirk G.
1993-01-01
Video image sequences can be exploited in real-time, giving analysts rapid access to information for military or criminal investigations. Video-rate dynamic range adjustment subdues fluctuations in image intensity, thereby assisting discrimination of small or low- contrast objects. Contrast-regulated unsharp masking enhances differentially shadowed or otherwise low-contrast image regions. Real-time removal of localized hotspots, when combined with automatic histogram equalization, may enhance resolution of objects directly adjacent. In video imagery corrupted by zero-mean noise, real-time frame averaging can assist resolution and location of small or low-contrast objects. To maximize analyst efficiency, lengthy video sequences can be screened automatically for low-frequency, high-magnitude events. Combined zoom, roam, and automatic dynamic range adjustment permit rapid analysis of facial features captured by video cameras recording crimes in progress. When trying to resolve small objects in murky seawater, stereo video places the moving imagery in an optimal setting for human interpretation.
Gradual cut detection using low-level vision for digital video
NASA Astrophysics Data System (ADS)
Lee, Jae-Hyun; Choi, Yeun-Sung; Jang, Ok-bae
1996-09-01
Digital video computing and organization is one of the important issues in multimedia system, signal compression, or database. Video should be segmented into shots to be used for identification and indexing. This approach requires a suitable method to automatically locate cut points in order to separate shot in a video. Automatic cut detection to isolate shots in a video has received considerable attention due to many practical applications; our video database, browsing, authoring system, retrieval and movie. Previous studies are based on a set of difference mechanisms and they measured the content changes between video frames. But they could not detect more special effects which include dissolve, wipe, fade-in, fade-out, and structured flashing. In this paper, a new cut detection method for gradual transition based on computer vision techniques is proposed. And then, experimental results applied to commercial video are presented and evaluated.
Context indexing of digital cardiac ultrasound records in PACS
NASA Astrophysics Data System (ADS)
Lobodzinski, S. Suave; Meszaros, Georg N.
1998-07-01
Recent wide adoption of the DICOM 3.0 standard by ultrasound equipment vendors created a need for practical clinical implementations of cardiac imaging study visualization, management and archiving, DICOM 3.0 defines only a logical and physical format for exchanging image data (still images, video, patient and study demographics). All DICOM compliant imaging studies must presently be archived on a 650 Mb recordable compact disk. This is a severe limitation for ultrasound applications where studies of 3 to 10 minutes long are a common practice. In addition, DICOM digital echocardiography objects require physiological signal indexing, content segmentation and characterization. Since DICOM 3.0 is an interchange standard only, it does not define how to database composite video objects. The goal of this research was therefore to address the issues of efficient storage, retrieval and management of DICOM compliant cardiac video studies in a distributed PACS environment. Our Web based implementation has the advantage of accommodating both DICOM defined entity-relation modules (equipment data, patient data, video format, etc.) in standard relational database tables and digital indexed video with its attributes in an object relational database. Object relational data model facilitates content indexing of full motion cardiac imaging studies through bi-directional hyperlink generation that tie searchable video attributes and related objects to individual video frames in the temporal domain. Benefits realized from use of bi-directionally hyperlinked data models in an object relational database include: (1) real time video indexing during image acquisition, (2) random access and frame accurate instant playback of previously recorded full motion imaging data, and (3) time savings from faster and more accurate access to data through multiple navigation mechanisms such as multidimensional queries on an index, queries on a hyperlink attribute, free search and browsing.
Overlaid caption extraction in news video based on SVM
NASA Astrophysics Data System (ADS)
Liu, Manman; Su, Yuting; Ji, Zhong
2007-11-01
Overlaid caption in news video often carries condensed semantic information which is key cues for content-based video indexing and retrieval. However, it is still a challenging work to extract caption from video because of its complex background and low resolution. In this paper, we propose an effective overlaid caption extraction approach for news video. We first scan the video key frames using a small window, and then classify the blocks into the text and non-text ones via support vector machine (SVM), with statistical features extracted from the gray level co-occurrence matrices, the LH and HL sub-bands wavelet coefficients and the orientated edge intensity ratios. Finally morphological filtering and projection profile analysis are employed to localize and refine the candidate caption regions. Experiments show its high performance on four 30-minute news video programs.
A method for automatically abstracting visual documents
NASA Technical Reports Server (NTRS)
Rorvig, Mark E.
1994-01-01
Visual documents--motion sequences on film, videotape, and digital recording--constitute a major source of information for the Space Agency, as well as all other government and private sector entities. This article describes a method for automatically selecting key frames from visual documents. These frames may in turn be used to represent the total image sequence of visual documents in visual libraries, hypermedia systems, and training algorithm reduces 51 minutes of video sequences to 134 frames; a reduction of information in the range of 700:1.
Implementation of smart phone video plethysmography and dependence on lighting parameters.
Fletcher, Richard Ribón; Chamberlain, Daniel; Paggi, Nicholas; Deng, Xinyue
2015-08-01
The remote measurement of heart rate (HR) and heart rate variability (HRV) via a digital camera (video plethysmography) has emerged as an area of great interest for biomedical and health applications. While a few implementations of video plethysmography have been demonstrated on smart phones under controlled lighting conditions, it has been challenging to create a general scalable solution due to the large variability in smart phone hardware performance, software architecture, and the variable response to lighting parameters. In this context, we present a selfcontained smart phone implementation of video plethysmography for Android OS, which employs both stochastic and deterministic algorithms, and we use this to study the effect of lighting parameters (illuminance, color spectrum) on the accuracy of the remote HR measurement. Using two different phone models, we present the median HR error for five different video plethysmography algorithms under three different types of lighting (natural sunlight, compact fluorescent, and halogen incandescent) and variations in brightness. For most algorithms, we found the optimum light brightness to be in the range 1000-4000 lux and the optimum lighting types to be compact fluorescent and natural light. Moderate errors were found for most algorithms with some devices under conditions of low-brightness (<;500 lux) and highbrightness (>4000 lux). Our analysis also identified camera frame rate jitter as a major source of variability and error across different phone models, but this can be largely corrected through non-linear resampling. Based on testing with six human subjects, our real-time Android implementation successfully predicted the measured HR with a median error of -0.31 bpm, and an inter-quartile range of 2.1bpm.
Automated Wing Twist And Bending Measurements Under Aerodynamic Load
NASA Technical Reports Server (NTRS)
Burner, A. W.; Martinson, S. D.
1996-01-01
An automated system to measure the change in wing twist and bending under aerodynamic load in a wind tunnel is described. The basic instrumentation consists of a single CCD video camera and a frame grabber interfaced to a computer. The technique is based upon a single view photogrammetric determination of two dimensional coordinates of wing targets with a fixed (and known) third dimensional coordinate, namely the spanwise location. The measurement technique has been used successfully at the National Transonic Facility, the Transonic Dynamics Tunnel, and the Unitary Plan Wind Tunnel at NASA Langley Research Center. The advantages and limitations (including targeting) of the technique are discussed. A major consideration in the development was that use of the technique must not appreciably reduce wind tunnel productivity.
Linear array of photodiodes to track a human speaker for video recording
NASA Astrophysics Data System (ADS)
DeTone, D.; Neal, H.; Lougheed, R.
2012-12-01
Communication and collaboration using stored digital media has garnered more interest by many areas of business, government and education in recent years. This is due primarily to improvements in the quality of cameras and speed of computers. An advantage of digital media is that it can serve as an effective alternative when physical interaction is not possible. Video recordings that allow for viewers to discern a presenter's facial features, lips and hand motions are more effective than videos that do not. To attain this, one must maintain a video capture in which the speaker occupies a significant portion of the captured pixels. However, camera operators are costly, and often do an imperfect job of tracking presenters in unrehearsed situations. This creates motivation for a robust, automated system that directs a video camera to follow a presenter as he or she walks anywhere in the front of a lecture hall or large conference room. Such a system is presented. The system consists of a commercial, off-the-shelf pan/tilt/zoom (PTZ) color video camera, a necklace of infrared LEDs and a linear photodiode array detector. Electronic output from the photodiode array is processed to generate the location of the LED necklace, which is worn by a human speaker. The computer controls the video camera movements to record video of the speaker. The speaker's vertical position and depth are assumed to remain relatively constant- the video camera is sent only panning (horizontal) movement commands. The LED necklace is flashed at 70Hz at a 50% duty cycle to provide noise-filtering capability. The benefit to using a photodiode array versus a standard video camera is its higher frame rate (4kHz vs. 60Hz). The higher frame rate allows for the filtering of infrared noise such as sunlight and indoor lighting-a capability absent from other tracking technologies. The system has been tested in a large lecture hall and is shown to be effective.
Obesity in the new media: a content analysis of obesity videos on YouTube.
Yoo, Jina H; Kim, Junghyun
2012-01-01
This study examines (1) how the topics of obesity are framed and (2) how obese persons are portrayed on YouTube video clips. The analysis of 417 obesity videos revealed that a newer medium like YouTube, similar to traditional media, appeared to assign responsibility and solutions for obesity mainly to individuals and their behaviors, although there was a tendency that some video categories have started to show other causal claims or solutions. However, due to the prevailing emphasis on personal causes and solutions, numerous YouTube videos had a theme of weight-based teasing, or showed obese persons engaging in stereotypical eating behaviors. We discuss a potential impact of YouTube videos on shaping viewers' perceptions about obesity and further reinforcing stigmatization of obese persons.
NASA Astrophysics Data System (ADS)
Sa, Qila; Wang, Zhihui
2018-03-01
At present, content-based video retrieval (CBVR) is the most mainstream video retrieval method, using the video features of its own to perform automatic identification and retrieval. This method involves a key technology, i.e. shot segmentation. In this paper, the method of automatic video shot boundary detection with K-means clustering and improved adaptive dual threshold comparison is proposed. First, extract the visual features of every frame and divide them into two categories using K-means clustering algorithm, namely, one with significant change and one with no significant change. Then, as to the classification results, utilize the improved adaptive dual threshold comparison method to determine the abrupt as well as gradual shot boundaries.Finally, achieve automatic video shot boundary detection system.
Development of optical packet and circuit integrated ring network testbed.
Furukawa, Hideaki; Harai, Hiroaki; Miyazawa, Takaya; Shinada, Satoshi; Kawasaki, Wataru; Wada, Naoya
2011-12-12
We developed novel integrated optical packet and circuit switch-node equipment. Compared with our previous equipment, a polarization-independent 4 × 4 semiconductor optical amplifier switch subsystem, gain-controlled optical amplifiers, and one 100 Gbps optical packet transponder and seven 10 Gbps optical path transponders with 10 Gigabit Ethernet (10GbE) client-interfaces were newly installed in the present system. The switch and amplifiers can provide more stable operation without equipment adjustments for the frequent polarization-rotations and dynamic packet-rate changes of optical packets. We constructed an optical packet and circuit integrated ring network testbed consisting of two switch nodes for accelerating network development, and we demonstrated 66 km fiber transmission and switching operation of multiplexed 14-wavelength 10 Gbps optical paths and 100 Gbps optical packets encapsulating 10GbE frames. Error-free (frame error rate < 1×10(-4)) operation was achieved with optical packets of various packet lengths and packet rates, and stable operation of the network testbed was confirmed. In addition, 4K uncompressed video streaming over OPS links was successfully demonstrated. © 2011 Optical Society of America
Infrared video based gas leak detection method using modified FAST features
NASA Astrophysics Data System (ADS)
Wang, Min; Hong, Hanyu; Huang, Likun
2018-03-01
In order to detect the invisible leaking gas that is usually dangerous and easily leads to fire or explosion in time, many new technologies have arisen in the recent years, among which the infrared video based gas leak detection is widely recognized as a viable tool. However, all the moving regions of a video frame can be detected as leaking gas regions by the existing infrared video based gas leak detection methods, without discriminating the property of each detected region, e.g., a walking person in a video frame may be also detected as gas by the current gas leak detection methods.To solve this problem, we propose a novel infrared video based gas leak detection method in this paper, which is able to effectively suppress strong motion disturbances.Firstly, the Gaussian mixture model(GMM) is used to establish the background model.Then due to the observation that the shapes of gas regions are different from most rigid moving objects, we modify the Features From Accelerated Segment Test (FAST) algorithm and use the modified FAST (mFAST) features to describe each connected component. In view of the fact that the statistical property of the mFAST features extracted from gas regions is different from that of other motion regions, we propose the Pixel-Per-Points (PPP) condition to further select candidate connected components.Experimental results show that the algorithm is able to effectively suppress most strong motion disturbances and achieve real-time leaking gas detection.
NASA Astrophysics Data System (ADS)
Cicala, L.; Angelino, C. V.; Ruatta, G.; Baccaglini, E.; Raimondo, N.
2015-08-01
Unmanned Aerial Vehicles (UAVs) are often employed to collect high resolution images in order to perform image mosaicking and/or 3D reconstruction. Images are usually stored on board and then processed with on-ground desktop software. In such a way the computational load, and hence the power consumption, is moved on ground, leaving on board only the task of storing data. Such an approach is important in the case of small multi-rotorcraft UAVs because of their low endurance due to the short battery life. Images can be stored on board with either still image or video data compression. Still image system are preferred when low frame rates are involved, because video coding systems are based on motion estimation and compensation algorithms which fail when the motion vectors are significantly long and when the overlapping between subsequent frames is very small. In this scenario, UAVs attitude and position metadata from the Inertial Navigation System (INS) can be employed to estimate global motion parameters without video analysis. A low complexity image analysis can be still performed in order to refine the motion field estimated using only the metadata. In this work, we propose to use this refinement step in order to improve the position and attitude estimation produced by the navigation system in order to maximize the encoder performance. Experiments are performed on both simulated and real world video sequences.
NASA Astrophysics Data System (ADS)
Morishima, Shigeo; Nakamura, Satoshi
2004-12-01
We introduce a multimodal English-to-Japanese and Japanese-to-English translation system that also translates the speaker's speech motion by synchronizing it to the translated speech. This system also introduces both a face synthesis technique that can generate any viseme lip shape and a face tracking technique that can estimate the original position and rotation of a speaker's face in an image sequence. To retain the speaker's facial expression, we substitute only the speech organ's image with the synthesized one, which is made by a 3D wire-frame model that is adaptable to any speaker. Our approach provides translated image synthesis with an extremely small database. The tracking motion of the face from a video image is performed by template matching. In this system, the translation and rotation of the face are detected by using a 3D personal face model whose texture is captured from a video frame. We also propose a method to customize the personal face model by using our GUI tool. By combining these techniques and the translated voice synthesis technique, an automatic multimodal translation can be achieved that is suitable for video mail or automatic dubbing systems into other languages.
MPEG-7 audio-visual indexing test-bed for video retrieval
NASA Astrophysics Data System (ADS)
Gagnon, Langis; Foucher, Samuel; Gouaillier, Valerie; Brun, Christelle; Brousseau, Julie; Boulianne, Gilles; Osterrath, Frederic; Chapdelaine, Claude; Dutrisac, Julie; St-Onge, Francis; Champagne, Benoit; Lu, Xiaojian
2003-12-01
This paper reports on the development status of a Multimedia Asset Management (MAM) test-bed for content-based indexing and retrieval of audio-visual documents within the MPEG-7 standard. The project, called "MPEG-7 Audio-Visual Document Indexing System" (MADIS), specifically targets the indexing and retrieval of video shots and key frames from documentary film archives, based on audio-visual content like face recognition, motion activity, speech recognition and semantic clustering. The MPEG-7/XML encoding of the film database is done off-line. The description decomposition is based on a temporal decomposition into visual segments (shots), key frames and audio/speech sub-segments. The visible outcome will be a web site that allows video retrieval using a proprietary XQuery-based search engine and accessible to members at the Canadian National Film Board (NFB) Cineroute site. For example, end-user will be able to ask to point on movie shots in the database that have been produced in a specific year, that contain the face of a specific actor who tells a specific word and in which there is no motion activity. Video streaming is performed over the high bandwidth CA*net network deployed by CANARIE, a public Canadian Internet development organization.
Study on a High Compression Processing for Video-on-Demand e-learning System
NASA Astrophysics Data System (ADS)
Nomura, Yoshihiko; Matsuda, Ryutaro; Sakamoto, Ryota; Sugiura, Tokuhiro; Matsui, Hirokazu; Kato, Norihiko
The authors proposed a high-quality and small-capacity lecture-video-file creating system for distance e-learning system. Examining the feature of the lecturing scene, the authors ingeniously employ two kinds of image-capturing equipment having complementary characteristics : one is a digital video camera with a low resolution and a high frame rate, and the other is a digital still camera with a high resolution and a very low frame rate. By managing the two kinds of image-capturing equipment, and by integrating them with image processing, we can produce course materials with the greatly reduced file capacity : the course materials satisfy the requirements both for the temporal resolution to see the lecturer's point-indicating actions and for the high spatial resolution to read the small written letters. As a result of a comparative experiment, the e-lecture using the proposed system was confirmed to be more effective than an ordinary lecture from the viewpoint of educational effect.
The AAPM/RSNA physics tutorial for residents: digital fluoroscopy.
Pooley, R A; McKinney, J M; Miller, D A
2001-01-01
A digital fluoroscopy system is most commonly configured as a conventional fluoroscopy system (tube, table, image intensifier, video system) in which the analog video signal is converted to and stored as digital data. Other methods of acquiring the digital data (eg, digital or charge-coupled device video and flat-panel detectors) will become more prevalent in the future. Fundamental concepts related to digital imaging in general include binary numbers, pixels, and gray levels. Digital image data allow the convenient use of several image processing techniques including last image hold, gray-scale processing, temporal frame averaging, and edge enhancement. Real-time subtraction of digital fluoroscopic images after injection of contrast material has led to widespread use of digital subtraction angiography (DSA). Additional image processing techniques used with DSA include road mapping, image fade, mask pixel shift, frame summation, and vessel size measurement. Peripheral angiography performed with an automatic moving table allows imaging of the peripheral vasculature with a single contrast material injection.
Study of atmospheric discharges caracteristics using with a standard video camera
NASA Astrophysics Data System (ADS)
Ferraz, E. C.; Saba, M. M. F.
In this study is showed some preliminary statistics on lightning characteristics such as: flash multiplicity, number of ground contact points, formation of new and altered channels and presence of continuous current in the strokes that form the flash. The analysis is based on the images of a standard video camera (30 frames.s-1). The results obtained for some flashes will be compared to the images of a high-speed CCD camera (1000 frames.s-1). The camera observing site is located in São José dos Campos (23°S,46° W) at an altitude of 630m. This observational site has nearly 360° field of view at a height of 25m. It is possible to visualize distant thunderstorms occurring within a radius of 25km from the site. The room, situated over a metal structure, has water and power supplies, a telephone line and a small crane on the roof. KEY WORDS: Video images, Lightning, Multiplicity, Stroke.
A portable wireless power transmission system for video capsule endoscopes.
Shi, Yu; Yan, Guozheng; Zhu, Bingquan; Liu, Gang
2015-01-01
Wireless power transmission (WPT) technology can solve the energy shortage problem of the video capsule endoscope (VCE) powered by button batteries, but the fixed platform limited its clinical application. This paper presents a portable WPT system for VCE. Besides portability, power transfer efficiency and stability are considered as the main indexes of optimization design of the system, which consists of the transmitting coil structure, portable control box, operating frequency, magnetic core and winding of receiving coil. Upon the above principles, the correlation parameters are measured, compared and chosen. Finally, through experiments on the platform, the methods are tested and evaluated. In the gastrointestinal tract of small pig, the VCE is supplied with sufficient energy by the WPT system, and the energy conversion efficiency is 2.8%. The video obtained is clear with a resolution of 320×240 and a frame rate of 30 frames per second. The experiments verify the feasibility of design scheme, and further improvement direction is discussed.
Video see-through augmented reality for oral and maxillofacial surgery.
Wang, Junchen; Suenaga, Hideyuki; Yang, Liangjing; Kobayashi, Etsuko; Sakuma, Ichiro
2017-06-01
Oral and maxillofacial surgery has not been benefitting from image guidance techniques owing to the limitations in image registration. A real-time markerless image registration method is proposed by integrating a shape matching method into a 2D tracking framework. The image registration is performed by matching the patient's teeth model with intraoperative video to obtain its pose. The resulting pose is used to overlay relevant models from the same CT space on the camera video for augmented reality. The proposed system was evaluated on mandible/maxilla phantoms, a volunteer and clinical data. Experimental results show that the target overlay error is about 1 mm, and the frame rate of registration update yields 3-5 frames per second with a 4 K camera. The significance of this work lies in its simplicity in clinical setting and the seamless integration into the current medical procedure with satisfactory response time and overlay accuracy. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Nonchronological video synopsis and indexing.
Pritch, Yael; Rav-Acha, Alex; Peleg, Shmuel
2008-11-01
The amount of captured video is growing with the increased numbers of video cameras, especially the increase of millions of surveillance cameras that operate 24 hours a day. Since video browsing and retrieval is time consuming, most captured video is never watched or examined. Video synopsis is an effective tool for browsing and indexing of such a video. It provides a short video representation, while preserving the essential activities of the original video. The activity in the video is condensed into a shorter period by simultaneously showing multiple activities, even when they originally occurred at different times. The synopsis video is also an index into the original video by pointing to the original time of each activity. Video Synopsis can be applied to create a synopsis of an endless video streams, as generated by webcams and by surveillance cameras. It can address queries like "Show in one minute the synopsis of this camera broadcast during the past day''. This process includes two major phases: (i) An online conversion of the endless video stream into a database of objects and activities (rather than frames). (ii) A response phase, generating the video synopsis as a response to the user's query.
Linear momentum, angular momentum and energy in the linear collision between two balls
NASA Astrophysics Data System (ADS)
Hanisch, C.; Hofmann, F.; Ziese, M.
2018-01-01
In an experiment of the basic physics laboratory, kinematical motion processes were analysed. The motion was recorded with a standard video camera having frame rates from 30 to 240 fps the videos were processed using video analysis software. Video detection was used to analyse the symmetric one-dimensional collision between two balls. Conservation of linear and angular momentum lead to a crossover from rolling to sliding directly after the collision. By variation of the rolling radius the system could be tuned from a regime in which the balls move away from each other after the collision to a situation in which they re-collide.
Butail, Sachit; Salerno, Philip; Bollt, Erik M; Porfiri, Maurizio
2015-12-01
Traditional approaches for the analysis of collective behavior entail digitizing the position of each individual, followed by evaluation of pertinent group observables, such as cohesion and polarization. Machine learning may enable considerable advancements in this area by affording the classification of these observables directly from images. While such methods have been successfully implemented in the classification of individual behavior, their potential in the study collective behavior is largely untested. In this paper, we compare three methods for the analysis of collective behavior: simple tracking (ST) without resolving occlusions, machine learning with real data (MLR), and machine learning with synthetic data (MLS). These methods are evaluated on videos recorded from an experiment studying the effect of ambient light on the shoaling tendency of Giant danios. In particular, we compute average nearest-neighbor distance (ANND) and polarization using the three methods and compare the values with manually-verified ground-truth data. To further assess possible dependence on sampling rate for computing ANND, the comparison is also performed at a low frame rate. Results show that while ST is the most accurate at higher frame rate for both ANND and polarization, at low frame rate for ANND there is no significant difference in accuracy between the three methods. In terms of computational speed, MLR and MLS take significantly less time to process an image, with MLS better addressing constraints related to generation of training data. Finally, all methods are able to successfully detect a significant difference in ANND as the ambient light intensity is varied irrespective of the direction of intensity change.
Design considerations for computationally constrained two-way real-time video communication
NASA Astrophysics Data System (ADS)
Bivolarski, Lazar M.; Saunders, Steven E.; Ralston, John D.
2009-08-01
Today's video codecs have evolved primarily to meet the requirements of the motion picture and broadcast industries, where high-complexity studio encoding can be utilized to create highly-compressed master copies that are then broadcast one-way for playback using less-expensive, lower-complexity consumer devices for decoding and playback. Related standards activities have largely ignored the computational complexity and bandwidth constraints of wireless or Internet based real-time video communications using devices such as cell phones or webcams. Telecommunications industry efforts to develop and standardize video codecs for applications such as video telephony and video conferencing have not yielded image size, quality, and frame-rate performance that match today's consumer expectations and market requirements for Internet and mobile video services. This paper reviews the constraints and the corresponding video codec requirements imposed by real-time, 2-way mobile video applications. Several promising elements of a new mobile video codec architecture are identified, and more comprehensive computational complexity metrics and video quality metrics are proposed in order to support the design, testing, and standardization of these new mobile video codecs.
ERIC Educational Resources Information Center
d'Alessio, Matthew; Lundquist, Loraine
2013-01-01
Each year our physical science class for pre-service elementary teachers launches water-powered rockets based on the activity from NASA. We analyze the rocket flight using data from frame-by-frame video analysis of the launches. Before developing the methods presented in this paper, we noticed our students were mired in calculation details while…
Identification and annotation of erotic film based on content analysis
NASA Astrophysics Data System (ADS)
Wang, Donghui; Zhu, Miaoliang; Yuan, Xin; Qian, Hui
2005-02-01
The paper brings forward a new method for identifying and annotating erotic films based on content analysis. First, the film is decomposed to video and audio stream. Then, the video stream is segmented into shots and key frames are extracted from each shot. We filter the shots that include potential erotic content by finding the nude human body in key frames. A Gaussian model in YCbCr color space for detecting skin region is presented. An external polygon that covered the skin regions is used for the approximation of the human body. Last, we give the degree of the nudity by calculating the ratio of skin area to whole body area with weighted parameters. The result of the experiment shows the effectiveness of our method.
Video shot boundary detection using region-growing-based watershed method
NASA Astrophysics Data System (ADS)
Wang, Jinsong; Patel, Nilesh; Grosky, William
2004-10-01
In this paper, a novel shot boundary detection approach is presented, based on the popular region growing segmentation method - Watershed segmentation. In image processing, gray-scale pictures could be considered as topographic reliefs, in which the numerical value of each pixel of a given image represents the elevation at that point. Watershed method segments images by filling up basins with water starting at local minima, and at points where water coming from different basins meet, dams are built. In our method, each frame in the video sequences is first transformed from the feature space into the topographic space based on a density function. Low-level features are extracted from frame to frame. Each frame is then treated as a point in the feature space. The density of each point is defined as the sum of the influence functions of all neighboring data points. The height function that is originally used in Watershed segmentation is then replaced by inverting the density at the point. Thus, all the highest density values are transformed into local minima. Subsequently, Watershed segmentation is performed in the topographic space. The intuitive idea under our method is that frames within a shot are highly agglomerative in the feature space and have higher possibilities to be merged together, while those frames between shots representing the shot changes are not, hence they have less density values and are less likely to be clustered by carefully extracting the markers and choosing the stopping criterion.
Virtual viewpoint synthesis in multi-view video system
NASA Astrophysics Data System (ADS)
Li, Fang; Yang, Shiqiang
2005-07-01
In this paper, we present a virtual viewpoint video synthesis algorithm to satisfy the following three aims: low computing consuming; real time interpolation and acceptable video quality. In contrast with previous technologies, this method obtain incompletely 3D structure using neighbor video sources instead of getting total 3D information with all video sources, so that the computation is reduced greatly. So we demonstrate our interactive multi-view video synthesis algorithm in a personal computer. Furthermore, adopting the method of choosing feature points to build the correspondence between the frames captured by neighbor cameras, we need not require camera calibration. Finally, our method can be used when the angle between neighbor cameras is 25-30 degrees that it is much larger than common computer vision experiments. In this way, our method can be applied into many applications such as sports live, video conference, etc.
NASA Astrophysics Data System (ADS)
Kim, Kyung-Su; Lee, Hae-Yeoun; Im, Dong-Hyuck; Lee, Heung-Kyu
Commercial markets employ digital right management (DRM) systems to protect valuable high-definition (HD) quality videos. DRM system uses watermarking to provide copyright protection and ownership authentication of multimedia contents. We propose a real-time video watermarking scheme for HD video in the uncompressed domain. Especially, our approach is in aspect of practical perspectives to satisfy perceptual quality, real-time processing, and robustness requirements. We simplify and optimize human visual system mask for real-time performance and also apply dithering technique for invisibility. Extensive experiments are performed to prove that the proposed scheme satisfies the invisibility, real-time processing, and robustness requirements against video processing attacks. We concentrate upon video processing attacks that commonly occur in HD quality videos to display on portable devices. These attacks include not only scaling and low bit-rate encoding, but also malicious attacks such as format conversion and frame rate change.
A novel Kalman filter based video image processing scheme for two-photon fluorescence microscopy
NASA Astrophysics Data System (ADS)
Sun, Wenqing; Huang, Xia; Li, Chunqiang; Xiao, Chuan; Qian, Wei
2016-03-01
Two-photon fluorescence microscopy (TPFM) is a perfect optical imaging equipment to monitor the interaction between fast moving viruses and hosts. However, due to strong unavoidable background noises from the culture, videos obtained by this technique are too noisy to elaborate this fast infection process without video image processing. In this study, we developed a novel scheme to eliminate background noises, recover background bacteria images and improve video qualities. In our scheme, we modified and implemented the following methods for both host and virus videos: correlation method, round identification method, tree-structured nonlinear filters, Kalman filters, and cell tracking method. After these procedures, most of noises were eliminated and host images were recovered with their moving directions and speed highlighted in the videos. From the analysis of the processed videos, 93% bacteria and 98% viruses were correctly detected in each frame on average.
Algorithms for High-Speed Noninvasive Eye-Tracking System
NASA Technical Reports Server (NTRS)
Talukder, Ashit; Morookian, John-Michael; Lambert, James
2010-01-01
Two image-data-processing algorithms are essential to the successful operation of a system of electronic hardware and software that noninvasively tracks the direction of a person s gaze in real time. The system was described in High-Speed Noninvasive Eye-Tracking System (NPO-30700) NASA Tech Briefs, Vol. 31, No. 8 (August 2007), page 51. To recapitulate from the cited article: Like prior commercial noninvasive eyetracking systems, this system is based on (1) illumination of an eye by a low-power infrared light-emitting diode (LED); (2) acquisition of video images of the pupil, iris, and cornea in the reflected infrared light; (3) digitization of the images; and (4) processing the digital image data to determine the direction of gaze from the centroids of the pupil and cornea in the images. Most of the prior commercial noninvasive eyetracking systems rely on standard video cameras, which operate at frame rates of about 30 Hz. Such systems are limited to slow, full-frame operation. The video camera in the present system includes a charge-coupled-device (CCD) image detector plus electronic circuitry capable of implementing an advanced control scheme that effects readout from a small region of interest (ROI), or subwindow, of the full image. Inasmuch as the image features of interest (the cornea and pupil) typically occupy a small part of the camera frame, this ROI capability can be exploited to determine the direction of gaze at a high frame rate by reading out from the ROI that contains the cornea and pupil (but not from the rest of the image) repeatedly. One of the present algorithms exploits the ROI capability. The algorithm takes horizontal row slices and takes advantage of the symmetry of the pupil and cornea circles and of the gray-scale contrasts of the pupil and cornea with respect to other parts of the eye. The algorithm determines which horizontal image slices contain the pupil and cornea, and, on each valid slice, the end coordinates of the pupil and cornea. Information from multiple slices is then combined to robustly locate the centroids of the pupil and cornea images. The other of the two present algorithms is a modified version of an older algorithm for estimating the direction of gaze from the centroids of the pupil and cornea. The modification lies in the use of the coordinates of the centroids, rather than differences between the coordinates of the centroids, in a gaze-mapping equation. The equation locates a gaze point, defined as the intersection of the gaze axis with a surface of interest, which is typically a computer display screen (see figure). The expected advantage of the modification is to make the gaze computation less dependent on some simplifying assumptions that are sometimes not accurate
Self-Affirmation Theory and Performance Feedback: When Scoring High Makes You Feel Low.
Velez, John A; Hanus, Michael D
2016-12-01
Video games have a wide variety of benefits for players. The current study examines how video games can also increase players' willingness to internalize important but threatening self-information. Research suggests that negative information regarding a valued self-image evokes defensive strategies aimed at dismissing or discrediting the source of information. Self-Affirmation Theory proposes that affirming or bolstering an important self-image unrelated to the previous threat can be an effective strategy for reducing defensiveness. Participants in the current study completed a fictitious intelligence test and received negative or no feedback, followed by 15 minutes of video game play that resulted in positive or no feedback. Results suggest that participants who valued video game success as part of their identity exhibited less defensive strategies in the form of increased test credibility ratings and lower self-perceptions of intelligence. This suggests that performing well on a video game is an affirmational resource for players whose identities are contingent upon such success. However, results also indicate that players who did not value video game success but received positive video game feedback exhibited more defensive reactions to the negative intelligence test feedback. This suggests that while players who value video game success as part of their identity may reap benefits from video game play after a self-threat, those who do not value such success may experience more harmful effects.
Hydra Rendezvous and Docking Sensor
NASA Technical Reports Server (NTRS)
Roe, Fred; Carrington, Connie
2007-01-01
The U.S. technology to support a CEV AR&D activity is mature and was developed by NASA and supporting industry during an extensive research and development program conducted during the 1990's and early 2000 time frame at the Marshall Space Flight Center. Development and demonstration of a rendezvous/docking sensor was identified early in the AR&D Program as the critical enabling technology that allows automated proxinity operations and docking. A first generation rendezvous/docking sensor, the Video Guidance Sensor (VGS) was developed and successfully flown on STS 87 and again on STS 95, proving the concept of a video-based sensor. Advances in both video and signal processing technologies and the lessons learned from the two successful flight experiments provided a baseline for the development of a new generation of video based rendezvous/docking sensor. The Advanced Video Guidance Sensor (AVGS) has greatly increased performance and additional capability for longer-range operation. A Demonstration Automatic Rendezvous Technology (DART) flight experiment was flown in April 2005 using AVGS as the primary proximity operations sensor. Because of the absence of a docking mechanism on the target satellite, this mission did not demonstrate the ability of the sensor to coltrold ocking. Mission results indicate that the rendezvous sensor operated successfully in "spot mode" (2 km acquisition of the target, bearing data only) but was never commanded to "acquire and track" the docking target. Parts obsolescence issues prevent the construction of current design AVGS units to support the NASA Exploration initiative. This flight proven AR&D technology is being modularized and upgraded with additional capabilities through the Hydra project at the Marshall Space Flight Center. Hydra brings a unique engineering approach and sensor architecture to the table, to solve the continuing issues of parts obsolescence and multiple sensor integration. This paper presents an approach to sensor hardware trades, to address the needs of future vehicles that may rendezvous and dock with the International Space Station (ISS). It will also discuss approaches for upgrading AVGS to address parts obsolescence, and concepts for modularizing the sensor to provide configuration flexibility for multiple vehicle applications. Options for complementary sensors to be integrated into the multi-head Hydra system will also be presented. Complementary sensor options include ULTOR, a digital image correlator system that could provide relative six-degree-of-freedom information independently from AVGS, and time-of-flight sensors, which determine the range between vehicles by timing pulses that travel from the sensor to the target and back. Common targets and integrated targets, suitable for use with the multi-sensor options in Hydra, will also be addressed.
Technical and economic feasibility of integrated video service by satellite
NASA Technical Reports Server (NTRS)
Price, Kent M.; Garlow, R. K.; Henderson, T. R.; Kwan, Robert K.; White, L. W.
1992-01-01
The trends and roles of satellite based video services in the year 2010 time frame are examined based on an overall network and service model for that period. Emphasis is placed on point to point and multipoint service, but broadcast could also be accommodated. An estimate of the video traffic is made and the service and general network requirements are identified. User charges are then estimated based on several usage scenarios. In order to accommodate these traffic needs, a 28 spot beam satellite architecture with on-board processing and signal mixing is suggested.
A single frame: imaging live cells twenty-five years ago.
Fink, Rachel
2011-07-01
In the mid-1980s live-cell imaging was changed by the introduction of video techniques, allowing new ways to collect and store data. The increased resolution obtained by manipulating video signals, the ability to use time-lapse videocassette recorders to study events that happen over long time intervals, and the introduction of fluorescent probes and sensitive video cameras opened research avenues previously unavailable. The author gives a personal account of this evolution, focusing on cell migration studies at the Marine Biological Laboratory 25 years ago. Copyright © 2011 Wiley-Liss, Inc.
Practical system for generating digital mixed reality video holograms.
Song, Joongseok; Kim, Changseob; Park, Hanhoon; Park, Jong-Il
2016-07-10
We propose a practical system that can effectively mix the depth data of real and virtual objects by using a Z buffer and can quickly generate digital mixed reality video holograms by using multiple graphic processing units (GPUs). In an experiment, we verify that real objects and virtual objects can be merged naturally in free viewing angles, and the occlusion problem is well handled. Furthermore, we demonstrate that the proposed system can generate mixed reality video holograms at 7.6 frames per second. Finally, the system performance is objectively verified by users' subjective evaluations.
Improved optical flow motion estimation for digital image stabilization
NASA Astrophysics Data System (ADS)
Lai, Lijun; Xu, Zhiyong; Zhang, Xuyao
2015-11-01
Optical flow is the instantaneous motion vector at each pixel in the image frame at a time instant. The gradient-based approach for optical flow computation can't work well when the video motion is too large. To alleviate such problem, we incorporate this algorithm into a pyramid multi-resolution coarse-to-fine search strategy. Using pyramid strategy to obtain multi-resolution images; Using iterative relationship from the highest level to the lowest level to obtain inter-frames' affine parameters; Subsequence frames compensate back to the first frame to obtain stabilized sequence. The experiment results demonstrate that the promoted method has good performance in global motion estimation.
Program Helps Generate And Manage Graphics
NASA Technical Reports Server (NTRS)
Truong, L. V.
1994-01-01
Living Color Frame Maker (LCFM) computer program generates computer-graphics frames. Graphical frames saved as text files, in readable and disclosed format, easily retrieved and manipulated by user programs for wide range of real-time visual information applications. LCFM implemented in frame-based expert system for visual aids in management of systems. Monitoring, diagnosis, and/or control, diagrams of circuits or systems brought to "life" by use of designated video colors and intensities to symbolize status of hardware components (via real-time feedback from sensors). Status of systems can be displayed. Written in C++ using Borland C++ 2.0 compiler for IBM PC-series computers and compatible computers running MS-DOS.
High Speed Videometric Monitoring of Rock Breakage
NASA Astrophysics Data System (ADS)
Allemand, J.; Shortis, M. R.; Elmouttie, M. K.
2018-05-01
Estimation of rock breakage characteristics plays an important role in optimising various industrial and mining processes used for rock comminution. Although little research has been undertaken into 3D photogrammetric measurement of the progeny kinematics, there is promising potential to improve the efficacy of rock breakage characterisation. In this study, the observation of progeny kinematics was conducted using a high speed, stereo videometric system based on laboratory experiments with a drop weight impact testing system. By manually tracking individual progeny through the captured video sequences, observed progeny coordinates can be used to determine 3D trajectories and velocities, supporting the idea that high speed video can be used for rock breakage characterisation purposes. An analysis of the results showed that the high speed videometric system successfully observed progeny trajectories and showed clear projection of the progeny away from the impact location. Velocities of the progeny could also be determined based on the trajectories and the video frame rate. These results were obtained despite the limitations of the photogrammetric system and experiment processes observed in this study. Accordingly there is sufficient evidence to conclude that high speed videometric systems are capable of observing progeny kinematics from drop weight impact tests. With further optimisation of the systems and processes used, there is potential for improving the efficacy of rock breakage characterisation from measurements with high speed videometric systems.
ERIC Educational Resources Information Center
Preston, Hilary
2006-01-01
This essay investigates the collaboration between dance and choreographic practice and film/video medium in a contemporary context. By looking specifically at dance made for the camera and the proliferation of dance-film/video, critical issues will be explored that have surfaced in response to this burgeoning form. Presenting a view of avant-garde…
2D to 3D conversion implemented in different hardware
NASA Astrophysics Data System (ADS)
Ramos-Diaz, Eduardo; Gonzalez-Huitron, Victor; Ponomaryov, Volodymyr I.; Hernandez-Fragoso, Araceli
2015-02-01
Conversion of available 2D data for release in 3D content is a hot topic for providers and for success of the 3D applications, in general. It naturally completely relies on virtual view synthesis of a second view given by original 2D video. Disparity map (DM) estimation is a central task in 3D generation but still follows a very difficult problem for rendering novel images precisely. There exist different approaches in DM reconstruction, among them manually and semiautomatic methods that can produce high quality DMs but they demonstrate hard time consuming and are computationally expensive. In this paper, several hardware implementations of designed frameworks for an automatic 3D color video generation based on 2D real video sequence are proposed. The novel framework includes simultaneous processing of stereo pairs using the following blocks: CIE L*a*b* color space conversions, stereo matching via pyramidal scheme, color segmentation by k-means on an a*b* color plane, and adaptive post-filtering, DM estimation using stereo matching between left and right images (or neighboring frames in a video), adaptive post-filtering, and finally, the anaglyph 3D scene generation. Novel technique has been implemented on DSP TMS320DM648, Matlab's Simulink module over a PC with Windows 7, and using graphic card (NVIDIA Quadro K2000) demonstrating that the proposed approach can be applied in real-time processing mode. The time values needed, mean Similarity Structural Index Measure (SSIM) and Bad Matching Pixels (B) values for different hardware implementations (GPU, Single CPU, and DSP) are exposed in this paper.
High dynamic range adaptive real-time smart camera: an overview of the HDR-ARTiST project
NASA Astrophysics Data System (ADS)
Lapray, Pierre-Jean; Heyrman, Barthélémy; Ginhac, Dominique
2015-04-01
Standard cameras capture only a fraction of the information that is visible to the human visual system. This is specifically true for natural scenes including areas of low and high illumination due to transitions between sunlit and shaded areas. When capturing such a scene, many cameras are unable to store the full Dynamic Range (DR) resulting in low quality video where details are concealed in shadows or washed out by sunlight. The imaging technique that can overcome this problem is called HDR (High Dynamic Range) imaging. This paper describes a complete smart camera built around a standard off-the-shelf LDR (Low Dynamic Range) sensor and a Virtex-6 FPGA board. This smart camera called HDR-ARtiSt (High Dynamic Range Adaptive Real-time Smart camera) is able to produce a real-time HDR live video color stream by recording and combining multiple acquisitions of the same scene while varying the exposure time. This technique appears as one of the most appropriate and cheapest solution to enhance the dynamic range of real-life environments. HDR-ARtiSt embeds real-time multiple captures, HDR processing, data display and transfer of a HDR color video for a full sensor resolution (1280 1024 pixels) at 60 frames per second. The main contributions of this work are: (1) Multiple Exposure Control (MEC) dedicated to the smart image capture with alternating three exposure times that are dynamically evaluated from frame to frame, (2) Multi-streaming Memory Management Unit (MMMU) dedicated to the memory read/write operations of the three parallel video streams, corresponding to the different exposure times, (3) HRD creating by combining the video streams using a specific hardware version of the Devebecs technique, and (4) Global Tone Mapping (GTM) of the HDR scene for display on a standard LCD monitor.
NASA Astrophysics Data System (ADS)
Crone, T. J.; Knuth, F.; Marburg, A.
2016-12-01
A broad array of Earth science problems can be investigated using high-definition video imagery from the seafloor, ranging from those that are geological and geophysical in nature, to those that are biological and water-column related. A high-definition video camera was installed as part of the Ocean Observatory Initiative's core instrument suite on the Cabled Array, a real-time fiber optic data and power system that stretches from the Oregon Coast to Axial Seamount on the Juan de Fuca Ridge. This camera runs a 14-minute pan-tilt-zoom routine 8 times per day, focusing on locations of scientific interest on and near the Mushroom vent in the ASHES hydrothermal field inside the Axial caldera. The system produces 13 GB of lossless HD video every 3 hours, and at the time of this writing it has generated 2100 recordings totaling 28.5 TB since it began streaming data into the OOI archive in August of 2015. Because of the large size of this dataset, downloading the entirety of the video for long timescale investigations is not practical. We are developing a set of user-side tools for downloading single frames and frame ranges from the OOI HD camera raw data archive to aid users interested in using these data for their research. We use these tools to download about one year's worth of partial frame sets to investigate several questions regarding the hydrothermal system at ASHES, including the variability of bacterial "floc" in the water-column, and changes in high temperature fluid fluxes using optical flow techniques. We show that while these user-side tools can facilitate rudimentary scientific investigations using the HD camera data, a server-side computing environment that allows users to explore this dataset without downloading any raw video will be required for more advanced investigations to flourish.
Video-tracker trajectory analysis: who meets whom, when and where
NASA Astrophysics Data System (ADS)
Jäger, U.; Willersinn, D.
2010-04-01
Unveiling unusual or hostile events by observing manifold moving persons in a crowd is a challenging task for human operators, especially when sitting in front of monitor walls for hours. Typically, hostile events are rare. Thus, due to tiredness and negligence the operator may miss important events. In such situations, an automatic alarming system is able to support the human operator. The system incorporates a processing chain consisting of (1) people tracking, (2) event detection, (3) data retrieval, and (4) display of relevant video sequence overlaid by highlighted regions of interest. In this paper we focus on the event detection stage of the processing chain mentioned above. In our case, the selected event of interest is the encounter of people. Although being based on a rather simple trajectory analysis, this kind of event embodies great practical importance because it paves the way to answer the question "who meets whom, when and where". This, in turn, forms the basis to detect potential situations where e.g. money, weapons, drugs etc. are handed over from one person to another in crowded environments like railway stations, airports or busy streets and places etc.. The input to the trajectory analysis comes from a multi-object video-based tracking system developed at IOSB which is able to track multiple individuals within a crowd in real-time [1]. From this we calculate the inter-distances between all persons on a frame-to-frame basis. We use a sequence of simple rules based on the individuals' kinematics to detect the event mentioned above to output the frame number, the persons' IDs from the tracker and the pixel coordinates of the meeting position. Using this information, a data retrieval system may extract the corresponding part of the recorded video image sequence and finally allows for replaying the selected video clip with a highlighted region of interest to attract the operator's attention for further visual inspection.
Real-time 3D video compression for tele-immersive environments
NASA Astrophysics Data System (ADS)
Yang, Zhenyu; Cui, Yi; Anwar, Zahid; Bocchino, Robert; Kiyanclar, Nadir; Nahrstedt, Klara; Campbell, Roy H.; Yurcik, William
2006-01-01
Tele-immersive systems can improve productivity and aid communication by allowing distributed parties to exchange information via a shared immersive experience. The TEEVE research project at the University of Illinois at Urbana-Champaign and the University of California at Berkeley seeks to foster the development and use of tele-immersive environments by a holistic integration of existing components that capture, transmit, and render three-dimensional (3D) scenes in real time to convey a sense of immersive space. However, the transmission of 3D video poses significant challenges. First, it is bandwidth-intensive, as it requires the transmission of multiple large-volume 3D video streams. Second, existing schemes for 2D color video compression such as MPEG, JPEG, and H.263 cannot be applied directly because the 3D video data contains depth as well as color information. Our goal is to explore from a different angle of the 3D compression space with factors including complexity, compression ratio, quality, and real-time performance. To investigate these trade-offs, we present and evaluate two simple 3D compression schemes. For the first scheme, we use color reduction to compress the color information, which we then compress along with the depth information using zlib. For the second scheme, we use motion JPEG to compress the color information and run-length encoding followed by Huffman coding to compress the depth information. We apply both schemes to 3D videos captured from a real tele-immersive environment. Our experimental results show that: (1) the compressed data preserves enough information to communicate the 3D images effectively (min. PSNR > 40) and (2) even without inter-frame motion estimation, very high compression ratios (avg. > 15) are achievable at speeds sufficient to allow real-time communication (avg. ~ 13 ms per 3D video frame).
Time and flow-direction responses of shear-styress-sensitive liquid crystal coatings
NASA Technical Reports Server (NTRS)
Reda, Daniel C.; Muraqtore, J. J.; Heinick, James T.
1994-01-01
Time and flow-direction responses of shear-stress liquid crystal coatings were exploresd experimentally. For the time-response experiments, coatings were exposed to transient, compressible flows created during the startup and off-design operation of an injector-driven supersonic wind tunnel. Flow transients were visualized with a focusing schlieren system and recorded with a 100 frame/s color video camera.
1991-08-15
Conversely, displays Atr con- past experience to the experimental stimuli. structed %xith normal density- controlled KDE cues but %ith 5. Excluding...frame. This 3Ndisplays, gray background is displayed’ on ail introduces 50% -scintillation (density control lion even frames (labelled 1:0). Other non ...video tapes were prepared, each of whsich contained all the experimental ASL signs but distributed 1 2 3 4 into dliffereint. filter groups . Eight
Real-time video quality monitoring
NASA Astrophysics Data System (ADS)
Liu, Tao; Narvekar, Niranjan; Wang, Beibei; Ding, Ran; Zou, Dekun; Cash, Glenn; Bhagavathy, Sitaram; Bloom, Jeffrey
2011-12-01
The ITU-T Recommendation G.1070 is a standardized opinion model for video telephony applications that uses video bitrate, frame rate, and packet-loss rate to measure the video quality. However, this model was original designed as an offline quality planning tool. It cannot be directly used for quality monitoring since the above three input parameters are not readily available within a network or at the decoder. And there is a great room for the performance improvement of this quality metric. In this article, we present a real-time video quality monitoring solution based on this Recommendation. We first propose a scheme to efficiently estimate the three parameters from video bitstreams, so that it can be used as a real-time video quality monitoring tool. Furthermore, an enhanced algorithm based on the G.1070 model that provides more accurate quality prediction is proposed. Finally, to use this metric in real-world applications, we present an example emerging application of real-time quality measurement to the management of transmitted videos, especially those delivered to mobile devices.
Shadow Detection Based on Regions of Light Sources for Object Extraction in Nighttime Video
Lee, Gil-beom; Lee, Myeong-jin; Lee, Woo-Kyung; Park, Joo-heon; Kim, Tae-Hwan
2017-01-01
Intelligent video surveillance systems detect pre-configured surveillance events through background modeling, foreground and object extraction, object tracking, and event detection. Shadow regions inside video frames sometimes appear as foreground objects, interfere with ensuing processes, and finally degrade the event detection performance of the systems. Conventional studies have mostly used intensity, color, texture, and geometric information to perform shadow detection in daytime video, but these methods lack the capability of removing shadows in nighttime video. In this paper, a novel shadow detection algorithm for nighttime video is proposed; this algorithm partitions each foreground object based on the object’s vertical histogram and screens out shadow objects by validating their orientations heading toward regions of light sources. From the experimental results, it can be seen that the proposed algorithm shows more than 93.8% shadow removal and 89.9% object extraction rates for nighttime video sequences, and the algorithm outperforms conventional shadow removal algorithms designed for daytime videos. PMID:28327515
Video Recording With a GoPro in Hand and Upper Extremity Surgery.
Vara, Alexander D; Wu, John; Shin, Alexander Y; Sobol, Gregory; Wiater, Brett
2016-10-01
Video recordings of surgical procedures are an excellent tool for presentations, analyzing self-performance, illustrating publications, and educating surgeons and patients. Recording the surgeon's perspective with high-resolution video in the operating room or clinic has become readily available and advances in software improve the ease of editing these videos. A GoPro HERO 4 Silver or Black was mounted on a head strap and worn over the surgical scrub cap, above the loupes of the operating surgeon. Five live surgical cases were recorded with the camera. The videos were uploaded to a computer and subsequently edited with iMovie or the GoPro software. The optimal settings for both the Silver and Black editions, when operating room lights are used, were determined to be a narrow view, 1080p, 60 frames per second (fps), spot meter on, protune on with auto white balance, exposure compensation at -0.5, and without a polarizing lens. When the operating room lights were not used, it was determined that the standard settings for a GoPro camera were ideal for positioning and editing (4K, 15 frames per second, spot meter and protune off). The GoPro HERO 4 provides high-quality, the surgeon perspective, and a cost-effective video recording of upper extremity surgical procedures. Challenges include finding the optimal settings for each surgical procedure and the length of recording due to battery life limitations. Copyright © 2016 American Society for Surgery of the Hand. Published by Elsevier Inc. All rights reserved.
Development, reliability and validation of an infant mammalian penetration-aspiration scale
Holman, Shaina Devi; Campbell-Malone, Regina; Ding, Peng; Gierbolini-Norat, Estela M.; Griffioen, Anne M.; Inokuchi, Haruhi; Lukasik, Stacey L.; German, Rebecca Z.
2012-01-01
A penetration-aspiration scale exists for assessing airway protection in adult videofluoroscopy and fiberoptic endoscopic swallowing studies, however no such scale exists for animal models. The aim of this study was threefold to 1) develop a Penetration-Aspiration Scale (PAS) for infant mammals, 2) test the scale’s intra- and inter-rater reliability, and 3) to validate the use of the scale for distinguishing between abnormal and normal animals. After discussion and reviewing many videos, the result was a 7-Point Infant Mammal PAS. Reliability was tested by having 5 judges score 90 swallows recorded with videofluoroscopy across two time points. In these videos, the frame rate was either 30 or 60 frames per second and the animals were either normal, had a unilateral superior laryngeal nerve (SLN) lesion, or had hard palate local anesthesia. The scale was validated by having one judge score videos of both normal and SLN lesioned pigs and testing the difference using a t-test. Raters had a high intra-rater (average kappa of 0.82, intraclass correlation coefficient (ICC)= 0.92) and high inter-rater reliability (average kappa of 0.68, ICC= 0.66). There was a significant difference in reliability for videos captured at 30 and 60 frames per second for scores of 3 and 7 (p<0.001). The scale was also validated for distinguishing between normal and abnormal pigs (p<0.001). Given the increasing number of animal studies using videofluoroscopy to study dysphagia, this scale provides a valid and reliable measure of airway protection during swallowing in infant pigs that will give these animal models increased translational significance. PMID:23129423
Ghosh, Tonmoy; Fattah, Shaikh Anowarul; Wahid, Khan A
2018-01-01
Wireless capsule endoscopy (WCE) is the most advanced technology to visualize whole gastrointestinal (GI) tract in a non-invasive way. But the major disadvantage here, it takes long reviewing time, which is very laborious as continuous manual intervention is necessary. In order to reduce the burden of the clinician, in this paper, an automatic bleeding detection method for WCE video is proposed based on the color histogram of block statistics, namely CHOBS. A single pixel in WCE image may be distorted due to the capsule motion in the GI tract. Instead of considering individual pixel values, a block surrounding to that individual pixel is chosen for extracting local statistical features. By combining local block features of three different color planes of RGB color space, an index value is defined. A color histogram, which is extracted from those index values, provides distinguishable color texture feature. A feature reduction technique utilizing color histogram pattern and principal component analysis is proposed, which can drastically reduce the feature dimension. For bleeding zone detection, blocks are classified using extracted local features that do not incorporate any computational burden for feature extraction. From extensive experimentation on several WCE videos and 2300 images, which are collected from a publicly available database, a very satisfactory bleeding frame and zone detection performance is achieved in comparison to that obtained by some of the existing methods. In the case of bleeding frame detection, the accuracy, sensitivity, and specificity obtained from proposed method are 97.85%, 99.47%, and 99.15%, respectively, and in the case of bleeding zone detection, 95.75% of precision is achieved. The proposed method offers not only low feature dimension but also highly satisfactory bleeding detection performance, which even can effectively detect bleeding frame and zone in a continuous WCE video data.
The Lancashire telemedicine ambulance.
Curry, G R; Harrop, N
1998-01-01
An emergency ambulance was equipped with three video-cameras and a system for transmitting slow-scan video-pictures through a cellular telephone link to a hospital accident and emergency department. Video-pictures were trasmitted at a resolution of 320 x 240 pixels and a frame rate of 15 pictures/min. In addition, a helmet-mounted camera was used with a wireless transmission link to the ambulance and thence the hospital. Speech was transmitted by a second hand-held cellular telephone. The equipment was installed in 1996-7 and video-recordings of actual ambulance journeys were made in July 1997. The technical feasibility of the telemedicine ambulance has been demonstrated and further clinical assessment is now in progress.
An Acoustic Charge Transport Imager for High Definition Television
NASA Technical Reports Server (NTRS)
Hunt, William D.; Brennan, Kevin; May, Gary; Glenn, William E.; Richardson, Mike; Solomon, Richard
1999-01-01
This project, over its term, included funding to a variety of companies and organizations. In addition to Georgia Tech these included Florida Atlantic University with Dr. William E. Glenn as the P.I., Kodak with Mr. Mike Richardson as the P.I. and M.I.T./Polaroid with Dr. Richard Solomon as the P.I. The focus of the work conducted by these organizations was the development of camera hardware for High Definition Television (HDTV). The focus of the research at Georgia Tech was the development of new semiconductor technology to achieve a next generation solid state imager chip that would operate at a high frame rate (I 70 frames per second), operate at low light levels (via the use of avalanche photodiodes as the detector element) and contain 2 million pixels. The actual cost required to create this new semiconductor technology was probably at least 5 or 6 times the investment made under this program and hence we fell short of achieving this rather grand goal. We did, however, produce a number of spin-off technologies as a result of our efforts. These include, among others, improved avalanche photodiode structures, significant advancement of the state of understanding of ZnO/GaAs structures and significant contributions to the analysis of general GaAs semiconductor devices and the design of Surface Acoustic Wave resonator filters for wireless communication. More of these will be described in the report. The work conducted at the partner sites resulted in the development of 4 prototype HDTV cameras. The HDTV camera developed by Kodak uses the Kodak KAI-2091M high- definition monochrome image sensor. This progressively-scanned charge-coupled device (CCD) can operate at video frame rates and has 9 gm square pixels. The photosensitive area has a 16:9 aspect ratio and is consistent with the "Common Image Format" (CIF). It features an active image area of 1928 horizontal by 1084 vertical pixels and has a 55% fill factor. The camera is designed to operate in continuous mode with an output data rate of 5MHz, which gives a maximum frame rate of 4 frames per second. The MIT/Polaroid group developed two cameras under this program. The cameras have effectively four times the current video spatial resolution and at 60 frames per second are double the normal video frame rate.
3-D Velocimetry of Strombolian Explosions
NASA Astrophysics Data System (ADS)
Taddeucci, J.; Gaudin, D.; Orr, T. R.; Scarlato, P.; Houghton, B. F.; Del Bello, E.
2014-12-01
Using two synchronized high-speed cameras we were able to reconstruct the three-dimensional displacement and velocity field of bomb-sized pyroclasts in Strombolian explosions at Stromboli Volcano. Relatively low-intensity Strombolian-style activity offers a rare opportunity to observe volcanic processes that remain hidden from view during more violent explosive activity. Such processes include the ejection and emplacement of bomb-sized clasts along pure or drag-modified ballistic trajectories, in-flight bomb collision, and gas liberation dynamics. High-speed imaging of Strombolian activity has already opened new windows for the study of the abovementioned processes, but to date has only utilized two-dimensional analysis with limited motion detection and ability to record motion towards or away from the observer. To overcome this limitation, we deployed two synchronized high-speed video cameras at Stromboli. The two cameras, located sixty meters apart, filmed Strombolian explosions at 500 and 1000 frames per second and with different resolutions. Frames from the two cameras were pre-processed and combined into a single video showing frames alternating from one to the other camera. Bomb-sized pyroclasts were then manually identified and tracked in the combined video, together with fixed reference points located as close as possible to the vent. The results from manual tracking were fed to a custom software routine that, knowing the relative position of the vent and cameras, and the field of view of the latter, provided the position of each bomb relative to the reference points. By tracking tens of bombs over five to ten frames at different intervals during one explosion, we were able to reconstruct the three-dimensional evolution of the displacement and velocity fields of bomb-sized pyroclasts during individual Strombolian explosions. Shifting jet directivity and dispersal angle clearly appear from the three-dimensional analysis.
NASA Astrophysics Data System (ADS)
El-Shafai, W.; El-Bakary, E. M.; El-Rabaie, S.; Zahran, O.; El-Halawany, M.; Abd El-Samie, F. E.
2017-06-01
Three-Dimensional Multi-View Video (3D-MVV) transmission over wireless networks suffers from Macro-Blocks losses due to either packet dropping or fading-motivated bit errors. Thus, the robust performance of 3D-MVV transmission schemes over wireless channels becomes a recent considerable hot research issue due to the restricted resources and the presence of severe channel errors. The 3D-MVV is composed of multiple video streams shot by several cameras around a single object, simultaneously. Therefore, it is an urgent task to achieve high compression ratios to meet future bandwidth constraints. Unfortunately, the highly-compressed 3D-MVV data becomes more sensitive and vulnerable to packet losses, especially in the case of heavy channel faults. Thus, in this paper, we suggest the application of a chaotic Baker interleaving approach with equalization and convolution coding for efficient Singular Value Decomposition (SVD) watermarked 3D-MVV transmission over an Orthogonal Frequency Division Multiplexing wireless system. Rayleigh fading and Additive White Gaussian Noise are considered in the real scenario of 3D-MVV transmission. The SVD watermarked 3D-MVV frames are primarily converted to their luminance and chrominance components, which are then converted to binary data format. After that, chaotic interleaving is applied prior to the modulation process. It is used to reduce the channel effects on the transmitted bit streams and it also adds a degree of encryption to the transmitted 3D-MVV frames. To test the performance of the proposed framework; several simulation experiments on different SVD watermarked 3D-MVV frames have been executed. The experimental results show that the received SVD watermarked 3D-MVV frames still have high Peak Signal-to-Noise Ratios and watermark extraction is possible in the proposed framework.
Motion-seeded object-based attention for dynamic visual imagery
NASA Astrophysics Data System (ADS)
Huber, David J.; Khosla, Deepak; Kim, Kyungnam
2017-05-01
This paper† describes a novel system that finds and segments "objects of interest" from dynamic imagery (video) that (1) processes each frame using an advanced motion algorithm that pulls out regions that exhibit anomalous motion, and (2) extracts the boundary of each object of interest using a biologically-inspired segmentation algorithm based on feature contours. The system uses a series of modular, parallel algorithms, which allows many complicated operations to be carried out by the system in a very short time, and can be used as a front-end to a larger system that includes object recognition and scene understanding modules. Using this method, we show 90% accuracy with fewer than 0.1 false positives per frame of video, which represents a significant improvement over detection using a baseline attention algorithm.
Ho, B T; Tsai, M J; Wei, J; Ma, M; Saipetch, P
1996-01-01
A new method of video compression for angiographic images has been developed to achieve high compression ratio (~20:1) while eliminating block artifacts which leads to loss of diagnostic accuracy. This method adopts motion picture experts group's (MPEGs) motion compensated prediction to takes advantage of frame to frame correlation. However, in contrast to MPEG, the error images arising from mismatches in the motion estimation are encoded by discrete wavelet transform (DWT) rather than block discrete cosine transform (DCT). Furthermore, the authors developed a classification scheme which label each block in an image as intra, error, or background type and encode it accordingly. This hybrid coding can significantly improve the compression efficiency in certain eases. This method can be generalized for any dynamic image sequences applications sensitive to block artifacts.
Instantaneous phase-shifting Fizeau interferometry with high-speed pixelated phase-mask camera
NASA Astrophysics Data System (ADS)
Yatagai, Toyohiko; Jackin, Boaz Jessie; Ono, Akira; Kiyohara, Kosuke; Noguchi, Masato; Yoshii, Minoru; Kiyohara, Motosuke; Niwa, Hayato; Ikuo, Kazuyuki; Onuma, Takashi
2015-08-01
A Fizeou interferometer with instantaneous phase-shifting ability using a Wollaston prism is designed. to measure dynamic phase change of objects, a high-speed video camera of 10-5s of shutter speed is used with a pixelated phase-mask of 1024 × 1024 elements. The light source used is a laser of wavelength 532 nm which is split into orthogonal polarization states by passing through a Wollaston prism. By adjusting the tilt of the reference surface it is possible to make the reference and object beam with orthogonal polarizations states to coincide and interfere. Then the pixelated phase-mask camera calculate the phase changes and hence the optical path length difference. Vibration of speakers and turbulence of air flow were successfully measured in 7,000 frames/sec.
NASA Technical Reports Server (NTRS)
Udomkesmalee, Suraphol; Padgett, Curtis; Zhu, David; Lung, Gerald; Howard, Ayanna
2000-01-01
A three-dimensional microelectronic device (3DANN-R) capable of performing general image convolution at the speed of 1012 operations/second (ops) in a volume of less than 1.5 cubic centimeter has been successfully built under the BMDO/JPL VIGILANTE program. 3DANN-R was developed in partnership with Irvine Sensors Corp., Costa Mesa, California. 3DANN-R is a sugar-cube-sized, low power image convolution engine that in its core computation circuitry is capable of performing 64 image convolutions with large (64x64) windows at video frame rates. This paper explores potential applications of 3DANN-R such as target recognition, SAR and hyperspectral data processing, and general machine vision using real data and discuss technical challenges for providing deployable systems for BMDO surveillance and interceptor programs.
An Automatic Portable Telecine Camera.
1978-08-01
five television frames to achieve synchronous operation, that is about 0.2 second. 6.3 Video recorder noise imnunity The synchronisation pulse separator...display is filmed by a modified 16 am cine camera driven by a control unit in which the camera supply voltage is derived from the field synchronisation ...pulses of the video signal. Automatic synchronisation of the camera mechanism is achieved over a wide range of television field frequencies and the
2010-07-01
imagery, persistent sensor array I. Introduction New device fabrication technologies and heterogeneous embedded processors have led to the emergence of a...geometric occlusions between target and sensor , motion blur, urban scene complexity, and high data volumes. In practical terms the targets are small...distributed airborne narrow-field-of-view video sensor networks. Airborne camera arrays combined with com- putational photography techniques enable the
Software for Real-Time Analysis of Subsonic Test Shot Accuracy
2014-03-01
used the C++ programming language, the Open Source Computer Vision ( OpenCV ®) software library, and Microsoft Windows® Application Programming...video for comparison through OpenCV image analysis tools. Based on the comparison, the software then computed the coordinates of each shot relative to...DWB researchers wanted to use the Open Source Computer Vision ( OpenCV ) software library for capturing and analyzing frames of video. OpenCV contains
Synchronizing A Stroboscope With A Video Camera
NASA Technical Reports Server (NTRS)
Rhodes, David B.; Franke, John M.; Jones, Stephen B.; Dismond, Harriet R.
1993-01-01
Circuit synchronizes flash of light from stroboscope with frame and field periods of video camera. Sync stripper sends vertical-synchronization signal to delay generator, which generates trigger signal. Flashlamp power supply accepts delayed trigger signal and sends pulse of power to flash lamp. Designed for use in making short-exposure images that "freeze" flow in wind tunnel. Also used for making longer-exposure images obtained by use of continuous intense illumination.
Jersey number detection in sports video for athlete identification
NASA Astrophysics Data System (ADS)
Ye, Qixiang; Huang, Qingming; Jiang, Shuqiang; Liu, Yang; Gao, Wen
2005-07-01
Athlete identification is important for sport video content analysis since users often care about the video clips with their preferred athletes. In this paper, we propose a method for athlete identification by combing the segmentation, tracking and recognition procedures into a coarse-to-fine scheme for jersey number (digital characters on sport shirt) detection. Firstly, image segmentation is employed to separate the jersey number regions with its background. And size/pipe-like attributes of digital characters are used to filter out candidates. Then, a K-NN (K nearest neighbor) classifier is employed to classify a candidate into a digit in "0-9" or negative. In the recognition procedure, we use the Zernike moment features, which are invariant to rotation and scale for digital shape recognition. Synthetic training samples with different fonts are used to represent the pattern of digital characters with non-rigid deformation. Once a character candidate is detected, a SSD (smallest square distance)-based tracking procedure is started. The recognition procedure is performed every several frames in the tracking process. After tracking tens of frames, the overall recognition results are combined to determine if a candidate is a true jersey number or not by a voting procedure. Experiments on several types of sports video shows encouraging result.
Persistent aerial video registration and fast multi-view mosaicing.
Molina, Edgardo; Zhu, Zhigang
2014-05-01
Capturing aerial imagery at high resolutions often leads to very low frame rate video streams, well under full motion video standards, due to bandwidth, storage, and cost constraints. Low frame rates make registration difficult when an aircraft is moving at high speeds or when global positioning system (GPS) contains large errors or it fails. We present a method that takes advantage of persistent cyclic video data collections to perform an online registration with drift correction. We split the persistent aerial imagery collection into individual cycles of the scene, identify and correct the registration errors on the first cycle in a batch operation, and then use the corrected base cycle as a reference pass to register and correct subsequent passes online. A set of multi-view panoramic mosaics is then constructed for each aerial pass for representation, presentation and exploitation of the 3D dynamic scene. These sets of mosaics are all in alignment to the reference cycle allowing their direct use in change detection, tracking, and 3D reconstruction/visualization algorithms. Stereo viewing with adaptive baselines and varying view angles is realized by choosing a pair of mosaics from a set of multi-view mosaics. Further, the mosaics for the second pass and later can be generated and visualized online as their is no further batch error correction.
Super-resolution imaging applied to moving object tracking
NASA Astrophysics Data System (ADS)
Swalaganata, Galandaru; Ratna Sulistyaningrum, Dwi; Setiyono, Budi
2017-10-01
Moving object tracking in a video is a method used to detect and analyze changes that occur in an object that being observed. Visual quality and the precision of the tracked target are highly wished in modern tracking system. The fact that the tracked object does not always seem clear causes the tracking result less precise. The reasons are low quality video, system noise, small object, and other factors. In order to improve the precision of the tracked object especially for small object, we propose a two step solution that integrates a super-resolution technique into tracking approach. First step is super-resolution imaging applied into frame sequences. This step was done by cropping the frame in several frame or all of frame. Second step is tracking the result of super-resolution images. Super-resolution image is a technique to obtain high-resolution images from low-resolution images. In this research single frame super-resolution technique is proposed for tracking approach. Single frame super-resolution was a kind of super-resolution that it has the advantage of fast computation time. The method used for tracking is Camshift. The advantages of Camshift was simple calculation based on HSV color that use its histogram for some condition and color of the object varies. The computational complexity and large memory requirements required for the implementation of super-resolution and tracking were reduced and the precision of the tracked target was good. Experiment showed that integrate a super-resolution imaging into tracking technique can track the object precisely with various background, shape changes of the object, and in a good light conditions.
Video laryngoscopy in pre-hospital critical care - a quality improvement study.
Rhode, Marianne Grønnebæk; Vandborg, Mads Partridge; Bladt, Vibeke; Rognås, Leif
2016-06-13
Pre-hospital endotracheal intubation is challenging and repeated endotracheal intubation is associated with increased morbidity and mortality. We investigated whether the introduction of the McGrath MAC video laryngoscope as the primary device for pre-hospital endotracheal intubation could improve first-pass success rate in our anaesthesiologist-staffed pre-hospital critical care services. We also investigated the incidence of failed pre-hospital endotracheal intubation, the use of airway adjuncts and back-up devices and problems encountered using the McGrath MAC video laryngoscope. Prospective quality improvement study collecting data from all adult pre-hospital endotracheal intubation performed by four anaesthesiologist-staffed pre-hospital critical care teams between December 15(th) 2013 and December 15(th) 2014. We registered data from 273 consecutive patients. When using the McGrath MAC video laryngoscope the overall pre-hospital endotracheal intubation first-pass success rate was 80.8 %. Following rapid sequence intubation (RSI) it was 88.9 %. This was not significantly different from previously reported first-pass success rates in our system (p = 0.27 and p = 0.41). During the last nine months of the study period the overall first-pass success rate was 80.1 (p = 0.47) but the post-RSI first-pass success rate improved to 94.4 % (0.048). The overall pre-hospital endotracheal intubation success rate with the McGrath MAC video laryngoscope was 98.9 % (p = 0.17). Gastric content, blood or secretion in the airway resulted in reduced vision when using the McGrath MAC video laryngoscope. In this study of video laryngoscope implementation in a Scandinavian anaesthesiologist-staffed pre-hospital critical care service, overall pre-hospital endotracheal first pass success rate did not change. The post-RSI first-pass success rate was significantly higher during the last nine months of our 12-month study compared with our results from before introducing McGrath MAC video laryngoscope. The implementation of the Standard Operating Procedure and check list for pre-hospital anaesthesia during the study period may have influenced the first-pass success rate and constitutes a potential confounder. The potential limitations of the McGrath MAC video laryngoscope when there are gastric content, blood and secretions in the airways need to be further investigated before the McGrath MAC video laryngoscope can be recommended as the primary device in all pre-hospital endotracheal intubations.
Comparative Analysis of THOR-NT ATD vs. Hybrid III ATD in Laboratory Vertical Shock Testing
2013-09-01
were taken both pretest and post - test for each test event (figure 5). Figure 5. Rigid fixture placed on the drop table with ATD seated: Hybrid III...6 3. Experimental Procedure 6 3.1 Test Setup...frames per second and with a Vision Research Phantom V9.1 (Wayne, NJ) high-speed video camera, sampling 1000 frames per second. 3. Experimental
Ultra High Definition Video from the International Space Station (Reel 1)
2015-06-15
The view of life in space is getting a major boost with the introduction of 4K Ultra High-Definition (UHD) video, providing an unprecedented look at what it's like to live and work aboard the International Space Station. This important new capability will allow researchers to acquire high resolution - high frame rate video to provide new insight into the vast array of experiments taking place every day. It will also bestow the most breathtaking views of planet Earth and space station activities ever acquired for consumption by those still dreaming of making the trip to outer space.
Sporadic frame dropping impact on quality perception
NASA Astrophysics Data System (ADS)
Pastrana-Vidal, Ricardo R.; Gicquel, Jean Charles; Colomes, Catherine; Cherifi, Hocine
2004-06-01
Over the past few years there has been an increasing interest in real time video services over packet networks. When considering quality, it is essential to quantify user perception of the received sequence. Severe motion discontinuities are one of the most common degradations in video streaming. The end-user perceives a jerky motion when the discontinuities are uniformly distributed over time and an instantaneous fluidity break is perceived when the motion loss is isolated or irregularly distributed. Bit rate adaptation techniques, transmission errors in the packet networks or restitution strategy could be the origin of this perceived jerkiness. In this paper we present a psychovisual experiment performed to quantify the effect of sporadically dropped pictures on the overall perceived quality. First, the perceptual detection thresholds of generated temporal discontinuities were measured. Then, the quality function was estimated in relation to a single frame dropping for different durations. Finally, a set of tests was performed to quantify the effect of several impairments distributed over time. We have found that the detection thresholds are content, duration and motion dependent. The assessment results show how quality is impaired by a single burst of dropped frames in a 10 sec sequence. The effect of several bursts of discarded frames, irregularly distributed over the time is also discussed.
Real-time lens distortion correction: speed, accuracy and efficiency
NASA Astrophysics Data System (ADS)
Bax, Michael R.; Shahidi, Ramin
2014-11-01
Optical lens systems suffer from nonlinear geometrical distortion. Optical imaging applications such as image-enhanced endoscopy and image-based bronchoscope tracking require correction of this distortion for accurate localization, tracking, registration, and measurement of image features. Real-time capability is desirable for interactive systems and live video. The use of a texture-mapping graphics accelerator, which is standard hardware on current motherboard chipsets and add-in video graphics cards, to perform distortion correction is proposed. Mesh generation for image tessellation, an error analysis, and performance results are presented. It is shown that distortion correction using commodity graphics hardware is substantially faster than using the main processor and can be performed at video frame rates (faster than 30 frames per second), and that the polar-based method of mesh generation proposed here is more accurate than a conventional grid-based approach. Using graphics hardware to perform distortion correction is not only fast and accurate but also efficient as it frees the main processor for other tasks, which is an important issue in some real-time applications.
NASA Astrophysics Data System (ADS)
Adi, K.; Widodo, A. P.; Widodo, C. E.; Pamungkas, A.; Putranto, A. B.
2018-05-01
Traffic monitoring on road needs to be done, the counting of the number of vehicles passing the road is necessary. It is more emphasized for highway transportation management in order to prevent efforts. Therefore, it is necessary to develop a system that is able to counting the number of vehicles automatically. Video processing method is able to counting the number of vehicles automatically. This research has development a system of vehicle counting on toll road. This system includes processes of video acquisition, frame extraction, and image processing for each frame. Video acquisition is conducted in the morning, at noon, in the afternoon, and in the evening. This system employs of background subtraction and morphology methods on gray scale images for vehicle counting. The best vehicle counting results were obtained in the morning with a counting accuracy of 86.36 %, whereas the lowest accuracy was in the evening, at 21.43 %. Differences in morning and evening results are caused by different illumination in the morning and evening. This will cause the values in the image pixels to be different.
Online Least Squares One-Class Support Vector Machines-Based Abnormal Visual Event Detection
Wang, Tian; Chen, Jie; Zhou, Yi; Snoussi, Hichem
2013-01-01
The abnormal event detection problem is an important subject in real-time video surveillance. In this paper, we propose a novel online one-class classification algorithm, online least squares one-class support vector machine (online LS-OC-SVM), combined with its sparsified version (sparse online LS-OC-SVM). LS-OC-SVM extracts a hyperplane as an optimal description of training objects in a regularized least squares sense. The online LS-OC-SVM learns a training set with a limited number of samples to provide a basic normal model, then updates the model through remaining data. In the sparse online scheme, the model complexity is controlled by the coherence criterion. The online LS-OC-SVM is adopted to handle the abnormal event detection problem. Each frame of the video is characterized by the covariance matrix descriptor encoding the moving information, then is classified into a normal or an abnormal frame. Experiments are conducted, on a two-dimensional synthetic distribution dataset and a benchmark video surveillance dataset, to demonstrate the promising results of the proposed online LS-OC-SVM method. PMID:24351629
Online least squares one-class support vector machines-based abnormal visual event detection.
Wang, Tian; Chen, Jie; Zhou, Yi; Snoussi, Hichem
2013-12-12
The abnormal event detection problem is an important subject in real-time video surveillance. In this paper, we propose a novel online one-class classification algorithm, online least squares one-class support vector machine (online LS-OC-SVM), combined with its sparsified version (sparse online LS-OC-SVM). LS-OC-SVM extracts a hyperplane as an optimal description of training objects in a regularized least squares sense. The online LS-OC-SVM learns a training set with a limited number of samples to provide a basic normal model, then updates the model through remaining data. In the sparse online scheme, the model complexity is controlled by the coherence criterion. The online LS-OC-SVM is adopted to handle the abnormal event detection problem. Each frame of the video is characterized by the covariance matrix descriptor encoding the moving information, then is classified into a normal or an abnormal frame. Experiments are conducted, on a two-dimensional synthetic distribution dataset and a benchmark video surveillance dataset, to demonstrate the promising results of the proposed online LS-OC-SVM method.