Joint source-channel coding for motion-compensated DCT-based SNR scalable video.
Kondi, Lisimachos P; Ishtiaq, Faisal; Katsaggelos, Aggelos K
2002-01-01
In this paper, we develop an approach toward joint source-channel coding for motion-compensated DCT-based scalable video coding and transmission. A framework for the optimal selection of the source and channel coding rates over all scalable layers is presented such that the overall distortion is minimized. The algorithm utilizes universal rate distortion characteristics which are obtained experimentally and show the sensitivity of the source encoder and decoder to channel errors. The proposed algorithm allocates the available bit rate between scalable layers and, within each layer, between source and channel coding. We present the results of this rate allocation algorithm for video transmission over a wireless channel using the H.263 Version 2 signal-to-noise ratio (SNR) scalable codec for source coding and rate-compatible punctured convolutional (RCPC) codes for channel coding. We discuss the performance of the algorithm with respect to the channel conditions, coding methodologies, layer rates, and number of layers.
A robust coding scheme for packet video
NASA Technical Reports Server (NTRS)
Chen, Y. C.; Sayood, Khalid; Nelson, D. J.
1991-01-01
We present a layered packet video coding algorithm based on a progressive transmission scheme. The algorithm provides good compression and can handle significant packet loss with graceful degradation in the reconstruction sequence. Simulation results for various conditions are presented.
A robust coding scheme for packet video
NASA Technical Reports Server (NTRS)
Chen, Yun-Chung; Sayood, Khalid; Nelson, Don J.
1992-01-01
A layered packet video coding algorithm based on a progressive transmission scheme is presented. The algorithm provides good compression and can handle significant packet loss with graceful degradation in the reconstruction sequence. Simulation results for various conditions are presented.
Implementation issues in source coding
NASA Technical Reports Server (NTRS)
Sayood, Khalid; Chen, Yun-Chung; Hadenfeldt, A. C.
1989-01-01
An edge preserving image coding scheme which can be operated in both a lossy and a lossless manner was developed. The technique is an extension of the lossless encoding algorithm developed for the Mars observer spectral data. It can also be viewed as a modification of the DPCM algorithm. A packet video simulator was also developed from an existing modified packet network simulator. The coding scheme for this system is a modification of the mixture block coding (MBC) scheme described in the last report. Coding algorithms for packet video were also investigated.
A contourlet transform based algorithm for real-time video encoding
NASA Astrophysics Data System (ADS)
Katsigiannis, Stamos; Papaioannou, Georgios; Maroulis, Dimitris
2012-06-01
In recent years, real-time video communication over the internet has been widely utilized for applications like video conferencing. Streaming live video over heterogeneous IP networks, including wireless networks, requires video coding algorithms that can support various levels of quality in order to adapt to the network end-to-end bandwidth and transmitter/receiver resources. In this work, a scalable video coding and compression algorithm based on the Contourlet Transform is proposed. The algorithm allows for multiple levels of detail, without re-encoding the video frames, by just dropping the encoded information referring to higher resolution than needed. Compression is achieved by means of lossy and lossless methods, as well as variable bit rate encoding schemes. Furthermore, due to the transformation utilized, it does not suffer from blocking artifacts that occur with many widely adopted compression algorithms. Another highly advantageous characteristic of the algorithm is the suppression of noise induced by low-quality sensors usually encountered in web-cameras, due to the manipulation of the transform coefficients at the compression stage. The proposed algorithm is designed to introduce minimal coding delay, thus achieving real-time performance. Performance is enhanced by utilizing the vast computational capabilities of modern GPUs, providing satisfactory encoding and decoding times at relatively low cost. These characteristics make this method suitable for applications like video-conferencing that demand real-time performance, along with the highest visual quality possible for each user. Through the presented performance and quality evaluation of the algorithm, experimental results show that the proposed algorithm achieves better or comparable visual quality relative to other compression and encoding methods tested, while maintaining a satisfactory compression ratio. Especially at low bitrates, it provides more human-eye friendly images compared to algorithms utilizing block-based coding, like the MPEG family, as it introduces fuzziness and blurring instead of artificial block artifacts.
Wavelet-based audio embedding and audio/video compression
NASA Astrophysics Data System (ADS)
Mendenhall, Michael J.; Claypoole, Roger L., Jr.
2001-12-01
Watermarking, traditionally used for copyright protection, is used in a new and exciting way. An efficient wavelet-based watermarking technique embeds audio information into a video signal. Several effective compression techniques are applied to compress the resulting audio/video signal in an embedded fashion. This wavelet-based compression algorithm incorporates bit-plane coding, index coding, and Huffman coding. To demonstrate the potential of this audio embedding and audio/video compression algorithm, we embed an audio signal into a video signal and then compress. Results show that overall compression rates of 15:1 can be achieved. The video signal is reconstructed with a median PSNR of nearly 33 dB. Finally, the audio signal is extracted from the compressed audio/video signal without error.
NASA Astrophysics Data System (ADS)
Liu, Mei-Feng; Zhong, Guo-Yun; He, Xiao-Hai; Qing, Lin-Bo
2016-09-01
Currently, most video resources on line are encoded in the H.264/AVC format. More fluent video transmission can be obtained if these resources are encoded in the newest international video coding standard: high efficiency video coding (HEVC). In order to improve the video transmission and storage on line, a transcoding method from H.264/AVC to HEVC is proposed. In this transcoding algorithm, the coding information of intraprediction, interprediction, and motion vector (MV) in H.264/AVC video stream are used to accelerate the coding in HEVC. It is found through experiments that the region of interprediction in HEVC overlaps that in H.264/AVC. Therefore, the intraprediction for the region in HEVC, which is interpredicted in H.264/AVC, can be skipped to reduce coding complexity. Several macroblocks in H.264/AVC are combined into one PU in HEVC when the MV difference between two of the macroblocks in H.264/AVC is lower than a threshold. This method selects only one coding unit depth and one prediction unit (PU) mode to reduce the coding complexity. An MV interpolation method of combined PU in HEVC is proposed according to the areas and distances between the center of one macroblock in H.264/AVC and that of the PU in HEVC. The predicted MV accelerates the motion estimation for HEVC coding. The simulation results show that our proposed algorithm achieves significant coding time reduction with a little loss in bitrates distortion rate, compared to the existing transcoding algorithms and normal HEVC coding.
Variable disparity-motion estimation based fast three-view video coding
NASA Astrophysics Data System (ADS)
Bae, Kyung-Hoon; Kim, Seung-Cheol; Hwang, Yong Seok; Kim, Eun-Soo
2009-02-01
In this paper, variable disparity-motion estimation (VDME) based 3-view video coding is proposed. In the encoding, key-frame coding (KFC) based motion estimation and variable disparity estimation (VDE) for effectively fast three-view video encoding are processed. These proposed algorithms enhance the performance of 3-D video encoding/decoding system in terms of accuracy of disparity estimation and computational overhead. From some experiments, stereo sequences of 'Pot Plant' and 'IVO', it is shown that the proposed algorithm's PSNRs is 37.66 and 40.55 dB, and the processing time is 0.139 and 0.124 sec/frame, respectively.
A hardware-oriented concurrent TZ search algorithm for High-Efficiency Video Coding
NASA Astrophysics Data System (ADS)
Doan, Nghia; Kim, Tae Sung; Rhee, Chae Eun; Lee, Hyuk-Jae
2017-12-01
High-Efficiency Video Coding (HEVC) is the latest video coding standard, in which the compression performance is double that of its predecessor, the H.264/AVC standard, while the video quality remains unchanged. In HEVC, the test zone (TZ) search algorithm is widely used for integer motion estimation because it effectively searches the good-quality motion vector with a relatively small amount of computation. However, the complex computation structure of the TZ search algorithm makes it difficult to implement it in the hardware. This paper proposes a new integer motion estimation algorithm which is designed for hardware execution by modifying the conventional TZ search to allow parallel motion estimations of all prediction unit (PU) partitions. The algorithm consists of the three phases of zonal, raster, and refinement searches. At the beginning of each phase, the algorithm obtains the search points required by the original TZ search for all PU partitions in a coding unit (CU). Then, all redundant search points are removed prior to the estimation of the motion costs, and the best search points are then selected for all PUs. Compared to the conventional TZ search algorithm, experimental results show that the proposed algorithm significantly decreases the Bjøntegaard Delta bitrate (BD-BR) by 0.84%, and it also reduces the computational complexity by 54.54%.
3D video coding: an overview of present and upcoming standards
NASA Astrophysics Data System (ADS)
Merkle, Philipp; Müller, Karsten; Wiegand, Thomas
2010-07-01
An overview of existing and upcoming 3D video coding standards is given. Various different 3D video formats are available, each with individual pros and cons. The 3D video formats can be separated into two classes: video-only formats (such as stereo and multiview video) and depth-enhanced formats (such as video plus depth and multiview video plus depth). Since all these formats exist of at least two video sequences and possibly additional depth data, efficient compression is essential for the success of 3D video applications and technologies. For the video-only formats the H.264 family of coding standards already provides efficient and widely established compression algorithms: H.264/AVC simulcast, H.264/AVC stereo SEI message, and H.264/MVC. For the depth-enhanced formats standardized coding algorithms are currently being developed. New and specially adapted coding approaches are necessary, as the depth or disparity information included in these formats has significantly different characteristics than video and is not displayed directly, but used for rendering. Motivated by evolving market needs, MPEG has started an activity to develop a generic 3D video standard within the 3DVC ad-hoc group. Key features of the standard are efficient and flexible compression of depth-enhanced 3D video representations and decoupling of content creation and display requirements.
NASA Astrophysics Data System (ADS)
da Silva, Thaísa Leal; Agostini, Luciano Volcan; da Silva Cruz, Luis A.
2014-05-01
Intra prediction is a very important tool in current video coding standards. High-efficiency video coding (HEVC) intra prediction presents relevant gains in encoding efficiency when compared to previous standards, but with a very important increase in the computational complexity since 33 directional angular modes must be evaluated. Motivated by this high complexity, this article presents a complexity reduction algorithm developed to reduce the HEVC intra mode decision complexity targeting multiview videos. The proposed algorithm presents an efficient fast intra prediction compliant with singleview and multiview video encoding. This fast solution defines a reduced subset of intra directions according to the video texture and it exploits the relationship between prediction units (PUs) of neighbor depth levels of the coding tree. This fast intra coding procedure is used to develop an inter-view prediction method, which exploits the relationship between the intra mode directions of adjacent views to further accelerate the intra prediction process in multiview video encoding applications. When compared to HEVC simulcast, our method achieves a complexity reduction of up to 47.77%, at the cost of an average BD-PSNR loss of 0.08 dB.
Video transmission on ATM networks. Ph.D. Thesis
NASA Technical Reports Server (NTRS)
Chen, Yun-Chung
1993-01-01
The broadband integrated services digital network (B-ISDN) is expected to provide high-speed and flexible multimedia applications. Multimedia includes data, graphics, image, voice, and video. Asynchronous transfer mode (ATM) is the adopted transport techniques for B-ISDN and has the potential for providing a more efficient and integrated environment for multimedia. It is believed that most broadband applications will make heavy use of visual information. The prospect of wide spread use of image and video communication has led to interest in coding algorithms for reducing bandwidth requirements and improving image quality. The major results of a study on the bridging of network transmission performance and video coding are: Using two representative video sequences, several video source models are developed. The fitness of these models are validated through the use of statistical tests and network queuing performance. A dual leaky bucket algorithm is proposed as an effective network policing function. The concept of the dual leaky bucket algorithm can be applied to a prioritized coding approach to achieve transmission efficiency. A mapping of the performance/control parameters at the network level into equivalent parameters at the video coding level is developed. Based on that, a complete set of principles for the design of video codecs for network transmission is proposed.
Mixture block coding with progressive transmission in packet video. Appendix 1: Item 2. M.S. Thesis
NASA Technical Reports Server (NTRS)
Chen, Yun-Chung
1989-01-01
Video transmission will become an important part of future multimedia communication because of dramatically increasing user demand for video, and rapid evolution of coding algorithm and VLSI technology. Video transmission will be part of the broadband-integrated services digital network (B-ISDN). Asynchronous transfer mode (ATM) is a viable candidate for implementation of B-ISDN due to its inherent flexibility, service independency, and high performance. According to the characteristics of ATM, the information has to be coded into discrete cells which travel independently in the packet switching network. A practical realization of an ATM video codec called Mixture Block Coding with Progressive Transmission (MBCPT) is presented. This variable bit rate coding algorithm shows how a constant quality performance can be obtained according to user demand. Interactions between codec and network are emphasized including packetization, service synchronization, flow control, and error recovery. Finally, some simulation results based on MBCPT coding with error recovery are presented.
Medical Ultrasound Video Coding with H.265/HEVC Based on ROI Extraction
Wu, Yueying; Liu, Pengyu; Gao, Yuan; Jia, Kebin
2016-01-01
High-efficiency video compression technology is of primary importance to the storage and transmission of digital medical video in modern medical communication systems. To further improve the compression performance of medical ultrasound video, two innovative technologies based on diagnostic region-of-interest (ROI) extraction using the high efficiency video coding (H.265/HEVC) standard are presented in this paper. First, an effective ROI extraction algorithm based on image textural features is proposed to strengthen the applicability of ROI detection results in the H.265/HEVC quad-tree coding structure. Second, a hierarchical coding method based on transform coefficient adjustment and a quantization parameter (QP) selection process is designed to implement the otherness encoding for ROIs and non-ROIs. Experimental results demonstrate that the proposed optimization strategy significantly improves the coding performance by achieving a BD-BR reduction of 13.52% and a BD-PSNR gain of 1.16 dB on average compared to H.265/HEVC (HM15.0). The proposed medical video coding algorithm is expected to satisfy low bit-rate compression requirements for modern medical communication systems. PMID:27814367
Medical Ultrasound Video Coding with H.265/HEVC Based on ROI Extraction.
Wu, Yueying; Liu, Pengyu; Gao, Yuan; Jia, Kebin
2016-01-01
High-efficiency video compression technology is of primary importance to the storage and transmission of digital medical video in modern medical communication systems. To further improve the compression performance of medical ultrasound video, two innovative technologies based on diagnostic region-of-interest (ROI) extraction using the high efficiency video coding (H.265/HEVC) standard are presented in this paper. First, an effective ROI extraction algorithm based on image textural features is proposed to strengthen the applicability of ROI detection results in the H.265/HEVC quad-tree coding structure. Second, a hierarchical coding method based on transform coefficient adjustment and a quantization parameter (QP) selection process is designed to implement the otherness encoding for ROIs and non-ROIs. Experimental results demonstrate that the proposed optimization strategy significantly improves the coding performance by achieving a BD-BR reduction of 13.52% and a BD-PSNR gain of 1.16 dB on average compared to H.265/HEVC (HM15.0). The proposed medical video coding algorithm is expected to satisfy low bit-rate compression requirements for modern medical communication systems.
Selective encryption for H.264/AVC video coding
NASA Astrophysics Data System (ADS)
Shi, Tuo; King, Brian; Salama, Paul
2006-02-01
Due to the ease with which digital data can be manipulated and due to the ongoing advancements that have brought us closer to pervasive computing, the secure delivery of video and images has become a challenging problem. Despite the advantages and opportunities that digital video provide, illegal copying and distribution as well as plagiarism of digital audio, images, and video is still ongoing. In this paper we describe two techniques for securing H.264 coded video streams. The first technique, SEH264Algorithm1, groups the data into the following blocks of data: (1) a block that contains the sequence parameter set and the picture parameter set, (2) a block containing a compressed intra coded frame, (3) a block containing the slice header of a P slice, all the headers of the macroblock within the same P slice, and all the luma and chroma DC coefficients belonging to the all the macroblocks within the same slice, (4) a block containing all the ac coefficients, and (5) a block containing all the motion vectors. The first three are encrypted whereas the last two are not. The second method, SEH264Algorithm2, relies on the use of multiple slices per coded frame. The algorithm searches the compressed video sequence for start codes (0x000001) and then encrypts the next N bits of data.
Mode-dependent templates and scan order for H.264/AVC-based intra lossless coding.
Gu, Zhouye; Lin, Weisi; Lee, Bu-Sung; Lau, Chiew Tong; Sun, Ming-Ting
2012-09-01
In H.264/advanced video coding (AVC), lossless coding and lossy coding share the same entropy coding module. However, the entropy coders in the H.264/AVC standard were original designed for lossy video coding and do not yield adequate performance for lossless video coding. In this paper, we analyze the problem with the current lossless coding scheme and propose a mode-dependent template (MD-template) based method for intra lossless coding. By exploring the statistical redundancy of the prediction residual in the H.264/AVC intra prediction modes, more zero coefficients are generated. By designing a new scan order for each MD-template, the scanned coefficients sequence fits the H.264/AVC entropy coders better. A fast implementation algorithm is also designed. With little computation increase, experimental results confirm that the proposed fast algorithm achieves about 7.2% bit saving compared with the current H.264/AVC fidelity range extensions high profile.
NASA Technical Reports Server (NTRS)
Sayood, K.; Chen, Y. C.; Wang, X.
1992-01-01
During this reporting period we have worked on three somewhat different problems. These are modeling of video traffic in packet networks, low rate video compression, and the development of a lossy + lossless image compression algorithm, which might have some application in browsing algorithms. The lossy + lossless scheme is an extension of work previously done under this grant. It provides a simple technique for incorporating browsing capability. The low rate coding scheme is also a simple variation on the standard discrete cosine transform (DCT) coding approach. In spite of its simplicity, the approach provides surprisingly high quality reconstructions. The modeling approach is borrowed from the speech recognition literature, and seems to be promising in that it provides a simple way of obtaining an idea about the second order behavior of a particular coding scheme. Details about these are presented.
NASA Astrophysics Data System (ADS)
Sanchez, Gustavo; Marcon, César; Agostini, Luciano Volcan
2018-01-01
The 3D-high efficiency video coding has introduced tools to obtain higher efficiency in 3-D video coding, and most of them are related to the depth maps coding. Among these tools, the depth modeling mode-1 (DMM-1) focuses on better encoding edges regions of depth maps. The large memory required for storing all wedgelet patterns is one of the bottlenecks in the DMM-1 hardware design of both encoder and decoder since many patterns must be stored. Three algorithms to reduce the DMM-1 memory requirements and a hardware design targeting the most efficient among these algorithms are presented. Experimental results demonstrate that the proposed solutions surpass related works reducing up to 78.8% of the wedgelet memory, without degrading the encoding efficiency. Synthesis results demonstrate that the proposed algorithm reduces almost 75% of the power dissipation when compared to the standard approach.
Progressive video coding for noisy channels
NASA Astrophysics Data System (ADS)
Kim, Beong-Jo; Xiong, Zixiang; Pearlman, William A.
1998-10-01
We extend the work of Sherwood and Zeger to progressive video coding for noisy channels. By utilizing a 3D extension of the set partitioning in hierarchical trees (SPIHT) algorithm, we cascade the resulting 3D SPIHT video coder with a rate-compatible punctured convolutional channel coder for transmission of video over a binary symmetric channel. Progressive coding is achieved by increasing the target rate of the 3D embedded SPIHT video coder as the channel condition improves. The performance of our proposed coding system is acceptable at low transmission rate and bad channel conditions. Its low complexity makes it suitable for emerging applications such as video over wireless channels.
Quality Scalability Aware Watermarking for Visual Content.
Bhowmik, Deepayan; Abhayaratne, Charith
2016-11-01
Scalable coding-based content adaptation poses serious challenges to traditional watermarking algorithms, which do not consider the scalable coding structure and hence cannot guarantee correct watermark extraction in media consumption chain. In this paper, we propose a novel concept of scalable blind watermarking that ensures more robust watermark extraction at various compression ratios while not effecting the visual quality of host media. The proposed algorithm generates scalable and robust watermarked image code-stream that allows the user to constrain embedding distortion for target content adaptations. The watermarked image code-stream consists of hierarchically nested joint distortion-robustness coding atoms. The code-stream is generated by proposing a new wavelet domain blind watermarking algorithm guided by a quantization based binary tree. The code-stream can be truncated at any distortion-robustness atom to generate the watermarked image with the desired distortion-robustness requirements. A blind extractor is capable of extracting watermark data from the watermarked images. The algorithm is further extended to incorporate a bit-plane discarding-based quantization model used in scalable coding-based content adaptation, e.g., JPEG2000. This improves the robustness against quality scalability of JPEG2000 compression. The simulation results verify the feasibility of the proposed concept, its applications, and its improved robustness against quality scalable content adaptation. Our proposed algorithm also outperforms existing methods showing 35% improvement. In terms of robustness to quality scalable video content adaptation using Motion JPEG2000 and wavelet-based scalable video coding, the proposed method shows major improvement for video watermarking.
NASA Astrophysics Data System (ADS)
Chen, Gang; Yang, Bing; Zhang, Xiaoyun; Gao, Zhiyong
2017-07-01
The latest high efficiency video coding (HEVC) standard significantly increases the encoding complexity for improving its coding efficiency. Due to the limited computational capability of handheld devices, complexity constrained video coding has drawn great attention in recent years. A complexity control algorithm based on adaptive mode selection is proposed for interframe coding in HEVC. Considering the direct proportionality between encoding time and computational complexity, the computational complexity is measured in terms of encoding time. First, complexity is mapped to a target in terms of prediction modes. Then, an adaptive mode selection algorithm is proposed for the mode decision process. Specifically, the optimal mode combination scheme that is chosen through offline statistics is developed at low complexity. If the complexity budget has not been used up, an adaptive mode sorting method is employed to further improve coding efficiency. The experimental results show that the proposed algorithm achieves a very large complexity control range (as low as 10%) for the HEVC encoder while maintaining good rate-distortion performance. For the lowdelayP condition, compared with the direct resource allocation method and the state-of-the-art method, an average gain of 0.63 and 0.17 dB in BDPSNR is observed for 18 sequences when the target complexity is around 40%.
Low-delay predictive audio coding for the HIVITS HDTV codec
NASA Astrophysics Data System (ADS)
McParland, A. K.; Gilchrist, N. H. C.
1995-01-01
The status of work relating to predictive audio coding, as part of the European project on High Quality Video Telephone and HD(TV) Systems (HIVITS), is reported. The predictive coding algorithm is developed, along with six-channel audio coding and decoding hardware. Demonstrations of the audio codec operating in conjunction with the video codec, are given.
NASA Astrophysics Data System (ADS)
Chen, Zhenzhong; Han, Junwei; Ngan, King Ngi
2005-10-01
MPEG-4 treats a scene as a composition of several objects or so-called video object planes (VOPs) that are separately encoded and decoded. Such a flexible video coding framework makes it possible to code different video object with different distortion scale. It is necessary to analyze the priority of the video objects according to its semantic importance, intrinsic properties and psycho-visual characteristics such that the bit budget can be distributed properly to video objects to improve the perceptual quality of the compressed video. This paper aims to provide an automatic video object priority definition method based on object-level visual attention model and further propose an optimization framework for video object bit allocation. One significant contribution of this work is that the human visual system characteristics are incorporated into the video coding optimization process. Another advantage is that the priority of the video object can be obtained automatically instead of fixing weighting factors before encoding or relying on the user interactivity. To evaluate the performance of the proposed approach, we compare it with traditional verification model bit allocation and the optimal multiple video object bit allocation algorithms. Comparing with traditional bit allocation algorithms, the objective quality of the object with higher priority is significantly improved under this framework. These results demonstrate the usefulness of this unsupervised subjective quality lifting framework.
NASA Astrophysics Data System (ADS)
Tsang, Sik-Ho; Chan, Yui-Lam; Siu, Wan-Chi
2017-01-01
Weighted prediction (WP) is an efficient video coding tool that was introduced since the establishment of the H.264/AVC video coding standard, for compensating the temporal illumination change in motion estimation and compensation. WP parameters, including a multiplicative weight and an additive offset for each reference frame, are required to be estimated and transmitted to the decoder by slice header. These parameters cause extra bits in the coded video bitstream. High efficiency video coding (HEVC) provides WP parameter prediction to reduce the overhead. Therefore, WP parameter prediction is crucial to research works or applications, which are related to WP. Prior art has been suggested to further improve the WP parameter prediction by implicit prediction of image characteristics and derivation of parameters. By exploiting both temporal and interlayer redundancies, we propose three WP parameter prediction algorithms, enhanced implicit WP parameter, enhanced direct WP parameter derivation, and interlayer WP parameter, to further improve the coding efficiency of HEVC. Results show that our proposed algorithms can achieve up to 5.83% and 5.23% bitrate reduction compared to the conventional scalable HEVC in the base layer for SNR scalability and 2× spatial scalability, respectively.
Side-information-dependent correlation channel estimation in hash-based distributed video coding.
Deligiannis, Nikos; Barbarien, Joeri; Jacobs, Marc; Munteanu, Adrian; Skodras, Athanassios; Schelkens, Peter
2012-04-01
In the context of low-cost video encoding, distributed video coding (DVC) has recently emerged as a potential candidate for uplink-oriented applications. This paper builds on a concept of correlation channel (CC) modeling, which expresses the correlation noise as being statistically dependent on the side information (SI). Compared with classical side-information-independent (SII) noise modeling adopted in current DVC solutions, it is theoretically proven that side-information-dependent (SID) modeling improves the Wyner-Ziv coding performance. Anchored in this finding, this paper proposes a novel algorithm for online estimation of the SID CC parameters based on already decoded information. The proposed algorithm enables bit-plane-by-bit-plane successive refinement of the channel estimation leading to progressively improved accuracy. Additionally, the proposed algorithm is included in a novel DVC architecture that employs a competitive hash-based motion estimation technique to generate high-quality SI at the decoder. Experimental results corroborate our theoretical gains and validate the accuracy of the channel estimation algorithm. The performance assessment of the proposed architecture shows remarkable and consistent coding gains over a germane group of state-of-the-art distributed and standard video codecs, even under strenuous conditions, i.e., large groups of pictures and highly irregular motion content.
NASA Astrophysics Data System (ADS)
Manjanaik, N.; Parameshachari, B. D.; Hanumanthappa, S. N.; Banu, Reshma
2017-08-01
Intra prediction process of H.264 video coding standard used to code first frame i.e. Intra frame of video to obtain good coding efficiency compare to previous video coding standard series. More benefit of intra frame coding is to reduce spatial pixel redundancy with in current frame, reduces computational complexity and provides better rate distortion performance. To code Intra frame it use existing process Rate Distortion Optimization (RDO) method. This method increases computational complexity, increases in bit rate and reduces picture quality so it is difficult to implement in real time applications, so the many researcher has been developed fast mode decision algorithm for coding of intra frame. The previous work carried on Intra frame coding in H.264 standard using fast decision mode intra prediction algorithm based on different techniques was achieved increased in bit rate, degradation of picture quality(PSNR) for different quantization parameters. Many previous approaches of fast mode decision algorithms on intra frame coding achieved only reduction of computational complexity or it save encoding time and limitation was increase in bit rate with loss of quality of picture. In order to avoid increase in bit rate and loss of picture quality a better approach was developed. In this paper developed a better approach i.e. Gaussian pulse for Intra frame coding using diagonal down left intra prediction mode to achieve higher coding efficiency in terms of PSNR and bitrate. In proposed method Gaussian pulse is multiplied with each 4x4 frequency domain coefficients of 4x4 sub macro block of macro block of current frame before quantization process. Multiplication of Gaussian pulse for each 4x4 integer transformed coefficients at macro block levels scales the information of the coefficients in a reversible manner. The resulting signal would turn abstract. Frequency samples are abstract in a known and controllable manner without intermixing of coefficients, it avoids picture getting bad hit for higher values of quantization parameters. The proposed work was implemented using MATLAB and JM 18.6 reference software. The proposed work measure the performance parameters PSNR, bit rate and compression of intra frame of yuv video sequences in QCIF resolution under different values of quantization parameter with Gaussian value for diagonal down left intra prediction mode. The simulation results of proposed algorithm are tabulated and compared with previous algorithm i.e. Tian et al method. The proposed algorithm achieved reduced in bit rate averagely 30.98% and maintain consistent picture quality for QCIF sequences compared to previous algorithm i.e. Tian et al method.
A complexity-scalable software-based MPEG-2 video encoder.
Chen, Guo-bin; Lu, Xin-ning; Wang, Xing-guo; Liu, Ji-lin
2004-05-01
With the development of general-purpose processors (GPP) and video signal processing algorithms, it is possible to implement a software-based real-time video encoder on GPP, and its low cost and easy upgrade attract developers' interests to transfer video encoding from specialized hardware to more flexible software. In this paper, the encoding structure is set up first to support complexity scalability; then a lot of high performance algorithms are used on the key time-consuming modules in coding process; finally, at programming level, processor characteristics are considered to improve data access efficiency and processing parallelism. Other programming methods such as lookup table are adopted to reduce the computational complexity. Simulation results showed that these ideas could not only improve the global performance of video coding, but also provide great flexibility in complexity regulation.
Audiovisual focus of attention and its application to Ultra High Definition video compression
NASA Astrophysics Data System (ADS)
Rerabek, Martin; Nemoto, Hiromi; Lee, Jong-Seok; Ebrahimi, Touradj
2014-02-01
Using Focus of Attention (FoA) as a perceptual process in image and video compression belongs to well-known approaches to increase coding efficiency. It has been shown that foveated coding, when compression quality varies across the image according to region of interest, is more efficient than the alternative coding, when all region are compressed in a similar way. However, widespread use of such foveated compression has been prevented due to two main conflicting causes, namely, the complexity and the efficiency of algorithms for FoA detection. One way around these is to use as much information as possible from the scene. Since most video sequences have an associated audio, and moreover, in many cases there is a correlation between the audio and the visual content, audiovisual FoA can improve efficiency of the detection algorithm while remaining of low complexity. This paper discusses a simple yet efficient audiovisual FoA algorithm based on correlation of dynamics between audio and video signal components. Results of audiovisual FoA detection algorithm are subsequently taken into account for foveated coding and compression. This approach is implemented into H.265/HEVC encoder producing a bitstream which is fully compliant to any H.265/HEVC decoder. The influence of audiovisual FoA in the perceived quality of high and ultra-high definition audiovisual sequences is explored and the amount of gain in compression efficiency is analyzed.
Visual saliency-based fast intracoding algorithm for high efficiency video coding
NASA Astrophysics Data System (ADS)
Zhou, Xin; Shi, Guangming; Zhou, Wei; Duan, Zhemin
2017-01-01
Intraprediction has been significantly improved in high efficiency video coding over H.264/AVC with quad-tree-based coding unit (CU) structure from size 64×64 to 8×8 and more prediction modes. However, these techniques cause a dramatic increase in computational complexity. An intracoding algorithm is proposed that consists of perceptual fast CU size decision algorithm and fast intraprediction mode decision algorithm. First, based on the visual saliency detection, an adaptive and fast CU size decision method is proposed to alleviate intraencoding complexity. Furthermore, a fast intraprediction mode decision algorithm with step halving rough mode decision method and early modes pruning algorithm is presented to selectively check the potential modes and effectively reduce the complexity of computation. Experimental results show that our proposed fast method reduces the computational complexity of the current HM to about 57% in encoding time with only 0.37% increases in BD rate. Meanwhile, the proposed fast algorithm has reasonable peak signal-to-noise ratio losses and nearly the same subjective perceptual quality.
FBCOT: a fast block coding option for JPEG 2000
NASA Astrophysics Data System (ADS)
Taubman, David; Naman, Aous; Mathew, Reji
2017-09-01
Based on the EBCOT algorithm, JPEG 2000 finds application in many fields, including high performance scientific, geospatial and video coding applications. Beyond digital cinema, JPEG 2000 is also attractive for low-latency video communications. The main obstacle for some of these applications is the relatively high computational complexity of the block coder, especially at high bit-rates. This paper proposes a drop-in replacement for the JPEG 2000 block coding algorithm, achieving much higher encoding and decoding throughputs, with only modest loss in coding efficiency (typically < 0.5dB). The algorithm provides only limited quality/SNR scalability, but offers truly reversible transcoding to/from any standard JPEG 2000 block bit-stream. The proposed FAST block coder can be used with EBCOT's post-compression RD-optimization methodology, allowing a target compressed bit-rate to be achieved even at low latencies, leading to the name FBCOT (Fast Block Coding with Optimized Truncation).
Low-complexity transcoding algorithm from H.264/AVC to SVC using data mining
NASA Astrophysics Data System (ADS)
Garrido-Cantos, Rosario; De Cock, Jan; Martínez, Jose Luis; Van Leuven, Sebastian; Cuenca, Pedro; Garrido, Antonio
2013-12-01
Nowadays, networks and terminals with diverse characteristics of bandwidth and capabilities coexist. To ensure a good quality of experience, this diverse environment demands adaptability of the video stream. In general, video contents are compressed to save storage capacity and to reduce the bandwidth required for its transmission. Therefore, if these compressed video streams were compressed using scalable video coding schemes, they would be able to adapt to those heterogeneous networks and a wide range of terminals. Since the majority of the multimedia contents are compressed using H.264/AVC, they cannot benefit from that scalability. This paper proposes a low-complexity algorithm to convert an H.264/AVC bitstream without scalability to scalable bitstreams with temporal scalability in baseline and main profiles by accelerating the mode decision task of the scalable video coding encoding stage using machine learning tools. The results show that when our technique is applied, the complexity is reduced by 87% while maintaining coding efficiency.
NASA Astrophysics Data System (ADS)
Nightingale, James; Wang, Qi; Grecos, Christos
2011-03-01
Users of the next generation wireless paradigm known as multihomed mobile networks expect satisfactory quality of service (QoS) when accessing streamed multimedia content. The recent H.264 Scalable Video Coding (SVC) extension to the Advanced Video Coding standard (AVC), offers the facility to adapt real-time video streams in response to the dynamic conditions of multiple network paths encountered in multihomed wireless mobile networks. Nevertheless, preexisting streaming algorithms were mainly proposed for AVC delivery over multipath wired networks and were evaluated by software simulation. This paper introduces a practical, hardware-based testbed upon which we implement and evaluate real-time H.264 SVC streaming algorithms in a realistic multihomed wireless mobile networks environment. We propose an optimised streaming algorithm with multi-fold technical contributions. Firstly, we extended the AVC packet prioritisation schemes to reflect the three-dimensional granularity of SVC. Secondly, we designed a mechanism for evaluating the effects of different streamer 'read ahead window' sizes on real-time performance. Thirdly, we took account of the previously unconsidered path switching and mobile networks tunnelling overheads encountered in real-world deployments. Finally, we implemented a path condition monitoring and reporting scheme to facilitate the intelligent path switching. The proposed system has been experimentally shown to offer a significant improvement in PSNR of the received stream compared with representative existing algorithms.
[Development of a video image system for wireless capsule endoscopes based on DSP].
Yang, Li; Peng, Chenglin; Wu, Huafeng; Zhao, Dechun; Zhang, Jinhua
2008-02-01
A video image recorder to record video picture for wireless capsule endoscopes was designed. TMS320C6211 DSP of Texas Instruments Inc. is the core processor of this system. Images are periodically acquired from Composite Video Broadcast Signal (CVBS) source and scaled by video decoder (SAA7114H). Video data is transported from high speed buffer First-in First-out (FIFO) to Digital Signal Processor (DSP) under the control of Complex Programmable Logic Device (CPLD). This paper adopts JPEG algorithm for image coding, and the compressed data in DSP was stored to Compact Flash (CF) card. TMS320C6211 DSP is mainly used for image compression and data transporting. Fast Discrete Cosine Transform (DCT) algorithm and fast coefficient quantization algorithm are used to accelerate operation speed of DSP and decrease the executing code. At the same time, proper address is assigned for each memory, which has different speed;the memory structure is also optimized. In addition, this system uses plenty of Extended Direct Memory Access (EDMA) to transport and process image data, which results in stable and high performance.
Visual Perception Based Rate Control Algorithm for HEVC
NASA Astrophysics Data System (ADS)
Feng, Zeqi; Liu, PengYu; Jia, Kebin
2018-01-01
For HEVC, rate control is an indispensably important video coding technology to alleviate the contradiction between video quality and the limited encoding resources during video communication. However, the rate control benchmark algorithm of HEVC ignores subjective visual perception. For key focus regions, bit allocation of LCU is not ideal and subjective quality is unsatisfied. In this paper, a visual perception based rate control algorithm for HEVC is proposed. First bit allocation weight of LCU level is optimized based on the visual perception of luminance and motion to ameliorate video subjective quality. Then λ and QP are adjusted in combination with the bit allocation weight to improve rate distortion performance. Experimental results show that the proposed algorithm reduces average 0.5% BD-BR and maximum 1.09% BD-BR at no cost in bitrate accuracy compared with HEVC (HM15.0). The proposed algorithm devotes to improving video subjective quality under various video applications.
Subjective evaluation of next-generation video compression algorithms: a case study
NASA Astrophysics Data System (ADS)
De Simone, Francesca; Goldmann, Lutz; Lee, Jong-Seok; Ebrahimi, Touradj; Baroncini, Vittorio
2010-08-01
This paper describes the details and the results of the subjective quality evaluation performed at EPFL, as a contribution to the effort of the Joint Collaborative Team on Video Coding (JCT-VC) for the definition of the next-generation video coding standard. The performance of 27 coding technologies have been evaluated with respect to two H.264/MPEG-4 AVC anchors, considering high definition (HD) test material. The test campaign involved a total of 494 naive observers and took place over a period of four weeks. While similar tests have been conducted as part of the standardization process of previous video coding technologies, the test campaign described in this paper is by far the most extensive in the history of video coding standardization. The obtained subjective quality scores show high consistency and support an accurate comparison of the performance of the different coding solutions.
Scalable video transmission over Rayleigh fading channels using LDPC codes
NASA Astrophysics Data System (ADS)
Bansal, Manu; Kondi, Lisimachos P.
2005-03-01
In this paper, we investigate an important problem of efficiently utilizing the available resources for video transmission over wireless channels while maintaining a good decoded video quality and resilience to channel impairments. Our system consists of the video codec based on 3-D set partitioning in hierarchical trees (3-D SPIHT) algorithm and employs two different schemes using low-density parity check (LDPC) codes for channel error protection. The first method uses the serial concatenation of the constant-rate LDPC code and rate-compatible punctured convolutional (RCPC) codes. Cyclic redundancy check (CRC) is used to detect transmission errors. In the other scheme, we use the product code structure consisting of a constant rate LDPC/CRC code across the rows of the `blocks' of source data and an erasure-correction systematic Reed-Solomon (RS) code as the column code. In both the schemes introduced here, we use fixed-length source packets protected with unequal forward error correction coding ensuring a strictly decreasing protection across the bitstream. A Rayleigh flat-fading channel with additive white Gaussian noise (AWGN) is modeled for the transmission. The rate-distortion optimization algorithm is developed and carried out for the selection of source coding and channel coding rates using Lagrangian optimization. The experimental results demonstrate the effectiveness of this system under different wireless channel conditions and both the proposed methods (LDPC+RCPC/CRC and RS+LDPC/CRC) outperform the more conventional schemes such as those employing RCPC/CRC.
Digital codec for real-time processing of broadcast quality video signals at 1.8 bits/pixel
NASA Technical Reports Server (NTRS)
Shalkhauser, Mary JO; Whyte, Wayne A., Jr.
1989-01-01
The authors present the hardware implementation of a digital television bandwidth compression algorithm which processes standard NTSC (National Television Systems Committee) composite color television signals and produces broadcast-quality video in real time at an average of 1.8 b/pixel. The sampling rate used with this algorithm results in 768 samples over the active portion of each video line by 512 active video lines per video frame. The algorithm is based on differential pulse code modulation (DPCM), but additionally utilizes a nonadaptive predictor, nonuniform quantizer, and multilevel Huffman coder to reduce the data rate substantially below that achievable with straight DPCM. The nonadaptive predictor and multilevel Huffman coder combine to set this technique apart from prior-art DPCM encoding algorithms. The authors describe the data compression algorithm and the hardware implementation of the codec and provide performance results.
Zero-block mode decision algorithm for H.264/AVC.
Lee, Yu-Ming; Lin, Yinyi
2009-03-01
In the previous paper , we proposed a zero-block intermode decision algorithm for H.264 video coding based upon the number of zero-blocks of 4 x 4 DCT coefficients between the current macroblock and the co-located macroblock. The proposed algorithm can achieve significant improvement in computation, but the computation performance is limited for high bit-rate coding. To improve computation efficiency, in this paper, we suggest an enhanced zero-block decision algorithm, which uses an early zero-block detection method to compute the number of zero-blocks instead of direct DCT and quantization (DCT/Q) calculation and incorporates two adequate decision methods into semi-stationary and nonstationary regions of a video sequence. In addition, the zero-block decision algorithm is also applied to the intramode prediction in the P frame. The enhanced zero-block decision algorithm brings out a reduction of average 27% of total encoding time compared to the zero-block decision algorithm.
Deep linear autoencoder and patch clustering-based unified one-dimensional coding of image and video
NASA Astrophysics Data System (ADS)
Li, Honggui
2017-09-01
This paper proposes a unified one-dimensional (1-D) coding framework of image and video, which depends on deep learning neural network and image patch clustering. First, an improved K-means clustering algorithm for image patches is employed to obtain the compact inputs of deep artificial neural network. Second, for the purpose of best reconstructing original image patches, deep linear autoencoder (DLA), a linear version of the classical deep nonlinear autoencoder, is introduced to achieve the 1-D representation of image blocks. Under the circumstances of 1-D representation, DLA is capable of attaining zero reconstruction error, which is impossible for the classical nonlinear dimensionality reduction methods. Third, a unified 1-D coding infrastructure for image, intraframe, interframe, multiview video, three-dimensional (3-D) video, and multiview 3-D video is built by incorporating different categories of videos into the inputs of patch clustering algorithm. Finally, it is shown in the results of simulation experiments that the proposed methods can simultaneously gain higher compression ratio and peak signal-to-noise ratio than those of the state-of-the-art methods in the situation of low bitrate transmission.
Yang, Yang; Stanković, Vladimir; Xiong, Zixiang; Zhao, Wei
2009-03-01
Following recent works on the rate region of the quadratic Gaussian two-terminal source coding problem and limit-approaching code designs, this paper examines multiterminal source coding of two correlated, i.e., stereo, video sequences to save the sum rate over independent coding of both sequences. Two multiterminal video coding schemes are proposed. In the first scheme, the left sequence of the stereo pair is coded by H.264/AVC and used at the joint decoder to facilitate Wyner-Ziv coding of the right video sequence. The first I-frame of the right sequence is successively coded by H.264/AVC Intracoding and Wyner-Ziv coding. An efficient stereo matching algorithm based on loopy belief propagation is then adopted at the decoder to produce pixel-level disparity maps between the corresponding frames of the two decoded video sequences on the fly. Based on the disparity maps, side information for both motion vectors and motion-compensated residual frames of the right sequence are generated at the decoder before Wyner-Ziv encoding. In the second scheme, source splitting is employed on top of classic and Wyner-Ziv coding for compression of both I-frames to allow flexible rate allocation between the two sequences. Experiments with both schemes on stereo video sequences using H.264/AVC, LDPC codes for Slepian-Wolf coding of the motion vectors, and scalar quantization in conjunction with LDPC codes for Wyner-Ziv coding of the residual coefficients give a slightly lower sum rate than separate H.264/AVC coding of both sequences at the same video quality.
NASA Astrophysics Data System (ADS)
Abdellah, Skoudarli; Mokhtar, Nibouche; Amina, Serir
2015-11-01
The H.264/AVC video coding standard is used in a wide range of applications from video conferencing to high-definition television according to its high compression efficiency. This efficiency is mainly acquired from the newly allowed prediction schemes including variable block modes. However, these schemes require a high complexity to select the optimal mode. Consequently, complexity reduction in the H.264/AVC encoder has recently become a very challenging task in the video compression domain, especially when implementing the encoder in real-time applications. Fast mode decision algorithms play an important role in reducing the overall complexity of the encoder. In this paper, we propose an adaptive fast intermode algorithm based on motion activity, temporal stationarity, and spatial homogeneity. This algorithm predicts the motion activity of the current macroblock from its neighboring blocks and identifies temporal stationary regions and spatially homogeneous regions using adaptive threshold values based on content video features. Extensive experimental work has been done in high profile, and results show that the proposed source-coding algorithm effectively reduces the computational complexity by 53.18% on average compared with the reference software encoder, while maintaining the high-coding efficiency of H.264/AVC by incurring only 0.097 dB in total peak signal-to-noise ratio and 0.228% increment on the total bit rate.
Adaptive format conversion for scalable video coding
NASA Astrophysics Data System (ADS)
Wan, Wade K.; Lim, Jae S.
2001-12-01
The enhancement layer in many scalable coding algorithms is composed of residual coding information. There is another type of information that can be transmitted instead of (or in addition to) residual coding. Since the encoder has access to the original sequence, it can utilize adaptive format conversion (AFC) to generate the enhancement layer and transmit the different format conversion methods as enhancement data. This paper investigates the use of adaptive format conversion information as enhancement data in scalable video coding. Experimental results are shown for a wide range of base layer qualities and enhancement bitrates to determine when AFC can improve video scalability. Since the parameters needed for AFC are small compared to residual coding, AFC can provide video scalability at low enhancement layer bitrates that are not possible with residual coding. In addition, AFC can also be used in addition to residual coding to improve video scalability at higher enhancement layer bitrates. Adaptive format conversion has not been studied in detail, but many scalable applications may benefit from it. An example of an application that AFC is well-suited for is the migration path for digital television where AFC can provide immediate video scalability as well as assist future migrations.
An efficient interpolation filter VLSI architecture for HEVC standard
NASA Astrophysics Data System (ADS)
Zhou, Wei; Zhou, Xin; Lian, Xiaocong; Liu, Zhenyu; Liu, Xiaoxiang
2015-12-01
The next-generation video coding standard of High-Efficiency Video Coding (HEVC) is especially efficient for coding high-resolution video such as 8K-ultra-high-definition (UHD) video. Fractional motion estimation in HEVC presents a significant challenge in clock latency and area cost as it consumes more than 40 % of the total encoding time and thus results in high computational complexity. With aims at supporting 8K-UHD video applications, an efficient interpolation filter VLSI architecture for HEVC is proposed in this paper. Firstly, a new interpolation filter algorithm based on the 8-pixel interpolation unit is proposed in this paper. It can save 19.7 % processing time on average with acceptable coding quality degradation. Based on the proposed algorithm, an efficient interpolation filter VLSI architecture, composed of a reused data path of interpolation, an efficient memory organization, and a reconfigurable pipeline interpolation filter engine, is presented to reduce the implement hardware area and achieve high throughput. The final VLSI implementation only requires 37.2k gates in a standard 90-nm CMOS technology at an operating frequency of 240 MHz. The proposed architecture can be reused for either half-pixel interpolation or quarter-pixel interpolation, which can reduce the area cost for about 131,040 bits RAM. The processing latency of our proposed VLSI architecture can support the real-time processing of 4:2:0 format 7680 × 4320@78fps video sequences.
An efficient CU partition algorithm for HEVC based on improved Sobel operator
NASA Astrophysics Data System (ADS)
Sun, Xuebin; Chen, Xiaodong; Xu, Yong; Sun, Gang; Yang, Yunsheng
2018-04-01
As the latest video coding standard, High Efficiency Video Coding (HEVC) achieves over 50% bit rate reduction with similar video quality compared with previous standards H.264/AVC. However, the higher compression efficiency is attained at the cost of significantly increasing computational load. In order to reduce the complexity, this paper proposes a fast coding unit (CU) partition technique to speed up the process. To detect the edge features of each CU, a more accurate improved Sobel filtering is developed and performed By analyzing the textural features of CU, an early CU splitting termination is proposed to decide whether a CU should be decomposed into four lower-dimensions CUs or not. Compared with the reference software HM16.7, experimental results indicate the proposed algorithm can lessen the encoding time up to 44.09% on average, with a negligible bit rate increase of 0.24%, and quality losses lower 0.03 dB, respectively. In addition, the proposed algorithm gets a better trade-off between complexity and rate-distortion among the other proposed works.
The Simple Video Coder: A free tool for efficiently coding social video data.
Barto, Daniel; Bird, Clark W; Hamilton, Derek A; Fink, Brandi C
2017-08-01
Videotaping of experimental sessions is a common practice across many disciplines of psychology, ranging from clinical therapy, to developmental science, to animal research. Audio-visual data are a rich source of information that can be easily recorded; however, analysis of the recordings presents a major obstacle to project completion. Coding behavior is time-consuming and often requires ad-hoc training of a student coder. In addition, existing software is either prohibitively expensive or cumbersome, which leaves researchers with inadequate tools to quickly process video data. We offer the Simple Video Coder-free, open-source software for behavior coding that is flexible in accommodating different experimental designs, is intuitive for students to use, and produces outcome measures of event timing, frequency, and duration. Finally, the software also offers extraction tools to splice video into coded segments suitable for training future human coders or for use as input for pattern classification algorithms.
Model-based video segmentation for vision-augmented interactive games
NASA Astrophysics Data System (ADS)
Liu, Lurng-Kuo
2000-04-01
This paper presents an architecture and algorithms for model based video object segmentation and its applications to vision augmented interactive game. We are especially interested in real time low cost vision based applications that can be implemented in software in a PC. We use different models for background and a player object. The object segmentation algorithm is performed in two different levels: pixel level and object level. At pixel level, the segmentation algorithm is formulated as a maximizing a posteriori probability (MAP) problem. The statistical likelihood of each pixel is calculated and used in the MAP problem. Object level segmentation is used to improve segmentation quality by utilizing the information about the spatial and temporal extent of the object. The concept of an active region, which is defined based on motion histogram and trajectory prediction, is introduced to indicate the possibility of a video object region for both background and foreground modeling. It also reduces the overall computation complexity. In contrast with other applications, the proposed video object segmentation system is able to create background and foreground models on the fly even without introductory background frames. Furthermore, we apply different rate of self-tuning on the scene model so that the system can adapt to the environment when there is a scene change. We applied the proposed video object segmentation algorithms to several prototype virtual interactive games. In our prototype vision augmented interactive games, a player can immerse himself/herself inside a game and can virtually interact with other animated characters in a real time manner without being constrained by helmets, gloves, special sensing devices, or background environment. The potential applications of the proposed algorithms including human computer gesture interface and object based video coding such as MPEG-4 video coding.
Fast image interpolation for motion estimation using graphics hardware
NASA Astrophysics Data System (ADS)
Kelly, Francis; Kokaram, Anil
2004-05-01
Motion estimation and compensation is the key to high quality video coding. Block matching motion estimation is used in most video codecs, including MPEG-2, MPEG-4, H.263 and H.26L. Motion estimation is also a key component in the digital restoration of archived video and for post-production and special effects in the movie industry. Sub-pixel accurate motion vectors can improve the quality of the vector field and lead to more efficient video coding. However sub-pixel accuracy requires interpolation of the image data. Image interpolation is a key requirement of many image processing algorithms. Often interpolation can be a bottleneck in these applications, especially in motion estimation due to the large number pixels involved. In this paper we propose using commodity computer graphics hardware for fast image interpolation. We use the full search block matching algorithm to illustrate the problems and limitations of using graphics hardware in this way.
A constrained joint source/channel coder design and vector quantization of nonstationary sources
NASA Technical Reports Server (NTRS)
Sayood, Khalid; Chen, Y. C.; Nori, S.; Araj, A.
1993-01-01
The emergence of broadband ISDN as the network for the future brings with it the promise of integration of all proposed services in a flexible environment. In order to achieve this flexibility, asynchronous transfer mode (ATM) has been proposed as the transfer technique. During this period a study was conducted on the bridging of network transmission performance and video coding. The successful transmission of variable bit rate video over ATM networks relies on the interaction between the video coding algorithm and the ATM networks. Two aspects of networks that determine the efficiency of video transmission are the resource allocation algorithm and the congestion control algorithm. These are explained in this report. Vector quantization (VQ) is one of the more popular compression techniques to appear in the last twenty years. Numerous compression techniques, which incorporate VQ, have been proposed. While the LBG VQ provides excellent compression, there are also several drawbacks to the use of the LBG quantizers including search complexity and memory requirements, and a mismatch between the codebook and the inputs. The latter mainly stems from the fact that the VQ is generally designed for a specific rate and a specific class of inputs. In this work, an adaptive technique is proposed for vector quantization of images and video sequences. This technique is an extension of the recursively indexed scalar quantization (RISQ) algorithm.
Investigating the structure preserving encryption of high efficiency video coding (HEVC)
NASA Astrophysics Data System (ADS)
Shahid, Zafar; Puech, William
2013-02-01
This paper presents a novel method for the real-time protection of new emerging High Efficiency Video Coding (HEVC) standard. Structure preserving selective encryption is being performed in CABAC entropy coding module of HEVC, which is significantly different from CABAC entropy coding of H.264/AVC. In CABAC of HEVC, exponential Golomb coding is replaced by truncated Rice (TR) up to a specific value for binarization of transform coefficients. Selective encryption is performed using AES cipher in cipher feedback mode on a plaintext of binstrings in a context aware manner. The encrypted bitstream has exactly the same bit-rate and is format complaint. Experimental evaluation and security analysis of the proposed algorithm is performed on several benchmark video sequences containing different combinations of motion, texture and objects.
Lee, Chaewoo
2014-01-01
The advancement in wideband wireless network supports real time services such as IPTV and live video streaming. However, because of the sharing nature of the wireless medium, efficient resource allocation has been studied to achieve a high level of acceptability and proliferation of wireless multimedia. Scalable video coding (SVC) with adaptive modulation and coding (AMC) provides an excellent solution for wireless video streaming. By assigning different modulation and coding schemes (MCSs) to video layers, SVC can provide good video quality to users in good channel conditions and also basic video quality to users in bad channel conditions. For optimal resource allocation, a key issue in applying SVC in the wireless multicast service is how to assign MCSs and the time resources to each SVC layer in the heterogeneous channel condition. We formulate this problem with integer linear programming (ILP) and provide numerical results to show the performance under 802.16 m environment. The result shows that our methodology enhances the overall system throughput compared to an existing algorithm. PMID:25276862
A novel multiple description scalable coding scheme for mobile wireless video transmission
NASA Astrophysics Data System (ADS)
Zheng, Haifeng; Yu, Lun; Chen, Chang Wen
2005-03-01
We proposed in this paper a novel multiple description scalable coding (MDSC) scheme based on in-band motion compensation temporal filtering (IBMCTF) technique in order to achieve high video coding performance and robust video transmission. The input video sequence is first split into equal-sized groups of frames (GOFs). Within a GOF, each frame is hierarchically decomposed by discrete wavelet transform. Since there is a direct relationship between wavelet coefficients and what they represent in the image content after wavelet decomposition, we are able to reorganize the spatial orientation trees to generate multiple bit-streams and employed SPIHT algorithm to achieve high coding efficiency. We have shown that multiple bit-stream transmission is very effective in combating error propagation in both Internet video streaming and mobile wireless video. Furthermore, we adopt the IBMCTF scheme to remove the redundancy for inter-frames along the temporal direction using motion compensated temporal filtering, thus high coding performance and flexible scalability can be provided in this scheme. In order to make compressed video resilient to channel error and to guarantee robust video transmission over mobile wireless channels, we add redundancy to each bit-stream and apply error concealment strategy for lost motion vectors. Unlike traditional multiple description schemes, the integration of these techniques enable us to generate more than two bit-streams that may be more appropriate for multiple antenna transmission of compressed video. Simulate results on standard video sequences have shown that the proposed scheme provides flexible tradeoff between coding efficiency and error resilience.
Video data compression using artificial neural network differential vector quantization
NASA Technical Reports Server (NTRS)
Krishnamurthy, Ashok K.; Bibyk, Steven B.; Ahalt, Stanley C.
1991-01-01
An artificial neural network vector quantizer is developed for use in data compression applications such as Digital Video. Differential Vector Quantization is used to preserve edge features, and a new adaptive algorithm, known as Frequency-Sensitive Competitive Learning, is used to develop the vector quantizer codebook. To develop real time performance, a custom Very Large Scale Integration Application Specific Integrated Circuit (VLSI ASIC) is being developed to realize the associative memory functions needed in the vector quantization algorithm. By using vector quantization, the need for Huffman coding can be eliminated, resulting in superior performance against channel bit errors than methods that use variable length codes.
Feasibility of video codec algorithms for software-only playback
NASA Astrophysics Data System (ADS)
Rodriguez, Arturo A.; Morse, Ken
1994-05-01
Software-only video codecs can provide good playback performance in desktop computers with a 486 or 68040 CPU running at 33 MHz without special hardware assistance. Typically, playback of compressed video can be categorized into three tasks: the actual decoding of the video stream, color conversion, and the transfer of decoded video data from system RAM to video RAM. By current standards, good playback performance is the decoding and display of video streams of 320 by 240 (or larger) compressed frames at 15 (or greater) frames-per- second. Software-only video codecs have evolved by modifying and tailoring existing compression methodologies to suit video playback in desktop computers. In this paper we examine the characteristics used to evaluate software-only video codec algorithms, namely: image fidelity (i.e., image quality), bandwidth (i.e., compression) ease-of-decoding (i.e., playback performance), memory consumption, compression to decompression asymmetry, scalability, and delay. We discuss the tradeoffs among these variables and the compromises that can be made to achieve low numerical complexity for software-only playback. Frame- differencing approaches are described since software-only video codecs typically employ them to enhance playback performance. To complement other papers that appear in this session of the Proceedings, we review methods derived from binary pattern image coding since these methods are amenable for software-only playback. In particular, we introduce a novel approach called pixel distribution image coding.
Performance evaluation of the intra compression in the video coding standards
NASA Astrophysics Data System (ADS)
Abramowski, Andrzej
2015-09-01
The article presents a comparison of the Intra prediction algorithms in the current state-of-the-art video coding standards, including MJPEG 2000, VP8, VP9, H.264/AVC and H.265/HEVC. The effectiveness of techniques employed by each standard is evaluated in terms of compression efficiency and average encoding time. The compression efficiency is measured using BD-PSNR and BD-RATE metrics with H.265/HEVC results as an anchor. Tests are performed on a set of video sequences, composed of sequences gathered by Joint Collaborative Team on Video Coding during the development of the H.265/HEVC standard and 4K sequences provided by Ultra Video Group. According to results, H.265/HEVC provides significant bit-rate savings at the expense of computational complexity, while VP9 may be regarded as a compromise between the efficiency and required encoding time.
Adaptive distributed source coding.
Varodayan, David; Lin, Yao-Chung; Girod, Bernd
2012-05-01
We consider distributed source coding in the presence of hidden variables that parameterize the statistical dependence among sources. We derive the Slepian-Wolf bound and devise coding algorithms for a block-candidate model of this problem. The encoder sends, in addition to syndrome bits, a portion of the source to the decoder uncoded as doping bits. The decoder uses the sum-product algorithm to simultaneously recover the source symbols and the hidden statistical dependence variables. We also develop novel techniques based on density evolution (DE) to analyze the coding algorithms. We experimentally confirm that our DE analysis closely approximates practical performance. This result allows us to efficiently optimize parameters of the algorithms. In particular, we show that the system performs close to the Slepian-Wolf bound when an appropriate doping rate is selected. We then apply our coding and analysis techniques to a reduced-reference video quality monitoring system and show a bit rate saving of about 75% compared with fixed-length coding.
NASA Astrophysics Data System (ADS)
Wang, Jianhua; Cheng, Lianglun; Wang, Tao; Peng, Xiaodong
2016-03-01
Table look-up operation plays a very important role during the decoding processing of context-based adaptive variable length decoding (CAVLD) in H.264/advanced video coding (AVC). However, frequent table look-up operation can result in big table memory access, and then lead to high table power consumption. Aiming to solve the problem of big table memory access of current methods, and then reduce high power consumption, a memory-efficient table look-up optimized algorithm is presented for CAVLD. The contribution of this paper lies that index search technology is introduced to reduce big memory access for table look-up, and then reduce high table power consumption. Specifically, in our schemes, we use index search technology to reduce memory access by reducing the searching and matching operations for code_word on the basis of taking advantage of the internal relationship among length of zero in code_prefix, value of code_suffix and code_lengh, thus saving the power consumption of table look-up. The experimental results show that our proposed table look-up algorithm based on index search can lower about 60% memory access consumption compared with table look-up by sequential search scheme, and then save much power consumption for CAVLD in H.264/AVC.
Digital CODEC for real-time processing of broadcast quality video signals at 1.8 bits/pixel
NASA Technical Reports Server (NTRS)
Shalkhauser, Mary JO; Whyte, Wayne A.
1991-01-01
Advances in very large scale integration and recent work in the field of bandwidth efficient digital modulation techniques have combined to make digital video processing technically feasible an potentially cost competitive for broadcast quality television transmission. A hardware implementation was developed for DPCM (differential pulse code midulation)-based digital television bandwidth compression algorithm which processes standard NTSC composite color television signals and produces broadcast quality video in real time at an average of 1.8 bits/pixel. The data compression algorithm and the hardware implementation of the codec are described, and performance results are provided.
Simulation and Real-Time Verification of Video Algorithms on the TI C6400 Using Simulink
2004-08-20
SPONSOR/MONITOR’S ACRONYM(S) 11. SPONSOR/MONITOR’S REPORT NUMBER(S) 12 . DISTRIBUTION/AVAILABILITY STATEMENT Approved for public release...plot estimates over time (scrolling data) Adjust detection threshold (click mouse on graph) Monitor video capture Input video frames Captured frames 12 ...Video App: Surveillance Recording 1 2 7 3 4 9 5 6 11 SL for video Explanation of GUI 12 Target Options8 Build Process 10 13 14 15 16 M-code snippet
Evaluation of H.264 and H.265 full motion video encoding for small UAS platforms
NASA Astrophysics Data System (ADS)
McGuinness, Christopher D.; Walker, David; Taylor, Clark; Hill, Kerry; Hoffman, Marc
2016-05-01
Of all the steps in the image acquisition and formation pipeline, compression is the only process that degrades image quality. A selected compression algorithm succeeds or fails to provide sufficient quality at the requested compression rate depending on how well the algorithm is suited to the input data. Applying an algorithm designed for one type of data to a different type often results in poor compression performance. This is mostly the case when comparing the performance of H.264, designed for standard definition data, to HEVC (High Efficiency Video Coding), which the Joint Collaborative Team on Video Coding (JCT-VC) designed for high-definition data. This study focuses on evaluating how HEVC compares to H.264 when compressing data from small UAS platforms. To compare the standards directly, we assess two open-source traditional software solutions: x264 and x265. These software-only comparisons allow us to establish a baseline of how much improvement can generally be expected of HEVC over H.264. Then, specific solutions leveraging different types of hardware are selected to understand the limitations of commercial-off-the-shelf (COTS) options. Algorithmically, regardless of the implementation, HEVC is found to provide similar quality video as H.264 at 40% lower data rates for video resolutions greater than 1280x720, roughly 1 Megapixel (MPx). For resolutions less than 1MPx, H.264 is an adequate solution though a small (roughly 20%) compression boost is earned by employing HEVC. New low cost, size, weight, and power (CSWAP) HEVC implementations are being developed and will be ideal for small UAS systems.
Lossless Video Sequence Compression Using Adaptive Prediction
NASA Technical Reports Server (NTRS)
Li, Ying; Sayood, Khalid
2007-01-01
We present an adaptive lossless video compression algorithm based on predictive coding. The proposed algorithm exploits temporal, spatial, and spectral redundancies in a backward adaptive fashion with extremely low side information. The computational complexity is further reduced by using a caching strategy. We also study the relationship between the operational domain for the coder (wavelet or spatial) and the amount of temporal and spatial redundancy in the sequence being encoded. Experimental results show that the proposed scheme provides significant improvements in compression efficiencies.
A study on multiresolution lossless video coding using inter/intra frame adaptive prediction
NASA Astrophysics Data System (ADS)
Nakachi, Takayuki; Sawabe, Tomoko; Fujii, Tetsuro
2003-06-01
Lossless video coding is required in the fields of archiving and editing digital cinema or digital broadcasting contents. This paper combines a discrete wavelet transform and adaptive inter/intra-frame prediction in the wavelet transform domain to create multiresolution lossless video coding. The multiresolution structure offered by the wavelet transform facilitates interchange among several video source formats such as Super High Definition (SHD) images, HDTV, SDTV, and mobile applications. Adaptive inter/intra-frame prediction is an extension of JPEG-LS, a state-of-the-art lossless still image compression standard. Based on the image statistics of the wavelet transform domains in successive frames, inter/intra frame adaptive prediction is applied to the appropriate wavelet transform domain. This adaptation offers superior compression performance. This is achieved with low computational cost and no increase in additional information. Experiments on digital cinema test sequences confirm the effectiveness of the proposed algorithm.
Binary video codec for data reduction in wireless visual sensor networks
NASA Astrophysics Data System (ADS)
Khursheed, Khursheed; Ahmad, Naeem; Imran, Muhammad; O'Nils, Mattias
2013-02-01
Wireless Visual Sensor Networks (WVSN) is formed by deploying many Visual Sensor Nodes (VSNs) in the field. Typical applications of WVSN include environmental monitoring, health care, industrial process monitoring, stadium/airports monitoring for security reasons and many more. The energy budget in the outdoor applications of WVSN is limited to the batteries and the frequent replacement of batteries is usually not desirable. So the processing as well as the communication energy consumption of the VSN needs to be optimized in such a way that the network remains functional for longer duration. The images captured by VSN contain huge amount of data and require efficient computational resources for processing the images and wide communication bandwidth for the transmission of the results. Image processing algorithms must be designed and developed in such a way that they are computationally less complex and must provide high compression rate. For some applications of WVSN, the captured images can be segmented into bi-level images and hence bi-level image coding methods will efficiently reduce the information amount in these segmented images. But the compression rate of the bi-level image coding methods is limited by the underlined compression algorithm. Hence there is a need for designing other intelligent and efficient algorithms which are computationally less complex and provide better compression rate than that of bi-level image coding methods. Change coding is one such algorithm which is computationally less complex (require only exclusive OR operations) and provide better compression efficiency compared to image coding but it is effective for applications having slight changes between adjacent frames of the video. The detection and coding of the Region of Interest (ROIs) in the change frame efficiently reduce the information amount in the change frame. But, if the number of objects in the change frames is higher than a certain level then the compression efficiency of both the change coding and ROI coding becomes worse than that of image coding. This paper explores the compression efficiency of the Binary Video Codec (BVC) for the data reduction in WVSN. We proposed to implement all the three compression techniques i.e. image coding, change coding and ROI coding at the VSN and then select the smallest bit stream among the results of the three compression techniques. In this way the compression performance of the BVC will never become worse than that of image coding. We concluded that the compression efficiency of BVC is always better than that of change coding and is always better than or equal that of ROI coding and image coding.
Depth assisted compression of full parallax light fields
NASA Astrophysics Data System (ADS)
Graziosi, Danillo B.; Alpaslan, Zahir Y.; El-Ghoroury, Hussein S.
2015-03-01
Full parallax light field displays require high pixel density and huge amounts of data. Compression is a necessary tool used by 3D display systems to cope with the high bandwidth requirements. One of the formats adopted by MPEG for 3D video coding standards is the use of multiple views with associated depth maps. Depth maps enable the coding of a reduced number of views, and are used by compression and synthesis software to reconstruct the light field. However, most of the developed coding and synthesis tools target linearly arranged cameras with small baselines. Here we propose to use the 3D video coding format for full parallax light field coding. We introduce a view selection method inspired by plenoptic sampling followed by transform-based view coding and view synthesis prediction to code residual views. We determine the minimal requirements for view sub-sampling and present the rate-distortion performance of our proposal. We also compare our method with established video compression techniques, such as H.264/AVC, H.264/MVC, and the new 3D video coding algorithm, 3DV-ATM. Our results show that our method not only has an improved rate-distortion performance, it also preserves the structure of the perceived light fields better.
Human Motion Capture Data Tailored Transform Coding.
Junhui Hou; Lap-Pui Chau; Magnenat-Thalmann, Nadia; Ying He
2015-07-01
Human motion capture (mocap) is a widely used technique for digitalizing human movements. With growing usage, compressing mocap data has received increasing attention, since compact data size enables efficient storage and transmission. Our analysis shows that mocap data have some unique characteristics that distinguish themselves from images and videos. Therefore, directly borrowing image or video compression techniques, such as discrete cosine transform, does not work well. In this paper, we propose a novel mocap-tailored transform coding algorithm that takes advantage of these features. Our algorithm segments the input mocap sequences into clips, which are represented in 2D matrices. Then it computes a set of data-dependent orthogonal bases to transform the matrices to frequency domain, in which the transform coefficients have significantly less dependency. Finally, the compression is obtained by entropy coding of the quantized coefficients and the bases. Our method has low computational cost and can be easily extended to compress mocap databases. It also requires neither training nor complicated parameter setting. Experimental results demonstrate that the proposed scheme significantly outperforms state-of-the-art algorithms in terms of compression performance and speed.
NASA Astrophysics Data System (ADS)
Bartolini, Franco; Pasquini, Cristina; Piva, Alessandro
2001-04-01
The recent development of video compression algorithms allowed the diffusion of systems for the transmission of video sequences over data networks. However, the transmission over error prone mobile communication channels is yet an open issue. In this paper, a system developed for the real time transmission of H263 video coded sequences over TETRA mobile networks is presented. TETRA is an open digital trunked radio standard defined by the European Telecommunications Standardization Institute developed for professional mobile radio users, providing full integration of voice and data services. Experimental tests demonstrate that, in spite of the low frame rate allowed by the SW only implementation of the decoder and by the low channel rate a video compression technique such as that complying with the H263 standard, is still preferable to a simpler but less effective frame based compression system.
Comparison of H.265/HEVC encoders
NASA Astrophysics Data System (ADS)
Trochimiuk, Maciej
2016-09-01
The H.265/HEVC is the state-of-the-art video compression standard, which allows the bitrate reduction up to 50% compared with its predecessor, H.264/AVC, maintaining equal perceptual video quality. The growth in coding efficiency was achieved by increasing the number of available intra- and inter-frame prediction features and improvements in existing ones, such as entropy encoding and filtering. Nevertheless, to achieve real-time performance of the encoder, simplifications in algorithm are inevitable. Some features and coding modes shall be skipped, to reduce time needed to evaluate modes forwarded to rate-distortion optimisation. Thus, the potential acceleration of the encoding process comes at the expense of coding efficiency. In this paper, a trade-off between video quality and encoding speed of various H.265/HEVC encoders is discussed.
NASA Astrophysics Data System (ADS)
Lei, Ted Chih-Wei; Tseng, Fan-Shuo
2017-07-01
This paper addresses the problem of high-computational complexity decoding in traditional Wyner-Ziv video coding (WZVC). The key focus is the migration of two traditionally high-computationally complex encoder algorithms, namely motion estimation and mode decision. In order to reduce the computational burden in this process, the proposed architecture adopts the partial boundary matching algorithm and four flexible types of block mode decision at the decoder. This approach does away with the need for motion estimation and mode decision at the encoder. The experimental results show that the proposed padding block-based WZVC not only decreases decoder complexity to approximately one hundredth that of the state-of-the-art DISCOVER decoding but also outperforms DISCOVER codec by up to 3 to 4 dB.
Temporal compressive imaging for video
NASA Astrophysics Data System (ADS)
Zhou, Qun; Zhang, Linxia; Ke, Jun
2018-01-01
In many situations, imagers are required to have higher imaging speed, such as gunpowder blasting analysis and observing high-speed biology phenomena. However, measuring high-speed video is a challenge to camera design, especially, in infrared spectrum. In this paper, we reconstruct a high-frame-rate video from compressive video measurements using temporal compressive imaging (TCI) with a temporal compression ratio T=8. This means that, 8 unique high-speed temporal frames will be obtained from a single compressive frame using a reconstruction algorithm. Equivalently, the video frame rates is increased by 8 times. Two methods, two-step iterative shrinkage/threshold (TwIST) algorithm and the Gaussian mixture model (GMM) method, are used for reconstruction. To reduce reconstruction time and memory usage, each frame of size 256×256 is divided into patches of size 8×8. The influence of different coded mask to reconstruction is discussed. The reconstruction qualities using TwIST and GMM are also compared.
Novel Integration of Frame Rate Up Conversion and HEVC Coding Based on Rate-Distortion Optimization.
Guo Lu; Xiaoyun Zhang; Li Chen; Zhiyong Gao
2018-02-01
Frame rate up conversion (FRUC) can improve the visual quality by interpolating new intermediate frames. However, high frame rate videos by FRUC are confronted with more bitrate consumption or annoying artifacts of interpolated frames. In this paper, a novel integration framework of FRUC and high efficiency video coding (HEVC) is proposed based on rate-distortion optimization, and the interpolated frames can be reconstructed at encoder side with low bitrate cost and high visual quality. First, joint motion estimation (JME) algorithm is proposed to obtain robust motion vectors, which are shared between FRUC and video coding. What's more, JME is embedded into the coding loop and employs the original motion search strategy in HEVC coding. Then, the frame interpolation is formulated as a rate-distortion optimization problem, where both the coding bitrate consumption and visual quality are taken into account. Due to the absence of original frames, the distortion model for interpolated frames is established according to the motion vector reliability and coding quantization error. Experimental results demonstrate that the proposed framework can achieve 21% ~ 42% reduction in BDBR, when compared with the traditional methods of FRUC cascaded with coding.
Comparison of compression efficiency between HEVC/H.265 and VP9 based on subjective assessments
NASA Astrophysics Data System (ADS)
Řeřábek, Martin; Ebrahimi, Touradj
2014-09-01
Current increasing effort of broadcast providers to transmit UHD (Ultra High Definition) content is likely to increase demand for ultra high definition televisions (UHDTVs). To compress UHDTV content, several alternative encoding mechanisms exist. In addition to internationally recognized standards, open access proprietary options, such as VP9 video encoding scheme, have recently appeared and are gaining popularity. One of the main goals of these encoders is to efficiently compress video sequences beyond HDTV resolution for various scenarios, such as broadcasting or internet streaming. In this paper, a broadcast scenario rate-distortion performance analysis and mutual comparison of one of the latest video coding standards H.265/HEVC with recently released proprietary video coding scheme VP9 is presented. Also, currently one of the most popular and widely spread encoder H.264/AVC has been included into the evaluation to serve as a comparison baseline. The comparison is performed by means of subjective evaluations showing actual differences between encoding algorithms in terms of perceived quality. The results indicate a general dominance of HEVC based encoding algorithm in comparison to other alternatives, while VP9 and AVC showing similar performance.
Locally adaptive vector quantization: Data compression with feature preservation
NASA Technical Reports Server (NTRS)
Cheung, K. M.; Sayano, M.
1992-01-01
A study of a locally adaptive vector quantization (LAVQ) algorithm for data compression is presented. This algorithm provides high-speed one-pass compression and is fully adaptable to any data source and does not require a priori knowledge of the source statistics. Therefore, LAVQ is a universal data compression algorithm. The basic algorithm and several modifications to improve performance are discussed. These modifications are nonlinear quantization, coarse quantization of the codebook, and lossless compression of the output. Performance of LAVQ on various images using irreversible (lossy) coding is comparable to that of the Linde-Buzo-Gray algorithm, but LAVQ has a much higher speed; thus this algorithm has potential for real-time video compression. Unlike most other image compression algorithms, LAVQ preserves fine detail in images. LAVQ's performance as a lossless data compression algorithm is comparable to that of Lempel-Ziv-based algorithms, but LAVQ uses far less memory during the coding process.
Real-time demonstration hardware for enhanced DPCM video compression algorithm
NASA Technical Reports Server (NTRS)
Bizon, Thomas P.; Whyte, Wayne A., Jr.; Marcopoli, Vincent R.
1992-01-01
The lack of available wideband digital links as well as the complexity of implementation of bandwidth efficient digital video CODECs (encoder/decoder) has worked to keep the cost of digital television transmission too high to compete with analog methods. Terrestrial and satellite video service providers, however, are now recognizing the potential gains that digital video compression offers and are proposing to incorporate compression systems to increase the number of available program channels. NASA is similarly recognizing the benefits of and trend toward digital video compression techniques for transmission of high quality video from space and therefore, has developed a digital television bandwidth compression algorithm to process standard National Television Systems Committee (NTSC) composite color television signals. The algorithm is based on differential pulse code modulation (DPCM), but additionally utilizes a non-adaptive predictor, non-uniform quantizer and multilevel Huffman coder to reduce the data rate substantially below that achievable with straight DPCM. The non-adaptive predictor and multilevel Huffman coder combine to set this technique apart from other DPCM encoding algorithms. All processing is done on a intra-field basis to prevent motion degradation and minimize hardware complexity. Computer simulations have shown the algorithm will produce broadcast quality reconstructed video at an average transmission rate of 1.8 bits/pixel. Hardware implementation of the DPCM circuit, non-adaptive predictor and non-uniform quantizer has been completed, providing realtime demonstration of the image quality at full video rates. Video sampling/reconstruction circuits have also been constructed to accomplish the analog video processing necessary for the real-time demonstration. Performance results for the completed hardware compare favorably with simulation results. Hardware implementation of the multilevel Huffman encoder/decoder is currently under development along with implementation of a buffer control algorithm to accommodate the variable data rate output of the multilevel Huffman encoder. A video CODEC of this type could be used to compress NTSC color television signals where high quality reconstruction is desirable (e.g., Space Station video transmission, transmission direct-to-the-home via direct broadcast satellite systems or cable television distribution to system headends and direct-to-the-home).
Spatial Pyramid Covariance based Compact Video Code for Robust Face Retrieval in TV-series.
Li, Yan; Wang, Ruiping; Cui, Zhen; Shan, Shiguang; Chen, Xilin
2016-10-10
We address the problem of face video retrieval in TV-series which searches video clips based on the presence of specific character, given one face track of his/her. This is tremendously challenging because on one hand, faces in TV-series are captured in largely uncontrolled conditions with complex appearance variations, and on the other hand retrieval task typically needs efficient representation with low time and space complexity. To handle this problem, we propose a compact and discriminative representation for the huge body of video data, named Compact Video Code (CVC). Our method first models the face track by its sample (i.e., frame) covariance matrix to capture the video data variations in a statistical manner. To incorporate discriminative information and obtain more compact video signature suitable for retrieval, the high-dimensional covariance representation is further encoded as a much lower-dimensional binary vector, which finally yields the proposed CVC. Specifically, each bit of the code, i.e., each dimension of the binary vector, is produced via supervised learning in a max margin framework, which aims to make a balance between the discriminability and stability of the code. Besides, we further extend the descriptive granularity of covariance matrix from traditional pixel-level to more general patchlevel, and proceed to propose a novel hierarchical video representation named Spatial Pyramid Covariance (SPC) along with a fast calculation method. Face retrieval experiments on two challenging TV-series video databases, i.e., the Big Bang Theory and Prison Break, demonstrate the competitiveness of the proposed CVC over state-of-the-art retrieval methods. In addition, as a general video matching algorithm, CVC is also evaluated in traditional video face recognition task on a standard Internet database, i.e., YouTube Celebrities, showing its quite promising performance by using an extremely compact code with only 128 bits.
Study of efficient video compression algorithms for space shuttle applications
NASA Technical Reports Server (NTRS)
Poo, Z.
1975-01-01
Results are presented of a study on video data compression techniques applicable to space flight communication. This study is directed towards monochrome (black and white) picture communication with special emphasis on feasibility of hardware implementation. The primary factors for such a communication system in space flight application are: picture quality, system reliability, power comsumption, and hardware weight. In terms of hardware implementation, these are directly related to hardware complexity, effectiveness of the hardware algorithm, immunity of the source code to channel noise, and data transmission rate (or transmission bandwidth). A system is recommended, and its hardware requirement summarized. Simulations of the study were performed on the improved LIM video controller which is computer-controlled by the META-4 CPU.
Region-of-interest determination and bit-rate conversion for H.264 video transcoding
NASA Astrophysics Data System (ADS)
Huang, Shu-Fen; Chen, Mei-Juan; Tai, Kuang-Han; Li, Mian-Shiuan
2013-12-01
This paper presents a video bit-rate transcoder for baseline profile in H.264/AVC standard to fit the available channel bandwidth for the client when transmitting video bit-streams via communication channels. To maintain visual quality for low bit-rate video efficiently, this study analyzes the decoded information in the transcoder and proposes a Bayesian theorem-based region-of-interest (ROI) determination algorithm. In addition, a curve fitting scheme is employed to find the models of video bit-rate conversion. The transcoded video will conform to the target bit-rate by re-quantization according to our proposed models. After integrating the ROI detection method and the bit-rate transcoding models, the ROI-based transcoder allocates more coding bits to ROI regions and reduces the complexity of the re-encoding procedure for non-ROI regions. Hence, it not only keeps the coding quality but improves the efficiency of the video transcoding for low target bit-rates and makes the real-time transcoding more practical. Experimental results show that the proposed framework gets significantly better visual quality.
2D-pattern matching image and video compression: theory, algorithms, and experiments.
Alzina, Marc; Szpankowski, Wojciech; Grama, Ananth
2002-01-01
In this paper, we propose a lossy data compression framework based on an approximate two-dimensional (2D) pattern matching (2D-PMC) extension of the Lempel-Ziv (1977, 1978) lossless scheme. This framework forms the basis upon which higher level schemes relying on differential coding, frequency domain techniques, prediction, and other methods can be built. We apply our pattern matching framework to image and video compression and report on theoretical and experimental results. Theoretically, we show that the fixed database model used for video compression leads to suboptimal but computationally efficient performance. The compression ratio of this model is shown to tend to the generalized entropy. For image compression, we use a growing database model for which we provide an approximate analysis. The implementation of 2D-PMC is a challenging problem from the algorithmic point of view. We use a range of techniques and data structures such as k-d trees, generalized run length coding, adaptive arithmetic coding, and variable and adaptive maximum distortion level to achieve good compression ratios at high compression speeds. We demonstrate bit rates in the range of 0.25-0.5 bpp for high-quality images and data rates in the range of 0.15-0.5 Mbps for a baseline video compression scheme that does not use any prediction or interpolation. We also demonstrate that this asymmetric compression scheme is capable of extremely fast decompression making it particularly suitable for networked multimedia applications.
Sparse/DCT (S/DCT) two-layered representation of prediction residuals for video coding.
Kang, Je-Won; Gabbouj, Moncef; Kuo, C-C Jay
2013-07-01
In this paper, we propose a cascaded sparse/DCT (S/DCT) two-layer representation of prediction residuals, and implement this idea on top of the state-of-the-art high efficiency video coding (HEVC) standard. First, a dictionary is adaptively trained to contain featured patterns of residual signals so that a high portion of energy in a structured residual can be efficiently coded via sparse coding. It is observed that the sparse representation alone is less effective in the R-D performance due to the side information overhead at higher bit rates. To overcome this problem, the DCT representation is cascaded at the second stage. It is applied to the remaining signal to improve coding efficiency. The two representations successfully complement each other. It is demonstrated by experimental results that the proposed algorithm outperforms the HEVC reference codec HM5.0 in the Common Test Condition.
Wide-Range Motion Estimation Architecture with Dual Search Windows for High Resolution Video Coding
NASA Astrophysics Data System (ADS)
Dung, Lan-Rong; Lin, Meng-Chun
This paper presents a memory-efficient motion estimation (ME) technique for high-resolution video compression. The main objective is to reduce the external memory access, especially for limited local memory resource. The reduction of memory access can successfully save the notorious power consumption. The key to reduce the memory accesses is based on center-biased algorithm in that the center-biased algorithm performs the motion vector (MV) searching with the minimum search data. While considering the data reusability, the proposed dual-search-windowing (DSW) approaches use the secondary windowing as an option per searching necessity. By doing so, the loading of search windows can be alleviated and hence reduce the required external memory bandwidth. The proposed techniques can save up to 81% of external memory bandwidth and require only 135 MBytes/sec, while the quality degradation is less than 0.2dB for 720p HDTV clips coded at 8Mbits/sec.
Using game theory for perceptual tuned rate control algorithm in video coding
NASA Astrophysics Data System (ADS)
Luo, Jiancong; Ahmad, Ishfaq
2005-03-01
This paper proposes a game theoretical rate control technique for video compression. Using a cooperative gaming approach, which has been utilized in several branches of natural and social sciences because of its enormous potential for solving constrained optimization problems, we propose a dual-level scheme to optimize the perceptual quality while guaranteeing "fairness" in bit allocation among macroblocks. At the frame level, the algorithm allocates target bits to frames based on their coding complexity. At the macroblock level, the algorithm distributes bits to macroblocks by defining a bargaining game. Macroblocks play cooperatively to compete for shares of resources (bits) to optimize their quantization scales while considering the Human Visual System"s perceptual property. Since the whole frame is an entity perceived by viewers, macroblocks compete cooperatively under a global objective of achieving the best quality with the given bit constraint. The major advantage of the proposed approach is that the cooperative game leads to an optimal and fair bit allocation strategy based on the Nash Bargaining Solution. Another advantage is that it allows multi-objective optimization with multiple decision makers (macroblocks). The simulation results testify the algorithm"s ability to achieve accurate bit rate with good perceptual quality, and to maintain a stable buffer level.
Dikbas, Salih; Altunbasak, Yucel
2013-08-01
In this paper, a new low-complexity true-motion estimation (TME) algorithm is proposed for video processing applications, such as motion-compensated temporal frame interpolation (MCTFI) or motion-compensated frame rate up-conversion (MCFRUC). Regular motion estimation, which is often used in video coding, aims to find the motion vectors (MVs) to reduce the temporal redundancy, whereas TME aims to track the projected object motion as closely as possible. TME is obtained by imposing implicit and/or explicit smoothness constraints on the block-matching algorithm. To produce better quality-interpolated frames, the dense motion field at interpolation time is obtained for both forward and backward MVs; then, bidirectional motion compensation using forward and backward MVs is applied by mixing both elegantly. Finally, the performance of the proposed algorithm for MCTFI is demonstrated against recently proposed methods and smoothness constraint optical flow employed by a professional video production suite. Experimental results show that the quality of the interpolated frames using the proposed method is better when compared with the MCFRUC techniques.
Reference View Selection in DIBR-Based Multiview Coding.
Maugey, Thomas; Petrazzuoli, Giovanni; Frossard, Pascal; Cagnazzo, Marco; Pesquet-Popescu, Beatrice
2016-04-01
Augmented reality, interactive navigation in 3D scenes, multiview video, and other emerging multimedia applications require large sets of images, hence larger data volumes and increased resources compared with traditional video services. The significant increase in the number of images in multiview systems leads to new challenging problems in data representation and data transmission to provide high quality of experience on resource-constrained environments. In order to reduce the size of the data, different multiview video compression strategies have been proposed recently. Most of them use the concept of reference or key views that are used to estimate other images when there is high correlation in the data set. In such coding schemes, the two following questions become fundamental: 1) how many reference views have to be chosen for keeping a good reconstruction quality under coding cost constraints? And 2) where to place these key views in the multiview data set? As these questions are largely overlooked in the literature, we study the reference view selection problem and propose an algorithm for the optimal selection of reference views in multiview coding systems. Based on a novel metric that measures the similarity between the views, we formulate an optimization problem for the positioning of the reference views, such that both the distortion of the view reconstruction and the coding rate cost are minimized. We solve this new problem with a shortest path algorithm that determines both the optimal number of reference views and their positions in the image set. We experimentally validate our solution in a practical multiview distributed coding system and in the standardized 3D-HEVC multiview coding scheme. We show that considering the 3D scene geometry in the reference view, positioning problem brings significant rate-distortion improvements and outperforms the traditional coding strategy that simply selects key frames based on the distance between cameras.
Multi-pass encoding of hyperspectral imagery with spectral quality control
NASA Astrophysics Data System (ADS)
Wasson, Steven; Walker, William
2015-05-01
Multi-pass encoding is a technique employed in the field of video compression that maximizes the quality of an encoded video sequence within the constraints of a specified bit rate. This paper presents research where multi-pass encoding is extended to the field of hyperspectral image compression. Unlike video, which is primarily intended to be viewed by a human observer, hyperspectral imagery is processed by computational algorithms that generally attempt to classify the pixel spectra within the imagery. As such, these algorithms are more sensitive to distortion in the spectral dimension of the image than they are to perceptual distortion in the spatial dimension. The compression algorithm developed for this research, which uses the Karhunen-Loeve transform for spectral decorrelation followed by a modified H.264/Advanced Video Coding (AVC) encoder, maintains a user-specified spectral quality level while maximizing the compression ratio throughout the encoding process. The compression performance may be considered near-lossless in certain scenarios. For qualitative purposes, this paper presents the performance of the compression algorithm for several Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) and Hyperion datasets using spectral angle as the spectral quality assessment function. Specifically, the compression performance is illustrated in the form of rate-distortion curves that plot spectral angle versus bits per pixel per band (bpppb).
NASA Astrophysics Data System (ADS)
Han, Yishi; Luo, Zhixiao; Wang, Jianhua; Min, Zhixuan; Qin, Xinyu; Sun, Yunlong
2014-09-01
In general, context-based adaptive variable length coding (CAVLC) decoding in H.264/AVC standard requires frequent access to the unstructured variable length coding tables (VLCTs) and significant memory accesses are consumed. Heavy memory accesses will cause high power consumption and time delays, which are serious problems for applications in portable multimedia devices. We propose a method for high-efficiency CAVLC decoding by using a program instead of all the VLCTs. The decoded codeword from VLCTs can be obtained without any table look-up and memory access. The experimental results show that the proposed algorithm achieves 100% memory access saving and 40% decoding time saving without degrading video quality. Additionally, the proposed algorithm shows a better performance compared with conventional CAVLC decoding, such as table look-up by sequential search, table look-up by binary search, Moon's method, and Kim's method.
A MPEG-4 encoder based on TMS320C6416
NASA Astrophysics Data System (ADS)
Li, Gui-ju; Liu, Wei-ning
2013-08-01
Engineering and products need to achieve real-time video encoding by DSP, but the high computational complexity and huge amount of data requires that system has high data throughput. In this paper, a real-time MPEG-4 video encoder is designed based on TMS320C6416 platform. The kernel is the DSP of TMS320C6416T and FPGA chip f as the organization and management of video data. In order to control the flow of input and output data. Encoded stream is output using the synchronous serial port. The system has the clock frequency of 1GHz and has up to 8000 MIPS speed processing capacity when running at full speed. Due to the low coding efficiency of MPEG-4 video encoder transferred directly to DSP platform, it is needed to improve the program structure, data structures and algorithms combined with TMS320C6416T characteristics. First: Design the image storage architecture by balancing the calculation spending, storage space cost and EDMA read time factors. Open up a more buffer in memory, each buffer cache 16 lines of video data to be encoded, reconstruction image and reference image including search range. By using the variable alignment mode of the DSP, modifying the definition of structure variables and change the look-up table which occupy larger space with a direct calculation array to save memory space. After the program structure optimization, the program code, all variables, buffering buffers and the interpolation image including the search range can be placed in memory. Then, as to the time-consuming process modules and some functions which are called many times, the corresponding modules are written in parallel assembly language of TMS320C6416T which can increase the running speed. Besides, the motion estimation algorithm is improved by using a cross-hexagon search algorithm, The search speed can be increased obviously. Finally, the execution time, signal-to-noise ratio and compression ratio of a real-time image acquisition sequence is given. The experimental results show that the designed encoder in this paper can accomplish real-time encoding of a 768× 576, 25 frames per second grayscale video. The code rate is 1.5M bits per second.
NASA Astrophysics Data System (ADS)
Various papers on communications for the information age are presented. Among the general topics considered are: telematic services and terminals, satellite communications, telecommunications mangaement network, control of integrated broadband networks, advances in digital radio systems, the intelligent network, broadband networks and services deployment, future switch architectures, performance analysis of computer networks, advances in spread spectrum, optical high-speed LANs, and broadband switching and networks. Also addressed are: multiple access protocols, video coding techniques, modulation and coding, photonic switching, SONET terminals and applications, standards for video coding, digital switching, progress in MANs, mobile and portable radio, software design for improved maintainability, multipath propagation and advanced countermeasure, data communication, network control and management, fiber in the loop, network algorithm and protocols, and advances in computer communications.
Pattern-based integer sample motion search strategies in the context of HEVC
NASA Astrophysics Data System (ADS)
Maier, Georg; Bross, Benjamin; Grois, Dan; Marpe, Detlev; Schwarz, Heiko; Veltkamp, Remco C.; Wiegand, Thomas
2015-09-01
The H.265/MPEG-H High Efficiency Video Coding (HEVC) standard provides a significant increase in coding efficiency compared to its predecessor, the H.264/MPEG-4 Advanced Video Coding (AVC) standard, which however comes at the cost of a high computational burden for a compliant encoder. Motion estimation (ME), which is a part of the inter-picture prediction process, typically consumes a high amount of computational resources, while significantly increasing the coding efficiency. In spite of the fact that both H.265/MPEG-H HEVC and H.264/MPEG-4 AVC standards allow processing motion information on a fractional sample level, the motion search algorithms based on the integer sample level remain to be an integral part of ME. In this paper, a flexible integer sample ME framework is proposed, thereby allowing to trade off significant reduction of ME computation time versus coding efficiency penalty in terms of bit rate overhead. As a result, through extensive experimentation, an integer sample ME algorithm that provides a good trade-off is derived, incorporating a combination and optimization of known predictive, pattern-based and early termination techniques. The proposed ME framework is implemented on a basis of the HEVC Test Model (HM) reference software, further being compared to the state-of-the-art fast search algorithm, which is a native part of HM. It is observed that for high resolution sequences, the integer sample ME process can be speed-up by factors varying from 3.2 to 7.6, resulting in the bit-rate overhead of 1.5% and 0.6% for Random Access (RA) and Low Delay P (LDP) configurations, respectively. In addition, the similar speed-up is observed for sequences with mainly Computer-Generated Imagery (CGI) content while trading off the bit rate overhead of up to 5.2%.
Implementation of MPEG-2 encoder to multiprocessor system using multiple MVPs (TMS320C80)
NASA Astrophysics Data System (ADS)
Kim, HyungSun; Boo, Kenny; Chung, SeokWoo; Choi, Geon Y.; Lee, YongJin; Jeon, JaeHo; Park, Hyun Wook
1997-05-01
This paper presents the efficient algorithm mapping for the real-time MPEG-2 encoding on the KAIST image computing system (KICS), which has a parallel architecture using five multimedia video processors (MVPs). The MVP is a general purpose digital signal processor (DSP) of Texas Instrument. It combines one floating-point processor and four fixed- point DSPs on a single chip. The KICS uses the MVP as a primary processing element (PE). Two PEs form a cluster, and there are two processing clusters in the KICS. Real-time MPEG-2 encoder is implemented through the spatial and the functional partitioning strategies. Encoding process of spatially partitioned half of the video input frame is assigned to ne processing cluster. Two PEs perform the functionally partitioned MPEG-2 encoding tasks in the pipelined operation mode. One PE of a cluster carries out the transform coding part and the other performs the predictive coding part of the MPEG-2 encoding algorithm. One MVP among five MVPs is used for system control and interface with host computer. This paper introduces an implementation of the MPEG-2 algorithm with a parallel processing architecture.
NASA Astrophysics Data System (ADS)
Rodríguez-Sánchez, Rafael; Martínez, José Luis; Cock, Jan De; Fernández-Escribano, Gerardo; Pieters, Bart; Sánchez, José L.; Claver, José M.; de Walle, Rik Van
2013-12-01
The H.264/AVC video coding standard introduces some improved tools in order to increase compression efficiency. Moreover, the multi-view extension of H.264/AVC, called H.264/MVC, adopts many of them. Among the new features, variable block-size motion estimation is one which contributes to high coding efficiency. Furthermore, it defines a different prediction structure that includes hierarchical bidirectional pictures, outperforming traditional Group of Pictures patterns in both scenarios: single-view and multi-view. However, these video coding techniques have high computational complexity. Several techniques have been proposed in the literature over the last few years which are aimed at accelerating the inter prediction process, but there are no works focusing on bidirectional prediction or hierarchical prediction. In this article, with the emergence of many-core processors or accelerators, a step forward is taken towards an implementation of an H.264/AVC and H.264/MVC inter prediction algorithm on a graphics processing unit. The results show a negligible rate distortion drop with a time reduction of up to 98% for the complete H.264/AVC encoder.
Watermarking textures in video games
NASA Astrophysics Data System (ADS)
Liu, Huajian; Berchtold, Waldemar; Schäfer, Marcel; Lieb, Patrick; Steinebach, Martin
2014-02-01
Digital watermarking is a promising solution to video game piracy. In this paper, based on the analysis of special challenges and requirements in terms of watermarking textures in video games, a novel watermarking scheme for DDS textures in video games is proposed. To meet the performance requirements in video game applications, the proposed algorithm embeds the watermark message directly in the compressed stream in DDS files and can be straightforwardly applied in watermark container technique for real-time embedding. Furthermore, the embedding approach achieves high watermark payload to handle collusion secure fingerprinting codes with extreme length. Hence, the scheme is resistant to collusion attacks, which is indispensable in video game applications. The proposed scheme is evaluated in aspects of transparency, robustness, security and performance. Especially, in addition to classical objective evaluation, the visual quality and playing experience of watermarked games is assessed subjectively in game playing.
NASA Astrophysics Data System (ADS)
González, Diego; Botella, Guillermo; García, Carlos; Prieto, Manuel; Tirado, Francisco
2013-12-01
This contribution focuses on the optimization of matching-based motion estimation algorithms widely used for video coding standards using an Altera custom instruction-based paradigm and a combination of synchronous dynamic random access memory (SDRAM) with on-chip memory in Nios II processors. A complete profile of the algorithms is achieved before the optimization, which locates code leaks, and afterward, creates a custom instruction set, which is then added to the specific design, enhancing the original system. As well, every possible memory combination between on-chip memory and SDRAM has been tested to achieve the best performance. The final throughput of the complete designs are shown. This manuscript outlines a low-cost system, mapped using very large scale integration technology, which accelerates software algorithms by converting them into custom hardware logic blocks and showing the best combination between on-chip memory and SDRAM for the Nios II processor.
Madan, Christopher R
2014-01-01
When studying animal behaviour within an open environment, movement-related data are often important for behavioural analyses. Therefore, simple and efficient techniques are needed to present and analyze the data of such movements. However, it is challenging to present both spatial and temporal information of movements within a two-dimensional image representation. To address this challenge, we developed the spectral time-lapse (STL) algorithm that re-codes an animal’s position at every time point with a time-specific color, and overlays it with a reference frame of the video, to produce a summary image. We additionally incorporated automated motion tracking, such that the animal’s position can be extracted and summary statistics such as path length and duration can be calculated, as well as instantaneous velocity and acceleration. Here we describe the STL algorithm and offer a freely available MATLAB toolbox that implements the algorithm and allows for a large degree of end-user control and flexibility. PMID:25580219
Constellation labeling optimization for bit-interleaved coded APSK
NASA Astrophysics Data System (ADS)
Xiang, Xingyu; Mo, Zijian; Wang, Zhonghai; Pham, Khanh; Blasch, Erik; Chen, Genshe
2016-05-01
This paper investigates the constellation and mapping optimization for amplitude phase shift keying (APSK) modulation, which is deployed in Digital Video Broadcasting Satellite - Second Generation (DVB-S2) and Digital Video Broadcasting - Satellite services to Handhelds (DVB-SH) broadcasting standards due to its merits of power and spectral efficiency together with the robustness against nonlinear distortion. The mapping optimization is performed for 32-APSK according to combined cost functions related to Euclidean distance and mutual information. A Binary switching algorithm and its modified version are used to minimize the cost function and the estimated error between the original and received data. The optimized constellation mapping is tested by combining DVB-S2 standard Low-Density Parity-Check (LDPC) codes in both Bit-Interleaved Coded Modulation (BICM) and BICM with iterative decoding (BICM-ID) systems. The simulated results validate the proposed constellation labeling optimization scheme which yields better performance against conventional 32-APSK constellation defined in DVB-S2 standard.
NASA Astrophysics Data System (ADS)
Park, Nam In; Kim, Seon Man; Kim, Hong Kook; Kim, Ji Woon; Kim, Myeong Bo; Yun, Su Won
In this paper, we propose a video-zoom driven audio-zoom algorithm in order to provide audio zooming effects in accordance with the degree of video-zoom. The proposed algorithm is designed based on a super-directive beamformer operating with a 4-channel microphone system, in conjunction with a soft masking process that considers the phase differences between microphones. Thus, the audio-zoom processed signal is obtained by multiplying an audio gain derived from a video-zoom level by the masked signal. After all, a real-time audio-zoom system is implemented on an ARM-CORETEX-A8 having a clock speed of 600 MHz after different levels of optimization are performed such as algorithmic level, C-code, and memory optimizations. To evaluate the complexity of the proposed real-time audio-zoom system, test data whose length is 21.3 seconds long is sampled at 48 kHz. As a result, it is shown from the experiments that the processing time for the proposed audio-zoom system occupies 14.6% or less of the ARM clock cycles. It is also shown from the experimental results performed in a semi-anechoic chamber that the signal with the front direction can be amplified by approximately 10 dB compared to the other directions.
Improving Video Based Heart Rate Monitoring.
Lin, Jian; Rozado, David; Duenser, Andreas
2015-01-01
Non-contact measurements of cardiac pulse can provide robust measurement of heart rate (HR) without the annoyance of attaching electrodes to the body. In this paper we explore a novel and reliable method to carry out video-based HR estimation and propose various performance improvement over existing approaches. The investigated method uses Independent Component Analysis (ICA) to detect the underlying HR signal from a mixed source signal present in the RGB channels of the image. The original ICA algorithm was implemented and several modifications were explored in order to determine which one could be optimal for accurate HR estimation. Using statistical analysis, we compared the cardiac pulse rate estimation from the different methods under comparison on the extracted videos to a commercially available oximeter. We found that some of these methods are quite effective and efficient in terms of improving accuracy and latency of the system. We have made the code of our algorithms openly available to the scientific community so that other researchers can explore how to integrate video-based HR monitoring in novel health technology applications. We conclude by noting that recent advances in video-based HR monitoring permit computers to be aware of a user's psychophysiological status in real time.
Novel memory architecture for video signal processor
NASA Astrophysics Data System (ADS)
Hung, Jen-Sheng; Lin, Chia-Hsing; Jen, Chein-Wei
1993-11-01
An on-chip memory architecture for video signal processor (VSP) is proposed. This memory structure is a two-level design for the different data locality in video applications. The upper level--Memory A provides enough storage capacity to reduce the impact on the limitation of chip I/O bandwidth, and the lower level--Memory B provides enough data parallelism and flexibility to meet the requirements of multiple reconfigurable pipeline function units in a single VSP chip. The needed memory size is decided by the memory usage analysis for video algorithms and the number of function units. Both levels of memory adopted a dual-port memory scheme to sustain the simultaneous read and write operations. Especially, Memory B uses multiple one-read-one-write memory banks to emulate the real multiport memory. Therefore, one can change the configuration of Memory B to several sets of memories with variable read/write ports by adjusting the bus switches. Then the numbers of read ports and write ports in proposed memory can meet requirement of data flow patterns in different video coding algorithms. We have finished the design of a prototype memory design using 1.2- micrometers SPDM SRAM technology and will fabricated it through TSMC, in Taiwan.
Efficient burst image compression using H.265/HEVC
NASA Astrophysics Data System (ADS)
Roodaki-Lavasani, Hoda; Lainema, Jani
2014-02-01
New imaging use cases are emerging as more powerful camera hardware is entering consumer markets. One family of such use cases is based on capturing multiple pictures instead of just one when taking a photograph. That kind of a camera operation allows e.g. selecting the most successful shot from a sequence of images, showing what happened right before or after the shot was taken or combining the shots by computational means to improve either visible characteristics of the picture (such as dynamic range or focus) or the artistic aspects of the photo (e.g. by superimposing pictures on top of each other). Considering that photographic images are typically of high resolution and quality and the fact that these kind of image bursts can consist of at least tens of individual pictures, an efficient compression algorithm is desired. However, traditional video coding approaches fail to provide the random access properties these use cases require to achieve near-instantaneous access to the pictures in the coded sequence. That feature is critical to allow users to browse the pictures in an arbitrary order or imaging algorithms to extract desired pictures from the sequence quickly. This paper proposes coding structures that provide such random access properties while achieving coding efficiency superior to existing image coders. The results indicate that using HEVC video codec with a single reference picture fixed for the whole sequence can achieve nearly as good compression as traditional IPPP coding structures. It is also shown that the selection of the reference frame can further improve the coding efficiency.
ISO-IEC MPEG-2 software video codec
NASA Astrophysics Data System (ADS)
Eckart, Stefan; Fogg, Chad E.
1995-04-01
Part 5 of the International Standard ISO/IEC 13818 `Generic Coding of Moving Pictures and Associated Audio' (MPEG-2) is a Technical Report, a sample software implementation of the procedures in parts 1, 2 and 3 of the standard (systems, video, and audio). This paper focuses on the video software, which gives an example of a fully compliant implementation of the standard and of a good video quality encoder, and serves as a tool for compliance testing. The implementation and some of the development aspects of the codec are described. The encoder is based on Test Model 5 (TM5), one of the best, published, non-proprietary coding models, which was used during MPEG-2 collaborative stage to evaluate proposed algorithms and to verify the syntax. The most important part of the Test Model is controlling the quantization parameter based on the image content and bit rate constraints under both signal-to-noise and psycho-optical aspects. The decoder has been successfully tested for compliance with the MPEG-2 standard, using the ISO/IEC MPEG verification and compliance bitstream test suites as stimuli.
Fast bi-directional prediction selection in H.264/MPEG-4 AVC temporal scalable video coding.
Lin, Hung-Chih; Hang, Hsueh-Ming; Peng, Wen-Hsiao
2011-12-01
In this paper, we propose a fast algorithm that efficiently selects the temporal prediction type for the dyadic hierarchical-B prediction structure in the H.264/MPEG-4 temporal scalable video coding (SVC). We make use of the strong correlations in prediction type inheritance to eliminate the superfluous computations for the bi-directional (BI) prediction in the finer partitions, 16×8/8×16/8×8 , by referring to the best temporal prediction type of 16 × 16. In addition, we carefully examine the relationship in motion bit-rate costs and distortions between the BI and the uni-directional temporal prediction types. As a result, we construct a set of adaptive thresholds to remove the unnecessary BI calculations. Moreover, for the block partitions smaller than 8 × 8, either the forward prediction (FW) or the backward prediction (BW) is skipped based upon the information of their 8 × 8 partitions. Hence, the proposed schemes can efficiently reduce the extensive computational burden in calculating the BI prediction. As compared to the JSVM 9.11 software, our method saves the encoding time from 48% to 67% for a large variety of test videos over a wide range of coding bit-rates and has only a minor coding performance loss. © 2011 IEEE
Impact of packet losses in scalable 3D holoscopic video coding
NASA Astrophysics Data System (ADS)
Conti, Caroline; Nunes, Paulo; Ducla Soares, Luís.
2014-05-01
Holoscopic imaging became a prospective glassless 3D technology to provide more natural 3D viewing experiences to the end user. Additionally, holoscopic systems also allow new post-production degrees of freedom, such as controlling the plane of focus or the viewing angle presented to the user. However, to successfully introduce this technology into the consumer market, a display scalable coding approach is essential to achieve backward compatibility with legacy 2D and 3D displays. Moreover, to effectively transmit 3D holoscopic content over error-prone networks, e.g., wireless networks or the Internet, error resilience techniques are required to mitigate the impact of data impairments in the user quality perception. Therefore, it is essential to deeply understand the impact of packet losses in terms of decoding video quality for the specific case of 3D holoscopic content, notably when a scalable approach is used. In this context, this paper studies the impact of packet losses when using a three-layer display scalable 3D holoscopic video coding architecture previously proposed, where each layer represents a different level of display scalability (i.e., L0 - 2D, L1 - stereo or multiview, and L2 - full 3D holoscopic). For this, a simple error concealment algorithm is used, which makes use of inter-layer redundancy between multiview and 3D holoscopic content and the inherent correlation of the 3D holoscopic content to estimate lost data. Furthermore, a study of the influence of 2D views generation parameters used in lower layers on the performance of the used error concealment algorithm is also presented.
Optimization of multicast optical networks with genetic algorithm
NASA Astrophysics Data System (ADS)
Lv, Bo; Mao, Xiangqiao; Zhang, Feng; Qin, Xi; Lu, Dan; Chen, Ming; Chen, Yong; Cao, Jihong; Jian, Shuisheng
2007-11-01
In this letter, aiming to obtain the best multicast performance of optical network in which the video conference information is carried by specified wavelength, we extend the solutions of matrix games with the network coding theory and devise a new method to solve the complex problems of multicast network switching. In addition, an experimental optical network has been testified with best switching strategies by employing the novel numerical solution designed with an effective way of genetic algorithm. The result shows that optimal solutions with genetic algorithm are accordance with the ones with the traditional fictitious play method.
Parallel processing approach to transform-based image coding
NASA Astrophysics Data System (ADS)
Normile, James O.; Wright, Dan; Chu, Ken; Yeh, Chia L.
1991-06-01
This paper describes a flexible parallel processing architecture designed for use in real time video processing. The system consists of floating point DSP processors connected to each other via fast serial links, each processor has access to a globally shared memory. A multiple bus architecture in combination with a dual ported memory allows communication with a host control processor. The system has been applied to prototyping of video compression and decompression algorithms. The decomposition of transform based algorithms for decompression into a form suitable for parallel processing is described. A technique for automatic load balancing among the processors is developed and discussed, results ar presented with image statistics and data rates. Finally techniques for accelerating the system throughput are analyzed and results from the application of one such modification described.
Robust video transmission with distributed source coded auxiliary channel.
Wang, Jiajun; Majumdar, Abhik; Ramchandran, Kannan
2009-12-01
We propose a novel solution to the problem of robust, low-latency video transmission over lossy channels. Predictive video codecs, such as MPEG and H.26x, are very susceptible to prediction mismatch between encoder and decoder or "drift" when there are packet losses. These mismatches lead to a significant degradation in the decoded quality. To address this problem, we propose an auxiliary codec system that sends additional information alongside an MPEG or H.26x compressed video stream to correct for errors in decoded frames and mitigate drift. The proposed system is based on the principles of distributed source coding and uses the (possibly erroneous) MPEG/H.26x decoder reconstruction as side information at the auxiliary decoder. The distributed source coding framework depends upon knowing the statistical dependency (or correlation) between the source and the side information. We propose a recursive algorithm to analytically track the correlation between the original source frame and the erroneous MPEG/H.26x decoded frame. Finally, we propose a rate-distortion optimization scheme to allocate the rate used by the auxiliary encoder among the encoding blocks within a video frame. We implement the proposed system and present extensive simulation results that demonstrate significant gains in performance both visually and objectively (on the order of 2 dB in PSNR over forward error correction based solutions and 1.5 dB in PSNR over intrarefresh based solutions for typical scenarios) under tight latency constraints.
Sub-band/transform compression of video sequences
NASA Technical Reports Server (NTRS)
Sauer, Ken; Bauer, Peter
1992-01-01
The progress on compression of video sequences is discussed. The overall goal of the research was the development of data compression algorithms for high-definition television (HDTV) sequences, but most of our research is general enough to be applicable to much more general problems. We have concentrated on coding algorithms based on both sub-band and transform approaches. Two very fundamental issues arise in designing a sub-band coder. First, the form of the signal decomposition must be chosen to yield band-pass images with characteristics favorable to efficient coding. A second basic consideration, whether coding is to be done in two or three dimensions, is the form of the coders to be applied to each sub-band. Computational simplicity is of essence. We review the first portion of the year, during which we improved and extended some of the previous grant period's results. The pyramid nonrectangular sub-band coder limited to intra-frame application is discussed. Perhaps the most critical component of the sub-band structure is the design of bandsplitting filters. We apply very simple recursive filters, which operate at alternating levels on rectangularly sampled, and quincunx sampled images. We will also cover the techniques we have studied for the coding of the resulting bandpass signals. We discuss adaptive three-dimensional coding which takes advantage of the detection algorithm developed last year. To this point, all the work on this project has been done without the benefit of motion compensation (MC). Motion compensation is included in many proposed codecs, but adds significant computational burden and hardware expense. We have sought to find a lower-cost alternative featuring a simple adaptation to motion in the form of the codec. In sequences of high spatial detail and zooming or panning, it appears that MC will likely be necessary for the proposed quality and bit rates.
Sparse Representation with Spatio-Temporal Online Dictionary Learning for Efficient Video Coding.
Dai, Wenrui; Shen, Yangmei; Tang, Xin; Zou, Junni; Xiong, Hongkai; Chen, Chang Wen
2016-07-27
Classical dictionary learning methods for video coding suer from high computational complexity and interfered coding eciency by disregarding its underlying distribution. This paper proposes a spatio-temporal online dictionary learning (STOL) algorithm to speed up the convergence rate of dictionary learning with a guarantee of approximation error. The proposed algorithm incorporates stochastic gradient descents to form a dictionary of pairs of 3-D low-frequency and highfrequency spatio-temporal volumes. In each iteration of the learning process, it randomly selects one sample volume and updates the atoms of dictionary by minimizing the expected cost, rather than optimizes empirical cost over the complete training data like batch learning methods, e.g. K-SVD. Since the selected volumes are supposed to be i.i.d. samples from the underlying distribution, decomposition coecients attained from the trained dictionary are desirable for sparse representation. Theoretically, it is proved that the proposed STOL could achieve better approximation for sparse representation than K-SVD and maintain both structured sparsity and hierarchical sparsity. It is shown to outperform batch gradient descent methods (K-SVD) in the sense of convergence speed and computational complexity, and its upper bound for prediction error is asymptotically equal to the training error. With lower computational complexity, extensive experiments validate that the STOL based coding scheme achieves performance improvements than H.264/AVC or HEVC as well as existing super-resolution based methods in ratedistortion performance and visual quality.
Reading color barcodes using visual snakes.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schaub, Hanspeter
2004-05-01
Statistical pressure snakes are used to track a mono-color target in an unstructured environment using a video camera. The report discusses an algorithm to extract a bar code signal that is embedded within the target. The target is assumed to be rectangular in shape, with the bar code printed in a slightly different saturation and value in HSV color space. Thus, the visual snake, which primarily weighs hue tracking errors, will not be deterred by the presence of the color bar codes in the target. The bar code is generate with the standard 3 of 9 method. Using this method,more » the numeric bar codes reveal if the target is right-side-up or up-side-down.« less
Qi, Jin; Yang, Zhiyong
2014-01-01
Real-time human activity recognition is essential for human-robot interactions for assisted healthy independent living. Most previous work in this area is performed on traditional two-dimensional (2D) videos and both global and local methods have been used. Since 2D videos are sensitive to changes of lighting condition, view angle, and scale, researchers begun to explore applications of 3D information in human activity understanding in recently years. Unfortunately, features that work well on 2D videos usually don't perform well on 3D videos and there is no consensus on what 3D features should be used. Here we propose a model of human activity recognition based on 3D movements of body joints. Our method has three steps, learning dictionaries of sparse codes of 3D movements of joints, sparse coding, and classification. In the first step, space-time volumes of 3D movements of body joints are obtained via dense sampling and independent component analysis is then performed to construct a dictionary of sparse codes for each activity. In the second step, the space-time volumes are projected to the dictionaries and a set of sparse histograms of the projection coefficients are constructed as feature representations of the activities. Finally, the sparse histograms are used as inputs to a support vector machine to recognize human activities. We tested this model on three databases of human activities and found that it outperforms the state-of-the-art algorithms. Thus, this model can be used for real-time human activity recognition in many applications.
Applications of just-noticeable depth difference model in joint multiview video plus depth coding
NASA Astrophysics Data System (ADS)
Liu, Chao; An, Ping; Zuo, Yifan; Zhang, Zhaoyang
2014-10-01
A new multiview just-noticeable-depth-difference(MJNDD) Model is presented and applied to compress the joint multiview video plus depth. Many video coding algorithms remove spatial and temporal redundancies and statistical redundancies but they are not capable of removing the perceptual redundancies. Since the final receptor of video is the human eyes, we can remove the perception redundancy to gain higher compression efficiency according to the properties of human visual system (HVS). Traditional just-noticeable-distortion (JND) model in pixel domain contains luminance contrast and spatial-temporal masking effects, which describes the perception redundancy quantitatively. Whereas HVS is very sensitive to depth information, a new multiview-just-noticeable-depth-difference(MJNDD) model is proposed by combining traditional JND model with just-noticeable-depth-difference (JNDD) model. The texture video is divided into background and foreground areas using depth information. Then different JND threshold values are assigned to these two parts. Later the MJNDD model is utilized to encode the texture video on JMVC. When encoding the depth video, JNDD model is applied to remove the block artifacts and protect the edges. Then we use VSRS3.5 (View Synthesis Reference Software) to generate the intermediate views. Experimental results show that our model can endure more noise and the compression efficiency is improved by 25.29 percent at average and by 54.06 percent at most compared to JMVC while maintaining the subject quality. Hence it can gain high compress ratio and low bit rate.
Xie, Jianwen; Douglas, Pamela K; Wu, Ying Nian; Brody, Arthur L; Anderson, Ariana E
2017-04-15
Brain networks in fMRI are typically identified using spatial independent component analysis (ICA), yet other mathematical constraints provide alternate biologically-plausible frameworks for generating brain networks. Non-negative matrix factorization (NMF) would suppress negative BOLD signal by enforcing positivity. Spatial sparse coding algorithms (L1 Regularized Learning and K-SVD) would impose local specialization and a discouragement of multitasking, where the total observed activity in a single voxel originates from a restricted number of possible brain networks. The assumptions of independence, positivity, and sparsity to encode task-related brain networks are compared; the resulting brain networks within scan for different constraints are used as basis functions to encode observed functional activity. These encodings are then decoded using machine learning, by using the time series weights to predict within scan whether a subject is viewing a video, listening to an audio cue, or at rest, in 304 fMRI scans from 51 subjects. The sparse coding algorithm of L1 Regularized Learning outperformed 4 variations of ICA (p<0.001) for predicting the task being performed within each scan using artifact-cleaned components. The NMF algorithms, which suppressed negative BOLD signal, had the poorest accuracy compared to the ICA and sparse coding algorithms. Holding constant the effect of the extraction algorithm, encodings using sparser spatial networks (containing more zero-valued voxels) had higher classification accuracy (p<0.001). Lower classification accuracy occurred when the extracted spatial maps contained more CSF regions (p<0.001). The success of sparse coding algorithms suggests that algorithms which enforce sparsity, discourage multitasking, and promote local specialization may capture better the underlying source processes than those which allow inexhaustible local processes such as ICA. Negative BOLD signal may capture task-related activations. Copyright © 2017 Elsevier B.V. All rights reserved.
New scene change control scheme based on pseudoskipped picture
NASA Astrophysics Data System (ADS)
Lee, Youngsun; Lee, Jinwhan; Chang, Hyunsik; Nam, Jae Y.
1997-01-01
A new scene change control scheme which improves the video coding performance for sequences that have many scene changed pictures is proposed in this paper. The scene changed pictures except intra-coded picture usually need more bits than normal pictures in order to maintain constant picture quality. The major idea of this paper is how to obtain extra bits which are needed to encode scene changed pictures. We encode a B picture which is located before a scene changed picture like a skipped picture. We call such a B picture as a pseudo-skipped picture. By generating the pseudo-skipped picture like a skipped picture. We call such a B picture as a pseudo-skipped picture. By generating the pseudo-skipped picture, we can save some bits and they are added to the originally allocated target bits to encode the scene changed picture. The simulation results show that the proposed algorithm improves encoding performance about 0.5 to approximately 2.0 dB of PSNR compared to MPEG-2 TM5 rate controls scheme. In addition, the suggested algorithm is compatible with MPEG-2 video syntax and the picture repetition is not recognizable.
Seeling, Patrick; Reisslein, Martin
2014-01-01
Video encoding for multimedia services over communication networks has significantly advanced in recent years with the development of the highly efficient and flexible H.264/AVC video coding standard and its SVC extension. The emerging H.265/HEVC video coding standard as well as 3D video coding further advance video coding for multimedia communications. This paper first gives an overview of these new video coding standards and then examines their implications for multimedia communications by studying the traffic characteristics of long videos encoded with the new coding standards. We review video coding advances from MPEG-2 and MPEG-4 Part 2 to H.264/AVC and its SVC and MVC extensions as well as H.265/HEVC. For single-layer (nonscalable) video, we compare H.265/HEVC and H.264/AVC in terms of video traffic and statistical multiplexing characteristics. Our study is the first to examine the H.265/HEVC traffic variability for long videos. We also illustrate the video traffic characteristics and statistical multiplexing of scalable video encoded with the SVC extension of H.264/AVC as well as 3D video encoded with the MVC extension of H.264/AVC.
2014-01-01
Video encoding for multimedia services over communication networks has significantly advanced in recent years with the development of the highly efficient and flexible H.264/AVC video coding standard and its SVC extension. The emerging H.265/HEVC video coding standard as well as 3D video coding further advance video coding for multimedia communications. This paper first gives an overview of these new video coding standards and then examines their implications for multimedia communications by studying the traffic characteristics of long videos encoded with the new coding standards. We review video coding advances from MPEG-2 and MPEG-4 Part 2 to H.264/AVC and its SVC and MVC extensions as well as H.265/HEVC. For single-layer (nonscalable) video, we compare H.265/HEVC and H.264/AVC in terms of video traffic and statistical multiplexing characteristics. Our study is the first to examine the H.265/HEVC traffic variability for long videos. We also illustrate the video traffic characteristics and statistical multiplexing of scalable video encoded with the SVC extension of H.264/AVC as well as 3D video encoded with the MVC extension of H.264/AVC. PMID:24701145
Reliable video transmission over fading channels via channel state estimation
NASA Astrophysics Data System (ADS)
Kumwilaisak, Wuttipong; Kim, JongWon; Kuo, C.-C. Jay
2000-04-01
Transmission of continuous media such as video over time- varying wireless communication channels can benefit from the use of adaptation techniques in both source and channel coding. An adaptive feedback-based wireless video transmission scheme is investigated in this research with special emphasis on feedback-based adaptation. To be more specific, an interactive adaptive transmission scheme is developed by letting the receiver estimate the channel state information and send it back to the transmitter. By utilizing the feedback information, the transmitter is capable of adapting the level of protection by changing the flexible RCPC (rate-compatible punctured convolutional) code ratio depending on the instantaneous channel condition. The wireless channel is modeled as a fading channel, where the long-term and short- term fading effects are modeled as the log-normal fading and the Rayleigh flat fading, respectively. Then, its state (mainly the long term fading portion) is tracked and predicted by using an adaptive LMS (least mean squares) algorithm. By utilizing the delayed feedback on the channel condition, the adaptation performance of the proposed scheme is first evaluated in terms of the error probability and the throughput. It is then extended to incorporate variable size packets of ITU-T H.263+ video with the error resilience option. Finally, the end-to-end performance of wireless video transmission is compared against several non-adaptive protection schemes.
Costa, Marcus V C; Carvalho, Joao L A; Berger, Pedro A; Zaghetto, Alexandre; da Rocha, Adson F; Nascimento, Francisco A O
2009-01-01
We present a new preprocessing technique for two-dimensional compression of surface electromyographic (S-EMG) signals, based on correlation sorting. We show that the JPEG2000 coding system (originally designed for compression of still images) and the H.264/AVC encoder (video compression algorithm operating in intraframe mode) can be used for compression of S-EMG signals. We compare the performance of these two off-the-shelf image compression algorithms for S-EMG compression, with and without the proposed preprocessing step. Compression of both isotonic and isometric contraction S-EMG signals is evaluated. The proposed methods were compared with other S-EMG compression algorithms from the literature.
Doulamis, A D; Doulamis, N D; Kollias, S D
2003-01-01
Multimedia services and especially digital video is expected to be the major traffic component transmitted over communication networks [such as internet protocol (IP)-based networks]. For this reason, traffic characterization and modeling of such services are required for an efficient network operation. The generated models can be used as traffic rate predictors, during the network operation phase (online traffic modeling), or as video generators for estimating the network resources, during the network design phase (offline traffic modeling). In this paper, an adaptable neural-network architecture is proposed covering both cases. The scheme is based on an efficient recursive weight estimation algorithm, which adapts the network response to current conditions. In particular, the algorithm updates the network weights so that 1) the network output, after the adaptation, is approximately equal to current bit rates (current traffic statistics) and 2) a minimal degradation over the obtained network knowledge is provided. It can be shown that the proposed adaptable neural-network architecture simulates a recursive nonlinear autoregressive model (RNAR) similar to the notation used in the linear case. The algorithm presents low computational complexity and high efficiency in tracking traffic rates in contrast to conventional retraining schemes. Furthermore, for the problem of offline traffic modeling, a novel correlation mechanism is proposed for capturing the burstness of the actual MPEG video traffic. The performance of the model is evaluated using several real-life MPEG coded video sources of long duration and compared with other linear/nonlinear techniques used for both cases. The results indicate that the proposed adaptable neural-network architecture presents better performance than other examined techniques.
Pillai, Dinesh; Song, Xiaoyan; Pastor, William; Ottolini, Mary; Powell, David; Wiedermann, Bernhard L; DeBiasi, Roberta L
2011-12-01
Variable treatment exists for children with bacterial pneumonia complications such as pleural effusion and empyema. Subspecialists at an urban academic tertiary children's hospital created a literature-based diagnosis and management algorithm for complicated pneumonia in children. We proposed that algorithm implementation would reduce use of computed tomography (CT) for diagnosis of pleural infection, thereby decreasing radiation exposure, without increased adverse outcomes. A cross-sectional study was undertaken in children (3 months to 20 years old) with principal or secondary diagnosis codes for empyema and/or pleural effusion in conjunction with bacterial pneumonia. Study cohorts consisted of subjects admitted 15 months before (cohort 1, n = 83) and after (cohort 2, n = 87) algorithm implementation. Data were collected using clinical and financial data systems. Imaging studies and procedures were identified using Current Procedural Terminology codes. Statistical analysis included χ test, linear and ordinal regression, and analysis of variance. Age (P = 0.56), sex (P = 0.30), diagnoses (P = 0.12), and severity level (P = 0.84) were similar between cohorts. There was a significant decrease in CT use in cohort 2 (cohort 1, 60% vs cohort 2, 17.2%; P = 0.001) and reduction in readmission rate (7.7% vs 0%; P = 0.01) and video-assisted thoracoscopic surgery procedures (44.6% vs 28.7; P = 0.03), without concomitant increases in vancomycin use (34.9% vs 34.5%; P = 0.95) or hospital length of stay (6.4 vs 7.6 days; P = 0.4). Among patients who received video-assisted thoracoscopic surgery drainage (n = 57), there were no significant differences between cohorts in median time from admission to video-assisted thoracoscopic surgery (2 days; P = 0.29) or median duration of chest tube drainage (3 vs 4 days; P = 0.10). There was a statistically nonsignificant trend for higher rate of pathogen identification in cohort 2 (cohort 1, 33% vs cohort 2, 54.1%; P = 0.12); Streptococcus pneumonia was the most commonly identified pathogen in both cohorts (37.5% vs 27%; P = 0.23). Implementation of an institutional complicated pneumonia management algorithm reduced CT scan use/radiation exposure, VATS procedures, and readmission rate in children with a diagnosis of pleural infection, without associated increases in length of stay or vancomycin use. This algorithm provides the framework for future prospective quality improvement studies in pediatric patients with complicated pneumonia.
NASA Astrophysics Data System (ADS)
Lee, Feifei; Kotani, Koji; Chen, Qiu; Ohmi, Tadahiro
2010-02-01
In this paper, a fast search algorithm for MPEG-4 video clips from video database is proposed. An adjacent pixel intensity difference quantization (APIDQ) histogram is utilized as the feature vector of VOP (video object plane), which had been reliably applied to human face recognition previously. Instead of fully decompressed video sequence, partially decoded data, namely DC sequence of the video object are extracted from the video sequence. Combined with active search, a temporal pruning algorithm, fast and robust video search can be realized. The proposed search algorithm has been evaluated by total 15 hours of video contained of TV programs such as drama, talk, news, etc. to search for given 200 MPEG-4 video clips which each length is 15 seconds. Experimental results show the proposed algorithm can detect the similar video clip in merely 80ms, and Equal Error Rate (ERR) of 2 % in drama and news categories are achieved, which are more accurately and robust than conventional fast video search algorithm.
Overview of the H.264/AVC video coding standard
NASA Astrophysics Data System (ADS)
Luthra, Ajay; Topiwala, Pankaj N.
2003-11-01
H.264/MPEG-4 AVC is the latest coding standard jointly developed by the Video Coding Experts Group (VCEG) of ITU-T and Moving Picture Experts Group (MPEG) of ISO/IEC. It uses state of the art coding tools and provides enhanced coding efficiency for a wide range of applications including video telephony, video conferencing, TV, storage (DVD and/or hard disk based), streaming video, digital video creation, digital cinema and others. In this paper an overview of this standard is provided. Some comparisons with the existing standards, MPEG-2 and MPEG-4 Part 2, are also provided.
Performance evaluation of MPEG internet video coding
NASA Astrophysics Data System (ADS)
Luo, Jiajia; Wang, Ronggang; Fan, Kui; Wang, Zhenyu; Li, Ge; Wang, Wenmin
2016-09-01
Internet Video Coding (IVC) has been developed in MPEG by combining well-known existing technology elements and new coding tools with royalty-free declarations. In June 2015, IVC project was approved as ISO/IEC 14496-33 (MPEG- 4 Internet Video Coding). It is believed that this standard can be highly beneficial for video services in the Internet domain. This paper evaluates the objective and subjective performances of IVC by comparing it against Web Video Coding (WVC), Video Coding for Browsers (VCB) and AVC High Profile. Experimental results show that IVC's compression performance is approximately equal to that of the AVC High Profile for typical operational settings, both for streaming and low-delay applications, and is better than WVC and VCB.
No-reference quality assessment based on visual perception
NASA Astrophysics Data System (ADS)
Li, Junshan; Yang, Yawei; Hu, Shuangyan; Zhang, Jiao
2014-11-01
The visual quality assessment of images/videos is an ongoing hot research topic, which has become more and more important for numerous image and video processing applications with the rapid development of digital imaging and communication technologies. The goal of image quality assessment (IQA) algorithms is to automatically assess the quality of images/videos in agreement with human quality judgments. Up to now, two kinds of models have been used for IQA, namely full-reference (FR) and no-reference (NR) models. For FR models, IQA algorithms interpret image quality as fidelity or similarity with a perfect image in some perceptual space. However, the reference image is not available in many practical applications, and a NR IQA approach is desired. Considering natural vision as optimized by the millions of years of evolutionary pressure, many methods attempt to achieve consistency in quality prediction by modeling salient physiological and psychological features of the human visual system (HVS). To reach this goal, researchers try to simulate HVS with image sparsity coding and supervised machine learning, which are two main features of HVS. A typical HVS captures the scenes by sparsity coding, and uses experienced knowledge to apperceive objects. In this paper, we propose a novel IQA approach based on visual perception. Firstly, a standard model of HVS is studied and analyzed, and the sparse representation of image is accomplished with the model; and then, the mapping correlation between sparse codes and subjective quality scores is trained with the regression technique of least squaresupport vector machine (LS-SVM), which gains the regressor that can predict the image quality; the visual metric of image is predicted with the trained regressor at last. We validate the performance of proposed approach on Laboratory for Image and Video Engineering (LIVE) database, the specific contents of the type of distortions present in the database are: 227 images of JPEG2000, 233 images of JPEG, 174 images of White Noise, 174 images of Gaussian Blur, 174 images of Fast Fading. The database includes subjective differential mean opinion score (DMOS) for each image. The experimental results show that the proposed approach not only can assess many kinds of distorted images quality, but also exhibits a superior accuracy and monotonicity.
Evaluating progressive-rendering algorithms in appearance design tasks.
Jiawei Ou; Karlik, Ondrej; Křivánek, Jaroslav; Pellacini, Fabio
2013-01-01
Progressive rendering is becoming a popular alternative to precomputational approaches to appearance design. However, progressive algorithms create images exhibiting visual artifacts at early stages. A user study investigated these artifacts' effects on user performance in appearance design tasks. Novice and expert subjects performed lighting and material editing tasks with four algorithms: random path tracing, quasirandom path tracing, progressive photon mapping, and virtual-point-light rendering. Both the novices and experts strongly preferred path tracing to progressive photon mapping and virtual-point-light rendering. None of the participants preferred random path tracing to quasirandom path tracing or vice versa; the same situation held between progressive photon mapping and virtual-point-light rendering. The user workflow didn’t differ significantly with the four algorithms. The Web Extras include a video showing how four progressive-rendering algorithms converged (at http://youtu.be/ck-Gevl1e9s), the source code used, and other supplementary materials.
Layered Wyner-Ziv video coding.
Xu, Qian; Xiong, Zixiang
2006-12-01
Following recent theoretical works on successive Wyner-Ziv coding (WZC), we propose a practical layered Wyner-Ziv video coder using the DCT, nested scalar quantization, and irregular LDPC code based Slepian-Wolf coding (or lossless source coding with side information at the decoder). Our main novelty is to use the base layer of a standard scalable video coder (e.g., MPEG-4/H.26L FGS or H.263+) as the decoder side information and perform layered WZC for quality enhancement. Similar to FGS coding, there is no performance difference between layered and monolithic WZC when the enhancement bitstream is generated in our proposed coder. Using an H.26L coded version as the base layer, experiments indicate that WZC gives slightly worse performance than FGS coding when the channel (for both the base and enhancement layers) is noiseless. However, when the channel is noisy, extensive simulations of video transmission over wireless networks conforming to the CDMA2000 1X standard show that H.26L base layer coding plus Wyner-Ziv enhancement layer coding are more robust against channel errors than H.26L FGS coding. These results demonstrate that layered Wyner-Ziv video coding is a promising new technique for video streaming over wireless networks.
Comparison of three coding strategies for a low cost structure light scanner
NASA Astrophysics Data System (ADS)
Xiong, Hanwei; Xu, Jun; Xu, Chenxi; Pan, Ming
2014-12-01
Coded structure light is widely used for 3D scanning, and different coding strategies are adopted to suit for different goals. In this paper, three coding strategies are compared, and one of them is selected to implement a low cost structure light scanner under the cost of €100. To reach this goal, the projector and the video camera must be the cheapest, which will lead to some problems related to light coding. For a cheapest projector, complex intensity pattern can't be generated; even if it can be generated, it can't be captured by a cheapest camera. Based on Gray code, three different strategies are implemented and compared, called phase-shift, line-shift, and bit-shift, respectively. The bit-shift Gray code is the contribution of this paper, in which a simple, stable light pattern is used to generate dense(mean points distance<0.4mm) and accurate(mean error<0.1mm) results. The whole algorithm details and some example are presented in the papers.
Android platform based smartphones for a logistical remote association repair framework.
Lien, Shao-Fan; Wang, Chun-Chieh; Su, Juhng-Perng; Chen, Hong-Ming; Wu, Chein-Hsing
2014-06-25
The maintenance of large-scale systems is an important issue for logistics support planning. In this paper, we developed a Logistical Remote Association Repair Framework (LRARF) to aid repairmen in keeping the system available. LRARF includes four subsystems: smart mobile phones, a Database Management System (DBMS), a Maintenance Support Center (MSC) and wireless networks. The repairman uses smart mobile phones to capture QR-codes and the images of faulty circuit boards. The captured QR-codes and images are transmitted to the DBMS so the invalid modules can be recognized via the proposed algorithm. In this paper, the Linear Projective Transform (LPT) is employed for fast QR-code calibration. Moreover, the ANFIS-based data mining system is used for module identification and searching automatically for the maintenance manual corresponding to the invalid modules. The inputs of the ANFIS-based data mining system are the QR-codes and image features; the output is the module ID. DBMS also transmits the maintenance manual back to the maintenance staff. If modules are not recognizable, the repairmen and center engineers can obtain the relevant information about the invalid modules through live video. The experimental results validate the applicability of the Android-based platform in the recognition of invalid modules. In addition, the live video can also be recorded synchronously on the MSC for later use.
Improved dense trajectories for action recognition based on random projection and Fisher vectors
NASA Astrophysics Data System (ADS)
Ai, Shihui; Lu, Tongwei; Xiong, Yudian
2018-03-01
As an important application of intelligent monitoring system, the action recognition in video has become a very important research area of computer vision. In order to improve the accuracy rate of the action recognition in video with improved dense trajectories, one advanced vector method is introduced. Improved dense trajectories combine Fisher Vector with Random Projection. The method realizes the reduction of the characteristic trajectory though projecting the high-dimensional trajectory descriptor into the low-dimensional subspace based on defining and analyzing Gaussian mixture model by Random Projection. And a GMM-FV hybrid model is introduced to encode the trajectory feature vector and reduce dimension. The computational complexity is reduced by Random Projection which can drop Fisher coding vector. Finally, a Linear SVM is used to classifier to predict labels. We tested the algorithm in UCF101 dataset and KTH dataset. Compared with existed some others algorithm, the result showed that the method not only reduce the computational complexity but also improved the accuracy of action recognition.
Real-time software-based end-to-end wireless visual communications simulation platform
NASA Astrophysics Data System (ADS)
Chen, Ting-Chung; Chang, Li-Fung; Wong, Andria H.; Sun, Ming-Ting; Hsing, T. Russell
1995-04-01
Wireless channel impairments pose many challenges to real-time visual communications. In this paper, we describe a real-time software based wireless visual communications simulation platform which can be used for performance evaluation in real-time. This simulation platform consists of two personal computers serving as hosts. Major components of each PC host include a real-time programmable video code, a wireless channel simulator, and a network interface for data transport between the two hosts. The three major components are interfaced in real-time to show the interaction of various wireless channels and video coding algorithms. The programmable features in the above components allow users to do performance evaluation of user-controlled wireless channel effects without physically carrying out these experiments which are limited in scope, time-consuming, and costly. Using this simulation platform as a testbed, we have experimented with several wireless channel effects including Rayleigh fading, antenna diversity, channel filtering, symbol timing, modulation, and packet loss.
Video error concealment using block matching and frequency selective extrapolation algorithms
NASA Astrophysics Data System (ADS)
P. K., Rajani; Khaparde, Arti
2017-06-01
Error Concealment (EC) is a technique at the decoder side to hide the transmission errors. It is done by analyzing the spatial or temporal information from available video frames. It is very important to recover distorted video because they are used for various applications such as video-telephone, video-conference, TV, DVD, internet video streaming, video games etc .Retransmission-based and resilient-based methods, are also used for error removal. But these methods add delay and redundant data. So error concealment is the best option for error hiding. In this paper, the error concealment methods such as Block Matching error concealment algorithm is compared with Frequency Selective Extrapolation algorithm. Both the works are based on concealment of manually error video frames as input. The parameter used for objective quality measurement was PSNR (Peak Signal to Noise Ratio) and SSIM(Structural Similarity Index). The original video frames along with error video frames are compared with both the Error concealment algorithms. According to simulation results, Frequency Selective Extrapolation is showing better quality measures such as 48% improved PSNR and 94% increased SSIM than Block Matching Algorithm.
The implementation of thermal image visualization by HDL based on pseudo-color
NASA Astrophysics Data System (ADS)
Zhu, Yong; Zhang, JiangLing
2004-11-01
The pseudo-color method which maps the sampled data to intuitive perception colors is a kind of powerful visualization way. And the all-around system of pseudo-color visualization, which includes the primary principle, model and HDL (Hardware Description Language) implementation for the thermal images, is expatiated on in the paper. The thermal images whose signal is modulated as video reflect the temperature distribution of measured object, so they have the speciality of mass and real-time. The solution to the intractable problem is as follows: First, the reasonable system, i.e. the combining of global pseudo-color visualization and local special area accurate measure, muse be adopted. Then, the HDL pseudo-color algorithms in SoC (System on Chip) carry out the system to ensure the real-time. Finally, the key HDL algorithms for direct gray levels connection coding, proportional gray levels map coding and enhanced gray levels map coding are presented, and its simulation results are showed. The pseudo-color visualization of thermal images implemented by HDL in the paper has effective application in the aspect of electric power equipment test and medical health diagnosis.
Shadow Detection Based on Regions of Light Sources for Object Extraction in Nighttime Video
Lee, Gil-beom; Lee, Myeong-jin; Lee, Woo-Kyung; Park, Joo-heon; Kim, Tae-Hwan
2017-01-01
Intelligent video surveillance systems detect pre-configured surveillance events through background modeling, foreground and object extraction, object tracking, and event detection. Shadow regions inside video frames sometimes appear as foreground objects, interfere with ensuing processes, and finally degrade the event detection performance of the systems. Conventional studies have mostly used intensity, color, texture, and geometric information to perform shadow detection in daytime video, but these methods lack the capability of removing shadows in nighttime video. In this paper, a novel shadow detection algorithm for nighttime video is proposed; this algorithm partitions each foreground object based on the object’s vertical histogram and screens out shadow objects by validating their orientations heading toward regions of light sources. From the experimental results, it can be seen that the proposed algorithm shows more than 93.8% shadow removal and 89.9% object extraction rates for nighttime video sequences, and the algorithm outperforms conventional shadow removal algorithms designed for daytime videos. PMID:28327515
2010-04-29
magnitude greater than today’s high-definition video coding standards. Moreover, the micromirror devices of maskless lithography are smaller than those...be found in the literature [33]. In this architecture, the optical source flashes on a writer system, which consists of a micromirror array and a...the writer system. Due to the physical dimension constraints of the micromirror array and writer system, an entire wafer can be written in a few
Video Monitoring a Simulation-Based Quality Improvement Program in Bihar, India.
Dyer, Jessica; Spindler, Hilary; Christmas, Amelia; Shah, Malay Bharat; Morgan, Melissa; Cohen, Susanna R; Sterne, Jason; Mahapatra, Tanmay; Walker, Dilys
2018-04-01
Simulation-based training has become an accepted clinical training andragogy in high-resource settings with its use increasing in low-resource settings. Video recordings of simulated scenarios are commonly used by facilitators. Beyond using the videos during debrief sessions, researchers can also analyze the simulation videos to quantify technical and nontechnical skills during simulated scenarios over time. Little is known about the feasibility and use of large-scale systems to video record and analyze simulation and debriefing data for monitoring and evaluation in low-resource settings. This manuscript describes the process of designing and implementing a large-scale video monitoring system. Mentees and Mentors were consented and all simulations and debriefs conducted at 320 Primary Health Centers (PHCs) were video recorded. The system design, number of video recordings, and inter-rater reliability of the coded videos were assessed. The final dataset included a total of 11,278 videos. Overall, a total of 2,124 simulation videos were coded and 183 (12%) were blindly double-coded. For the double-coded sample, the average inter-rater reliability (IRR) scores were 80% for nontechnical skills, and 94% for clinical technical skills. Among 4,450 long debrief videos received, 216 were selected for coding and all were double-coded. Data quality of simulation videos was found to be very good in terms of recorded instances of "unable to see" and "unable to hear" in Phases 1 and 2. This study demonstrates that video monitoring systems can be effectively implemented at scale in resource limited settings. Further, video monitoring systems can play several vital roles within program implementation, including monitoring and evaluation, provision of actionable feedback to program implementers, and assurance of program fidelity.
Design and implementation of H.264 based embedded video coding technology
NASA Astrophysics Data System (ADS)
Mao, Jian; Liu, Jinming; Zhang, Jiemin
2016-03-01
In this paper, an embedded system for remote online video monitoring was designed and developed to capture and record the real-time circumstances in elevator. For the purpose of improving the efficiency of video acquisition and processing, the system selected Samsung S5PV210 chip as the core processor which Integrated graphics processing unit. And the video was encoded with H.264 format for storage and transmission efficiently. Based on S5PV210 chip, the hardware video coding technology was researched, which was more efficient than software coding. After running test, it had been proved that the hardware video coding technology could obviously reduce the cost of system and obtain the more smooth video display. It can be widely applied for the security supervision [1].
Imran, Noreen; Seet, Boon-Chong; Fong, A C M
2015-01-01
Distributed video coding (DVC) is a relatively new video coding architecture originated from two fundamental theorems namely, Slepian-Wolf and Wyner-Ziv. Recent research developments have made DVC attractive for applications in the emerging domain of wireless video sensor networks (WVSNs). This paper reviews the state-of-the-art DVC architectures with a focus on understanding their opportunities and gaps in addressing the operational requirements and application needs of WVSNs.
New fast DCT algorithms based on Loeffler's factorization
NASA Astrophysics Data System (ADS)
Hong, Yoon Mi; Kim, Il-Koo; Lee, Tammy; Cheon, Min-Su; Alshina, Elena; Han, Woo-Jin; Park, Jeong-Hoon
2012-10-01
This paper proposes a new 32-point fast discrete cosine transform (DCT) algorithm based on the Loeffler's 16-point transform. Fast integer realizations of 16-point and 32-point transforms are also provided based on the proposed transform. For the recent development of High Efficiency Video Coding (HEVC), simplified quanti-zation and de-quantization process are proposed. Three different forms of implementation with the essentially same performance, namely matrix multiplication, partial butterfly, and full factorization can be chosen accord-ing to the given platform. In terms of the number of multiplications required for the realization, our proposed full-factorization is 3~4 times faster than a partial butterfly, and about 10 times faster than direct matrix multiplication.
Tracking cells in Life Cell Imaging videos using topological alignments.
Mosig, Axel; Jäger, Stefan; Wang, Chaofeng; Nath, Sumit; Ersoy, Ilker; Palaniappan, Kannap-pan; Chen, Su-Shing
2009-07-16
With the increasing availability of live cell imaging technology, tracking cells and other moving objects in live cell videos has become a major challenge for bioimage informatics. An inherent problem for most cell tracking algorithms is over- or under-segmentation of cells - many algorithms tend to recognize one cell as several cells or vice versa. We propose to approach this problem through so-called topological alignments, which we apply to address the problem of linking segmentations of two consecutive frames in the video sequence. Starting from the output of a conventional segmentation procedure, we align pairs of consecutive frames through assigning sets of segments in one frame to sets of segments in the next frame. We achieve this through finding maximum weighted solutions to a generalized "bipartite matching" between two hierarchies of segments, where we derive weights from relative overlap scores of convex hulls of sets of segments. For solving the matching task, we rely on an integer linear program. Practical experiments demonstrate that the matching task can be solved efficiently in practice, and that our method is both effective and useful for tracking cells in data sets derived from a so-called Large Scale Digital Cell Analysis System (LSDCAS). The source code of the implementation is available for download from http://www.picb.ac.cn/patterns/Software/topaln.
Background-Modeling-Based Adaptive Prediction for Surveillance Video Coding.
Zhang, Xianguo; Huang, Tiejun; Tian, Yonghong; Gao, Wen
2014-02-01
The exponential growth of surveillance videos presents an unprecedented challenge for high-efficiency surveillance video coding technology. Compared with the existing coding standards that were basically developed for generic videos, surveillance video coding should be designed to make the best use of the special characteristics of surveillance videos (e.g., relative static background). To do so, this paper first conducts two analyses on how to improve the background and foreground prediction efficiencies in surveillance video coding. Following the analysis results, we propose a background-modeling-based adaptive prediction (BMAP) method. In this method, all blocks to be encoded are firstly classified into three categories. Then, according to the category of each block, two novel inter predictions are selectively utilized, namely, the background reference prediction (BRP) that uses the background modeled from the original input frames as the long-term reference and the background difference prediction (BDP) that predicts the current data in the background difference domain. For background blocks, the BRP can effectively improve the prediction efficiency using the higher quality background as the reference; whereas for foreground-background-hybrid blocks, the BDP can provide a better reference after subtracting its background pixels. Experimental results show that the BMAP can achieve at least twice the compression ratio on surveillance videos as AVC (MPEG-4 Advanced Video Coding) high profile, yet with a slightly additional encoding complexity. Moreover, for the foreground coding performance, which is crucial to the subjective quality of moving objects in surveillance videos, BMAP also obtains remarkable gains over several state-of-the-art methods.
NASA Astrophysics Data System (ADS)
Sullivan, Gary J.; Topiwala, Pankaj N.; Luthra, Ajay
2004-11-01
H.264/MPEG-4 AVC is the latest international video coding standard. It was jointly developed by the Video Coding Experts Group (VCEG) of the ITU-T and the Moving Picture Experts Group (MPEG) of ISO/IEC. It uses state-of-the-art coding tools and provides enhanced coding efficiency for a wide range of applications, including video telephony, video conferencing, TV, storage (DVD and/or hard disk based, especially high-definition DVD), streaming video, digital video authoring, digital cinema, and many others. The work on a new set of extensions to this standard has recently been completed. These extensions, known as the Fidelity Range Extensions (FRExt), provide a number of enhanced capabilities relative to the base specification as approved in the Spring of 2003. In this paper, an overview of this standard is provided, including the highlights of the capabilities of the new FRExt features. Some comparisons with the existing MPEG-2 and MPEG-4 Part 2 standards are also provided.
Blind prediction of natural video quality.
Saad, Michele A; Bovik, Alan C; Charrier, Christophe
2014-03-01
We propose a blind (no reference or NR) video quality evaluation model that is nondistortion specific. The approach relies on a spatio-temporal model of video scenes in the discrete cosine transform domain, and on a model that characterizes the type of motion occurring in the scenes, to predict video quality. We use the models to define video statistics and perceptual features that are the basis of a video quality assessment (VQA) algorithm that does not require the presence of a pristine video to compare against in order to predict a perceptual quality score. The contributions of this paper are threefold. 1) We propose a spatio-temporal natural scene statistics (NSS) model for videos. 2) We propose a motion model that quantifies motion coherency in video scenes. 3) We show that the proposed NSS and motion coherency models are appropriate for quality assessment of videos, and we utilize them to design a blind VQA algorithm that correlates highly with human judgments of quality. The proposed algorithm, called video BLIINDS, is tested on the LIVE VQA database and on the EPFL-PoliMi video database and shown to perform close to the level of top performing reduced and full reference VQA algorithms.
NASA Astrophysics Data System (ADS)
Bulan, Orhan; Bernal, Edgar A.; Loce, Robert P.; Wu, Wencheng
2013-03-01
Video cameras are widely deployed along city streets, interstate highways, traffic lights, stop signs and toll booths by entities that perform traffic monitoring and law enforcement. The videos captured by these cameras are typically compressed and stored in large databases. Performing a rapid search for a specific vehicle within a large database of compressed videos is often required and can be a time-critical life or death situation. In this paper, we propose video compression and decompression algorithms that enable fast and efficient vehicle or, more generally, event searches in large video databases. The proposed algorithm selects reference frames (i.e., I-frames) based on a vehicle having been detected at a specified position within the scene being monitored while compressing a video sequence. A search for a specific vehicle in the compressed video stream is performed across the reference frames only, which does not require decompression of the full video sequence as in traditional search algorithms. Our experimental results on videos captured in a local road show that the proposed algorithm significantly reduces the search space (thus reducing time and computational resources) in vehicle search tasks within compressed video streams, particularly those captured in light traffic volume conditions.
Android Platform Based Smartphones for a Logistical Remote Association Repair Framework
Lien, Shao-Fan; Wang, Chun-Chieh; Su, Juhng-Perng; Chen, Hong-Ming; Wu, Chein-Hsing
2014-01-01
The maintenance of large-scale systems is an important issue for logistics support planning. In this paper, we developed a Logistical Remote Association Repair Framework (LRARF) to aid repairmen in keeping the system available. LRARF includes four subsystems: smart mobile phones, a Database Management System (DBMS), a Maintenance Support Center (MSC) and wireless networks. The repairman uses smart mobile phones to capture QR-codes and the images of faulty circuit boards. The captured QR-codes and images are transmitted to the DBMS so the invalid modules can be recognized via the proposed algorithm. In this paper, the Linear Projective Transform (LPT) is employed for fast QR-code calibration. Moreover, the ANFIS-based data mining system is used for module identification and searching automatically for the maintenance manual corresponding to the invalid modules. The inputs of the ANFIS-based data mining system are the QR-codes and image features; the output is the module ID. DBMS also transmits the maintenance manual back to the maintenance staff. If modules are not recognizable, the repairmen and center engineers can obtain the relevant information about the invalid modules through live video. The experimental results validate the applicability of the Android-based platform in the recognition of invalid modules. In addition, the live video can also be recorded synchronously on the MSC for later use. PMID:24967603
Influence of audio triggered emotional attention on video perception
NASA Astrophysics Data System (ADS)
Torres, Freddy; Kalva, Hari
2014-02-01
Perceptual video coding methods attempt to improve compression efficiency by discarding visual information not perceived by end users. Most of the current approaches for perceptual video coding only use visual features ignoring the auditory component. Many psychophysical studies have demonstrated that auditory stimuli affects our visual perception. In this paper we present our study of audio triggered emotional attention and it's applicability to perceptual video coding. Experiments with movie clips show that the reaction time to detect video compression artifacts was longer when video was presented with the audio information. The results reported are statistically significant with p=0.024.
System Synchronizes Recordings from Separated Video Cameras
NASA Technical Reports Server (NTRS)
Nail, William; Nail, William L.; Nail, Jasper M.; Le, Doung T.
2009-01-01
A system of electronic hardware and software for synchronizing recordings from multiple, physically separated video cameras is being developed, primarily for use in multiple-look-angle video production. The system, the time code used in the system, and the underlying method of synchronization upon which the design of the system is based are denoted generally by the term "Geo-TimeCode(TradeMark)." The system is embodied mostly in compact, lightweight, portable units (see figure) denoted video time-code units (VTUs) - one VTU for each video camera. The system is scalable in that any number of camera recordings can be synchronized. The estimated retail price per unit would be about $350 (in 2006 dollars). The need for this or another synchronization system external to video cameras arises because most video cameras do not include internal means for maintaining synchronization with other video cameras. Unlike prior video-camera-synchronization systems, this system does not depend on continuous cable or radio links between cameras (however, it does depend on occasional cable links lasting a few seconds). Also, whereas the time codes used in prior video-camera-synchronization systems typically repeat after 24 hours, the time code used in this system does not repeat for slightly more than 136 years; hence, this system is much better suited for long-term deployment of multiple cameras.
Self-Supervised Video Hashing With Hierarchical Binary Auto-Encoder.
Song, Jingkuan; Zhang, Hanwang; Li, Xiangpeng; Gao, Lianli; Wang, Meng; Hong, Richang
2018-07-01
Existing video hash functions are built on three isolated stages: frame pooling, relaxed learning, and binarization, which have not adequately explored the temporal order of video frames in a joint binary optimization model, resulting in severe information loss. In this paper, we propose a novel unsupervised video hashing framework dubbed self-supervised video hashing (SSVH), which is able to capture the temporal nature of videos in an end-to-end learning to hash fashion. We specifically address two central problems: 1) how to design an encoder-decoder architecture to generate binary codes for videos and 2) how to equip the binary codes with the ability of accurate video retrieval. We design a hierarchical binary auto-encoder to model the temporal dependencies in videos with multiple granularities, and embed the videos into binary codes with less computations than the stacked architecture. Then, we encourage the binary codes to simultaneously reconstruct the visual content and neighborhood structure of the videos. Experiments on two real-world data sets show that our SSVH method can significantly outperform the state-of-the-art methods and achieve the current best performance on the task of unsupervised video retrieval.
Self-Supervised Video Hashing With Hierarchical Binary Auto-Encoder
NASA Astrophysics Data System (ADS)
Song, Jingkuan; Zhang, Hanwang; Li, Xiangpeng; Gao, Lianli; Wang, Meng; Hong, Richang
2018-07-01
Existing video hash functions are built on three isolated stages: frame pooling, relaxed learning, and binarization, which have not adequately explored the temporal order of video frames in a joint binary optimization model, resulting in severe information loss. In this paper, we propose a novel unsupervised video hashing framework dubbed Self-Supervised Video Hashing (SSVH), that is able to capture the temporal nature of videos in an end-to-end learning-to-hash fashion. We specifically address two central problems: 1) how to design an encoder-decoder architecture to generate binary codes for videos; and 2) how to equip the binary codes with the ability of accurate video retrieval. We design a hierarchical binary autoencoder to model the temporal dependencies in videos with multiple granularities, and embed the videos into binary codes with less computations than the stacked architecture. Then, we encourage the binary codes to simultaneously reconstruct the visual content and neighborhood structure of the videos. Experiments on two real-world datasets (FCVID and YFCC) show that our SSVH method can significantly outperform the state-of-the-art methods and achieve the currently best performance on the task of unsupervised video retrieval.
2014-05-01
fusion, space and astrophysical plasmas, but still the general picture can be presented quite well with the fluid approach [6, 7]. The microscopic...purpose computing CPU for algorithms where processing of large blocks of data is done in parallel. The reason for that is the GPU’s highly effective...parallel structure. Most of the image and video processing computations involve heavy matrix and vector op- erations over large amounts of data and
A method of operation scheduling based on video transcoding for cluster equipment
NASA Astrophysics Data System (ADS)
Zhou, Haojie; Yan, Chun
2018-04-01
Because of the cluster technology in real-time video transcoding device, the application of facing the massive growth in the number of video assignments and resolution and bit rate of diversity, task scheduling algorithm, and analyze the current mainstream of cluster for real-time video transcoding equipment characteristics of the cluster, combination with the characteristics of the cluster equipment task delay scheduling algorithm is proposed. This algorithm enables the cluster to get better performance in the generation of the job queue and the lower part of the job queue when receiving the operation instruction. In the end, a small real-time video transcode cluster is constructed to analyze the calculation ability, running time, resource occupation and other aspects of various algorithms in operation scheduling. The experimental results show that compared with traditional clustering task scheduling algorithm, task delay scheduling algorithm has more flexible and efficient characteristics.
Uranus: a rapid prototyping tool for FPGA embedded computer vision
NASA Astrophysics Data System (ADS)
Rosales-Hernández, Victor; Castillo-Jimenez, Liz; Viveros-Velez, Gilberto; Zuñiga-Grajeda, Virgilio; Treviño Torres, Abel; Arias-Estrada, M.
2007-01-01
The starting point for all successful system development is the simulation. Performing high level simulation of a system can help to identify, insolate and fix design problems. This work presents Uranus, a software tool for simulation and evaluation of image processing algorithms with support to migrate them to an FPGA environment for algorithm acceleration and embedded processes purposes. The tool includes an integrated library of previous coded operators in software and provides the necessary support to read and display image sequences as well as video files. The user can use the previous compiled soft-operators in a high level process chain, and code his own operators. Additional to the prototyping tool, Uranus offers FPGA-based hardware architecture with the same organization as the software prototyping part. The hardware architecture contains a library of FPGA IP cores for image processing that are connected with a PowerPC based system. The Uranus environment is intended for rapid prototyping of machine vision and the migration to FPGA accelerator platform, and it is distributed for academic purposes.
Special-effect edit detection using VideoTrails: a comparison with existing techniques
NASA Astrophysics Data System (ADS)
Kobla, Vikrant; DeMenthon, Daniel; Doermann, David S.
1998-12-01
Video segmentation plays an integral role in many multimedia applications, such as digital libraries, content management systems, and various other video browsing, indexing, and retrieval systems. Many algorithms for segmentation of video have appeared within the past few years. Most of these algorithms perform well on cuts, but yield poor performance on gradual transitions or special effects edits. A complete video segmentation system must also achieve good performance on special effect edit detection. In this paper, we discuss the performance of our Video Trails-based algorithms, with other existing special effect edit-detection algorithms within the literature. Results from experiments testing for the ability to detect edits from TV programs, ranging from commercials to news magazine programs, including diverse special effect edits, which we have introduced.
An efficient system for reliably transmitting image and video data over low bit rate noisy channels
NASA Technical Reports Server (NTRS)
Costello, Daniel J., Jr.; Huang, Y. F.; Stevenson, Robert L.
1994-01-01
This research project is intended to develop an efficient system for reliably transmitting image and video data over low bit rate noisy channels. The basic ideas behind the proposed approach are the following: employ statistical-based image modeling to facilitate pre- and post-processing and error detection, use spare redundancy that the source compression did not remove to add robustness, and implement coded modulation to improve bandwidth efficiency and noise rejection. Over the last six months, progress has been made on various aspects of the project. Through our studies of the integrated system, a list-based iterative Trellis decoder has been developed. The decoder accepts feedback from a post-processor which can detect channel errors in the reconstructed image. The error detection is based on the Huber Markov random field image model for the compressed image. The compression scheme used here is that of JPEG (Joint Photographic Experts Group). Experiments were performed and the results are quite encouraging. The principal ideas here are extendable to other compression techniques. In addition, research was also performed on unequal error protection channel coding, subband vector quantization as a means of source coding, and post processing for reducing coding artifacts. Our studies on unequal error protection (UEP) coding for image transmission focused on examining the properties of the UEP capabilities of convolutional codes. The investigation of subband vector quantization employed a wavelet transform with special emphasis on exploiting interband redundancy. The outcome of this investigation included the development of three algorithms for subband vector quantization. The reduction of transform coding artifacts was studied with the aid of a non-Gaussian Markov random field model. This results in improved image decompression. These studies are summarized and the technical papers included in the appendices.
Expressing Youth Voice through Video Games and Coding
ERIC Educational Resources Information Center
Martin, Crystle
2017-01-01
A growing body of research focuses on the impact of video games and coding on learning. The research often elevates learning the technical skills associated with video games and coding or the importance of problem solving and computational thinking, which are, of course, necessary and relevant. However, the literature less often explores how young…
Application discussion of source coding standard in voyage data recorder
NASA Astrophysics Data System (ADS)
Zong, Yonggang; Zhao, Xiandong
2018-04-01
This paper analyzes the disadvantages of the audio and video compression coding technology used by Voyage Data Recorder, and combines the improvement of performance of audio and video acquisition equipment. The thinking of improving the audio and video compression coding technology of the voyage data recorder is proposed, and the feasibility of adopting the new compression coding technology is analyzed from economy and technology two aspects.
Non-US data compression and coding research. FASAC Technical Assessment Report
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gray, R.M.; Cohn, M.; Craver, L.W.
1993-11-01
This assessment of recent data compression and coding research outside the United States examines fundamental and applied work in the basic areas of signal decomposition, quantization, lossless compression, and error control, as well as application development efforts in image/video compression and speech/audio compression. Seven computer scientists and engineers who are active in development of these technologies in US academia, government, and industry carried out the assessment. Strong industrial and academic research groups in Western Europe, Israel, and the Pacific Rim are active in the worldwide search for compression algorithms that provide good tradeoffs among fidelity, bit rate, and computational complexity,more » though the theoretical roots and virtually all of the classical compression algorithms were developed in the United States. Certain areas, such as segmentation coding, model-based coding, and trellis-coded modulation, have developed earlier or in more depth outside the United States, though the United States has maintained its early lead in most areas of theory and algorithm development. Researchers abroad are active in other currently popular areas, such as quantizer design techniques based on neural networks and signal decompositions based on fractals and wavelets, but, in most cases, either similar research is or has been going on in the United States, or the work has not led to useful improvements in compression performance. Because there is a high degree of international cooperation and interaction in this field, good ideas spread rapidly across borders (both ways) through international conferences, journals, and technical exchanges. Though there have been no fundamental data compression breakthroughs in the past five years--outside or inside the United State--there have been an enormous number of significant improvements in both places in the tradeoffs among fidelity, bit rate, and computational complexity.« less
Discriminative object tracking via sparse representation and online dictionary learning.
Xie, Yuan; Zhang, Wensheng; Li, Cuihua; Lin, Shuyang; Qu, Yanyun; Zhang, Yinghua
2014-04-01
We propose a robust tracking algorithm based on local sparse coding with discriminative dictionary learning and new keypoint matching schema. This algorithm consists of two parts: the local sparse coding with online updated discriminative dictionary for tracking (SOD part), and the keypoint matching refinement for enhancing the tracking performance (KP part). In the SOD part, the local image patches of the target object and background are represented by their sparse codes using an over-complete discriminative dictionary. Such discriminative dictionary, which encodes the information of both the foreground and the background, may provide more discriminative power. Furthermore, in order to adapt the dictionary to the variation of the foreground and background during the tracking, an online learning method is employed to update the dictionary. The KP part utilizes refined keypoint matching schema to improve the performance of the SOD. With the help of sparse representation and online updated discriminative dictionary, the KP part are more robust than the traditional method to reject the incorrect matches and eliminate the outliers. The proposed method is embedded into a Bayesian inference framework for visual tracking. Experimental results on several challenging video sequences demonstrate the effectiveness and robustness of our approach.
Chroma intra prediction based on inter-channel correlation for HEVC.
Zhang, Xingyu; Gisquet, Christophe; François, Edouard; Zou, Feng; Au, Oscar C
2014-01-01
In this paper, we investigate a new inter-channel coding mode called LM mode proposed for the next generation video coding standard called high efficiency video coding. This mode exploits inter-channel correlation using reconstructed luma to predict chroma linearly with parameters derived from neighboring reconstructed luma and chroma pixels at both encoder and decoder to avoid overhead signaling. In this paper, we analyze the LM mode and prove that the LM parameters for predicting original chroma and reconstructed chroma are statistically the same. We also analyze the error sensitivity of the LM parameters. We identify some LM mode problematic situations and propose three novel LM-like modes called LMA, LML, and LMO to address the situations. To limit the increase in complexity due to the LM-like modes, we propose some fast algorithms with the help of some new cost functions. We further identify some potentially-problematic conditions in the parameter estimation (including regression dilution problem) and introduce a novel model correction technique to detect and correct those conditions. Simulation results suggest that considerable BD-rate reduction can be achieved by the proposed LM-like modes and model correction technique. In addition, the performance gain of the two techniques appears to be essentially additive when combined.
Impulsive noise removal from color video with morphological filtering
NASA Astrophysics Data System (ADS)
Ruchay, Alexey; Kober, Vitaly
2017-09-01
This paper deals with impulse noise removal from color video. The proposed noise removal algorithm employs a switching filtering for denoising of color video; that is, detection of corrupted pixels by means of a novel morphological filtering followed by removal of the detected pixels on the base of estimation of uncorrupted pixels in the previous scenes. With the help of computer simulation we show that the proposed algorithm is able to well remove impulse noise in color video. The performance of the proposed algorithm is compared in terms of image restoration metrics with that of common successful algorithms.
Chroma sampling and modulation techniques in high dynamic range video coding
NASA Astrophysics Data System (ADS)
Dai, Wei; Krishnan, Madhu; Topiwala, Pankaj
2015-09-01
High Dynamic Range and Wide Color Gamut (HDR/WCG) Video Coding is an area of intense research interest in the engineering community, for potential near-term deployment in the marketplace. HDR greatly enhances the dynamic range of video content (up to 10,000 nits), as well as broadens the chroma representation (BT.2020). The resulting content offers new challenges in its coding and transmission. The Moving Picture Experts Group (MPEG) of the International Standards Organization (ISO) is currently exploring coding efficiency and/or the functionality enhancements of the recently developed HEVC video standard for HDR and WCG content. FastVDO has developed an advanced approach to coding HDR video, based on splitting the HDR signal into a smoothed luminance (SL) signal, and an associated base signal (B). Both signals are then chroma downsampled to YFbFr 4:2:0 signals, using advanced resampling filters, and coded using the Main10 High Efficiency Video Coding (HEVC) standard, which has been developed jointly by ISO/IEC MPEG and ITU-T WP3/16 (VCEG). Our proposal offers both efficient coding, and backwards compatibility with the existing HEVC Main10 Profile. That is, an existing Main10 decoder can produce a viewable standard dynamic range video, suitable for existing screens. Subjective tests show visible improvement over the anchors. Objective tests show a sizable gain of over 25% in PSNR (RGB domain) on average, for a key set of test clips selected by the ISO/MPEG committee.
Prioritized packet video transmission over time-varying wireless channel using proactive FEC
NASA Astrophysics Data System (ADS)
Kumwilaisak, Wuttipong; Kim, JongWon; Kuo, C.-C. Jay
2000-12-01
Quality of video transmitted over time-varying wireless channels relies heavily on the coordinated effort to cope with both channel and source variations dynamically. Given the priority of each source packet and the estimated channel condition, an adaptive protection scheme based on joint source-channel criteria is investigated via proactive forward error correction (FEC). With proactive FEC in Reed Solomon (RS)/Rate-compatible punctured convolutional (RCPC) codes, we study a practical algorithm to match the relative priority of source packets and instantaneous channel conditions. The channel condition is estimated to capture the long-term fading effect in terms of the averaged SNR over a preset window. Proactive protection is performed for each packet based on the joint source-channel criteria with special attention to the accuracy, time-scale match, and feedback delay of channel status estimation. The overall gain of the proposed protection mechanism is demonstrated in terms of the end-to-end wireless video performance.
Learning Collaborative Sparse Representation for Grayscale-Thermal Tracking.
Li, Chenglong; Cheng, Hui; Hu, Shiyi; Liu, Xiaobai; Tang, Jin; Lin, Liang
2016-09-27
Integrating multiple different yet complementary feature representations has been proved to be an effective way for boosting tracking performance. This paper investigates how to perform robust object tracking in challenging scenarios by adaptively incorporating information from grayscale and thermal videos, and proposes a novel collaborative algorithm for online tracking. In particular, an adaptive fusion scheme is proposed based on collaborative sparse representation in Bayesian filtering framework. We jointly optimize sparse codes and the reliable weights of different modalities in an online way. In addition, this work contributes a comprehensive video benchmark, which includes 50 grayscale-thermal sequences and their ground truth annotations for tracking purpose. The videos are with high diversity and the annotations were finished by one single person to guarantee consistency. Extensive experiments against other stateof- the-art trackers with both grayscale and grayscale-thermal inputs demonstrate the effectiveness of the proposed tracking approach. Through analyzing quantitative results, we also provide basic insights and potential future research directions in grayscale-thermal tracking.
Yao, Guangle; Lei, Tao; Zhong, Jiandan; Jiang, Ping; Jia, Wenwu
2017-01-01
Background subtraction (BS) is one of the most commonly encountered tasks in video analysis and tracking systems. It distinguishes the foreground (moving objects) from the video sequences captured by static imaging sensors. Background subtraction in remote scene infrared (IR) video is important and common to lots of fields. This paper provides a Remote Scene IR Dataset captured by our designed medium-wave infrared (MWIR) sensor. Each video sequence in this dataset is identified with specific BS challenges and the pixel-wise ground truth of foreground (FG) for each frame is also provided. A series of experiments were conducted to evaluate BS algorithms on this proposed dataset. The overall performance of BS algorithms and the processor/memory requirements were compared. Proper evaluation metrics or criteria were employed to evaluate the capability of each BS algorithm to handle different kinds of BS challenges represented in this dataset. The results and conclusions in this paper provide valid references to develop new BS algorithm for remote scene IR video sequence, and some of them are not only limited to remote scene or IR video sequence but also generic for background subtraction. The Remote Scene IR dataset and the foreground masks detected by each evaluated BS algorithm are available online: https://github.com/JerryYaoGl/BSEvaluationRemoteSceneIR. PMID:28837112
Automated fall detection on privacy-enhanced video.
Edgcomb, Alex; Vahid, Frank
2012-01-01
A privacy-enhanced video obscures the appearance of a person in the video. We consider four privacy enhancements: blurring of the person, silhouetting of the person, covering the person with a graphical box, and covering the person with a graphical oval. We demonstrate that an automated video-based fall detection algorithm can be as accurate on privacy-enhanced video as on raw video. The algorithm operated on video from a stationary in-home camera, using a foreground-background segmentation algorithm to extract a minimum bounding rectangle (MBR) around the motion in the video, and using time series shapelet analysis on the height and width of the rectangle to detect falls. We report accuracy applying fall detection on 23 scenarios depicted as raw video and privacy-enhanced videos involving a sole actor portraying normal activities and various falls. We found that fall detection on privacy-enhanced video, except for the common approach of blurring of the person, was competitive with raw video, and in particular that the graphical oval privacy enhancement yielded the same accuracy as raw video, namely 0.91 sensitivity and 0.92 specificity.
Apply network coding for H.264/SVC multicasting
NASA Astrophysics Data System (ADS)
Wang, Hui; Kuo, C.-C. Jay
2008-08-01
In a packet erasure network environment, video streaming benefits from error control in two ways to achieve graceful degradation. The first approach is application-level (or the link-level) forward error-correction (FEC) to provide erasure protection. The second error control approach is error concealment at the decoder end to compensate lost packets. A large amount of research work has been done in the above two areas. More recently, network coding (NC) techniques have been proposed for efficient data multicast over networks. It was shown in our previous work that multicast video streaming benefits from NC for its throughput improvement. An algebraic model is given to analyze the performance in this work. By exploiting the linear combination of video packets along nodes in a network and the SVC video format, the system achieves path diversity automatically and enables efficient video delivery to heterogeneous receivers in packet erasure channels. The application of network coding can protect video packets against the erasure network environment. However, the rank defficiency problem of random linear network coding makes the error concealment inefficiently. It is shown by computer simulation that the proposed NC video multicast scheme enables heterogenous receiving according to their capacity constraints. But it needs special designing to improve the video transmission performance when applying network coding.
NASA Astrophysics Data System (ADS)
Boumehrez, Farouk; Brai, Radhia; Doghmane, Noureddine; Mansouri, Khaled
2018-01-01
Recently, video streaming has attracted much attention and interest due to its capability to process and transmit large data. We propose a quality of experience (QoE) model relying on high efficiency video coding (HEVC) encoder adaptation scheme, in turn based on the multiple description coding (MDC) for video streaming. The main contributions of the paper are (1) a performance evaluation of the new and emerging video coding standard HEVC/H.265, which is based on the variation of quantization parameter (QP) values depending on different video contents to deduce their influence on the sequence to be transmitted, (2) QoE support multimedia applications in wireless networks are investigated, so we inspect the packet loss impact on the QoE of transmitted video sequences, (3) HEVC encoder parameter adaptation scheme based on MDC is modeled with the encoder parameter and objective QoE model. A comparative study revealed that the proposed MDC approach is effective for improving the transmission with a peak signal-to-noise ratio (PSNR) gain of about 2 to 3 dB. Results show that a good choice of QP value can compensate for transmission channel effects and improve received video quality, although HEVC/H.265 is also sensitive to packet loss. The obtained results show the efficiency of our proposed method in terms of PSNR and mean-opinion-score.
Video quality assessment using a statistical model of human visual speed perception.
Wang, Zhou; Li, Qiang
2007-12-01
Motion is one of the most important types of information contained in natural video, but direct use of motion information in the design of video quality assessment algorithms has not been deeply investigated. Here we propose to incorporate a recent model of human visual speed perception [Nat. Neurosci. 9, 578 (2006)] and model visual perception in an information communication framework. This allows us to estimate both the motion information content and the perceptual uncertainty in video signals. Improved video quality assessment algorithms are obtained by incorporating the model as spatiotemporal weighting factors, where the weight increases with the information content and decreases with the perceptual uncertainty. Consistent improvement over existing video quality assessment algorithms is observed in our validation with the video quality experts group Phase I test data set.
NASA Astrophysics Data System (ADS)
Pei, Yong; Modestino, James W.
2004-12-01
Digital video delivered over wired-to-wireless networks is expected to suffer quality degradation from both packet loss and bit errors in the payload. In this paper, the quality degradation due to packet loss and bit errors in the payload are quantitatively evaluated and their effects are assessed. We propose the use of a concatenated forward error correction (FEC) coding scheme employing Reed-Solomon (RS) codes and rate-compatible punctured convolutional (RCPC) codes to protect the video data from packet loss and bit errors, respectively. Furthermore, the performance of a joint source-channel coding (JSCC) approach employing this concatenated FEC coding scheme for video transmission is studied. Finally, we describe an improved end-to-end architecture using an edge proxy in a mobile support station to implement differential error protection for the corresponding channel impairments expected on the two networks. Results indicate that with an appropriate JSCC approach and the use of an edge proxy, FEC-based error-control techniques together with passive error-recovery techniques can significantly improve the effective video throughput and lead to acceptable video delivery quality over time-varying heterogeneous wired-to-wireless IP networks.
Anomaly Detection in Moving-Camera Video Sequences Using Principal Subspace Analysis
Thomaz, Lucas A.; Jardim, Eric; da Silva, Allan F.; ...
2017-10-16
This study presents a family of algorithms based on sparse decompositions that detect anomalies in video sequences obtained from slow moving cameras. These algorithms start by computing the union of subspaces that best represents all the frames from a reference (anomaly free) video as a low-rank projection plus a sparse residue. Then, they perform a low-rank representation of a target (possibly anomalous) video by taking advantage of both the union of subspaces and the sparse residue computed from the reference video. Such algorithms provide good detection results while at the same time obviating the need for previous video synchronization. However,more » this is obtained at the cost of a large computational complexity, which hinders their applicability. Another contribution of this paper approaches this problem by using intrinsic properties of the obtained data representation in order to restrict the search space to the most relevant subspaces, providing computational complexity gains of up to two orders of magnitude. The developed algorithms are shown to cope well with videos acquired in challenging scenarios, as verified by the analysis of 59 videos from the VDAO database that comprises videos with abandoned objects in a cluttered industrial scenario.« less
Anomaly Detection in Moving-Camera Video Sequences Using Principal Subspace Analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Thomaz, Lucas A.; Jardim, Eric; da Silva, Allan F.
This study presents a family of algorithms based on sparse decompositions that detect anomalies in video sequences obtained from slow moving cameras. These algorithms start by computing the union of subspaces that best represents all the frames from a reference (anomaly free) video as a low-rank projection plus a sparse residue. Then, they perform a low-rank representation of a target (possibly anomalous) video by taking advantage of both the union of subspaces and the sparse residue computed from the reference video. Such algorithms provide good detection results while at the same time obviating the need for previous video synchronization. However,more » this is obtained at the cost of a large computational complexity, which hinders their applicability. Another contribution of this paper approaches this problem by using intrinsic properties of the obtained data representation in order to restrict the search space to the most relevant subspaces, providing computational complexity gains of up to two orders of magnitude. The developed algorithms are shown to cope well with videos acquired in challenging scenarios, as verified by the analysis of 59 videos from the VDAO database that comprises videos with abandoned objects in a cluttered industrial scenario.« less
NASA Astrophysics Data System (ADS)
Russkova, Tatiana V.
2017-11-01
One tool to improve the performance of Monte Carlo methods for numerical simulation of light transport in the Earth's atmosphere is the parallel technology. A new algorithm oriented to parallel execution on the CUDA-enabled NVIDIA graphics processor is discussed. The efficiency of parallelization is analyzed on the basis of calculating the upward and downward fluxes of solar radiation in both a vertically homogeneous and inhomogeneous models of the atmosphere. The results of testing the new code under various atmospheric conditions including continuous singlelayered and multilayered clouds, and selective molecular absorption are presented. The results of testing the code using video cards with different compute capability are analyzed. It is shown that the changeover of computing from conventional PCs to the architecture of graphics processors gives more than a hundredfold increase in performance and fully reveals the capabilities of the technology used.
An improved real time superresolution FPGA system
NASA Astrophysics Data System (ADS)
Lakshmi Narasimha, Pramod; Mudigoudar, Basavaraj; Yue, Zhanfeng; Topiwala, Pankaj
2009-05-01
In numerous computer vision applications, enhancing the quality and resolution of captured video can be critical. Acquired video is often grainy and low quality due to motion, transmission bottlenecks, etc. Postprocessing can enhance it. Superresolution greatly decreases camera jitter to deliver a smooth, stabilized, high quality video. In this paper, we extend previous work on a real-time superresolution application implemented in ASIC/FPGA hardware. A gradient based technique is used to register the frames at the sub-pixel level. Once we get the high resolution grid, we use an improved regularization technique in which the image is iteratively modified by applying back-projection to get a sharp and undistorted image. The algorithm was first tested in software and migrated to hardware, to achieve 320x240 -> 1280x960, about 30 fps, a stunning superresolution by 16X in total pixels. Various input parameters, such as size of input image, enlarging factor and the number of nearest neighbors, can be tuned conveniently by the user. We use a maximum word size of 32 bits to implement the algorithm in Matlab Simulink as well as in FPGA hardware, which gives us a fine balance between the number of bits and performance. The proposed system is robust and highly efficient. We have shown the performance improvement of the hardware superresolution over the software version (C code).
Digital cinema system using JPEG2000 movie of 8-million pixel resolution
NASA Astrophysics Data System (ADS)
Fujii, Tatsuya; Nomura, Mitsuru; Shirai, Daisuke; Yamaguchi, Takahiro; Fujii, Tetsuro; Ono, Sadayasu
2003-05-01
We have developed a prototype digital cinema system that can store, transmit and display extra high quality movies of 8-million pixel resolution, using JPEG2000 coding algorithm. The image quality is 4 times better than HDTV in resolution, and enables us to replace conventional films with digital cinema archives. Using wide-area optical gigabit IP networks, cinema contents are distributed and played back as a video-on-demand (VoD) system. The system consists of three main devices, a video server, a real-time JPEG2000 decoder, and a large-venue LCD projector. All digital movie data are compressed by JPEG2000 and stored in advance. The coded streams of 300~500 Mbps can be continuously transmitted from the PC server using TCP/IP. The decoder can perform the real-time decompression at 24/48 frames per second, using 120 parallel JPEG2000 processing elements. The received streams are expanded into 4.5Gbps raw video signals. The prototype LCD projector uses 3 pieces of 3840×2048 pixel reflective LCD panels (D-ILA) to show RGB 30-bit color movies fed by the decoder. The brightness exceeds 3000 ANSI lumens for a 300-inch screen. The refresh rate is chosen to 96Hz to thoroughly eliminate flickers, while preserving compatibility to cinema movies of 24 frames per second.
Real-time transmission of digital video using variable-length coding
NASA Technical Reports Server (NTRS)
Bizon, Thomas P.; Shalkhauser, Mary JO; Whyte, Wayne A., Jr.
1993-01-01
Huffman coding is a variable-length lossless compression technique where data with a high probability of occurrence is represented with short codewords, while 'not-so-likely' data is assigned longer codewords. Compression is achieved when the high-probability levels occur so frequently that their benefit outweighs any penalty paid when a less likely input occurs. One instance where Huffman coding is extremely effective occurs when data is highly predictable and differential coding can be applied (as with a digital video signal). For that reason, it is desirable to apply this compression technique to digital video transmission; however, special care must be taken in order to implement a communication protocol utilizing Huffman coding. This paper addresses several of the issues relating to the real-time transmission of Huffman-coded digital video over a constant-rate serial channel. Topics discussed include data rate conversion (from variable to a fixed rate), efficient data buffering, channel coding, recovery from communication errors, decoder synchronization, and decoder architectures. A description of the hardware developed to execute Huffman coding and serial transmission is also included. Although this paper focuses on matters relating to Huffman-coded digital video, the techniques discussed can easily be generalized for a variety of applications which require transmission of variable-length data.
NASA Astrophysics Data System (ADS)
Siddeq, M. M.; Rodrigues, M. A.
2015-09-01
Image compression techniques are widely used on 2D image 2D video 3D images and 3D video. There are many types of compression techniques and among the most popular are JPEG and JPEG2000. In this research, we introduce a new compression method based on applying a two level discrete cosine transform (DCT) and a two level discrete wavelet transform (DWT) in connection with novel compression steps for high-resolution images. The proposed image compression algorithm consists of four steps. (1) Transform an image by a two level DWT followed by a DCT to produce two matrices: DC- and AC-Matrix, or low and high frequency matrix, respectively, (2) apply a second level DCT on the DC-Matrix to generate two arrays, namely nonzero-array and zero-array, (3) apply the Minimize-Matrix-Size algorithm to the AC-Matrix and to the other high-frequencies generated by the second level DWT, (4) apply arithmetic coding to the output of previous steps. A novel decompression algorithm, Fast-Match-Search algorithm (FMS), is used to reconstruct all high-frequency matrices. The FMS-algorithm computes all compressed data probabilities by using a table of data, and then using a binary search algorithm for finding decompressed data inside the table. Thereafter, all decoded DC-values with the decoded AC-coefficients are combined in one matrix followed by inverse two levels DCT with two levels DWT. The technique is tested by compression and reconstruction of 3D surface patches. Additionally, this technique is compared with JPEG and JPEG2000 algorithm through 2D and 3D root-mean-square-error following reconstruction. The results demonstrate that the proposed compression method has better visual properties than JPEG and JPEG2000 and is able to more accurately reconstruct surface patches in 3D.
Kapeller, Christoph; Kamada, Kyousuke; Ogawa, Hiroshi; Prueckl, Robert; Scharinger, Josef; Guger, Christoph
2014-01-01
A brain-computer-interface (BCI) allows the user to control a device or software with brain activity. Many BCIs rely on visual stimuli with constant stimulation cycles that elicit steady-state visual evoked potentials (SSVEP) in the electroencephalogram (EEG). This EEG response can be generated with a LED or a computer screen flashing at a constant frequency, and similar EEG activity can be elicited with pseudo-random stimulation sequences on a screen (code-based BCI). Using electrocorticography (ECoG) instead of EEG promises higher spatial and temporal resolution and leads to more dominant evoked potentials due to visual stimulation. This work is focused on BCIs based on visual evoked potentials (VEP) and its capability as a continuous control interface for augmentation of video applications. One 35 year old female subject with implanted subdural grids participated in the study. The task was to select one out of four visual targets, while each was flickering with a code sequence. After a calibration run including 200 code sequences, a linear classifier was used during an evaluation run to identify the selected visual target based on the generated code-based VEPs over 20 trials. Multiple ECoG buffer lengths were tested and the subject reached a mean online classification accuracy of 99.21% for a window length of 3.15 s. Finally, the subject performed an unsupervised free run in combination with visual feedback of the current selection. Additionally, an algorithm was implemented that allowed to suppress false positive selections and this allowed the subject to start and stop the BCI at any time. The code-based BCI system attained very high online accuracy, which makes this approach very promising for control applications where a continuous control signal is needed. PMID:25147509
3D Visualization of Machine Learning Algorithms with Astronomical Data
NASA Astrophysics Data System (ADS)
Kent, Brian R.
2016-01-01
We present innovative machine learning (ML) methods using unsupervised clustering with minimum spanning trees (MSTs) to study 3D astronomical catalogs. Utilizing Python code to build trees based on galaxy catalogs, we can render the results with the visualization suite Blender to produce interactive 360 degree panoramic videos. The catalogs and their ML results can be explored in a 3D space using mobile devices, tablets or desktop browsers. We compare the statistics of the MST results to a number of machine learning methods relating to optimization and efficiency.
The emerging High Efficiency Video Coding standard (HEVC)
NASA Astrophysics Data System (ADS)
Raja, Gulistan; Khan, Awais
2013-12-01
High definition video (HDV) is becoming popular day by day. This paper describes the performance analysis of latest upcoming video standard known as High Efficiency Video Coding (HEVC). HEVC is designed to fulfil all the requirements for future high definition videos. In this paper, three configurations (intra only, low delay and random access) of HEVC are analyzed using various 480p, 720p and 1080p high definition test video sequences. Simulation results show the superior objective and subjective quality of HEVC.
Spatial resampling of IDR frames for low bitrate video coding with HEVC
NASA Astrophysics Data System (ADS)
Hosking, Brett; Agrafiotis, Dimitris; Bull, David; Easton, Nick
2015-03-01
As the demand for higher quality and higher resolution video increases, many applications fail to meet this demand due to low bandwidth restrictions. One factor contributing to this problem is the high bitrate requirement of the intra-coded Instantaneous Decoding Refresh (IDR) frames featuring in all video coding standards. Frequent coding of IDR frames is essential for error resilience in order to prevent the occurrence of error propagation. However, as each one consumes a huge portion of the available bitrate, the quality of future coded frames is hindered by high levels of compression. This work presents a new technique, known as Spatial Resampling of IDR Frames (SRIF), and shows how it can increase the rate distortion performance by providing a higher and more consistent level of video quality at low bitrates.
Resource allocation for error resilient video coding over AWGN using optimization approach.
An, Cheolhong; Nguyen, Truong Q
2008-12-01
The number of slices for error resilient video coding is jointly optimized with 802.11a-like media access control and the physical layers with automatic repeat request and rate compatible punctured convolutional code over additive white gaussian noise channel as well as channel times allocation for time division multiple access. For error resilient video coding, the relation between the number of slices and coding efficiency is analyzed and formulated as a mathematical model. It is applied for the joint optimization problem, and the problem is solved by a convex optimization method such as the primal-dual decomposition method. We compare the performance of a video communication system which uses the optimal number of slices with one that codes a picture as one slice. From numerical examples, end-to-end distortion of utility functions can be significantly reduced with the optimal slices of a picture especially at low signal-to-noise ratio.
H.264 Layered Coded Video over Wireless Networks: Channel Coding and Modulation Constraints
NASA Astrophysics Data System (ADS)
Ghandi, M. M.; Barmada, B.; Jones, E. V.; Ghanbari, M.
2006-12-01
This paper considers the prioritised transmission of H.264 layered coded video over wireless channels. For appropriate protection of video data, methods such as prioritised forward error correction coding (FEC) or hierarchical quadrature amplitude modulation (HQAM) can be employed, but each imposes system constraints. FEC provides good protection but at the price of a high overhead and complexity. HQAM is less complex and does not introduce any overhead, but permits only fixed data ratios between the priority layers. Such constraints are analysed and practical solutions are proposed for layered transmission of data-partitioned and SNR-scalable coded video where combinations of HQAM and FEC are used to exploit the advantages of both coding methods. Simulation results show that the flexibility of SNR scalability and absence of picture drift imply that SNR scalability as modelled is superior to data partitioning in such applications.
Compression of stereoscopic video using MPEG-2
NASA Astrophysics Data System (ADS)
Puri, A.; Kollarits, Richard V.; Haskell, Barry G.
1995-10-01
Many current as well as emerging applications in areas of entertainment, remote operations, manufacturing industry and medicine can benefit from the depth perception offered by stereoscopic video systems which employ two views of a scene imaged under the constraints imposed by human visual system. Among the many challenges to be overcome for practical realization and widespread use of 3D/stereoscopic systems are good 3D displays and efficient techniques for digital compression of enormous amounts of data while maintaining compatibility with normal video decoding and display systems. After a brief introduction to the basics of 3D/stereo including issues of depth perception, stereoscopic 3D displays and terminology in stereoscopic imaging and display, we present an overview of tools in the MPEG-2 video standard that are relevant to our discussion on compression of stereoscopic video, which is the main topic of this paper. Next, we outilne the various approaches for compression of stereoscopic video and then focus on compatible stereoscopic video coding using MPEG-2 Temporal scalability concepts. Compatible coding employing two different types of prediction structures become potentially possible, disparity compensated prediction and combined disparity and motion compensated predictions. To further improve coding performance and display quality, preprocessing for reducing mismatch between the two views forming stereoscopic video is considered. Results of simulations performed on stereoscopic video of normal TV resolution are then reported comparing the performance of two prediction structures with the simulcast solution. It is found that combined disparity and motion compensated prediction offers the best performance. Results indicate that compression of both views of stereoscopic video of normal TV resolution appears feasible in a total of 6 to 8 Mbit/s. We then discuss regarding multi-viewpoint video, a generalization of stereoscopic video. Finally, we describe ongoing efforts within MPEG-2 to define a profile for stereoscopic video coding, as well as, the promise of MPEG-4 in addressing coding of multi-viewpoint video.
Compression of stereoscopic video using MPEG-2
NASA Astrophysics Data System (ADS)
Puri, Atul; Kollarits, Richard V.; Haskell, Barry G.
1995-12-01
Many current as well as emerging applications in areas of entertainment, remote operations, manufacturing industry and medicine can benefit from the depth perception offered by stereoscopic video systems which employ two views of a scene imaged under the constraints imposed by human visual system. Among the many challenges to be overcome for practical realization and widespread use of 3D/stereoscopic systems are good 3D displays and efficient techniques for digital compression of enormous amounts of data while maintaining compatibility with normal video decoding and display systems. After a brief introduction to the basics of 3D/stereo including issues of depth perception, stereoscopic 3D displays and terminology in stereoscopic imaging and display, we present an overview of tools in the MPEG-2 video standard that are relevant to our discussion on compression of stereoscopic video, which is the main topic of this paper. Next, we outline the various approaches for compression of stereoscopic video and then focus on compatible stereoscopic video coding using MPEG-2 Temporal scalability concepts. Compatible coding employing two different types of prediction structures become potentially possible, disparity compensated prediction and combined disparity and motion compensated predictions. To further improve coding performance and display quality, preprocessing for reducing mismatch between the two views forming stereoscopic video is considered. Results of simulations performed on stereoscopic video of normal TV resolution are then reported comparing the performance of two prediction structures with the simulcast solution. It is found that combined disparity and motion compensated prediction offers the best performance. Results indicate that compression of both views of stereoscopic video of normal TV resolution appears feasible in a total of 6 to 8 Mbit/s. We then discuss regarding multi-viewpoint video, a generalization of stereoscopic video. Finally, we describe ongoing efforts within MPEG-2 to define a profile for stereoscopic video coding, as well as, the promise of MPEG-4 in addressing coding of multi-viewpoint video.
Video coding for 3D-HEVC based on saliency information
NASA Astrophysics Data System (ADS)
Yu, Fang; An, Ping; Yang, Chao; You, Zhixiang; Shen, Liquan
2016-11-01
As an extension of High Efficiency Video Coding ( HEVC), 3D-HEVC has been widely researched under the impetus of the new generation coding standard in recent years. Compared with H.264/AVC, its compression efficiency is doubled while keeping the same video quality. However, its higher encoding complexity and longer encoding time are not negligible. To reduce the computational complexity and guarantee the subjective quality of virtual views, this paper presents a novel video coding method for 3D-HEVC based on the saliency informat ion which is an important part of Human Visual System (HVS). First of all, the relationship between the current coding unit and its adjacent units is used to adjust the maximum depth of each largest coding unit (LCU) and determine the SKIP mode reasonably. Then, according to the saliency informat ion of each frame image, the texture and its corresponding depth map will be divided into three regions, that is, salient area, middle area and non-salient area. Afterwards, d ifferent quantization parameters will be assigned to different regions to conduct low complexity coding. Finally, the compressed video will generate new view point videos through the renderer tool. As shown in our experiments, the proposed method saves more bit rate than other approaches and achieves up to highest 38% encoding time reduction without subjective quality loss in compression or rendering.
NASA Astrophysics Data System (ADS)
Al-Hayani, Nazar; Al-Jawad, Naseer; Jassim, Sabah A.
2014-05-01
Video compression and encryption became very essential in a secured real time video transmission. Applying both techniques simultaneously is one of the challenges where the size and the quality are important in multimedia transmission. In this paper we proposed a new technique for video compression and encryption. Both encryption and compression are based on edges extracted from the high frequency sub-bands of wavelet decomposition. The compression algorithm based on hybrid of: discrete wavelet transforms, discrete cosine transform, vector quantization, wavelet based edge detection, and phase sensing. The compression encoding algorithm treats the video reference and non-reference frames in two different ways. The encryption algorithm utilized A5 cipher combined with chaotic logistic map to encrypt the significant parameters and wavelet coefficients. Both algorithms can be applied simultaneously after applying the discrete wavelet transform on each individual frame. Experimental results show that the proposed algorithms have the following features: high compression, acceptable quality, and resistance to the statistical and bruteforce attack with low computational processing.
Layer-based buffer aware rate adaptation design for SHVC video streaming
NASA Astrophysics Data System (ADS)
Gudumasu, Srinivas; Hamza, Ahmed; Asbun, Eduardo; He, Yong; Ye, Yan
2016-09-01
This paper proposes a layer based buffer aware rate adaptation design which is able to avoid abrupt video quality fluctuation, reduce re-buffering latency and improve bandwidth utilization when compared to a conventional simulcast based adaptive streaming system. The proposed adaptation design schedules DASH segment requests based on the estimated bandwidth, dependencies among video layers and layer buffer fullness. Scalable HEVC video coding is the latest state-of-art video coding technique that can alleviate various issues caused by simulcast based adaptive video streaming. With scalable coded video streams, the video is encoded once into a number of layers representing different qualities and/or resolutions: a base layer (BL) and one or more enhancement layers (EL), each incrementally enhancing the quality of the lower layers. Such layer based coding structure allows fine granularity rate adaptation for the video streaming applications. Two video streaming use cases are presented in this paper. The first use case is to stream HD SHVC video over a wireless network where available bandwidth varies, and the performance comparison between proposed layer-based streaming approach and conventional simulcast streaming approach is provided. The second use case is to stream 4K/UHD SHVC video over a hybrid access network that consists of a 5G millimeter wave high-speed wireless link and a conventional wired or WiFi network. The simulation results verify that the proposed layer based rate adaptation approach is able to utilize the bandwidth more efficiently. As a result, a more consistent viewing experience with higher quality video content and minimal video quality fluctuations can be presented to the user.
Patient-Physician Communication About Code Status Preferences: A Randomized Controlled Trial
Rhondali, Wadih; Perez-Cruz, Pedro; Hui, David; Chisholm, Gary B.; Dalal, Shalini; Baile, Walter; Chittenden, Eva; Bruera, Eduardo
2013-01-01
Purpose Code status discussions are important in cancer care. The best modality for such discussions has not been established. Our objective was to determine the impact of a physician ending a code status discussion with a question (autonomy approach) versus a recommendation (beneficence approach) on patients' do-not-resuscitate (DNR) preference. Methods Patients in a supportive care clinic watched two videos showing a physician-patient discussion regarding code status. Both videos were identical except for the ending: one ended with the physician asking for the patient's code status preference and the other with the physician recommending DNR. Patients were randomly assigned to watch the videos in different sequences. The main outcome was the proportion of patients choosing DNR for the video patient. Results 78 patients completed the study. 74% chose DNR after the question video, 73% after the recommendation video. Median physician compassion score was very high and not different for both videos. 30/30 patients who had chosen DNR for themselves and 30/48 patients who had not chosen DNR for themselves chose DNR for the video patient (100% v/s 62%). Age (OR=1.1/year) and white ethnicity (OR=9.43) predicted DNR choice for the video patient. Conclusion Ending DNR discussions with a question or a recommendation did not impact DNR choice or perception of physician compassion. Therefore, both approaches are clinically appropriate. All patients who chose DNR for themselves and most patients who did not choose DNR for themselves chose DNR for the video patient. Age and race predicted DNR choice. PMID:23564395
Learning-Based Just-Noticeable-Quantization- Distortion Modeling for Perceptual Video Coding.
Ki, Sehwan; Bae, Sung-Ho; Kim, Munchurl; Ko, Hyunsuk
2018-07-01
Conventional predictive video coding-based approaches are reaching the limit of their potential coding efficiency improvements, because of severely increasing computation complexity. As an alternative approach, perceptual video coding (PVC) has attempted to achieve high coding efficiency by eliminating perceptual redundancy, using just-noticeable-distortion (JND) directed PVC. The previous JNDs were modeled by adding white Gaussian noise or specific signal patterns into the original images, which were not appropriate in finding JND thresholds due to distortion with energy reduction. In this paper, we present a novel discrete cosine transform-based energy-reduced JND model, called ERJND, that is more suitable for JND-based PVC schemes. Then, the proposed ERJND model is extended to two learning-based just-noticeable-quantization-distortion (JNQD) models as preprocessing that can be applied for perceptual video coding. The two JNQD models can automatically adjust JND levels based on given quantization step sizes. One of the two JNQD models, called LR-JNQD, is based on linear regression and determines the model parameter for JNQD based on extracted handcraft features. The other JNQD model is based on a convolution neural network (CNN), called CNN-JNQD. To our best knowledge, our paper is the first approach to automatically adjust JND levels according to quantization step sizes for preprocessing the input to video encoders. In experiments, both the LR-JNQD and CNN-JNQD models were applied to high efficiency video coding (HEVC) and yielded maximum (average) bitrate reductions of 38.51% (10.38%) and 67.88% (24.91%), respectively, with little subjective video quality degradation, compared with the input without preprocessing applied.
Merino, Aimee M; Greiner, Ryan; Hartwig, Kristopher
2017-09-01
Patient preferences regarding cardiopulmonary resuscitation (CPR) are important, especially during hospitalization when a patient's health is changing. Yet many patients are not adequately informed or involved in the decision-making process. We examined the effect of an informational video about CPR on hospitalized patients' code status choices. This was a prospective, randomized trial conducted at the Minneapolis Veterans Affairs Health Care System in Minnesota. We enrolled 119 patients, hospitalized on the general medicine service, and at least 65 years old. The majority were men (97%) with a mean age of 75. A video described code status choices: full code (CPR and intubation if required), do not resuscitate (DNR), and do not resuscitate/do not intubate (DNR/DNI). Participants were randomized to watch the video (n = 59) or usual care (n = 60). The primary outcome was participants' code status preferences. Secondary outcomes included a questionnaire designed to evaluate participants' trust in their healthcare team and knowledge and perceptions about CPR. Participants who viewed the video were less likely to choose full code (37%) compared to participants in the usual care group (71%) and more likely to choose DNR/DNI (56% in the video group vs. 17% in the control group) ( < 0.00001). We did not see a difference in trust in their healthcare team or knowledge and perceptions about CPR as assessed by our questionnaire. Hospitalized patients who watched a video about CPR and code status choices were less likely to choose full code and more likely to choose DNR/DNI. © 2017 Society of Hospital Medicine
Toward enhancing the distributed video coder under a multiview video codec framework
NASA Astrophysics Data System (ADS)
Lee, Shih-Chieh; Chen, Jiann-Jone; Tsai, Yao-Hong; Chen, Chin-Hua
2016-11-01
The advance of video coding technology enables multiview video (MVV) or three-dimensional television (3-D TV) display for users with or without glasses. For mobile devices or wireless applications, a distributed video coder (DVC) can be utilized to shift the encoder complexity to decoder under the MVV coding framework, denoted as multiview distributed video coding (MDVC). We proposed to exploit both inter- and intraview video correlations to enhance side information (SI) and improve the MDVC performance: (1) based on the multiview motion estimation (MVME) framework, a categorized block matching prediction with fidelity weights (COMPETE) was proposed to yield a high quality SI frame for better DVC reconstructed images. (2) The block transform coefficient properties, i.e., DCs and ACs, were exploited to design the priority rate control for the turbo code, such that the DVC decoding can be carried out with fewest parity bits. In comparison, the proposed COMPETE method demonstrated lower time complexity, while presenting better reconstructed video quality. Simulations show that the proposed COMPETE can reduce the time complexity of MVME to 1.29 to 2.56 times smaller, as compared to previous hybrid MVME methods, while the image peak signal to noise ratios (PSNRs) of a decoded video can be improved 0.2 to 3.5 dB, as compared to H.264/AVC intracoding.
The role of optical flow in automated quality assessment of full-motion video
NASA Astrophysics Data System (ADS)
Harguess, Josh; Shafer, Scott; Marez, Diego
2017-09-01
In real-world video data, such as full-motion-video (FMV) taken from unmanned vehicles, surveillance systems, and other sources, various corruptions to the raw data is inevitable. This can be due to the image acquisition process, noise, distortion, and compression artifacts, among other sources of error. However, we desire methods to analyze the quality of the video to determine whether the underlying content of the corrupted video can be analyzed by humans or machines and to what extent. Previous approaches have shown that motion estimation, or optical flow, can be an important cue in automating this video quality assessment. However, there are many different optical flow algorithms in the literature, each with their own advantages and disadvantages. We examine the effect of the choice of optical flow algorithm (including baseline and state-of-the-art), on motionbased automated video quality assessment algorithms.
Film grain noise modeling in advanced video coding
NASA Astrophysics Data System (ADS)
Oh, Byung Tae; Kuo, C.-C. Jay; Sun, Shijun; Lei, Shawmin
2007-01-01
A new technique for film grain noise extraction, modeling and synthesis is proposed and applied to the coding of high definition video in this work. The film grain noise is viewed as a part of artistic presentation by people in the movie industry. On one hand, since the film grain noise can boost the natural appearance of pictures in high definition video, it should be preserved in high-fidelity video processing systems. On the other hand, video coding with film grain noise is expensive. It is desirable to extract film grain noise from the input video as a pre-processing step at the encoder and re-synthesize the film grain noise and add it back to the decoded video as a post-processing step at the decoder. Under this framework, the coding gain of the denoised video is higher while the quality of the final reconstructed video can still be well preserved. Following this idea, we present a method to remove film grain noise from image/video without distorting its original content. Besides, we describe a parametric model containing a small set of parameters to represent the extracted film grain noise. The proposed model generates the film grain noise that is close to the real one in terms of power spectral density and cross-channel spectral correlation. Experimental results are shown to demonstrate the efficiency of the proposed scheme.
Bi, Sheng; Zeng, Xiao; Tang, Xin; Qin, Shujia; Lai, King Wai Chiu
2016-01-01
Compressive sensing (CS) theory has opened up new paths for the development of signal processing applications. Based on this theory, a novel single pixel camera architecture has been introduced to overcome the current limitations and challenges of traditional focal plane arrays. However, video quality based on this method is limited by existing acquisition and recovery methods, and the method also suffers from being time-consuming. In this paper, a multi-frame motion estimation algorithm is proposed in CS video to enhance the video quality. The proposed algorithm uses multiple frames to implement motion estimation. Experimental results show that using multi-frame motion estimation can improve the quality of recovered videos. To further reduce the motion estimation time, a block match algorithm is used to process motion estimation. Experiments demonstrate that using the block match algorithm can reduce motion estimation time by 30%. PMID:26950127
Multi-frame knowledge based text enhancement for mobile phone captured videos
NASA Astrophysics Data System (ADS)
Ozarslan, Suleyman; Eren, P. Erhan
2014-02-01
In this study, we explore automated text recognition and enhancement using mobile phone captured videos of store receipts. We propose a method which includes Optical Character Resolution (OCR) enhanced by our proposed Row Based Multiple Frame Integration (RB-MFI), and Knowledge Based Correction (KBC) algorithms. In this method, first, the trained OCR engine is used for recognition; then, the RB-MFI algorithm is applied to the output of the OCR. The RB-MFI algorithm determines and combines the most accurate rows of the text outputs extracted by using OCR from multiple frames of the video. After RB-MFI, KBC algorithm is applied to these rows to correct erroneous characters. Results of the experiments show that the proposed video-based approach which includes the RB-MFI and the KBC algorithm increases the word character recognition rate to 95%, and the character recognition rate to 98%.
Error-free holographic frames encryption with CA pixel-permutation encoding algorithm
NASA Astrophysics Data System (ADS)
Li, Xiaowei; Xiao, Dan; Wang, Qiong-Hua
2018-01-01
The security of video data is necessary in network security transmission hence cryptography is technique to make video data secure and unreadable to unauthorized users. In this paper, we propose a holographic frames encryption technique based on the cellular automata (CA) pixel-permutation encoding algorithm. The concise pixel-permutation algorithm is used to address the drawbacks of the traditional CA encoding methods. The effectiveness of the proposed video encoding method is demonstrated by simulation examples.
Guo, Yang-Yang; He, Dong-Jian; Liu, Cong
2018-06-25
Insect behaviour is an important research topic in plant protection. To study insect behaviour accurately, it is necessary to observe and record their flight trajectory quantitatively and precisely in three dimensions (3D). The goal of this research was to analyse frames extracted from videos using Kernelized Correlation Filters (KCF) and Background Subtraction (BS) (KCF-BS) to plot the 3D trajectory of cabbage butterfly (P. rapae). Considering the experimental environment with a wind tunnel, a quadrature binocular vision insect video capture system was designed and applied in this study. The KCF-BS algorithm was used to track the butterfly in video frames and obtain coordinates of the target centroid in two videos. Finally the 3D trajectory was calculated according to the matching relationship in the corresponding frames of two angles in the video. To verify the validity of the KCF-BS algorithm, Compressive Tracking (CT) and Spatio-Temporal Context Learning (STC) algorithms were performed. The results revealed that the KCF-BS tracking algorithm performed more favourably than CT and STC in terms of accuracy and robustness.
NASA Astrophysics Data System (ADS)
Grois, Dan; Marpe, Detlev; Nguyen, Tung; Hadar, Ofer
2014-09-01
The popularity of low-delay video applications dramatically increased over the last years due to a rising demand for realtime video content (such as video conferencing or video surveillance), and also due to the increasing availability of relatively inexpensive heterogeneous devices (such as smartphones and tablets). To this end, this work presents a comparative assessment of the two latest video coding standards: H.265/MPEG-HEVC (High-Efficiency Video Coding), H.264/MPEG-AVC (Advanced Video Coding), and also of the VP9 proprietary video coding scheme. For evaluating H.264/MPEG-AVC, an open-source x264 encoder was selected, which has a multi-pass encoding mode, similarly to VP9. According to experimental results, which were obtained by using similar low-delay configurations for all three examined representative encoders, it was observed that H.265/MPEG-HEVC provides significant average bit-rate savings of 32.5%, and 40.8%, relative to VP9 and x264 for the 1-pass encoding, and average bit-rate savings of 32.6%, and 42.2% for the 2-pass encoding, respectively. On the other hand, compared to the x264 encoder, typical low-delay encoding times of the VP9 encoder, are about 2,000 times higher for the 1-pass encoding, and are about 400 times higher for the 2-pass encoding.
CVD2014-A Database for Evaluating No-Reference Video Quality Assessment Algorithms.
Nuutinen, Mikko; Virtanen, Toni; Vaahteranoksa, Mikko; Vuori, Tero; Oittinen, Pirkko; Hakkinen, Jukka
2016-07-01
In this paper, we present a new video database: CVD2014-Camera Video Database. In contrast to previous video databases, this database uses real cameras rather than introducing distortions via post-processing, which results in a complex distortion space in regard to the video acquisition process. CVD2014 contains a total of 234 videos that are recorded using 78 different cameras. Moreover, this database contains the observer-specific quality evaluation scores rather than only providing mean opinion scores. We have also collected open-ended quality descriptions that are provided by the observers. These descriptions were used to define the quality dimensions for the videos in CVD2014. The dimensions included sharpness, graininess, color balance, darkness, and jerkiness. At the end of this paper, a performance study of image and video quality algorithms for predicting the subjective video quality is reported. For this performance study, we proposed a new performance measure that accounts for observer variance. The performance study revealed that there is room for improvement regarding the video quality assessment algorithms. The CVD2014 video database has been made publicly available for the research community. All video sequences and corresponding subjective ratings can be obtained from the CVD2014 project page (http://www.helsinki.fi/psychology/groups/visualcognition/).
Study and simulation of low rate video coding schemes
NASA Technical Reports Server (NTRS)
Sayood, Khalid; Chen, Yun-Chung; Kipp, G.
1992-01-01
The semiannual report is included. Topics covered include communication, information science, data compression, remote sensing, color mapped images, robust coding scheme for packet video, recursively indexed differential pulse code modulation, image compression technique for use on token ring networks, and joint source/channel coder design.
Shaking video stabilization with content completion
NASA Astrophysics Data System (ADS)
Peng, Yi; Ye, Qixiang; Liu, Yanmei; Jiao, Jianbin
2009-01-01
A new stabilization algorithm to counterbalance the shaking motion in a video based on classical Kandade-Lucas- Tomasi (KLT) method is presented in this paper. Feature points are evaluated with law of large numbers and clustering algorithm to reduce the side effect of moving foreground. Analysis on the change of motion direction is also carried out to detect the existence of shaking. For video clips with detected shaking, an affine transformation is performed to warp the current frame to the reference one. In addition, the missing content of a frame during the stabilization is completed with optical flow analysis and mosaicking operation. Experiments on video clips demonstrate the effectiveness of the proposed algorithm.
Digital television system design study
NASA Technical Reports Server (NTRS)
Huth, G. K.
1976-01-01
The use of digital techniques for transmission of pictorial data is discussed for multi-frame images (television). Video signals are processed in a manner which includes quantization and coding such that they are separable from the noise introduced into the channel. The performance of digital television systems is determined by the nature of the processing techniques (i.e., whether the video signal itself or, instead, something related to the video signal is quantized and coded) and to the quantization and coding schemes employed.
NASA Astrophysics Data System (ADS)
Khosla, Deepak; Huber, David J.; Bhattacharyya, Rajan
2017-05-01
In this paper, we describe an algorithm and system for optimizing search and detection performance for "items of interest" (IOI) in large-sized images and videos that employ the Rapid Serial Visual Presentation (RSVP) based EEG paradigm and surprise algorithms that incorporate motion processing to determine whether static or video RSVP is used. The system works by first computing a motion surprise map on image sub-regions (chips) of incoming sensor video data and then uses those surprise maps to label the chips as either "static" or "moving". This information tells the system whether to use a static or video RSVP presentation and decoding algorithm in order to optimize EEG based detection of IOI in each chip. Using this method, we are able to demonstrate classification of a series of image regions from video with an azimuth value of 1, indicating perfect classification, over a range of display frequencies and video speeds.
NASA Astrophysics Data System (ADS)
Cicala, L.; Angelino, C. V.; Ruatta, G.; Baccaglini, E.; Raimondo, N.
2015-08-01
Unmanned Aerial Vehicles (UAVs) are often employed to collect high resolution images in order to perform image mosaicking and/or 3D reconstruction. Images are usually stored on board and then processed with on-ground desktop software. In such a way the computational load, and hence the power consumption, is moved on ground, leaving on board only the task of storing data. Such an approach is important in the case of small multi-rotorcraft UAVs because of their low endurance due to the short battery life. Images can be stored on board with either still image or video data compression. Still image system are preferred when low frame rates are involved, because video coding systems are based on motion estimation and compensation algorithms which fail when the motion vectors are significantly long and when the overlapping between subsequent frames is very small. In this scenario, UAVs attitude and position metadata from the Inertial Navigation System (INS) can be employed to estimate global motion parameters without video analysis. A low complexity image analysis can be still performed in order to refine the motion field estimated using only the metadata. In this work, we propose to use this refinement step in order to improve the position and attitude estimation produced by the navigation system in order to maximize the encoder performance. Experiments are performed on both simulated and real world video sequences.
NASA Astrophysics Data System (ADS)
The present conference on global telecommunications discusses topics in the fields of Integrated Services Digital Network (ISDN) technology field trial planning and results to date, motion video coding, ISDN networking, future network communications security, flexible and intelligent voice/data networks, Asian and Pacific lightwave and radio systems, subscriber radio systems, the performance of distributed systems, signal processing theory, satellite communications modulation and coding, and terminals for the handicapped. Also discussed are knowledge-based technologies for communications systems, future satellite transmissions, high quality image services, novel digital signal processors, broadband network access interface, traffic engineering for ISDN design and planning, telecommunications software, coherent optical communications, multimedia terminal systems, advanced speed coding, portable and mobile radio communications, multi-Gbit/second lightwave transmission systems, enhanced capability digital terminals, communications network reliability, advanced antimultipath fading techniques, undersea lightwave transmission, image coding, modulation and synchronization, adaptive signal processing, integrated optical devices, VLSI technologies for ISDN, field performance of packet switching, CSMA protocols, optical transport system architectures for broadband ISDN, mobile satellite communications, indoor wireless communication, echo cancellation in communications, and distributed network algorithms.
NASA Astrophysics Data System (ADS)
Pei, Yong; Modestino, James W.
2007-12-01
We describe a multilayered video transport scheme for wireless channels capable of adapting to channel conditions in order to maximize end-to-end quality of service (QoS). This scheme combines a scalable H.263+ video source coder with unequal error protection (UEP) across layers. The UEP is achieved by employing different channel codes together with a multiresolution modulation approach to transport the different priority layers. Adaptivity to channel conditions is provided through a joint source-channel coding (JSCC) approach which attempts to jointly optimize the source and channel coding rates together with the modulation parameters to obtain the maximum achievable end-to-end QoS for the prevailing channel conditions. In this work, we model the wireless links as slow-fading Rician channel where the channel conditions can be described in terms of the channel signal-to-noise ratio (SNR) and the ratio of specular-to-diffuse energy[InlineEquation not available: see fulltext.]. The multiresolution modulation/coding scheme consists of binary rate-compatible punctured convolutional (RCPC) codes used together with nonuniform phase-shift keyed (PSK) signaling constellations. Results indicate that this adaptive JSCC scheme employing scalable video encoding together with a multiresolution modulation/coding approach leads to significant improvements in delivered video quality for specified channel conditions. In particular, the approach results in considerably improved graceful degradation properties for decreasing channel SNR.
Detection of dominant flow and abnormal events in surveillance video
NASA Astrophysics Data System (ADS)
Kwak, Sooyeong; Byun, Hyeran
2011-02-01
We propose an algorithm for abnormal event detection in surveillance video. The proposed algorithm is based on a semi-unsupervised learning method, a kind of feature-based approach so that it does not detect the moving object individually. The proposed algorithm identifies dominant flow without individual object tracking using a latent Dirichlet allocation model in crowded environments. It can also automatically detect and localize an abnormally moving object in real-life video. The performance tests are taken with several real-life databases, and their results show that the proposed algorithm can efficiently detect abnormally moving objects in real time. The proposed algorithm can be applied to any situation in which abnormal directions or abnormal speeds are detected regardless of direction.
Real-time data compression of broadcast video signals
NASA Technical Reports Server (NTRS)
Shalkauser, Mary Jo W. (Inventor); Whyte, Wayne A., Jr. (Inventor); Barnes, Scott P. (Inventor)
1991-01-01
A non-adaptive predictor, a nonuniform quantizer, and a multi-level Huffman coder are incorporated into a differential pulse code modulation system for coding and decoding broadcast video signals in real time.
NASA Astrophysics Data System (ADS)
Jubran, Mohammad K.; Bansal, Manu; Kondi, Lisimachos P.
2006-01-01
In this paper, we consider the problem of optimal bit allocation for wireless video transmission over fading channels. We use a newly developed hybrid scalable/multiple-description codec that combines the functionality of both scalable and multiple-description codecs. It produces a base layer and multiple-description enhancement layers. Any of the enhancement layers can be decoded (in a non-hierarchical manner) with the base layer to improve the reconstructed video quality. Two different channel coding schemes (Rate-Compatible Punctured Convolutional (RCPC)/Cyclic Redundancy Check (CRC) coding and, product code Reed Solomon (RS)+RCPC/CRC coding) are used for unequal error protection of the layered bitstream. Optimal allocation of the bitrate between source and channel coding is performed for discrete sets of source coding rates and channel coding rates. Experimental results are presented for a wide range of channel conditions. Also, comparisons with classical scalable coding show the effectiveness of using hybrid scalable/multiple-description coding for wireless transmission.
NASA Astrophysics Data System (ADS)
He, Qiang; Schultz, Richard R.; Chu, Chee-Hung Henry
2008-04-01
The concept surrounding super-resolution image reconstruction is to recover a highly-resolved image from a series of low-resolution images via between-frame subpixel image registration. In this paper, we propose a novel and efficient super-resolution algorithm, and then apply it to the reconstruction of real video data captured by a small Unmanned Aircraft System (UAS). Small UAS aircraft generally have a wingspan of less than four meters, so that these vehicles and their payloads can be buffeted by even light winds, resulting in potentially unstable video. This algorithm is based on a coarse-to-fine strategy, in which a coarsely super-resolved image sequence is first built from the original video data by image registration and bi-cubic interpolation between a fixed reference frame and every additional frame. It is well known that the median filter is robust to outliers. If we calculate pixel-wise medians in the coarsely super-resolved image sequence, we can restore a refined super-resolved image. The primary advantage is that this is a noniterative algorithm, unlike traditional approaches based on highly-computational iterative algorithms. Experimental results show that our coarse-to-fine super-resolution algorithm is not only robust, but also very efficient. In comparison with five well-known super-resolution algorithms, namely the robust super-resolution algorithm, bi-cubic interpolation, projection onto convex sets (POCS), the Papoulis-Gerchberg algorithm, and the iterated back projection algorithm, our proposed algorithm gives both strong efficiency and robustness, as well as good visual performance. This is particularly useful for the application of super-resolution to UAS surveillance video, where real-time processing is highly desired.
Colonoscopy video quality assessment using hidden Markov random fields
NASA Astrophysics Data System (ADS)
Park, Sun Young; Sargent, Dusty; Spofford, Inbar; Vosburgh, Kirby
2011-03-01
With colonoscopy becoming a common procedure for individuals aged 50 or more who are at risk of developing colorectal cancer (CRC), colon video data is being accumulated at an ever increasing rate. However, the clinically valuable information contained in these videos is not being maximally exploited to improve patient care and accelerate the development of new screening methods. One of the well-known difficulties in colonoscopy video analysis is the abundance of frames with no diagnostic information. Approximately 40% - 50% of the frames in a colonoscopy video are contaminated by noise, acquisition errors, glare, blur, and uneven illumination. Therefore, filtering out low quality frames containing no diagnostic information can significantly improve the efficiency of colonoscopy video analysis. To address this challenge, we present a quality assessment algorithm to detect and remove low quality, uninformative frames. The goal of our algorithm is to discard low quality frames while retaining all diagnostically relevant information. Our algorithm is based on a hidden Markov model (HMM) in combination with two measures of data quality to filter out uninformative frames. Furthermore, we present a two-level framework based on an embedded hidden Markov model (EHHM) to incorporate the proposed quality assessment algorithm into a complete, automated diagnostic image analysis system for colonoscopy video.
Influence of video compression on the measurement error of the television system
NASA Astrophysics Data System (ADS)
Sotnik, A. V.; Yarishev, S. N.; Korotaev, V. V.
2015-05-01
Video data require a very large memory capacity. Optimal ratio quality / volume video encoding method is one of the most actual problem due to the urgent need to transfer large amounts of video over various networks. The technology of digital TV signal compression reduces the amount of data used for video stream representation. Video compression allows effective reduce the stream required for transmission and storage. It is important to take into account the uncertainties caused by compression of the video signal in the case of television measuring systems using. There are a lot digital compression methods. The aim of proposed work is research of video compression influence on the measurement error in television systems. Measurement error of the object parameter is the main characteristic of television measuring systems. Accuracy characterizes the difference between the measured value abd the actual parameter value. Errors caused by the optical system can be selected as a source of error in the television systems measurements. Method of the received video signal processing is also a source of error. Presence of error leads to large distortions in case of compression with constant data stream rate. Presence of errors increases the amount of data required to transmit or record an image frame in case of constant quality. The purpose of the intra-coding is reducing of the spatial redundancy within a frame (or field) of television image. This redundancy caused by the strong correlation between the elements of the image. It is possible to convert an array of image samples into a matrix of coefficients that are not correlated with each other, if one can find corresponding orthogonal transformation. It is possible to apply entropy coding to these uncorrelated coefficients and achieve a reduction in the digital stream. One can select such transformation that most of the matrix coefficients will be almost zero for typical images . Excluding these zero coefficients also possible reducing of the digital stream. Discrete cosine transformation is most widely used among possible orthogonal transformation. Errors of television measuring systems and data compression protocols analyzed In this paper. The main characteristics of measuring systems and detected sources of their error detected. The most effective methods of video compression are determined. The influence of video compression error on television measuring systems was researched. Obtained results will increase the accuracy of the measuring systems. In television image quality measuring system reduces distortion identical distortion in analog systems and specific distortions resulting from the process of coding / decoding digital video signal and errors in the transmission channel. By the distortions associated with encoding / decoding signal include quantization noise, reducing resolution, mosaic effect, "mosquito" effect edging on sharp drops brightness, blur colors, false patterns, the effect of "dirty window" and other defects. The size of video compression algorithms used in television measuring systems based on the image encoding with intra- and inter prediction individual fragments. The process of encoding / decoding image is non-linear in space and in time, because the quality of the playback of a movie at the reception depends on the pre- and post-history of a random, from the preceding and succeeding tracks, which can lead to distortion of the inadequacy of the sub-picture and a corresponding measuring signal.
Video fingerprinting for copy identification: from research to industry applications
NASA Astrophysics Data System (ADS)
Lu, Jian
2009-02-01
Research that began a decade ago in video copy detection has developed into a technology known as "video fingerprinting". Today, video fingerprinting is an essential and enabling tool adopted by the industry for video content identification and management in online video distribution. This paper provides a comprehensive review of video fingerprinting technology and its applications in identifying, tracking, and managing copyrighted content on the Internet. The review includes a survey on video fingerprinting algorithms and some fundamental design considerations, such as robustness, discriminability, and compactness. It also discusses fingerprint matching algorithms, including complexity analysis, and approximation and optimization for fast fingerprint matching. On the application side, it provides an overview of a number of industry-driven applications that rely on video fingerprinting. Examples are given based on real-world systems and workflows to demonstrate applications in detecting and managing copyrighted content, and in monitoring and tracking video distribution on the Internet.
High efficiency video coding for ultrasound video communication in m-health systems.
Panayides, A; Antoniou, Z; Pattichis, M S; Pattichis, C S; Constantinides, A G
2012-01-01
Emerging high efficiency video compression methods and wider availability of wireless network infrastructure will significantly advance existing m-health applications. For medical video communications, the emerging video compression and network standards support low-delay and high-resolution video transmission, at the clinically acquired resolution and frame rates. Such advances are expected to further promote the adoption of m-health systems for remote diagnosis and emergency incidents in daily clinical practice. This paper compares the performance of the emerging high efficiency video coding (HEVC) standard to the current state-of-the-art H.264/AVC standard. The experimental evaluation, based on five atherosclerotic plaque ultrasound videos encoded at QCIF, CIF, and 4CIF resolutions demonstrates that 50% reductions in bitrate requirements is possible for equivalent clinical quality.
Picture data compression coder using subband/transform coding with a Lempel-Ziv-based coder
NASA Technical Reports Server (NTRS)
Glover, Daniel R. (Inventor)
1995-01-01
Digital data coders/decoders are used extensively in video transmission. A digitally encoded video signal is separated into subbands. Separating the video into subbands allows transmission at low data rates. Once the data is separated into these subbands it can be coded and then decoded by statistical coders such as the Lempel-Ziv based coder.
Experimental design and analysis of JND test on coded image/video
NASA Astrophysics Data System (ADS)
Lin, Joe Yuchieh; Jin, Lina; Hu, Sudeng; Katsavounidis, Ioannis; Li, Zhi; Aaron, Anne; Kuo, C.-C. Jay
2015-09-01
The visual Just-Noticeable-Difference (JND) metric is characterized by the detectable minimum amount of two visual stimuli. Conducting the subjective JND test is a labor-intensive task. In this work, we present a novel interactive method in performing the visual JND test on compressed image/video. JND has been used to enhance perceptual visual quality in the context of image/video compression. Given a set of coding parameters, a JND test is designed to determine the distinguishable quality level against a reference image/video, which is called the anchor. The JND metric can be used to save coding bitrates by exploiting the special characteristics of the human visual system. The proposed JND test is conducted using a binary-forced choice, which is often adopted to discriminate the difference in perception in a psychophysical experiment. The assessors are asked to compare coded image/video pairs and determine whether they are of the same quality or not. A bisection procedure is designed to find the JND locations so as to reduce the required number of comparisons over a wide range of bitrates. We will demonstrate the efficiency of the proposed JND test, report experimental results on the image and video JND tests.
Gas leak detection in infrared video with background modeling
NASA Astrophysics Data System (ADS)
Zeng, Xiaoxia; Huang, Likun
2018-03-01
Background modeling plays an important role in the task of gas detection based on infrared video. VIBE algorithm is a widely used background modeling algorithm in recent years. However, the processing speed of the VIBE algorithm sometimes cannot meet the requirements of some real time detection applications. Therefore, based on the traditional VIBE algorithm, we propose a fast prospect model and optimize the results by combining the connected domain algorithm and the nine-spaces algorithm in the following processing steps. Experiments show the effectiveness of the proposed method.
First- and third-party ground truth for key frame extraction from consumer video clips
NASA Astrophysics Data System (ADS)
Costello, Kathleen; Luo, Jiebo
2007-02-01
Extracting key frames (KF) from video is of great interest in many applications, such as video summary, video organization, video compression, and prints from video. KF extraction is not a new problem. However, current literature has been focused mainly on sports or news video. In the consumer video space, the biggest challenges for key frame selection from consumer videos are the unconstrained content and lack of any preimposed structure. In this study, we conduct ground truth collection of key frames from video clips taken by digital cameras (as opposed to camcorders) using both first- and third-party judges. The goals of this study are: (1) to create a reference database of video clips reasonably representative of the consumer video space; (2) to identify associated key frames by which automated algorithms can be compared and judged for effectiveness; and (3) to uncover the criteria used by both first- and thirdparty human judges so these criteria can influence algorithm design. The findings from these ground truths will be discussed.
NASA Astrophysics Data System (ADS)
Palma, V.; Carli, M.; Neri, A.
2011-02-01
In this paper a Multi-view Distributed Video Coding scheme for mobile applications is presented. Specifically a new fusion technique between temporal and spatial side information in Zernike Moments domain is proposed. Distributed video coding introduces a flexible architecture that enables the design of very low complex video encoders compared to its traditional counterparts. The main goal of our work is to generate at the decoder the side information that optimally blends temporal and interview data. Multi-view distributed coding performance strongly depends on the side information quality built at the decoder. At this aim for improving its quality a spatial view compensation/prediction in Zernike moments domain is applied. Spatial and temporal motion activity have been fused together to obtain the overall side-information. The proposed method has been evaluated by rate-distortion performances for different inter-view and temporal estimation quality conditions.
Scalable Video Transmission Over Multi-Rate Multiple Access Channels
2007-06-01
Rate - compatible punctured convolutional codes (RCPC codes ) and their ap- plications,” IEEE...source encoded using the MPEG-4 video codec. The source encoded bitstream is then channel encoded with Rate Compatible Punctured Convolutional (RCPC...Clark, and J. M. Geist, “ Punctured convolutional codes or rate (n-1)/n and simplified maximum likelihood decoding,” IEEE Transactions on
Automatic attention-based prioritization of unconstrained video for compression
NASA Astrophysics Data System (ADS)
Itti, Laurent
2004-06-01
We apply a biologically-motivated algorithm that selects visually-salient regions of interest in video streams to multiply-foveated video compression. Regions of high encoding priority are selected based on nonlinear integration of low-level visual cues, mimicking processing in primate occipital and posterior parietal cortex. A dynamic foveation filter then blurs (foveates) every frame, increasingly with distance from high-priority regions. Two variants of the model (one with continuously-variable blur proportional to saliency at every pixel, and the other with blur proportional to distance from three independent foveation centers) are validated against eye fixations from 4-6 human observers on 50 video clips (synthetic stimuli, video games, outdoors day and night home video, television newscast, sports, talk-shows, etc). Significant overlap is found between human and algorithmic foveations on every clip with one variant, and on 48 out of 50 clips with the other. Substantial compressed file size reductions by a factor 0.5 on average are obtained for foveated compared to unfoveated clips. These results suggest a general-purpose usefulness of the algorithm in improving compression ratios of unconstrained video.
Digital video technologies and their network requirements
DOE Office of Scientific and Technical Information (OSTI.GOV)
R. P. Tsang; H. Y. Chen; J. M. Brandt
1999-11-01
Coded digital video signals are considered to be one of the most difficult data types to transport due to their real-time requirements and high bit rate variability. In this study, the authors discuss the coding mechanisms incorporated by the major compression standards bodies, i.e., JPEG and MPEG, as well as more advanced coding mechanisms such as wavelet and fractal techniques. The relationship between the applications which use these coding schemes and their network requirements are the major focus of this study. Specifically, the authors relate network latency, channel transmission reliability, random access speed, buffering and network bandwidth with the variousmore » coding techniques as a function of the applications which use them. Such applications include High-Definition Television, Video Conferencing, Computer-Supported Collaborative Work (CSCW), and Medical Imaging.« less
Temporal flicker reduction and denoising in video using sparse directional transforms
NASA Astrophysics Data System (ADS)
Kanumuri, Sandeep; Guleryuz, Onur G.; Civanlar, M. Reha; Fujibayashi, Akira; Boon, Choong S.
2008-08-01
The bulk of the video content available today over the Internet and over mobile networks suffers from many imperfections caused during acquisition and transmission. In the case of user-generated content, which is typically produced with inexpensive equipment, these imperfections manifest in various ways through noise, temporal flicker and blurring, just to name a few. Imperfections caused by compression noise and temporal flicker are present in both studio-produced and user-generated video content transmitted at low bit-rates. In this paper, we introduce an algorithm designed to reduce temporal flicker and noise in video sequences. The algorithm takes advantage of the sparse nature of video signals in an appropriate transform domain that is chosen adaptively based on local signal statistics. When the signal corresponds to a sparse representation in this transform domain, flicker and noise, which are spread over the entire domain, can be reduced easily by enforcing sparsity. Our results show that the proposed algorithm reduces flicker and noise significantly and enables better presentation of compressed videos.
A reaction-diffusion-based coding rate control mechanism for camera sensor networks.
Yamamoto, Hiroshi; Hyodo, Katsuya; Wakamiya, Naoki; Murata, Masayuki
2010-01-01
A wireless camera sensor network is useful for surveillance and monitoring for its visibility and easy deployment. However, it suffers from the limited capacity of wireless communication and a network is easily overflown with a considerable amount of video traffic. In this paper, we propose an autonomous video coding rate control mechanism where each camera sensor node can autonomously determine its coding rate in accordance with the location and velocity of target objects. For this purpose, we adopted a biological model, i.e., reaction-diffusion model, inspired by the similarity of biological spatial patterns and the spatial distribution of video coding rate. Through simulation and practical experiments, we verify the effectiveness of our proposal.
Online Hierarchical Sparse Representation of Multifeature for Robust Object Tracking
Qu, Shiru
2016-01-01
Object tracking based on sparse representation has given promising tracking results in recent years. However, the trackers under the framework of sparse representation always overemphasize the sparse representation and ignore the correlation of visual information. In addition, the sparse coding methods only encode the local region independently and ignore the spatial neighborhood information of the image. In this paper, we propose a robust tracking algorithm. Firstly, multiple complementary features are used to describe the object appearance; the appearance model of the tracked target is modeled by instantaneous and stable appearance features simultaneously. A two-stage sparse-coded method which takes the spatial neighborhood information of the image patch and the computation burden into consideration is used to compute the reconstructed object appearance. Then, the reliability of each tracker is measured by the tracking likelihood function of transient and reconstructed appearance models. Finally, the most reliable tracker is obtained by a well established particle filter framework; the training set and the template library are incrementally updated based on the current tracking results. Experiment results on different challenging video sequences show that the proposed algorithm performs well with superior tracking accuracy and robustness. PMID:27630710
Designing a scalable video-on-demand server with data sharing
NASA Astrophysics Data System (ADS)
Lim, Hyeran; Du, David H.
2000-12-01
As current disk space and transfer speed increase, the bandwidth between a server and its disks has become critical for video-on-demand (VOD) services. Our VOD server consists of several hosts sharing data on disks through a ring-based network. Data sharing provided by the spatial-reuse ring network between servers and disks not only increases the utilization towards full bandwidth but also improves the availability of videos. Striping and replication methods are introduced in order to improve the efficiency of our VOD server system as well as the availability of videos. We consider tow kinds of resources of a VOD server system. Given a representative access profile, our intention is to propose an algorithm to find an initial condition, place videos on disks in the system successfully. If any copy of a video cannot be placed due to lack of resources, more servers/disks are added. When all videos are place on the disks by our algorithm, the final configuration is determined with indicator of how tolerable it is against the fluctuation in demand of videos. Considering it is a NP-hard problem, our algorithm generates the final configuration with O(M log M) at best, where M is the number of movies.
Designing a scalable video-on-demand server with data sharing
NASA Astrophysics Data System (ADS)
Lim, Hyeran; Du, David H. C.
2001-01-01
As current disk space and transfer speed increase, the bandwidth between a server and its disks has become critical for video-on-demand (VOD) services. Our VOD server consists of several hosts sharing data on disks through a ring-based network. Data sharing provided by the spatial-reuse ring network between servers and disks not only increases the utilization towards full bandwidth but also improves the availability of videos. Striping and replication methods are introduced in order to improve the efficiency of our VOD server system as well as the availability of videos. We consider tow kinds of resources of a VOD server system. Given a representative access profile, our intention is to propose an algorithm to find an initial condition, place videos on disks in the system successfully. If any copy of a video cannot be placed due to lack of resources, more servers/disks are added. When all videos are place on the disks by our algorithm, the final configuration is determined with indicator of how tolerable it is against the fluctuation in demand of videos. Considering it is a NP-hard problem, our algorithm generates the final configuration with O(M log M) at best, where M is the number of movies.
Algorithm-Based Motion Magnification for Video Processing in Urological Laparoscopy.
Adams, Fabian; Schoelly, Reto; Schlager, Daniel; Schoenthaler, Martin; Schoeb, Dominik S; Wilhelm, Konrad; Hein, Simon; Wetterauer, Ulrich; Miernik, Arkadiusz
2017-06-01
Minimally invasive surgery is in constant further development and has replaced many conventional operative procedures. If vascular structure movement could be detected during these procedures, it could reduce the risk of vascular injury and conversion to open surgery. The recently proposed motion-amplifying algorithm, Eulerian Video Magnification (EVM), has been shown to substantially enhance minimal object changes in digitally recorded video that is barely perceptible to the human eye. We adapted and examined this technology for use in urological laparoscopy. Video sequences of routine urological laparoscopic interventions were recorded and further processed using spatial decomposition and filtering algorithms. The freely available EVM algorithm was investigated for its usability in real-time processing. In addition, a new image processing technology, the CRS iimotion Motion Magnification (CRSMM) algorithm, was specifically adjusted for endoscopic requirements, applied, and validated by our working group. Using EVM, no significant motion enhancement could be detected without severe impairment of the image resolution, motion, and color presentation. The CRSMM algorithm significantly improved image quality in terms of motion enhancement. In particular, the pulsation of vascular structures could be displayed more accurately than in EVM. Motion magnification image processing technology has the potential for clinical importance as a video optimizing modality in endoscopic and laparoscopic surgery. Barely detectable (micro)movements can be visualized using this noninvasive marker-free method. Despite these optimistic results, the technology requires considerable further technical development and clinical tests.
Two novel motion-based algorithms for surveillance video analysis on embedded platforms
NASA Astrophysics Data System (ADS)
Vijverberg, Julien A.; Loomans, Marijn J. H.; Koeleman, Cornelis J.; de With, Peter H. N.
2010-05-01
This paper proposes two novel motion-vector based techniques for target detection and target tracking in surveillance videos. The algorithms are designed to operate on a resource-constrained device, such as a surveillance camera, and to reuse the motion vectors generated by the video encoder. The first novel algorithm for target detection uses motion vectors to construct a consistent motion mask, which is combined with a simple background segmentation technique to obtain a segmentation mask. The second proposed algorithm aims at multi-target tracking and uses motion vectors to assign blocks to targets employing five features. The weights of these features are adapted based on the interaction between targets. These algorithms are combined in one complete analysis application. The performance of this application for target detection has been evaluated for the i-LIDS sterile zone dataset and achieves an F1-score of 0.40-0.69. The performance of the analysis algorithm for multi-target tracking has been evaluated using the CAVIAR dataset and achieves an MOTP of around 9.7 and MOTA of 0.17-0.25. On a selection of targets in videos from other datasets, the achieved MOTP and MOTA are 8.8-10.5 and 0.32-0.49 respectively. The execution time on a PC-based platform is 36 ms. This includes the 20 ms for generating motion vectors, which are also required by the video encoder.
Realization and optimization of AES algorithm on the TMS320DM6446 based on DaVinci technology
NASA Astrophysics Data System (ADS)
Jia, Wen-bin; Xiao, Fu-hai
2013-03-01
The application of AES algorithm in the digital cinema system avoids video data to be illegal theft or malicious tampering, and solves its security problems. At the same time, in order to meet the requirements of the real-time, scene and transparent encryption of high-speed data streams of audio and video in the information security field, through the in-depth analysis of AES algorithm principle, based on the hardware platform of TMS320DM6446, with the software framework structure of DaVinci, this paper proposes the specific realization methods of AES algorithm in digital video system and its optimization solutions. The test results show digital movies encrypted by AES128 can not play normally, which ensures the security of digital movies. Through the comparison of the performance of AES128 algorithm before optimization and after, the correctness and validity of improved algorithm is verified.
Research on compression performance of ultrahigh-definition videos
NASA Astrophysics Data System (ADS)
Li, Xiangqun; He, Xiaohai; Qing, Linbo; Tao, Qingchuan; Wu, Di
2017-11-01
With the popularization of high-definition (HD) images and videos (1920×1080 pixels and above), there are even 4K (3840×2160) television signals and 8 K (8192×4320) ultrahigh-definition videos. The demand for HD images and videos is increasing continuously, along with the increasing data volume. The storage and transmission cannot be properly solved only by virtue of the expansion capacity of hard disks and the update and improvement of transmission devices. Based on the full use of the coding standard high-efficiency video coding (HEVC), super-resolution reconstruction technology, and the correlation between the intra- and the interprediction, we first put forward a "division-compensation"-based strategy to further improve the compression performance of a single image and frame I. Then, by making use of the above thought and HEVC encoder and decoder, a video compression coding frame is designed. HEVC is used inside the frame. Last, with the super-resolution reconstruction technology, the reconstructed video quality is further improved. The experiment shows that by the proposed compression method for a single image (frame I) and video sequence here, the performance is superior to that of HEVC in a low bit rate environment.
NASA Astrophysics Data System (ADS)
Jo, Hyunho; Sim, Donggyu
2014-06-01
We present a bitstream decoding processor for entropy decoding of variable length coding-based multiformat videos. Since most of the computational complexity of entropy decoders comes from bitstream accesses and table look-up process, the developed bitstream processing unit (BsPU) has several designated instructions to access bitstreams and to minimize branch operations in the table look-up process. In addition, the instruction for bitstream access has the capability to remove emulation prevention bytes (EPBs) of H.264/AVC without initial delay, repeated memory accesses, and additional buffer. Experimental results show that the proposed method for EPB removal achieves a speed-up of 1.23 times compared to the conventional EPB removal method. In addition, the BsPU achieves speed-ups of 5.6 and 3.5 times in entropy decoding of H.264/AVC and MPEG-4 Visual bitstreams, respectively, compared to an existing processor without designated instructions and a new table mapping algorithm. The BsPU is implemented on a Xilinx Virtex5 LX330 field-programmable gate array. The MPEG-4 Visual (ASP, Level 5) and H.264/AVC (Main Profile, Level 4) are processed using the developed BsPU with a core clock speed of under 250 MHz in real time.
Performance comparison of AV1, HEVC, and JVET video codecs on 360 (spherical) video
NASA Astrophysics Data System (ADS)
Topiwala, Pankaj; Dai, Wei; Krishnan, Madhu; Abbas, Adeel; Doshi, Sandeep; Newman, David
2017-09-01
This paper compares the coding efficiency performance on 360 videos, of three software codecs: (a) AV1 video codec from the Alliance for Open Media (AOM); (b) the HEVC Reference Software HM; and (c) the JVET JEM Reference SW. Note that 360 video is especially challenging content, in that one codes full res globally, but typically looks locally (in a viewport), which magnifies errors. These are tested in two different projection formats ERP and RSP, to check consistency. Performance is tabulated for 1-pass encoding on two fronts: (1) objective performance based on end-to-end (E2E) metrics such as SPSNR-NN, and WS-PSNR, currently developed in the JVET committee; and (2) informal subjective assessment of static viewports. Constant quality encoding is performed with all the three codecs for an unbiased comparison of the core coding tools. Our general conclusion is that under constant quality coding, AV1 underperforms HEVC, which underperforms JVET. We also test with rate control, where AV1 currently underperforms the open source X265 HEVC codec. Objective and visual evidence is provided.
Spatial correlation-based side information refinement for distributed video coding
NASA Astrophysics Data System (ADS)
Taieb, Mohamed Haj; Chouinard, Jean-Yves; Wang, Demin
2013-12-01
Distributed video coding (DVC) architecture designs, based on distributed source coding principles, have benefitted from significant progresses lately, notably in terms of achievable rate-distortion performances. However, a significant performance gap still remains when compared to prediction-based video coding schemes such as H.264/AVC. This is mainly due to the non-ideal exploitation of the video sequence temporal correlation properties during the generation of side information (SI). In fact, the decoder side motion estimation provides only an approximation of the true motion. In this paper, a progressive DVC architecture is proposed, which exploits the spatial correlation of the video frames to improve the motion-compensated temporal interpolation (MCTI). Specifically, Wyner-Ziv (WZ) frames are divided into several spatially correlated groups that are then sent progressively to the receiver. SI refinement (SIR) is performed as long as these groups are being decoded, thus providing more accurate SI for the next groups. It is shown that the proposed progressive SIR method leads to significant improvements over the Discover DVC codec as well as other SIR schemes recently introduced in the literature.
Video quality pooling adaptive to perceptual distortion severity.
Park, Jincheol; Seshadrinathan, Kalpana; Lee, Sanghoon; Bovik, Alan Conrad
2013-02-01
It is generally recognized that severe video distortions that are transient in space and/or time have a large effect on overall perceived video quality. In order to understand this phenomena, we study the distribution of spatio-temporally local quality scores obtained from several video quality assessment (VQA) algorithms on videos suffering from compression and lossy transmission over communication channels. We propose a content adaptive spatial and temporal pooling strategy based on the observed distribution. Our method adaptively emphasizes "worst" scores along both the spatial and temporal dimensions of a video sequence and also considers the perceptual effect of large-area cohesive motion flow such as egomotion. We demonstrate the efficacy of the method by testing it using three different VQA algorithms on the LIVE Video Quality database and the EPFL-PoliMI video quality database.
Ensemble of Chaotic and Naive Approaches for Performance Enhancement in Video Encryption.
Chandrasekaran, Jeyamala; Thiruvengadam, S J
2015-01-01
Owing to the growth of high performance network technologies, multimedia applications over the Internet are increasing exponentially. Applications like video conferencing, video-on-demand, and pay-per-view depend upon encryption algorithms for providing confidentiality. Video communication is characterized by distinct features such as large volume, high redundancy between adjacent frames, video codec compliance, syntax compliance, and application specific requirements. Naive approaches for video encryption encrypt the entire video stream with conventional text based cryptographic algorithms. Although naive approaches are the most secure for video encryption, the computational cost associated with them is very high. This research work aims at enhancing the speed of naive approaches through chaos based S-box design. Chaotic equations are popularly known for randomness, extreme sensitivity to initial conditions, and ergodicity. The proposed methodology employs two-dimensional discrete Henon map for (i) generation of dynamic and key-dependent S-box that could be integrated with symmetric algorithms like Blowfish and Data Encryption Standard (DES) and (ii) generation of one-time keys for simple substitution ciphers. The proposed design is tested for randomness, nonlinearity, avalanche effect, bit independence criterion, and key sensitivity. Experimental results confirm that chaos based S-box design and key generation significantly reduce the computational cost of video encryption with no compromise in security.
Ensemble of Chaotic and Naive Approaches for Performance Enhancement in Video Encryption
Chandrasekaran, Jeyamala; Thiruvengadam, S. J.
2015-01-01
Owing to the growth of high performance network technologies, multimedia applications over the Internet are increasing exponentially. Applications like video conferencing, video-on-demand, and pay-per-view depend upon encryption algorithms for providing confidentiality. Video communication is characterized by distinct features such as large volume, high redundancy between adjacent frames, video codec compliance, syntax compliance, and application specific requirements. Naive approaches for video encryption encrypt the entire video stream with conventional text based cryptographic algorithms. Although naive approaches are the most secure for video encryption, the computational cost associated with them is very high. This research work aims at enhancing the speed of naive approaches through chaos based S-box design. Chaotic equations are popularly known for randomness, extreme sensitivity to initial conditions, and ergodicity. The proposed methodology employs two-dimensional discrete Henon map for (i) generation of dynamic and key-dependent S-box that could be integrated with symmetric algorithms like Blowfish and Data Encryption Standard (DES) and (ii) generation of one-time keys for simple substitution ciphers. The proposed design is tested for randomness, nonlinearity, avalanche effect, bit independence criterion, and key sensitivity. Experimental results confirm that chaos based S-box design and key generation significantly reduce the computational cost of video encryption with no compromise in security. PMID:26550603
Coding visual features extracted from video sequences.
Baroffio, Luca; Cesana, Matteo; Redondi, Alessandro; Tagliasacchi, Marco; Tubaro, Stefano
2014-05-01
Visual features are successfully exploited in several applications (e.g., visual search, object recognition and tracking, etc.) due to their ability to efficiently represent image content. Several visual analysis tasks require features to be transmitted over a bandwidth-limited network, thus calling for coding techniques to reduce the required bit budget, while attaining a target level of efficiency. In this paper, we propose, for the first time, a coding architecture designed for local features (e.g., SIFT, SURF) extracted from video sequences. To achieve high coding efficiency, we exploit both spatial and temporal redundancy by means of intraframe and interframe coding modes. In addition, we propose a coding mode decision based on rate-distortion optimization. The proposed coding scheme can be conveniently adopted to implement the analyze-then-compress (ATC) paradigm in the context of visual sensor networks. That is, sets of visual features are extracted from video frames, encoded at remote nodes, and finally transmitted to a central controller that performs visual analysis. This is in contrast to the traditional compress-then-analyze (CTA) paradigm, in which video sequences acquired at a node are compressed and then sent to a central unit for further processing. In this paper, we compare these coding paradigms using metrics that are routinely adopted to evaluate the suitability of visual features in the context of content-based retrieval, object recognition, and tracking. Experimental results demonstrate that, thanks to the significant coding gains achieved by the proposed coding scheme, ATC outperforms CTA with respect to all evaluation metrics.
NASA Astrophysics Data System (ADS)
Barnett, Barry S.; Bovik, Alan C.
1995-04-01
This paper presents a real time full motion video conferencing system based on the Visual Pattern Image Sequence Coding (VPISC) software codec. The prototype system hardware is comprised of two personal computers, two camcorders, two frame grabbers, and an ethernet connection. The prototype system software has a simple structure. It runs under the Disk Operating System, and includes a user interface, a video I/O interface, an event driven network interface, and a free running or frame synchronous video codec that also acts as the controller for the video and network interfaces. Two video coders have been tested in this system. Simple implementations of Visual Pattern Image Coding and VPISC have both proven to support full motion video conferencing with good visual quality. Future work will concentrate on expanding this prototype to support the motion compensated version of VPISC, as well as encompassing point-to-point modem I/O and multiple network protocols. The application will be ported to multiple hardware platforms and operating systems. The motivation for developing this prototype system is to demonstrate the practicality of software based real time video codecs. Furthermore, software video codecs are not only cheaper, but are more flexible system solutions because they enable different computer platforms to exchange encoded video information without requiring on-board protocol compatible video codex hardware. Software based solutions enable true low cost video conferencing that fits the `open systems' model of interoperability that is so important for building portable hardware and software applications.
Frohlich, Dennis Owen; Zmyslinski-Seelig, Anne
2012-01-01
The purpose of this study was to explore the types of social support messages YouTube users posted on medical videos. Specifically, the study compared messages posted on inflammatory bowel disease-related videos and ostomy-related videos. Additionally, the study analyzed the differences in social support messages posted on lay-created videos and professionally-created videos. Conducting a content analysis, the researchers unitized the comments on each video; the total number of thought units amounted to 5,960. Researchers coded each thought unit through the use of a coding scheme modified from a previous study. YouTube users posted informational support messages most frequently (65.1%), followed by emotional support messages (18.3%), and finally, instrumental support messages (8.2%).
Hierarchical event selection for video storyboards with a case study on snooker video visualization.
Parry, Matthew L; Legg, Philip A; Chung, David H S; Griffiths, Iwan W; Chen, Min
2011-12-01
Video storyboard, which is a form of video visualization, summarizes the major events in a video using illustrative visualization. There are three main technical challenges in creating a video storyboard, (a) event classification, (b) event selection and (c) event illustration. Among these challenges, (a) is highly application-dependent and requires a significant amount of application specific semantics to be encoded in a system or manually specified by users. This paper focuses on challenges (b) and (c). In particular, we present a framework for hierarchical event representation, and an importance-based selection algorithm for supporting the creation of a video storyboard from a video. We consider the storyboard to be an event summarization for the whole video, whilst each individual illustration on the board is also an event summarization but for a smaller time window. We utilized a 3D visualization template for depicting and annotating events in illustrations. To demonstrate the concepts and algorithms developed, we use Snooker video visualization as a case study, because it has a concrete and agreeable set of semantic definitions for events and can make use of existing techniques of event detection and 3D reconstruction in a reliable manner. Nevertheless, most of our concepts and algorithms developed for challenges (b) and (c) can be applied to other application areas. © 2010 IEEE
A motion compensation technique using sliced blocks and its application to hybrid video coding
NASA Astrophysics Data System (ADS)
Kondo, Satoshi; Sasai, Hisao
2005-07-01
This paper proposes a new motion compensation method using "sliced blocks" in DCT-based hybrid video coding. In H.264 ? MPEG-4 Advance Video Coding, a brand-new international video coding standard, motion compensation can be performed by splitting macroblocks into multiple square or rectangular regions. In the proposed method, on the other hand, macroblocks or sub-macroblocks are divided into two regions (sliced blocks) by an arbitrary line segment. The result is that the shapes of the segmented regions are not limited to squares or rectangles, allowing the shapes of the segmented regions to better match the boundaries between moving objects. Thus, the proposed method can improve the performance of the motion compensation. In addition, adaptive prediction of the shape according to the region shape of the surrounding macroblocks can reduce overheads to describe shape information in the bitstream. The proposed method also has the advantage that conventional coding techniques such as mode decision using rate-distortion optimization can be utilized, since coding processes such as frequency transform and quantization are performed on a macroblock basis, similar to the conventional coding methods. The proposed method is implemented in an H.264-based P-picture codec and an improvement in bit rate of 5% is confirmed in comparison with H.264.
Robust digital image inpainting algorithm in the wireless environment
NASA Astrophysics Data System (ADS)
Karapetyan, G.; Sarukhanyan, H. G.; Agaian, S. S.
2014-05-01
Image or video inpainting is the process/art of retrieving missing portions of an image without introducing undesirable artifacts that are undetectable by an ordinary observer. An image/video can be damaged due to a variety of factors, such as deterioration due to scratches, laser dazzling effects, wear and tear, dust spots, loss of data when transmitted through a channel, etc. Applications of inpainting include image restoration (removing laser dazzling effects, dust spots, date, text, time, etc.), image synthesis (texture synthesis), completing panoramas, image coding, wireless transmission (recovery of the missing blocks), digital culture protection, image de-noising, fingerprint recognition, and film special effects and production. Most inpainting methods can be classified in two key groups: global and local methods. Global methods are used for generating large image regions from samples while local methods are used for filling in small image gaps. Each method has its own advantages and limitations. For example, the global inpainting methods perform well on textured image retrieval, whereas the classical local methods perform poorly. In addition, some of the techniques are computationally intensive; exceeding the capabilities of most currently used mobile devices. In general, the inpainting algorithms are not suitable for the wireless environment. This paper presents a new and efficient scheme that combines the advantages of both local and global methods into a single algorithm. Particularly, it introduces a blind inpainting model to solve the above problems by adaptively selecting support area for the inpainting scheme. The proposed method is applied to various challenging image restoration tasks, including recovering old photos, recovering missing data on real and synthetic images, and recovering the specular reflections in endoscopic images. A number of computer simulations demonstrate the effectiveness of our scheme and also illustrate the main properties and implementation steps of the presented algorithm. Furthermore, the simulation results show that the presented method is among the state-of-the-art and compares favorably against many available methods in the wireless environment. Robustness in the wireless environment with respect to the shape of the manually selected "marked" region is also illustrated. Currently, we are working on the expansion of this work to video and 3-D data.
Extensions under development for the HEVC standard
NASA Astrophysics Data System (ADS)
Sullivan, Gary J.
2013-09-01
This paper discusses standardization activities for extending the capabilities of the High Efficiency Video Coding (HEVC) standard - the first edition of which was completed in early 2013. These near-term extensions are focused on three areas: range extensions (such as enhanced chroma formats, monochrome video, and increased bit depth), bitstream scalability extensions for spatial and fidelity scalability, and 3D video extensions (including stereoscopic/multi-view coding, and probably also depth map coding and combinations thereof). Standardization extensions on each of these topics will be completed by mid-2014, and further work beyond that timeframe is also discussed.
Robust camera calibration for sport videos using court models
NASA Astrophysics Data System (ADS)
Farin, Dirk; Krabbe, Susanne; de With, Peter H. N.; Effelsberg, Wolfgang
2003-12-01
We propose an automatic camera calibration algorithm for court sports. The obtained camera calibration parameters are required for applications that need to convert positions in the video frame to real-world coordinates or vice versa. Our algorithm uses a model of the arrangement of court lines for calibration. Since the court model can be specified by the user, the algorithm can be applied to a variety of different sports. The algorithm starts with a model initialization step which locates the court in the image without any user assistance or a-priori knowledge about the most probable position. Image pixels are classified as court line pixels if they pass several tests including color and local texture constraints. A Hough transform is applied to extract line elements, forming a set of court line candidates. The subsequent combinatorial search establishes correspondences between lines in the input image and lines from the court model. For the succeeding input frames, an abbreviated calibration algorithm is used, which predicts the camera parameters for the new image and optimizes the parameters using a gradient-descent algorithm. We have conducted experiments on a variety of sport videos (tennis, volleyball, and goal area sequences of soccer games). Video scenes with considerable difficulties were selected to test the robustness of the algorithm. Results show that the algorithm is very robust to occlusions, partial court views, bad lighting conditions, or shadows.
Evaluation schemes for video and image anomaly detection algorithms
NASA Astrophysics Data System (ADS)
Parameswaran, Shibin; Harguess, Josh; Barngrover, Christopher; Shafer, Scott; Reese, Michael
2016-05-01
Video anomaly detection is a critical research area in computer vision. It is a natural first step before applying object recognition algorithms. There are many algorithms that detect anomalies (outliers) in videos and images that have been introduced in recent years. However, these algorithms behave and perform differently based on differences in domains and tasks to which they are subjected. In order to better understand the strengths and weaknesses of outlier algorithms and their applicability in a particular domain/task of interest, it is important to measure and quantify their performance using appropriate evaluation metrics. There are many evaluation metrics that have been used in the literature such as precision curves, precision-recall curves, and receiver operating characteristic (ROC) curves. In order to construct these different metrics, it is also important to choose an appropriate evaluation scheme that decides when a proposed detection is considered a true or a false detection. Choosing the right evaluation metric and the right scheme is very critical since the choice can introduce positive or negative bias in the measuring criterion and may favor (or work against) a particular algorithm or task. In this paper, we review evaluation metrics and popular evaluation schemes that are used to measure the performance of anomaly detection algorithms on videos and imagery with one or more anomalies. We analyze the biases introduced by these by measuring the performance of an existing anomaly detection algorithm.
Advanced Architectures for Astrophysical Supercomputing
NASA Astrophysics Data System (ADS)
Barsdell, B. R.; Barnes, D. G.; Fluke, C. J.
2010-12-01
Astronomers have come to rely on the increasing performance of computers to reduce, analyze, simulate and visualize their data. In this environment, faster computation can mean more science outcomes or the opening up of new parameter spaces for investigation. If we are to avoid major issues when implementing codes on advanced architectures, it is important that we have a solid understanding of our algorithms. A recent addition to the high-performance computing scene that highlights this point is the graphics processing unit (GPU). The hardware originally designed for speeding-up graphics rendering in video games is now achieving speed-ups of O(100×) in general-purpose computation - performance that cannot be ignored. We are using a generalized approach, based on the analysis of astronomy algorithms, to identify the optimal problem-types and techniques for taking advantage of both current GPU hardware and future developments in computing architectures.
Sparse Coding and Counting for Robust Visual Tracking
Liu, Risheng; Wang, Jing; Shang, Xiaoke; Wang, Yiyang; Su, Zhixun; Cai, Yu
2016-01-01
In this paper, we propose a novel sparse coding and counting method under Bayesian framework for visual tracking. In contrast to existing methods, the proposed method employs the combination of L0 and L1 norm to regularize the linear coefficients of incrementally updated linear basis. The sparsity constraint enables the tracker to effectively handle difficult challenges, such as occlusion or image corruption. To achieve real-time processing, we propose a fast and efficient numerical algorithm for solving the proposed model. Although it is an NP-hard problem, the proposed accelerated proximal gradient (APG) approach is guaranteed to converge to a solution quickly. Besides, we provide a closed solution of combining L0 and L1 regularized representation to obtain better sparsity. Experimental results on challenging video sequences demonstrate that the proposed method achieves state-of-the-art results both in accuracy and speed. PMID:27992474
Tracking and recognition face in videos with incremental local sparse representation model
NASA Astrophysics Data System (ADS)
Wang, Chao; Wang, Yunhong; Zhang, Zhaoxiang
2013-10-01
This paper addresses the problem of tracking and recognizing faces via incremental local sparse representation. First a robust face tracking algorithm is proposed via employing local sparse appearance and covariance pooling method. In the following face recognition stage, with the employment of a novel template update strategy, which combines incremental subspace learning, our recognition algorithm adapts the template to appearance changes and reduces the influence of occlusion and illumination variation. This leads to a robust video-based face tracking and recognition with desirable performance. In the experiments, we test the quality of face recognition in real-world noisy videos on YouTube database, which includes 47 celebrities. Our proposed method produces a high face recognition rate at 95% of all videos. The proposed face tracking and recognition algorithms are also tested on a set of noisy videos under heavy occlusion and illumination variation. The tracking results on challenging benchmark videos demonstrate that the proposed tracking algorithm performs favorably against several state-of-the-art methods. In the case of the challenging dataset in which faces undergo occlusion and illumination variation, and tracking and recognition experiments under significant pose variation on the University of California, San Diego (Honda/UCSD) database, our proposed method also consistently demonstrates a high recognition rate.
NASA Astrophysics Data System (ADS)
Karczewicz, Marta; Chen, Peisong; Joshi, Rajan; Wang, Xianglin; Chien, Wei-Jung; Panchal, Rahul; Coban, Muhammed; Chong, In Suk; Reznik, Yuriy A.
2011-01-01
This paper describes video coding technology proposal submitted by Qualcomm Inc. in response to a joint call for proposal (CfP) issued by ITU-T SG16 Q.6 (VCEG) and ISO/IEC JTC1/SC29/WG11 (MPEG) in January 2010. Proposed video codec follows a hybrid coding approach based on temporal prediction, followed by transform, quantization, and entropy coding of the residual. Some of its key features are extended block sizes (up to 64x64), recursive integer transforms, single pass switched interpolation filters with offsets (single pass SIFO), mode dependent directional transform (MDDT) for intra-coding, luma and chroma high precision filtering, geometry motion partitioning, adaptive motion vector resolution. It also incorporates internal bit-depth increase (IBDI), and modified quadtree based adaptive loop filtering (QALF). Simulation results are presented for a variety of bit rates, resolutions and coding configurations to demonstrate the high compression efficiency achieved by the proposed video codec at moderate level of encoding and decoding complexity. For random access hierarchical B configuration (HierB), the proposed video codec achieves an average BD-rate reduction of 30.88c/o compared to the H.264/AVC alpha anchor. For low delay hierarchical P (HierP) configuration, the proposed video codec achieves an average BD-rate reduction of 32.96c/o and 48.57c/o, compared to the H.264/AVC beta and gamma anchors, respectively.
Indexing, Browsing, and Searching of Digital Video.
ERIC Educational Resources Information Center
Smeaton, Alan F.
2004-01-01
Presents a literature review that covers the following topics related to indexing, browsing, and searching of digital video: video coding and standards; conventional approaches to accessing digital video; automatically structuring and indexing digital video; searching, browsing, and summarization; measurement and evaluation of the effectiveness of…
Shot boundary detection and label propagation for spatio-temporal video segmentation
NASA Astrophysics Data System (ADS)
Piramanayagam, Sankaranaryanan; Saber, Eli; Cahill, Nathan D.; Messinger, David
2015-02-01
This paper proposes a two stage algorithm for streaming video segmentation. In the first stage, shot boundaries are detected within a window of frames by comparing dissimilarity between 2-D segmentations of each frame. In the second stage, the 2-D segments are propagated across the window of frames in both spatial and temporal direction. The window is moved across the video to find all shot transitions and obtain spatio-temporal segments simultaneously. As opposed to techniques that operate on entire video, the proposed approach consumes significantly less memory and enables segmentation of lengthy videos. We tested our segmentation based shot detection method on the TRECVID 2007 video dataset and compared it with block-based technique. Cut detection results on the TRECVID 2007 dataset indicate that our algorithm has comparable results to the best of the block-based methods. The streaming video segmentation routine also achieves promising results on a challenging video segmentation benchmark database.
NASA Astrophysics Data System (ADS)
Hild, Jutta; Krüger, Wolfgang; Brüstle, Stefan; Trantelle, Patrick; Unmüßig, Gabriel; Voit, Michael; Heinze, Norbert; Peinsipp-Byma, Elisabeth; Beyerer, Jürgen
2017-05-01
Real-time motion video analysis is a challenging and exhausting task for the human observer, particularly in safety and security critical domains. Hence, customized video analysis systems providing functions for the analysis of subtasks like motion detection or target tracking are welcome. While such automated algorithms relieve the human operators from performing basic subtasks, they impose additional interaction duties on them. Prior work shows that, e.g., for interaction with target tracking algorithms, a gaze-enhanced user interface is beneficial. In this contribution, we present an investigation on interaction with an independent motion detection (IDM) algorithm. Besides identifying an appropriate interaction technique for the user interface - again, we compare gaze-based and traditional mouse-based interaction - we focus on the benefit an IDM algorithm might provide for an UAS video analyst. In a pilot study, we exposed ten subjects to the task of moving target detection in UAS video data twice, once performing with automatic support, once performing without it. We compare the two conditions considering performance in terms of effectiveness (correct target selections). Additionally, we report perceived workload (measured using the NASA-TLX questionnaire) and user satisfaction (measured using the ISO 9241-411 questionnaire). The results show that a combination of gaze input and automated IDM algorithm provides valuable support for the human observer, increasing the number of correct target selections up to 62% and reducing workload at the same time.
Collaborative real-time motion video analysis by human observer and image exploitation algorithms
NASA Astrophysics Data System (ADS)
Hild, Jutta; Krüger, Wolfgang; Brüstle, Stefan; Trantelle, Patrick; Unmüßig, Gabriel; Heinze, Norbert; Peinsipp-Byma, Elisabeth; Beyerer, Jürgen
2015-05-01
Motion video analysis is a challenging task, especially in real-time applications. In most safety and security critical applications, a human observer is an obligatory part of the overall analysis system. Over the last years, substantial progress has been made in the development of automated image exploitation algorithms. Hence, we investigate how the benefits of automated video analysis can be integrated suitably into the current video exploitation systems. In this paper, a system design is introduced which strives to combine both the qualities of the human observer's perception and the automated algorithms, thus aiming to improve the overall performance of a real-time video analysis system. The system design builds on prior work where we showed the benefits for the human observer by means of a user interface which utilizes the human visual focus of attention revealed by the eye gaze direction for interaction with the image exploitation system; eye tracker-based interaction allows much faster, more convenient, and equally precise moving target acquisition in video images than traditional computer mouse selection. The system design also builds on prior work we did on automated target detection, segmentation, and tracking algorithms. Beside the system design, a first pilot study is presented, where we investigated how the participants (all non-experts in video analysis) performed in initializing an object tracking subsystem by selecting a target for tracking. Preliminary results show that the gaze + key press technique is an effective, efficient, and easy to use interaction technique when performing selection operations on moving targets in videos in order to initialize an object tracking function.
Open Source Subtitle Editor Software Study for Section 508 Close Caption Applications
NASA Technical Reports Server (NTRS)
Murphy, F. Brandon
2013-01-01
This paper will focus on a specific item within the NASA Electronic Information Accessibility Policy - Multimedia Presentation shall have synchronized caption; thus making information accessible to a person with hearing impairment. This synchronized caption will assist a person with hearing or cognitive disability to access the same information as everyone else. This paper focuses on the research and implementation for CC (subtitle option) support to video multimedia. The goal of this research is identify the best available open-source (free) software to achieve synchronized captions requirement and achieve savings, while meeting the security requirement for Government information integrity and assurance. CC and subtitling are processes that display text within a video to provide additional or interpretive information for those whom may need it or those whom chose it. Closed captions typically show the transcription of the audio portion of a program (video) as it occurs (either verbatim or in its edited form), sometimes including non-speech elements (such as sound effects). The transcript can be provided by a third party source or can be extracted word for word from the video. This feature can be made available for videos in two forms: either Soft-Coded or Hard-Coded. Soft-Coded is the more optional version of CC, where you can chose to turn them on if you want, or you can turn them off. Most of the time, when using the Soft-Coded option, the transcript is also provided to the view along-side the video. This option is subject to compromise, whereas the transcript is merely a text file that can be changed by anyone who has access to it. With this option the integrity of the CC is at the mercy of the user. Hard-Coded CC is a more permanent form of CC. A Hard-Coded CC transcript is embedded within a video, without the option of removal.
Trellises and Trellis-Based Decoding Algorithms for Linear Block Codes
NASA Technical Reports Server (NTRS)
Lin, Shu
1998-01-01
A code trellis is a graphical representation of a code, block or convolutional, in which every path represents a codeword (or a code sequence for a convolutional code). This representation makes it possible to implement Maximum Likelihood Decoding (MLD) of a code with reduced decoding complexity. The most well known trellis-based MLD algorithm is the Viterbi algorithm. The trellis representation was first introduced and used for convolutional codes [23]. This representation, together with the Viterbi decoding algorithm, has resulted in a wide range of applications of convolutional codes for error control in digital communications over the last two decades. There are two major reasons for this inactive period of research in this area. First, most coding theorists at that time believed that block codes did not have simple trellis structure like convolutional codes and maximum likelihood decoding of linear block codes using the Viterbi algorithm was practically impossible, except for very short block codes. Second, since almost all of the linear block codes are constructed algebraically or based on finite geometries, it was the belief of many coding theorists that algebraic decoding was the only way to decode these codes. These two reasons seriously hindered the development of efficient soft-decision decoding methods for linear block codes and their applications to error control in digital communications. This led to a general belief that block codes are inferior to convolutional codes and hence, that they were not useful. Chapter 2 gives a brief review of linear block codes. The goal is to provide the essential background material for the development of trellis structure and trellis-based decoding algorithms for linear block codes in the later chapters. Chapters 3 through 6 present the fundamental concepts, finite-state machine model, state space formulation, basic structural properties, state labeling, construction procedures, complexity, minimality, and sectionalization of trellises. Chapter 7 discusses trellis decomposition and subtrellises for low-weight codewords. Chapter 8 first presents well known methods for constructing long powerful codes from short component codes or component codes of smaller dimensions, and then provides methods for constructing their trellises which include Shannon and Cartesian product techniques. Chapter 9 deals with convolutional codes, puncturing, zero-tail termination and tail-biting.Chapters 10 through 13 present various trellis-based decoding algorithms, old and new. Chapter 10 first discusses the application of the well known Viterbi decoding algorithm to linear block codes, optimum sectionalization of a code trellis to minimize computation complexity, and design issues for IC (integrated circuit) implementation of a Viterbi decoder. Then it presents a new decoding algorithm for convolutional codes, named Differential Trellis Decoding (DTD) algorithm. Chapter 12 presents a suboptimum reliability-based iterative decoding algorithm with a low-weight trellis search for the most likely codeword. This decoding algorithm provides a good trade-off between error performance and decoding complexity. All the decoding algorithms presented in Chapters 10 through 12 are devised to minimize word error probability. Chapter 13 presents decoding algorithms that minimize bit error probability and provide the corresponding soft (reliability) information at the output of the decoder. Decoding algorithms presented are the MAP (maximum a posteriori probability) decoding algorithm and the Soft-Output Viterbi Algorithm (SOVA) algorithm. Finally, the minimization of bit error probability in trellis-based MLD is discussed.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Duberstein, Corey A.; Matzner, Shari; Cullinan, Valerie I.
Surveying wildlife at risk from offshore wind energy development is difficult and expensive. Infrared video can be used to record birds and bats that pass through the camera view, but it is also time consuming and expensive to review video and determine what was recorded. We proposed to conduct algorithm and software development to identify and to differentiate thermally detected targets of interest that would allow automated processing of thermal image data to enumerate birds, bats, and insects. During FY2012 we developed computer code within MATLAB to identify objects recorded in video and extract attribute information that describes the objectsmore » recorded. We tested the efficiency of track identification using observer-based counts of tracks within segments of sample video. We examined object attributes, modeled the effects of random variability on attributes, and produced data smoothing techniques to limit random variation within attribute data. We also began drafting and testing methodology to identify objects recorded on video. We also recorded approximately 10 hours of infrared video of various marine birds, passerine birds, and bats near the Pacific Northwest National Laboratory (PNNL) Marine Sciences Laboratory (MSL) at Sequim, Washington. A total of 6 hours of bird video was captured overlooking Sequim Bay over a series of weeks. An additional 2 hours of video of birds was also captured during two weeks overlooking Dungeness Bay within the Strait of Juan de Fuca. Bats and passerine birds (swallows) were also recorded at dusk on the MSL campus during nine evenings. An observer noted the identity of objects viewed through the camera concurrently with recording. These video files will provide the information necessary to produce and test software developed during FY2013. The annotation will also form the basis for creation of a method to reliably identify recorded objects.« less
Dual-Layer Video Encryption using RSA Algorithm
NASA Astrophysics Data System (ADS)
Chadha, Aman; Mallik, Sushmit; Chadha, Ankit; Johar, Ravdeep; Mani Roja, M.
2015-04-01
This paper proposes a video encryption algorithm using RSA and Pseudo Noise (PN) sequence, aimed at applications requiring sensitive video information transfers. The system is primarily designed to work with files encoded using the Audio Video Interleaved (AVI) codec, although it can be easily ported for use with Moving Picture Experts Group (MPEG) encoded files. The audio and video components of the source separately undergo two layers of encryption to ensure a reasonable level of security. Encryption of the video component involves applying the RSA algorithm followed by the PN-based encryption. Similarly, the audio component is first encrypted using PN and further subjected to encryption using the Discrete Cosine Transform. Combining these techniques, an efficient system, invulnerable to security breaches and attacks with favorable values of parameters such as encryption/decryption speed, encryption/decryption ratio and visual degradation; has been put forth. For applications requiring encryption of sensitive data wherein stringent security requirements are of prime concern, the system is found to yield negligible similarities in visual perception between the original and the encrypted video sequence. For applications wherein visual similarity is not of major concern, we limit the encryption task to a single level of encryption which is accomplished by using RSA, thereby quickening the encryption process. Although some similarity between the original and encrypted video is observed in this case, it is not enough to comprehend the happenings in the video.
Video Transmission for Third Generation Wireless Communication Systems
Gharavi, H.; Alamouti, S. M.
2001-01-01
This paper presents a twin-class unequal protected video transmission system over wireless channels. Video partitioning based on a separation of the Variable Length Coded (VLC) Discrete Cosine Transform (DCT) coefficients within each block is considered for constant bitrate transmission (CBR). In the splitting process the fraction of bits assigned to each of the two partitions is adjusted according to the requirements of the unequal error protection scheme employed. Subsequently, partitioning is applied to the ITU-T H.263 coding standard. As a transport vehicle, we have considered one of the leading third generation cellular radio standards known as WCDMA. A dual-priority transmission system is then invoked on the WCDMA system where the video data, after being broken into two streams, is unequally protected. We use a very simple error correction coding scheme for illustration and then propose more sophisticated forms of unequal protection of the digitized video signals. We show that this strategy results in a significantly higher quality of the reconstructed video data when it is transmitted over time-varying multipath fading channels. PMID:27500033
Grubbs, Kathleen M; Fortney, John C; Dean, Tisha; Williams, James S; Godleski, Linda
2015-07-01
This study compares the mental health diagnoses of encounters delivered face to face and via interactive video in the Veterans Healthcare Administration (VHA). We compiled 1 year of national-level VHA administrative data for Fiscal Year 2012 (FY12). Mental health encounters were those with both a VHA Mental Health Stop Code and a Mental Health Diagnosis (n=11,906,114). Interactive video encounters were identified as those with a Mental Health Stop Code, paired with a VHA Telehealth Secondary Stop Code. Primary diagnoses were grouped into posttraumatic stress disorder (PTSD), depression, anxiety, bipolar disorder, psychosis, drug use, alcohol use, and other. In FY12, 1.5% of all mental health encounters were delivered via interactive video. Compared with face-to-face encounters, a larger percentage of interactive video encounters was for PTSD, depression, and anxiety, whereas a smaller percentage was for alcohol use, drug use, or psychosis. Providers and patients may feel more comfortable treating depression and anxiety disorders than substance use or psychosis via interactive video.
Video Shot Boundary Detection Using QR-Decomposition and Gaussian Transition Detection
NASA Astrophysics Data System (ADS)
Amiri, Ali; Fathy, Mahmood
2010-12-01
This article explores the problem of video shot boundary detection and examines a novel shot boundary detection algorithm by using QR-decomposition and modeling of gradual transitions by Gaussian functions. Specifically, the authors attend to the challenges of detecting gradual shots and extracting appropriate spatiotemporal features that affect the ability of algorithms to efficiently detect shot boundaries. The algorithm utilizes the properties of QR-decomposition and extracts a block-wise probability function that illustrates the probability of video frames to be in shot transitions. The probability function has abrupt changes in hard cut transitions, and semi-Gaussian behavior in gradual transitions. The algorithm detects these transitions by analyzing the probability function. Finally, we will report the results of the experiments using large-scale test sets provided by the TRECVID 2006, which has assessments for hard cut and gradual shot boundary detection. These results confirm the high performance of the proposed algorithm.
Video segmentation for post-production
NASA Astrophysics Data System (ADS)
Wills, Ciaran
2001-12-01
Specialist post-production is an industry that has much to gain from the application of content-based video analysis techniques. However the types of material handled in specialist post-production, such as television commercials, pop music videos and special effects are quite different in nature from the typical broadcast material which many video analysis techniques are designed to work with; shots are short and highly dynamic, and the transitions are often novel or ambiguous. We address the problem of scene change detection and develop a new algorithm which tackles some of the common aspects of post-production material that cause difficulties for past algorithms, such as illumination changes and jump cuts. Operating in the compressed domain on Motion JPEG compressed video, our algorithm detects cuts and fades by analyzing each JPEG macroblock in the context of its temporal and spatial neighbors. Analyzing the DCT coefficients directly we can extract the mean color of a block and an approximate detail level. We can also perform an approximated cross-correlation between two blocks. The algorithm is part of a set of tools being developed to work with an automated asset management system designed specifically for use in post-production facilities.
Scene-aware joint global and local homographic video coding
NASA Astrophysics Data System (ADS)
Peng, Xiulian; Xu, Jizheng; Sullivan, Gary J.
2016-09-01
Perspective motion is commonly represented in video content that is captured and compressed for various applications including cloud gaming, vehicle and aerial monitoring, etc. Existing approaches based on an eight-parameter homography motion model cannot deal with this efficiently, either due to low prediction accuracy or excessive bit rate overhead. In this paper, we consider the camera motion model and scene structure in such video content and propose a joint global and local homography motion coding approach for video with perspective motion. The camera motion is estimated by a computer vision approach, and camera intrinsic and extrinsic parameters are globally coded at the frame level. The scene is modeled as piece-wise planes, and three plane parameters are coded at the block level. Fast gradient-based approaches are employed to search for the plane parameters for each block region. In this way, improved prediction accuracy and low bit costs are achieved. Experimental results based on the HEVC test model show that up to 9.1% bit rate savings can be achieved (with equal PSNR quality) on test video content with perspective motion. Test sequences for the example applications showed a bit rate savings ranging from 3.7 to 9.1%.
SarcOptiM for ImageJ: high-frequency online sarcomere length computing on stimulated cardiomyocytes.
Pasqualin, Côme; Gannier, François; Yu, Angèle; Malécot, Claire O; Bredeloux, Pierre; Maupoil, Véronique
2016-08-01
Accurate measurement of cardiomyocyte contraction is a critical issue for scientists working on cardiac physiology and physiopathology of diseases implying contraction impairment. Cardiomyocytes contraction can be quantified by measuring sarcomere length, but few tools are available for this, and none is freely distributed. We developed a plug-in (SarcOptiM) for the ImageJ/Fiji image analysis platform developed by the National Institutes of Health. SarcOptiM computes sarcomere length via fast Fourier transform analysis of video frames captured or displayed in ImageJ and thus is not tied to a dedicated video camera. It can work in real time or offline, the latter overcoming rotating motion or displacement-related artifacts. SarcOptiM includes a simulator and video generator of cardiomyocyte contraction. Acquisition parameters, such as pixel size and camera frame rate, were tested with both experimental recordings of rat ventricular cardiomyocytes and synthetic videos. It is freely distributed, and its source code is available. It works under Windows, Mac, or Linux operating systems. The camera speed is the limiting factor, since the algorithm can compute online sarcomere shortening at frame rates >10 kHz. In conclusion, SarcOptiM is a free and validated user-friendly tool for studying cardiomyocyte contraction in all species, including human. Copyright © 2016 the American Physiological Society.
Low-complexity video encoding method for wireless image transmission in capsule endoscope.
Takizawa, Kenichi; Hamaguchi, Kiyoshi
2010-01-01
This paper presents a low-complexity video encoding method applicable for wireless image transmission in capsule endoscopes. This encoding method is based on Wyner-Ziv theory, in which side information available at a transmitter is treated as side information at its receiver. Therefore complex processes in video encoding, such as estimation of the motion vector, are moved to the receiver side, which has a larger-capacity battery. As a result, the encoding process is only to decimate coded original data through channel coding. We provide a performance evaluation for a low-density parity check (LDPC) coding method in the AWGN channel.
Improved Techniques for Video Compression and Communication
ERIC Educational Resources Information Center
Chen, Haoming
2016-01-01
Video compression and communication has been an important field over the past decades and critical for many applications, e.g., video on demand, video-conferencing, and remote education. In many applications, providing low-delay and error-resilient video transmission and increasing the coding efficiency are two major challenges. Low-delay and…
Barbier, Paolo; Alimento, Marina; Berna, Giovanni; Celeste, Fabrizio; Gentile, Francesco; Mantero, Antonio; Montericcio, Vincenzo; Muratori, Manuela
2007-05-01
Large files produced by standard compression algorithms slow down spread of digital and tele-echocardiography. We validated echocardiographic video high-grade compression with the new Motion Pictures Expert Groups (MPEG)-4 algorithms with a multicenter study. Seven expert cardiologists blindly scored (5-point scale) 165 uncompressed and compressed 2-dimensional and color Doppler video clips, based on combined diagnostic content and image quality (uncompressed files as references). One digital video and 3 MPEG-4 algorithms (WM9, MV2, and DivX) were used, the latter at 3 compression levels (0%, 35%, and 60%). Compressed file sizes decreased from 12 to 83 MB to 0.03 to 2.3 MB (1:1051-1:26 reduction ratios). Mean SD of differences was 0.81 for intraobserver variability (uncompressed and digital video files). Compared with uncompressed files, only the DivX mean score at 35% (P = .04) and 60% (P = .001) compression was significantly reduced. At subcategory analysis, these differences were still significant for gray-scale and fundamental imaging but not for color or second harmonic tissue imaging. Original image quality, session sequence, compression grade, and bitrate were all independent determinants of mean score. Our study supports use of MPEG-4 algorithms to greatly reduce echocardiographic file sizes, thus facilitating archiving and transmission. Quality evaluation studies should account for the many independent variables that affect image quality grading.
Display device-adapted video quality-of-experience assessment
NASA Astrophysics Data System (ADS)
Rehman, Abdul; Zeng, Kai; Wang, Zhou
2015-03-01
Today's viewers consume video content from a variety of connected devices, including smart phones, tablets, notebooks, TVs, and PCs. This imposes significant challenges for managing video traffic efficiently to ensure an acceptable quality-of-experience (QoE) for the end users as the perceptual quality of video content strongly depends on the properties of the display device and the viewing conditions. State-of-the-art full-reference objective video quality assessment algorithms do not take into account the combined impact of display device properties, viewing conditions, and video resolution while performing video quality assessment. We performed a subjective study in order to understand the impact of aforementioned factors on perceptual video QoE. We also propose a full reference video QoE measure, named SSIMplus, that provides real-time prediction of the perceptual quality of a video based on human visual system behaviors, video content characteristics (such as spatial and temporal complexity, and video resolution), display device properties (such as screen size, resolution, and brightness), and viewing conditions (such as viewing distance and angle). Experimental results have shown that the proposed algorithm outperforms state-of-the-art video quality measures in terms of accuracy and speed.
Trellises and Trellis-Based Decoding Algorithms for Linear Block Codes. Part 3
NASA Technical Reports Server (NTRS)
Lin, Shu
1998-01-01
Decoding algorithms based on the trellis representation of a code (block or convolutional) drastically reduce decoding complexity. The best known and most commonly used trellis-based decoding algorithm is the Viterbi algorithm. It is a maximum likelihood decoding algorithm. Convolutional codes with the Viterbi decoding have been widely used for error control in digital communications over the last two decades. This chapter is concerned with the application of the Viterbi decoding algorithm to linear block codes. First, the Viterbi algorithm is presented. Then, optimum sectionalization of a trellis to minimize the computational complexity of a Viterbi decoder is discussed and an algorithm is presented. Some design issues for IC (integrated circuit) implementation of a Viterbi decoder are considered and discussed. Finally, a new decoding algorithm based on the principle of compare-select-add is presented. This new algorithm can be applied to both block and convolutional codes and is more efficient than the conventional Viterbi algorithm based on the add-compare-select principle. This algorithm is particularly efficient for rate 1/n antipodal convolutional codes and their high-rate punctured codes. It reduces computational complexity by one-third compared with the Viterbi algorithm.
Virtual viewpoint synthesis in multi-view video system
NASA Astrophysics Data System (ADS)
Li, Fang; Yang, Shiqiang
2005-07-01
In this paper, we present a virtual viewpoint video synthesis algorithm to satisfy the following three aims: low computing consuming; real time interpolation and acceptable video quality. In contrast with previous technologies, this method obtain incompletely 3D structure using neighbor video sources instead of getting total 3D information with all video sources, so that the computation is reduced greatly. So we demonstrate our interactive multi-view video synthesis algorithm in a personal computer. Furthermore, adopting the method of choosing feature points to build the correspondence between the frames captured by neighbor cameras, we need not require camera calibration. Finally, our method can be used when the angle between neighbor cameras is 25-30 degrees that it is much larger than common computer vision experiments. In this way, our method can be applied into many applications such as sports live, video conference, etc.
On scalable lossless video coding based on sub-pixel accurate MCTF
NASA Astrophysics Data System (ADS)
Yea, Sehoon; Pearlman, William A.
2006-01-01
We propose two approaches to scalable lossless coding of motion video. They achieve SNR-scalable bitstream up to lossless reconstruction based upon the subpixel-accurate MCTF-based wavelet video coding. The first approach is based upon a two-stage encoding strategy where a lossy reconstruction layer is augmented by a following residual layer in order to obtain (nearly) lossless reconstruction. The key advantages of our approach include an 'on-the-fly' determination of bit budget distribution between the lossy and the residual layers, freedom to use almost any progressive lossy video coding scheme as the first layer and an added feature of near-lossless compression. The second approach capitalizes on the fact that we can maintain the invertibility of MCTF with an arbitrary sub-pixel accuracy even in the presence of an extra truncation step for lossless reconstruction thanks to the lifting implementation. Experimental results show that the proposed schemes achieve compression ratios not obtainable by intra-frame coders such as Motion JPEG-2000 thanks to their inter-frame coding nature. Also they are shown to outperform the state-of-the-art non-scalable inter-frame coder H.264 (JM) lossless mode, with the added benefit of bitstream embeddedness.
Empirical evaluation of H.265/HEVC-based dynamic adaptive video streaming over HTTP (HEVC-DASH)
NASA Astrophysics Data System (ADS)
Irondi, Iheanyi; Wang, Qi; Grecos, Christos
2014-05-01
Real-time HTTP streaming has gained global popularity for delivering video content over Internet. In particular, the recent MPEG-DASH (Dynamic Adaptive Streaming over HTTP) standard enables on-demand, live, and adaptive Internet streaming in response to network bandwidth fluctuations. Meanwhile, emerging is the new-generation video coding standard, H.265/HEVC (High Efficiency Video Coding) promises to reduce the bandwidth requirement by 50% at the same video quality when compared with the current H.264/AVC standard. However, little existing work has addressed the integration of the DASH and HEVC standards, let alone empirical performance evaluation of such systems. This paper presents an experimental HEVC-DASH system, which is a pull-based adaptive streaming solution that delivers HEVC-coded video content through conventional HTTP servers where the client switches to its desired quality, resolution or bitrate based on the available network bandwidth. Previous studies in DASH have focused on H.264/AVC, whereas we present an empirical evaluation of the HEVC-DASH system by implementing a real-world test bed, which consists of an Apache HTTP Server with GPAC, an MP4Client (GPAC) with open HEVC-based DASH client and a NETEM box in the middle emulating different network conditions. We investigate and analyze the performance of HEVC-DASH by exploring the impact of various network conditions such as packet loss, bandwidth and delay on video quality. Furthermore, we compare the Intra and Random Access profiles of HEVC coding with the Intra profile of H.264/AVC when the correspondingly encoded video is streamed with DASH. Finally, we explore the correlation among the quality metrics and network conditions, and empirically establish under which conditions the different codecs can provide satisfactory performance.
Verification testing of the compression performance of the HEVC screen content coding extensions
NASA Astrophysics Data System (ADS)
Sullivan, Gary J.; Baroncini, Vittorio A.; Yu, Haoping; Joshi, Rajan L.; Liu, Shan; Xiu, Xiaoyu; Xu, Jizheng
2017-09-01
This paper reports on verification testing of the coding performance of the screen content coding (SCC) extensions of the High Efficiency Video Coding (HEVC) standard (Rec. ITU-T H.265 | ISO/IEC 23008-2 MPEG-H Part 2). The coding performance of HEVC screen content model (SCM) reference software is compared with that of the HEVC test model (HM) without the SCC extensions, as well as with the Advanced Video Coding (AVC) joint model (JM) reference software, for both lossy and mathematically lossless compression using All-Intra (AI), Random Access (RA), and Lowdelay B (LB) encoding structures and using similar encoding techniques. Video test sequences in 1920×1080 RGB 4:4:4, YCbCr 4:4:4, and YCbCr 4:2:0 colour sampling formats with 8 bits per sample are tested in two categories: "text and graphics with motion" (TGM) and "mixed" content. For lossless coding, the encodings are evaluated in terms of relative bit-rate savings. For lossy compression, subjective testing was conducted at 4 quality levels for each coding case, and the test results are presented through mean opinion score (MOS) curves. The relative coding performance is also evaluated in terms of Bjøntegaard-delta (BD) bit-rate savings for equal PSNR quality. The perceptual tests and objective metric measurements show a very substantial benefit in coding efficiency for the SCC extensions, and provided consistent results with a high degree of confidence. For TGM video, the estimated bit-rate savings ranged from 60-90% relative to the JM and 40-80% relative to the HM, depending on the AI/RA/LB configuration category and colour sampling format.
The video watermarking container: efficient real-time transaction watermarking
NASA Astrophysics Data System (ADS)
Wolf, Patrick; Hauer, Enrico; Steinebach, Martin
2008-02-01
When transaction watermarking is used to secure sales in online shops by embedding transaction specific watermarks, the major challenge is embedding efficiency: Maximum speed by minimal workload. This is true for all types of media. Video transaction watermarking presents a double challenge. Video files not only are larger than for example music files of the same playback time. In addition, video watermarking algorithms have a higher complexity than algorithms for other types of media. Therefore online shops that want to protect their videos by transaction watermarking are faced with the problem that their servers need to work harder and longer for every sold medium in comparison to audio sales. In the past, many algorithms responded to this challenge by reducing their complexity. But this usually results in a loss of either robustness or transparency. This paper presents a different approach. The container technology separates watermark embedding into two stages: A preparation stage and the finalization stage. In the preparation stage, the video is divided into embedding segments. For each segment one copy marked with "0" and anther one marked with "1" is created. This stage is computationally expensive but only needs to be done once. In the finalization stage, the watermarked video is assembled from the embedding segments according to the watermark message. This stage is very fast and involves no complex computations. It thus allows efficient creation of individually watermarked video files.
A Completely Blind Video Integrity Oracle.
Mittal, Anish; Saad, Michele A; Bovik, Alan C
2016-01-01
Considerable progress has been made toward developing still picture perceptual quality analyzers that do not require any reference picture and that are not trained on human opinion scores of distorted images. However, there do not yet exist any such completely blind video quality assessment (VQA) models. Here, we attempt to bridge this gap by developing a new VQA model called the video intrinsic integrity and distortion evaluation oracle (VIIDEO). The new model does not require the use of any additional information other than the video being quality evaluated. VIIDEO embodies models of intrinsic statistical regularities that are observed in natural vidoes, which are used to quantify disturbances introduced due to distortions. An algorithm derived from the VIIDEO model is thereby able to predict the quality of distorted videos without any external knowledge about the pristine source, anticipated distortions, or human judgments of video quality. Even with such a paucity of information, we are able to show that the VIIDEO algorithm performs much better than the legacy full reference quality measure MSE on the LIVE VQA database and delivers performance comparable with a leading human judgment trained blind VQA model. We believe that the VIIDEO algorithm is a significant step toward making real-time monitoring of completely blind video quality possible.
An efficient approach for video information retrieval
NASA Astrophysics Data System (ADS)
Dong, Daoguo; Xue, Xiangyang
2005-01-01
Today, more and more video information can be accessed through internet, satellite, etc.. Retrieving specific video information from large-scale video database has become an important and challenging research topic in the area of multimedia information retrieval. In this paper, we introduce a new and efficient index structure OVA-File, which is a variant of VA-File. In OVA-File, the approximations close to each other in data space are stored in close positions of the approximation file. The benefit is that only a part of approximations close to the query vector need to be visited to get the query result. Both shot query algorithm and video clip algorithm are proposed to support video information retrieval efficiently. The experimental results showed that the queries based on OVA-File were much faster than that based on VA-File with small loss of result quality.
NASA Astrophysics Data System (ADS)
Eaton, Adam; Vincely, Vinoin; Lloyd, Paige; Hugenberg, Kurt; Vishwanath, Karthik
2017-03-01
Video Photoplethysmography (VPPG) is a numerical technique to process standard RGB video data of exposed human skin and extracting the heart-rate (HR) from the skin areas. Being a non-contact technique, VPPG has the potential to provide estimates of subject's heart-rate, respiratory rate, and even the heart rate variability of human subjects with potential applications ranging from infant monitors, remote healthcare and psychological experiments, particularly given the non-contact and sensor-free nature of the technique. Though several previous studies have reported successful correlations in HR obtained using VPPG algorithms to HR measured using the gold-standard electrocardiograph, others have reported that these correlations are dependent on controlling for duration of the video-data analyzed, subject motion, and ambient lighting. Here, we investigate the ability of two commonly used VPPG-algorithms in extraction of human heart-rates under three different laboratory conditions. We compare the VPPG HR values extracted across these three sets of experiments to the gold-standard values acquired by using an electrocardiogram or a commercially available pulseoximeter. The two VPPG-algorithms were applied with and without KLT-facial feature tracking and detection algorithms from the Computer Vision MATLAB® toolbox. Results indicate that VPPG based numerical approaches have the ability to provide robust estimates of subject HR values and are relatively insensitive to the devices used to record the video data. However, they are highly sensitive to conditions of video acquisition including subject motion, the location, size and averaging techniques applied to regions-of-interest as well as to the number of video frames used for data processing.
Telesign: a videophone system for sign language distant communication
NASA Astrophysics Data System (ADS)
Mozelle, Gerard; Preteux, Francoise J.; Viallet, Jean-Emmanuel
1998-09-01
This paper presents a low bit rate videophone system for deaf people communicating by means of sign language. Classic video conferencing systems have focused on head and shoulders sequences which are not well-suited for sign language video transmission since hearing impaired people also use their hands and arms to communicate. To address the above-mentioned functionality, we have developed a two-step content-based video coding system based on: (1) A segmentation step. Four or five video objects (VO) are extracted using a cooperative approach between color-based and morphological segmentation. (2) VO coding are achieved by using a standardized MPEG-4 video toolbox. Results of encoded sign language video sequences, presented for three target bit rates (32 kbits/s, 48 kbits/s and 64 kbits/s), demonstrate the efficiency of the approach presented in this paper.
Layered compression for high-precision depth data.
Miao, Dan; Fu, Jingjing; Lu, Yan; Li, Shipeng; Chen, Chang Wen
2015-12-01
With the development of depth data acquisition technologies, access to high-precision depth with more than 8-b depths has become much easier and determining how to efficiently represent and compress high-precision depth is essential for practical depth storage and transmission systems. In this paper, we propose a layered high-precision depth compression framework based on an 8-b image/video encoder to achieve efficient compression with low complexity. Within this framework, considering the characteristics of the high-precision depth, a depth map is partitioned into two layers: 1) the most significant bits (MSBs) layer and 2) the least significant bits (LSBs) layer. The MSBs layer provides rough depth value distribution, while the LSBs layer records the details of the depth value variation. For the MSBs layer, an error-controllable pixel domain encoding scheme is proposed to exploit the data correlation of the general depth information with sharp edges and to guarantee the data format of LSBs layer is 8 b after taking the quantization error from MSBs layer. For the LSBs layer, standard 8-b image/video codec is leveraged to perform the compression. The experimental results demonstrate that the proposed coding scheme can achieve real-time depth compression with satisfactory reconstruction quality. Moreover, the compressed depth data generated from this scheme can achieve better performance in view synthesis and gesture recognition applications compared with the conventional coding schemes because of the error control algorithm.
Intra prediction using face continuity in 360-degree video coding
NASA Astrophysics Data System (ADS)
Hanhart, Philippe; He, Yuwen; Ye, Yan
2017-09-01
This paper presents a new reference sample derivation method for intra prediction in 360-degree video coding. Unlike the conventional reference sample derivation method for 2D video coding, which uses the samples located directly above and on the left of the current block, the proposed method considers the spherical nature of 360-degree video when deriving reference samples located outside the current face to which the block belongs, and derives reference samples that are geometric neighbors on the sphere. The proposed reference sample derivation method was implemented in the Joint Exploration Model 3.0 (JEM-3.0) for the cubemap projection format. Simulation results for the all intra configuration show that, when compared with the conventional reference sample derivation method, the proposed method gives, on average, luma BD-rate reduction of 0.3% in terms of the weighted spherical PSNR (WS-PSNR) and spherical PSNR (SPSNR) metrics.
NASA Astrophysics Data System (ADS)
Nightingale, James; Wang, Qi; Grecos, Christos; Goma, Sergio
2014-02-01
High Efficiency Video Coding (HEVC), the latest video compression standard (also known as H.265), can deliver video streams of comparable quality to the current H.264 Advanced Video Coding (H.264/AVC) standard with a 50% reduction in bandwidth. Research into SHVC, the scalable extension to the HEVC standard, is still in its infancy. One important area for investigation is whether, given the greater compression ratio of HEVC (and SHVC), the loss of packets containing video content will have a greater impact on the quality of delivered video than is the case with H.264/AVC or its scalable extension H.264/SVC. In this work we empirically evaluate the layer-based, in-network adaptation of video streams encoded using SHVC in situations where dynamically changing bandwidths and datagram loss ratios require the real-time adaptation of video streams. Through the use of extensive experimentation, we establish a comprehensive set of benchmarks for SHVC-based highdefinition video streaming in loss prone network environments such as those commonly found in mobile networks. Among other results, we highlight that packet losses of only 1% can lead to a substantial reduction in PSNR of over 3dB and error propagation in over 130 pictures following the one in which the loss occurred. This work would be one of the earliest studies in this cutting-edge area that reports benchmark evaluation results for the effects of datagram loss on SHVC picture quality and offers empirical and analytical insights into SHVC adaptation to lossy, mobile networking conditions.
Spherical rotation orientation indication for HEVC and JEM coding of 360 degree video
NASA Astrophysics Data System (ADS)
Boyce, Jill; Xu, Qian
2017-09-01
Omnidirectional (or "360 degree") video, representing a panoramic view of a spherical 360° ×180° scene, can be encoded using conventional video compression standards, once it has been projection mapped to a 2D rectangular format. Equirectangular projection format is currently used for mapping 360 degree video to a rectangular representation for coding using HEVC/JEM. However, video in the top and bottom regions of the image, corresponding to the "north pole" and "south pole" of the spherical representation, is significantly warped. We propose to perform spherical rotation of the input video prior to HEVC/JEM encoding in order to improve the coding efficiency, and to signal parameters in a supplemental enhancement information (SEI) message that describe the inverse rotation process recommended to be applied following HEVC/JEM decoding, prior to display. Experiment results show that up to 17.8% bitrate gain (using the WS-PSNR end-to-end metric) can be achieved for the Chairlift sequence using HM16.15 and 11.9% gain using JEM6.0, and an average gain of 2.9% for HM16.15 and 2.2% for JEM6.0.
Soft-output decoding algorithms in iterative decoding of turbo codes
NASA Technical Reports Server (NTRS)
Benedetto, S.; Montorsi, G.; Divsalar, D.; Pollara, F.
1996-01-01
In this article, we present two versions of a simplified maximum a posteriori decoding algorithm. The algorithms work in a sliding window form, like the Viterbi algorithm, and can thus be used to decode continuously transmitted sequences obtained by parallel concatenated codes, without requiring code trellis termination. A heuristic explanation is also given of how to embed the maximum a posteriori algorithms into the iterative decoding of parallel concatenated codes (turbo codes). The performances of the two algorithms are compared on the basis of a powerful rate 1/3 parallel concatenated code. Basic circuits to implement the simplified a posteriori decoding algorithm using lookup tables, and two further approximations (linear and threshold), with a very small penalty, to eliminate the need for lookup tables are proposed.
Robust Transmission of H.264/AVC Streams Using Adaptive Group Slicing and Unequal Error Protection
NASA Astrophysics Data System (ADS)
Thomos, Nikolaos; Argyropoulos, Savvas; Boulgouris, Nikolaos V.; Strintzis, Michael G.
2006-12-01
We present a novel scheme for the transmission of H.264/AVC video streams over lossy packet networks. The proposed scheme exploits the error-resilient features of H.264/AVC codec and employs Reed-Solomon codes to protect effectively the streams. A novel technique for adaptive classification of macroblocks into three slice groups is also proposed. The optimal classification of macroblocks and the optimal channel rate allocation are achieved by iterating two interdependent steps. Dynamic programming techniques are used for the channel rate allocation process in order to reduce complexity. Simulations clearly demonstrate the superiority of the proposed method over other recent algorithms for transmission of H.264/AVC streams.
ERIC Educational Resources Information Center
Garneli, Varvara; Chorianopoulos, Konstantinos
2018-01-01
Various aspects of computational thinking (CT) could be supported by educational contexts such as simulations and video-games construction. In this field study, potential differences in student motivation and learning were empirically examined through students' code. For this purpose, we performed a teaching intervention that took place over five…
Optimized atom position and coefficient coding for matching pursuit-based image compression.
Shoa, Alireza; Shirani, Shahram
2009-12-01
In this paper, we propose a new encoding algorithm for matching pursuit image coding. We show that coding performance is improved when correlations between atom positions and atom coefficients are both used in encoding. We find the optimum tradeoff between efficient atom position coding and efficient atom coefficient coding and optimize the encoder parameters. Our proposed algorithm outperforms the existing coding algorithms designed for matching pursuit image coding. Additionally, we show that our algorithm results in better rate distortion performance than JPEG 2000 at low bit rates.
NASA Astrophysics Data System (ADS)
Flores, Eileen; Yelamos, Oriol; Cordova, Miguel; Kose, Kivanc; Phillips, William; Rossi, Anthony; Nehal, Kishwer; Rajadhyaksha, Milind
2017-02-01
Reflectance confocal microscopy (RCM) imaging shows promise for guiding surgical treatment of skin cancers. Recent technological advancements such as the introduction of the handheld version of the reflectance confocal microscope, video acquisition and video-mosaicing have improved RCM as an emerging tool to evaluate cancer margins during routine surgical skin procedures such as Mohs micrographic surgery (MMS). Detection of residual non-melanoma skin cancer (NMSC) tumor during MMS is feasible, as demonstrated by the introduction of real-time perioperative imaging on patients in the surgical setting. Our study is currently testing the feasibility of a new mosaicing algorithm for perioperative RCM imaging of NMSC cancer margins on patients during MMS. We report progress toward imaging and image analysis on forty-five patients, who presented for MMS at the MSKCC Dermatology service. The first 10 patients were used as a training set to establish an RCM imaging algorithm, which was implemented on the remaining test set of 35 patients. RCM imaging, using 35% AlCl3 for nuclear contrast, was performed pre- and intra-operatively with the Vivascope 3000 (Caliber ID). Imaging was performed in quadrants in the wound, to simulate the Mohs surgeon's examination of pathology. Videos were taken at the epidermal and deep dermal margins. Our Mohs surgeons assessed all videos and video-mosaics for quality and correlation to histology. Overall, our RCM video-mosaicing algorithm is feasible. RCM videos and video-mosaics of the epidermal and dermal margins were found to be of clinically acceptable quality. Assessment of cancer margins was affected by type of NMSC, size and location. Among the test set of 35 patients, 83% showed acceptable imaging quality, resolution and contrast. Visualization of nuclear and cellular morphology of residual BCC/SCC tumor and normal skin features could be detected in the peripheral and deep dermal margins. We observed correlation between the RCM videos/video-mosaics and the corresponding histology in 32 lesions. Peri-operative RCM imaging shows promise for improved and faster detection of cancer margins and guiding MMS in the surgical setting.
Collision count in rugby union: A comparison of micro-technology and video analysis methods.
Reardon, Cillian; Tobin, Daniel P; Tierney, Peter; Delahunt, Eamonn
2017-10-01
The aim of our study was to determine if there is a role for manipulation of g force thresholds acquired via micro-technology for accurately detecting collisions in rugby union. In total, 36 players were recruited from an elite Guinness Pro12 rugby union team. Player movement profiles and collisions were acquired via individual global positioning system (GPS) micro-technology units. Players were assigned to a sub-category of positions in order to determine positional collision demands. The coding of collisions by micro-technology at g force thresholds between 2 and 5.5 g (0.5 g increments) was compared with collision coding by an expert video analyst using Bland-Altman assessments. The most appropriate g force threshold (smallest mean difference compared with video analyst coding) was lower for all forwards positions (2.5 g) than for all backs positions (3.5 g). The Bland-Altman 95% limits of agreement indicated that there may be a substantial over- or underestimation of collisions coded via GPS micro-technology when using expert video analyst coding as the reference comparator. The manipulation of the g force thresholds applied to data acquired by GPS micro-technology units based on incremental thresholds of 0.5 g does not provide a reliable tool for the accurate coding of collisions in rugby union. Future research should aim to investigate smaller g force threshold increments and determine the events that cause coding of false positives.
Direct endoscopic video registration for sinus surgery
NASA Astrophysics Data System (ADS)
Mirota, Daniel; Taylor, Russell H.; Ishii, Masaru; Hager, Gregory D.
2009-02-01
Advances in computer vision have made possible robust 3D reconstruction of monocular endoscopic video. These reconstructions accurately represent the visible anatomy and, once registered to pre-operative CT data, enable a navigation system to track directly through video eliminating the need for an external tracking system. Video registration provides the means for a direct interface between an endoscope and a navigation system and allows a shorter chain of rigid-body transformations to be used to solve the patient/navigation-system registration. To solve this registration step we propose a new 3D-3D registration algorithm based on Trimmed Iterative Closest Point (TrICP)1 and the z-buffer algorithm.2 The algorithm takes as input a 3D point cloud of relative scale with the origin at the camera center, an isosurface from the CT, and an initial guess of the scale and location. Our algorithm utilizes only the visible polygons of the isosurface from the current camera location during each iteration to minimize the search area of the target region and robustly reject outliers of the reconstruction. We present example registrations in the sinus passage applicable to both sinus surgery and transnasal surgery. To evaluate our algorithm's performance we compare it to registration via Optotrak and present closest distance point to surface error. We show our algorithm has a mean closest distance error of .2268mm.
Reflectance Prediction Modelling for Residual-Based Hyperspectral Image Coding
Xiao, Rui; Gao, Junbin; Bossomaier, Terry
2016-01-01
A Hyperspectral (HS) image provides observational powers beyond human vision capability but represents more than 100 times the data compared to a traditional image. To transmit and store the huge volume of an HS image, we argue that a fundamental shift is required from the existing “original pixel intensity”-based coding approaches using traditional image coders (e.g., JPEG2000) to the “residual”-based approaches using a video coder for better compression performance. A modified video coder is required to exploit spatial-spectral redundancy using pixel-level reflectance modelling due to the different characteristics of HS images in their spectral and shape domain of panchromatic imagery compared to traditional videos. In this paper a novel coding framework using Reflectance Prediction Modelling (RPM) in the latest video coding standard High Efficiency Video Coding (HEVC) for HS images is proposed. An HS image presents a wealth of data where every pixel is considered a vector for different spectral bands. By quantitative comparison and analysis of pixel vector distribution along spectral bands, we conclude that modelling can predict the distribution and correlation of the pixel vectors for different bands. To exploit distribution of the known pixel vector, we estimate a predicted current spectral band from the previous bands using Gaussian mixture-based modelling. The predicted band is used as the additional reference band together with the immediate previous band when we apply the HEVC. Every spectral band of an HS image is treated like it is an individual frame of a video. In this paper, we compare the proposed method with mainstream encoders. The experimental results are fully justified by three types of HS dataset with different wavelength ranges. The proposed method outperforms the existing mainstream HS encoders in terms of rate-distortion performance of HS image compression. PMID:27695102
A Robust Model-Based Coding Technique for Ultrasound Video
NASA Technical Reports Server (NTRS)
Docef, Alen; Smith, Mark J. T.
1995-01-01
This paper introduces a new approach to coding ultrasound video, the intended application being very low bit rate coding for transmission over low cost phone lines. The method exploits both the characteristic noise and the quasi-periodic nature of the signal. Data compression ratios between 250:1 and 1000:1 are shown to be possible, which is sufficient for transmission over ISDN and conventional phone lines. Preliminary results show this approach to be promising for remote ultrasound examinations.
NASA Astrophysics Data System (ADS)
Kurceren, Ragip; Modestino, James W.
1998-12-01
The use of forward error-control (FEC) coding, possibly in conjunction with ARQ techniques, has emerged as a promising approach for video transport over ATM networks for cell-loss recovery and/or bit error correction, such as might be required for wireless links. Although FEC provides cell-loss recovery capabilities it also introduces transmission overhead which can possibly cause additional cell losses. A methodology is described to maximize the number of video sources multiplexed at a given quality of service (QoS), measured in terms of decoded cell loss probability, using interlaced FEC codes. The transport channel is modelled as a block interference channel (BIC) and the multiplexer as single server, deterministic service, finite buffer supporting N users. Based upon an information-theoretic characterization of the BIC and large deviation bounds on the buffer overflow probability, the described methodology provides theoretically achievable upper limits on the number of sources multiplexed. Performance of specific coding techniques using interlaced nonbinary Reed-Solomon (RS) codes and binary rate-compatible punctured convolutional (RCPC) codes is illustrated.
Facilitation and Teacher Behaviors: An Analysis of Literacy Teachers' Video-Case Discussions
ERIC Educational Resources Information Center
Arya, Poonam; Christ, Tanya; Chiu, Ming Ming
2014-01-01
This study explored how peer and professor facilitations are related to teachers' behaviors during video-case discussions. Fourteen inservice teachers produced 1,787 turns of conversation during 12 video-case discussions that were video-recorded, transcribed, coded, and analyzed with statistical discourse analysis. Professor facilitations (sharing…
NASA Astrophysics Data System (ADS)
Aghamaleki, Javad Abbasi; Behrad, Alireza
2018-01-01
Double compression detection is a crucial stage in digital image and video forensics. However, the detection of double compressed videos is challenging when the video forger uses the same quantization matrix and synchronized group of pictures (GOP) structure during the recompression history to conceal tampering effects. A passive approach is proposed for detecting double compressed MPEG videos with the same quantization matrix and synchronized GOP structure. To devise the proposed algorithm, the effects of recompression on P frames are mathematically studied. Then, based on the obtained guidelines, a feature vector is proposed to detect double compressed frames on the GOP level. Subsequently, sparse representations of the feature vectors are used for dimensionality reduction and enrich the traces of recompression. Finally, a support vector machine classifier is employed to detect and localize double compression in temporal domain. The experimental results show that the proposed algorithm achieves the accuracy of more than 95%. In addition, the comparisons of the results of the proposed method with those of other methods reveal the efficiency of the proposed algorithm.
Dynamic full-scalability conversion in scalable video coding
NASA Astrophysics Data System (ADS)
Lee, Dong Su; Bae, Tae Meon; Thang, Truong Cong; Ro, Yong Man
2007-02-01
For outstanding coding efficiency with scalability functions, SVC (Scalable Video Coding) is being standardized. SVC can support spatial, temporal and SNR scalability and these scalabilities are useful to provide a smooth video streaming service even in a time varying network such as a mobile environment. But current SVC is insufficient to support dynamic video conversion with scalability, thereby the adaptation of bitrate to meet a fluctuating network condition is limited. In this paper, we propose dynamic full-scalability conversion methods for QoS adaptive video streaming in SVC. To accomplish full scalability dynamic conversion, we develop corresponding bitstream extraction, encoding and decoding schemes. At the encoder, we insert the IDR NAL periodically to solve the problems of spatial scalability conversion. At the extractor, we analyze the SVC bitstream to get the information which enable dynamic extraction. Real time extraction is achieved by using this information. Finally, we develop the decoder so that it can manage the changing scalability. Experimental results showed that dynamic full-scalability conversion was verified and it was necessary for time varying network condition.
Mission planning optimization of video satellite for ground multi-object staring imaging
NASA Astrophysics Data System (ADS)
Cui, Kaikai; Xiang, Junhua; Zhang, Yulin
2018-03-01
This study investigates the emergency scheduling problem of ground multi-object staring imaging for a single video satellite. In the proposed mission scenario, the ground objects require a specified duration of staring imaging by the video satellite. The planning horizon is not long, i.e., it is usually shorter than one orbit period. A binary decision variable and the imaging order are used as the design variables, and the total observation revenue combined with the influence of the total attitude maneuvering time is regarded as the optimization objective. Based on the constraints of the observation time windows, satellite attitude adjustment time, and satellite maneuverability, a constraint satisfaction mission planning model is established for ground object staring imaging by a single video satellite. Further, a modified ant colony optimization algorithm with tabu lists (Tabu-ACO) is designed to solve this problem. The proposed algorithm can fully exploit the intelligence and local search ability of ACO. Based on full consideration of the mission characteristics, the design of the tabu lists can reduce the search range of ACO and improve the algorithm efficiency significantly. The simulation results show that the proposed algorithm outperforms the conventional algorithm in terms of optimization performance, and it can obtain satisfactory scheduling results for the mission planning problem.
Quantitative fluorescence angiography for neurosurgical interventions.
Weichelt, Claudia; Duscha, Philipp; Steinmeier, Ralf; Meyer, Tobias; Kuß, Julia; Cimalla, Peter; Kirsch, Matthias; Sobottka, Stephan B; Koch, Edmund; Schackert, Gabriele; Morgenstern, Ute
2013-06-01
Present methods for quantitative measurement of cerebral perfusion during neurosurgical operations require additional technology for measurement, data acquisition, and processing. This study used conventional fluorescence video angiography--as an established method to visualize blood flow in brain vessels--enhanced by a quantifying perfusion software tool. For these purposes, the fluorescence dye indocyanine green is given intravenously, and after activation by a near-infrared light source the fluorescence signal is recorded. Video data are analyzed by software algorithms to allow quantification of the blood flow. Additionally, perfusion is measured intraoperatively by a reference system. Furthermore, comparing reference measurements using a flow phantom were performed to verify the quantitative blood flow results of the software and to validate the software algorithm. Analysis of intraoperative video data provides characteristic biological parameters. These parameters were implemented in the special flow phantom for experimental validation of the developed software algorithms. Furthermore, various factors that influence the determination of perfusion parameters were analyzed by means of mathematical simulation. Comparing patient measurement, phantom experiment, and computer simulation under certain conditions (variable frame rate, vessel diameter, etc.), the results of the software algorithms are within the range of parameter accuracy of the reference methods. Therefore, the software algorithm for calculating cortical perfusion parameters from video data presents a helpful intraoperative tool without complex additional measurement technology.
NASA Astrophysics Data System (ADS)
Bezan, Scott; Shirani, Shahram
2006-12-01
To reliably transmit video over error-prone channels, the data should be both source and channel coded. When multiple channels are available for transmission, the problem extends to that of partitioning the data across these channels. The condition of transmission channels, however, varies with time. Therefore, the error protection added to the data at one instant of time may not be optimal at the next. In this paper, we propose a method for adaptively adding error correction code in a rate-distortion (RD) optimized manner using rate-compatible punctured convolutional codes to an MJPEG2000 constant rate-coded frame of video. We perform an analysis on the rate-distortion tradeoff of each of the coding units (tiles and packets) in each frame and adapt the error correction code assigned to the unit taking into account the bandwidth and error characteristics of the channels. This method is applied to both single and multiple time-varying channel environments. We compare our method with a basic protection method in which data is either not transmitted, transmitted with no protection, or transmitted with a fixed amount of protection. Simulation results show promising performance for our proposed method.
Video streaming with SHVC to HEVC transcoding
NASA Astrophysics Data System (ADS)
Gudumasu, Srinivas; He, Yuwen; Ye, Yan; Xiu, Xiaoyu
2015-09-01
This paper proposes an efficient Scalable High efficiency Video Coding (SHVC) to High Efficiency Video Coding (HEVC) transcoder, which can reduce the transcoding complexity significantly, and provide a desired trade-off between the transcoding complexity and the transcoded video quality. To reduce the transcoding complexity, some of coding information, such as coding unit (CU) depth, prediction mode, merge mode, motion vector information, intra direction information and transform unit (TU) depth information, in the SHVC bitstream are mapped and transcoded to single layer HEVC bitstream. One major difficulty in transcoding arises when trying to reuse the motion information from SHVC bitstream since motion vectors referring to inter-layer reference (ILR) pictures cannot be reused directly in transcoding. Reusing motion information obtained from ILR pictures for those prediction units (PUs) will reduce the complexity of the SHVC transcoder greatly but a significant reduction in the quality of the picture is observed. Pictures corresponding to the intra refresh pictures in the base layer (BL) will be coded as P pictures in enhancement layer (EL) in the SHVC bitstream; and directly reusing the intra information from the BL for transcoding will not get a good coding efficiency. To solve these problems, various transcoding technologies are proposed. The proposed technologies offer different trade-offs between transcoding speed and transcoding quality. They are implemented on the basis of reference software SHM-6.0 and HM-14.0 for the two layer spatial scalability configuration. Simulations show that the proposed SHVC software transcoder reduces the transcoding complexity by up to 98-99% using low complexity transcoding mode when compared with cascaded re-encoding method. The transcoder performance at various bitrates with different transcoding modes are compared in terms of transcoding speed and transcoded video quality.
HEVC for high dynamic range services
NASA Astrophysics Data System (ADS)
Kim, Seung-Hwan; Zhao, Jie; Misra, Kiran; Segall, Andrew
2015-09-01
Displays capable of showing a greater range of luminance values can render content containing high dynamic range information in a way such that the viewers have a more immersive experience. This paper introduces the design aspects of a high dynamic range (HDR) system, and examines the performance of the HDR processing chain in terms of compression efficiency. Specifically it examines the relation between recently introduced Society of Motion Picture and Television Engineers (SMPTE) ST 2084 transfer function and the High Efficiency Video Coding (HEVC) standard. SMPTE ST 2084 is designed to cover the full range of an HDR signal from 0 to 10,000 nits, however in many situations the valid signal range of actual video might be smaller than SMPTE ST 2084 supported range. The above restricted signal range results in restricted range of code values for input video data and adversely impacts compression efficiency. In this paper, we propose a code value remapping method that extends the restricted range code values into the full range code values so that the existing standards such as HEVC may better compress the video content. The paper also identifies related non-normative encoder-only changes that are required for remapping method for a fair comparison with anchor. Results are presented comparing the efficiency of the current approach versus the proposed remapping method for HM-16.2.
NASA Astrophysics Data System (ADS)
Chan, Chia-Hsin; Tu, Chun-Chuan; Tsai, Wen-Jiin
2017-01-01
High efficiency video coding (HEVC) not only improves the coding efficiency drastically compared to the well-known H.264/AVC but also introduces coding tools for parallel processing, one of which is tiles. Tile partitioning is allowed to be arbitrary in HEVC, but how to decide tile boundaries remains an open issue. An adaptive tile boundary (ATB) method is proposed to select a better tile partitioning to improve load balancing (ATB-LoadB) and coding efficiency (ATB-Gain) with a unified scheme. Experimental results show that, compared to ordinary uniform-space partitioning, the proposed ATB can save up to 17.65% of encoding times in parallel encoding scenarios and can reduce up to 0.8% of total bit rates for coding efficiency.
NASA Astrophysics Data System (ADS)
Zhao, Haiwu; Wang, Guozhong; Hou, Gang
2005-07-01
AVS is a new digital audio-video coding standard established by China. AVS will be used in digital TV broadcasting and next general optical disk. AVS adopted many digital audio-video coding techniques developed by Chinese company and universities in recent years, it has very low complexity compared to H.264, and AVS will charge very low royalty fee through one-step license including all AVS tools. So AVS is a good and competitive candidate for Chinese DTV and next generation optical disk. In addition, Chinese government has published a plan for satellite TV signal directly to home(DTH) and a telecommunication satellite named as SINO 2 will be launched in 2006. AVS will be also one of the best hopeful candidates of audio-video coding standard on satellite signal transmission.
Multicore-based 3D-DWT video encoder
NASA Astrophysics Data System (ADS)
Galiano, Vicente; López-Granado, Otoniel; Malumbres, Manuel P.; Migallón, Hector
2013-12-01
Three-dimensional wavelet transform (3D-DWT) encoders are good candidates for applications like professional video editing, video surveillance, multi-spectral satellite imaging, etc. where a frame must be reconstructed as quickly as possible. In this paper, we present a new 3D-DWT video encoder based on a fast run-length coding engine. Furthermore, we present several multicore optimizations to speed-up the 3D-DWT computation. An exhaustive evaluation of the proposed encoder (3D-GOP-RL) has been performed, and we have compared the evaluation results with other video encoders in terms of rate/distortion (R/D), coding/decoding delay, and memory consumption. Results show that the proposed encoder obtains good R/D results for high-resolution video sequences with nearly in-place computation using only the memory needed to store a group of pictures. After applying the multicore optimization strategies over the 3D DWT, the proposed encoder is able to compress a full high-definition video sequence in real-time.
Keyhole imaging method for dynamic objects behind the occlusion area
NASA Astrophysics Data System (ADS)
Hao, Conghui; Chen, Xi; Dong, Liquan; Zhao, Yuejin; Liu, Ming; Kong, Lingqin; Hui, Mei; Liu, Xiaohua; Wu, Hong
2018-01-01
A method of keyhole imaging based on camera array is realized to obtain the video image behind a keyhole in shielded space at a relatively long distance. We get the multi-angle video images by using a 2×2 CCD camera array to take the images behind the keyhole in four directions. The multi-angle video images are saved in the form of frame sequences. This paper presents a method of video frame alignment. In order to remove the non-target area outside the aperture, we use the canny operator and morphological method to realize the edge detection of images and fill the images. The image stitching of four images is accomplished on the basis of the image stitching algorithm of two images. In the image stitching algorithm of two images, the SIFT method is adopted to accomplish the initial matching of images, and then the RANSAC algorithm is applied to eliminate the wrong matching points and to obtain a homography matrix. A method of optimizing transformation matrix is proposed in this paper. Finally, the video image with larger field of view behind the keyhole can be synthesized with image frame sequence in which every single frame is stitched. The results show that the screen of the video is clear and natural, the brightness transition is smooth. There is no obvious artificial stitching marks in the video, and it can be applied in different engineering environment .
Coding and decoding for code division multiple user communication systems
NASA Technical Reports Server (NTRS)
Healy, T. J.
1985-01-01
A new algorithm is introduced which decodes code division multiple user communication signals. The algorithm makes use of the distinctive form or pattern of each signal to separate it from the composite signal created by the multiple users. Although the algorithm is presented in terms of frequency-hopped signals, the actual transmitter modulator can use any of the existing digital modulation techniques. The algorithm is applicable to error-free codes or to codes where controlled interference is permitted. It can be used when block synchronization is assumed, and in some cases when it is not. The paper also discusses briefly some of the codes which can be used in connection with the algorithm, and relates the algorithm to past studies which use other approaches to the same problem.
Subjective evaluation of H.265/HEVC based dynamic adaptive video streaming over HTTP (HEVC-DASH)
NASA Astrophysics Data System (ADS)
Irondi, Iheanyi; Wang, Qi; Grecos, Christos
2015-02-01
The Dynamic Adaptive Streaming over HTTP (DASH) standard is becoming increasingly popular for real-time adaptive HTTP streaming of internet video in response to unstable network conditions. Integration of DASH streaming techniques with the new H.265/HEVC video coding standard is a promising area of research. The performance of HEVC-DASH systems has been previously evaluated by a few researchers using objective metrics, however subjective evaluation would provide a better measure of the user's Quality of Experience (QoE) and overall performance of the system. This paper presents a subjective evaluation of an HEVC-DASH system implemented in a hardware testbed. Previous studies in this area have focused on using the current H.264/AVC (Advanced Video Coding) or H.264/SVC (Scalable Video Coding) codecs and moreover, there has been no established standard test procedure for the subjective evaluation of DASH adaptive streaming. In this paper, we define a test plan for HEVC-DASH with a carefully justified data set employing longer video sequences that would be sufficient to demonstrate the bitrate switching operations in response to various network condition patterns. We evaluate the end user's real-time QoE online by investigating the perceived impact of delay, different packet loss rates, fluctuating bandwidth, and the perceived quality of using different DASH video stream segment sizes on a video streaming session using different video sequences. The Mean Opinion Score (MOS) results give an insight into the performance of the system and expectation of the users. The results from this study show the impact of different network impairments and different video segments on users' QoE and further analysis and study may help in optimizing system performance.
Maximum-likelihood soft-decision decoding of block codes using the A* algorithm
NASA Technical Reports Server (NTRS)
Ekroot, L.; Dolinar, S.
1994-01-01
The A* algorithm finds the path in a finite depth binary tree that optimizes a function. Here, it is applied to maximum-likelihood soft-decision decoding of block codes where the function optimized over the codewords is the likelihood function of the received sequence given each codeword. The algorithm considers codewords one bit at a time, making use of the most reliable received symbols first and pursuing only the partially expanded codewords that might be maximally likely. A version of the A* algorithm for maximum-likelihood decoding of block codes has been implemented for block codes up to 64 bits in length. The efficiency of this algorithm makes simulations of codes up to length 64 feasible. This article details the implementation currently in use, compares the decoding complexity with that of exhaustive search and Viterbi decoding algorithms, and presents performance curves obtained with this implementation of the A* algorithm for several codes.
Hierarchical video summarization
NASA Astrophysics Data System (ADS)
Ratakonda, Krishna; Sezan, M. Ibrahim; Crinon, Regis J.
1998-12-01
We address the problem of key-frame summarization of vide in the absence of any a priori information about its content. This is a common problem that is encountered in home videos. We propose a hierarchical key-frame summarization algorithm where a coarse-to-fine key-frame summary is generated. A hierarchical key-frame summary facilitates multi-level browsing where the user can quickly discover the content of the video by accessing its coarsest but most compact summary and then view a desired segment of the video with increasingly more detail. At the finest level, the summary is generated on the basis of color features of video frames, using an extension of a recently proposed key-frame extraction algorithm. The finest level key-frames are recursively clustered using a novel pairwise K-means clustering approach with temporal consecutiveness constraint. We also address summarization of MPEG-2 compressed video without fully decoding the bitstream. We also propose efficient mechanisms that facilitate decoding the video when the hierarchical summary is utilized in browsing and playback of video segments starting at selected key-frames.
A Method for Automated Detection of Usability Problems from Client User Interface Events
Saadawi, Gilan M.; Legowski, Elizabeth; Medvedeva, Olga; Chavan, Girish; Crowley, Rebecca S.
2005-01-01
Think-aloud usability analysis provides extremely useful data but is very time-consuming and expensive to perform because of the extensive manual video analysis that is required. We describe a simple method for automated detection of usability problems from client user interface events for a developing medical intelligent tutoring system. The method incorporates (1) an agent-based method for communication that funnels all interface events and system responses to a centralized database, (2) a simple schema for representing interface events and higher order subgoals, and (3) an algorithm that reproduces the criteria used for manual coding of usability problems. A correction factor was empirically determining to account for the slower task performance of users when thinking aloud. We tested the validity of the method by simultaneously identifying usability problems using TAU and manually computing them from stored interface event data using the proposed algorithm. All usability problems that did not rely on verbal utterances were detectable with the proposed method. PMID:16779121
Perceptual video quality assessment in H.264 video coding standard using objective modeling.
Karthikeyan, Ramasamy; Sainarayanan, Gopalakrishnan; Deepa, Subramaniam Nachimuthu
2014-01-01
Since usage of digital video is wide spread nowadays, quality considerations have become essential, and industry demand for video quality measurement is rising. This proposal provides a method of perceptual quality assessment in H.264 standard encoder using objective modeling. For this purpose, quality impairments are calculated and a model is developed to compute the perceptual video quality metric based on no reference method. Because of the shuttle difference between the original video and the encoded video the quality of the encoded picture gets degraded, this quality difference is introduced by the encoding process like Intra and Inter prediction. The proposed model takes into account of the artifacts introduced by these spatial and temporal activities in the hybrid block based coding methods and an objective modeling of these artifacts into subjective quality estimation is proposed. The proposed model calculates the objective quality metric using subjective impairments; blockiness, blur and jerkiness compared to the existing bitrate only calculation defined in the ITU G 1070 model. The accuracy of the proposed perceptual video quality metrics is compared against popular full reference objective methods as defined by VQEG.
MPI-FAUN: An MPI-Based Framework for Alternating-Updating Nonnegative Matrix Factorization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kannan, Ramakrishnan; Ballard, Grey; Park, Haesun
Non-negative matrix factorization (NMF) is the problem of determining two non-negative low rank factors W and H, for the given input matrix A, such that A≈WH. NMF is a useful tool for many applications in different domains such as topic modeling in text mining, background separation in video analysis, and community detection in social networks. Despite its popularity in the data mining community, there is a lack of efficient parallel algorithms to solve the problem for big data sets. The main contribution of this work is a new, high-performance parallel computational framework for a broad class of NMF algorithms thatmore » iteratively solves alternating non-negative least squares (NLS) subproblems for W and H. It maintains the data and factor matrices in memory (distributed across processors), uses MPI for interprocessor communication, and, in the dense case, provably minimizes communication costs (under mild assumptions). The framework is flexible and able to leverage a variety of NMF and NLS algorithms, including Multiplicative Update, Hierarchical Alternating Least Squares, and Block Principal Pivoting. Our implementation allows us to benchmark and compare different algorithms on massive dense and sparse data matrices of size that spans from few hundreds of millions to billions. We demonstrate the scalability of our algorithm and compare it with baseline implementations, showing significant performance improvements. The code and the datasets used for conducting the experiments are available online.« less
Files synchronization from a large number of insertions and deletions
NASA Astrophysics Data System (ADS)
Ellappan, Vijayan; Kumari, Savera
2017-11-01
Synchronization between different versions of files is becoming a major issue that most of the applications are facing. To make the applications more efficient a economical algorithm is developed from the previously used algorithm of “File Loading Algorithm”. I am extending this algorithm in three ways: First, dealing with non-binary files, Second backup is generated for uploaded files and lastly each files are synchronized with insertions and deletions. User can reconstruct file from the former file with minimizing the error and also provides interactive communication by eliminating the frequency without any disturbance. The drawback of previous system is overcome by using synchronization, in which multiple copies of each file/record is created and stored in backup database and is efficiently restored in case of any unwanted deletion or loss of data. That is, to introduce a protocol that user B may use to reconstruct file X from file Y with suitably low probability of error. Synchronization algorithms find numerous areas of use, including data storage, file sharing, source code control systems, and cloud applications. For example, cloud storage services such as Drop box synchronize between local copies and cloud backups each time users make changes to local versions. Similarly, synchronization tools are necessary in mobile devices. Specialized synchronization algorithms are used for video and sound editing. Synchronization tools are also capable of performing data duplication.
MPI-FAUN: An MPI-Based Framework for Alternating-Updating Nonnegative Matrix Factorization
Kannan, Ramakrishnan; Ballard, Grey; Park, Haesun
2017-10-30
Non-negative matrix factorization (NMF) is the problem of determining two non-negative low rank factors W and H, for the given input matrix A, such that A≈WH. NMF is a useful tool for many applications in different domains such as topic modeling in text mining, background separation in video analysis, and community detection in social networks. Despite its popularity in the data mining community, there is a lack of efficient parallel algorithms to solve the problem for big data sets. The main contribution of this work is a new, high-performance parallel computational framework for a broad class of NMF algorithms thatmore » iteratively solves alternating non-negative least squares (NLS) subproblems for W and H. It maintains the data and factor matrices in memory (distributed across processors), uses MPI for interprocessor communication, and, in the dense case, provably minimizes communication costs (under mild assumptions). The framework is flexible and able to leverage a variety of NMF and NLS algorithms, including Multiplicative Update, Hierarchical Alternating Least Squares, and Block Principal Pivoting. Our implementation allows us to benchmark and compare different algorithms on massive dense and sparse data matrices of size that spans from few hundreds of millions to billions. We demonstrate the scalability of our algorithm and compare it with baseline implementations, showing significant performance improvements. The code and the datasets used for conducting the experiments are available online.« less
Interacting with target tracking algorithms in a gaze-enhanced motion video analysis system
NASA Astrophysics Data System (ADS)
Hild, Jutta; Krüger, Wolfgang; Heinze, Norbert; Peinsipp-Byma, Elisabeth; Beyerer, Jürgen
2016-05-01
Motion video analysis is a challenging task, particularly if real-time analysis is required. It is therefore an important issue how to provide suitable assistance for the human operator. Given that the use of customized video analysis systems is more and more established, one supporting measure is to provide system functions which perform subtasks of the analysis. Recent progress in the development of automated image exploitation algorithms allow, e.g., real-time moving target tracking. Another supporting measure is to provide a user interface which strives to reduce the perceptual, cognitive and motor load of the human operator for example by incorporating the operator's visual focus of attention. A gaze-enhanced user interface is able to help here. This work extends prior work on automated target recognition, segmentation, and tracking algorithms as well as about the benefits of a gaze-enhanced user interface for interaction with moving targets. We also propose a prototypical system design aiming to combine both the qualities of the human observer's perception and the automated algorithms in order to improve the overall performance of a real-time video analysis system. In this contribution, we address two novel issues analyzing gaze-based interaction with target tracking algorithms. The first issue extends the gaze-based triggering of a target tracking process, e.g., investigating how to best relaunch in the case of track loss. The second issue addresses the initialization of tracking algorithms without motion segmentation where the operator has to provide the system with the object's image region in order to start the tracking algorithm.
Efficient Helicopter Aerodynamic and Aeroacoustic Predictions on Parallel Computers
NASA Technical Reports Server (NTRS)
Wissink, Andrew M.; Lyrintzis, Anastasios S.; Strawn, Roger C.; Oliker, Leonid; Biswas, Rupak
1996-01-01
This paper presents parallel implementations of two codes used in a combined CFD/Kirchhoff methodology to predict the aerodynamics and aeroacoustics properties of helicopters. The rotorcraft Navier-Stokes code, TURNS, computes the aerodynamic flowfield near the helicopter blades and the Kirchhoff acoustics code computes the noise in the far field, using the TURNS solution as input. The overall parallel strategy adds MPI message passing calls to the existing serial codes to allow for communication between processors. As a result, the total code modifications required for parallel execution are relatively small. The biggest bottleneck in running the TURNS code in parallel comes from the LU-SGS algorithm that solves the implicit system of equations. We use a new hybrid domain decomposition implementation of LU-SGS to obtain good parallel performance on the SP-2. TURNS demonstrates excellent parallel speedups for quasi-steady and unsteady three-dimensional calculations of a helicopter blade in forward flight. The execution rate attained by the code on 114 processors is six times faster than the same cases run on one processor of the Cray C-90. The parallel Kirchhoff code also shows excellent parallel speedups and fast execution rates. As a performance demonstration, unsteady acoustic pressures are computed at 1886 far-field observer locations for a sample acoustics problem. The calculation requires over two hundred hours of CPU time on one C-90 processor but takes only a few hours on 80 processors of the SP2. The resultant far-field acoustic field is analyzed with state of-the-art audio and video rendering of the propagating acoustic signals.
Portrayal of Alcohol Brands Popular Among Underage Youth on YouTube: A Content Analysis.
Primack, Brian A; Colditz, Jason B; Rosen, Eva B; Giles, Leila M; Jackson, Kristina M; Kraemer, Kevin L
2017-09-01
We characterized leading YouTube videos featuring alcohol brand references and examined video characteristics associated with each brand and video category. We systematically captured the 137 most relevant and popular videos on YouTube portraying alcohol brands that are popular among underage youth. We used an iterative process to codebook development. We coded variables within domains of video type, character sociodemographics, production quality, and negative and positive associations with alcohol use. All variables were double coded, and Cohen's kappa was greater than .80 for all variables except age, which was eliminated. There were 96,860,936 combined views for all videos. The most common video type was "traditional advertisements," which comprised 40% of videos. Of the videos, 20% were "guides" and 10% focused on chugging a bottle of distilled spirits. While 95% of videos featured males, 40% featured females. Alcohol intoxication was present in 19% of videos. Aggression, addiction, and injuries were uncommonly identified (2%, 3%, and 4%, respectively), but 47% of videos contained humor. Traditional advertisements represented the majority of videos related to Bud Light (83%) but only 18% of Grey Goose and 8% of Hennessy videos. Intoxication was most present in chugging demonstrations (77%), whereas addiction was only portrayed in music videos (22%). Videos containing humor ranged from 11% for music-related videos to 77% for traditional advertisements. YouTube videos depicting the alcohol brands favored by underage youth are heavily viewed, and the majority are traditional or narrative advertisements. Understanding characteristics associated with different brands and video categories may aid in intervention development.
Correlation approach to identify coding regions in DNA sequences
NASA Technical Reports Server (NTRS)
Ossadnik, S. M.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Peng, C. K.; Simons, M.; Stanley, H. E.
1994-01-01
Recently, it was observed that noncoding regions of DNA sequences possess long-range power-law correlations, whereas coding regions typically display only short-range correlations. We develop an algorithm based on this finding that enables investigators to perform a statistical analysis on long DNA sequences to locate possible coding regions. The algorithm is particularly successful in predicting the location of lengthy coding regions. For example, for the complete genome of yeast chromosome III (315,344 nucleotides), at least 82% of the predictions correspond to putative coding regions; the algorithm correctly identified all coding regions larger than 3000 nucleotides, 92% of coding regions between 2000 and 3000 nucleotides long, and 79% of coding regions between 1000 and 2000 nucleotides. The predictive ability of this new algorithm supports the claim that there is a fundamental difference in the correlation property between coding and noncoding sequences. This algorithm, which is not species-dependent, can be implemented with other techniques for rapidly and accurately locating relatively long coding regions in genomic sequences.
Determining the bias and variance of a deterministic finger-tracking algorithm.
Morash, Valerie S; van der Velden, Bas H M
2016-06-01
Finger tracking has the potential to expand haptic research and applications, as eye tracking has done in vision research. In research applications, it is desirable to know the bias and variance associated with a finger-tracking method. However, assessing the bias and variance of a deterministic method is not straightforward. Multiple measurements of the same finger position data will not produce different results, implying zero variance. Here, we present a method of assessing deterministic finger-tracking variance and bias through comparison to a non-deterministic measure. A proof-of-concept is presented using a video-based finger-tracking algorithm developed for the specific purpose of tracking participant fingers during a psychological research study. The algorithm uses ridge detection on videos of the participant's hand, and estimates the location of the right index fingertip. The algorithm was evaluated using data from four participants, who explored tactile maps using only their right index finger and all right-hand fingers. The algorithm identified the index fingertip in 99.78 % of one-finger video frames and 97.55 % of five-finger video frames. Although the algorithm produced slightly biased and more dispersed estimates relative to a human coder, these differences (x=0.08 cm, y=0.04 cm) and standard deviations (σ x =0.16 cm, σ y =0.21 cm) were small compared to the size of a fingertip (1.5-2.0 cm). Some example finger-tracking results are provided where corrections are made using the bias and variance estimates.
Enhance Video Film using Retnix method
NASA Astrophysics Data System (ADS)
Awad, Rasha; Al-Zuky, Ali A.; Al-Saleh, Anwar H.; Mohamad, Haidar J.
2018-05-01
An enhancement technique used to improve the studied video quality. Algorithms like mean and standard deviation are used as a criterion within this paper, and it applied for each video clip that divided into 80 images. The studied filming environment has different light intensity (315, 566, and 644Lux). This different environment gives similar reality to the outdoor filming. The outputs of the suggested algorithm are compared with the results before applying it. This method is applied into two ways: first, it is applied for the full video clip to get the enhanced film; second, it is applied for every individual image to get the enhanced image then compiler them to get the enhanced film. This paper shows that the enhancement technique gives good quality video film depending on a statistical method, and it is recommended to use it in different application.
Open-source telemedicine platform for wireless medical video communication.
Panayides, A; Eleftheriou, I; Pantziaris, M
2013-01-01
An m-health system for real-time wireless communication of medical video based on open-source software is presented. The objective is to deliver a low-cost telemedicine platform which will allow for reliable remote diagnosis m-health applications such as emergency incidents, mass population screening, and medical education purposes. The performance of the proposed system is demonstrated using five atherosclerotic plaque ultrasound videos. The videos are encoded at the clinically acquired resolution, in addition to lower, QCIF, and CIF resolutions, at different bitrates, and four different encoding structures. Commercially available wireless local area network (WLAN) and 3.5G high-speed packet access (HSPA) wireless channels are used to validate the developed platform. Objective video quality assessment is based on PSNR ratings, following calibration using the variable frame delay (VFD) algorithm that removes temporal mismatch between original and received videos. Clinical evaluation is based on atherosclerotic plaque ultrasound video assessment protocol. Experimental results show that adequate diagnostic quality wireless medical video communications are realized using the designed telemedicine platform. HSPA cellular networks provide for ultrasound video transmission at the acquired resolution, while VFD algorithm utilization bridges objective and subjective ratings.
Open-Source Telemedicine Platform for Wireless Medical Video Communication
Panayides, A.; Eleftheriou, I.; Pantziaris, M.
2013-01-01
An m-health system for real-time wireless communication of medical video based on open-source software is presented. The objective is to deliver a low-cost telemedicine platform which will allow for reliable remote diagnosis m-health applications such as emergency incidents, mass population screening, and medical education purposes. The performance of the proposed system is demonstrated using five atherosclerotic plaque ultrasound videos. The videos are encoded at the clinically acquired resolution, in addition to lower, QCIF, and CIF resolutions, at different bitrates, and four different encoding structures. Commercially available wireless local area network (WLAN) and 3.5G high-speed packet access (HSPA) wireless channels are used to validate the developed platform. Objective video quality assessment is based on PSNR ratings, following calibration using the variable frame delay (VFD) algorithm that removes temporal mismatch between original and received videos. Clinical evaluation is based on atherosclerotic plaque ultrasound video assessment protocol. Experimental results show that adequate diagnostic quality wireless medical video communications are realized using the designed telemedicine platform. HSPA cellular networks provide for ultrasound video transmission at the acquired resolution, while VFD algorithm utilization bridges objective and subjective ratings. PMID:23573082
MATIN: a random network coding based framework for high quality peer-to-peer live video streaming.
Barekatain, Behrang; Khezrimotlagh, Dariush; Aizaini Maarof, Mohd; Ghaeini, Hamid Reza; Salleh, Shaharuddin; Quintana, Alfonso Ariza; Akbari, Behzad; Cabrera, Alicia Triviño
2013-01-01
In recent years, Random Network Coding (RNC) has emerged as a promising solution for efficient Peer-to-Peer (P2P) video multicasting over the Internet. This probably refers to this fact that RNC noticeably increases the error resiliency and throughput of the network. However, high transmission overhead arising from sending large coefficients vector as header has been the most important challenge of the RNC. Moreover, due to employing the Gauss-Jordan elimination method, considerable computational complexity can be imposed on peers in decoding the encoded blocks and checking linear dependency among the coefficients vectors. In order to address these challenges, this study introduces MATIN which is a random network coding based framework for efficient P2P video streaming. The MATIN includes a novel coefficients matrix generation method so that there is no linear dependency in the generated coefficients matrix. Using the proposed framework, each peer encapsulates one instead of n coefficients entries into the generated encoded packet which results in very low transmission overhead. It is also possible to obtain the inverted coefficients matrix using a bit number of simple arithmetic operations. In this regard, peers sustain very low computational complexities. As a result, the MATIN permits random network coding to be more efficient in P2P video streaming systems. The results obtained from simulation using OMNET++ show that it substantially outperforms the RNC which uses the Gauss-Jordan elimination method by providing better video quality on peers in terms of the four important performance metrics including video distortion, dependency distortion, End-to-End delay and Initial Startup delay.
High-throughput sample adaptive offset hardware architecture for high-efficiency video coding
NASA Astrophysics Data System (ADS)
Zhou, Wei; Yan, Chang; Zhang, Jingzhi; Zhou, Xin
2018-03-01
A high-throughput hardware architecture for a sample adaptive offset (SAO) filter in the high-efficiency video coding video coding standard is presented. First, an implementation-friendly and simplified bitrate estimation method of rate-distortion cost calculation is proposed to reduce the computational complexity in the mode decision of SAO. Then, a high-throughput VLSI architecture for SAO is presented based on the proposed bitrate estimation method. Furthermore, multiparallel VLSI architecture for in-loop filters, which integrates both deblocking filter and SAO filter, is proposed. Six parallel strategies are applied in the proposed in-loop filters architecture to improve the system throughput and filtering speed. Experimental results show that the proposed in-loop filters architecture can achieve up to 48% higher throughput in comparison with prior work. The proposed architecture can reach a high-operating clock frequency of 297 MHz with TSMC 65-nm library and meet the real-time requirement of the in-loop filters for 8 K × 4 K video format at 132 fps.
Detection and Tracking of Moving Objects with Real-Time Onboard Vision System
NASA Astrophysics Data System (ADS)
Erokhin, D. Y.; Feldman, A. B.; Korepanov, S. E.
2017-05-01
Detection of moving objects in video sequence received from moving video sensor is a one of the most important problem in computer vision. The main purpose of this work is developing set of algorithms, which can detect and track moving objects in real time computer vision system. This set includes three main parts: the algorithm for estimation and compensation of geometric transformations of images, an algorithm for detection of moving objects, an algorithm to tracking of the detected objects and prediction their position. The results can be claimed to create onboard vision systems of aircraft, including those relating to small and unmanned aircraft.
Change Detection Algorithms for Surveillance in Visual IoT: A Comparative Study
NASA Astrophysics Data System (ADS)
Akram, Beenish Ayesha; Zafar, Amna; Akbar, Ali Hammad; Wajid, Bilal; Chaudhry, Shafique Ahmad
2018-01-01
The VIoT (Visual Internet of Things) connects virtual information world with real world objects using sensors and pervasive computing. For video surveillance in VIoT, ChD (Change Detection) is a critical component. ChD algorithms identify regions of change in multiple images of the same scene recorded at different time intervals for video surveillance. This paper presents performance comparison of histogram thresholding and classification ChD algorithms using quantitative measures for video surveillance in VIoT based on salient features of datasets. The thresholding algorithms Otsu, Kapur, Rosin and classification methods k-means, EM (Expectation Maximization) were simulated in MATLAB using diverse datasets. For performance evaluation, the quantitative measures used include OSR (Overall Success Rate), YC (Yule's Coefficient) and JC (Jaccard's Coefficient), execution time and memory consumption. Experimental results showed that Kapur's algorithm performed better for both indoor and outdoor environments with illumination changes, shadowing and medium to fast moving objects. However, it reflected degraded performance for small object size with minor changes. Otsu algorithm showed better results for indoor environments with slow to medium changes and nomadic object mobility. k-means showed good results in indoor environment with small object size producing slow change, no shadowing and scarce illumination changes.
Flexible and evolutionary optical access networks
NASA Astrophysics Data System (ADS)
Hsueh, Yu-Li
Passive optical networks (PONs) are promising solutions that will open the first-mile bottleneck. Current PONs employ time division multiplexing (TDM) to share bandwidth among users, leading to low cost but limited capacity. In the future, wavelength division multiplexing (WDM) technologies will be deployed to achieve high performance. This dissertation describes several advanced technologies to enhance PON systems. A spectral shaping line coding scheme is developed to allow a simple and cost-effective overlay of high data-rate services in existing PONs, leaving field-deployed fibers and existing services untouched. Spectral shapes of coded signals can be manipulated to adapt to different systems. For a specific tolerable interference level, the optimal line code can be found which maximizes the data throughput. Experiments are conducted to demonstrate and compare several optimized line codes. A novel PON employing dynamic wavelength allocation to provide bandwidth sharing across multiple physical PONs is designed and experimentally demonstrated. Tunable lasers, arrayed waveguide gratings, and coarse/fine filtering combine to create a flexible optical access solution. The network's excellent scalability can bridge the gap between conventional TDM PONs and WDM PONs. Scheduling algorithms with quality of service support are also investigated. Simulation results show that the proposed architecture exhibits significant performance gain over conventional PON systems. Streaming video transmission is demonstrated on the prototype experimental testbed. The powerful architecture is a promising candidate for next-generation optical access networks. A new CDR circuit for receiving the bursty traffic in PONs is designed and analyzed. It detects data transition edges upon arrival of the data burst and quickly selects the best clock phase by a control logic circuit. Then, an analog delay-locked loop (DLL) keeps track of data transitions and removes phase errors throughout the burst. The combination of the fast phase detection mechanism and a feedback loop based on DLL allows both fast response and manageable jitter performance in the burst-mode application. A new efficient numerical algorithm is developed to analyze holey optical fibers. The algorithm has been verified against experimental data, and is exploited to design holey optical fibers optimized for the discrete Raman amplification.
Rehm, K; Seeley, G W; Dallas, W J; Ovitt, T W; Seeger, J F
1990-01-01
One of the goals of our research in the field of digital radiography has been to develop contrast-enhancement algorithms for eventual use in the display of chest images on video devices with the aim of preserving the diagnostic information presently available with film, some of which would normally be lost because of the smaller dynamic range of video monitors. The ASAHE algorithm discussed in this article has been tested by investigating observer performance in a difficult detection task involving phantoms and simulated lung nodules, using film as the output medium. The results of the experiment showed that the algorithm is successful in providing contrast-enhanced, natural-looking chest images while maintaining diagnostic information. The algorithm did not effect an increase in nodule detectability, but this was not unexpected because film is a medium capable of displaying a wide range of gray levels. It is sufficient at this stage to show that there is no degradation in observer performance. Future tests will evaluate the performance of the ASAHE algorithm in preparing chest images for video display.
ERIC Educational Resources Information Center
Arya, Poonam; Christ, Tanya; Chiu, Ming
2015-01-01
This study examined how characteristics of Collaborative Peer Video Analysis (CPVA) events are related to teachers' pedagogical outcomes. Data included 39 transcribed literacy video events, in which 14 in-service teachers engaged in discussions of their video clips. Emergent coding and Statistical Discourse Analysis were used to analyze the data.…
Gallium arsenide processing elements for motion estimation full-search algorithm
NASA Astrophysics Data System (ADS)
Lopez, Jose F.; Cortes, P.; Lopez, S.; Sarmiento, Roberto
2001-11-01
The Block-Matching motion estimation algorithm (BMA) is the most popular method for motion-compensated coding of image sequence. Among the several possible searching methods to compute this algorithm, the full-search BMA (FBMA) has obtained great interest from the scientific community due to its regularity, optimal solution and low control overhead which simplifies its VLSI realization. On the other hand, its main drawback is the demand of an enormous amount of computation. There are different ways of overcoming this factor, being the use of advanced technologies, such as Gallium Arsenide (GaAs), the one adopted in this article together with different techniques to reduce area overhead. By exploiting GaAs properties, improvements can be obtained in the implementation of feasible systems for real time video compression architectures. Different primitives used in the implementation of processing elements (PE) for a FBMA scheme are presented. As a result, Pes running at 270 MHz have been developed in order to study its functionality and performance. From these results, an implementation for MPEG applications is proposed, leading to an architecture running at 145 MHz with a power dissipation of 3.48 W and an area of 11.5 mm2.
ViBe: a universal background subtraction algorithm for video sequences.
Barnich, Olivier; Van Droogenbroeck, Marc
2011-06-01
This paper presents a technique for motion detection that incorporates several innovative mechanisms. For example, our proposed technique stores, for each pixel, a set of values taken in the past at the same location or in the neighborhood. It then compares this set to the current pixel value in order to determine whether that pixel belongs to the background, and adapts the model by choosing randomly which values to substitute from the background model. This approach differs from those based upon the classical belief that the oldest values should be replaced first. Finally, when the pixel is found to be part of the background, its value is propagated into the background model of a neighboring pixel. We describe our method in full details (including pseudo-code and the parameter values used) and compare it to other background subtraction techniques. Efficiency figures show that our method outperforms recent and proven state-of-the-art methods in terms of both computation speed and detection rate. We also analyze the performance of a downscaled version of our algorithm to the absolute minimum of one comparison and one byte of memory per pixel. It appears that even such a simplified version of our algorithm performs better than mainstream techniques.
Optimal power allocation and joint source-channel coding for wireless DS-CDMA visual sensor networks
NASA Astrophysics Data System (ADS)
Pandremmenou, Katerina; Kondi, Lisimachos P.; Parsopoulos, Konstantinos E.
2011-01-01
In this paper, we propose a scheme for the optimal allocation of power, source coding rate, and channel coding rate for each of the nodes of a wireless Direct Sequence Code Division Multiple Access (DS-CDMA) visual sensor network. The optimization is quality-driven, i.e. the received quality of the video that is transmitted by the nodes is optimized. The scheme takes into account the fact that the sensor nodes may be imaging scenes with varying levels of motion. Nodes that image low-motion scenes will require a lower source coding rate, so they will be able to allocate a greater portion of the total available bit rate to channel coding. Stronger channel coding will mean that such nodes will be able to transmit at lower power. This will both increase battery life and reduce interference to other nodes. Two optimization criteria are considered. One that minimizes the average video distortion of the nodes and one that minimizes the maximum distortion among the nodes. The transmission powers are allowed to take continuous values, whereas the source and channel coding rates can assume only discrete values. Thus, the resulting optimization problem lies in the field of mixed-integer optimization tasks and is solved using Particle Swarm Optimization. Our experimental results show the importance of considering the characteristics of the video sequences when determining the transmission power, source coding rate and channel coding rate for the nodes of the visual sensor network.
NASA Astrophysics Data System (ADS)
Jackson, Christopher Robert
"Lucky-region" fusion (LRF) is a synthetic imaging technique that has proven successful in enhancing the quality of images distorted by atmospheric turbulence. The LRF algorithm selects sharp regions of an image obtained from a series of short exposure frames, and fuses the sharp regions into a final, improved image. In previous research, the LRF algorithm had been implemented on a PC using the C programming language. However, the PC did not have sufficient sequential processing power to handle real-time extraction, processing and reduction required when the LRF algorithm was applied to real-time video from fast, high-resolution image sensors. This thesis describes two hardware implementations of the LRF algorithm to achieve real-time image processing. The first was created with a VIRTEX-7 field programmable gate array (FPGA). The other developed using the graphics processing unit (GPU) of a NVIDIA GeForce GTX 690 video card. The novelty in the FPGA approach is the creation of a "black box" LRF video processing system with a general camera link input, a user controller interface, and a camera link video output. We also describe a custom hardware simulation environment we have built to test the FPGA LRF implementation. The advantage of the GPU approach is significantly improved development time, integration of image stabilization into the system, and comparable atmospheric turbulence mitigation.
An algorithm for power line detection and warning based on a millimeter-wave radar video.
Ma, Qirong; Goshi, Darren S; Shih, Yi-Chi; Sun, Ming-Ting
2011-12-01
Power-line-strike accident is a major safety threat for low-flying aircrafts such as helicopters, thus an automatic warning system to power lines is highly desirable. In this paper we propose an algorithm for detecting power lines from radar videos from an active millimeter-wave sensor. Hough Transform is employed to detect candidate lines. The major challenge is that the radar videos are very noisy due to ground return. The noise points could fall on the same line which results in signal peaks after Hough Transform similar to the actual cable lines. To differentiate the cable lines from the noise lines, we train a Support Vector Machine to perform the classification. We exploit the Bragg pattern, which is due to the diffraction of electromagnetic wave on the periodic surface of power lines. We propose a set of features to represent the Bragg pattern for the classifier. We also propose a slice-processing algorithm which supports parallel processing, and improves the detection of cables in a cluttered background. Lastly, an adaptive algorithm is proposed to integrate the detection results from individual frames into a reliable video detection decision, in which temporal correlation of the cable pattern across frames is used to make the detection more robust. Extensive experiments with real-world data validated the effectiveness of our cable detection algorithm. © 2011 IEEE
VLSI design of lossless frame recompression using multi-orientation prediction
NASA Astrophysics Data System (ADS)
Lee, Yu-Hsuan; You, Yi-Lun; Chen, Yi-Guo
2016-01-01
Pursuing an experience of high-end visual quality drives human to demand a higher display resolution and a higher frame rate. Hence, a lot of powerful coding tools are aggregated together in emerging video coding standards to improve coding efficiency. This also makes video coding standards suffer from two design challenges: heavy computation and tremendous memory bandwidth. The first issue can be properly solved by a careful hardware architecture design with advanced semiconductor processes. Nevertheless, the second one becomes a critical design bottleneck for a modern video coding system. In this article, a lossless frame recompression using multi-orientation prediction technique is proposed to overcome this bottleneck. This work is realised into a silicon chip with the technology of TSMC 0.18 µm CMOS process. Its encoding capability can reach full-HD (1920 × 1080)@48 fps. The chip power consumption is 17.31 mW@100 MHz. Core area and chip area are 0.83 × 0.83 mm2 and 1.20 × 1.20 mm2, respectively. Experiment results demonstrate that this work exhibits an outstanding performance on lossless compression ratio with a competitive hardware performance.
Coding tools investigation for next generation video coding based on HEVC
NASA Astrophysics Data System (ADS)
Chen, Jianle; Chen, Ying; Karczewicz, Marta; Li, Xiang; Liu, Hongbin; Zhang, Li; Zhao, Xin
2015-09-01
The new state-of-the-art video coding standard, H.265/HEVC, has been finalized in 2013 and it achieves roughly 50% bit rate saving compared to its predecessor, H.264/MPEG-4 AVC. This paper provides the evidence that there is still potential for further coding efficiency improvements. A brief overview of HEVC is firstly given in the paper. Then, our improvements on each main module of HEVC are presented. For instance, the recursive quadtree block structure is extended to support larger coding unit and transform unit. The motion information prediction scheme is improved by advanced temporal motion vector prediction, which inherits the motion information of each small block within a large block from a temporal reference picture. Cross component prediction with linear prediction model improves intra prediction and overlapped block motion compensation improves the efficiency of inter prediction. Furthermore, coding of both intra and inter prediction residual is improved by adaptive multiple transform technique. Finally, in addition to deblocking filter and SAO, adaptive loop filter is applied to further enhance the reconstructed picture quality. This paper describes above-mentioned techniques in detail and evaluates their coding performance benefits based on the common test condition during HEVC development. The simulation results show that significant performance improvement over HEVC standard can be achieved, especially for the high resolution video materials.
Jung, Jaehoon; Yoon, Inhye; Paik, Joonki
2016-01-01
This paper presents an object occlusion detection algorithm using object depth information that is estimated by automatic camera calibration. The object occlusion problem is a major factor to degrade the performance of object tracking and recognition. To detect an object occlusion, the proposed algorithm consists of three steps: (i) automatic camera calibration using both moving objects and a background structure; (ii) object depth estimation; and (iii) detection of occluded regions. The proposed algorithm estimates the depth of the object without extra sensors but with a generic red, green and blue (RGB) camera. As a result, the proposed algorithm can be applied to improve the performance of object tracking and object recognition algorithms for video surveillance systems. PMID:27347978
Cooperative optimization and their application in LDPC codes
NASA Astrophysics Data System (ADS)
Chen, Ke; Rong, Jian; Zhong, Xiaochun
2008-10-01
Cooperative optimization is a new way for finding global optima of complicated functions of many variables. The proposed algorithm is a class of message passing algorithms and has solid theory foundations. It can achieve good coding gains over the sum-product algorithm for LDPC codes. For (6561, 4096) LDPC codes, the proposed algorithm can achieve 2.0 dB gains over the sum-product algorithm at BER of 4×10-7. The decoding complexity of the proposed algorithm is lower than the sum-product algorithm can do; furthermore, the former can achieve much lower error floor than the latter can do after the Eb / No is higher than 1.8 dB.
Joint Attributes and Event Analysis for Multimedia Event Detection.
Ma, Zhigang; Chang, Xiaojun; Xu, Zhongwen; Sebe, Nicu; Hauptmann, Alexander G
2017-06-15
Semantic attributes have been increasingly used the past few years for multimedia event detection (MED) with promising results. The motivation is that multimedia events generally consist of lower level components such as objects, scenes, and actions. By characterizing multimedia event videos with semantic attributes, one could exploit more informative cues for improved detection results. Much existing work obtains semantic attributes from images, which may be suboptimal for video analysis since these image-inferred attributes do not carry dynamic information that is essential for videos. To address this issue, we propose to learn semantic attributes from external videos using their semantic labels. We name them video attributes in this paper. In contrast with multimedia event videos, these external videos depict lower level contents such as objects, scenes, and actions. To harness video attributes, we propose an algorithm established on a correlation vector that correlates them to a target event. Consequently, we could incorporate video attributes latently as extra information into the event detector learnt from multimedia event videos in a joint framework. To validate our method, we perform experiments on the real-world large-scale TRECVID MED 2013 and 2014 data sets and compare our method with several state-of-the-art algorithms. The experiments show that our method is advantageous for MED.
Mat Kiah, M L; Al-Bakri, S H; Zaidan, A A; Zaidan, B B; Hussain, Muzammil
2014-10-01
One of the applications of modern technology in telemedicine is video conferencing. An alternative to traveling to attend a conference or meeting, video conferencing is becoming increasingly popular among hospitals. By using this technology, doctors can help patients who are unable to physically visit hospitals. Video conferencing particularly benefits patients from rural areas, where good doctors are not always available. Telemedicine has proven to be a blessing to patients who have no access to the best treatment. A telemedicine system consists of customized hardware and software at two locations, namely, at the patient's and the doctor's end. In such cases, the video streams of the conferencing parties may contain highly sensitive information. Thus, real-time data security is one of the most important requirements when designing video conferencing systems. This study proposes a secure framework for video conferencing systems and a complete management solution for secure video conferencing groups. Java Media Framework Application Programming Interface classes are used to design and test the proposed secure framework. Real-time Transport Protocol over User Datagram Protocol is used to transmit the encrypted audio and video streams, and RSA and AES algorithms are used to provide the required security services. Results show that the encryption algorithm insignificantly increases the video conferencing computation time.
"ON ALGEBRAIC DECODING OF Q-ARY REED-MULLER AND PRODUCT REED-SOLOMON CODES"
DOE Office of Scientific and Technical Information (OSTI.GOV)
SANTHI, NANDAKISHORE
We consider a list decoding algorithm recently proposed by Pellikaan-Wu for q-ary Reed-Muller codes RM{sub q}({ell}, m, n) of length n {le} q{sup m} when {ell} {le} q. A simple and easily accessible correctness proof is given which shows that this algorithm achieves a relative error-correction radius of {tau} {le} (1-{radical}{ell}q{sup m-1}/n). This is an improvement over the proof using one-point Algebraic-Geometric decoding method given in. The described algorithm can be adapted to decode product Reed-Solomon codes. We then propose a new low complexity recursive aJgebraic decoding algorithm for product Reed-Solomon codes and Reed-Muller codes. This algorithm achieves a relativemore » error correction radius of {tau} {le} {Pi}{sub i=1}{sup m} (1 - {radical}k{sub i}/q). This algorithm is then proved to outperform the Pellikaan-Wu algorithm in both complexity and error correction radius over a wide range of code rates.« less
DCT based interpolation filter for motion compensation in HEVC
NASA Astrophysics Data System (ADS)
Alshin, Alexander; Alshina, Elena; Park, Jeong Hoon; Han, Woo-Jin
2012-10-01
High Efficiency Video Coding (HEVC) draft standard has a challenging goal to improve coding efficiency twice compare to H.264/AVC. Many aspects of the traditional hybrid coding framework were improved during new standard development. Motion compensated prediction, in particular the interpolation filter, is one area that was improved significantly over H.264/AVC. This paper presents the details of the interpolation filter design of the draft HEVC standard. The coding efficiency improvements over H.264/AVC interpolation filter is studied and experimental results are presented, which show a 4.0% average bitrate reduction for Luma component and 11.3% average bitrate reduction for Chroma component. The coding efficiency gains are significant for some video sequences and can reach up 21.7%.
Dynamic code block size for JPEG 2000
NASA Astrophysics Data System (ADS)
Tsai, Ping-Sing; LeCornec, Yann
2008-02-01
Since the standardization of the JPEG 2000, it has found its way into many different applications such as DICOM (digital imaging and communication in medicine), satellite photography, military surveillance, digital cinema initiative, professional video cameras, and so on. The unified framework of the JPEG 2000 architecture makes practical high quality real-time compression possible even in video mode, i.e. motion JPEG 2000. In this paper, we present a study of the compression impact using dynamic code block size instead of fixed code block size as specified in the JPEG 2000 standard. The simulation results show that there is no significant impact on compression if dynamic code block sizes are used. In this study, we also unveil the advantages of using dynamic code block sizes.
A fuzzy measure approach to motion frame analysis for scene detection. M.S. Thesis - Houston Univ.
NASA Technical Reports Server (NTRS)
Leigh, Albert B.; Pal, Sankar K.
1992-01-01
This paper addresses a solution to the problem of scene estimation of motion video data in the fuzzy set theoretic framework. Using fuzzy image feature extractors, a new algorithm is developed to compute the change of information in each of two successive frames to classify scenes. This classification process of raw input visual data can be used to establish structure for correlation. The algorithm attempts to fulfill the need for nonlinear, frame-accurate access to video data for applications such as video editing and visual document archival/retrieval systems in multimedia environments.
Digital CODEC for real-time processing of broadcast quality video signals at 1.8 bits/pixel
NASA Technical Reports Server (NTRS)
Shalkhauser, Mary JO; Whyte, Wayne A., Jr.
1989-01-01
Advances in very large-scale integration and recent work in the field of bandwidth efficient digital modulation techniques have combined to make digital video processing technically feasible and potentially cost competitive for broadcast quality television transmission. A hardware implementation was developed for a DPCM-based digital television bandwidth compression algorithm which processes standard NTSC composite color television signals and produces broadcast quality video in real time at an average of 1.8 bits/pixel. The data compression algorithm and the hardware implementation of the CODEC are described, and performance results are provided.
A video coding scheme based on joint spatiotemporal and adaptive prediction.
Jiang, Wenfei; Latecki, Longin Jan; Liu, Wenyu; Liang, Hui; Gorman, Ken
2009-05-01
We propose a video coding scheme that departs from traditional Motion Estimation/DCT frameworks and instead uses Karhunen-Loeve Transform (KLT)/Joint Spatiotemporal Prediction framework. In particular, a novel approach that performs joint spatial and temporal prediction simultaneously is introduced. It bypasses the complex H.26x interframe techniques and it is less computationally intensive. Because of the advantage of the effective joint prediction and the image-dependent color space transformation (KLT), the proposed approach is demonstrated experimentally to consistently lead to improved video quality, and in many cases to better compression rates and improved computational speed.
Priority-based methods for reducing the impact of packet loss on HEVC encoded video streams
NASA Astrophysics Data System (ADS)
Nightingale, James; Wang, Qi; Grecos, Christos
2013-02-01
The rapid growth in the use of video streaming over IP networks has outstripped the rate at which new network infrastructure has been deployed. These bandwidth-hungry applications now comprise a significant part of all Internet traffic and present major challenges for network service providers. The situation is more acute in mobile networks where the available bandwidth is often limited. Work towards the standardisation of High Efficiency Video Coding (HEVC), the next generation video coding scheme, is currently on track for completion in 2013. HEVC offers the prospect of a 50% improvement in compression over the current H.264 Advanced Video Coding standard (H.264/AVC) for the same quality. However, there has been very little published research on HEVC streaming or the challenges of delivering HEVC streams in resource-constrained network environments. In this paper we consider the problem of adapting an HEVC encoded video stream to meet the bandwidth limitation in a mobile networks environment. Video sequences were encoded using the Test Model under Consideration (TMuC HM6) for HEVC. Network abstraction layers (NAL) units were packetized, on a one NAL unit per RTP packet basis, and transmitted over a realistic hybrid wired/wireless testbed configured with dynamically changing network path conditions and multiple independent network paths from the streamer to the client. Two different schemes for the prioritisation of RTP packets, based on the NAL units they contain, have been implemented and empirically compared using a range of video sequences, encoder configurations, bandwidths and network topologies. In the first prioritisation method the importance of an RTP packet was determined by the type of picture and the temporal switching point information carried in the NAL unit header. Packets containing parameter set NAL units and video coding layer (VCL) NAL units of the instantaneous decoder refresh (IDR) and the clean random access (CRA) pictures were given the highest priority followed by NAL units containing pictures used as reference pictures from which others can be predicted. The second method assigned a priority to each NAL unit based on the rate-distortion cost of the VCL coding units contained in the NAL unit. The sum of the rate-distortion costs of each coding unit contained in a NAL unit was used as the priority weighting. The preliminary results of extensive experiments have shown that all three schemes offered an improvement in PSNR, when comparing original and decoded received streams, over uncontrolled packet loss. Using the first method consistently delivered a significant average improvement of 0.97dB over the uncontrolled scenario while the second method provided a measurable, but less consistent, improvement across the range of testing conditions and encoder configurations.
DOT National Transportation Integrated Search
2012-10-01
In this report we present a transportation video coding and wireless transmission system specically tailored to automated : vehicle tracking applications. By taking into account the video characteristics and the lossy nature of the wireless channe...
Robust Video Stabilization Using Particle Keypoint Update and l1-Optimized Camera Path
Jeon, Semi; Yoon, Inhye; Jang, Jinbeum; Yang, Seungji; Kim, Jisung; Paik, Joonki
2017-01-01
Acquisition of stabilized video is an important issue for various type of digital cameras. This paper presents an adaptive camera path estimation method using robust feature detection to remove shaky artifacts in a video. The proposed algorithm consists of three steps: (i) robust feature detection using particle keypoints between adjacent frames; (ii) camera path estimation and smoothing; and (iii) rendering to reconstruct a stabilized video. As a result, the proposed algorithm can estimate the optimal homography by redefining important feature points in the flat region using particle keypoints. In addition, stabilized frames with less holes can be generated from the optimal, adaptive camera path that minimizes a temporal total variation (TV). The proposed video stabilization method is suitable for enhancing the visual quality for various portable cameras and can be applied to robot vision, driving assistant systems, and visual surveillance systems. PMID:28208622
Cardiac ultrasonography over 4G wireless networks using a tele-operated robot
Panayides, Andreas S.; Jossif, Antonis P.; Christoforou, Eftychios G.; Vieyres, Pierre; Novales, Cyril; Voskarides, Sotos; Pattichis, Constantinos S.
2016-01-01
This Letter proposes an end-to-end mobile tele-echography platform using a portable robot for remote cardiac ultrasonography. Performance evaluation investigates the capacity of long-term evolution (LTE) wireless networks to facilitate responsive robot tele-manipulation and real-time ultrasound video streaming that qualifies for clinical practice. Within this context, a thorough video coding standards comparison for cardiac ultrasound applications is performed, using a data set of ten ultrasound videos. Both objective and subjective (clinical) video quality assessment demonstrate that H.264/AVC and high efficiency video coding standards can achieve diagnostically-lossless video quality at bitrates well within the LTE supported data rates. Most importantly, reduced latencies experienced throughout the live tele-echography sessions allow the medical expert to remotely operate the robot in a responsive manner, using the wirelessly communicated cardiac ultrasound video to reach a diagnosis. Based on preliminary results documented in this Letter, the proposed robotised tele-echography platform can provide for reliable, remote diagnosis, achieving comparable quality of experience levels with in-hospital ultrasound examinations. PMID:27733929
Method and system for efficient video compression with low-complexity encoder
NASA Technical Reports Server (NTRS)
Chen, Jun (Inventor); He, Dake (Inventor); Sheinin, Vadim (Inventor); Jagmohan, Ashish (Inventor); Lu, Ligang (Inventor)
2012-01-01
Disclosed are a method and system for video compression, wherein the video encoder has low computational complexity and high compression efficiency. The disclosed system comprises a video encoder and a video decoder, wherein the method for encoding includes the steps of converting a source frame into a space-frequency representation; estimating conditional statistics of at least one vector of space-frequency coefficients; estimating encoding rates based on the said conditional statistics; and applying Slepian-Wolf codes with the said computed encoding rates. The preferred method for decoding includes the steps of; generating a side-information vector of frequency coefficients based on previously decoded source data, encoder statistics, and previous reconstructions of the source frequency vector; and performing Slepian-Wolf decoding of at least one source frequency vector based on the generated side-information, the Slepian-Wolf code bits and the encoder statistics.
A review of predictive coding algorithms.
Spratling, M W
2017-03-01
Predictive coding is a leading theory of how the brain performs probabilistic inference. However, there are a number of distinct algorithms which are described by the term "predictive coding". This article provides a concise review of these different predictive coding algorithms, highlighting their similarities and differences. Five algorithms are covered: linear predictive coding which has a long and influential history in the signal processing literature; the first neuroscience-related application of predictive coding to explaining the function of the retina; and three versions of predictive coding that have been proposed to model cortical function. While all these algorithms aim to fit a generative model to sensory data, they differ in the type of generative model they employ, in the process used to optimise the fit between the model and sensory data, and in the way that they are related to neurobiology. Copyright © 2016 Elsevier Inc. All rights reserved.
Tsapatsoulis, Nicolas; Loizou, Christos; Pattichis, Constantinos
2007-01-01
Efficient medical video transmission over 3G wireless is of great importance for fast diagnosis and on site medical staff training purposes. In this paper we present a region of interest based ultrasound video compression study which shows that significant reduction of the required, for transmission, bit rate can be achieved without altering the design of existing video codecs. Simple preprocessing of the original videos to define visually and clinically important areas is the only requirement.
Billing code algorithms to identify cases of peripheral artery disease from administrative data
Fan, Jin; Arruda-Olson, Adelaide M; Leibson, Cynthia L; Smith, Carin; Liu, Guanghui; Bailey, Kent R; Kullo, Iftikhar J
2013-01-01
Objective To construct and validate billing code algorithms for identifying patients with peripheral arterial disease (PAD). Methods We extracted all encounters and line item details including PAD-related billing codes at Mayo Clinic Rochester, Minnesota, between July 1, 1997 and June 30, 2008; 22 712 patients evaluated in the vascular laboratory were divided into training and validation sets. Multiple logistic regression analysis was used to create an integer code score from the training dataset, and this was tested in the validation set. We applied a model-based code algorithm to patients evaluated in the vascular laboratory and compared this with a simpler algorithm (presence of at least one of the ICD-9 PAD codes 440.20–440.29). We also applied both algorithms to a community-based sample (n=4420), followed by a manual review. Results The logistic regression model performed well in both training and validation datasets (c statistic=0.91). In patients evaluated in the vascular laboratory, the model-based code algorithm provided better negative predictive value. The simpler algorithm was reasonably accurate for identification of PAD status, with lesser sensitivity and greater specificity. In the community-based sample, the sensitivity (38.7% vs 68.0%) of the simpler algorithm was much lower, whereas the specificity (92.0% vs 87.6%) was higher than the model-based algorithm. Conclusions A model-based billing code algorithm had reasonable accuracy in identifying PAD cases from the community, and in patients referred to the non-invasive vascular laboratory. The simpler algorithm had reasonable accuracy for identification of PAD in patients referred to the vascular laboratory but was significantly less sensitive in a community-based sample. PMID:24166724
Semantic-based surveillance video retrieval.
Hu, Weiming; Xie, Dan; Fu, Zhouyu; Zeng, Wenrong; Maybank, Steve
2007-04-01
Visual surveillance produces large amounts of video data. Effective indexing and retrieval from surveillance video databases are very important. Although there are many ways to represent the content of video clips in current video retrieval algorithms, there still exists a semantic gap between users and retrieval systems. Visual surveillance systems supply a platform for investigating semantic-based video retrieval. In this paper, a semantic-based video retrieval framework for visual surveillance is proposed. A cluster-based tracking algorithm is developed to acquire motion trajectories. The trajectories are then clustered hierarchically using the spatial and temporal information, to learn activity models. A hierarchical structure of semantic indexing and retrieval of object activities, where each individual activity automatically inherits all the semantic descriptions of the activity model to which it belongs, is proposed for accessing video clips and individual objects at the semantic level. The proposed retrieval framework supports various queries including queries by keywords, multiple object queries, and queries by sketch. For multiple object queries, succession and simultaneity restrictions, together with depth and breadth first orders, are considered. For sketch-based queries, a method for matching trajectories drawn by users to spatial trajectories is proposed. The effectiveness and efficiency of our framework are tested in a crowded traffic scene.
Gabbett, Tim J
2013-08-01
The physical demands of rugby league, rugby union, and American football are significantly increased through the large number of collisions players are required to perform during match play. Because of the labor-intensive nature of coding collisions from video recordings, manufacturers of wearable microsensor (e.g., global positioning system [GPS]) units have refined the technology to automatically detect collisions, with several sport scientists attempting to use these microsensors to quantify the physical demands of collision sports. However, a question remains over the validity of these microtechnology units to quantify the contact demands of collision sports. Indeed, recent evidence has shown significant differences in the number of "impacts" recorded by microtechnology units (GPSports) and the actual number of collisions coded from video. However, a separate study investigated the validity of a different microtechnology unit (minimaxX; Catapult Sports) that included GPS and triaxial accelerometers, and also a gyroscope and magnetometer, to quantify collisions. Collisions detected by the minimaxX unit were compared with video-based coding of the actual events. No significant differences were detected in the number of mild, moderate, and heavy collisions detected via the minimaxX units and those coded from video recordings of the actual event. Furthermore, a strong correlation (r = 0.96, p < 0.01) was observed between collisions recorded via the minimaxX units and those coded from video recordings of the event. These findings demonstrate that only one commercially available and wearable microtechnology unit (minimaxX) can be considered capable of offering a valid method of quantifying the contact loads that typically occur in collision sports. Until such validation research is completed, sport scientists should be circumspect of the ability of other units to perform similar functions.
NASA Technical Reports Server (NTRS)
2004-01-01
Ever wonder whether a still shot from a home video could serve as a "picture perfect" photograph worthy of being framed and proudly displayed on the mantle? Wonder no more. A critical imaging code used to enhance video footage taken from spaceborne imaging instruments is now available within a portable photography tool capable of producing an optimized, high-resolution image from multiple video frames.
An adaptive enhancement algorithm for infrared video based on modified k-means clustering
NASA Astrophysics Data System (ADS)
Zhang, Linze; Wang, Jingqi; Wu, Wen
2016-09-01
In this paper, we have proposed a video enhancement algorithm to improve the output video of the infrared camera. Sometimes the video obtained by infrared camera is very dark since there is no clear target. In this case, infrared video should be divided into frame images by frame extraction, in order to carry out the image enhancement. For the first frame image, which can be divided into k sub images by using K-means clustering according to the gray interval it occupies before k sub images' histogram equalization according to the amount of information per sub image, we used a method to solve a problem that final cluster centers close to each other in some cases; and for the other frame images, their initial cluster centers can be determined by the final clustering centers of the previous ones, and the histogram equalization of each sub image will be carried out after image segmentation based on K-means clustering. The histogram equalization can make the gray value of the image to the whole gray level, and the gray level of each sub image is determined by the ratio of pixels to a frame image. Experimental results show that this algorithm can improve the contrast of infrared video where night target is not obvious which lead to a dim scene, and reduce the negative effect given by the overexposed pixels adaptively in a certain range.
Greenberg, Jacob K; Ladner, Travis R; Olsen, Margaret A; Shannon, Chevis N; Liu, Jingxia; Yarbrough, Chester K; Piccirillo, Jay F; Wellons, John C; Smyth, Matthew D; Park, Tae Sung; Limbrick, David D
2015-08-01
The use of administrative billing data may enable large-scale assessments of treatment outcomes for Chiari Malformation type I (CM-1). However, to utilize such data sets, validated International Classification of Diseases, Ninth Revision (ICD-9-CM) code algorithms for identifying CM-1 surgery are needed. To validate 2 ICD-9-CM code algorithms identifying patients undergoing CM-1 decompression surgery. We retrospectively analyzed the validity of 2 ICD-9-CM code algorithms for identifying adult CM-1 decompression surgery performed at 2 academic medical centers between 2001 and 2013. Algorithm 1 included any discharge diagnosis code of 348.4 (CM-1), as well as a procedure code of 01.24 (cranial decompression) or 03.09 (spinal decompression, or laminectomy). Algorithm 2 restricted this group to patients with a primary diagnosis of 348.4. The positive predictive value (PPV) and sensitivity of each algorithm were calculated. Among 340 first-time admissions identified by Algorithm 1, the overall PPV for CM-1 decompression was 65%. Among the 214 admissions identified by Algorithm 2, the overall PPV was 99.5%. The PPV for Algorithm 1 was lower in the Vanderbilt (59%) cohort, males (40%), and patients treated between 2009 and 2013 (57%), whereas the PPV of Algorithm 2 remained high (≥99%) across subgroups. The sensitivity of Algorithms 1 (86%) and 2 (83%) were above 75% in all subgroups. ICD-9-CM code Algorithm 2 has excellent PPV and good sensitivity to identify adult CM-1 decompression surgery. These results lay the foundation for studying CM-1 treatment outcomes by using large administrative databases.
A Novel Artificial Intelligence System for Endotracheal Intubation.
Carlson, Jestin N; Das, Samarjit; De la Torre, Fernando; Frisch, Adam; Guyette, Francis X; Hodgins, Jessica K; Yealy, Donald M
2016-01-01
Adequate visualization of the glottic opening is a key factor to successful endotracheal intubation (ETI); however, few objective tools exist to help guide providers' ETI attempts toward the glottic opening in real-time. Machine learning/artificial intelligence has helped to automate the detection of other visual structures but its utility with ETI is unknown. We sought to test the accuracy of various computer algorithms in identifying the glottic opening, creating a tool that could aid successful intubation. We collected a convenience sample of providers who each performed ETI 10 times on a mannequin using a video laryngoscope (C-MAC, Karl Storz Corp, Tuttlingen, Germany). We recorded each attempt and reviewed one-second time intervals for the presence or absence of the glottic opening. Four different machine learning/artificial intelligence algorithms analyzed each attempt and time point: k-nearest neighbor (KNN), support vector machine (SVM), decision trees, and neural networks (NN). We used half of the videos to train the algorithms and the second half to test the accuracy, sensitivity, and specificity of each algorithm. We enrolled seven providers, three Emergency Medicine attendings, and four paramedic students. From the 70 total recorded laryngoscopic video attempts, we created 2,465 time intervals. The algorithms had the following sensitivity and specificity for detecting the glottic opening: KNN (70%, 90%), SVM (70%, 90%), decision trees (68%, 80%), and NN (72%, 78%). Initial efforts at computer algorithms using artificial intelligence are able to identify the glottic opening with over 80% accuracy. With further refinements, video laryngoscopy has the potential to provide real-time, direction feedback to the provider to help guide successful ETI.
The Impact of Video Review on Supervisory Conferencing
ERIC Educational Resources Information Center
Baecher, Laura; McCormack, Bede
2015-01-01
This study investigated how video-based observation may alter the nature of post-observation talk between supervisors and teacher candidates. Audio-recorded post-observation conversations were coded using a conversation analysis framework and interpreted through the lens of interactional sociology. Findings suggest that video-based observations…
Novel modes and adaptive block scanning order for intra prediction in AV1
NASA Astrophysics Data System (ADS)
Hadar, Ofer; Shleifer, Ariel; Mukherjee, Debargha; Joshi, Urvang; Mazar, Itai; Yuzvinsky, Michael; Tavor, Nitzan; Itzhak, Nati; Birman, Raz
2017-09-01
The demand for streaming video content is on the rise and growing exponentially. Networks bandwidth is very costly and therefore there is a constant effort to improve video compression rates and enable the sending of reduced data volumes while retaining quality of experience (QoE). One basic feature that utilizes the spatial correlation of pixels for video compression is Intra-Prediction, which determines the codec's compression efficiency. Intra prediction enables significant reduction of the Intra-Frame (I frame) size and, therefore, contributes to efficient exploitation of bandwidth. In this presentation, we propose new Intra-Prediction algorithms that improve the AV1 prediction model and provide better compression ratios. Two (2) types of methods are considered: )1( New scanning order method that maximizes spatial correlation in order to reduce prediction error; and )2( New Intra-Prediction modes implementation in AVI. Modern video coding standards, including AVI codec, utilize fixed scan orders in processing blocks during intra coding. The fixed scan orders typically result in residual blocks with high prediction error mainly in blocks with edges. This means that the fixed scan orders cannot fully exploit the content-adaptive spatial correlations between adjacent blocks, thus the bitrate after compression tends to be large. To reduce the bitrate induced by inaccurate intra prediction, the proposed approach adaptively chooses the scanning order of blocks according to criteria of firstly predicting blocks with maximum number of surrounding, already Inter-Predicted blocks. Using the modified scanning order method and the new modes has reduced the MSE by up to five (5) times when compared to conventional TM mode / Raster scan and up to two (2) times when compared to conventional CALIC mode / Raster scan, depending on the image characteristics (which determines the percentage of blocks predicted with Inter-Prediction, which in turn impacts the efficiency of the new scanning method). For the same cases, the PSNR was shown to improve by up to 7.4dB and up to 4 dB, respectively. The new modes have yielded 5% improvement in BD-Rate over traditionally used modes, when run on K-Frame, which is expected to yield 1% of overall improvement.
Hyperspectral processing in graphical processing units
NASA Astrophysics Data System (ADS)
Winter, Michael E.; Winter, Edwin M.
2011-06-01
With the advent of the commercial 3D video card in the mid 1990s, we have seen an order of magnitude performance increase with each generation of new video cards. While these cards were designed primarily for visualization and video games, it became apparent after a short while that they could be used for scientific purposes. These Graphical Processing Units (GPUs) are rapidly being incorporated into data processing tasks usually reserved for general purpose computers. It has been found that many image processing problems scale well to modern GPU systems. We have implemented four popular hyperspectral processing algorithms (N-FINDR, linear unmixing, Principal Components, and the RX anomaly detection algorithm). These algorithms show an across the board speedup of at least a factor of 10, with some special cases showing extreme speedups of a hundred times or more.
Content-based video retrieval by example video clip
NASA Astrophysics Data System (ADS)
Dimitrova, Nevenka; Abdel-Mottaleb, Mohamed
1997-01-01
This paper presents a novel approach for video retrieval from a large archive of MPEG or Motion JPEG compressed video clips. We introduce a retrieval algorithm that takes a video clip as a query and searches the database for clips with similar contents. Video clips are characterized by a sequence of representative frame signatures, which are constructed from DC coefficients and motion information (`DC+M' signatures). The similarity between two video clips is determined by using their respective signatures. This method facilitates retrieval of clips for the purpose of video editing, broadcast news retrieval, or copyright violation detection.
Combined Wavelet Video Coding and Error Control for Internet Streaming and Multicast
NASA Astrophysics Data System (ADS)
Chu, Tianli; Xiong, Zixiang
2003-12-01
This paper proposes an integrated approach to Internet video streaming and multicast (e.g., receiver-driven layered multicast (RLM) by McCanne) based on combined wavelet video coding and error control. We design a packetized wavelet video (PWV) coder to facilitate its integration with error control. The PWV coder produces packetized layered bitstreams that are independent among layers while being embedded within each layer. Thus, a lost packet only renders the following packets in the same layer useless. Based on the PWV coder, we search for a multilayered error-control strategy that optimally trades off source and channel coding for each layer under a given transmission rate to mitigate the effects of packet loss. While both the PWV coder and the error-control strategy are new—the former incorporates embedded wavelet video coding and packetization and the latter extends the single-layered approach for RLM by Chou et al.—the main distinction of this paper lies in the seamless integration of the two parts. Theoretical analysis shows a gain of up to 1 dB on a channel with 20% packet loss using our combined approach over separate designs of the source coder and the error-control mechanism. This is also substantiated by our simulations with a gain of up to 0.6 dB. In addition, our simulations show a gain of up to 2.2 dB over previous results reported by Chou et al.
Automated video-based detection of nocturnal convulsive seizures in a residential care setting.
Geertsema, Evelien E; Thijs, Roland D; Gutter, Therese; Vledder, Ben; Arends, Johan B; Leijten, Frans S; Visser, Gerhard H; Kalitzin, Stiliyan N
2018-06-01
People with epilepsy need assistance and are at risk of sudden death when having convulsive seizures (CS). Automated real-time seizure detection systems can help alert caregivers, but wearable sensors are not always tolerated. We determined algorithm settings and investigated detection performance of a video algorithm to detect CS in a residential care setting. The algorithm calculates power in the 2-6 Hz range relative to 0.5-12.5 Hz range in group velocity signals derived from video-sequence optical flow. A detection threshold was found using a training set consisting of video-electroencephalogaphy (EEG) recordings of 72 CS. A test set consisting of 24 full nights of 12 new subjects in residential care and additional recordings of 50 CS selected randomly was used to estimate performance. All data were analyzed retrospectively. The start and end of CS (generalized clonic and tonic-clonic seizures) and other seizures considered desirable to detect (long generalized tonic, hyperkinetic, and other major seizures) were annotated. The detection threshold was set to the value that obtained 97% sensitivity in the training set. Sensitivity, latency, and false detection rate (FDR) per night were calculated in the test set. A seizure was detected when the algorithm output exceeded the threshold continuously for 2 seconds. With the detection threshold determined in the training set, all CS were detected in the test set (100% sensitivity). Latency was ≤10 seconds in 78% of detections. Three/five hyperkinetic and 6/9 other major seizures were detected. Median FDR was 0.78 per night and no false detections occurred in 9/24 nights. Our algorithm could improve safety unobtrusively by automated real-time detection of CS in video registrations, with an acceptable latency and FDR. The algorithm can also detect some other motor seizures requiring assistance. © 2018 The Authors. Epilepsia published by Wiley Periodicals, Inc. on behalf of International League Against Epilepsy.
Compression performance comparison in low delay real-time video for mobile applications
NASA Astrophysics Data System (ADS)
Bivolarski, Lazar
2012-10-01
This article compares the performance of several current video coding standards in the conditions of low-delay real-time in a resource constrained environment. The comparison is performed using the same content and the metrics and mix of objective and perceptual quality metrics. The metrics results in different coding schemes are analyzed from a point of view of user perception and quality of service. Multiple standards are compared MPEG-2, MPEG4 and MPEG-AVC and well and H.263. The metrics used in the comparison include SSIM, VQM and DVQ. Subjective evaluation and quality of service are discussed from a point of view of perceptual metrics and their incorporation in the coding scheme development process. The performance and the correlation of results are presented as a predictor of the performance of video compression schemes.
Channel coding for underwater acoustic single-carrier CDMA communication system
NASA Astrophysics Data System (ADS)
Liu, Lanjun; Zhang, Yonglei; Zhang, Pengcheng; Zhou, Lin; Niu, Jiong
2017-01-01
CDMA is an effective multiple access protocol for underwater acoustic networks, and channel coding can effectively reduce the bit error rate (BER) of the underwater acoustic communication system. For the requirements of underwater acoustic mobile networks based on CDMA, an underwater acoustic single-carrier CDMA communication system (UWA/SCCDMA) based on the direct-sequence spread spectrum is proposed, and its channel coding scheme is studied based on convolution, RA, Turbo and LDPC coding respectively. The implementation steps of the Viterbi algorithm of convolutional coding, BP and minimum sum algorithms of RA coding, Log-MAP and SOVA algorithms of Turbo coding, and sum-product algorithm of LDPC coding are given. An UWA/SCCDMA simulation system based on Matlab is designed. Simulation results show that the UWA/SCCDMA based on RA, Turbo and LDPC coding have good performance such that the communication BER is all less than 10-6 in the underwater acoustic channel with low signal to noise ratio (SNR) from -12 dB to -10dB, which is about 2 orders of magnitude lower than that of the convolutional coding. The system based on Turbo coding with Log-MAP algorithm has the best performance.
MATIN: A Random Network Coding Based Framework for High Quality Peer-to-Peer Live Video Streaming
Barekatain, Behrang; Khezrimotlagh, Dariush; Aizaini Maarof, Mohd; Ghaeini, Hamid Reza; Salleh, Shaharuddin; Quintana, Alfonso Ariza; Akbari, Behzad; Cabrera, Alicia Triviño
2013-01-01
In recent years, Random Network Coding (RNC) has emerged as a promising solution for efficient Peer-to-Peer (P2P) video multicasting over the Internet. This probably refers to this fact that RNC noticeably increases the error resiliency and throughput of the network. However, high transmission overhead arising from sending large coefficients vector as header has been the most important challenge of the RNC. Moreover, due to employing the Gauss-Jordan elimination method, considerable computational complexity can be imposed on peers in decoding the encoded blocks and checking linear dependency among the coefficients vectors. In order to address these challenges, this study introduces MATIN which is a random network coding based framework for efficient P2P video streaming. The MATIN includes a novel coefficients matrix generation method so that there is no linear dependency in the generated coefficients matrix. Using the proposed framework, each peer encapsulates one instead of n coefficients entries into the generated encoded packet which results in very low transmission overhead. It is also possible to obtain the inverted coefficients matrix using a bit number of simple arithmetic operations. In this regard, peers sustain very low computational complexities. As a result, the MATIN permits random network coding to be more efficient in P2P video streaming systems. The results obtained from simulation using OMNET++ show that it substantially outperforms the RNC which uses the Gauss-Jordan elimination method by providing better video quality on peers in terms of the four important performance metrics including video distortion, dependency distortion, End-to-End delay and Initial Startup delay. PMID:23940530
Dobson-Belaire, Wendy; Goodfield, Jason; Borrelli, Richard; Liu, Fei Fei; Khan, Zeba M
2018-01-01
Using diagnosis code-based algorithms is the primary method of identifying patient cohorts for retrospective studies; nevertheless, many databases lack reliable diagnosis code information. To develop precise algorithms based on medication claims/prescriber visits (MCs/PVs) to identify psoriasis (PsO) patients and psoriatic patients with arthritic conditions (PsO-AC), a proxy for psoriatic arthritis, in Canadian databases lacking diagnosis codes. Algorithms were developed using medications with narrow indication profiles in combination with prescriber specialty to define PsO and PsO-AC. For a 3-year study period from July 1, 2009, algorithms were validated using the PharMetrics Plus database, which contains both adjudicated medication claims and diagnosis codes. Positive predictive value (PPV), negative predictive value (NPV), sensitivity, and specificity of the developed algorithms were assessed using diagnosis code as the reference standard. Chosen algorithms were then applied to Canadian drug databases to profile the algorithm-identified PsO and PsO-AC cohorts. In the selected database, 183,328 patients were identified for validation. The highest PPVs for PsO (85%) and PsO-AC (65%) occurred when a predictive algorithm of two or more MCs/PVs was compared with the reference standard of one or more diagnosis codes. NPV and specificity were high (99%-100%), whereas sensitivity was low (≤30%). Reducing the number of MCs/PVs or increasing diagnosis claims decreased the algorithms' PPVs. We have developed an MC/PV-based algorithm to identify PsO patients with a high degree of accuracy, but accuracy for PsO-AC requires further investigation. Such methods allow researchers to conduct retrospective studies in databases in which diagnosis codes are absent. Copyright © 2018 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Zhang, Hao; Chen, Minghua; Parekh, Abhay; Ramchandran, Kannan
2011-09-01
We design a distributed multi-channel P2P Video-on-Demand (VoD) system using "plug-and-play" helpers. Helpers are heterogenous "micro-servers" with limited storage, bandwidth and number of users they can serve simultaneously. Our proposed system has the following salient features: (1) it jointly optimizes over helper-user connection topology, video storage distribution and transmission bandwidth allocation; (2) it minimizes server load, and is adaptable to varying supply and demand patterns across multiple video channels irrespective of video popularity; and (3) it is fully distributed and requires little or no maintenance overhead. The combinatorial nature of the problem and the system demand for distributed algorithms makes the problem uniquely challenging. By utilizing Lagrangian decomposition and Markov chain approximation based arguments, we address this challenge by designing two distributed algorithms running in tandem: a primal-dual storage and bandwidth allocation algorithm and a "soft-worst-neighbor-choking" topology-building algorithm. Our scheme provably converges to a near-optimal solution, and is easy to implement in practice. Packet-level simulation results show that the proposed scheme achieves minimum sever load under highly heterogeneous combinations of supply and demand patterns, and is robust to system dynamics of user/helper churn, user/helper asynchrony, and random delays in the network.
Development of MPEG standards for 3D and free viewpoint video
NASA Astrophysics Data System (ADS)
Smolic, Aljoscha; Kimata, Hideaki; Vetro, Anthony
2005-11-01
An overview of 3D and free viewpoint video is given in this paper with special focus on related standardization activities in MPEG. Free viewpoint video allows the user to freely navigate within real world visual scenes, as known from virtual worlds in computer graphics. Suitable 3D scene representation formats are classified and the processing chain is explained. Examples are shown for image-based and model-based free viewpoint video systems, highlighting standards conform realization using MPEG-4. Then the principles of 3D video are introduced providing the user with a 3D depth impression of the observed scene. Example systems are described again focusing on their realization based on MPEG-4. Finally multi-view video coding is described as a key component for 3D and free viewpoint video systems. MPEG is currently working on a new standard for multi-view video coding. The conclusion is that the necessary technology including standard media formats for 3D and free viewpoint is available or will be available in the near future, and that there is a clear demand from industry and user side for such applications. 3DTV at home and free viewpoint video on DVD will be available soon, and will create huge new markets.
Application of Bayesian a Priori Distributions for Vehicles' Video Tracking Systems
NASA Astrophysics Data System (ADS)
Mazurek, Przemysław; Okarma, Krzysztof
Intelligent Transportation Systems (ITS) helps to improve the quality and quantity of many car traffic parameters. The use of the ITS is possible when the adequate measuring infrastructure is available. Video systems allow for its implementation with relatively low cost due to the possibility of simultaneous video recording of a few lanes of the road at a considerable distance from the camera. The process of tracking can be realized through different algorithms, the most attractive algorithms are Bayesian, because they use the a priori information derived from previous observations or known limitations. Use of this information is crucial for improving the quality of tracking especially for difficult observability conditions, which occur in the video systems under the influence of: smog, fog, rain, snow and poor lighting conditions.
Distributed Coding/Decoding Complexity in Video Sensor Networks
Cordeiro, Paulo J.; Assunção, Pedro
2012-01-01
Video Sensor Networks (VSNs) are recent communication infrastructures used to capture and transmit dense visual information from an application context. In such large scale environments which include video coding, transmission and display/storage, there are several open problems to overcome in practical implementations. This paper addresses the most relevant challenges posed by VSNs, namely stringent bandwidth usage and processing time/power constraints. In particular, the paper proposes a novel VSN architecture where large sets of visual sensors with embedded processors are used for compression and transmission of coded streams to gateways, which in turn transrate the incoming streams and adapt them to the variable complexity requirements of both the sensor encoders and end-user decoder terminals. Such gateways provide real-time transcoding functionalities for bandwidth adaptation and coding/decoding complexity distribution by transferring the most complex video encoding/decoding tasks to the transcoding gateway at the expense of a limited increase in bit rate. Then, a method to reduce the decoding complexity, suitable for system-on-chip implementation, is proposed to operate at the transcoding gateway whenever decoders with constrained resources are targeted. The results show that the proposed method achieves good performance and its inclusion into the VSN infrastructure provides an additional level of complexity control functionality. PMID:22736972
Distributed coding/decoding complexity in video sensor networks.
Cordeiro, Paulo J; Assunção, Pedro
2012-01-01
Video Sensor Networks (VSNs) are recent communication infrastructures used to capture and transmit dense visual information from an application context. In such large scale environments which include video coding, transmission and display/storage, there are several open problems to overcome in practical implementations. This paper addresses the most relevant challenges posed by VSNs, namely stringent bandwidth usage and processing time/power constraints. In particular, the paper proposes a novel VSN architecture where large sets of visual sensors with embedded processors are used for compression and transmission of coded streams to gateways, which in turn transrate the incoming streams and adapt them to the variable complexity requirements of both the sensor encoders and end-user decoder terminals. Such gateways provide real-time transcoding functionalities for bandwidth adaptation and coding/decoding complexity distribution by transferring the most complex video encoding/decoding tasks to the transcoding gateway at the expense of a limited increase in bit rate. Then, a method to reduce the decoding complexity, suitable for system-on-chip implementation, is proposed to operate at the transcoding gateway whenever decoders with constrained resources are targeted. The results show that the proposed method achieves good performance and its inclusion into the VSN infrastructure provides an additional level of complexity control functionality.
A robust low-rate coding scheme for packet video
NASA Technical Reports Server (NTRS)
Chen, Y. C.; Sayood, Khalid; Nelson, D. J.; Arikan, E. (Editor)
1991-01-01
Due to the rapidly evolving field of image processing and networking, video information promises to be an important part of telecommunication systems. Although up to now video transmission has been transported mainly over circuit-switched networks, it is likely that packet-switched networks will dominate the communication world in the near future. Asynchronous transfer mode (ATM) techniques in broadband-ISDN can provide a flexible, independent and high performance environment for video communication. For this paper, the network simulator was used only as a channel in this simulation. Mixture blocking coding with progressive transmission (MBCPT) has been investigated for use over packet networks and has been found to provide high compression rate with good visual performance, robustness to packet loss, tractable integration with network mechanics and simplicity in parallel implementation.
Image Coding Based on Address Vector Quantization.
NASA Astrophysics Data System (ADS)
Feng, Yushu
Image coding is finding increased application in teleconferencing, archiving, and remote sensing. This thesis investigates the potential of Vector Quantization (VQ), a relatively new source coding technique, for compression of monochromatic and color images. Extensions of the Vector Quantization technique to the Address Vector Quantization method have been investigated. In Vector Quantization, the image data to be encoded are first processed to yield a set of vectors. A codeword from the codebook which best matches the input image vector is then selected. Compression is achieved by replacing the image vector with the index of the code-word which produced the best match, the index is sent to the channel. Reconstruction of the image is done by using a table lookup technique, where the label is simply used as an address for a table containing the representative vectors. A code-book of representative vectors (codewords) is generated using an iterative clustering algorithm such as K-means, or the generalized Lloyd algorithm. A review of different Vector Quantization techniques are given in chapter 1. Chapter 2 gives an overview of codebook design methods including the Kohonen neural network to design codebook. During the encoding process, the correlation of the address is considered and Address Vector Quantization is developed for color image and monochrome image coding. Address VQ which includes static and dynamic processes is introduced in chapter 3. In order to overcome the problems in Hierarchical VQ, Multi-layer Address Vector Quantization is proposed in chapter 4. This approach gives the same performance as that of the normal VQ scheme but the bit rate is about 1/2 to 1/3 as that of the normal VQ method. In chapter 5, a Dynamic Finite State VQ based on a probability transition matrix to select the best subcodebook to encode the image is developed. In chapter 6, a new adaptive vector quantization scheme, suitable for color video coding, called "A Self -Organizing Adaptive VQ Technique" is presented. In addition to chapters 2 through 6 which report on new work, this dissertation includes one chapter (chapter 1) and part of chapter 2 which review previous work on VQ and image coding, respectively. Finally, a short discussion of directions for further research is presented in conclusion.
Proof of Concept of Automated Collision Detection Technology in Rugby Sevens.
Clarke, Anthea C; Anson, Judith M; Pyne, David B
2017-04-01
Clarke, AC, Anson, JM, and Pyne, DB. Proof of concept of automated collision detection technology in rugby sevens. J Strength Cond Res 31(4): 1116-1120, 2017-Developments in microsensor technology allow for automated detection of collisions in various codes of football, removing the need for time-consuming postprocessing of video footage. However, little research is available on the ability of microsensor technology to be used across various sports or genders. Game video footage was matched with microsensor-detected collisions (GPSports) in one men's (n = 12 players) and one women's (n = 12) rugby sevens match. True-positive, false-positive, and false-negative events between video and microsensor-detected collisions were used to calculate recall (ability to detect a collision) and precision (accurately identify a collision). The precision was similar between the men's and women's rugby sevens game (∼0.72; scale 0.00-1.00); however, the recall in the women's game (0.45) was less than that for the men's game (0.69). This resulted in 45% of collisions for men and 62% of collisions for women being incorrectly labeled. Currently, the automated collision detection system in GPSports microtechnology units has only modest utility in rugby sevens, and it seems that a rugby sevens-specific algorithm is needed. Differences in measures between the men's and women's game may be a result of physical size, and strength, and physicality, as well as technical and tactical factors.
Vision-based measurement for rotational speed by improving Lucas-Kanade template tracking algorithm.
Guo, Jie; Zhu, Chang'an; Lu, Siliang; Zhang, Dashan; Zhang, Chunyu
2016-09-01
Rotational angle and speed are important parameters for condition monitoring and fault diagnosis of rotating machineries, and their measurement is useful in precision machining and early warning of faults. In this study, a novel vision-based measurement algorithm is proposed to complete this task. A high-speed camera is first used to capture the video of the rotational object. To extract the rotational angle, the template-based Lucas-Kanade algorithm is introduced to complete motion tracking by aligning the template image in the video sequence. Given the special case of nonplanar surface of the cylinder object, a nonlinear transformation is designed for modeling the rotation tracking. In spite of the unconventional and complex form, the transformation can realize angle extraction concisely with only one parameter. A simulation is then conducted to verify the tracking effect, and a practical tracking strategy is further proposed to track consecutively the video sequence. Based on the proposed algorithm, instantaneous rotational speed (IRS) can be measured accurately and efficiently. Finally, the effectiveness of the proposed algorithm is verified on a brushless direct current motor test rig through the comparison with results obtained by the microphone. Experimental results demonstrate that the proposed algorithm can extract accurately rotational angles and can measure IRS with the advantage of noncontact and effectiveness.
Compact full-motion video hyperspectral cameras: development, image processing, and applications
NASA Astrophysics Data System (ADS)
Kanaev, A. V.
2015-10-01
Emergence of spectral pixel-level color filters has enabled development of hyper-spectral Full Motion Video (FMV) sensors operating in visible (EO) and infrared (IR) wavelengths. The new class of hyper-spectral cameras opens broad possibilities of its utilization for military and industry purposes. Indeed, such cameras are able to classify materials as well as detect and track spectral signatures continuously in real time while simultaneously providing an operator the benefit of enhanced-discrimination-color video. Supporting these extensive capabilities requires significant computational processing of the collected spectral data. In general, two processing streams are envisioned for mosaic array cameras. The first is spectral computation that provides essential spectral content analysis e.g. detection or classification. The second is presentation of the video to an operator that can offer the best display of the content depending on the performed task e.g. providing spatial resolution enhancement or color coding of the spectral analysis. These processing streams can be executed in parallel or they can utilize each other's results. The spectral analysis algorithms have been developed extensively, however demosaicking of more than three equally-sampled spectral bands has been explored scarcely. We present unique approach to demosaicking based on multi-band super-resolution and show the trade-off between spatial resolution and spectral content. Using imagery collected with developed 9-band SWIR camera we demonstrate several of its concepts of operation including detection and tracking. We also compare the demosaicking results to the results of multi-frame super-resolution as well as to the combined multi-frame and multiband processing.
An Interactive Assessment Framework for Visual Engagement: Statistical Analysis of a TEDx Video
ERIC Educational Resources Information Center
Farhan, Muhammad; Aslam, Muhammad
2017-01-01
This study aims to assess the visual engagement of the video lectures. This analysis can be useful for the presenter and student to find out the overall visual attention of the videos. For this purpose, a new algorithm and data collection module are developed. Videos can be transformed into a dataset with the help of data collection module. The…
Automatic and user-centric approaches to video summary evaluation
NASA Astrophysics Data System (ADS)
Taskiran, Cuneyt M.; Bentley, Frank
2007-01-01
Automatic video summarization has become an active research topic in content-based video processing. However, not much emphasis has been placed on developing rigorous summary evaluation methods and developing summarization systems based on a clear understanding of user needs, obtained through user centered design. In this paper we address these two topics and propose an automatic video summary evaluation algorithm adapted from teh text summarization domain.
NASA Astrophysics Data System (ADS)
Zingoni, Andrea; Diani, Marco; Corsini, Giovanni
2016-10-01
We developed an algorithm for automatically detecting small and poorly contrasted (dim) moving objects in real-time, within video sequences acquired through a steady infrared camera. The algorithm is suitable for different situations since it is independent of the background characteristics and of changes in illumination. Unlike other solutions, small objects of any size (up to single-pixel), either hotter or colder than the background, can be successfully detected. The algorithm is based on accurately estimating the background at the pixel level and then rejecting it. A novel approach permits background estimation to be robust to changes in the scene illumination and to noise, and not to be biased by the transit of moving objects. Care was taken in avoiding computationally costly procedures, in order to ensure the real-time performance even using low-cost hardware. The algorithm was tested on a dataset of 12 video sequences acquired in different conditions, providing promising results in terms of detection rate and false alarm rate, independently of background and objects characteristics. In addition, the detection map was produced frame by frame in real-time, using cheap commercial hardware. The algorithm is particularly suitable for applications in the fields of video-surveillance and computer vision. Its reliability and speed permit it to be used also in critical situations, like in search and rescue, defence and disaster monitoring.
NASA Technical Reports Server (NTRS)
Murray, N. D.
1985-01-01
Current technology projections indicate a lack of availability of special purpose computing for Space Station applications. Potential functions for video image special purpose processing are being investigated, such as smoothing, enhancement, restoration and filtering, data compression, feature extraction, object detection and identification, pixel interpolation/extrapolation, spectral estimation and factorization, and vision synthesis. Also, architectural approaches are being identified and a conceptual design generated. Computationally simple algorithms will be research and their image/vision effectiveness determined. Suitable algorithms will be implimented into an overall architectural approach that will provide image/vision processing at video rates that are flexible, selectable, and programmable. Information is given in the form of charts, diagrams and outlines.
Unmanned Vehicle Guidance Using Video Camera/Vehicle Model
NASA Technical Reports Server (NTRS)
Sutherland, T.
1999-01-01
A video guidance sensor (VGS) system has flown on both STS-87 and STS-95 to validate a single camera/target concept for vehicle navigation. The main part of the image algorithm was the subtraction of two consecutive images using software. For a nominal size image of 256 x 256 pixels this subtraction can take a large portion of the time between successive frames in standard rate video leaving very little time for other computations. The purpose of this project was to integrate the software subtraction into hardware to speed up the subtraction process and allow for more complex algorithms to be performed, both in hardware and software.
ERIC Educational Resources Information Center
Oggins, Jean; Sammis, Jeffrey
2012-01-01
In this study, 438 players of the online video game, World of Warcraft, completed a survey about video game addiction and answered an open-ended question about behaviors they considered characteristic of video game addiction. Responses were coded and correlated with players' self-reports of being addicted to games and scores on a modified video…
Layered video transmission over multirate DS-CDMA wireless systems
NASA Astrophysics Data System (ADS)
Kondi, Lisimachos P.; Srinivasan, Deepika; Pados, Dimitris A.; Batalama, Stella N.
2003-05-01
n this paper, we consider the transmission of video over wireless direct-sequence code-division multiple access (DS-CDMA) channels. A layered (scalable) video source codec is used and each layer is transmitted over a different CDMA channel. Spreading codes with different lengths are allowed for each CDMA channel (multirate CDMA). Thus, a different number of chips per bit can be used for the transmission of each scalable layer. For a given fixed energy value per chip and chip rate, the selection of a spreading code length affects the transmitted energy per bit and bit rate for each scalable layer. An MPEG-4 source encoder is used to provide a two-layer SNR scalable bitstream. Each of the two layers is channel-coded using Rate-Compatible Punctured Convolutional (RCPC) codes. Then, the data are interleaved, spread, carrier-modulated and transmitted over the wireless channel. A multipath Rayleigh fading channel is assumed. At the other end, we assume the presence of an antenna array receiver. After carrier demodulation, multiple-access-interference suppressing despreading is performed using space-time auxiliary vector (AV) filtering. The choice of the AV receiver is dictated by realistic channel fading rates that limit the data record available for receiver adaptation and redesign. Indeed, AV filter short-data-record estimators have been shown to exhibit superior bit-error-rate performance in comparison with LMS, RLS, SMI, or 'multistage nested Wiener' adaptive filter implementations. Our experimental results demonstrate the effectiveness of multirate DS-CDMA systems for wireless video transmission.
Tracking Algorithm of Multiple Pedestrians Based on Particle Filters in Video Sequences
Liu, Yun; Wang, Chuanxu; Zhang, Shujun; Cui, Xuehong
2016-01-01
Pedestrian tracking is a critical problem in the field of computer vision. Particle filters have been proven to be very useful in pedestrian tracking for nonlinear and non-Gaussian estimation problems. However, pedestrian tracking in complex environment is still facing many problems due to changes of pedestrian postures and scale, moving background, mutual occlusion, and presence of pedestrian. To surmount these difficulties, this paper presents tracking algorithm of multiple pedestrians based on particle filters in video sequences. The algorithm acquires confidence value of the object and the background through extracting a priori knowledge thus to achieve multipedestrian detection; it adopts color and texture features into particle filter to get better observation results and then automatically adjusts weight value of each feature according to current tracking environment. During the process of tracking, the algorithm processes severe occlusion condition to prevent drift and loss phenomena caused by object occlusion and associates detection results with particle state to propose discriminated method for object disappearance and emergence thus to achieve robust tracking of multiple pedestrians. Experimental verification and analysis in video sequences demonstrate that proposed algorithm improves the tracking performance and has better tracking results. PMID:27847514
Algorithm for Automatic Behavior Quantification of Laboratory Mice Using High-Frame-Rate Videos
NASA Astrophysics Data System (ADS)
Nie, Yuman; Takaki, Takeshi; Ishii, Idaku; Matsuda, Hiroshi
In this paper, we propose an algorithm for automatic behavior quantification in laboratory mice to quantify several model behaviors. The algorithm can detect repetitive motions of the fore- or hind-limbs at several or dozens of hertz, which are too rapid for the naked eye, from high-frame-rate video images. Multiple repetitive motions can always be identified from periodic frame-differential image features in four segmented regions — the head, left side, right side, and tail. Even when a mouse changes its posture and orientation relative to the camera, these features can still be extracted from the shift- and orientation-invariant shape of the mouse silhouette by using the polar coordinate system and adjusting the angle coordinate according to the head and tail positions. The effectiveness of the algorithm is evaluated by analyzing long-term 240-fps videos of four laboratory mice for six typical model behaviors: moving, rearing, immobility, head grooming, left-side scratching, and right-side scratching. The time durations for the model behaviors determined by the algorithm have detection/correction ratios greater than 80% for all the model behaviors. This shows good quantification results for actual animal testing.
Falling-incident detection and throughput enhancement in a multi-camera video-surveillance system.
Shieh, Wann-Yun; Huang, Ju-Chin
2012-09-01
For most elderly, unpredictable falling incidents may occur at the corner of stairs or a long corridor due to body frailty. If we delay to rescue a falling elder who is likely fainting, more serious consequent injury may occur. Traditional secure or video surveillance systems need caregivers to monitor a centralized screen continuously, or need an elder to wear sensors to detect falling incidents, which explicitly waste much human power or cause inconvenience for elders. In this paper, we propose an automatic falling-detection algorithm and implement this algorithm in a multi-camera video surveillance system. The algorithm uses each camera to fetch the images from the regions required to be monitored. It then uses a falling-pattern recognition algorithm to determine if a falling incident has occurred. If yes, system will send short messages to someone needs to be noticed. The algorithm has been implemented in a DSP-based hardware acceleration board for functionality proof. Simulation results show that the accuracy of falling detection can achieve at least 90% and the throughput of a four-camera surveillance system can be improved by about 2.1 times. Copyright © 2011 IPEM. Published by Elsevier Ltd. All rights reserved.
Global motion compensated visual attention-based video watermarking
NASA Astrophysics Data System (ADS)
Oakes, Matthew; Bhowmik, Deepayan; Abhayaratne, Charith
2016-11-01
Imperceptibility and robustness are two key but complementary requirements of any watermarking algorithm. Low-strength watermarking yields high imperceptibility but exhibits poor robustness. High-strength watermarking schemes achieve good robustness but often suffer from embedding distortions resulting in poor visual quality in host media. This paper proposes a unique video watermarking algorithm that offers a fine balance between imperceptibility and robustness using motion compensated wavelet-based visual attention model (VAM). The proposed VAM includes spatial cues for visual saliency as well as temporal cues. The spatial modeling uses the spatial wavelet coefficients while the temporal modeling accounts for both local and global motion to arrive at the spatiotemporal VAM for video. The model is then used to develop a video watermarking algorithm, where a two-level watermarking weighting parameter map is generated from the VAM saliency maps using the saliency model and data are embedded into the host image according to the visual attentiveness of each region. By avoiding higher strength watermarking in the visually attentive region, the resulting watermarked video achieves high perceived visual quality while preserving high robustness. The proposed VAM outperforms the state-of-the-art video visual attention methods in joint saliency detection and low computational complexity performance. For the same embedding distortion, the proposed visual attention-based watermarking achieves up to 39% (nonblind) and 22% (blind) improvement in robustness against H.264/AVC compression, compared to existing watermarking methodology that does not use the VAM. The proposed visual attention-based video watermarking results in visual quality similar to that of low-strength watermarking and a robustness similar to those of high-strength watermarking.
Jin, Meihua; Jung, Ji-Young; Lee, Jung-Ryun
2016-10-12
With the arrival of the era of Internet of Things (IoT), Wi-Fi Direct is becoming an emerging wireless technology that allows one to communicate through a direct connection between the mobile devices anytime, anywhere. In Wi-Fi Direct-based IoT networks, all devices are categorized by group of owner (GO) and client. Since portability is emphasized in Wi-Fi Direct devices, it is essential to control the energy consumption of a device very efficiently. In order to avoid unnecessary power consumed by GO, Wi-Fi Direct standard defines two power-saving methods: Opportunistic and Notice of Absence (NoA) power-saving methods. In this paper, we suggest an algorithm to enhance the energy efficiency of Wi-Fi Direct power-saving, considering the characteristics of multimedia video traffic. Proposed algorithm utilizes the statistical distribution for the size of video frames and adjusts the lengths of awake intervals in a beacon interval dynamically. In addition, considering the inter-dependency among video frames, the proposed algorithm ensures that a video frame having high priority is transmitted with higher probability than other frames having low priority. Simulation results show that the proposed method outperforms the traditional NoA method in terms of average delay and energy efficiency.
Jin, Meihua; Jung, Ji-Young; Lee, Jung-Ryun
2016-01-01
With the arrival of the era of Internet of Things (IoT), Wi-Fi Direct is becoming an emerging wireless technology that allows one to communicate through a direct connection between the mobile devices anytime, anywhere. In Wi-Fi Direct-based IoT networks, all devices are categorized by group of owner (GO) and client. Since portability is emphasized in Wi-Fi Direct devices, it is essential to control the energy consumption of a device very efficiently. In order to avoid unnecessary power consumed by GO, Wi-Fi Direct standard defines two power-saving methods: Opportunistic and Notice of Absence (NoA) power-saving methods. In this paper, we suggest an algorithm to enhance the energy efficiency of Wi-Fi Direct power-saving, considering the characteristics of multimedia video traffic. Proposed algorithm utilizes the statistical distribution for the size of video frames and adjusts the lengths of awake intervals in a beacon interval dynamically. In addition, considering the inter-dependency among video frames, the proposed algorithm ensures that a video frame having high priority is transmitted with higher probability than other frames having low priority. Simulation results show that the proposed method outperforms the traditional NoA method in terms of average delay and energy efficiency. PMID:27754315
Progressive Dictionary Learning with Hierarchical Predictive Structure for Scalable Video Coding.
Dai, Wenrui; Shen, Yangmei; Xiong, Hongkai; Jiang, Xiaoqian; Zou, Junni; Taubman, David
2017-04-12
Dictionary learning has emerged as a promising alternative to the conventional hybrid coding framework. However, the rigid structure of sequential training and prediction degrades its performance in scalable video coding. This paper proposes a progressive dictionary learning framework with hierarchical predictive structure for scalable video coding, especially in low bitrate region. For pyramidal layers, sparse representation based on spatio-temporal dictionary is adopted to improve the coding efficiency of enhancement layers (ELs) with a guarantee of reconstruction performance. The overcomplete dictionary is trained to adaptively capture local structures along motion trajectories as well as exploit the correlations between neighboring layers of resolutions. Furthermore, progressive dictionary learning is developed to enable the scalability in temporal domain and restrict the error propagation in a close-loop predictor. Under the hierarchical predictive structure, online learning is leveraged to guarantee the training and prediction performance with an improved convergence rate. To accommodate with the stateof- the-art scalable extension of H.264/AVC and latest HEVC, standardized codec cores are utilized to encode the base and enhancement layers. Experimental results show that the proposed method outperforms the latest SHVC and HEVC simulcast over extensive test sequences with various resolutions.
Using QR Codes to Differentiate Learning for Gifted and Talented Students
ERIC Educational Resources Information Center
Siegle, Del
2015-01-01
QR codes are two-dimensional square patterns that are capable of coding information that ranges from web addresses to links to YouTube video. The codes save time typing and eliminate errors in entering addresses incorrectly. These codes make learning with technology easier for students and motivationally engage them in news ways.
Dynamic quality of service differentiation using fixed code weight in optical CDMA networks
NASA Astrophysics Data System (ADS)
Kakaee, Majid H.; Essa, Shawnim I.; Abd, Thanaa H.; Seyedzadeh, Saleh
2015-11-01
The emergence of network-driven applications, such as internet, video conferencing, and online gaming, brings in the need for a network the environments with capability of providing diverse Quality of Services (QoS). In this paper, a new code family of novel spreading sequences, called a Multi-Service (MS) code, has been constructed to support multiple services in Optical- Code Division Multiple Access (CDMA) system. The proposed method uses fixed weight for all services, however reducing the interfering codewords for the users requiring higher QoS. The performance of the proposed code is demonstrated using mathematical analysis. It shown that the total number of served users with satisfactory BER of 10-9 using NB=2 is 82, while they are only 36 and 10 when NB=3 and 4 respectively. The developed MS code is compared with variable-weight codes such as Variable Weight-Khazani Syed (VW-KS) and Multi-Weight-Random Diagonal (MW-RD). Different numbers of basic users (NB) are used to support triple-play services (audio, data and video) with different QoS requirements. Furthermore, reference to the BER of 10-12, 10-9, and 10-3 for video, data and audio, respectively, the system can support up to 45 total users. Hence, results show that the technique can clearly provide a relative QoS differentiation with lower value of basic users can support larger number of subscribers as well as better performance in terms of acceptable BER of 10-9 at fixed code weight.
Exclusively visual analysis of classroom group interactions
NASA Astrophysics Data System (ADS)
Tucker, Laura; Scherr, Rachel E.; Zickler, Todd; Mazur, Eric
2016-12-01
Large-scale audiovisual data that measure group learning are time consuming to collect and analyze. As an initial step towards scaling qualitative classroom observation, we qualitatively coded classroom video using an established coding scheme with and without its audio cues. We find that interrater reliability is as high when using visual data only—without audio—as when using both visual and audio data to code. Also, interrater reliability is high when comparing use of visual and audio data to visual-only data. We see a small bias to code interactions as group discussion when visual and audio data are used compared with video-only data. This work establishes that meaningful educational observation can be made through visual information alone. Further, it suggests that after initial work to create a coding scheme and validate it in each environment, computer-automated visual coding could drastically increase the breadth of qualitative studies and allow for meaningful educational analysis on a far greater scale.
An Implementation Of Elias Delta Code And ElGamal Algorithm In Image Compression And Security
NASA Astrophysics Data System (ADS)
Rachmawati, Dian; Andri Budiman, Mohammad; Saffiera, Cut Amalia
2018-01-01
In data transmission such as transferring an image, confidentiality, integrity, and efficiency of data storage aspects are highly needed. To maintain the confidentiality and integrity of data, one of the techniques used is ElGamal. The strength of this algorithm is found on the difficulty of calculating discrete logs in a large prime modulus. ElGamal belongs to the class of Asymmetric Key Algorithm and resulted in enlargement of the file size, therefore data compression is required. Elias Delta Code is one of the compression algorithms that use delta code table. The image was first compressed using Elias Delta Code Algorithm, then the result of the compression was encrypted by using ElGamal algorithm. Prime test was implemented using Agrawal Biswas Algorithm. The result showed that ElGamal method could maintain the confidentiality and integrity of data with MSE and PSNR values 0 and infinity. The Elias Delta Code method generated compression ratio and space-saving each with average values of 62.49%, and 37.51%.
Adams, Bradley J; Aschheim, Kenneth W
2016-01-01
Comparison of antemortem and postmortem dental records is a leading method of victim identification, especially for incidents involving a large number of decedents. This process may be expedited with computer software that provides a ranked list of best possible matches. This study provides a comparison of the most commonly used conventional coding and sorting algorithms used in the United States (WinID3) with a simplified coding format that utilizes an optimized sorting algorithm. The simplified system consists of seven basic codes and utilizes an optimized algorithm based largely on the percentage of matches. To perform this research, a large reference database of approximately 50,000 antemortem and postmortem records was created. For most disaster scenarios, the proposed simplified codes, paired with the optimized algorithm, performed better than WinID3 which uses more complex codes. The detailed coding system does show better performance with extremely large numbers of records and/or significant body fragmentation. © 2015 American Academy of Forensic Sciences.
CT brush and CancerZap!: two video games for computed tomography dose minimization.
Alvare, Graham; Gordon, Richard
2015-05-12
X-ray dose from computed tomography (CT) scanners has become a significant public health concern. All CT scanners spray x-ray photons across a patient, including those using compressive sensing algorithms. New technologies make it possible to aim x-ray beams where they are most needed to form a diagnostic or screening image. We have designed a computer game, CT Brush, that takes advantage of this new flexibility. It uses a standard MART algorithm (Multiplicative Algebraic Reconstruction Technique), but with a user defined dynamically selected subset of the rays. The image appears as the player moves the CT brush over an initially blank scene, with dose accumulating with every "mouse down" move. The goal is to find the "tumor" with as few moves (least dose) as possible. We have successfully implemented CT Brush in Java and made it available publicly, requesting crowdsourced feedback on improving the open source code. With this experience, we also outline a "shoot 'em up game" CancerZap! for photon limited CT. We anticipate that human computing games like these, analyzed by methods similar to those used to understand eye tracking, will lead to new object dependent CT algorithms that will require significantly less dose than object independent nonlinear and compressive sensing algorithms that depend on sprayed photons. Preliminary results suggest substantial dose reduction is achievable.
Single-Scale Retinex Using Digital Signal Processors
NASA Technical Reports Server (NTRS)
Hines, Glenn; Rahman, Zia-Ur; Jobson, Daniel; Woodell, Glenn
2005-01-01
The Retinex is an image enhancement algorithm that improves the brightness, contrast and sharpness of an image. It performs a non-linear spatial/spectral transform that provides simultaneous dynamic range compression and color constancy. It has been used for a wide variety of applications ranging from aviation safety to general purpose photography. Many potential applications require the use of Retinex processing at video frame rates. This is difficult to achieve with general purpose processors because the algorithm contains a large number of complex computations and data transfers. In addition, many of these applications also constrain the potential architectures to embedded processors to save power, weight and cost. Thus we have focused on digital signal processors (DSPs) and field programmable gate arrays (FPGAs) as potential solutions for real-time Retinex processing. In previous efforts we attained a 21 (full) frame per second (fps) processing rate for the single-scale monochromatic Retinex with a TMS320C6711 DSP operating at 150 MHz. This was achieved after several significant code improvements and optimizations. Since then we have migrated our design to the slightly more powerful TMS320C6713 DSP and the fixed point TMS320DM642 DSP. In this paper we briefly discuss the Retinex algorithm, the performance of the algorithm executing on the TMS320C6713 and the TMS320DM642, and compare the results with the TMS320C6711.
NASA Technical Reports Server (NTRS)
Woo, Simon S.; Cheng, Michael K.
2011-01-01
The original Luby Transform (LT) coding scheme is extended to account for data transmissions where some information symbols in a message block are more important than others. Prioritized LT codes provide unequal error protection (UEP) of data on an erasure channel by modifying the original LT encoder. The prioritized algorithm improves high-priority data protection without penalizing low-priority data recovery. Moreover, low-latency decoding is also obtained for high-priority data due to fast encoding. Prioritized LT codes only require a slight change in the original encoding algorithm, and no changes at all at the decoder. Hence, with a small complexity increase in the LT encoder, an improved UEP and low-decoding latency performance for high-priority data can be achieved. LT encoding partitions a data stream into fixed-sized message blocks each with a constant number of information symbols. To generate a code symbol from the information symbols in a message, the Robust-Soliton probability distribution is first applied in order to determine the number of information symbols to be used to compute the code symbol. Then, the specific information symbols are chosen uniform randomly from the message block. Finally, the selected information symbols are XORed to form the code symbol. The Prioritized LT code construction includes an additional restriction that code symbols formed by a relatively small number of XORed information symbols select some of these information symbols from the pool of high-priority data. Once high-priority data are fully covered, encoding continues with the conventional LT approach where code symbols are generated by selecting information symbols from the entire message block including all different priorities. Therefore, if code symbols derived from high-priority data experience an unusual high number of erasures, Prioritized LT codes can still reliably recover both high- and low-priority data. This hybrid approach decides not only "how to encode" but also "what to encode" to achieve UEP. Another advantage of the priority encoding process is that the majority of high-priority data can be decoded sooner since only a small number of code symbols are required to reconstruct high-priority data. This approach increases the likelihood that high-priority data is decoded first over low-priority data. The Prioritized LT code scheme achieves an improvement in high-priority data decoding performance as well as overall information recovery without penalizing the decoding of low-priority data, assuming high-priority data is no more than half of a message block. The cost is in the additional complexity required in the encoder. If extra computation resource is available at the transmitter, image, voice, and video transmission quality in terrestrial and space communications can benefit from accurate use of redundancy in protecting data with varying priorities.
Digital Watermarking: From Concepts to Real-Time Video Applications
1999-01-01
includes still- image , video, audio, and geometry data among others-the fundamental con- cept of steganography can be transferred from the field of...size of the message, which should be as small as possible. Some commercially available algorithms for image watermarking forego the secure-watermarking... image compres- sion.’ The image’s luminance component is divided into 8 x 8 pixel blocks. The algorithm selects a sequence of blocks and applies the
Optical Flow Analysis and Kalman Filter Tracking in Video Surveillance Algorithms
2007-06-01
Grover Brown and Patrick Y.C. Hwang , Introduction to Random Signals and Applied Kalman Filtering, Third edition, John Wiley & Sons, New York, 1997...noise. Brown and Hwang [6] achieve this improvement by linearly blending the prior estimate, 1kx ∧ − , with the noisy measurement, kz , in the equation...AND KALMAN FILTER TRACKING IN VIDEO SURVEILLANCE ALGORITHMS by David A. Semko June 2007 Thesis Advisor: Monique P. Fargues Second
A Method for Counting Moving People in Video Surveillance Videos
NASA Astrophysics Data System (ADS)
Conte, Donatello; Foggia, Pasquale; Percannella, Gennaro; Tufano, Francesco; Vento, Mario
2010-12-01
People counting is an important problem in video surveillance applications. This problem has been faced either by trying to detect people in the scene and then counting them or by establishing a mapping between some scene feature and the number of people (avoiding the complex detection problem). This paper presents a novel method, following this second approach, that is based on the use of SURF features and of an [InlineEquation not available: see fulltext.]-SVR regressor provide an estimate of this count. The algorithm takes specifically into account problems due to partial occlusions and to perspective. In the experimental evaluation, the proposed method has been compared with the algorithm by Albiol et al., winner of the PETS 2009 contest on people counting, using the same PETS 2009 database. The provided results confirm that the proposed method yields an improved accuracy, while retaining the robustness of Albiol's algorithm.
The comparison and analysis of extracting video key frame
NASA Astrophysics Data System (ADS)
Ouyang, S. Z.; Zhong, L.; Luo, R. Q.
2018-05-01
Video key frame extraction is an important part of the large data processing. Based on the previous work in key frame extraction, we summarized four important key frame extraction algorithms, and these methods are largely developed by comparing the differences between each of two frames. If the difference exceeds a threshold value, take the corresponding frame as two different keyframes. After the research, the key frame extraction based on the amount of mutual trust is proposed, the introduction of information entropy, by selecting the appropriate threshold values into the initial class, and finally take a similar mean mutual information as a candidate key frame. On this paper, several algorithms is used to extract the key frame of tunnel traffic videos. Then, with the analysis to the experimental results and comparisons between the pros and cons of these algorithms, the basis of practical applications is well provided.
Compression of computer generated phase-shifting hologram sequence using AVC and HEVC
NASA Astrophysics Data System (ADS)
Xing, Yafei; Pesquet-Popescu, Béatrice; Dufaux, Frederic
2013-09-01
With the capability of achieving twice the compression ratio of Advanced Video Coding (AVC) with similar reconstruction quality, High Efficiency Video Coding (HEVC) is expected to become the newleading technique of video coding. In order to reduce the storage and transmission burden of digital holograms, in this paper we propose to use HEVC for compressing the phase-shifting digital hologram sequences (PSDHS). By simulating phase-shifting digital holography (PSDH) interferometry, interference patterns between illuminated three dimensional( 3D) virtual objects and the stepwise phase changed reference wave are generated as digital holograms. The hologram sequences are obtained by the movement of the virtual objects and compressed by AVC and HEVC. The experimental results show that AVC and HEVC are efficient to compress PSDHS, with HEVC giving better performance. Good compression rate and reconstruction quality can be obtained with bitrate above 15000kbps.
An improved algorithm for evaluating trellis phase codes
NASA Technical Reports Server (NTRS)
Mulligan, M. G.; Wilson, S. G.
1982-01-01
A method is described for evaluating the minimum distance parameters of trellis phase codes, including CPFSK, partial response FM, and more importantly, coded CPM (continuous phase modulation) schemes. The algorithm provides dramatically faster execution times and lesser memory requirements than previous algorithms. Results of sample calculations and timing comparisons are included.
An improved algorithm for evaluating trellis phase codes
NASA Technical Reports Server (NTRS)
Mulligan, M. G.; Wilson, S. G.
1984-01-01
A method is described for evaluating the minimum distance parameters of trellis phase codes, including CPFSK, partial response FM, and more importantly, coded CPM (continuous phase modulation) schemes. The algorithm provides dramatically faster execution times and lesser memory requirements than previous algorithms. Results of sample calculations and timing comparisons are included.
Computer algorithm for coding gain
NASA Technical Reports Server (NTRS)
Dodd, E. E.
1974-01-01
Development of a computer algorithm for coding gain for use in an automated communications link design system. Using an empirical formula which defines coding gain as used in space communications engineering, an algorithm is constructed on the basis of available performance data for nonsystematic convolutional encoding with soft-decision (eight-level) Viterbi decoding.
Meeting highlights applications of Nek5000 simulation code | Argonne
Photos Videos Fact Sheets, Brochures and Reports Summer Science Writing Internship Careers Education Photos Videos Fact Sheets, Brochures and Reports Summer Science Writing Internship Meeting highlights
Topic Transition in Educational Videos Using Visually Salient Words
ERIC Educational Resources Information Center
Gandhi, Ankit; Biswas, Arijit; Deshmukh, Om
2015-01-01
In this paper, we propose a visual saliency algorithm for automatically finding the topic transition points in an educational video. First, we propose a method for assigning a saliency score to each word extracted from an educational video. We design several mid-level features that are indicative of visual saliency. The optimal feature combination…
Content-Based Indexing and Teaching Focus Mining for Lecture Videos
ERIC Educational Resources Information Center
Lin, Yu-Tzu; Yen, Bai-Jang; Chang, Chia-Hu; Lee, Greg C.; Lin, Yu-Chih
2010-01-01
Purpose: The purpose of this paper is to propose an indexing and teaching focus mining system for lecture videos recorded in an unconstrained environment. Design/methodology/approach: By applying the proposed algorithms in this paper, the slide structure can be reconstructed by extracting slide images from the video. Instead of applying…
Cranwell, Jo; Britton, John; Bains, Manpreet
2017-02-01
The purpose of the present study is to describe the portrayal of alcohol content in popular YouTube music videos. We used inductive thematic analysis to explore the lyrics and visual imagery in 49 UK Top 40 songs and music videos previously found to contain alcohol content and watched by many British adolescents aged between 11 and 18 years and to examine if branded content contravened alcohol industry advertising codes of practice. The analysis generated three themes. First, alcohol content was associated with sexualised imagery or lyrics and the objectification of women. Second, alcohol was associated with image, lifestyle and sociability. Finally, some videos showed alcohol overtly encouraging excessive drinking and drunkenness, including those containing branding, with no negative consequences to the drinker. Our results suggest that YouTube music videos promote positive associations with alcohol use. Further, several alcohol companies adopt marketing strategies in the video medium that are entirely inconsistent with their own or others agreed advertising codes of practice. We conclude that, as a harm reduction measure, policies should change to prevent adolescent exposure to the positive promotion of alcohol and alcohol branding in music videos.
Stuart, Samuel; Hickey, Aodhán; Galna, Brook; Lord, Sue; Rochester, Lynn; Godfrey, Alan
2017-01-01
Detection of saccades (fast eye-movements) within raw mobile electrooculography (EOG) data involves complex algorithms which typically process data acquired during seated static tasks only. Processing of data during dynamic tasks such as walking is relatively rare and complex, particularly in older adults or people with Parkinson's disease (PD). Development of algorithms that can be easily implemented to detect saccades is required. This study aimed to develop an algorithm for the detection and measurement of saccades in EOG data during static (sitting) and dynamic (walking) tasks, in older adults and PD. Eye-tracking via mobile EOG and infra-red (IR) eye-tracker (with video) was performed with a group of older adults (n = 10) and PD participants (n = 10) (⩾50 years). Horizontal saccades made between targets set 5°, 10° and 15° apart were first measured while seated. Horizontal saccades were then measured while a participant walked and executed a 40° turn left and right. The EOG algorithm was evaluated by comparing the number of correct saccade detections and agreement (ICC 2,1 ) between output from visual inspection of eye-tracker videos and IR eye-tracker. The EOG algorithm detected 75-92% of saccades compared to video inspection and IR output during static testing, with fair to excellent agreement (ICC 2,1 0.49-0.93). However, during walking EOG saccade detection reduced to 42-88% compared to video inspection or IR output, with poor to excellent (ICC 2,1 0.13-0.88) agreement between methodologies. The algorithm was robust during seated testing but less so during walking, which was likely due to increased measurement and analysis error with a dynamic task. Future studies may consider a combination of EOG and IR for comprehensive measurement.
Development of a Video-Based Evaluation Tool in Rett Syndrome
ERIC Educational Resources Information Center
Fyfe, S.; Downs, J.; McIlroy, O.; Burford, B.; Lister, J.; Reilly, S.; Laurvick, C. L.; Philippe, C.; Msall, M.; Kaufmann, W. E.; Ellaway, C.; Leonard, H.
2007-01-01
This paper describes the development of a video-based evaluation tool for use in Rett syndrome (RTT). Components include a parent-report checklist, and video filming and coding protocols that contain items on eating, drinking, communication, hand function and movements, personal care and mobility. Ninety-seven of the 169 families who initially…
Context adaptive binary arithmetic coding-based data hiding in partially encrypted H.264/AVC videos
NASA Astrophysics Data System (ADS)
Xu, Dawen; Wang, Rangding
2015-05-01
A scheme of data hiding directly in a partially encrypted version of H.264/AVC videos is proposed which includes three parts, i.e., selective encryption, data embedding and data extraction. Selective encryption is performed on context adaptive binary arithmetic coding (CABAC) bin-strings via stream ciphers. By careful selection of CABAC entropy coder syntax elements for selective encryption, the encrypted bitstream is format-compliant and has exactly the same bit rate. Then a data-hider embeds the additional data into partially encrypted H.264/AVC videos using a CABAC bin-string substitution technique without accessing the plaintext of the video content. Since bin-string substitution is carried out on those residual coefficients with approximately the same magnitude, the quality of the decrypted video is satisfactory. Video file size is strictly preserved even after data embedding. In order to adapt to different application scenarios, data extraction can be done either in the encrypted domain or in the decrypted domain. Experimental results have demonstrated the feasibility and efficiency of the proposed scheme.
Lin, Frank Yeong-Sung; Hsiao, Chiu-Han; Yen, Hong-Hsu; Hsieh, Yu-Jen
2013-01-01
One of the important applications in Wireless Sensor Networks (WSNs) is video surveillance that includes the tasks of video data processing and transmission. Processing and transmission of image and video data in WSNs has attracted a lot of attention in recent years. This is known as Wireless Visual Sensor Networks (WVSNs). WVSNs are distributed intelligent systems for collecting image or video data with unique performance, complexity, and quality of service challenges. WVSNs consist of a large number of battery-powered and resource constrained camera nodes. End-to-end delay is a very important Quality of Service (QoS) metric for video surveillance application in WVSNs. How to meet the stringent delay QoS in resource constrained WVSNs is a challenging issue that requires novel distributed and collaborative routing strategies. This paper proposes a Near-Optimal Distributed QoS Constrained (NODQC) routing algorithm to achieve an end-to-end route with lower delay and higher throughput. A Lagrangian Relaxation (LR)-based routing metric that considers the “system perspective” and “user perspective” is proposed to determine the near-optimal routing paths that satisfy end-to-end delay constraints with high system throughput. The empirical results show that the NODQC routing algorithm outperforms others in terms of higher system throughput with lower average end-to-end delay and delay jitter. In this paper, for the first time, the algorithm shows how to meet the delay QoS and at the same time how to achieve higher system throughput in stringently resource constrained WVSNs.
Guided filtering for solar image/video processing
NASA Astrophysics Data System (ADS)
Xu, Long; Yan, Yihua; Cheng, Jun
2017-06-01
A new image enhancement algorithm employing guided filtering is proposed in this work for the enhancement of solar images and videos so that users can easily figure out important fine structures embedded in the recorded images/movies for solar observation. The proposed algorithm can efficiently remove image noises, including Gaussian and impulse noises. Meanwhile, it can further highlight fibrous structures on/beyond the solar disk. These fibrous structures can clearly demonstrate the progress of solar flare, prominence coronal mass emission, magnetic field, and so on. The experimental results prove that the proposed algorithm gives significant enhancement of visual quality of solar images beyond original input and several classical image enhancement algorithms, thus facilitating easier determination of interesting solar burst activities from recorded images/movies.
A hybrid frame concealment algorithm for H.264/AVC.
Yan, Bo; Gharavi, Hamid
2010-01-01
In packet-based video transmissions, packets loss due to channel errors may result in the loss of the whole video frame. Recently, many error concealment algorithms have been proposed in order to combat channel errors; however, most of the existing algorithms can only deal with the loss of macroblocks and are not able to conceal the whole missing frame. In order to resolve this problem, in this paper, we have proposed a new hybrid motion vector extrapolation (HMVE) algorithm to recover the whole missing frame, and it is able to provide more accurate estimation for the motion vectors of the missing frame than other conventional methods. Simulation results show that it is highly effective and significantly outperforms other existing frame recovery methods.
Two-dimensional thermal video analysis of offshore bird and bat flight
Matzner, Shari; Cullinan, Valerie I.; Duberstein, Corey A.
2015-09-11
Thermal infrared video can provide essential information about bird and bat presence and activity for risk assessment studies, but the analysis of recorded video can be time-consuming and may not extract all of the available information. Automated processing makes continuous monitoring over extended periods of time feasible, and maximizes the information provided by video. This is especially important for collecting data in remote locations that are difficult for human observers to access, such as proposed offshore wind turbine sites. We present guidelines for selecting an appropriate thermal camera based on environmental conditions and the physical characteristics of the target animals.more » We developed new video image processing algorithms that automate the extraction of bird and bat flight tracks from thermal video, and that characterize the extracted tracks to support animal identification and behavior inference. The algorithms use a video peak store process followed by background masking and perceptual grouping to extract flight tracks. The extracted tracks are automatically quantified in terms that could then be used to infer animal type and possibly behavior. The developed automated processing generates results that are reproducible and verifiable, and reduces the total amount of video data that must be retained and reviewed by human experts. Finally, we suggest models for interpreting thermal imaging information.« less
Two-dimensional thermal video analysis of offshore bird and bat flight
DOE Office of Scientific and Technical Information (OSTI.GOV)
Matzner, Shari; Cullinan, Valerie I.; Duberstein, Corey A.
Thermal infrared video can provide essential information about bird and bat presence and activity for risk assessment studies, but the analysis of recorded video can be time-consuming and may not extract all of the available information. Automated processing makes continuous monitoring over extended periods of time feasible, and maximizes the information provided by video. This is especially important for collecting data in remote locations that are difficult for human observers to access, such as proposed offshore wind turbine sites. We present guidelines for selecting an appropriate thermal camera based on environmental conditions and the physical characteristics of the target animals.more » We developed new video image processing algorithms that automate the extraction of bird and bat flight tracks from thermal video, and that characterize the extracted tracks to support animal identification and behavior inference. The algorithms use a video peak store process followed by background masking and perceptual grouping to extract flight tracks. The extracted tracks are automatically quantified in terms that could then be used to infer animal type and possibly behavior. The developed automated processing generates results that are reproducible and verifiable, and reduces the total amount of video data that must be retained and reviewed by human experts. Finally, we suggest models for interpreting thermal imaging information.« less
Efficient video-equipped fire detection approach for automatic fire alarm systems
NASA Astrophysics Data System (ADS)
Kang, Myeongsu; Tung, Truong Xuan; Kim, Jong-Myon
2013-01-01
This paper proposes an efficient four-stage approach that automatically detects fire using video capabilities. In the first stage, an approximate median method is used to detect video frame regions involving motion. In the second stage, a fuzzy c-means-based clustering algorithm is employed to extract candidate regions of fire from all of the movement-containing regions. In the third stage, a gray level co-occurrence matrix is used to extract texture parameters by tracking red-colored objects in the candidate regions. These texture features are, subsequently, used as inputs of a back-propagation neural network to distinguish between fire and nonfire. Experimental results indicate that the proposed four-stage approach outperforms other fire detection algorithms in terms of consistently increasing the accuracy of fire detection in both indoor and outdoor test videos.
On the use of video projectors for three-dimensional scanning
NASA Astrophysics Data System (ADS)
Juarez-Salazar, Rigoberto; Diaz-Ramirez, Victor H.; Robledo-Sanchez, Carlos; Diaz-Gonzalez, Gerardo
2017-08-01
Structured light projection is one of the most useful methods for accurate three-dimensional scanning. Video projectors are typically used as the illumination source. However, because video projectors are not designed for structured light systems, some considerations such as gamma calibration must be taken into account. In this work, we present a simple method for gamma calibration of video projectors. First, the experimental fringe patterns are normalized. Then, the samples of the fringe patterns are sorted in ascending order. The sample sorting leads to a simple three-parameter sine curve that is fitted using the Gauss-Newton algorithm. The novelty of this method is that the sorting process removes the effect of the unknown phase. Thus, the resulting gamma calibration algorithm is significantly simplified. The feasibility of the proposed method is illustrated in a three-dimensional scanning experiment.
Tactile Cueing for Target Acquisition and Identification
2005-09-01
method of coding tactile information, and the method of presenting elevation information were studied. Results: Subjects were divided into video game experienced...VGP) subjects and non- video game (NVGP) experienced subjects. VGPs showed a significantly lower’ target acquisition time with the 12...that video game players performed better with the highest level of tactile resolution, while non- video game players performed better with simpler pattern and a lower resolution display.
ERIC Educational Resources Information Center
King, Keith; Laake, Rebecca A.; Bernard, Amy
2006-01-01
This study examined the sexual messages depicted in music videos aired on MTV, MTV2, BET, and GAC from August 2, 2004 to August 15, 2004. One-hour segments of music videos were taped daily for two weeks. Depictions of sexual attire and sexual behavior were analyzed via a four-page coding sheet (interrater-reliability = 0.93). Results indicated…
NASA Astrophysics Data System (ADS)
Tereshin, Alexander A.; Usilin, Sergey A.; Arlazarov, Vladimir V.
2018-04-01
This paper aims to study the problem of multi-class object detection in video stream with Viola-Jones cascades. An adaptive algorithm for selecting Viola-Jones cascade based on greedy choice strategy in solution of the N-armed bandit problem is proposed. The efficiency of the algorithm on the problem of detection and recognition of the bank card logos in the video stream is shown. The proposed algorithm can be effectively used in documents localization and identification, recognition of road scene elements, localization and tracking of the lengthy objects , and for solving other problems of rigid object detection in a heterogeneous data flows. The computational efficiency of the algorithm makes it possible to use it both on personal computers and on mobile devices based on processors with low power consumption.
Analysis of view synthesis prediction architectures in modern coding standards
NASA Astrophysics Data System (ADS)
Tian, Dong; Zou, Feng; Lee, Chris; Vetro, Anthony; Sun, Huifang
2013-09-01
Depth-based 3D formats are currently being developed as extensions to both AVC and HEVC standards. The availability of depth information facilitates the generation of intermediate views for advanced 3D applications and displays, and also enables more efficient coding of the multiview input data through view synthesis prediction techniques. This paper outlines several approaches that have been explored to realize view synthesis prediction in modern video coding standards such as AVC and HEVC. The benefits and drawbacks of various architectures are analyzed in terms of performance, complexity, and other design considerations. It is hence concluded that block-based VSP prediction for multiview video signals provides attractive coding gains with comparable complexity as traditional motion/disparity compensation.
Unified transform architecture for AVC, AVS, VC-1 and HEVC high-performance codecs
NASA Astrophysics Data System (ADS)
Dias, Tiago; Roma, Nuno; Sousa, Leonel
2014-12-01
A unified architecture for fast and efficient computation of the set of two-dimensional (2-D) transforms adopted by the most recent state-of-the-art digital video standards is presented in this paper. Contrasting to other designs with similar functionality, the presented architecture is supported on a scalable, modular and completely configurable processing structure. This flexible structure not only allows to easily reconfigure the architecture to support different transform kernels, but it also permits its resizing to efficiently support transforms of different orders (e.g. order-4, order-8, order-16 and order-32). Consequently, not only is it highly suitable to realize high-performance multi-standard transform cores, but it also offers highly efficient implementations of specialized processing structures addressing only a reduced subset of transforms that are used by a specific video standard. The experimental results that were obtained by prototyping several configurations of this processing structure in a Xilinx Virtex-7 FPGA show the superior performance and hardware efficiency levels provided by the proposed unified architecture for the implementation of transform cores for the Advanced Video Coding (AVC), Audio Video coding Standard (AVS), VC-1 and High Efficiency Video Coding (HEVC) standards. In addition, such results also demonstrate the ability of this processing structure to realize multi-standard transform cores supporting all the standards mentioned above and that are capable of processing the 8k Ultra High Definition Television (UHDTV) video format (7,680 × 4,320 at 30 fps) in real time.
Bellaïche, Yohanns; Bosveld, Floris; Graner, François; Mikula, Karol; Remesíková, Mariana; Smísek, Michal
2011-01-01
In this paper, we present a novel algorithm for tracking cells in time lapse confocal microscopy movie of a Drosophila epithelial tissue during pupal morphogenesis. We consider a 2D + time video as a 3D static image, where frames are stacked atop each other, and using a spatio-temporal segmentation algorithm we obtain information about spatio-temporal 3D tubes representing evolutions of cells. The main idea for tracking is the usage of two distance functions--first one from the cells in the initial frame and second one from segmented boundaries. We track the cells backwards in time. The first distance function attracts the subsequently constructed cell trajectories to the cells in the initial frame and the second one forces them to be close to centerlines of the segmented tubular structures. This makes our tracking algorithm robust against noise and missing spatio-temporal boundaries. This approach can be generalized to a 3D + time video analysis, where spatio-temporal tubes are 4D objects.
Cross-Layer Design for Robust and Scalable Video Transmission in Dynamic Wireless Environment
2011-02-01
code rate convolutional codes or prioritized Rate - Compatible Punctured ...34New rate - compatible punctured convolutional codes for Viterbi decoding," IEEE Trans. Communications, Volume 42, Issue 12, pp. 3073-3079, Dec...Quality of service RCPC Rate - compatible and punctured convolutional codes SNR Signal to noise
A Video Transmission System for Severely Degraded Channels
2006-07-01
rate compatible punctured convolutional codes (RCPC) . By separating the SPIHT bitstream...June 2000. 149 [170] J. Hagenauer, Rate - compatible punctured convolutional codes (RCPC codes ) and their applications, IEEE Transactions on...Farvardin [160] used rate compatible convolutional codes . They noticed that for some transmission rates , one of their EEP schemes, which may
NASA Technical Reports Server (NTRS)
Lin, Shu; Fossorier, Marc
1998-01-01
For long linear block codes, maximum likelihood decoding based on full code trellises would be very hard to implement if not impossible. In this case, we may wish to trade error performance for the reduction in decoding complexity. Sub-optimum soft-decision decoding of a linear block code based on a low-weight sub-trellis can be devised to provide an effective trade-off between error performance and decoding complexity. This chapter presents such a suboptimal decoding algorithm for linear block codes. This decoding algorithm is iterative in nature and based on an optimality test. It has the following important features: (1) a simple method to generate a sequence of candidate code-words, one at a time, for test; (2) a sufficient condition for testing a candidate code-word for optimality; and (3) a low-weight sub-trellis search for finding the most likely (ML) code-word.
Improved segmentation of occluded and adjoining vehicles in traffic surveillance videos
NASA Astrophysics Data System (ADS)
Juneja, Medha; Grover, Priyanka
2013-12-01
Occlusion in image processing refers to concealment of any part of the object or the whole object from view of an observer. Real time videos captured by static cameras on roads often encounter overlapping and hence, occlusion of vehicles. Occlusion in traffic surveillance videos usually occurs when an object which is being tracked is hidden by another object. This makes it difficult for the object detection algorithms to distinguish all the vehicles efficiently. Also morphological operations tend to join the close proximity vehicles resulting in formation of a single bounding box around more than one vehicle. Such problems lead to errors in further video processing, like counting of vehicles in a video. The proposed system brings forward efficient moving object detection and tracking approach to reduce such errors. The paper uses successive frame subtraction technique for detection of moving objects. Further, this paper implements the watershed algorithm to segment the overlapped and adjoining vehicles. The segmentation results have been improved by the use of noise and morphological operations.
Robust Pedestrian Tracking and Recognition from FLIR Video: A Unified Approach via Sparse Coding
Li, Xin; Guo, Rui; Chen, Chao
2014-01-01
Sparse coding is an emerging method that has been successfully applied to both robust object tracking and recognition in the vision literature. In this paper, we propose to explore a sparse coding-based approach toward joint object tracking-and-recognition and explore its potential in the analysis of forward-looking infrared (FLIR) video to support nighttime machine vision systems. A key technical contribution of this work is to unify existing sparse coding-based approaches toward tracking and recognition under the same framework, so that they can benefit from each other in a closed-loop. On the one hand, tracking the same object through temporal frames allows us to achieve improved recognition performance through dynamical updating of template/dictionary and combining multiple recognition results; on the other hand, the recognition of individual objects facilitates the tracking of multiple objects (i.e., walking pedestrians), especially in the presence of occlusion within a crowded environment. We report experimental results on both the CASIAPedestrian Database and our own collected FLIR video database to demonstrate the effectiveness of the proposed joint tracking-and-recognition approach. PMID:24961216
A Secure and Robust Object-Based Video Authentication System
NASA Astrophysics Data System (ADS)
He, Dajun; Sun, Qibin; Tian, Qi
2004-12-01
An object-based video authentication system, which combines watermarking, error correction coding (ECC), and digital signature techniques, is presented for protecting the authenticity between video objects and their associated backgrounds. In this system, a set of angular radial transformation (ART) coefficients is selected as the feature to represent the video object and the background, respectively. ECC and cryptographic hashing are applied to those selected coefficients to generate the robust authentication watermark. This content-based, semifragile watermark is then embedded into the objects frame by frame before MPEG4 coding. In watermark embedding and extraction, groups of discrete Fourier transform (DFT) coefficients are randomly selected, and their energy relationships are employed to hide and extract the watermark. The experimental results demonstrate that our system is robust to MPEG4 compression, object segmentation errors, and some common object-based video processing such as object translation, rotation, and scaling while securely preventing malicious object modifications. The proposed solution can be further incorporated into public key infrastructure (PKI).
An efficient decoding for low density parity check codes
NASA Astrophysics Data System (ADS)
Zhao, Ling; Zhang, Xiaolin; Zhu, Manjie
2009-12-01
Low density parity check (LDPC) codes are a class of forward-error-correction codes. They are among the best-known codes capable of achieving low bit error rates (BER) approaching Shannon's capacity limit. Recently, LDPC codes have been adopted by the European Digital Video Broadcasting (DVB-S2) standard, and have also been proposed for the emerging IEEE 802.16 fixed and mobile broadband wireless-access standard. The consultative committee for space data system (CCSDS) has also recommended using LDPC codes in the deep space communications and near-earth communications. It is obvious that LDPC codes will be widely used in wired and wireless communication, magnetic recording, optical networking, DVB, and other fields in the near future. Efficient hardware implementation of LDPC codes is of great interest since LDPC codes are being considered for a wide range of applications. This paper presents an efficient partially parallel decoder architecture suited for quasi-cyclic (QC) LDPC codes using Belief propagation algorithm for decoding. Algorithmic transformation and architectural level optimization are incorporated to reduce the critical path. First, analyze the check matrix of LDPC code, to find out the relationship between the row weight and the column weight. And then, the sharing level of the check node updating units (CNU) and the variable node updating units (VNU) are determined according to the relationship. After that, rearrange the CNU and the VNU, and divide them into several smaller parts, with the help of some assistant logic circuit, these smaller parts can be grouped into CNU during the check node update processing and grouped into VNU during the variable node update processing. These smaller parts are called node update kernel units (NKU) and the assistant logic circuit are called node update auxiliary unit (NAU). With NAUs' help, the two steps of iteration operation are completed by NKUs, which brings in great hardware resource reduction. Meanwhile, efficient techniques have been developed to reduce the computation delay of the node processing units and to minimize hardware overhead for parallel processing. This method may be applied not only to regular LDPC codes, but also to the irregular ones. Based on the proposed architectures, a (7493, 6096) irregular QC-LDPC code decoder is described using verilog hardware design language and implemented on Altera field programmable gate array (FPGA) StratixII EP2S130. The implementation results show that over 20% of logic core size can be saved than conventional partially parallel decoder architectures without any performance degradation. If the decoding clock is 100MHz, the proposed decoder can achieve a maximum (source data) decoding throughput of 133 Mb/s at 18 iterations.
Reconstructing Interlaced High-Dynamic-Range Video Using Joint Learning.
Inchang Choi; Seung-Hwan Baek; Kim, Min H
2017-11-01
For extending the dynamic range of video, it is a common practice to capture multiple frames sequentially with different exposures and combine them to extend the dynamic range of each video frame. However, this approach results in typical ghosting artifacts due to fast and complex motion in nature. As an alternative, video imaging with interlaced exposures has been introduced to extend the dynamic range. However, the interlaced approach has been hindered by jaggy artifacts and sensor noise, leading to concerns over image quality. In this paper, we propose a data-driven approach for jointly solving two specific problems of deinterlacing and denoising that arise in interlaced video imaging with different exposures. First, we solve the deinterlacing problem using joint dictionary learning via sparse coding. Since partial information of detail in differently exposed rows is often available via interlacing, we make use of the information to reconstruct details of the extended dynamic range from the interlaced video input. Second, we jointly solve the denoising problem by tailoring sparse coding to better handle additive noise in low-/high-exposure rows, and also adopt multiscale homography flow to temporal sequences for denoising. We anticipate that the proposed method will allow for concurrent capture of higher dynamic range video frames without suffering from ghosting artifacts. We demonstrate the advantages of our interlaced video imaging compared with the state-of-the-art high-dynamic-range video methods.
Wireless Visual Sensor Network Resource Allocation using Cross-Layer Optimization
2009-01-01
Rate Compatible Punctured Convolutional (RCPC) codes for channel...vol. 44, pp. 2943–2959, November 1998. [22] J. Hagenauer, “ Rate - compatible punctured convolutional codes (RCPC codes ) and their applications,” IEEE... coding rate for H.264/AVC video compression is determined. At the data link layer, the Rate - Compatible Puctured Convolutional (RCPC) channel coding
Real-time minimal-bit-error probability decoding of convolutional codes
NASA Technical Reports Server (NTRS)
Lee, L.-N.
1974-01-01
A recursive procedure is derived for decoding of rate R = 1/n binary convolutional codes which minimizes the probability of the individual decoding decisions for each information bit, subject to the constraint that the decoding delay be limited to Delta branches. This new decoding algorithm is similar to, but somewhat more complex than, the Viterbi decoding algorithm. A real-time, i.e., fixed decoding delay, version of the Viterbi algorithm is also developed and used for comparison to the new algorithm on simulated channels. It is shown that the new algorithm offers advantages over Viterbi decoding in soft-decision applications, such as in the inner coding system for concatenated coding.
Real-time minimal bit error probability decoding of convolutional codes
NASA Technical Reports Server (NTRS)
Lee, L. N.
1973-01-01
A recursive procedure is derived for decoding of rate R=1/n binary convolutional codes which minimizes the probability of the individual decoding decisions for each information bit subject to the constraint that the decoding delay be limited to Delta branches. This new decoding algorithm is similar to, but somewhat more complex than, the Viterbi decoding algorithm. A real-time, i.e. fixed decoding delay, version of the Viterbi algorithm is also developed and used for comparison to the new algorithm on simulated channels. It is shown that the new algorithm offers advantages over Viterbi decoding in soft-decision applications such as in the inner coding system for concatenated coding.
Advances/applications of MAGIC and SOS
NASA Astrophysics Data System (ADS)
Warren, Gary; Ludeking, Larry; Nguyen, Khanh; Smithe, David; Goplen, Bruce
1993-12-01
MAGIC and SOS have been applied to investigate a variety of accelerator-related devices. Examples include high brightness electron guns, beam-RF interactions in klystrons, cold-test modes in an RFQ and in RF sources, and a high-quality, flexible, electron gun with operating modes appropriate for gyrotrons, peniotrons, and other RF sources. Algorithmic improvements for PIC have been developed and added to MAGIC and SOS to facilitate these modeling efforts. Two new field algorithms allow improved control of computational numerical noise and selective control of harmonic modes in RF cavities. An axial filter in SOS accelerates simulations in cylindrical coordinates. The recent addition of an export/import feature now allows long devices to be modeled in sections. Interfaces have been added to receive electromagnetic field information from the Poisson group of codes and from EGUN and to send beam information to PARMELA for subsequent tracing of bunches through beam optics. Post-processors compute and display beam properties including geometric, normalized, and slice emittances, and phase-space parameters, and video. VMS, UNIX, and DOS versions are supported, with migration underway toward windows environments.
Real-time blood flow visualization using the graphics processing unit
NASA Astrophysics Data System (ADS)
Yang, Owen; Cuccia, David; Choi, Bernard
2011-01-01
Laser speckle imaging (LSI) is a technique in which coherent light incident on a surface produces a reflected speckle pattern that is related to the underlying movement of optical scatterers, such as red blood cells, indicating blood flow. Image-processing algorithms can be applied to produce speckle flow index (SFI) maps of relative blood flow. We present a novel algorithm that employs the NVIDIA Compute Unified Device Architecture (CUDA) platform to perform laser speckle image processing on the graphics processing unit. Software written in C was integrated with CUDA and integrated into a LabVIEW Virtual Instrument (VI) that is interfaced with a monochrome CCD camera able to acquire high-resolution raw speckle images at nearly 10 fps. With the CUDA code integrated into the LabVIEW VI, the processing and display of SFI images were performed also at ~10 fps. We present three video examples depicting real-time flow imaging during a reactive hyperemia maneuver, with fluid flow through an in vitro phantom, and a demonstration of real-time LSI during laser surgery of a port wine stain birthmark.
Real-time blood flow visualization using the graphics processing unit
Yang, Owen; Cuccia, David; Choi, Bernard
2011-01-01
Laser speckle imaging (LSI) is a technique in which coherent light incident on a surface produces a reflected speckle pattern that is related to the underlying movement of optical scatterers, such as red blood cells, indicating blood flow. Image-processing algorithms can be applied to produce speckle flow index (SFI) maps of relative blood flow. We present a novel algorithm that employs the NVIDIA Compute Unified Device Architecture (CUDA) platform to perform laser speckle image processing on the graphics processing unit. Software written in C was integrated with CUDA and integrated into a LabVIEW Virtual Instrument (VI) that is interfaced with a monochrome CCD camera able to acquire high-resolution raw speckle images at nearly 10 fps. With the CUDA code integrated into the LabVIEW VI, the processing and display of SFI images were performed also at ∼10 fps. We present three video examples depicting real-time flow imaging during a reactive hyperemia maneuver, with fluid flow through an in vitro phantom, and a demonstration of real-time LSI during laser surgery of a port wine stain birthmark. PMID:21280915
Multidimensional incremental parsing for universal source coding.
Bae, Soo Hyun; Juang, Biing-Hwang
2008-10-01
A multidimensional incremental parsing algorithm (MDIP) for multidimensional discrete sources, as a generalization of the Lempel-Ziv coding algorithm, is investigated. It consists of three essential component schemes, maximum decimation matching, hierarchical structure of multidimensional source coding, and dictionary augmentation. As a counterpart of the longest match search in the Lempel-Ziv algorithm, two classes of maximum decimation matching are studied. Also, an underlying behavior of the dictionary augmentation scheme for estimating the source statistics is examined. For an m-dimensional source, m augmentative patches are appended into the dictionary at each coding epoch, thus requiring the transmission of a substantial amount of information to the decoder. The property of the hierarchical structure of the source coding algorithm resolves this issue by successively incorporating lower dimensional coding procedures in the scheme. In regard to universal lossy source coders, we propose two distortion functions, the local average distortion and the local minimax distortion with a set of threshold levels for each source symbol. For performance evaluation, we implemented three image compression algorithms based upon the MDIP; one is lossless and the others are lossy. The lossless image compression algorithm does not perform better than the Lempel-Ziv-Welch coding, but experimentally shows efficiency in capturing the source structure. The two lossy image compression algorithms are implemented using the two distortion functions, respectively. The algorithm based on the local average distortion is efficient at minimizing the signal distortion, but the images by the one with the local minimax distortion have a good perceptual fidelity among other compression algorithms. Our insights inspire future research on feature extraction of multidimensional discrete sources.
Adaptive image coding based on cubic-spline interpolation
NASA Astrophysics Data System (ADS)
Jiang, Jian-Xing; Hong, Shao-Hua; Lin, Tsung-Ching; Wang, Lin; Truong, Trieu-Kien
2014-09-01
It has been investigated that at low bit rates, downsampling prior to coding and upsampling after decoding can achieve better compression performance than standard coding algorithms, e.g., JPEG and H. 264/AVC. However, at high bit rates, the sampling-based schemes generate more distortion. Additionally, the maximum bit rate for the sampling-based scheme to outperform the standard algorithm is image-dependent. In this paper, a practical adaptive image coding algorithm based on the cubic-spline interpolation (CSI) is proposed. This proposed algorithm adaptively selects the image coding method from CSI-based modified JPEG and standard JPEG under a given target bit rate utilizing the so called ρ-domain analysis. The experimental results indicate that compared with the standard JPEG, the proposed algorithm can show better performance at low bit rates and maintain the same performance at high bit rates.
Optimal space communications techniques. [discussion of video signals and delta modulation
NASA Technical Reports Server (NTRS)
Schilling, D. L.
1974-01-01
The encoding of video signals using the Song Adaptive Delta Modulator (Song ADM) is discussed. The video signals are characterized as a sequence of pulses having arbitrary height and width. Although the ADM is suited to tracking signals having fast rise times, it was found that the DM algorithm (which permits an exponential rise for estimating an input step) results in a large overshoot and an underdamped response to the step. An overshoot suppression algorithm which significantly reduces the ringing while not affecting the rise time is presented along with formuli for the rise time and the settling time. Channel errors and their effect on the DM encoded bit stream were investigated.
Evaluation of Moving Object Detection Based on Various Input Noise Using Fixed Camera
NASA Astrophysics Data System (ADS)
Kiaee, N.; Hashemizadeh, E.; Zarrinpanjeh, N.
2017-09-01
Detecting and tracking objects in video has been as a research area of interest in the field of image processing and computer vision. This paper evaluates the performance of a novel method for object detection algorithm in video sequences. This process helps us to know the advantage of this method which is being used. The proposed framework compares the correct and wrong detection percentage of this algorithm. This method was evaluated with the collected data in the field of urban transport which include car and pedestrian in fixed camera situation. The results show that the accuracy of the algorithm will decreases because of image resolution reduction.
Robust efficient estimation of heart rate pulse from video.
Xu, Shuchang; Sun, Lingyun; Rohde, Gustavo Kunde
2014-04-01
We describe a simple but robust algorithm for estimating the heart rate pulse from video sequences containing human skin in real time. Based on a model of light interaction with human skin, we define the change of blood concentration due to arterial pulsation as a pixel quotient in log space, and successfully use the derived signal for computing the pulse heart rate. Various experiments with different cameras, different illumination condition, and different skin locations were conducted to demonstrate the effectiveness and robustness of the proposed algorithm. Examples computed with normal illumination show the algorithm is comparable with pulse oximeter devices both in accuracy and sensitivity.
Robust efficient estimation of heart rate pulse from video
Xu, Shuchang; Sun, Lingyun; Rohde, Gustavo Kunde
2014-01-01
We describe a simple but robust algorithm for estimating the heart rate pulse from video sequences containing human skin in real time. Based on a model of light interaction with human skin, we define the change of blood concentration due to arterial pulsation as a pixel quotient in log space, and successfully use the derived signal for computing the pulse heart rate. Various experiments with different cameras, different illumination condition, and different skin locations were conducted to demonstrate the effectiveness and robustness of the proposed algorithm. Examples computed with normal illumination show the algorithm is comparable with pulse oximeter devices both in accuracy and sensitivity. PMID:24761294
NASA Astrophysics Data System (ADS)
Walton, James S.; Hodgson, Peter; Hallamasek, Karen; Palmer, Jake
2003-07-01
4DVideo is creating a general purpose capability for capturing and analyzing kinematic data from video sequences in near real-time. The core element of this capability is a software package designed for the PC platform. The software ("4DCapture") is designed to capture and manipulate customized AVI files that can contain a variety of synchronized data streams -- including audio, video, centroid locations -- and signals acquired from more traditional sources (such as accelerometers and strain gauges.) The code includes simultaneous capture or playback of multiple video streams, and linear editing of the images (together with the ancilliary data embedded in the files). Corresponding landmarks seen from two or more views are matched automatically, and photogrammetric algorithms permit multiple landmarks to be tracked in two- and three-dimensions -- with or without lens calibrations. Trajectory data can be processed within the main application or they can be exported to a spreadsheet where they can be processed or passed along to a more sophisticated, stand-alone, data analysis application. Previous attempts to develop such applications for high-speed imaging have been limited in their scope, or by the complexity of the application itself. 4DVideo has devised a friendly ("FlowStack") user interface that assists the end-user to capture and treat image sequences in a natural progression. 4DCapture employs the AVI 2.0 standard and DirectX technology which effectively eliminates the file size limitations found in older applications. In early tests, 4DVideo has streamed three RS-170 video sources to disk for more than an hour without loss of data. At this time, the software can acquire video sequences in three ways: (1) directly, from up to three hard-wired cameras supplying RS-170 (monochrome) signals; (2) directly, from a single camera or video recorder supplying an NTSC (color) signal; and (3) by importing existing video streams in the AVI 1.0 or AVI 2.0 formats. The latter is particularly useful for high-speed applications where the raw images are often captured and stored by the camera before being downloaded. Provision has been made to synchronize data acquired from any combination of these video sources using audio and visual "tags." Additional "front-ends," designed for digital cameras, are anticipated.
Measuring Engagement as Students Learn Dynamic Systems and Control with a Video Game
ERIC Educational Resources Information Center
Coller, B. D.; Shernoff, David J.; Strati, Anna
2011-01-01
The paper presents results of a multi-year quasi-experimental study of student engagement during which a video game was introduced into an undergraduate dynamic systems and control course. The video game, "EduTorcs", provided challenges in which students devised control algorithms that drive virtual cars and ride virtual bikes through a…
Adams, Derk; Schreuder, Astrid B; Salottolo, Kristin; Settell, April; Goss, J Richard
2011-07-01
There are significant changes in the abbreviated injury scale (AIS) 2005 system, which make it impractical to compare patients coded in AIS version 98 with patients coded in AIS version 2005. Harborview Medical Center created a computer algorithm "Harborview AIS Mapping Program (HAMP)" to automatically convert AIS 2005 to AIS 98 injury codes. The mapping was validated using 6 months of double-coded patient injury records from a Level I Trauma Center. HAMP was used to determine how closely individual AIS and injury severity scores (ISS) were converted from AIS 2005 to AIS 98 versions. The kappa statistic was used to measure the agreement between manually determined codes and HAMP-derived codes. Seven hundred forty-nine patient records were used for validation. For the conversion of AIS codes, the measure of agreement between HAMP and manually determined codes was [kappa] = 0.84 (95% confidence interval, 0.82-0.86). The algorithm errors were smaller in magnitude than the manually determined coding errors. For the conversion of ISS, the agreement between HAMP versus manually determined ISS was [kappa] = 0.81 (95% confidence interval, 0.78-0.84). The HAMP algorithm successfully converted injuries coded in AIS 2005 to AIS 98. This algorithm will be useful when comparing trauma patient clinical data across populations coded in different versions, especially for longitudinal studies.
Algorithm for Video Summarization of Bronchoscopy Procedures
2011-01-01
Background The duration of bronchoscopy examinations varies considerably depending on the diagnostic and therapeutic procedures used. It can last more than 20 minutes if a complex diagnostic work-up is included. With wide access to videobronchoscopy, the whole procedure can be recorded as a video sequence. Common practice relies on an active attitude of the bronchoscopist who initiates the recording process and usually chooses to archive only selected views and sequences. However, it may be important to record the full bronchoscopy procedure as documentation when liability issues are at stake. Furthermore, an automatic recording of the whole procedure enables the bronchoscopist to focus solely on the performed procedures. Video recordings registered during bronchoscopies include a considerable number of frames of poor quality due to blurry or unfocused images. It seems that such frames are unavoidable due to the relatively tight endobronchial space, rapid movements of the respiratory tract due to breathing or coughing, and secretions which occur commonly in the bronchi, especially in patients suffering from pulmonary disorders. Methods The use of recorded bronchoscopy video sequences for diagnostic, reference and educational purposes could be considerably extended with efficient, flexible summarization algorithms. Thus, the authors developed a prototype system to create shortcuts (called summaries or abstracts) of bronchoscopy video recordings. Such a system, based on models described in previously published papers, employs image analysis methods to exclude frames or sequences of limited diagnostic or education value. Results The algorithm for the selection or exclusion of specific frames or shots from video sequences recorded during bronchoscopy procedures is based on several criteria, including automatic detection of "non-informative", frames showing the branching of the airways and frames including pathological lesions. Conclusions The paper focuses on the challenge of generating summaries of bronchoscopy video recordings. PMID:22185344
Real-time unmanned aircraft systems surveillance video mosaicking using GPU
NASA Astrophysics Data System (ADS)
Camargo, Aldo; Anderson, Kyle; Wang, Yi; Schultz, Richard R.; Fevig, Ronald A.
2010-04-01
Digital video mosaicking from Unmanned Aircraft Systems (UAS) is being used for many military and civilian applications, including surveillance, target recognition, border protection, forest fire monitoring, traffic control on highways, monitoring of transmission lines, among others. Additionally, NASA is using digital video mosaicking to explore the moon and planets such as Mars. In order to compute a "good" mosaic from video captured by a UAS, the algorithm must deal with motion blur, frame-to-frame jitter associated with an imperfectly stabilized platform, perspective changes as the camera tilts in flight, as well as a number of other factors. The most suitable algorithms use SIFT (Scale-Invariant Feature Transform) to detect the features consistent between video frames. Utilizing these features, the next step is to estimate the homography between two consecutives video frames, perform warping to properly register the image data, and finally blend the video frames resulting in a seamless video mosaick. All this processing takes a great deal of resources of resources from the CPU, so it is almost impossible to compute a real time video mosaic on a single processor. Modern graphics processing units (GPUs) offer computational performance that far exceeds current CPU technology, allowing for real-time operation. This paper presents the development of a GPU-accelerated digital video mosaicking implementation and compares it with CPU performance. Our tests are based on two sets of real video captured by a small UAS aircraft; one video comes from Infrared (IR) and Electro-Optical (EO) cameras. Our results show that we can obtain a speed-up of more than 50 times using GPU technology, so real-time operation at a video capture of 30 frames per second is feasible.
Video copy protection and detection framework (VPD) for e-learning systems
NASA Astrophysics Data System (ADS)
ZandI, Babak; Doustarmoghaddam, Danial; Pour, Mahsa R.
2013-03-01
This Article reviews and compares the copyright issues related to the digital video files, which can be categorized as contended based and Digital watermarking copy Detection. Then we describe how to protect a digital video by using a special Video data hiding method and algorithm. We also discuss how to detect the copy right of the file, Based on expounding Direction of the technology of the video copy detection, and Combining with the own research results, brings forward a new video protection and copy detection approach in terms of plagiarism and e-learning systems using the video data hiding technology. Finally we introduce a framework for Video protection and detection in e-learning systems (VPD Framework).
Robust video copy detection approach based on local tangent space alignment
NASA Astrophysics Data System (ADS)
Nie, Xiushan; Qiao, Qianping
2012-04-01
We propose a robust content-based video copy detection approach based on local tangent space alignment (LTSA), which is an efficient dimensionality reduction algorithm. The idea is motivated by the fact that the content of video becomes richer and the dimension of content becomes higher. It does not give natural tools for video analysis and understanding because of the high dimensionality. The proposed approach reduces the dimensionality of video content using LTSA, and then generates video fingerprints in low dimensional space for video copy detection. Furthermore, a dynamic sliding window is applied to fingerprint matching. Experimental results show that the video copy detection approach has good robustness and discrimination.
Mutual information-based analysis of JPEG2000 contexts.
Liu, Zhen; Karam, Lina J
2005-04-01
Context-based arithmetic coding has been widely adopted in image and video compression and is a key component of the new JPEG2000 image compression standard. In this paper, the contexts used in JPEG2000 are analyzed using the mutual information, which is closely related to the compression performance. We first show that, when combining the contexts, the mutual information between the contexts and the encoded data will decrease unless the conditional probability distributions of the combined contexts are the same. Given I, the initial number of contexts, and F, the final desired number of contexts, there are S(I, F) possible context classification schemes where S(I, F) is called the Stirling number of the second kind. The optimal classification scheme is the one that gives the maximum mutual information. Instead of using an exhaustive search, the optimal classification scheme can be obtained through a modified generalized Lloyd algorithm with the relative entropy as the distortion metric. For binary arithmetic coding, the search complexity can be reduced by using dynamic programming. Our experimental results show that the JPEG2000 contexts capture the correlations among the wavelet coefficients very well. At the same time, the number of contexts used as part of the standard can be reduced without loss in the coding performance.
A comparison of cigarette- and hookah-related videos on YouTube.
Carroll, Mary V; Shensa, Ariel; Primack, Brian A
2013-09-01
YouTube is now the second most visited site on the internet. The authors aimed to compare characteristics of and messages conveyed by cigarette- and hookah-related videos on YouTube. Systematic search procedures yielded 66 cigarette-related and 61 hookah-related videos. After three trained qualitative researchers used an iterative approach to develop and refine definitions for the coding of variables, two of them independently coded each video for content including positive and negative associations with smoking and major content type. Median view counts were 606,884 for cigarettes-related videos and 102,307 for hookah-related videos (p<0.001). However, the number of comments per 1000 views was significantly lower for cigarette-related videos than for hookah-related videos (1.6 vs 2.5, p=0.003). There was no significant difference in the number of 'like' designations per 100 reactions (91 vs 87, p=0.39). Cigarette-related videos were less likely than hookah-related videos to portray tobacco use in a positive light (24% vs 92%, p<0.001). In addition, cigarette-related videos were more likely to be of high production quality (42% vs 5%, p<0.001), to mention short-term consequences (50% vs 18%, p<0.001) and long-term consequences (44% vs 2%, p<0.001) of tobacco use, to contain explicit antismoking messages (39% vs 0%, p<0.001) and to provide specific information on how to quit tobacco use (21% vs 0%, p<0.001). Although internet user-generated videos related to cigarette smoking often acknowledge harmful consequences and provide explicit antismoking messages, hookah-related videos do not. It may be valuable for public health programmes to correct common misconceptions regarding hookah use.
Research of real-time video processing system based on 6678 multi-core DSP
NASA Astrophysics Data System (ADS)
Li, Xiangzhen; Xie, Xiaodan; Yin, Xiaoqiang
2017-10-01
In the information age, the rapid development in the direction of intelligent video processing, complex algorithm proposed the powerful challenge on the performance of the processor. In this article, through the FPGA + TMS320C6678 frame structure, the image to fog, merge into an organic whole, to stabilize the image enhancement, its good real-time, superior performance, break through the traditional function of video processing system is simple, the product defects such as single, solved the video application in security monitoring, video, etc. Can give full play to the video monitoring effectiveness, improve enterprise economic benefits.
Video Denoising via Dynamic Video Layering
NASA Astrophysics Data System (ADS)
Guo, Han; Vaswani, Namrata
2018-07-01
Video denoising refers to the problem of removing "noise" from a video sequence. Here the term "noise" is used in a broad sense to refer to any corruption or outlier or interference that is not the quantity of interest. In this work, we develop a novel approach to video denoising that is based on the idea that many noisy or corrupted videos can be split into three parts - the "low-rank layer", the "sparse layer", and a small residual (which is small and bounded). We show, using extensive experiments, that our denoising approach outperforms the state-of-the-art denoising algorithms.
Code inspection instructional validation
NASA Technical Reports Server (NTRS)
Orr, Kay; Stancil, Shirley
1992-01-01
The Shuttle Data Systems Branch (SDSB) of the Flight Data Systems Division (FDSD) at Johnson Space Center contracted with Southwest Research Institute (SwRI) to validate the effectiveness of an interactive video course on the code inspection process. The purpose of this project was to determine if this course could be effective for teaching NASA analysts the process of code inspection. In addition, NASA was interested in the effectiveness of this unique type of instruction (Digital Video Interactive), for providing training on software processes. This study found the Carnegie Mellon course, 'A Cure for the Common Code', effective for teaching the process of code inspection. In addition, analysts prefer learning with this method of instruction, or this method in combination with other methods. As is, the course is definitely better than no course at all; however, findings indicate changes are needed. Following are conclusions of this study. (1) The course is instructionally effective. (2) The simulation has a positive effect on student's confidence in his ability to apply new knowledge. (3) Analysts like the course and prefer this method of training, or this method in combination with current methods of training in code inspection, over the way training is currently being conducted. (4) Analysts responded favorably to information presented through scenarios incorporating full motion video. (5) Some course content needs to be changed. (6) Some content needs to be added to the course. SwRI believes this study indicates interactive video instruction combined with simulation is effective for teaching software processes. Based on the conclusions of this study, SwRI has outlined seven options for NASA to consider. SwRI recommends the option which involves creation of new source code and data files, but uses much of the existing content and design from the current course. Although this option involves a significant software development effort, SwRI believes this option will produce the most effective results.
Parallel design patterns for a low-power, software-defined compressed video encoder
NASA Astrophysics Data System (ADS)
Bruns, Michael W.; Hunt, Martin A.; Prasad, Durga; Gunupudi, Nageswara R.; Sonachalam, Sekar
2011-06-01
Video compression algorithms such as H.264 offer much potential for parallel processing that is not always exploited by the technology of a particular implementation. Consumer mobile encoding devices often achieve real-time performance and low power consumption through parallel processing in Application Specific Integrated Circuit (ASIC) technology, but many other applications require a software-defined encoder. High quality compression features needed for some applications such as 10-bit sample depth or 4:2:2 chroma format often go beyond the capability of a typical consumer electronics device. An application may also need to efficiently combine compression with other functions such as noise reduction, image stabilization, real time clocks, GPS data, mission/ESD/user data or software-defined radio in a low power, field upgradable implementation. Low power, software-defined encoders may be implemented using a massively parallel memory-network processor array with 100 or more cores and distributed memory. The large number of processor elements allow the silicon device to operate more efficiently than conventional DSP or CPU technology. A dataflow programming methodology may be used to express all of the encoding processes including motion compensation, transform and quantization, and entropy coding. This is a declarative programming model in which the parallelism of the compression algorithm is expressed as a hierarchical graph of tasks with message communication. Data parallel and task parallel design patterns are supported without the need for explicit global synchronization control. An example is described of an H.264 encoder developed for a commercially available, massively parallel memorynetwork processor device.
Please Like Me: Facebook and Public Health Communication.
Kite, James; Foley, Bridget C; Grunseit, Anne C; Freeman, Becky
2016-01-01
Facebook, the most widely used social media platform, has been adopted by public health organisations for health promotion and behaviour change campaigns and activities. However, limited information is available on the most effective and efficient use of Facebook for this purpose. This study sought to identify the features of Facebook posts that are associated with higher user engagement on Australian public health organisations' Facebook pages. We selected 20 eligible pages through a systematic search and coded 360-days of posts for each page. Posts were coded by: post type (e.g., photo, text only etc.), communication technique employed (e.g. testimonial, informative etc.) and use of marketing elements (e.g., branding, use of mascots). A series of negative binomial regressions were used to assess associations between post characteristics and user engagement as measured by the number of likes, shares and comments. Our results showed that video posts attracted the greatest amount of user engagement, although an analysis of a subset of the data suggested this may be a reflection of the Facebook algorithm, which governs what is and is not shown in user newsfeeds and appear to preference videos over other post types. Posts that featured a positive emotional appeal or provided factual information attracted higher levels of user engagement, while conventional marketing elements, such as sponsorships and the use of persons of authority, generally discouraged user engagement, with the exception of posts that included a celebrity or sportsperson. Our results give insight into post content that maximises user engagement and begins to fill the knowledge gap on effective use of Facebook by public health organisations.
Please Like Me: Facebook and Public Health Communication
Kite, James; Foley, Bridget C.; Grunseit, Anne C.; Freeman, Becky
2016-01-01
Facebook, the most widely used social media platform, has been adopted by public health organisations for health promotion and behaviour change campaigns and activities. However, limited information is available on the most effective and efficient use of Facebook for this purpose. This study sought to identify the features of Facebook posts that are associated with higher user engagement on Australian public health organisations’ Facebook pages. We selected 20 eligible pages through a systematic search and coded 360-days of posts for each page. Posts were coded by: post type (e.g., photo, text only etc.), communication technique employed (e.g. testimonial, informative etc.) and use of marketing elements (e.g., branding, use of mascots). A series of negative binomial regressions were used to assess associations between post characteristics and user engagement as measured by the number of likes, shares and comments. Our results showed that video posts attracted the greatest amount of user engagement, although an analysis of a subset of the data suggested this may be a reflection of the Facebook algorithm, which governs what is and is not shown in user newsfeeds and appear to preference videos over other post types. Posts that featured a positive emotional appeal or provided factual information attracted higher levels of user engagement, while conventional marketing elements, such as sponsorships and the use of persons of authority, generally discouraged user engagement, with the exception of posts that included a celebrity or sportsperson. Our results give insight into post content that maximises user engagement and begins to fill the knowledge gap on effective use of Facebook by public health organisations. PMID:27632172
Integrating Algorithm Visualization Video into a First-Year Algorithm and Data Structure Course
ERIC Educational Resources Information Center
Crescenzi, Pilu; Malizia, Alessio; Verri, M. Cecilia; Diaz, Paloma; Aedo, Ignacio
2012-01-01
In this paper we describe the results that we have obtained while integrating algorithm visualization (AV) movies (strongly tightened with the other teaching material), within a first-year undergraduate course on algorithms and data structures. Our experimental results seem to support the hypothesis that making these movies available significantly…
Motion adaptive Kalman filter for super-resolution
NASA Astrophysics Data System (ADS)
Richter, Martin; Nasse, Fabian; Schröder, Hartmut
2011-01-01
Superresolution is a sophisticated strategy to enhance image quality of both low and high resolution video, performing tasks like artifact reduction, scaling and sharpness enhancement in one algorithm, all of them reconstructing high frequency components (above Nyquist frequency) in some way. Especially recursive superresolution algorithms can fulfill high quality aspects because they control the video output using a feed-back loop and adapt the result in the next iteration. In addition to excellent output quality, temporal recursive methods are very hardware efficient and therefore even attractive for real-time video processing. A very promising approach is the utilization of Kalman filters as proposed by Farsiu et al. Reliable motion estimation is crucial for the performance of superresolution. Therefore, robust global motion models are mainly used, but this also limits the application of superresolution algorithm. Thus, handling sequences with complex object motion is essential for a wider field of application. Hence, this paper proposes improvements by extending the Kalman filter approach using motion adaptive variance estimation and segmentation techniques. Experiments confirm the potential of our proposal for ideal and real video sequences with complex motion and further compare its performance to state-of-the-art methods like trainable filters.
Reliable motion detection of small targets in video with low signal-to-clutter ratios
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nichols, S.A.; Naylor, R.B.
1995-07-01
Studies show that vigilance decreases rapidly after several minutes when human operators are required to search live video for infrequent intrusion detections. Therefore, there is a need for systems which can automatically detect targets in live video and reserve the operator`s attention for assessment only. Thus far, automated systems have not simultaneously provided adequate detection sensitivity, false alarm suppression, and ease of setup when used in external, unconstrained environments. This unsatisfactory performance can be exacerbated by poor video imagery with low contrast, high noise, dynamic clutter, image misregistration, and/or the presence of small, slow, or erratically moving targets. This papermore » describes a highly adaptive video motion detection and tracking algorithm which has been developed as part of Sandia`s Advanced Exterior Sensor (AES) program. The AES is a wide-area detection and assessment system for use in unconstrained exterior security applications. The AES detection and tracking algorithm provides good performance under stressing data and environmental conditions. Features of the algorithm include: reliable detection with negligible false alarm rate of variable velocity targets having low signal-to-clutter ratios; reliable tracking of targets that exhibit motion that is non-inertial, i.e., varies in direction and velocity; automatic adaptation to both infrared and visible imagery with variable quality; and suppression of false alarms caused by sensor flaws and/or cutouts.« less
Resolution enhancement of low-quality videos using a high-resolution frame
NASA Astrophysics Data System (ADS)
Pham, Tuan Q.; van Vliet, Lucas J.; Schutte, Klamer
2006-01-01
This paper proposes an example-based Super-Resolution (SR) algorithm of compressed videos in the Discrete Cosine Transform (DCT) domain. Input to the system is a Low-Resolution (LR) compressed video together with a High-Resolution (HR) still image of similar content. Using a training set of corresponding LR-HR pairs of image patches from the HR still image, high-frequency details are transferred from the HR source to the LR video. The DCT-domain algorithm is much faster than example-based SR in spatial domain 6 because of a reduction in search dimensionality, which is a direct result of the compact and uncorrelated DCT representation. Fast searching techniques like tree-structure vector quantization 16 and coherence search1 are also key to the improved efficiency. Preliminary results on MJPEG sequence show promising result of the DCT-domain SR synthesis approach.
Free-viewpoint video of human actors using multiple handheld Kinects.
Ye, Genzhi; Liu, Yebin; Deng, Yue; Hasler, Nils; Ji, Xiangyang; Dai, Qionghai; Theobalt, Christian
2013-10-01
We present an algorithm for creating free-viewpoint video of interacting humans using three handheld Kinect cameras. Our method reconstructs deforming surface geometry and temporal varying texture of humans through estimation of human poses and camera poses for every time step of the RGBZ video. Skeletal configurations and camera poses are found by solving a joint energy minimization problem, which optimizes the alignment of RGBZ data from all cameras, as well as the alignment of human shape templates to the Kinect data. The energy function is based on a combination of geometric correspondence finding, implicit scene segmentation, and correspondence finding using image features. Finally, texture recovery is achieved through jointly optimization on spatio-temporal RGB data using matrix completion. As opposed to previous methods, our algorithm succeeds on free-viewpoint video of human actors under general uncontrolled indoor scenes with potentially dynamic background, and it succeeds even if the cameras are moving.
The Coverage Problem in Video-Based Wireless Sensor Networks: A Survey
Costa, Daniel G.; Guedes, Luiz Affonso
2010-01-01
Wireless sensor networks typically consist of a great number of tiny low-cost electronic devices with limited sensing and computing capabilities which cooperatively communicate to collect some kind of information from an area of interest. When wireless nodes of such networks are equipped with a low-power camera, visual data can be retrieved, facilitating a new set of novel applications. The nature of video-based wireless sensor networks demands new algorithms and solutions, since traditional wireless sensor networks approaches are not feasible or even efficient for that specialized communication scenario. The coverage problem is a crucial issue of wireless sensor networks, requiring specific solutions when video-based sensors are employed. In this paper, it is surveyed the state of the art of this particular issue, regarding strategies, algorithms and general computational solutions. Open research areas are also discussed, envisaging promising investigation considering coverage in video-based wireless sensor networks. PMID:22163651
A new DWT/MC/DPCM video compression framework based on EBCOT
NASA Astrophysics Data System (ADS)
Mei, L. M.; Wu, H. R.; Tan, D. M.
2005-07-01
A novel Discrete Wavelet Transform (DWT)/Motion Compensation (MC)/Differential Pulse Code Modulation (DPCM) video compression framework is proposed in this paper. Although the Discrete Cosine Transform (DCT)/MC/DPCM is the mainstream framework for video coders in industry and international standards, the idea of DWT/MC/DPCM has existed for more than one decade in the literature and the investigation is still undergoing. The contribution of this work is twofold. Firstly, the Embedded Block Coding with Optimal Truncation (EBCOT) is used here as the compression engine for both intra- and inter-frame coding, which provides good compression ratio and embedded rate-distortion (R-D) optimization mechanism. This is an extension of the EBCOT application from still images to videos. Secondly, this framework offers a good interface for the Perceptual Distortion Measure (PDM) based on the Human Visual System (HVS) where the Mean Squared Error (MSE) can be easily replaced with the PDM in the R-D optimization. Some of the preliminary results are reported here. They are also compared with benchmarks such as MPEG-2 and MPEG-4 version 2. The results demonstrate that under specified condition the proposed coder outperforms the benchmarks in terms of rate vs. distortion.
Performance of the JPEG Estimated Spectrum Adaptive Postfilter (JPEG-ESAP) for Low Bit Rates
NASA Technical Reports Server (NTRS)
Linares, Irving (Inventor)
2016-01-01
Frequency-based, pixel-adaptive filtering using the JPEG-ESAP algorithm for low bit rate JPEG formatted color images may allow for more compressed images while maintaining equivalent quality at a smaller file size or bitrate. For RGB, an image is decomposed into three color bands--red, green, and blue. The JPEG-ESAP algorithm is then applied to each band (e.g., once for red, once for green, and once for blue) and the output of each application of the algorithm is rebuilt as a single color image. The ESAP algorithm may be repeatedly applied to MPEG-2 video frames to reduce their bit rate by a factor of 2 or 3, while maintaining equivalent video quality, both perceptually, and objectively, as recorded in the computed PSNR values.
Motion-seeded object-based attention for dynamic visual imagery
NASA Astrophysics Data System (ADS)
Huber, David J.; Khosla, Deepak; Kim, Kyungnam
2017-05-01
This paper† describes a novel system that finds and segments "objects of interest" from dynamic imagery (video) that (1) processes each frame using an advanced motion algorithm that pulls out regions that exhibit anomalous motion, and (2) extracts the boundary of each object of interest using a biologically-inspired segmentation algorithm based on feature contours. The system uses a series of modular, parallel algorithms, which allows many complicated operations to be carried out by the system in a very short time, and can be used as a front-end to a larger system that includes object recognition and scene understanding modules. Using this method, we show 90% accuracy with fewer than 0.1 false positives per frame of video, which represents a significant improvement over detection using a baseline attention algorithm.
Tail Biting Trellis Representation of Codes: Decoding and Construction
NASA Technical Reports Server (NTRS)
Shao. Rose Y.; Lin, Shu; Fossorier, Marc
1999-01-01
This paper presents two new iterative algorithms for decoding linear codes based on their tail biting trellises, one is unidirectional and the other is bidirectional. Both algorithms are computationally efficient and achieves virtually optimum error performance with a small number of decoding iterations. They outperform all the previous suboptimal decoding algorithms. The bidirectional algorithm also reduces decoding delay. Also presented in the paper is a method for constructing tail biting trellises for linear block codes.
Reading your own lips: common-coding theory and visual speech perception.
Tye-Murray, Nancy; Spehar, Brent P; Myerson, Joel; Hale, Sandra; Sommers, Mitchell S
2013-02-01
Common-coding theory posits that (1) perceiving an action activates the same representations of motor plans that are activated by actually performing that action, and (2) because of individual differences in the ways that actions are performed, observing recordings of one's own previous behavior activates motor plans to an even greater degree than does observing someone else's behavior. We hypothesized that if observing oneself activates motor plans to a greater degree than does observing others, and if these activated plans contribute to perception, then people should be able to lipread silent video clips of their own previous utterances more accurately than they can lipread video clips of other talkers. As predicted, two groups of participants were able to lipread video clips of themselves, recorded more than two weeks earlier, significantly more accurately than video clips of others. These results suggest that visual input activates speech motor activity that links to word representations in the mental lexicon.
Image compression using quad-tree coding with morphological dilation
NASA Astrophysics Data System (ADS)
Wu, Jiaji; Jiang, Weiwei; Jiao, Licheng; Wang, Lei
2007-11-01
In this paper, we propose a new algorithm which integrates morphological dilation operation to quad-tree coding, the purpose of doing this is to compensate each other's drawback by using quad-tree coding and morphological dilation operation respectively. New algorithm can not only quickly find the seed significant coefficient of dilation but also break the limit of block boundary of quad-tree coding. We also make a full use of both within-subband and cross-subband correlation to avoid the expensive cost of representing insignificant coefficients. Experimental results show that our algorithm outperforms SPECK and SPIHT. Without using any arithmetic coding, our algorithm can achieve good performance with low computational cost and it's more suitable to mobile devices or scenarios with a strict real-time requirement.
Iterative Code-Aided ML Phase Estimation and Phase Ambiguity Resolution
NASA Astrophysics Data System (ADS)
Wymeersch, Henk; Moeneclaey, Marc
2005-12-01
As many coded systems operate at very low signal-to-noise ratios, synchronization becomes a very difficult task. In many cases, conventional algorithms will either require long training sequences or result in large BER degradations. By exploiting code properties, these problems can be avoided. In this contribution, we present several iterative maximum-likelihood (ML) algorithms for joint carrier phase estimation and ambiguity resolution. These algorithms operate on coded signals by accepting soft information from the MAP decoder. Issues of convergence and initialization are addressed in detail. Simulation results are presented for turbo codes, and are compared to performance results of conventional algorithms. Performance comparisons are carried out in terms of BER performance and mean square estimation error (MSEE). We show that the proposed algorithm reduces the MSEE and, more importantly, the BER degradation. Additionally, phase ambiguity resolution can be performed without resorting to a pilot sequence, thus improving the spectral efficiency.
An Adaptive Inpainting Algorithm Based on DCT Induced Wavelet Regularization
2013-01-01
research in image processing. Applications of image inpainting include old films restoration, video inpainting [4], de -interlacing of video sequences...show 5 (a) (b) (c) (d) (e) (f) Fig. 1. Performance of various inpainting algorithms for a cartoon image with text. (a) the original test image; (b...the test image with text; inpainted images by (c) SF (PSNR=37.38 dB); (d) SF-LDCT (PSNR=37.37 dB); (e) MCA (PSNR=37.04 dB); and (f) the proposed
Automated Visual Event Detection, Tracking, and Data Management System for Cabled- Observatory Video
NASA Astrophysics Data System (ADS)
Edgington, D. R.; Cline, D. E.; Schlining, B.; Raymond, E.
2008-12-01
Ocean observatories and underwater video surveys have the potential to unlock important discoveries with new and existing camera systems. Yet the burden of video management and analysis often requires reducing the amount of video recorded through time-lapse video or similar methods. It's unknown how many digitized video data sets exist in the oceanographic community, but we suspect that many remain under analyzed due to lack of good tools or human resources to analyze the video. To help address this problem, the Automated Visual Event Detection (AVED) software and The Video Annotation and Reference System (VARS) have been under development at MBARI. For detecting interesting events in the video, the AVED software has been developed over the last 5 years. AVED is based on a neuromorphic-selective attention algorithm, modeled on the human vision system. Frames are decomposed into specific feature maps that are combined into a unique saliency map. This saliency map is then scanned to determine the most salient locations. The candidate salient locations are then segmented from the scene using algorithms suitable for the low, non-uniform light and marine snow typical of deep underwater video. For managing the AVED descriptions of the video, the VARS system provides an interface and database for describing, viewing, and cataloging the video. VARS was developed by the MBARI for annotating deep-sea video data and is currently being used to describe over 3000 dives by our remotely operated vehicles (ROV), making it well suited to this deepwater observatory application with only a few modifications. To meet the compute and data intensive job of video processing, a distributed heterogeneous network of computers is managed using the Condor workload management system. This system manages data storage, video transcoding, and AVED processing. Looking to the future, we see high-speed networks and Grid technology as an important element in addressing the problem of processing and accessing large video data sets.
Audiovisual quality evaluation of low-bitrate video
NASA Astrophysics Data System (ADS)
Winkler, Stefan; Faller, Christof
2005-03-01
Audiovisual quality assessment is a relatively unexplored topic. We designed subjective experiments for audio, video, and audiovisual quality using content and encoding parameters representative of video for mobile applications. Our focus were the MPEG-4 AVC (a.k.a. H.264) and AAC coding standards. Our goals in this study are two-fold: we want to understand the interactions between audio and video in terms of perceived audiovisual quality, and we use the subjective data to evaluate the prediction performance of our non-reference video and audio quality metrics.
Dogra, Debi P; Majumdar, Arun K; Sural, Shamik; Mukherjee, Jayanta; Mukherjee, Suchandra; Singh, Arun
2012-01-01
Hammersmith Infant Neurological Examination (HINE) is a set of tests used for grading neurological development of infants on a scale of 0 to 3. These tests help in assessing neurophysiological development of babies, especially preterm infants who are born before (the fetus reaches) the gestational age of 36 weeks. Such tests are often conducted in the follow-up clinics of hospitals for grading infants with suspected disabilities. Assessment based on HINE depends on the expertise of the physicians involved in conducting the examinations. It has been noted that some of these tests, especially pulled-to-sit and lateral tilting, are difficult to assess solely based on visual observation. For example, during the pulled-to-sit examination, the examiner needs to observe the relative movement of the head with respect to torso while pulling the infant by holding wrists. The examiner may find it difficult to follow the head movement from the coronal view. Video object tracking based automatic or semi-automatic analysis can be helpful in this case. In this paper, we present a video based method to automate the analysis of pulled-to-sit examination. In this context, a dynamic programming and node pruning based efficient video object tracking algorithm has been proposed. Pulled-to-sit event detection is handled by the proposed tracking algorithm that uses a 2-D geometric model of the scene. The algorithm has been tested with normal as well as marker based videos of the examination recorded at the neuro-development clinic of the SSKM Hospital, Kolkata, India. It is found that the proposed algorithm is capable of estimating the pulled-to-sit score with sensitivity (80%-92%) and specificity (89%-96%).
Implementation of smart phone video plethysmography and dependence on lighting parameters.
Fletcher, Richard Ribón; Chamberlain, Daniel; Paggi, Nicholas; Deng, Xinyue
2015-08-01
The remote measurement of heart rate (HR) and heart rate variability (HRV) via a digital camera (video plethysmography) has emerged as an area of great interest for biomedical and health applications. While a few implementations of video plethysmography have been demonstrated on smart phones under controlled lighting conditions, it has been challenging to create a general scalable solution due to the large variability in smart phone hardware performance, software architecture, and the variable response to lighting parameters. In this context, we present a selfcontained smart phone implementation of video plethysmography for Android OS, which employs both stochastic and deterministic algorithms, and we use this to study the effect of lighting parameters (illuminance, color spectrum) on the accuracy of the remote HR measurement. Using two different phone models, we present the median HR error for five different video plethysmography algorithms under three different types of lighting (natural sunlight, compact fluorescent, and halogen incandescent) and variations in brightness. For most algorithms, we found the optimum light brightness to be in the range 1000-4000 lux and the optimum lighting types to be compact fluorescent and natural light. Moderate errors were found for most algorithms with some devices under conditions of low-brightness (<;500 lux) and highbrightness (>4000 lux). Our analysis also identified camera frame rate jitter as a major source of variability and error across different phone models, but this can be largely corrected through non-linear resampling. Based on testing with six human subjects, our real-time Android implementation successfully predicted the measured HR with a median error of -0.31 bpm, and an inter-quartile range of 2.1bpm.
Missile signal processing common computer architecture for rapid technology upgrade
NASA Astrophysics Data System (ADS)
Rabinkin, Daniel V.; Rutledge, Edward; Monticciolo, Paul
2004-10-01
Interceptor missiles process IR images to locate an intended target and guide the interceptor towards it. Signal processing requirements have increased as the sensor bandwidth increases and interceptors operate against more sophisticated targets. A typical interceptor signal processing chain is comprised of two parts. Front-end video processing operates on all pixels of the image and performs such operations as non-uniformity correction (NUC), image stabilization, frame integration and detection. Back-end target processing, which tracks and classifies targets detected in the image, performs such algorithms as Kalman tracking, spectral feature extraction and target discrimination. In the past, video processing was implemented using ASIC components or FPGAs because computation requirements exceeded the throughput of general-purpose processors. Target processing was performed using hybrid architectures that included ASICs, DSPs and general-purpose processors. The resulting systems tended to be function-specific, and required custom software development. They were developed using non-integrated toolsets and test equipment was developed along with the processor platform. The lifespan of a system utilizing the signal processing platform often spans decades, while the specialized nature of processor hardware and software makes it difficult and costly to upgrade. As a result, the signal processing systems often run on outdated technology, algorithms are difficult to update, and system effectiveness is impaired by the inability to rapidly respond to new threats. A new design approach is made possible three developments; Moore's Law - driven improvement in computational throughput; a newly introduced vector computing capability in general purpose processors; and a modern set of open interface software standards. Today's multiprocessor commercial-off-the-shelf (COTS) platforms have sufficient throughput to support interceptor signal processing requirements. This application may be programmed under existing real-time operating systems using parallel processing software libraries, resulting in highly portable code that can be rapidly migrated to new platforms as processor technology evolves. Use of standardized development tools and 3rd party software upgrades are enabled as well as rapid upgrade of processing components as improved algorithms are developed. The resulting weapon system will have a superior processing capability over a custom approach at the time of deployment as a result of a shorter development cycles and use of newer technology. The signal processing computer may be upgraded over the lifecycle of the weapon system, and can migrate between weapon system variants enabled by modification simplicity. This paper presents a reference design using the new approach that utilizes an Altivec PowerPC parallel COTS platform. It uses a VxWorks-based real-time operating system (RTOS), and application code developed using an efficient parallel vector library (PVL). A quantification of computing requirements and demonstration of interceptor algorithm operating on this real-time platform are provided.
SEMG signal compression based on two-dimensional techniques.
de Melo, Wheidima Carneiro; de Lima Filho, Eddie Batista; da Silva Júnior, Waldir Sabino
2016-04-18
Recently, two-dimensional techniques have been successfully employed for compressing surface electromyographic (SEMG) records as images, through the use of image and video encoders. Such schemes usually provide specific compressors, which are tuned for SEMG data, or employ preprocessing techniques, before the two-dimensional encoding procedure, in order to provide a suitable data organization, whose correlations can be better exploited by off-the-shelf encoders. Besides preprocessing input matrices, one may also depart from those approaches and employ an adaptive framework, which is able to directly tackle SEMG signals reassembled as images. This paper proposes a new two-dimensional approach for SEMG signal compression, which is based on a recurrent pattern matching algorithm called multidimensional multiscale parser (MMP). The mentioned encoder was modified, in order to efficiently work with SEMG signals and exploit their inherent redundancies. Moreover, a new preprocessing technique, named as segmentation by similarity (SbS), which has the potential to enhance the exploitation of intra- and intersegment correlations, is introduced, the percentage difference sorting (PDS) algorithm is employed, with different image compressors, and results with the high efficiency video coding (HEVC), H.264/AVC, and JPEG2000 encoders are presented. Experiments were carried out with real isometric and dynamic records, acquired in laboratory. Dynamic signals compressed with H.264/AVC and HEVC, when combined with preprocessing techniques, resulted in good percent root-mean-square difference [Formula: see text] compression factor figures, for low and high compression factors, respectively. Besides, regarding isometric signals, the modified two-dimensional MMP algorithm outperformed state-of-the-art schemes, for low compression factors, the combination between SbS and HEVC proved to be competitive, for high compression factors, and JPEG2000, combined with PDS, provided good performance allied to low computational complexity, all in terms of percent root-mean-square difference [Formula: see text] compression factor. The proposed schemes are effective and, specifically, the modified MMP algorithm can be considered as an interesting alternative for isometric signals, regarding traditional SEMG encoders. Besides, the approach based on off-the-shelf image encoders has the potential of fast implementation and dissemination, given that many embedded systems may already have such encoders available, in the underlying hardware/software architecture.
NASA Astrophysics Data System (ADS)
Liu, Yu; Lin, Xiaocheng; Fan, Nianfei; Zhang, Lin
2016-01-01
Wireless video multicast has become one of the key technologies in wireless applications. But the main challenge of conventional wireless video multicast, i.e., the cliff effect, remains unsolved. To overcome the cliff effect, a hybrid digital-analog (HDA) video transmission framework based on SoftCast, which transmits the digital bitstream with the quantization residuals, is proposed. With an effective power allocation algorithm and appropriate parameter settings, the residual gains can be maximized; meanwhile, the digital bitstream can assure transmission of a basic video to the multicast receiver group. In the multiple-input multiple-output (MIMO) system, since nonuniform noise interference on different antennas can be regarded as the cliff effect problem, ParCast, which is a variation of SoftCast, is also applied to video transmission to solve it. The HDA scheme with corresponding power allocation algorithms is also applied to improve video performance. Simulations show that the proposed HDA scheme can overcome the cliff effect completely with the transmission of residuals. What is more, it outperforms the compared WSVC scheme by more than 2 dB when transmitting under the same bandwidth, and it can further improve performance by nearly 8 dB in MIMO when compared with the ParCast scheme.
Wireless visual sensor network resource allocation using cross-layer optimization
NASA Astrophysics Data System (ADS)
Bentley, Elizabeth S.; Matyjas, John D.; Medley, Michael J.; Kondi, Lisimachos P.
2009-01-01
In this paper, we propose an approach to manage network resources for a Direct Sequence Code Division Multiple Access (DS-CDMA) visual sensor network where nodes monitor scenes with varying levels of motion. It uses cross-layer optimization across the physical layer, the link layer and the application layer. Our technique simultaneously assigns a source coding rate, a channel coding rate, and a power level to all nodes in the network based on one of two criteria that maximize the quality of video of the entire network as a whole, subject to a constraint on the total chip rate. One criterion results in the minimal average end-to-end distortion amongst all nodes, while the other criterion minimizes the maximum distortion of the network. Our approach allows one to determine the capacity of the visual sensor network based on the number of nodes and the quality of video that must be transmitted. For bandwidth-limited applications, one can also determine the minimum bandwidth needed to accommodate a number of nodes with a specific target chip rate. Video captured by a sensor node camera is encoded and decoded using the H.264 video codec by a centralized control unit at the network layer. To reduce the computational complexity of the solution, Universal Rate-Distortion Characteristics (URDCs) are obtained experimentally to relate bit error probabilities to the distortion of corrupted video. Bit error rates are found first by using Viterbi's upper bounds on the bit error probability and second, by simulating nodes transmitting data spread by Total Square Correlation (TSC) codes over a Rayleigh-faded DS-CDMA channel and receiving that data using Auxiliary Vector (AV) filtering.
Dynamic frame resizing with convolutional neural network for efficient video compression
NASA Astrophysics Data System (ADS)
Kim, Jaehwan; Park, Youngo; Choi, Kwang Pyo; Lee, JongSeok; Jeon, Sunyoung; Park, JeongHoon
2017-09-01
In the past, video codecs such as vc-1 and H.263 used a technique to encode reduced-resolution video and restore original resolution from the decoder for improvement of coding efficiency. The techniques of vc-1 and H.263 Annex Q are called dynamic frame resizing and reduced-resolution update mode, respectively. However, these techniques have not been widely used due to limited performance improvements that operate well only under specific conditions. In this paper, video frame resizing (reduced/restore) technique based on machine learning is proposed for improvement of coding efficiency. The proposed method features video of low resolution made by convolutional neural network (CNN) in encoder and reconstruction of original resolution using CNN in decoder. The proposed method shows improved subjective performance over all the high resolution videos which are dominantly consumed recently. In order to assess subjective quality of the proposed method, Video Multi-method Assessment Fusion (VMAF) which showed high reliability among many subjective measurement tools was used as subjective metric. Moreover, to assess general performance, diverse bitrates are tested. Experimental results showed that BD-rate based on VMAF was improved by about 51% compare to conventional HEVC. Especially, VMAF values were significantly improved in low bitrate. Also, when the method is subjectively tested, it had better subjective visual quality in similar bit rate.
Ohira, Yoshiyuki; Uehara, Takanori; Noda, Kazutaka; Suzuki, Shingo; Shikino, Kiyoshi; Kajiwara, Hideki; Kondo, Takeshi; Hirota, Yusuke; Ikusaka, Masatomi
2017-01-01
Objectives We examined whether problem-based learning tutorials using patient-simulated videos showing daily life are more practical for clinical learning, compared with traditional paper-based problem-based learning, for the consideration rate of psychosocial issues and the recall rate for experienced learning. Methods Twenty-two groups with 120 fifth-year students were each assigned paper-based problem-based learning and video-based problem-based learning using patient-simulated videos. We compared target achievement rates in questionnaires using the Wilcoxon signed-rank test and discussion contents diversity using the Mann-Whitney U test. A follow-up survey used a chi-square test to measure students’ recall of cases in three categories: video, paper, and non-experienced. Results Video-based problem-based learning displayed significantly higher achievement rates for imagining authentic patients (p=0.001), incorporating a comprehensive approach including psychosocial aspects (p<0.001), and satisfaction with sessions (p=0.001). No significant differences existed in the discussion contents diversity regarding the International Classification of Primary Care Second Edition codes and chapter types or in the rate of psychological codes. In a follow-up survey comparing video and paper groups to non-experienced groups, the rates were higher for video (χ2=24.319, p<0.001) and paper (χ2=11.134, p=0.001). Although the video rate tended to be higher than the paper rate, no significant difference was found between the two. Conclusions Patient-simulated videos showing daily life facilitate imagining true patients and support a comprehensive approach that fosters better memory. The clinical patient-simulated video method is more practical and clinical problem-based tutorials can be implemented if we create patient-simulated videos for each symptom as teaching materials. PMID:28245193
Wilson, Michael E; Krupa, Artur; Hinds, Richard F; Litell, John M; Swetz, Keith M; Akhoundi, Abbasali; Kashyap, Rahul; Gajic, Ognjen; Kashani, Kianoush
2015-03-01
To determine if a video depicting cardiopulmonary resuscitation and resuscitation preference options would improve knowledge and decision making among patients and surrogates in the ICU. Randomized, unblinded trial. Single medical ICU. Patients and surrogate decision makers in the ICU. The usual care group received a standard pamphlet about cardiopulmonary resuscitation and cardiopulmonary resuscitation preference options plus routine code status discussions with clinicians. The video group received usual care plus an 8-minute video that depicted cardiopulmonary resuscitation, showed a simulated hospital code, and explained resuscitation preference options. One hundred three patients and surrogates were randomized to usual care. One hundred five patients and surrogates were randomized to video plus usual care. Median total knowledge scores (0-15 points possible for correct answers) in the video group were 13 compared with 10 in the usual care group, p value of less than 0.0001. Video group participants had higher rates of understanding the purpose of cardiopulmonary resuscitation and resuscitation options and terminology and could correctly name components of cardiopulmonary resuscitation. No statistically significant differences in documented resuscitation preferences following the interventions were found between the two groups, although the trial was underpowered to detect such differences. A majority of participants felt that the video was helpful in cardiopulmonary resuscitation decision making (98%) and would recommend the video to others (99%). A video depicting cardiopulmonary resuscitation and explaining resuscitation preference options was associated with improved knowledge of in-hospital cardiopulmonary resuscitation options and cardiopulmonary resuscitation terminology among patients and surrogate decision makers in the ICU, compared with receiving a pamphlet on cardiopulmonary resuscitation. Patients and surrogates found the video helpful in decision making and would recommend the video to others.
Ikegami, Akiko; Ohira, Yoshiyuki; Uehara, Takanori; Noda, Kazutaka; Suzuki, Shingo; Shikino, Kiyoshi; Kajiwara, Hideki; Kondo, Takeshi; Hirota, Yusuke; Ikusaka, Masatomi
2017-02-27
We examined whether problem-based learning tutorials using patient-simulated videos showing daily life are more practical for clinical learning, compared with traditional paper-based problem-based learning, for the consideration rate of psychosocial issues and the recall rate for experienced learning. Twenty-two groups with 120 fifth-year students were each assigned paper-based problem-based learning and video-based problem-based learning using patient-simulated videos. We compared target achievement rates in questionnaires using the Wilcoxon signed-rank test and discussion contents diversity using the Mann-Whitney U test. A follow-up survey used a chi-square test to measure students' recall of cases in three categories: video, paper, and non-experienced. Video-based problem-based learning displayed significantly higher achievement rates for imagining authentic patients (p=0.001), incorporating a comprehensive approach including psychosocial aspects (p<0.001), and satisfaction with sessions (p=0.001). No significant differences existed in the discussion contents diversity regarding the International Classification of Primary Care Second Edition codes and chapter types or in the rate of psychological codes. In a follow-up survey comparing video and paper groups to non-experienced groups, the rates were higher for video (χ 2 =24.319, p<0.001) and paper (χ 2 =11.134, p=0.001). Although the video rate tended to be higher than the paper rate, no significant difference was found between the two. Patient-simulated videos showing daily life facilitate imagining true patients and support a comprehensive approach that fosters better memory. The clinical patient-simulated video method is more practical and clinical problem-based tutorials can be implemented if we create patient-simulated videos for each symptom as teaching materials.
Design of batch audio/video conversion platform based on JavaEE
NASA Astrophysics Data System (ADS)
Cui, Yansong; Jiang, Lianpin
2018-03-01
With the rapid development of digital publishing industry, the direction of audio / video publishing shows the diversity of coding standards for audio and video files, massive data and other significant features. Faced with massive and diverse data, how to quickly and efficiently convert to a unified code format has brought great difficulties to the digital publishing organization. In view of this demand and present situation in this paper, basing on the development architecture of Sptring+SpringMVC+Mybatis, and combined with the open source FFMPEG format conversion tool, a distributed online audio and video format conversion platform with a B/S structure is proposed. Based on the Java language, the key technologies and strategies designed in the design of platform architecture are analyzed emphatically in this paper, designing and developing a efficient audio and video format conversion system, which is composed of “Front display system”, "core scheduling server " and " conversion server ". The test results show that, compared with the ordinary audio and video conversion scheme, the use of batch audio and video format conversion platform can effectively improve the conversion efficiency of audio and video files, and reduce the complexity of the work. Practice has proved that the key technology discussed in this paper can be applied in the field of large batch file processing, and has certain practical application value.
Takeda, Noriaki; Uno, Atsuhiko; Inohara, Hidenori; Shimada, Shoichi
2016-01-01
Background The mouse is the most commonly used animal model in biomedical research because of recent advances in molecular genetic techniques. Studies related to eye movement in mice are common in fields such as ophthalmology relating to vision, neuro-otology relating to the vestibulo-ocular reflex (VOR), neurology relating to the cerebellum’s role in movement, and psychology relating to attention. Recording eye movements in mice, however, is technically difficult. Methods We developed a new algorithm for analyzing the three-dimensional (3D) rotation vector of eye movement in mice using high-speed video-oculography (VOG). The algorithm made it possible to analyze the gain and phase of VOR using the eye’s angular velocity around the axis of eye rotation. Results When mice were rotated at 0.5 Hz and 2.5 Hz around the earth’s vertical axis with their heads in a 30° nose-down position, the vertical components of their left eye movements were in phase with the horizontal components. The VOR gain was 0.42 at 0.5 Hz and 0.74 at 2.5 Hz, and the phase lead of the eye movement against the turntable was 16.1° at 0.5 Hz and 4.88° at 2.5 Hz. Conclusions To the best of our knowledge, this is the first report of this algorithm being used to calculate a 3D rotation vector of eye movement in mice using high-speed VOG. We developed a technique for analyzing the 3D rotation vector of eye movements in mice with a high-speed infrared CCD camera. We concluded that the technique is suitable for analyzing eye movements in mice. We also include a C++ source code that can calculate the 3D rotation vectors of the eye position from two-dimensional coordinates of the pupil and the iris freckle in the image to this article. PMID:27023859
NASA Astrophysics Data System (ADS)
Rhzanov, Y.; Beaulieu, S.; Soule, S. A.; Shank, T.; Fornari, D.; Mayer, L. A.
2005-12-01
Many advances in understanding geologic, tectonic, biologic, and sedimentologic processes in the deep ocean are facilitated by direct observation of the seafloor. However, making such observations is both difficult and expensive. Optical systems (e.g., video, still camera, or direct observation) will always be constrained by the severe attenuation of light in the deep ocean, limiting the field of view to distances that are typically less than 10 meters. Acoustic systems can 'see' much larger areas, but at the cost of spatial resolution. Ultimately, scientists want to study and observe deep-sea processes in the same way we do land-based phenomena so that the spatial distribution and juxtaposition of processes and features can be resolved. We have begun development of algorithms that will, in near real-time, generate mosaics from video collected by deep-submergence vehicles. Mosaics consist of >>10 video frames and can cover 100's of square-meters. This work builds on a publicly available still and video mosaicking software package developed by Rzhanov and Mayer. Here we present the results of initial tests of data collection methodologies (e.g., transects across the seafloor and panoramas across features of interest), algorithm application, and GIS integration conducted during a recent cruise to the Eastern Galapagos Spreading Center (0 deg N, 86 deg W). We have developed a GIS database for the region that will act as a means to access and display mosaics within a geospatially-referenced framework. We have constructed numerous mosaics using both video and still imagery and assessed the quality of the mosaics (including registration errors) under different lighting conditions and with different navigation procedures. We have begun to develop algorithms for efficient and timely mosaicking of collected video as well as integration with navigation data for georeferencing the mosaics. Initial results indicate that operators must be properly versed in the control of the video systems as well as maintaining vehicle attitude and altitude in order to achieve the best results possible.
Annotations of Mexican bullfighting videos for semantic index
NASA Astrophysics Data System (ADS)
Montoya Obeso, Abraham; Oropesa Morales, Lester Arturo; Fernando Vázquez, Luis; Cocolán Almeda, Sara Ivonne; Stoian, Andrei; García Vázquez, Mireya Saraí; Zamudio Fuentes, Luis Miguel; Montiel Perez, Jesús Yalja; de la O Torres, Saul; Ramírez Acosta, Alejandro Alvaro
2015-09-01
The video annotation is important for web indexing and browsing systems. Indeed, in order to evaluate the performance of video query and mining techniques, databases with concept annotations are required. Therefore, it is necessary generate a database with a semantic indexing that represents the digital content of the Mexican bullfighting atmosphere. This paper proposes a scheme to make complex annotations in a video in the frame of multimedia search engine project. Each video is partitioned using our segmentation algorithm that creates shots of different length and different number of frames. In order to make complex annotations about the video, we use ELAN software. The annotations are done in two steps: First, we take note about the whole content in each shot. Second, we describe the actions as parameters of the camera like direction, position and deepness. As a consequence, we obtain a more complete descriptor of every action. In both cases we use the concepts of the TRECVid 2014 dataset. We also propose new concepts. This methodology allows to generate a database with the necessary information to create descriptors and algorithms capable to detect actions to automatically index and classify new bullfighting multimedia content.
Automated frame selection process for high-resolution microendoscopy
NASA Astrophysics Data System (ADS)
Ishijima, Ayumu; Schwarz, Richard A.; Shin, Dongsuk; Mondrik, Sharon; Vigneswaran, Nadarajah; Gillenwater, Ann M.; Anandasabapathy, Sharmila; Richards-Kortum, Rebecca
2015-04-01
We developed an automated frame selection algorithm for high-resolution microendoscopy video sequences. The algorithm rapidly selects a representative frame with minimal motion artifact from a short video sequence, enabling fully automated image analysis at the point-of-care. The algorithm was evaluated by quantitative comparison of diagnostically relevant image features and diagnostic classification results obtained using automated frame selection versus manual frame selection. A data set consisting of video sequences collected in vivo from 100 oral sites and 167 esophageal sites was used in the analysis. The area under the receiver operating characteristic curve was 0.78 (automated selection) versus 0.82 (manual selection) for oral sites, and 0.93 (automated selection) versus 0.92 (manual selection) for esophageal sites. The implementation of fully automated high-resolution microendoscopy at the point-of-care has the potential to reduce the number of biopsies needed for accurate diagnosis of precancer and cancer in low-resource settings where there may be limited infrastructure and personnel for standard histologic analysis.
Automated High-Speed Video Detection of Small-Scale Explosives Testing
NASA Astrophysics Data System (ADS)
Ford, Robert; Guymon, Clint
2013-06-01
Small-scale explosives sensitivity test data is used to evaluate hazards of processing, handling, transportation, and storage of energetic materials. Accurate test data is critical to implementation of engineering and administrative controls for personnel safety and asset protection. Operator mischaracterization of reactions during testing contributes to either excessive or inadequate safety protocols. Use of equipment and associated algorithms to aid the operator in reaction determination can significantly reduce operator error. Safety Management Services, Inc. has developed an algorithm to evaluate high-speed video images of sparks from an ESD (Electrostatic Discharge) machine to automatically determine whether or not a reaction has taken place. The algorithm with the high-speed camera is termed GoDetect (patent pending). An operator assisted version for friction and impact testing has also been developed where software is used to quickly process and store video of sensitivity testing. We have used this method for sensitivity testing with multiple pieces of equipment. We present the fundamentals of GoDetect and compare it to other methods used for reaction detection.
Space-time light field rendering.
Wang, Huamin; Sun, Mingxuan; Yang, Ruigang
2007-01-01
In this paper, we propose a novel framework called space-time light field rendering, which allows continuous exploration of a dynamic scene in both space and time. Compared to existing light field capture/rendering systems, it offers the capability of using unsynchronized video inputs and the added freedom of controlling the visualization in the temporal domain, such as smooth slow motion and temporal integration. In order to synthesize novel views from any viewpoint at any time instant, we develop a two-stage rendering algorithm. We first interpolate in the temporal domain to generate globally synchronized images using a robust spatial-temporal image registration algorithm followed by edge-preserving image morphing. We then interpolate these software-synchronized images in the spatial domain to synthesize the final view. In addition, we introduce a very accurate and robust algorithm to estimate subframe temporal offsets among input video sequences. Experimental results from unsynchronized videos with or without time stamps show that our approach is capable of maintaining photorealistic quality from a variety of real scenes.
NASA Astrophysics Data System (ADS)
Sood, Suresh; Pattinson, Hugh
Traditionally, face-to-face negotiations in the real world have not been looked at as a complex systems interaction of actors resulting in a dynamic and potentially emergent system. If indeed negotiations are an outcome of a dynamic interaction of simpler behavior just as with a complex system, we should be able to see the patterns contributing to the complexities of a negotiation under study. This paper and the supporting research sets out to show B2B (business-to-business) negotiations as complex systems of interacting actors exhibiting dynamic and emergent behavior. This paper discusses the exploratory research based on negotiation simulations in which a large number of business students participate as buyers and sellers. The student interactions are captured on video and a purpose built research method attempts to look for patterns of interactions between actors using visualization techniques traditionally reserved to observe the algorithmic complexity of complex systems. Students are videoed negotiating with partners. Each video is tagged according to a recognized classification and coding scheme for negotiations. The classification relates to the phases through which any particular negotiation might pass, such as laughter, aggression, compromise, and so forth — through some 30 possible categories. Were negotiations more or less successful if they progressed through the categories in different ways? Furthermore, does the data depict emergent pathway segments considered to be more or less successful? This focus on emergence within the data provides further strong support for face-to-face (F2F) negotiations to be construed as complex systems.
User-oriented summary extraction for soccer video based on multimodal analysis
NASA Astrophysics Data System (ADS)
Liu, Huayong; Jiang, Shanshan; He, Tingting
2011-11-01
An advanced user-oriented summary extraction method for soccer video is proposed in this work. Firstly, an algorithm of user-oriented summary extraction for soccer video is introduced. A novel approach that integrates multimodal analysis, such as extraction and analysis of the stadium features, moving object features, audio features and text features is introduced. By these features the semantic of the soccer video and the highlight mode are obtained. Then we can find the highlight position and put them together by highlight degrees to obtain the video summary. The experimental results for sports video of world cup soccer games indicate that multimodal analysis is effective for soccer video browsing and retrieval.
A novel key-frame extraction approach for both video summary and video index.
Lei, Shaoshuai; Xie, Gang; Yan, Gaowei
2014-01-01
Existing key-frame extraction methods are basically video summary oriented; yet the index task of key-frames is ignored. This paper presents a novel key-frame extraction approach which can be available for both video summary and video index. First a dynamic distance separability algorithm is advanced to divide a shot into subshots based on semantic structure, and then appropriate key-frames are extracted in each subshot by SVD decomposition. Finally, three evaluation indicators are proposed to evaluate the performance of the new approach. Experimental results show that the proposed approach achieves good semantic structure for semantics-based video index and meanwhile produces video summary consistent with human perception.
Protection of HEVC Video Delivery in Vehicular Networks with RaptorQ Codes
Martínez-Rach, Miguel; López, Otoniel; Malumbres, Manuel Pérez
2014-01-01
With future vehicles equipped with processing capability, storage, and communications, vehicular networks will become a reality. A vast number of applications will arise that will make use of this connectivity. Some of them will be based on video streaming. In this paper we focus on HEVC video coding standard streaming in vehicular networks and how it deals with packet losses with the aid of RaptorQ, a Forward Error Correction scheme. As vehicular networks are packet loss prone networks, protection mechanisms are necessary if we want to guarantee a minimum level of quality of experience to the final user. We have run simulations to evaluate which configurations fit better in this type of scenarios. PMID:25136675
Ladner, Travis R; Greenberg, Jacob K; Guerrero, Nicole; Olsen, Margaret A; Shannon, Chevis N; Yarbrough, Chester K; Piccirillo, Jay F; Anderson, Richard C E; Feldstein, Neil A; Wellons, John C; Smyth, Matthew D; Park, Tae Sung; Limbrick, David D
2016-05-01
OBJECTIVE Administrative billing data may facilitate large-scale assessments of treatment outcomes for pediatric Chiari malformation Type I (CM-I). Validated International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) code algorithms for identifying CM-I surgery are critical prerequisites for such studies but are currently only available for adults. The objective of this study was to validate two ICD-9-CM code algorithms using hospital billing data to identify pediatric patients undergoing CM-I decompression surgery. METHODS The authors retrospectively analyzed the validity of two ICD-9-CM code algorithms for identifying pediatric CM-I decompression surgery performed at 3 academic medical centers between 2001 and 2013. Algorithm 1 included any discharge diagnosis code of 348.4 (CM-I), as well as a procedure code of 01.24 (cranial decompression) or 03.09 (spinal decompression or laminectomy). Algorithm 2 restricted this group to the subset of patients with a primary discharge diagnosis of 348.4. The positive predictive value (PPV) and sensitivity of each algorithm were calculated. RESULTS Among 625 first-time admissions identified by Algorithm 1, the overall PPV for CM-I decompression was 92%. Among the 581 admissions identified by Algorithm 2, the PPV was 97%. The PPV for Algorithm 1 was lower in one center (84%) compared with the other centers (93%-94%), whereas the PPV of Algorithm 2 remained high (96%-98%) across all subgroups. The sensitivity of Algorithms 1 (91%) and 2 (89%) was very good and remained so across subgroups (82%-97%). CONCLUSIONS An ICD-9-CM algorithm requiring a primary diagnosis of CM-I has excellent PPV and very good sensitivity for identifying CM-I decompression surgery in pediatric patients. These results establish a basis for utilizing administrative billing data to assess pediatric CM-I treatment outcomes.
Subjective quality evaluation of low-bit-rate video
NASA Astrophysics Data System (ADS)
Masry, Mark; Hemami, Sheila S.; Osberger, Wilfried M.; Rohaly, Ann M.
2001-06-01
A subjective quality evaluation was performed to qualify vie4wre responses to visual defects that appear in low bit rate video at full and reduced frame rates. The stimuli were eight sequences compressed by three motion compensated encoders - Sorenson Video, H.263+ and a Wavelet based coder - operating at five bit/frame rate combinations. The stimulus sequences exhibited obvious coding artifacts whose nature differed across the three coders. The subjective evaluation was performed using the Single Stimulus Continuos Quality Evaluation method of UTI-R Rec. BT.500-8. Viewers watched concatenated coded test sequences and continuously registered the perceived quality using a slider device. Data form 19 viewers was colleted. An analysis of their responses to the presence of various artifacts across the range of possible coding conditions and content is presented. The effects of blockiness and blurriness on perceived quality are examined. The effects of changes in frame rate on perceived quality are found to be related to the nature of the motion in the sequence.
Novel inter and intra prediction tools under consideration for the emerging AV1 video codec
NASA Astrophysics Data System (ADS)
Joshi, Urvang; Mukherjee, Debargha; Han, Jingning; Chen, Yue; Parker, Sarah; Su, Hui; Chiang, Angie; Xu, Yaowu; Liu, Zoe; Wang, Yunqing; Bankoski, Jim; Wang, Chen; Keyder, Emil
2017-09-01
Google started the WebM Project in 2010 to develop open source, royalty- free video codecs designed specifically for media on the Web. The second generation codec released by the WebM project, VP9, is currently served by YouTube, and enjoys billions of views per day. Realizing the need for even greater compression efficiency to cope with the growing demand for video on the web, the WebM team embarked on an ambitious project to develop a next edition codec AV1, in a consortium of major tech companies called the Alliance for Open Media, that achieves at least a generational improvement in coding efficiency over VP9. In this paper, we focus primarily on new tools in AV1 that improve the prediction of pixel blocks before transforms, quantization and entropy coding are invoked. Specifically, we describe tools and coding modes that improve intra, inter and combined inter-intra prediction. Results are presented on standard test sets.
Detection of Abnormal Events via Optical Flow Feature Analysis
Wang, Tian; Snoussi, Hichem
2015-01-01
In this paper, a novel algorithm is proposed to detect abnormal events in video streams. The algorithm is based on the histogram of the optical flow orientation descriptor and the classification method. The details of the histogram of the optical flow orientation descriptor are illustrated for describing movement information of the global video frame or foreground frame. By combining one-class support vector machine and kernel principal component analysis methods, the abnormal events in the current frame can be detected after a learning period characterizing normal behaviors. The difference abnormal detection results are analyzed and explained. The proposed detection method is tested on benchmark datasets, then the experimental results show the effectiveness of the algorithm. PMID:25811227