video processing algorithms: Topics by Science.gov

Sample records for video processing algorithms

Digital codec for real-time processing of broadcast quality video signals at 1.8 bits/pixel

NASA Technical Reports Server (NTRS)

Shalkhauser, Mary JO; Whyte, Wayne A., Jr.

1989-01-01

The authors present the hardware implementation of a digital television bandwidth compression algorithm which processes standard NTSC (National Television Systems Committee) composite color television signals and produces broadcast-quality video in real time at an average of 1.8 b/pixel. The sampling rate used with this algorithm results in 768 samples over the active portion of each video line by 512 active video lines per video frame. The algorithm is based on differential pulse code modulation (DPCM), but additionally utilizes a nonadaptive predictor, nonuniform quantizer, and multilevel Huffman coder to reduce the data rate substantially below that achievable with straight DPCM. The nonadaptive predictor and multilevel Huffman coder combine to set this technique apart from prior-art DPCM encoding algorithms. The authors describe the data compression algorithm and the hardware implementation of the codec and provide performance results.
Algorithm-Based Motion Magnification for Video Processing in Urological Laparoscopy.

PubMed

Adams, Fabian; Schoelly, Reto; Schlager, Daniel; Schoenthaler, Martin; Schoeb, Dominik S; Wilhelm, Konrad; Hein, Simon; Wetterauer, Ulrich; Miernik, Arkadiusz

2017-06-01

Minimally invasive surgery is in constant further development and has replaced many conventional operative procedures. If vascular structure movement could be detected during these procedures, it could reduce the risk of vascular injury and conversion to open surgery. The recently proposed motion-amplifying algorithm, Eulerian Video Magnification (EVM), has been shown to substantially enhance minimal object changes in digitally recorded video that is barely perceptible to the human eye. We adapted and examined this technology for use in urological laparoscopy. Video sequences of routine urological laparoscopic interventions were recorded and further processed using spatial decomposition and filtering algorithms. The freely available EVM algorithm was investigated for its usability in real-time processing. In addition, a new image processing technology, the CRS iimotion Motion Magnification (CRSMM) algorithm, was specifically adjusted for endoscopic requirements, applied, and validated by our working group. Using EVM, no significant motion enhancement could be detected without severe impairment of the image resolution, motion, and color presentation. The CRSMM algorithm significantly improved image quality in terms of motion enhancement. In particular, the pulsation of vascular structures could be displayed more accurately than in EVM. Motion magnification image processing technology has the potential for clinical importance as a video optimizing modality in endoscopic and laparoscopic surgery. Barely detectable (micro)movements can be visualized using this noninvasive marker-free method. Despite these optimistic results, the technology requires considerable further technical development and clinical tests.
Compressive Video Recovery Using Block Match Multi-Frame Motion Estimation Based on Single Pixel Cameras

PubMed Central

Bi, Sheng; Zeng, Xiao; Tang, Xin; Qin, Shujia; Lai, King Wai Chiu

2016-01-01

Compressive sensing (CS) theory has opened up new paths for the development of signal processing applications. Based on this theory, a novel single pixel camera architecture has been introduced to overcome the current limitations and challenges of traditional focal plane arrays. However, video quality based on this method is limited by existing acquisition and recovery methods, and the method also suffers from being time-consuming. In this paper, a multi-frame motion estimation algorithm is proposed in CS video to enhance the video quality. The proposed algorithm uses multiple frames to implement motion estimation. Experimental results show that using multi-frame motion estimation can improve the quality of recovered videos. To further reduce the motion estimation time, a block match algorithm is used to process motion estimation. Experiments demonstrate that using the block match algorithm can reduce motion estimation time by 30%. PMID:26950127
Gas leak detection in infrared video with background modeling

NASA Astrophysics Data System (ADS)

Zeng, Xiaoxia; Huang, Likun

2018-03-01

Background modeling plays an important role in the task of gas detection based on infrared video. VIBE algorithm is a widely used background modeling algorithm in recent years. However, the processing speed of the VIBE algorithm sometimes cannot meet the requirements of some real time detection applications. Therefore, based on the traditional VIBE algorithm, we propose a fast prospect model and optimize the results by combining the connected domain algorithm and the nine-spaces algorithm in the following processing steps. Experiments show the effectiveness of the proposed method.
Shadow Detection Based on Regions of Light Sources for Object Extraction in Nighttime Video

PubMed Central

Lee, Gil-beom; Lee, Myeong-jin; Lee, Woo-Kyung; Park, Joo-heon; Kim, Tae-Hwan

2017-01-01

Intelligent video surveillance systems detect pre-configured surveillance events through background modeling, foreground and object extraction, object tracking, and event detection. Shadow regions inside video frames sometimes appear as foreground objects, interfere with ensuing processes, and finally degrade the event detection performance of the systems. Conventional studies have mostly used intensity, color, texture, and geometric information to perform shadow detection in daytime video, but these methods lack the capability of removing shadows in nighttime video. In this paper, a novel shadow detection algorithm for nighttime video is proposed; this algorithm partitions each foreground object based on the object’s vertical histogram and screens out shadow objects by validating their orientations heading toward regions of light sources. From the experimental results, it can be seen that the proposed algorithm shows more than 93.8% shadow removal and 89.9% object extraction rates for nighttime video sequences, and the algorithm outperforms conventional shadow removal algorithms designed for daytime videos. PMID:28327515
Real-time demonstration hardware for enhanced DPCM video compression algorithm

NASA Technical Reports Server (NTRS)

Bizon, Thomas P.; Whyte, Wayne A., Jr.; Marcopoli, Vincent R.

1992-01-01

The lack of available wideband digital links as well as the complexity of implementation of bandwidth efficient digital video CODECs (encoder/decoder) has worked to keep the cost of digital television transmission too high to compete with analog methods. Terrestrial and satellite video service providers, however, are now recognizing the potential gains that digital video compression offers and are proposing to incorporate compression systems to increase the number of available program channels. NASA is similarly recognizing the benefits of and trend toward digital video compression techniques for transmission of high quality video from space and therefore, has developed a digital television bandwidth compression algorithm to process standard National Television Systems Committee (NTSC) composite color television signals. The algorithm is based on differential pulse code modulation (DPCM), but additionally utilizes a non-adaptive predictor, non-uniform quantizer and multilevel Huffman coder to reduce the data rate substantially below that achievable with straight DPCM. The non-adaptive predictor and multilevel Huffman coder combine to set this technique apart from other DPCM encoding algorithms. All processing is done on a intra-field basis to prevent motion degradation and minimize hardware complexity. Computer simulations have shown the algorithm will produce broadcast quality reconstructed video at an average transmission rate of 1.8 bits/pixel. Hardware implementation of the DPCM circuit, non-adaptive predictor and non-uniform quantizer has been completed, providing realtime demonstration of the image quality at full video rates. Video sampling/reconstruction circuits have also been constructed to accomplish the analog video processing necessary for the real-time demonstration. Performance results for the completed hardware compare favorably with simulation results. Hardware implementation of the multilevel Huffman encoder/decoder is currently under development along with implementation of a buffer control algorithm to accommodate the variable data rate output of the multilevel Huffman encoder. A video CODEC of this type could be used to compress NTSC color television signals where high quality reconstruction is desirable (e.g., Space Station video transmission, transmission direct-to-the-home via direct broadcast satellite systems or cable television distribution to system headends and direct-to-the-home).
Hyperspectral processing in graphical processing units

NASA Astrophysics Data System (ADS)

Winter, Michael E.; Winter, Edwin M.

2011-06-01

With the advent of the commercial 3D video card in the mid 1990s, we have seen an order of magnitude performance increase with each generation of new video cards. While these cards were designed primarily for visualization and video games, it became apparent after a short while that they could be used for scientific purposes. These Graphical Processing Units (GPUs) are rapidly being incorporated into data processing tasks usually reserved for general purpose computers. It has been found that many image processing problems scale well to modern GPU systems. We have implemented four popular hyperspectral processing algorithms (N-FINDR, linear unmixing, Principal Components, and the RX anomaly detection algorithm). These algorithms show an across the board speedup of at least a factor of 10, with some special cases showing extreme speedups of a hundred times or more.
CVD2014-A Database for Evaluating No-Reference Video Quality Assessment Algorithms.

PubMed

Nuutinen, Mikko; Virtanen, Toni; Vaahteranoksa, Mikko; Vuori, Tero; Oittinen, Pirkko; Hakkinen, Jukka

2016-07-01

In this paper, we present a new video database: CVD2014-Camera Video Database. In contrast to previous video databases, this database uses real cameras rather than introducing distortions via post-processing, which results in a complex distortion space in regard to the video acquisition process. CVD2014 contains a total of 234 videos that are recorded using 78 different cameras. Moreover, this database contains the observer-specific quality evaluation scores rather than only providing mean opinion scores. We have also collected open-ended quality descriptions that are provided by the observers. These descriptions were used to define the quality dimensions for the videos in CVD2014. The dimensions included sharpness, graininess, color balance, darkness, and jerkiness. At the end of this paper, a performance study of image and video quality algorithms for predicting the subjective video quality is reported. For this performance study, we proposed a new performance measure that accounts for observer variance. The performance study revealed that there is room for improvement regarding the video quality assessment algorithms. The CVD2014 video database has been made publicly available for the research community. All video sequences and corresponding subjective ratings can be obtained from the CVD2014 project page (http://www.helsinki.fi/psychology/groups/visualcognition/).
Variable disparity-motion estimation based fast three-view video coding

NASA Astrophysics Data System (ADS)

Bae, Kyung-Hoon; Kim, Seung-Cheol; Hwang, Yong Seok; Kim, Eun-Soo

2009-02-01

In this paper, variable disparity-motion estimation (VDME) based 3-view video coding is proposed. In the encoding, key-frame coding (KFC) based motion estimation and variable disparity estimation (VDE) for effectively fast three-view video encoding are processed. These proposed algorithms enhance the performance of 3-D video encoding/decoding system in terms of accuracy of disparity estimation and computational overhead. From some experiments, stereo sequences of 'Pot Plant' and 'IVO', it is shown that the proposed algorithm's PSNRs is 37.66 and 40.55 dB, and the processing time is 0.139 and 0.124 sec/frame, respectively.
A complexity-scalable software-based MPEG-2 video encoder.

PubMed

Chen, Guo-bin; Lu, Xin-ning; Wang, Xing-guo; Liu, Ji-lin

2004-05-01

With the development of general-purpose processors (GPP) and video signal processing algorithms, it is possible to implement a software-based real-time video encoder on GPP, and its low cost and easy upgrade attract developers' interests to transfer video encoding from specialized hardware to more flexible software. In this paper, the encoding structure is set up first to support complexity scalability; then a lot of high performance algorithms are used on the key time-consuming modules in coding process; finally, at programming level, processor characteristics are considered to improve data access efficiency and processing parallelism. Other programming methods such as lookup table are adopted to reduce the computational complexity. Simulation results showed that these ideas could not only improve the global performance of video coding, but also provide great flexibility in complexity regulation.
Digital CODEC for real-time processing of broadcast quality video signals at 1.8 bits/pixel

NASA Technical Reports Server (NTRS)

Shalkhauser, Mary JO; Whyte, Wayne A., Jr.

1989-01-01

Advances in very large-scale integration and recent work in the field of bandwidth efficient digital modulation techniques have combined to make digital video processing technically feasible and potentially cost competitive for broadcast quality television transmission. A hardware implementation was developed for a DPCM-based digital television bandwidth compression algorithm which processes standard NTSC composite color television signals and produces broadcast quality video in real time at an average of 1.8 bits/pixel. The data compression algorithm and the hardware implementation of the CODEC are described, and performance results are provided.
Digital CODEC for real-time processing of broadcast quality video signals at 1.8 bits/pixel

NASA Technical Reports Server (NTRS)

Shalkhauser, Mary JO; Whyte, Wayne A.

1991-01-01

Advances in very large scale integration and recent work in the field of bandwidth efficient digital modulation techniques have combined to make digital video processing technically feasible an potentially cost competitive for broadcast quality television transmission. A hardware implementation was developed for DPCM (differential pulse code midulation)-based digital television bandwidth compression algorithm which processes standard NTSC composite color television signals and produces broadcast quality video in real time at an average of 1.8 bits/pixel. The data compression algorithm and the hardware implementation of the codec are described, and performance results are provided.
Real time mitigation of atmospheric turbulence in long distance imaging using the lucky region fusion algorithm with FPGA and GPU hardware acceleration

NASA Astrophysics Data System (ADS)

Jackson, Christopher Robert

"Lucky-region" fusion (LRF) is a synthetic imaging technique that has proven successful in enhancing the quality of images distorted by atmospheric turbulence. The LRF algorithm selects sharp regions of an image obtained from a series of short exposure frames, and fuses the sharp regions into a final, improved image. In previous research, the LRF algorithm had been implemented on a PC using the C programming language. However, the PC did not have sufficient sequential processing power to handle real-time extraction, processing and reduction required when the LRF algorithm was applied to real-time video from fast, high-resolution image sensors. This thesis describes two hardware implementations of the LRF algorithm to achieve real-time image processing. The first was created with a VIRTEX-7 field programmable gate array (FPGA). The other developed using the graphics processing unit (GPU) of a NVIDIA GeForce GTX 690 video card. The novelty in the FPGA approach is the creation of a "black box" LRF video processing system with a general camera link input, a user controller interface, and a camera link video output. We also describe a custom hardware simulation environment we have built to test the FPGA LRF implementation. The advantage of the GPU approach is significantly improved development time, integration of image stabilization into the system, and comparable atmospheric turbulence mitigation.
The role of optical flow in automated quality assessment of full-motion video

NASA Astrophysics Data System (ADS)

Harguess, Josh; Shafer, Scott; Marez, Diego

2017-09-01

In real-world video data, such as full-motion-video (FMV) taken from unmanned vehicles, surveillance systems, and other sources, various corruptions to the raw data is inevitable. This can be due to the image acquisition process, noise, distortion, and compression artifacts, among other sources of error. However, we desire methods to analyze the quality of the video to determine whether the underlying content of the corrupted video can be analyzed by humans or machines and to what extent. Previous approaches have shown that motion estimation, or optical flow, can be an important cue in automating this video quality assessment. However, there are many different optical flow algorithms in the literature, each with their own advantages and disadvantages. We examine the effect of the choice of optical flow algorithm (including baseline and state-of-the-art), on motionbased automated video quality assessment algorithms.
Simultaneous compression and encryption for secure real-time secure transmission of sensitive video transmission

NASA Astrophysics Data System (ADS)

Al-Hayani, Nazar; Al-Jawad, Naseer; Jassim, Sabah A.

2014-05-01

Video compression and encryption became very essential in a secured real time video transmission. Applying both techniques simultaneously is one of the challenges where the size and the quality are important in multimedia transmission. In this paper we proposed a new technique for video compression and encryption. Both encryption and compression are based on edges extracted from the high frequency sub-bands of wavelet decomposition. The compression algorithm based on hybrid of: discrete wavelet transforms, discrete cosine transform, vector quantization, wavelet based edge detection, and phase sensing. The compression encoding algorithm treats the video reference and non-reference frames in two different ways. The encryption algorithm utilized A5 cipher combined with chaotic logistic map to encrypt the significant parameters and wavelet coefficients. Both algorithms can be applied simultaneously after applying the discrete wavelet transform on each individual frame. Experimental results show that the proposed algorithms have the following features: high compression, acceptable quality, and resistance to the statistical and bruteforce attack with low computational processing.
Video image processing

NASA Technical Reports Server (NTRS)

Murray, N. D.

1985-01-01

Current technology projections indicate a lack of availability of special purpose computing for Space Station applications. Potential functions for video image special purpose processing are being investigated, such as smoothing, enhancement, restoration and filtering, data compression, feature extraction, object detection and identification, pixel interpolation/extrapolation, spectral estimation and factorization, and vision synthesis. Also, architectural approaches are being identified and a conceptual design generated. Computationally simple algorithms will be research and their image/vision effectiveness determined. Suitable algorithms will be implimented into an overall architectural approach that will provide image/vision processing at video rates that are flexible, selectable, and programmable. Information is given in the form of charts, diagrams and outlines.
Research of real-time video processing system based on 6678 multi-core DSP

NASA Astrophysics Data System (ADS)

Li, Xiangzhen; Xie, Xiaodan; Yin, Xiaoqiang

2017-10-01

In the information age, the rapid development in the direction of intelligent video processing, complex algorithm proposed the powerful challenge on the performance of the processor. In this article, through the FPGA + TMS320C6678 frame structure, the image to fog, merge into an organic whole, to stabilize the image enhancement, its good real-time, superior performance, break through the traditional function of video processing system is simple, the product defects such as single, solved the video application in security monitoring, video, etc. Can give full play to the video monitoring effectiveness, improve enterprise economic benefits.
Two-dimensional thermal video analysis of offshore bird and bat flight

DOE PAGES

Matzner, Shari; Cullinan, Valerie I.; Duberstein, Corey A.

2015-09-11

Thermal infrared video can provide essential information about bird and bat presence and activity for risk assessment studies, but the analysis of recorded video can be time-consuming and may not extract all of the available information. Automated processing makes continuous monitoring over extended periods of time feasible, and maximizes the information provided by video. This is especially important for collecting data in remote locations that are difficult for human observers to access, such as proposed offshore wind turbine sites. We present guidelines for selecting an appropriate thermal camera based on environmental conditions and the physical characteristics of the target animals.more » We developed new video image processing algorithms that automate the extraction of bird and bat flight tracks from thermal video, and that characterize the extracted tracks to support animal identification and behavior inference. The algorithms use a video peak store process followed by background masking and perceptual grouping to extract flight tracks. The extracted tracks are automatically quantified in terms that could then be used to infer animal type and possibly behavior. The developed automated processing generates results that are reproducible and verifiable, and reduces the total amount of video data that must be retained and reviewed by human experts. Finally, we suggest models for interpreting thermal imaging information.« less
Two-dimensional thermal video analysis of offshore bird and bat flight

DOE Office of Scientific and Technical Information (OSTI.GOV)

Matzner, Shari; Cullinan, Valerie I.; Duberstein, Corey A.

Thermal infrared video can provide essential information about bird and bat presence and activity for risk assessment studies, but the analysis of recorded video can be time-consuming and may not extract all of the available information. Automated processing makes continuous monitoring over extended periods of time feasible, and maximizes the information provided by video. This is especially important for collecting data in remote locations that are difficult for human observers to access, such as proposed offshore wind turbine sites. We present guidelines for selecting an appropriate thermal camera based on environmental conditions and the physical characteristics of the target animals.more » We developed new video image processing algorithms that automate the extraction of bird and bat flight tracks from thermal video, and that characterize the extracted tracks to support animal identification and behavior inference. The algorithms use a video peak store process followed by background masking and perceptual grouping to extract flight tracks. The extracted tracks are automatically quantified in terms that could then be used to infer animal type and possibly behavior. The developed automated processing generates results that are reproducible and verifiable, and reduces the total amount of video data that must be retained and reviewed by human experts. Finally, we suggest models for interpreting thermal imaging information.« less
Optimized static and video EEG rapid serial visual presentation (RSVP) paradigm based on motion surprise computation

NASA Astrophysics Data System (ADS)

Khosla, Deepak; Huber, David J.; Bhattacharyya, Rajan

2017-05-01

In this paper, we describe an algorithm and system for optimizing search and detection performance for "items of interest" (IOI) in large-sized images and videos that employ the Rapid Serial Visual Presentation (RSVP) based EEG paradigm and surprise algorithms that incorporate motion processing to determine whether static or video RSVP is used. The system works by first computing a motion surprise map on image sub-regions (chips) of incoming sensor video data and then uses those surprise maps to label the chips as either "static" or "moving". This information tells the system whether to use a static or video RSVP presentation and decoding algorithm in order to optimize EEG based detection of IOI in each chip. Using this method, we are able to demonstrate classification of a series of image regions from video with an azimuth value of 1, indicating perfect classification, over a range of display frequencies and video speeds.

Near real-time, on-the-move software PED using VPEF

NASA Astrophysics Data System (ADS)

Green, Kevin; Geyer, Chris; Burnette, Chris; Agarwal, Sanjeev; Swett, Bruce; Phan, Chung; Deterline, Diane

2015-05-01

The scope of the Micro-Cloud for Operational, Vehicle-Based EO-IR Reconnaissance System (MOVERS) development effort, managed by the Night Vision and Electronic Sensors Directorate (NVESD), is to develop, integrate, and demonstrate new sensor technologies and algorithms that improve improvised device/mine detection using efficient and effective exploitation and fusion of sensor data and target cues from existing and future Route Clearance Package (RCP) sensor systems. Unfortunately, the majority of forward looking Full Motion Video (FMV) and computer vision processing, exploitation, and dissemination (PED) algorithms are often developed using proprietary, incompatible software. This makes the insertion of new algorithms difficult due to the lack of standardized processing chains. In order to overcome these limitations, EOIR developed the Government off-the-shelf (GOTS) Video Processing and Exploitation Framework (VPEF) to be able to provide standardized interfaces (e.g., input/output video formats, sensor metadata, and detected objects) for exploitation software and to rapidly integrate and test computer vision algorithms. EOIR developed a vehicle-based computing framework within the MOVERS and integrated it with VPEF. VPEF was further enhanced for automated processing, detection, and publishing of detections in near real-time, thus improving the efficiency and effectiveness of RCP sensor systems.
The reliability and accuracy of estimating heart-rates from RGB video recorded on a consumer grade camera

NASA Astrophysics Data System (ADS)

Eaton, Adam; Vincely, Vinoin; Lloyd, Paige; Hugenberg, Kurt; Vishwanath, Karthik

2017-03-01

Video Photoplethysmography (VPPG) is a numerical technique to process standard RGB video data of exposed human skin and extracting the heart-rate (HR) from the skin areas. Being a non-contact technique, VPPG has the potential to provide estimates of subject's heart-rate, respiratory rate, and even the heart rate variability of human subjects with potential applications ranging from infant monitors, remote healthcare and psychological experiments, particularly given the non-contact and sensor-free nature of the technique. Though several previous studies have reported successful correlations in HR obtained using VPPG algorithms to HR measured using the gold-standard electrocardiograph, others have reported that these correlations are dependent on controlling for duration of the video-data analyzed, subject motion, and ambient lighting. Here, we investigate the ability of two commonly used VPPG-algorithms in extraction of human heart-rates under three different laboratory conditions. We compare the VPPG HR values extracted across these three sets of experiments to the gold-standard values acquired by using an electrocardiogram or a commercially available pulseoximeter. The two VPPG-algorithms were applied with and without KLT-facial feature tracking and detection algorithms from the Computer Vision MATLAB® toolbox. Results indicate that VPPG based numerical approaches have the ability to provide robust estimates of subject HR values and are relatively insensitive to the devices used to record the video data. However, they are highly sensitive to conditions of video acquisition including subject motion, the location, size and averaging techniques applied to regions-of-interest as well as to the number of video frames used for data processing.
Evaluation of Moving Object Detection Based on Various Input Noise Using Fixed Camera

NASA Astrophysics Data System (ADS)

Kiaee, N.; Hashemizadeh, E.; Zarrinpanjeh, N.

2017-09-01

Detecting and tracking objects in video has been as a research area of interest in the field of image processing and computer vision. This paper evaluates the performance of a novel method for object detection algorithm in video sequences. This process helps us to know the advantage of this method which is being used. The proposed framework compares the correct and wrong detection percentage of this algorithm. This method was evaluated with the collected data in the field of urban transport which include car and pedestrian in fixed camera situation. The results show that the accuracy of the algorithm will decreases because of image resolution reduction.
Automatic attention-based prioritization of unconstrained video for compression

NASA Astrophysics Data System (ADS)

Itti, Laurent

2004-06-01

We apply a biologically-motivated algorithm that selects visually-salient regions of interest in video streams to multiply-foveated video compression. Regions of high encoding priority are selected based on nonlinear integration of low-level visual cues, mimicking processing in primate occipital and posterior parietal cortex. A dynamic foveation filter then blurs (foveates) every frame, increasingly with distance from high-priority regions. Two variants of the model (one with continuously-variable blur proportional to saliency at every pixel, and the other with blur proportional to distance from three independent foveation centers) are validated against eye fixations from 4-6 human observers on 50 video clips (synthetic stimuli, video games, outdoors day and night home video, television newscast, sports, talk-shows, etc). Significant overlap is found between human and algorithmic foveations on every clip with one variant, and on 48 out of 50 clips with the other. Substantial compressed file size reductions by a factor 0.5 on average are obtained for foveated compared to unfoveated clips. These results suggest a general-purpose usefulness of the algorithm in improving compression ratios of unconstrained video.
Processor core for real time background identification of HD video based on OpenCV Gaussian mixture model algorithm

NASA Astrophysics Data System (ADS)

Genovese, Mariangela; Napoli, Ettore

2013-05-01

The identification of moving objects is a fundamental step in computer vision processing chains. The development of low cost and lightweight smart cameras steadily increases the request of efficient and high performance circuits able to process high definition video in real time. The paper proposes two processor cores aimed to perform the real time background identification on High Definition (HD, 1920 1080 pixel) video streams. The implemented algorithm is the OpenCV version of the Gaussian Mixture Model (GMM), an high performance probabilistic algorithm for the segmentation of the background that is however computationally intensive and impossible to implement on general purpose CPU with the constraint of real time processing. In the proposed paper, the equations of the OpenCV GMM algorithm are optimized in such a way that a lightweight and low power implementation of the algorithm is obtained. The reported performances are also the result of the use of state of the art truncated binary multipliers and ROM compression techniques for the implementation of the non-linear functions. The first circuit has commercial FPGA devices as a target and provides speed and logic resource occupation that overcome previously proposed implementations. The second circuit is oriented to an ASIC (UMC-90nm) standard cell implementation. Both implementations are able to process more than 60 frames per second in 1080p format, a frame rate compatible with HD television.
Incremental principal component pursuit for video background modeling

DOEpatents

Rodriquez-Valderrama, Paul A.; Wohlberg, Brendt

2017-03-14

An incremental Principal Component Pursuit (PCP) algorithm for video background modeling that is able to process one frame at a time while adapting to changes in background, with a computational complexity that allows for real-time processing, having a low memory footprint and is robust to translational and rotational jitter.
An algorithm for power line detection and warning based on a millimeter-wave radar video.

PubMed

Ma, Qirong; Goshi, Darren S; Shih, Yi-Chi; Sun, Ming-Ting

2011-12-01

Power-line-strike accident is a major safety threat for low-flying aircrafts such as helicopters, thus an automatic warning system to power lines is highly desirable. In this paper we propose an algorithm for detecting power lines from radar videos from an active millimeter-wave sensor. Hough Transform is employed to detect candidate lines. The major challenge is that the radar videos are very noisy due to ground return. The noise points could fall on the same line which results in signal peaks after Hough Transform similar to the actual cable lines. To differentiate the cable lines from the noise lines, we train a Support Vector Machine to perform the classification. We exploit the Bragg pattern, which is due to the diffraction of electromagnetic wave on the periodic surface of power lines. We propose a set of features to represent the Bragg pattern for the classifier. We also propose a slice-processing algorithm which supports parallel processing, and improves the detection of cables in a cluttered background. Lastly, an adaptive algorithm is proposed to integrate the detection results from individual frames into a reliable video detection decision, in which temporal correlation of the cable pattern across frames is used to make the detection more robust. Extensive experiments with real-world data validated the effectiveness of our cable detection algorithm. © 2011 IEEE
Improved segmentation of occluded and adjoining vehicles in traffic surveillance videos

NASA Astrophysics Data System (ADS)

Juneja, Medha; Grover, Priyanka

2013-12-01

Occlusion in image processing refers to concealment of any part of the object or the whole object from view of an observer. Real time videos captured by static cameras on roads often encounter overlapping and hence, occlusion of vehicles. Occlusion in traffic surveillance videos usually occurs when an object which is being tracked is hidden by another object. This makes it difficult for the object detection algorithms to distinguish all the vehicles efficiently. Also morphological operations tend to join the close proximity vehicles resulting in formation of a single bounding box around more than one vehicle. Such problems lead to errors in further video processing, like counting of vehicles in a video. The proposed system brings forward efficient moving object detection and tracking approach to reduce such errors. The paper uses successive frame subtraction technique for detection of moving objects. Further, this paper implements the watershed algorithm to segment the overlapped and adjoining vehicles. The segmentation results have been improved by the use of noise and morphological operations.
[Development of a video image system for wireless capsule endoscopes based on DSP].

PubMed

Yang, Li; Peng, Chenglin; Wu, Huafeng; Zhao, Dechun; Zhang, Jinhua

2008-02-01

A video image recorder to record video picture for wireless capsule endoscopes was designed. TMS320C6211 DSP of Texas Instruments Inc. is the core processor of this system. Images are periodically acquired from Composite Video Broadcast Signal (CVBS) source and scaled by video decoder (SAA7114H). Video data is transported from high speed buffer First-in First-out (FIFO) to Digital Signal Processor (DSP) under the control of Complex Programmable Logic Device (CPLD). This paper adopts JPEG algorithm for image coding, and the compressed data in DSP was stored to Compact Flash (CF) card. TMS320C6211 DSP is mainly used for image compression and data transporting. Fast Discrete Cosine Transform (DCT) algorithm and fast coefficient quantization algorithm are used to accelerate operation speed of DSP and decrease the executing code. At the same time, proper address is assigned for each memory, which has different speed;the memory structure is also optimized. In addition, this system uses plenty of Extended Direct Memory Access (EDMA) to transport and process image data, which results in stable and high performance.
Pixel-By Estimation of Scene Motion in Video

NASA Astrophysics Data System (ADS)

Tashlinskii, A. G.; Smirnov, P. V.; Tsaryov, M. G.

2017-05-01

The paper considers the effectiveness of motion estimation in video using pixel-by-pixel recurrent algorithms. The algorithms use stochastic gradient decent to find inter-frame shifts of all pixels of a frame. These vectors form shift vectors' field. As estimated parameters of the vectors the paper studies their projections and polar parameters. It considers two methods for estimating shift vectors' field. The first method uses stochastic gradient descent algorithm to sequentially process all nodes of the image row-by-row. It processes each row bidirectionally i.e. from the left to the right and from the right to the left. Subsequent joint processing of the results allows compensating inertia of the recursive estimation. The second method uses correlation between rows to increase processing efficiency. It processes rows one after the other with the change in direction after each row and uses obtained values to form resulting estimate. The paper studies two criteria of its formation: gradient estimation minimum and correlation coefficient maximum. The paper gives examples of experimental results of pixel-by-pixel estimation for a video with a moving object and estimation of a moving object trajectory using shift vectors' field.
The design of red-blue 3D video fusion system based on DM642

NASA Astrophysics Data System (ADS)

Fu, Rongguo; Luo, Hao; Lv, Jin; Feng, Shu; Wei, Yifang; Zhang, Hao

2016-10-01

Aiming at the uncertainty of traditional 3D video capturing including camera focal lengths, distance and angle parameters between two cameras, a red-blue 3D video fusion system based on DM642 hardware processing platform is designed with the parallel optical axis. In view of the brightness reduction of traditional 3D video, the brightness enhancement algorithm based on human visual characteristics is proposed and the luminance component processing method based on YCbCr color space is also proposed. The BIOS real-time operating system is used to improve the real-time performance. The video processing circuit with the core of DM642 enhances the brightness of the images, then converts the video signals of YCbCr to RGB and extracts the R component from one camera, so does the other video and G, B component are extracted synchronously, outputs 3D fusion images finally. The real-time adjustments such as translation and scaling of the two color components are realized through the serial communication between the VC software and BIOS. The system with the method of adding red-blue components reduces the lost of the chrominance components and makes the picture color saturation reduce to more than 95% of the original. Enhancement algorithm after optimization to reduce the amount of data fusion in the processing of video is used to reduce the fusion time and watching effect is improved. Experimental results show that the system can capture images in near distance, output red-blue 3D video and presents the nice experiences to the audience wearing red-blue glasses.
Multi-pass encoding of hyperspectral imagery with spectral quality control

NASA Astrophysics Data System (ADS)

Wasson, Steven; Walker, William

2015-05-01

Multi-pass encoding is a technique employed in the field of video compression that maximizes the quality of an encoded video sequence within the constraints of a specified bit rate. This paper presents research where multi-pass encoding is extended to the field of hyperspectral image compression. Unlike video, which is primarily intended to be viewed by a human observer, hyperspectral imagery is processed by computational algorithms that generally attempt to classify the pixel spectra within the imagery. As such, these algorithms are more sensitive to distortion in the spectral dimension of the image than they are to perceptual distortion in the spatial dimension. The compression algorithm developed for this research, which uses the Karhunen-Loeve transform for spectral decorrelation followed by a modified H.264/Advanced Video Coding (AVC) encoder, maintains a user-specified spectral quality level while maximizing the compression ratio throughout the encoding process. The compression performance may be considered near-lossless in certain scenarios. For qualitative purposes, this paper presents the performance of the compression algorithm for several Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) and Hyperion datasets using spectral angle as the spectral quality assessment function. Specifically, the compression performance is illustrated in the form of rate-distortion curves that plot spectral angle versus bits per pixel per band (bpppb).
Tracking Algorithm of Multiple Pedestrians Based on Particle Filters in Video Sequences

PubMed Central

Liu, Yun; Wang, Chuanxu; Zhang, Shujun; Cui, Xuehong

2016-01-01

Pedestrian tracking is a critical problem in the field of computer vision. Particle filters have been proven to be very useful in pedestrian tracking for nonlinear and non-Gaussian estimation problems. However, pedestrian tracking in complex environment is still facing many problems due to changes of pedestrian postures and scale, moving background, mutual occlusion, and presence of pedestrian. To surmount these difficulties, this paper presents tracking algorithm of multiple pedestrians based on particle filters in video sequences. The algorithm acquires confidence value of the object and the background through extracting a priori knowledge thus to achieve multipedestrian detection; it adopts color and texture features into particle filter to get better observation results and then automatically adjusts weight value of each feature according to current tracking environment. During the process of tracking, the algorithm processes severe occlusion condition to prevent drift and loss phenomena caused by object occlusion and associates detection results with particle state to propose discriminated method for object disappearance and emergence thus to achieve robust tracking of multiple pedestrians. Experimental verification and analysis in video sequences demonstrate that proposed algorithm improves the tracking performance and has better tracking results. PMID:27847514
Automatic and user-centric approaches to video summary evaluation

NASA Astrophysics Data System (ADS)

Taskiran, Cuneyt M.; Bentley, Frank

2007-01-01

Automatic video summarization has become an active research topic in content-based video processing. However, not much emphasis has been placed on developing rigorous summary evaluation methods and developing summarization systems based on a clear understanding of user needs, obtained through user centered design. In this paper we address these two topics and propose an automatic video summary evaluation algorithm adapted from teh text summarization domain.
Efficient super-resolution image reconstruction applied to surveillance video captured by small unmanned aircraft systems

NASA Astrophysics Data System (ADS)

He, Qiang; Schultz, Richard R.; Chu, Chee-Hung Henry

2008-04-01

The concept surrounding super-resolution image reconstruction is to recover a highly-resolved image from a series of low-resolution images via between-frame subpixel image registration. In this paper, we propose a novel and efficient super-resolution algorithm, and then apply it to the reconstruction of real video data captured by a small Unmanned Aircraft System (UAS). Small UAS aircraft generally have a wingspan of less than four meters, so that these vehicles and their payloads can be buffeted by even light winds, resulting in potentially unstable video. This algorithm is based on a coarse-to-fine strategy, in which a coarsely super-resolved image sequence is first built from the original video data by image registration and bi-cubic interpolation between a fixed reference frame and every additional frame. It is well known that the median filter is robust to outliers. If we calculate pixel-wise medians in the coarsely super-resolved image sequence, we can restore a refined super-resolved image. The primary advantage is that this is a noniterative algorithm, unlike traditional approaches based on highly-computational iterative algorithms. Experimental results show that our coarse-to-fine super-resolution algorithm is not only robust, but also very efficient. In comparison with five well-known super-resolution algorithms, namely the robust super-resolution algorithm, bi-cubic interpolation, projection onto convex sets (POCS), the Papoulis-Gerchberg algorithm, and the iterated back projection algorithm, our proposed algorithm gives both strong efficiency and robustness, as well as good visual performance. This is particularly useful for the application of super-resolution to UAS surveillance video, where real-time processing is highly desired.
A fuzzy measure approach to motion frame analysis for scene detection. M.S. Thesis - Houston Univ.

NASA Technical Reports Server (NTRS)

Leigh, Albert B.; Pal, Sankar K.

1992-01-01

This paper addresses a solution to the problem of scene estimation of motion video data in the fuzzy set theoretic framework. Using fuzzy image feature extractors, a new algorithm is developed to compute the change of information in each of two successive frames to classify scenes. This classification process of raw input visual data can be used to establish structure for correlation. The algorithm attempts to fulfill the need for nonlinear, frame-accurate access to video data for applications such as video editing and visual document archival/retrieval systems in multimedia environments.
Design and Implementation of a Video-Zoom Driven Digital Audio-Zoom System for Portable Digital Imaging Devices

NASA Astrophysics Data System (ADS)

Park, Nam In; Kim, Seon Man; Kim, Hong Kook; Kim, Ji Woon; Kim, Myeong Bo; Yun, Su Won

In this paper, we propose a video-zoom driven audio-zoom algorithm in order to provide audio zooming effects in accordance with the degree of video-zoom. The proposed algorithm is designed based on a super-directive beamformer operating with a 4-channel microphone system, in conjunction with a soft masking process that considers the phase differences between microphones. Thus, the audio-zoom processed signal is obtained by multiplying an audio gain derived from a video-zoom level by the masked signal. After all, a real-time audio-zoom system is implemented on an ARM-CORETEX-A8 having a clock speed of 600 MHz after different levels of optimization are performed such as algorithmic level, C-code, and memory optimizations. To evaluate the complexity of the proposed real-time audio-zoom system, test data whose length is 21.3 seconds long is sampled at 48 kHz. As a result, it is shown from the experiments that the processing time for the proposed audio-zoom system occupies 14.6% or less of the ARM clock cycles. It is also shown from the experimental results performed in a semi-anechoic chamber that the signal with the front direction can be amplified by approximately 10 dB compared to the other directions.
Automated High-Speed Video Detection of Small-Scale Explosives Testing

NASA Astrophysics Data System (ADS)

Ford, Robert; Guymon, Clint

2013-06-01

Small-scale explosives sensitivity test data is used to evaluate hazards of processing, handling, transportation, and storage of energetic materials. Accurate test data is critical to implementation of engineering and administrative controls for personnel safety and asset protection. Operator mischaracterization of reactions during testing contributes to either excessive or inadequate safety protocols. Use of equipment and associated algorithms to aid the operator in reaction determination can significantly reduce operator error. Safety Management Services, Inc. has developed an algorithm to evaluate high-speed video images of sparks from an ESD (Electrostatic Discharge) machine to automatically determine whether or not a reaction has taken place. The algorithm with the high-speed camera is termed GoDetect (patent pending). An operator assisted version for friction and impact testing has also been developed where software is used to quickly process and store video of sensitivity testing. We have used this method for sensitivity testing with multiple pieces of equipment. We present the fundamentals of GoDetect and compare it to other methods used for reaction detection.
Dual-Layer Video Encryption using RSA Algorithm

NASA Astrophysics Data System (ADS)

Chadha, Aman; Mallik, Sushmit; Chadha, Ankit; Johar, Ravdeep; Mani Roja, M.

2015-04-01

This paper proposes a video encryption algorithm using RSA and Pseudo Noise (PN) sequence, aimed at applications requiring sensitive video information transfers. The system is primarily designed to work with files encoded using the Audio Video Interleaved (AVI) codec, although it can be easily ported for use with Moving Picture Experts Group (MPEG) encoded files. The audio and video components of the source separately undergo two layers of encryption to ensure a reasonable level of security. Encryption of the video component involves applying the RSA algorithm followed by the PN-based encryption. Similarly, the audio component is first encrypted using PN and further subjected to encryption using the Discrete Cosine Transform. Combining these techniques, an efficient system, invulnerable to security breaches and attacks with favorable values of parameters such as encryption/decryption speed, encryption/decryption ratio and visual degradation; has been put forth. For applications requiring encryption of sensitive data wherein stringent security requirements are of prime concern, the system is found to yield negligible similarities in visual perception between the original and the encrypted video sequence. For applications wherein visual similarity is not of major concern, we limit the encryption task to a single level of encryption which is accomplished by using RSA, thereby quickening the encryption process. Although some similarity between the original and encrypted video is observed in this case, it is not enough to comprehend the happenings in the video.
Method and system for enabling real-time speckle processing using hardware platforms

NASA Technical Reports Server (NTRS)

Ortiz, Fernando E. (Inventor); Kelmelis, Eric (Inventor); Durbano, James P. (Inventor); Curt, Peterson F. (Inventor)

2012-01-01

An accelerator for the speckle atmospheric compensation algorithm may enable real-time speckle processing of video feeds that may enable the speckle algorithm to be applied in numerous real-time applications. The accelerator may be implemented in various forms, including hardware, software, and/or machine-readable media.

Automated Visual Event Detection, Tracking, and Data Management System for Cabled- Observatory Video

NASA Astrophysics Data System (ADS)

Edgington, D. R.; Cline, D. E.; Schlining, B.; Raymond, E.

2008-12-01

Ocean observatories and underwater video surveys have the potential to unlock important discoveries with new and existing camera systems. Yet the burden of video management and analysis often requires reducing the amount of video recorded through time-lapse video or similar methods. It's unknown how many digitized video data sets exist in the oceanographic community, but we suspect that many remain under analyzed due to lack of good tools or human resources to analyze the video. To help address this problem, the Automated Visual Event Detection (AVED) software and The Video Annotation and Reference System (VARS) have been under development at MBARI. For detecting interesting events in the video, the AVED software has been developed over the last 5 years. AVED is based on a neuromorphic-selective attention algorithm, modeled on the human vision system. Frames are decomposed into specific feature maps that are combined into a unique saliency map. This saliency map is then scanned to determine the most salient locations. The candidate salient locations are then segmented from the scene using algorithms suitable for the low, non-uniform light and marine snow typical of deep underwater video. For managing the AVED descriptions of the video, the VARS system provides an interface and database for describing, viewing, and cataloging the video. VARS was developed by the MBARI for annotating deep-sea video data and is currently being used to describe over 3000 dives by our remotely operated vehicles (ROV), making it well suited to this deepwater observatory application with only a few modifications. To meet the compute and data intensive job of video processing, a distributed heterogeneous network of computers is managed using the Condor workload management system. This system manages data storage, video transcoding, and AVED processing. Looking to the future, we see high-speed networks and Grid technology as an important element in addressing the problem of processing and accessing large video data sets.
An unsupervised video foreground co-localization and segmentation process by incorporating motion cues and frame features

NASA Astrophysics Data System (ADS)

Zhang, Chao; Zhang, Qian; Zheng, Chi; Qiu, Guoping

2018-04-01

Video foreground segmentation is one of the key problems in video processing. In this paper, we proposed a novel and fully unsupervised approach for foreground object co-localization and segmentation of unconstrained videos. We firstly compute both the actual edges and motion boundaries of the video frames, and then align them by their HOG feature maps. Then, by filling the occlusions generated by the aligned edges, we obtained more precise masks about the foreground object. Such motion-based masks could be derived as the motion-based likelihood. Moreover, the color-base likelihood is adopted for the segmentation process. Experimental Results show that our approach outperforms most of the State-of-the-art algorithms.
Simulation and Real-Time Verification of Video Algorithms on the TI C6400 Using Simulink

DTIC Science & Technology

2004-08-20

SPONSOR/MONITOR’S ACRONYM(S) 11. SPONSOR/MONITOR’S REPORT NUMBER(S) 12 . DISTRIBUTION/AVAILABILITY STATEMENT Approved for public release...plot estimates over time (scrolling data) Adjust detection threshold (click mouse on graph) Monitor video capture Input video frames Captured frames 12 ...Video App: Surveillance Recording 1 2 7 3 4 9 5 6 11 SL for video Explanation of GUI 12 Target Options8 Build Process 10 13 14 15 16 M-code snippet
Automated Generation of Geo-Referenced Mosaics From Video Data Collected by Deep-Submergence Vehicles: Preliminary Results

NASA Astrophysics Data System (ADS)

Rhzanov, Y.; Beaulieu, S.; Soule, S. A.; Shank, T.; Fornari, D.; Mayer, L. A.

2005-12-01

Many advances in understanding geologic, tectonic, biologic, and sedimentologic processes in the deep ocean are facilitated by direct observation of the seafloor. However, making such observations is both difficult and expensive. Optical systems (e.g., video, still camera, or direct observation) will always be constrained by the severe attenuation of light in the deep ocean, limiting the field of view to distances that are typically less than 10 meters. Acoustic systems can 'see' much larger areas, but at the cost of spatial resolution. Ultimately, scientists want to study and observe deep-sea processes in the same way we do land-based phenomena so that the spatial distribution and juxtaposition of processes and features can be resolved. We have begun development of algorithms that will, in near real-time, generate mosaics from video collected by deep-submergence vehicles. Mosaics consist of >>10 video frames and can cover 100's of square-meters. This work builds on a publicly available still and video mosaicking software package developed by Rzhanov and Mayer. Here we present the results of initial tests of data collection methodologies (e.g., transects across the seafloor and panoramas across features of interest), algorithm application, and GIS integration conducted during a recent cruise to the Eastern Galapagos Spreading Center (0 deg N, 86 deg W). We have developed a GIS database for the region that will act as a means to access and display mosaics within a geospatially-referenced framework. We have constructed numerous mosaics using both video and still imagery and assessed the quality of the mosaics (including registration errors) under different lighting conditions and with different navigation procedures. We have begun to develop algorithms for efficient and timely mosaicking of collected video as well as integration with navigation data for georeferencing the mosaics. Initial results indicate that operators must be properly versed in the control of the video systems as well as maintaining vehicle attitude and altitude in order to achieve the best results possible.
Compression Algorithm Analysis of In-Situ (S)TEM Video: Towards Automatic Event Detection and Characterization

DOE Office of Scientific and Technical Information (OSTI.GOV)

Teuton, Jeremy R.; Griswold, Richard L.; Mehdi, Beata L.

Precise analysis of both (S)TEM images and video are time and labor intensive processes. As an example, determining when crystal growth and shrinkage occurs during the dynamic process of Li dendrite deposition and stripping involves manually scanning through each frame in the video to extract a specific set of frames/images. For large numbers of images, this process can be very time consuming, so a fast and accurate automated method is desirable. Given this need, we developed software that uses analysis of video compression statistics for detecting and characterizing events in large data sets. This software works by converting the datamore » into a series of images which it compresses into an MPEG-2 video using the open source “avconv” utility [1]. The software does not use the video itself, but rather analyzes the video statistics from the first pass of the video encoding that avconv records in the log file. This file contains statistics for each frame of the video including the frame quality, intra-texture and predicted texture bits, forward and backward motion vector resolution, among others. In all, avconv records 15 statistics for each frame. By combining different statistics, we have been able to detect events in various types of data. We have developed an interactive tool for exploring the data and the statistics that aids the analyst in selecting useful statistics for each analysis. Going forward, an algorithm for detecting and possibly describing events automatically can be written based on statistic(s) for each data type.« less
Heterogeneous CPU-GPU moving targets detection for UAV video

NASA Astrophysics Data System (ADS)

Li, Maowen; Tang, Linbo; Han, Yuqi; Yu, Chunlei; Zhang, Chao; Fu, Huiquan

2017-07-01

Moving targets detection is gaining popularity in civilian and military applications. On some monitoring platform of motion detection, some low-resolution stationary cameras are replaced by moving HD camera based on UAVs. The pixels of moving targets in the HD Video taken by UAV are always in a minority, and the background of the frame is usually moving because of the motion of UAVs. The high computational cost of the algorithm prevents running it at higher resolutions the pixels of frame. Hence, to solve the problem of moving targets detection based UAVs video, we propose a heterogeneous CPU-GPU moving target detection algorithm for UAV video. More specifically, we use background registration to eliminate the impact of the moving background and frame difference to detect small moving targets. In order to achieve the effect of real-time processing, we design the solution of heterogeneous CPU-GPU framework for our method. The experimental results show that our method can detect the main moving targets from the HD video taken by UAV, and the average process time is 52.16ms per frame which is fast enough to solve the problem.
Unmanned Vehicle Guidance Using Video Camera/Vehicle Model

NASA Technical Reports Server (NTRS)

Sutherland, T.

1999-01-01

A video guidance sensor (VGS) system has flown on both STS-87 and STS-95 to validate a single camera/target concept for vehicle navigation. The main part of the image algorithm was the subtraction of two consecutive images using software. For a nominal size image of 256 x 256 pixels this subtraction can take a large portion of the time between successive frames in standard rate video leaving very little time for other computations. The purpose of this project was to integrate the software subtraction into hardware to speed up the subtraction process and allow for more complex algorithms to be performed, both in hardware and software.
A Near-Optimal Distributed QoS Constrained Routing Algorithm for Multichannel Wireless Sensor Networks

PubMed Central

Lin, Frank Yeong-Sung; Hsiao, Chiu-Han; Yen, Hong-Hsu; Hsieh, Yu-Jen

2013-01-01

One of the important applications in Wireless Sensor Networks (WSNs) is video surveillance that includes the tasks of video data processing and transmission. Processing and transmission of image and video data in WSNs has attracted a lot of attention in recent years. This is known as Wireless Visual Sensor Networks (WVSNs). WVSNs are distributed intelligent systems for collecting image or video data with unique performance, complexity, and quality of service challenges. WVSNs consist of a large number of battery-powered and resource constrained camera nodes. End-to-end delay is a very important Quality of Service (QoS) metric for video surveillance application in WVSNs. How to meet the stringent delay QoS in resource constrained WVSNs is a challenging issue that requires novel distributed and collaborative routing strategies. This paper proposes a Near-Optimal Distributed QoS Constrained (NODQC) routing algorithm to achieve an end-to-end route with lower delay and higher throughput. A Lagrangian Relaxation (LR)-based routing metric that considers the “system perspective” and “user perspective” is proposed to determine the near-optimal routing paths that satisfy end-to-end delay constraints with high system throughput. The empirical results show that the NODQC routing algorithm outperforms others in terms of higher system throughput with lower average end-to-end delay and delay jitter. In this paper, for the first time, the algorithm shows how to meet the delay QoS and at the same time how to achieve higher system throughput in stringently resource constrained WVSNs.
Fast and efficient search for MPEG-4 video using adjacent pixel intensity difference quantization histogram feature

NASA Astrophysics Data System (ADS)

Lee, Feifei; Kotani, Koji; Chen, Qiu; Ohmi, Tadahiro

2010-02-01

In this paper, a fast search algorithm for MPEG-4 video clips from video database is proposed. An adjacent pixel intensity difference quantization (APIDQ) histogram is utilized as the feature vector of VOP (video object plane), which had been reliably applied to human face recognition previously. Instead of fully decompressed video sequence, partially decoded data, namely DC sequence of the video object are extracted from the video sequence. Combined with active search, a temporal pruning algorithm, fast and robust video search can be realized. The proposed search algorithm has been evaluated by total 15 hours of video contained of TV programs such as drama, talk, news, etc. to search for given 200 MPEG-4 video clips which each length is 15 seconds. Experimental results show the proposed algorithm can detect the similar video clip in merely 80ms, and Equal Error Rate (ERR) of 2 % in drama and news categories are achieved, which are more accurately and robust than conventional fast video search algorithm.
Flight State Information Inference with Application to Helicopter Cockpit Video Data Analysis Using Data Mining Techniques

NASA Astrophysics Data System (ADS)

Shin, Sanghyun

The National Transportation Safety Board (NTSB) has recently emphasized the importance of analyzing flight data as one of the most effective methods to improve eciency and safety of helicopter operations. By analyzing flight data with Flight Data Monitoring (FDM) programs, the safety and performance of helicopter operations can be evaluated and improved. In spite of the NTSB's effort, the safety of helicopter operations has not improved at the same rate as the safety of worldwide airlines, and the accident rate of helicopters continues to be much higher than that of fixed-wing aircraft. One of the main reasons is that the participation rates of the rotorcraft industry in the FDM programs are low due to the high costs of the Flight Data Recorder (FDR), the need of a special readout device to decode the FDR, anxiety of punitive action, etc. Since a video camera is easily installed, accessible, and inexpensively maintained, cockpit video data could complement the FDR in the presence of the FDR or possibly replace the role of the FDR in the absence of the FDR. Cockpit video data is composed of image and audio data: image data contains outside views through cockpit windows and activities on the flight instrument panels, whereas audio data contains sounds of the alarms within the cockpit. The goal of this research is to develop, test, and demonstrate a cockpit video data analysis algorithm based on data mining and signal processing techniques that can help better understand situations in the cockpit and the state of a helicopter by efficiently and accurately inferring the useful flight information from cockpit video data. Image processing algorithms based on data mining techniques are proposed to estimate a helicopter's attitude such as the bank and pitch angles, identify indicators from a flight instrument panel, and read the gauges and the numbers in the analogue gauge indicators and digital displays from cockpit image data. In addition, an audio processing algorithm based on signal processing and abrupt change detection techniques is proposed to identify types of warning alarms and to detect the occurrence times of individual alarms from cockpit audio data. Those proposed algorithms are then successfully applied to simulated and real helicopter cockpit video data to demonstrate and validate their performance.
iTrack: instrumented mobile electrooculography (EOG) eye-tracking in older adults and Parkinson's disease.

PubMed

Stuart, Samuel; Hickey, Aodhán; Galna, Brook; Lord, Sue; Rochester, Lynn; Godfrey, Alan

2017-01-01

Detection of saccades (fast eye-movements) within raw mobile electrooculography (EOG) data involves complex algorithms which typically process data acquired during seated static tasks only. Processing of data during dynamic tasks such as walking is relatively rare and complex, particularly in older adults or people with Parkinson's disease (PD). Development of algorithms that can be easily implemented to detect saccades is required. This study aimed to develop an algorithm for the detection and measurement of saccades in EOG data during static (sitting) and dynamic (walking) tasks, in older adults and PD. Eye-tracking via mobile EOG and infra-red (IR) eye-tracker (with video) was performed with a group of older adults (n = 10) and PD participants (n = 10) (⩾50 years). Horizontal saccades made between targets set 5°, 10° and 15° apart were first measured while seated. Horizontal saccades were then measured while a participant walked and executed a 40° turn left and right. The EOG algorithm was evaluated by comparing the number of correct saccade detections and agreement (ICC 2,1 ) between output from visual inspection of eye-tracker videos and IR eye-tracker. The EOG algorithm detected 75-92% of saccades compared to video inspection and IR output during static testing, with fair to excellent agreement (ICC 2,1 0.49-0.93). However, during walking EOG saccade detection reduced to 42-88% compared to video inspection or IR output, with poor to excellent (ICC 2,1 0.13-0.88) agreement between methodologies. The algorithm was robust during seated testing but less so during walking, which was likely due to increased measurement and analysis error with a dynamic task. Future studies may consider a combination of EOG and IR for comprehensive measurement.
An Adaptive Inpainting Algorithm Based on DCT Induced Wavelet Regularization

DTIC Science & Technology

2013-01-01

research in image processing. Applications of image inpainting include old films restoration, video inpainting [4], de -interlacing of video sequences...show 5 (a) (b) (c) (d) (e) (f) Fig. 1. Performance of various inpainting algorithms for a cartoon image with text. (a) the original test image; (b...the test image with text; inpainted images by (c) SF (PSNR=37.38 dB); (d) SF-LDCT (PSNR=37.37 dB); (e) MCA (PSNR=37.04 dB); and (f) the proposed
Application of Bayesian a Priori Distributions for Vehicles' Video Tracking Systems

NASA Astrophysics Data System (ADS)

Mazurek, Przemysław; Okarma, Krzysztof

Intelligent Transportation Systems (ITS) helps to improve the quality and quantity of many car traffic parameters. The use of the ITS is possible when the adequate measuring infrastructure is available. Video systems allow for its implementation with relatively low cost due to the possibility of simultaneous video recording of a few lanes of the road at a considerable distance from the camera. The process of tracking can be realized through different algorithms, the most attractive algorithms are Bayesian, because they use the a priori information derived from previous observations or known limitations. Use of this information is crucial for improving the quality of tracking especially for difficult observability conditions, which occur in the video systems under the influence of: smog, fog, rain, snow and poor lighting conditions.
An efficient interpolation filter VLSI architecture for HEVC standard

NASA Astrophysics Data System (ADS)

Zhou, Wei; Zhou, Xin; Lian, Xiaocong; Liu, Zhenyu; Liu, Xiaoxiang

2015-12-01

The next-generation video coding standard of High-Efficiency Video Coding (HEVC) is especially efficient for coding high-resolution video such as 8K-ultra-high-definition (UHD) video. Fractional motion estimation in HEVC presents a significant challenge in clock latency and area cost as it consumes more than 40 % of the total encoding time and thus results in high computational complexity. With aims at supporting 8K-UHD video applications, an efficient interpolation filter VLSI architecture for HEVC is proposed in this paper. Firstly, a new interpolation filter algorithm based on the 8-pixel interpolation unit is proposed in this paper. It can save 19.7 % processing time on average with acceptable coding quality degradation. Based on the proposed algorithm, an efficient interpolation filter VLSI architecture, composed of a reused data path of interpolation, an efficient memory organization, and a reconfigurable pipeline interpolation filter engine, is presented to reduce the implement hardware area and achieve high throughput. The final VLSI implementation only requires 37.2k gates in a standard 90-nm CMOS technology at an operating frequency of 240 MHz. The proposed architecture can be reused for either half-pixel interpolation or quarter-pixel interpolation, which can reduce the area cost for about 131,040 bits RAM. The processing latency of our proposed VLSI architecture can support the real-time processing of 4:2:0 format 7680 × 4320@78fps video sequences.
Quantitative fluorescence angiography for neurosurgical interventions.

PubMed

Weichelt, Claudia; Duscha, Philipp; Steinmeier, Ralf; Meyer, Tobias; Kuß, Julia; Cimalla, Peter; Kirsch, Matthias; Sobottka, Stephan B; Koch, Edmund; Schackert, Gabriele; Morgenstern, Ute

2013-06-01

Present methods for quantitative measurement of cerebral perfusion during neurosurgical operations require additional technology for measurement, data acquisition, and processing. This study used conventional fluorescence video angiography--as an established method to visualize blood flow in brain vessels--enhanced by a quantifying perfusion software tool. For these purposes, the fluorescence dye indocyanine green is given intravenously, and after activation by a near-infrared light source the fluorescence signal is recorded. Video data are analyzed by software algorithms to allow quantification of the blood flow. Additionally, perfusion is measured intraoperatively by a reference system. Furthermore, comparing reference measurements using a flow phantom were performed to verify the quantitative blood flow results of the software and to validate the software algorithm. Analysis of intraoperative video data provides characteristic biological parameters. These parameters were implemented in the special flow phantom for experimental validation of the developed software algorithms. Furthermore, various factors that influence the determination of perfusion parameters were analyzed by means of mathematical simulation. Comparing patient measurement, phantom experiment, and computer simulation under certain conditions (variable frame rate, vessel diameter, etc.), the results of the software algorithms are within the range of parameter accuracy of the reference methods. Therefore, the software algorithm for calculating cortical perfusion parameters from video data presents a helpful intraoperative tool without complex additional measurement technology.
Real-time high-level video understanding using data warehouse

NASA Astrophysics Data System (ADS)

Lienard, Bruno; Desurmont, Xavier; Barrie, Bertrand; Delaigle, Jean-Francois

2006-02-01

High-level Video content analysis such as video-surveillance is often limited by computational aspects of automatic image understanding, i.e. it requires huge computing resources for reasoning processes like categorization and huge amount of data to represent knowledge of objects, scenarios and other models. This article explains how to design and develop a "near real-time adaptive image datamart", used, as a decisional support system for vision algorithms, and then as a mass storage system. Using RDF specification as storing format of vision algorithms meta-data, we can optimise the data warehouse concepts for video analysis, add some processes able to adapt the current model and pre-process data to speed-up queries. In this way, when new data is sent from a sensor to the data warehouse for long term storage, using remote procedure call embedded in object-oriented interfaces to simplified queries, they are processed and in memory data-model is updated. After some processing, possible interpretations of this data can be returned back to the sensor. To demonstrate this new approach, we will present typical scenarios applied to this architecture such as people tracking and events detection in a multi-camera network. Finally we will show how this system becomes a high-semantic data container for external data-mining.
Real-time unmanned aircraft systems surveillance video mosaicking using GPU

NASA Astrophysics Data System (ADS)

Camargo, Aldo; Anderson, Kyle; Wang, Yi; Schultz, Richard R.; Fevig, Ronald A.

2010-04-01

Digital video mosaicking from Unmanned Aircraft Systems (UAS) is being used for many military and civilian applications, including surveillance, target recognition, border protection, forest fire monitoring, traffic control on highways, monitoring of transmission lines, among others. Additionally, NASA is using digital video mosaicking to explore the moon and planets such as Mars. In order to compute a "good" mosaic from video captured by a UAS, the algorithm must deal with motion blur, frame-to-frame jitter associated with an imperfectly stabilized platform, perspective changes as the camera tilts in flight, as well as a number of other factors. The most suitable algorithms use SIFT (Scale-Invariant Feature Transform) to detect the features consistent between video frames. Utilizing these features, the next step is to estimate the homography between two consecutives video frames, perform warping to properly register the image data, and finally blend the video frames resulting in a seamless video mosaick. All this processing takes a great deal of resources of resources from the CPU, so it is almost impossible to compute a real time video mosaic on a single processor. Modern graphics processing units (GPUs) offer computational performance that far exceeds current CPU technology, allowing for real-time operation. This paper presents the development of a GPU-accelerated digital video mosaicking implementation and compares it with CPU performance. Our tests are based on two sets of real video captured by a small UAS aircraft; one video comes from Infrared (IR) and Electro-Optical (EO) cameras. Our results show that we can obtain a speed-up of more than 50 times using GPU technology, so real-time operation at a video capture of 30 frames per second is feasible.
Guided filtering for solar image/video processing

NASA Astrophysics Data System (ADS)

Xu, Long; Yan, Yihua; Cheng, Jun

2017-06-01

A new image enhancement algorithm employing guided filtering is proposed in this work for the enhancement of solar images and videos so that users can easily figure out important fine structures embedded in the recorded images/movies for solar observation. The proposed algorithm can efficiently remove image noises, including Gaussian and impulse noises. Meanwhile, it can further highlight fibrous structures on/beyond the solar disk. These fibrous structures can clearly demonstrate the progress of solar flare, prominence coronal mass emission, magnetic field, and so on. The experimental results prove that the proposed algorithm gives significant enhancement of visual quality of solar images beyond original input and several classical image enhancement algorithms, thus facilitating easier determination of interesting solar burst activities from recorded images/movies.
Parallel processing approach to transform-based image coding

NASA Astrophysics Data System (ADS)

Normile, James O.; Wright, Dan; Chu, Ken; Yeh, Chia L.

1991-06-01

This paper describes a flexible parallel processing architecture designed for use in real time video processing. The system consists of floating point DSP processors connected to each other via fast serial links, each processor has access to a globally shared memory. A multiple bus architecture in combination with a dual ported memory allows communication with a host control processor. The system has been applied to prototyping of video compression and decompression algorithms. The decomposition of transform based algorithms for decompression into a form suitable for parallel processing is described. A technique for automatic load balancing among the processors is developed and discussed, results ar presented with image statistics and data rates. Finally techniques for accelerating the system throughput are analyzed and results from the application of one such modification described.
Avionics-compatible video facial cognizer for detection of pilot incapacitation.

PubMed

Steffin, Morris

2006-01-01

High-acceleration loss of consciousness is a serious problem for military pilots. In this laboratory, a video cognizer has been developed that in real time detects facial changes closely coupled to the onset of loss of consciousness. Efficient algorithms are compatible with video digital signal processing hardware and are thus configurable on an autonomous single board that generates alarm triggers to activate autopilot, and is avionics-compatible.

On the use of video projectors for three-dimensional scanning

NASA Astrophysics Data System (ADS)

Juarez-Salazar, Rigoberto; Diaz-Ramirez, Victor H.; Robledo-Sanchez, Carlos; Diaz-Gonzalez, Gerardo

2017-08-01

Structured light projection is one of the most useful methods for accurate three-dimensional scanning. Video projectors are typically used as the illumination source. However, because video projectors are not designed for structured light systems, some considerations such as gamma calibration must be taken into account. In this work, we present a simple method for gamma calibration of video projectors. First, the experimental fringe patterns are normalized. Then, the samples of the fringe patterns are sorted in ascending order. The sample sorting leads to a simple three-parameter sine curve that is fitted using the Gauss-Newton algorithm. The novelty of this method is that the sorting process removes the effect of the unknown phase. Thus, the resulting gamma calibration algorithm is significantly simplified. The feasibility of the proposed method is illustrated in a three-dimensional scanning experiment.
The comparison and analysis of extracting video key frame

NASA Astrophysics Data System (ADS)

Ouyang, S. Z.; Zhong, L.; Luo, R. Q.

2018-05-01

Video key frame extraction is an important part of the large data processing. Based on the previous work in key frame extraction, we summarized four important key frame extraction algorithms, and these methods are largely developed by comparing the differences between each of two frames. If the difference exceeds a threshold value, take the corresponding frame as two different keyframes. After the research, the key frame extraction based on the amount of mutual trust is proposed, the introduction of information entropy, by selecting the appropriate threshold values into the initial class, and finally take a similar mean mutual information as a candidate key frame. On this paper, several algorithms is used to extract the key frame of tunnel traffic videos. Then, with the analysis to the experimental results and comparisons between the pros and cons of these algorithms, the basis of practical applications is well provided.
In-camera video-stream processing for bandwidth reduction in web inspection

NASA Astrophysics Data System (ADS)

Jullien, Graham A.; Li, QiuPing; Hajimowlana, S. Hossain; Morvay, J.; Conflitti, D.; Roberts, James W.; Doody, Brian C.

1996-02-01

Automated machine vision systems are now widely used for industrial inspection tasks where video-stream data information is taken in by the camera and then sent out to the inspection system for future processing. In this paper we describe a prototype system for on-line programming of arbitrary real-time video data stream bandwidth reduction algorithms; the output of the camera only contains information that has to be further processed by a host computer. The processing system is built into a DALSA CCD camera and uses a microcontroller interface to download bit-stream data to a XILINXTM FPGA. The FPGA is directly connected to the video data-stream and outputs data to a low bandwidth output bus. The camera communicates to a host computer via an RS-232 link to the microcontroller. Static memory is used to both generate a FIFO interface for buffering defect burst data, and for off-line examination of defect detection data. In addition to providing arbitrary FPGA architectures, the internal program of the microcontroller can also be changed via the host computer and a ROM monitor. This paper describes a prototype system board, mounted inside a DALSA camera, and discusses some of the algorithms currently being implemented for web inspection applications.
Video error concealment using block matching and frequency selective extrapolation algorithms

NASA Astrophysics Data System (ADS)

P. K., Rajani; Khaparde, Arti

2017-06-01

Error Concealment (EC) is a technique at the decoder side to hide the transmission errors. It is done by analyzing the spatial or temporal information from available video frames. It is very important to recover distorted video because they are used for various applications such as video-telephone, video-conference, TV, DVD, internet video streaming, video games etc .Retransmission-based and resilient-based methods, are also used for error removal. But these methods add delay and redundant data. So error concealment is the best option for error hiding. In this paper, the error concealment methods such as Block Matching error concealment algorithm is compared with Frequency Selective Extrapolation algorithm. Both the works are based on concealment of manually error video frames as input. The parameter used for objective quality measurement was PSNR (Peak Signal to Noise Ratio) and SSIM(Structural Similarity Index). The original video frames along with error video frames are compared with both the Error concealment algorithms. According to simulation results, Frequency Selective Extrapolation is showing better quality measures such as 48% improved PSNR and 94% increased SSIM than Block Matching Algorithm.
Inter-view prediction of intra mode decision for high-efficiency video coding-based multiview video coding

NASA Astrophysics Data System (ADS)

da Silva, Thaísa Leal; Agostini, Luciano Volcan; da Silva Cruz, Luis A.

2014-05-01

Intra prediction is a very important tool in current video coding standards. High-efficiency video coding (HEVC) intra prediction presents relevant gains in encoding efficiency when compared to previous standards, but with a very important increase in the computational complexity since 33 directional angular modes must be evaluated. Motivated by this high complexity, this article presents a complexity reduction algorithm developed to reduce the HEVC intra mode decision complexity targeting multiview videos. The proposed algorithm presents an efficient fast intra prediction compliant with singleview and multiview video encoding. This fast solution defines a reduced subset of intra directions according to the video texture and it exploits the relationship between prediction units (PUs) of neighbor depth levels of the coding tree. This fast intra coding procedure is used to develop an inter-view prediction method, which exploits the relationship between the intra mode directions of adjacent views to further accelerate the intra prediction process in multiview video encoding applications. When compared to HEVC simulcast, our method achieves a complexity reduction of up to 47.77%, at the cost of an average BD-PSNR loss of 0.08 dB.
A unified framework of unsupervised subjective optimized bit allocation for multiple video object coding

NASA Astrophysics Data System (ADS)

Chen, Zhenzhong; Han, Junwei; Ngan, King Ngi

2005-10-01

MPEG-4 treats a scene as a composition of several objects or so-called video object planes (VOPs) that are separately encoded and decoded. Such a flexible video coding framework makes it possible to code different video object with different distortion scale. It is necessary to analyze the priority of the video objects according to its semantic importance, intrinsic properties and psycho-visual characteristics such that the bit budget can be distributed properly to video objects to improve the perceptual quality of the compressed video. This paper aims to provide an automatic video object priority definition method based on object-level visual attention model and further propose an optimization framework for video object bit allocation. One significant contribution of this work is that the human visual system characteristics are incorporated into the video coding optimization process. Another advantage is that the priority of the video object can be obtained automatically instead of fixing weighting factors before encoding or relying on the user interactivity. To evaluate the performance of the proposed approach, we compare it with traditional verification model bit allocation and the optimal multiple video object bit allocation algorithms. Comparing with traditional bit allocation algorithms, the objective quality of the object with higher priority is significantly improved under this framework. These results demonstrate the usefulness of this unsupervised subjective quality lifting framework.
Motion-seeded object-based attention for dynamic visual imagery

NASA Astrophysics Data System (ADS)

Huber, David J.; Khosla, Deepak; Kim, Kyungnam

2017-05-01

This paper† describes a novel system that finds and segments "objects of interest" from dynamic imagery (video) that (1) processes each frame using an advanced motion algorithm that pulls out regions that exhibit anomalous motion, and (2) extracts the boundary of each object of interest using a biologically-inspired segmentation algorithm based on feature contours. The system uses a series of modular, parallel algorithms, which allows many complicated operations to be carried out by the system in a very short time, and can be used as a front-end to a larger system that includes object recognition and scene understanding modules. Using this method, we show 90% accuracy with fewer than 0.1 false positives per frame of video, which represents a significant improvement over detection using a baseline attention algorithm.
Localizing wushu players on a platform based on a video recording

NASA Astrophysics Data System (ADS)

Peczek, Piotr M.; Zabołotny, Wojciech M.

2017-08-01

This article describes the development of a method to localize an athlete during sports performance on a platform, based on a static video recording. Considered sport for this method is wushu - martial art. However, any other discipline can be applied. There are specified requirements, and 2 algorithms of image processing are described. The next part presents an experiment that was held based on recordings from the Pan American Wushu Championship. Based on those recordings the steps of the algorithm are shown. Results are evaluated manually. The last part of the article concludes if the algorithm is applicable and what improvements have to be implemented to use it during sports competitions as well as for offline analysis.
Interacting with target tracking algorithms in a gaze-enhanced motion video analysis system

NASA Astrophysics Data System (ADS)

Hild, Jutta; Krüger, Wolfgang; Heinze, Norbert; Peinsipp-Byma, Elisabeth; Beyerer, Jürgen

2016-05-01

Motion video analysis is a challenging task, particularly if real-time analysis is required. It is therefore an important issue how to provide suitable assistance for the human operator. Given that the use of customized video analysis systems is more and more established, one supporting measure is to provide system functions which perform subtasks of the analysis. Recent progress in the development of automated image exploitation algorithms allow, e.g., real-time moving target tracking. Another supporting measure is to provide a user interface which strives to reduce the perceptual, cognitive and motor load of the human operator for example by incorporating the operator's visual focus of attention. A gaze-enhanced user interface is able to help here. This work extends prior work on automated target recognition, segmentation, and tracking algorithms as well as about the benefits of a gaze-enhanced user interface for interaction with moving targets. We also propose a prototypical system design aiming to combine both the qualities of the human observer's perception and the automated algorithms in order to improve the overall performance of a real-time video analysis system. In this contribution, we address two novel issues analyzing gaze-based interaction with target tracking algorithms. The first issue extends the gaze-based triggering of a target tracking process, e.g., investigating how to best relaunch in the case of track loss. The second issue addresses the initialization of tracking algorithms without motion segmentation where the operator has to provide the system with the object's image region in order to start the tracking algorithm.
Adaptive Gaussian mixture models for pre-screening in GPR data

NASA Astrophysics Data System (ADS)

Torrione, Peter; Morton, Kenneth, Jr.; Besaw, Lance E.

2011-06-01

Due to the large amount of data generated by vehicle-mounted ground penetrating radar (GPR) antennae arrays, advanced feature extraction and classification can only be performed on a small subset of data during real-time operation. As a result, most GPR based landmine detection systems implement "pre-screening" algorithms to processes all of the data generated by the antennae array and identify locations with anomalous signatures for more advanced processing. These pre-screening algorithms must be computationally efficient and obtain high probability of detection, but can permit a false alarm rate which might be higher than the total system requirements. Many approaches to prescreening have previously been proposed, including linear prediction coefficients, the LMS algorithm, and CFAR-based approaches. Similar pre-screening techniques have also been developed in the field of video processing to identify anomalous behavior or anomalous objects. One such algorithm, an online k-means approximation to an adaptive Gaussian mixture model (GMM), is particularly well-suited to application for pre-screening in GPR data due to its computational efficiency, non-linear nature, and relevance of the logic underlying the algorithm to GPR processing. In this work we explore the application of an adaptive GMM-based approach for anomaly detection from the video processing literature to pre-screening in GPR data. Results with the ARA Nemesis landmine detection system demonstrate significant pre-screening performance improvements compared to alternative approaches, and indicate that the proposed algorithm is a complimentary technique to existing methods.
Learning patterns of life from intelligence analyst chat

NASA Astrophysics Data System (ADS)

Schneider, Michael K.; Alford, Mark; Babko-Malaya, Olga; Blasch, Erik; Chen, Lingji; Crespi, Valentino; HandUber, Jason; Haney, Phil; Nagy, Jim; Richman, Mike; Von Pless, Gregory; Zhu, Howie; Rhodes, Bradley J.

2016-05-01

Our Multi-INT Data Association Tool (MIDAT) learns patterns of life (POL) of a geographical area from video analyst observations called out in textual reporting. Typical approaches to learning POLs from video make use of computer vision algorithms to extract locations in space and time of various activities. Such approaches are subject to the detection and tracking performance of the video processing algorithms. Numerous examples of human analysts monitoring live video streams annotating or "calling out" relevant entities and activities exist, such as security analysis, crime-scene forensics, news reports, and sports commentary. This user description typically corresponds with textual capture, such as chat. Although the purpose of these text products is primarily to describe events as they happen, organizations typically archive the reports for extended periods. This archive provides a basis to build POLs. Such POLs are useful for diagnosis to assess activities in an area based on historical context, and for consumers of products, who gain an understanding of historical patterns. MIDAT combines natural language processing, multi-hypothesis tracking, and Multi-INT Activity Pattern Learning and Exploitation (MAPLE) technologies in an end-to-end lab prototype that processes textual products produced by video analysts, infers POLs, and highlights anomalies relative to those POLs with links to "tracks" of related activities performed by the same entity. MIDAT technologies perform well, achieving, for example, a 90% F1-value on extracting activities from the textual reports.
Video bandwidth compression system

NASA Astrophysics Data System (ADS)

Ludington, D.

1980-08-01

The objective of this program was the development of a Video Bandwidth Compression brassboard model for use by the Air Force Avionics Laboratory, Wright-Patterson Air Force Base, in evaluation of bandwidth compression techniques for use in tactical weapons and to aid in the selection of particular operational modes to be implemented in an advanced flyable model. The bandwidth compression system is partitioned into two major divisions: the encoder, which processes the input video with a compression algorithm and transmits the most significant information; and the decoder where the compressed data is reconstructed into a video image for display.
Medical Ultrasound Video Coding with H.265/HEVC Based on ROI Extraction

PubMed Central

Wu, Yueying; Liu, Pengyu; Gao, Yuan; Jia, Kebin

2016-01-01

High-efficiency video compression technology is of primary importance to the storage and transmission of digital medical video in modern medical communication systems. To further improve the compression performance of medical ultrasound video, two innovative technologies based on diagnostic region-of-interest (ROI) extraction using the high efficiency video coding (H.265/HEVC) standard are presented in this paper. First, an effective ROI extraction algorithm based on image textural features is proposed to strengthen the applicability of ROI detection results in the H.265/HEVC quad-tree coding structure. Second, a hierarchical coding method based on transform coefficient adjustment and a quantization parameter (QP) selection process is designed to implement the otherness encoding for ROIs and non-ROIs. Experimental results demonstrate that the proposed optimization strategy significantly improves the coding performance by achieving a BD-BR reduction of 13.52% and a BD-PSNR gain of 1.16 dB on average compared to H.265/HEVC (HM15.0). The proposed medical video coding algorithm is expected to satisfy low bit-rate compression requirements for modern medical communication systems. PMID:27814367
Medical Ultrasound Video Coding with H.265/HEVC Based on ROI Extraction.

PubMed

Wu, Yueying; Liu, Pengyu; Gao, Yuan; Jia, Kebin

2016-01-01

High-efficiency video compression technology is of primary importance to the storage and transmission of digital medical video in modern medical communication systems. To further improve the compression performance of medical ultrasound video, two innovative technologies based on diagnostic region-of-interest (ROI) extraction using the high efficiency video coding (H.265/HEVC) standard are presented in this paper. First, an effective ROI extraction algorithm based on image textural features is proposed to strengthen the applicability of ROI detection results in the H.265/HEVC quad-tree coding structure. Second, a hierarchical coding method based on transform coefficient adjustment and a quantization parameter (QP) selection process is designed to implement the otherness encoding for ROIs and non-ROIs. Experimental results demonstrate that the proposed optimization strategy significantly improves the coding performance by achieving a BD-BR reduction of 13.52% and a BD-PSNR gain of 1.16 dB on average compared to H.265/HEVC (HM15.0). The proposed medical video coding algorithm is expected to satisfy low bit-rate compression requirements for modern medical communication systems.
Fast image interpolation for motion estimation using graphics hardware

NASA Astrophysics Data System (ADS)

Kelly, Francis; Kokaram, Anil

2004-05-01

Motion estimation and compensation is the key to high quality video coding. Block matching motion estimation is used in most video codecs, including MPEG-2, MPEG-4, H.263 and H.26L. Motion estimation is also a key component in the digital restoration of archived video and for post-production and special effects in the movie industry. Sub-pixel accurate motion vectors can improve the quality of the vector field and lead to more efficient video coding. However sub-pixel accuracy requires interpolation of the image data. Image interpolation is a key requirement of many image processing algorithms. Often interpolation can be a bottleneck in these applications, especially in motion estimation due to the large number pixels involved. In this paper we propose using commodity computer graphics hardware for fast image interpolation. We use the full search block matching algorithm to illustrate the problems and limitations of using graphics hardware in this way.
Scalable gastroscopic video summarization via similar-inhibition dictionary selection.

PubMed

Wang, Shuai; Cong, Yang; Cao, Jun; Yang, Yunsheng; Tang, Yandong; Zhao, Huaici; Yu, Haibin

2016-01-01

This paper aims at developing an automated gastroscopic video summarization algorithm to assist clinicians to more effectively go through the abnormal contents of the video. To select the most representative frames from the original video sequence, we formulate the problem of gastroscopic video summarization as a dictionary selection issue. Different from the traditional dictionary selection methods, which take into account only the number and reconstruction ability of selected key frames, our model introduces the similar-inhibition constraint to reinforce the diversity of selected key frames. We calculate the attention cost by merging both gaze and content change into a prior cue to help select the frames with more high-level semantic information. Moreover, we adopt an image quality evaluation process to eliminate the interference of the poor quality images and a segmentation process to reduce the computational complexity. For experiments, we build a new gastroscopic video dataset captured from 30 volunteers with more than 400k images and compare our method with the state-of-the-arts using the content consistency, index consistency and content-index consistency with the ground truth. Compared with all competitors, our method obtains the best results in 23 of 30 videos evaluated based on content consistency, 24 of 30 videos evaluated based on index consistency and all videos evaluated based on content-index consistency. For gastroscopic video summarization, we propose an automated annotation method via similar-inhibition dictionary selection. Our model can achieve better performance compared with other state-of-the-art models and supplies more suitable key frames for diagnosis. The developed algorithm can be automatically adapted to various real applications, such as the training of young clinicians, computer-aided diagnosis or medical report generation. Copyright © 2015 Elsevier B.V. All rights reserved.
Novel true-motion estimation algorithm and its application to motion-compensated temporal frame interpolation.

PubMed

Dikbas, Salih; Altunbasak, Yucel

2013-08-01

In this paper, a new low-complexity true-motion estimation (TME) algorithm is proposed for video processing applications, such as motion-compensated temporal frame interpolation (MCTFI) or motion-compensated frame rate up-conversion (MCFRUC). Regular motion estimation, which is often used in video coding, aims to find the motion vectors (MVs) to reduce the temporal redundancy, whereas TME aims to track the projected object motion as closely as possible. TME is obtained by imposing implicit and/or explicit smoothness constraints on the block-matching algorithm. To produce better quality-interpolated frames, the dense motion field at interpolation time is obtained for both forward and backward MVs; then, bidirectional motion compensation using forward and backward MVs is applied by mixing both elegantly. Finally, the performance of the proposed algorithm for MCTFI is demonstrated against recently proposed methods and smoothness constraint optical flow employed by a professional video production suite. Experimental results show that the quality of the interpolated frames using the proposed method is better when compared with the MCFRUC techniques.
Description of texts of auxiliary programs for processing video information. Part 2: SUODH program of automated separation of quasihomogeneous formations

NASA Technical Reports Server (NTRS)

Borisenko, V. I.; Chesalin, L. S.

1980-01-01

The algorithm, block diagram, complete text, and instructions are given for the use of a computer program to separate formations whose spectral characteristics are constant on the average. The initial material for operating the computer program presented is video information in a standard color-superposition format.
Low-complexity camera digital signal imaging for video document projection system

NASA Astrophysics Data System (ADS)

Hsia, Shih-Chang; Tsai, Po-Shien

2011-04-01

We present high-performance and low-complexity algorithms for real-time camera imaging applications. The main functions of the proposed camera digital signal processing (DSP) involve color interpolation, white balance, adaptive binary processing, auto gain control, and edge and color enhancement for video projection systems. A series of simulations demonstrate that the proposed method can achieve good image quality while keeping computation cost and memory requirements low. On the basis of the proposed algorithms, the cost-effective hardware core is developed using Verilog HDL. The prototype chip has been verified with one low-cost programmable device. The real-time camera system can achieve 1270 × 792 resolution with the combination of extra components and can demonstrate each DSP function.
Automatic video segmentation and indexing

NASA Astrophysics Data System (ADS)

Chahir, Youssef; Chen, Liming

1999-08-01

Indexing is an important aspect of video database management. Video indexing involves the analysis of video sequences, which is a computationally intensive process. However, effective management of digital video requires robust indexing techniques. The main purpose of our proposed video segmentation is twofold. Firstly, we develop an algorithm that identifies camera shot boundary. The approach is based on the use of combination of color histograms and block-based technique. Next, each temporal segment is represented by a color reference frame which specifies the shot similarities and which is used in the constitution of scenes. Experimental results using a variety of videos selected in the corpus of the French Audiovisual National Institute are presented to demonstrate the effectiveness of performing shot detection, the content characterization of shots and the scene constitution.

Visual Perception Based Rate Control Algorithm for HEVC

NASA Astrophysics Data System (ADS)

Feng, Zeqi; Liu, PengYu; Jia, Kebin

2018-01-01

For HEVC, rate control is an indispensably important video coding technology to alleviate the contradiction between video quality and the limited encoding resources during video communication. However, the rate control benchmark algorithm of HEVC ignores subjective visual perception. For key focus regions, bit allocation of LCU is not ideal and subjective quality is unsatisfied. In this paper, a visual perception based rate control algorithm for HEVC is proposed. First bit allocation weight of LCU level is optimized based on the visual perception of luminance and motion to ameliorate video subjective quality. Then λ and QP are adjusted in combination with the bit allocation weight to improve rate distortion performance. Experimental results show that the proposed algorithm reduces average 0.5% BD-BR and maximum 1.09% BD-BR at no cost in bitrate accuracy compared with HEVC (HM15.0). The proposed algorithm devotes to improving video subjective quality under various video applications.
Automated frame selection process for high-resolution microendoscopy

NASA Astrophysics Data System (ADS)

Ishijima, Ayumu; Schwarz, Richard A.; Shin, Dongsuk; Mondrik, Sharon; Vigneswaran, Nadarajah; Gillenwater, Ann M.; Anandasabapathy, Sharmila; Richards-Kortum, Rebecca

2015-04-01

We developed an automated frame selection algorithm for high-resolution microendoscopy video sequences. The algorithm rapidly selects a representative frame with minimal motion artifact from a short video sequence, enabling fully automated image analysis at the point-of-care. The algorithm was evaluated by quantitative comparison of diagnostically relevant image features and diagnostic classification results obtained using automated frame selection versus manual frame selection. A data set consisting of video sequences collected in vivo from 100 oral sites and 167 esophageal sites was used in the analysis. The area under the receiver operating characteristic curve was 0.78 (automated selection) versus 0.82 (manual selection) for oral sites, and 0.93 (automated selection) versus 0.92 (manual selection) for esophageal sites. The implementation of fully automated high-resolution microendoscopy at the point-of-care has the potential to reduce the number of biopsies needed for accurate diagnosis of precancer and cancer in low-resource settings where there may be limited infrastructure and personnel for standard histologic analysis.
Blind prediction of natural video quality.

PubMed

Saad, Michele A; Bovik, Alan C; Charrier, Christophe

2014-03-01

We propose a blind (no reference or NR) video quality evaluation model that is nondistortion specific. The approach relies on a spatio-temporal model of video scenes in the discrete cosine transform domain, and on a model that characterizes the type of motion occurring in the scenes, to predict video quality. We use the models to define video statistics and perceptual features that are the basis of a video quality assessment (VQA) algorithm that does not require the presence of a pristine video to compare against in order to predict a perceptual quality score. The contributions of this paper are threefold. 1) We propose a spatio-temporal natural scene statistics (NSS) model for videos. 2) We propose a motion model that quantifies motion coherency in video scenes. 3) We show that the proposed NSS and motion coherency models are appropriate for quality assessment of videos, and we utilize them to design a blind VQA algorithm that correlates highly with human judgments of quality. The proposed algorithm, called video BLIINDS, is tested on the LIVE VQA database and on the EPFL-PoliMi video database and shown to perform close to the level of top performing reduced and full reference VQA algorithms.
Vehicle-triggered video compression/decompression for fast and efficient searching in large video databases

NASA Astrophysics Data System (ADS)

Bulan, Orhan; Bernal, Edgar A.; Loce, Robert P.; Wu, Wencheng

2013-03-01

Video cameras are widely deployed along city streets, interstate highways, traffic lights, stop signs and toll booths by entities that perform traffic monitoring and law enforcement. The videos captured by these cameras are typically compressed and stored in large databases. Performing a rapid search for a specific vehicle within a large database of compressed videos is often required and can be a time-critical life or death situation. In this paper, we propose video compression and decompression algorithms that enable fast and efficient vehicle or, more generally, event searches in large video databases. The proposed algorithm selects reference frames (i.e., I-frames) based on a vehicle having been detected at a specified position within the scene being monitored while compressing a video sequence. A search for a specific vehicle in the compressed video stream is performed across the reference frames only, which does not require decompression of the full video sequence as in traditional search algorithms. Our experimental results on videos captured in a local road show that the proposed algorithm significantly reduces the search space (thus reducing time and computational resources) in vehicle search tasks within compressed video streams, particularly those captured in light traffic volume conditions.
A method of operation scheduling based on video transcoding for cluster equipment

NASA Astrophysics Data System (ADS)

Zhou, Haojie; Yan, Chun

2018-04-01

Because of the cluster technology in real-time video transcoding device, the application of facing the massive growth in the number of video assignments and resolution and bit rate of diversity, task scheduling algorithm, and analyze the current mainstream of cluster for real-time video transcoding equipment characteristics of the cluster, combination with the characteristics of the cluster equipment task delay scheduling algorithm is proposed. This algorithm enables the cluster to get better performance in the generation of the job queue and the lower part of the job queue when receiving the operation instruction. In the end, a small real-time video transcode cluster is constructed to analyze the calculation ability, running time, resource occupation and other aspects of various algorithms in operation scheduling. The experimental results show that compared with traditional clustering task scheduling algorithm, task delay scheduling algorithm has more flexible and efficient characteristics.
Implementation of MPEG-2 encoder to multiprocessor system using multiple MVPs (TMS320C80)

NASA Astrophysics Data System (ADS)

Kim, HyungSun; Boo, Kenny; Chung, SeokWoo; Choi, Geon Y.; Lee, YongJin; Jeon, JaeHo; Park, Hyun Wook

1997-05-01

This paper presents the efficient algorithm mapping for the real-time MPEG-2 encoding on the KAIST image computing system (KICS), which has a parallel architecture using five multimedia video processors (MVPs). The MVP is a general purpose digital signal processor (DSP) of Texas Instrument. It combines one floating-point processor and four fixed- point DSPs on a single chip. The KICS uses the MVP as a primary processing element (PE). Two PEs form a cluster, and there are two processing clusters in the KICS. Real-time MPEG-2 encoder is implemented through the spatial and the functional partitioning strategies. Encoding process of spatially partitioned half of the video input frame is assigned to ne processing cluster. Two PEs perform the functionally partitioned MPEG-2 encoding tasks in the pipelined operation mode. One PE of a cluster carries out the transform coding part and the other performs the predictive coding part of the MPEG-2 encoding algorithm. One MVP among five MVPs is used for system control and interface with host computer. This paper introduces an implementation of the MPEG-2 algorithm with a parallel processing architecture.
Motion adaptive Kalman filter for super-resolution

NASA Astrophysics Data System (ADS)

Richter, Martin; Nasse, Fabian; Schröder, Hartmut

2011-01-01

Superresolution is a sophisticated strategy to enhance image quality of both low and high resolution video, performing tasks like artifact reduction, scaling and sharpness enhancement in one algorithm, all of them reconstructing high frequency components (above Nyquist frequency) in some way. Especially recursive superresolution algorithms can fulfill high quality aspects because they control the video output using a feed-back loop and adapt the result in the next iteration. In addition to excellent output quality, temporal recursive methods are very hardware efficient and therefore even attractive for real-time video processing. A very promising approach is the utilization of Kalman filters as proposed by Farsiu et al. Reliable motion estimation is crucial for the performance of superresolution. Therefore, robust global motion models are mainly used, but this also limits the application of superresolution algorithm. Thus, handling sequences with complex object motion is essential for a wider field of application. Hence, this paper proposes improvements by extending the Kalman filter approach using motion adaptive variance estimation and segmentation techniques. Experiments confirm the potential of our proposal for ideal and real video sequences with complex motion and further compare its performance to state-of-the-art methods like trainable filters.
Special-effect edit detection using VideoTrails: a comparison with existing techniques

NASA Astrophysics Data System (ADS)

Kobla, Vikrant; DeMenthon, Daniel; Doermann, David S.

1998-12-01

Video segmentation plays an integral role in many multimedia applications, such as digital libraries, content management systems, and various other video browsing, indexing, and retrieval systems. Many algorithms for segmentation of video have appeared within the past few years. Most of these algorithms perform well on cuts, but yield poor performance on gradual transitions or special effects edits. A complete video segmentation system must also achieve good performance on special effect edit detection. In this paper, we discuss the performance of our Video Trails-based algorithms, with other existing special effect edit-detection algorithms within the literature. Results from experiments testing for the ability to detect edits from TV programs, ranging from commercials to news magazine programs, including diverse special effect edits, which we have introduced.
An incremental DPMM-based method for trajectory clustering, modeling, and retrieval.

PubMed

Hu, Weiming; Li, Xi; Tian, Guodong; Maybank, Stephen; Zhang, Zhongfei

2013-05-01

Trajectory analysis is the basis for many applications, such as indexing of motion events in videos, activity recognition, and surveillance. In this paper, the Dirichlet process mixture model (DPMM) is applied to trajectory clustering, modeling, and retrieval. We propose an incremental version of a DPMM-based clustering algorithm and apply it to cluster trajectories. An appropriate number of trajectory clusters is determined automatically. When trajectories belonging to new clusters arrive, the new clusters can be identified online and added to the model without any retraining using the previous data. A time-sensitive Dirichlet process mixture model (tDPMM) is applied to each trajectory cluster for learning the trajectory pattern which represents the time-series characteristics of the trajectories in the cluster. Then, a parameterized index is constructed for each cluster. A novel likelihood estimation algorithm for the tDPMM is proposed, and a trajectory-based video retrieval model is developed. The tDPMM-based probabilistic matching method and the DPMM-based model growing method are combined to make the retrieval model scalable and adaptable. Experimental comparisons with state-of-the-art algorithms demonstrate the effectiveness of our algorithm.
Classification of video sequences into chosen generalized use classes of target size and lighting level.

PubMed

Leszczuk, Mikołaj; Dudek, Łukasz; Witkowski, Marcin

The VQiPS (Video Quality in Public Safety) Working Group, supported by the U.S. Department of Homeland Security, has been developing a user guide for public safety video applications. According to VQiPS, five parameters have particular importance influencing the ability to achieve a recognition task. They are: usage time-frame, discrimination level, target size, lighting level, and level of motion. These parameters form what are referred to as Generalized Use Classes (GUCs). The aim of our research was to develop algorithms that would automatically assist classification of input sequences into one of the GUCs. Target size and lighting level parameters were approached. The experiment described reveals the experts' ambiguity and hesitation during the manual target size determination process. However, the automatic methods developed for target size classification make it possible to determine GUC parameters with 70 % compliance to the end-users' opinion. Lighting levels of the entire sequence can be classified with an efficiency reaching 93 %. To make the algorithms available for use, a test application has been developed. It is able to process video files and display classification results, the user interface being very simple and requiring only minimal user interaction.
Audiovisual focus of attention and its application to Ultra High Definition video compression

NASA Astrophysics Data System (ADS)

Rerabek, Martin; Nemoto, Hiromi; Lee, Jong-Seok; Ebrahimi, Touradj

2014-02-01

Using Focus of Attention (FoA) as a perceptual process in image and video compression belongs to well-known approaches to increase coding efficiency. It has been shown that foveated coding, when compression quality varies across the image according to region of interest, is more efficient than the alternative coding, when all region are compressed in a similar way. However, widespread use of such foveated compression has been prevented due to two main conflicting causes, namely, the complexity and the efficiency of algorithms for FoA detection. One way around these is to use as much information as possible from the scene. Since most video sequences have an associated audio, and moreover, in many cases there is a correlation between the audio and the visual content, audiovisual FoA can improve efficiency of the detection algorithm while remaining of low complexity. This paper discusses a simple yet efficient audiovisual FoA algorithm based on correlation of dynamics between audio and video signal components. Results of audiovisual FoA detection algorithm are subsequently taken into account for foveated coding and compression. This approach is implemented into H.265/HEVC encoder producing a bitstream which is fully compliant to any H.265/HEVC decoder. The influence of audiovisual FoA in the perceived quality of high and ultra-high definition audiovisual sequences is explored and the amount of gain in compression efficiency is analyzed.
Intra Frame Coding In Advanced Video Coding Standard (H.264) to Obtain Consistent PSNR and Reduce Bit Rate for Diagonal Down Left Mode Using Gaussian Pulse

NASA Astrophysics Data System (ADS)

Manjanaik, N.; Parameshachari, B. D.; Hanumanthappa, S. N.; Banu, Reshma

2017-08-01

Intra prediction process of H.264 video coding standard used to code first frame i.e. Intra frame of video to obtain good coding efficiency compare to previous video coding standard series. More benefit of intra frame coding is to reduce spatial pixel redundancy with in current frame, reduces computational complexity and provides better rate distortion performance. To code Intra frame it use existing process Rate Distortion Optimization (RDO) method. This method increases computational complexity, increases in bit rate and reduces picture quality so it is difficult to implement in real time applications, so the many researcher has been developed fast mode decision algorithm for coding of intra frame. The previous work carried on Intra frame coding in H.264 standard using fast decision mode intra prediction algorithm based on different techniques was achieved increased in bit rate, degradation of picture quality(PSNR) for different quantization parameters. Many previous approaches of fast mode decision algorithms on intra frame coding achieved only reduction of computational complexity or it save encoding time and limitation was increase in bit rate with loss of quality of picture. In order to avoid increase in bit rate and loss of picture quality a better approach was developed. In this paper developed a better approach i.e. Gaussian pulse for Intra frame coding using diagonal down left intra prediction mode to achieve higher coding efficiency in terms of PSNR and bitrate. In proposed method Gaussian pulse is multiplied with each 4x4 frequency domain coefficients of 4x4 sub macro block of macro block of current frame before quantization process. Multiplication of Gaussian pulse for each 4x4 integer transformed coefficients at macro block levels scales the information of the coefficients in a reversible manner. The resulting signal would turn abstract. Frequency samples are abstract in a known and controllable manner without intermixing of coefficients, it avoids picture getting bad hit for higher values of quantization parameters. The proposed work was implemented using MATLAB and JM 18.6 reference software. The proposed work measure the performance parameters PSNR, bit rate and compression of intra frame of yuv video sequences in QCIF resolution under different values of quantization parameter with Gaussian value for diagonal down left intra prediction mode. The simulation results of proposed algorithm are tabulated and compared with previous algorithm i.e. Tian et al method. The proposed algorithm achieved reduced in bit rate averagely 30.98% and maintain consistent picture quality for QCIF sequences compared to previous algorithm i.e. Tian et al method.
Evaluation of H.264 and H.265 full motion video encoding for small UAS platforms

NASA Astrophysics Data System (ADS)

McGuinness, Christopher D.; Walker, David; Taylor, Clark; Hill, Kerry; Hoffman, Marc

2016-05-01

Of all the steps in the image acquisition and formation pipeline, compression is the only process that degrades image quality. A selected compression algorithm succeeds or fails to provide sufficient quality at the requested compression rate depending on how well the algorithm is suited to the input data. Applying an algorithm designed for one type of data to a different type often results in poor compression performance. This is mostly the case when comparing the performance of H.264, designed for standard definition data, to HEVC (High Efficiency Video Coding), which the Joint Collaborative Team on Video Coding (JCT-VC) designed for high-definition data. This study focuses on evaluating how HEVC compares to H.264 when compressing data from small UAS platforms. To compare the standards directly, we assess two open-source traditional software solutions: x264 and x265. These software-only comparisons allow us to establish a baseline of how much improvement can generally be expected of HEVC over H.264. Then, specific solutions leveraging different types of hardware are selected to understand the limitations of commercial-off-the-shelf (COTS) options. Algorithmically, regardless of the implementation, HEVC is found to provide similar quality video as H.264 at 40% lower data rates for video resolutions greater than 1280x720, roughly 1 Megapixel (MPx). For resolutions less than 1MPx, H.264 is an adequate solution though a small (roughly 20%) compression boost is earned by employing HEVC. New low cost, size, weight, and power (CSWAP) HEVC implementations are being developed and will be ideal for small UAS systems.
Impulsive noise removal from color video with morphological filtering

NASA Astrophysics Data System (ADS)

Ruchay, Alexey; Kober, Vitaly

2017-09-01

This paper deals with impulse noise removal from color video. The proposed noise removal algorithm employs a switching filtering for denoising of color video; that is, detection of corrupted pixels by means of a novel morphological filtering followed by removal of the detected pixels on the base of estimation of uncorrupted pixels in the previous scenes. With the help of computer simulation we show that the proposed algorithm is able to well remove impulse noise in color video. The performance of the proposed algorithm is compared in terms of image restoration metrics with that of common successful algorithms.
Spatiotemporal video deinterlacing using control grid interpolation

NASA Astrophysics Data System (ADS)

Venkatesan, Ragav; Zwart, Christine M.; Frakes, David H.; Li, Baoxin

2015-03-01

With the advent of progressive format display and broadcast technologies, video deinterlacing has become an important video-processing technique. Numerous approaches exist in the literature to accomplish deinterlacing. While most earlier methods were simple linear filtering-based approaches, the emergence of faster computing technologies and even dedicated video-processing hardware in display units has allowed higher quality but also more computationally intense deinterlacing algorithms to become practical. Most modern approaches analyze motion and content in video to select different deinterlacing methods for various spatiotemporal regions. We introduce a family of deinterlacers that employs spectral residue to choose between and weight control grid interpolation based spatial and temporal deinterlacing methods. The proposed approaches perform better than the prior state-of-the-art based on peak signal-to-noise ratio, other visual quality metrics, and simple perception-based subjective evaluations conducted by human viewers. We further study the advantages of using soft and hard decision thresholds on the visual performance.
Robust and efficient fiducial tracking for augmented reality in HD-laparoscopic video streams

NASA Astrophysics Data System (ADS)

Mueller, M.; Groch, A.; Baumhauer, M.; Maier-Hein, L.; Teber, D.; Rassweiler, J.; Meinzer, H.-P.; Wegner, In.

2012-02-01

Augmented Reality (AR) is a convenient way of porting information from medical images into the surgical field of view and can deliver valuable assistance to the surgeon, especially in laparoscopic procedures. In addition, high definition (HD) laparoscopic video devices are a great improvement over the previously used low resolution equipment. However, in AR applications that rely on real-time detection of fiducials from video streams, the demand for efficient image processing has increased due to the introduction of HD devices. We present an algorithm based on the well-known Conditional Density Propagation (CONDENSATION) algorithm which can satisfy these new demands. By incorporating a prediction around an already existing and robust segmentation algorithm, we can speed up the whole procedure while leaving the robustness of the fiducial segmentation untouched. For evaluation purposes we tested the algorithm on recordings from real interventions, allowing for a meaningful interpretation of the results. Our results show that we can accelerate the segmentation by a factor of 3.5 on average. Moreover, the prediction information can be used to compensate for fiducials that are temporarily occluded or out of scope, providing greater stability.
Smoke regions extraction based on two steps segmentation and motion detection in early fire

NASA Astrophysics Data System (ADS)

Jian, Wenlin; Wu, Kaizhi; Yu, Zirong; Chen, Lijuan

2018-03-01

Aiming at the early problems of video-based smoke detection in fire video, this paper proposes a method to extract smoke suspected regions by combining two steps segmentation and motion characteristics. Early smoldering smoke can be seen as gray or gray-white regions. In the first stage, regions of interests (ROIs) with smoke are obtained by using two step segmentation methods. Then, suspected smoke regions are detected by combining the two step segmentation and motion detection. Finally, morphological processing is used for smoke regions extracting. The Otsu algorithm is used as segmentation method and the ViBe algorithm is used to detect the motion of smoke. The proposed method was tested on 6 test videos with smoke. The experimental results show the effectiveness of our proposed method over visual observation.
A contourlet transform based algorithm for real-time video encoding

NASA Astrophysics Data System (ADS)

Katsigiannis, Stamos; Papaioannou, Georgios; Maroulis, Dimitris

2012-06-01

In recent years, real-time video communication over the internet has been widely utilized for applications like video conferencing. Streaming live video over heterogeneous IP networks, including wireless networks, requires video coding algorithms that can support various levels of quality in order to adapt to the network end-to-end bandwidth and transmitter/receiver resources. In this work, a scalable video coding and compression algorithm based on the Contourlet Transform is proposed. The algorithm allows for multiple levels of detail, without re-encoding the video frames, by just dropping the encoded information referring to higher resolution than needed. Compression is achieved by means of lossy and lossless methods, as well as variable bit rate encoding schemes. Furthermore, due to the transformation utilized, it does not suffer from blocking artifacts that occur with many widely adopted compression algorithms. Another highly advantageous characteristic of the algorithm is the suppression of noise induced by low-quality sensors usually encountered in web-cameras, due to the manipulation of the transform coefficients at the compression stage. The proposed algorithm is designed to introduce minimal coding delay, thus achieving real-time performance. Performance is enhanced by utilizing the vast computational capabilities of modern GPUs, providing satisfactory encoding and decoding times at relatively low cost. These characteristics make this method suitable for applications like video-conferencing that demand real-time performance, along with the highest visual quality possible for each user. Through the presented performance and quality evaluation of the algorithm, experimental results show that the proposed algorithm achieves better or comparable visual quality relative to other compression and encoding methods tested, while maintaining a satisfactory compression ratio. Especially at low bitrates, it provides more human-eye friendly images compared to algorithms utilizing block-based coding, like the MPEG family, as it introduces fuzziness and blurring instead of artificial block artifacts.
Comparative Evaluation of Background Subtraction Algorithms in Remote Scene Videos Captured by MWIR Sensors

PubMed Central

Yao, Guangle; Lei, Tao; Zhong, Jiandan; Jiang, Ping; Jia, Wenwu

2017-01-01

Background subtraction (BS) is one of the most commonly encountered tasks in video analysis and tracking systems. It distinguishes the foreground (moving objects) from the video sequences captured by static imaging sensors. Background subtraction in remote scene infrared (IR) video is important and common to lots of fields. This paper provides a Remote Scene IR Dataset captured by our designed medium-wave infrared (MWIR) sensor. Each video sequence in this dataset is identified with specific BS challenges and the pixel-wise ground truth of foreground (FG) for each frame is also provided. A series of experiments were conducted to evaluate BS algorithms on this proposed dataset. The overall performance of BS algorithms and the processor/memory requirements were compared. Proper evaluation metrics or criteria were employed to evaluate the capability of each BS algorithm to handle different kinds of BS challenges represented in this dataset. The results and conclusions in this paper provide valid references to develop new BS algorithm for remote scene IR video sequence, and some of them are not only limited to remote scene or IR video sequence but also generic for background subtraction. The Remote Scene IR dataset and the foreground masks detected by each evaluated BS algorithm are available online: https://github.com/JerryYaoGl/BSEvaluationRemoteSceneIR. PMID:28837112
Automated fall detection on privacy-enhanced video.

PubMed

Edgcomb, Alex; Vahid, Frank

2012-01-01

A privacy-enhanced video obscures the appearance of a person in the video. We consider four privacy enhancements: blurring of the person, silhouetting of the person, covering the person with a graphical box, and covering the person with a graphical oval. We demonstrate that an automated video-based fall detection algorithm can be as accurate on privacy-enhanced video as on raw video. The algorithm operated on video from a stationary in-home camera, using a foreground-background segmentation algorithm to extract a minimum bounding rectangle (MBR) around the motion in the video, and using time series shapelet analysis on the height and width of the rectangle to detect falls. We report accuracy applying fall detection on 23 scenarios depicted as raw video and privacy-enhanced videos involving a sole actor portraying normal activities and various falls. We found that fall detection on privacy-enhanced video, except for the common approach of blurring of the person, was competitive with raw video, and in particular that the graphical oval privacy enhancement yielded the same accuracy as raw video, namely 0.91 sensitivity and 0.92 specificity.

Video quality assessment using a statistical model of human visual speed perception.

PubMed

Wang, Zhou; Li, Qiang

2007-12-01

Motion is one of the most important types of information contained in natural video, but direct use of motion information in the design of video quality assessment algorithms has not been deeply investigated. Here we propose to incorporate a recent model of human visual speed perception [Nat. Neurosci. 9, 578 (2006)] and model visual perception in an information communication framework. This allows us to estimate both the motion information content and the perceptual uncertainty in video signals. Improved video quality assessment algorithms are obtained by incorporating the model as spatiotemporal weighting factors, where the weight increases with the information content and decreases with the perceptual uncertainty. Consistent improvement over existing video quality assessment algorithms is observed in our validation with the video quality experts group Phase I test data set.
Algorithm for Video Summarization of Bronchoscopy Procedures

PubMed Central

2011-01-01

Background The duration of bronchoscopy examinations varies considerably depending on the diagnostic and therapeutic procedures used. It can last more than 20 minutes if a complex diagnostic work-up is included. With wide access to videobronchoscopy, the whole procedure can be recorded as a video sequence. Common practice relies on an active attitude of the bronchoscopist who initiates the recording process and usually chooses to archive only selected views and sequences. However, it may be important to record the full bronchoscopy procedure as documentation when liability issues are at stake. Furthermore, an automatic recording of the whole procedure enables the bronchoscopist to focus solely on the performed procedures. Video recordings registered during bronchoscopies include a considerable number of frames of poor quality due to blurry or unfocused images. It seems that such frames are unavoidable due to the relatively tight endobronchial space, rapid movements of the respiratory tract due to breathing or coughing, and secretions which occur commonly in the bronchi, especially in patients suffering from pulmonary disorders. Methods The use of recorded bronchoscopy video sequences for diagnostic, reference and educational purposes could be considerably extended with efficient, flexible summarization algorithms. Thus, the authors developed a prototype system to create shortcuts (called summaries or abstracts) of bronchoscopy video recordings. Such a system, based on models described in previously published papers, employs image analysis methods to exclude frames or sequences of limited diagnostic or education value. Results The algorithm for the selection or exclusion of specific frames or shots from video sequences recorded during bronchoscopy procedures is based on several criteria, including automatic detection of "non-informative", frames showing the branching of the airways and frames including pathological lesions. Conclusions The paper focuses on the challenge of generating summaries of bronchoscopy video recordings. PMID:22185344
Behavior analysis of video object in complicated background

NASA Astrophysics Data System (ADS)

Zhao, Wenting; Wang, Shigang; Liang, Chao; Wu, Wei; Lu, Yang

2016-10-01

This paper aims to achieve robust behavior recognition of video object in complicated background. Features of the video object are described and modeled according to the depth information of three-dimensional video. Multi-dimensional eigen vector are constructed and used to process high-dimensional data. Stable object tracing in complex scenes can be achieved with multi-feature based behavior analysis, so as to obtain the motion trail. Subsequently, effective behavior recognition of video object is obtained according to the decision criteria. What's more, the real-time of algorithms and accuracy of analysis are both improved greatly. The theory and method on the behavior analysis of video object in reality scenes put forward by this project have broad application prospect and important practical significance in the security, terrorism, military and many other fields.
Anomaly Detection in Moving-Camera Video Sequences Using Principal Subspace Analysis

DOE PAGES

Thomaz, Lucas A.; Jardim, Eric; da Silva, Allan F.; ...

2017-10-16

This study presents a family of algorithms based on sparse decompositions that detect anomalies in video sequences obtained from slow moving cameras. These algorithms start by computing the union of subspaces that best represents all the frames from a reference (anomaly free) video as a low-rank projection plus a sparse residue. Then, they perform a low-rank representation of a target (possibly anomalous) video by taking advantage of both the union of subspaces and the sparse residue computed from the reference video. Such algorithms provide good detection results while at the same time obviating the need for previous video synchronization. However,more » this is obtained at the cost of a large computational complexity, which hinders their applicability. Another contribution of this paper approaches this problem by using intrinsic properties of the obtained data representation in order to restrict the search space to the most relevant subspaces, providing computational complexity gains of up to two orders of magnitude. The developed algorithms are shown to cope well with videos acquired in challenging scenarios, as verified by the analysis of 59 videos from the VDAO database that comprises videos with abandoned objects in a cluttered industrial scenario.« less
Anomaly Detection in Moving-Camera Video Sequences Using Principal Subspace Analysis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Thomaz, Lucas A.; Jardim, Eric; da Silva, Allan F.

This study presents a family of algorithms based on sparse decompositions that detect anomalies in video sequences obtained from slow moving cameras. These algorithms start by computing the union of subspaces that best represents all the frames from a reference (anomaly free) video as a low-rank projection plus a sparse residue. Then, they perform a low-rank representation of a target (possibly anomalous) video by taking advantage of both the union of subspaces and the sparse residue computed from the reference video. Such algorithms provide good detection results while at the same time obviating the need for previous video synchronization. However,more » this is obtained at the cost of a large computational complexity, which hinders their applicability. Another contribution of this paper approaches this problem by using intrinsic properties of the obtained data representation in order to restrict the search space to the most relevant subspaces, providing computational complexity gains of up to two orders of magnitude. The developed algorithms are shown to cope well with videos acquired in challenging scenarios, as verified by the analysis of 59 videos from the VDAO database that comprises videos with abandoned objects in a cluttered industrial scenario.« less
An efficient CU partition algorithm for HEVC based on improved Sobel operator

NASA Astrophysics Data System (ADS)

Sun, Xuebin; Chen, Xiaodong; Xu, Yong; Sun, Gang; Yang, Yunsheng

2018-04-01

As the latest video coding standard, High Efficiency Video Coding (HEVC) achieves over 50% bit rate reduction with similar video quality compared with previous standards H.264/AVC. However, the higher compression efficiency is attained at the cost of significantly increasing computational load. In order to reduce the complexity, this paper proposes a fast coding unit (CU) partition technique to speed up the process. To detect the edge features of each CU, a more accurate improved Sobel filtering is developed and performed By analyzing the textural features of CU, an early CU splitting termination is proposed to decide whether a CU should be decomposed into four lower-dimensions CUs or not. Compared with the reference software HM16.7, experimental results indicate the proposed algorithm can lessen the encoding time up to 44.09% on average, with a negligible bit rate increase of 0.24%, and quality losses lower 0.03 dB, respectively. In addition, the proposed algorithm gets a better trade-off between complexity and rate-distortion among the other proposed works.
Incremental Structured Dictionary Learning for Video Sensor-Based Object Tracking

PubMed Central

Xue, Ming; Yang, Hua; Zheng, Shibao; Zhou, Yi; Yu, Zhenghua

2014-01-01

To tackle robust object tracking for video sensor-based applications, an online discriminative algorithm based on incremental discriminative structured dictionary learning (IDSDL-VT) is presented. In our framework, a discriminative dictionary combining both positive, negative and trivial patches is designed to sparsely represent the overlapped target patches. Then, a local update (LU) strategy is proposed for sparse coefficient learning. To formulate the training and classification process, a multiple linear classifier group based on a K-combined voting (KCV) function is proposed. As the dictionary evolves, the models are also trained to timely adapt the target appearance variation. Qualitative and quantitative evaluations on challenging image sequences compared with state-of-the-art algorithms demonstrate that the proposed tracking algorithm achieves a more favorable performance. We also illustrate its relay application in visual sensor networks. PMID:24549252
A Motion Detection Algorithm Using Local Phase Information

PubMed Central

Lazar, Aurel A.; Ukani, Nikul H.; Zhou, Yiyin

2016-01-01

Previous research demonstrated that global phase alone can be used to faithfully represent visual scenes. Here we provide a reconstruction algorithm by using only local phase information. We also demonstrate that local phase alone can be effectively used to detect local motion. The local phase-based motion detector is akin to models employed to detect motion in biological vision, for example, the Reichardt detector. The local phase-based motion detection algorithm introduced here consists of two building blocks. The first building block measures/evaluates the temporal change of the local phase. The temporal derivative of the local phase is shown to exhibit the structure of a second order Volterra kernel with two normalized inputs. We provide an efficient, FFT-based algorithm for implementing the change of the local phase. The second processing building block implements the detector; it compares the maximum of the Radon transform of the local phase derivative with a chosen threshold. We demonstrate examples of applying the local phase-based motion detection algorithm on several video sequences. We also show how the locally detected motion can be used for segmenting moving objects in video scenes and compare our local phase-based algorithm to segmentation achieved with a widely used optic flow algorithm. PMID:26880882
Robust real-time horizon detection in full-motion video

NASA Astrophysics Data System (ADS)

Young, Grace B.; Bagnall, Bryan; Lane, Corey; Parameswaran, Shibin

2014-06-01

The ability to detect the horizon on a real-time basis in full-motion video is an important capability to aid and facilitate real-time processing of full-motion videos for the purposes such as object detection, recognition and other video/image segmentation applications. In this paper, we propose a method for real-time horizon detection that is designed to be used as a front-end processing unit for a real-time marine object detection system that carries out object detection and tracking on full-motion videos captured by ship/harbor-mounted cameras, Unmanned Aerial Vehicles (UAVs) or any other method of surveillance for Maritime Domain Awareness (MDA). Unlike existing horizon detection work, we cannot assume a priori the angle or nature (for e.g. straight line) of the horizon, due to the nature of the application domain and the data. Therefore, the proposed real-time algorithm is designed to identify the horizon at any angle and irrespective of objects appearing close to and/or occluding the horizon line (for e.g. trees, vehicles at a distance) by accounting for its non-linear nature. We use a simple two-stage hierarchical methodology, leveraging color-based features, to quickly isolate the region of the image containing the horizon and then perform a more ne-grained horizon detection operation. In this paper, we present our real-time horizon detection results using our algorithm on real-world full-motion video data from a variety of surveillance sensors like UAVs and ship mounted cameras con rming the real-time applicability of this method and its ability to detect horizon with no a priori assumptions.
Image processing and computer controls for video profile diagnostic system in the ground test accelerator (GTA)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wright, R.M.; Zander, M.E.; Brown, S.K.

1992-09-01

This paper describes the application of video image processing to beam profile measurements on the Ground Test Accelerator (GTA). A diagnostic was needed to measure beam profiles in the intermediate matching section (IMS) between the radio-frequency quadrupole (RFQ) and the drift tube linac (DTL). Beam profiles are measured by injecting puffs of gas into the beam. The light emitted from the beam-gas interaction is captured and processed by a video image processing system, generating the beam profile data. A general purpose, modular and flexible video image processing system, imagetool, was used for the GTA image profile measurement. The development ofmore » both software and hardware for imagetool and its integration with the GTA control system (GTACS) will be discussed. The software includes specialized algorithms for analyzing data and calibrating the system. The underlying design philosophy of imagetool was tested by the experience of building and using the system, pointing the way for future improvements. The current status of the system will be illustrated by samples of experimental data.« less
Image processing and computer controls for video profile diagnostic system in the ground test accelerator (GTA)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wright, R.M.; Zander, M.E.; Brown, S.K.

1992-01-01

This paper describes the application of video image processing to beam profile measurements on the Ground Test Accelerator (GTA). A diagnostic was needed to measure beam profiles in the intermediate matching section (IMS) between the radio-frequency quadrupole (RFQ) and the drift tube linac (DTL). Beam profiles are measured by injecting puffs of gas into the beam. The light emitted from the beam-gas interaction is captured and processed by a video image processing system, generating the beam profile data. A general purpose, modular and flexible video image processing system, imagetool, was used for the GTA image profile measurement. The development ofmore » both software and hardware for imagetool and its integration with the GTA control system (GTACS) will be discussed. The software includes specialized algorithms for analyzing data and calibrating the system. The underlying design philosophy of imagetool was tested by the experience of building and using the system, pointing the way for future improvements. The current status of the system will be illustrated by samples of experimental data.« less
Complexity control algorithm based on adaptive mode selection for interframe coding in high efficiency video coding

NASA Astrophysics Data System (ADS)

Chen, Gang; Yang, Bing; Zhang, Xiaoyun; Gao, Zhiyong

2017-07-01

The latest high efficiency video coding (HEVC) standard significantly increases the encoding complexity for improving its coding efficiency. Due to the limited computational capability of handheld devices, complexity constrained video coding has drawn great attention in recent years. A complexity control algorithm based on adaptive mode selection is proposed for interframe coding in HEVC. Considering the direct proportionality between encoding time and computational complexity, the computational complexity is measured in terms of encoding time. First, complexity is mapped to a target in terms of prediction modes. Then, an adaptive mode selection algorithm is proposed for the mode decision process. Specifically, the optimal mode combination scheme that is chosen through offline statistics is developed at low complexity. If the complexity budget has not been used up, an adaptive mode sorting method is employed to further improve coding efficiency. The experimental results show that the proposed algorithm achieves a very large complexity control range (as low as 10%) for the HEVC encoder while maintaining good rate-distortion performance. For the lowdelayP condition, compared with the direct resource allocation method and the state-of-the-art method, an average gain of 0.63 and 0.17 dB in BDPSNR is observed for 18 sequences when the target complexity is around 40%.
A novel frame-level constant-distortion bit allocation for smooth H.264/AVC video quality

NASA Astrophysics Data System (ADS)

Liu, Li; Zhuang, Xinhua

2009-01-01

It is known that quality fluctuation has a major negative effect on visual perception. In previous work, we introduced a constant-distortion bit allocation method [1] for H.263+ encoder. However, the method in [1] can not be adapted to the newest H.264/AVC encoder directly as the well-known chicken-egg dilemma resulted from the rate-distortion optimization (RDO) decision process. To solve this problem, we propose a new two stage constant-distortion bit allocation (CDBA) algorithm with enhanced rate control for H.264/AVC encoder. In stage-1, the algorithm performs RD optimization process with a constant quantization QP. Based on prediction residual signals from stage-1 and target distortion for smooth video quality purpose, the frame-level bit target is allocated by using a close-form approximations of ratedistortion relationship similar to [1], and a fast stage-2 encoding process is performed with enhanced basic unit rate control. Experimental results show that, compared with original rate control algorithm provided by H.264/AVC reference software JM12.1, the proposed constant-distortion frame-level bit allocation scheme reduces quality fluctuation and delivers much smoother PSNR on all testing sequences.
Automated vehicle counting using image processing and machine learning

NASA Astrophysics Data System (ADS)

Meany, Sean; Eskew, Edward; Martinez-Castro, Rosana; Jang, Shinae

2017-04-01

Vehicle counting is used by the government to improve roadways and the flow of traffic, and by private businesses for purposes such as determining the value of locating a new store in an area. A vehicle count can be performed manually or automatically. Manual counting requires an individual to be on-site and tally the traffic electronically or by hand. However, this can lead to miscounts due to factors such as human error A common form of automatic counting involves pneumatic tubes, but pneumatic tubes disrupt traffic during installation and removal, and can be damaged by passing vehicles. Vehicle counting can also be performed via the use of a camera at the count site recording video of the traffic, with counting being performed manually post-recording or using automatic algorithms. This paper presents a low-cost procedure to perform automatic vehicle counting using remote video cameras with an automatic counting algorithm. The procedure would utilize a Raspberry Pi micro-computer to detect when a car is in a lane, and generate an accurate count of vehicle movements. The method utilized in this paper would use background subtraction to process the images and a machine learning algorithm to provide the count. This method avoids fatigue issues that are encountered in manual video counting and prevents the disruption of roadways that occurs when installing pneumatic tubes
Energy Efficient Image/Video Data Transmission on Commercial Multi-Core Processors

PubMed Central

Lee, Sungju; Kim, Heegon; Chung, Yongwha; Park, Daihee

2012-01-01

In transmitting image/video data over Video Sensor Networks (VSNs), energy consumption must be minimized while maintaining high image/video quality. Although image/video compression is well known for its efficiency and usefulness in VSNs, the excessive costs associated with encoding computation and complexity still hinder its adoption for practical use. However, it is anticipated that high-performance handheld multi-core devices will be used as VSN processing nodes in the near future. In this paper, we propose a way to improve the energy efficiency of image and video compression with multi-core processors while maintaining the image/video quality. We improve the compression efficiency at the algorithmic level or derive the optimal parameters for the combination of a machine and compression based on the tradeoff between the energy consumption and the image/video quality. Based on experimental results, we confirm that the proposed approach can improve the energy efficiency of the straightforward approach by a factor of 2∼5 without compromising image/video quality. PMID:23202181
A robust human face detection algorithm

NASA Astrophysics Data System (ADS)

Raviteja, Thaluru; Karanam, Srikrishna; Yeduguru, Dinesh Reddy V.

2012-01-01

Human face detection plays a vital role in many applications like video surveillance, managing a face image database, human computer interface among others. This paper proposes a robust algorithm for face detection in still color images that works well even in a crowded environment. The algorithm uses conjunction of skin color histogram, morphological processing and geometrical analysis for detecting human faces. To reinforce the accuracy of face detection, we further identify mouth and eye regions to establish the presence/absence of face in a particular region of interest.
A deblocking algorithm based on color psychology for display quality enhancement

NASA Astrophysics Data System (ADS)

Yeh, Chia-Hung; Tseng, Wen-Yu; Huang, Kai-Lin

2012-12-01

This article proposes a post-processing deblocking filter to reduce blocking effects. The proposed algorithm detects blocking effects by fusing the results of Sobel edge detector and wavelet-based edge detector. The filtering stage provides four filter modes to eliminate blocking effects at different color regions according to human color vision and color psychology analysis. Experimental results show that the proposed algorithm has better subjective and objective qualities for H.264/AVC reconstructed videos when compared to several existing methods.
Multiple objects tracking with HOGs matching in circular windows

NASA Astrophysics Data System (ADS)

Miramontes-Jaramillo, Daniel; Kober, Vitaly; Díaz-Ramírez, Víctor H.

2014-09-01

In recent years tracking applications with development of new technologies like smart TVs, Kinect, Google Glass and Oculus Rift become very important. When tracking uses a matching algorithm, a good prediction algorithm is required to reduce the search area for each object to be tracked as well as processing time. In this work, we analyze the performance of different tracking algorithms based on prediction and matching for a real-time tracking multiple objects. The used matching algorithm utilizes histograms of oriented gradients. It carries out matching in circular windows, and possesses rotation invariance and tolerance to viewpoint and scale changes. The proposed algorithm is implemented in a personal computer with GPU, and its performance is analyzed in terms of processing time in real scenarios. Such implementation takes advantage of current technologies and helps to process video sequences in real-time for tracking several objects at the same time.
Computationally Efficient Clustering of Audio-Visual Meeting Data

NASA Astrophysics Data System (ADS)

Hung, Hayley; Friedland, Gerald; Yeo, Chuohao

This chapter presents novel computationally efficient algorithms to extract semantically meaningful acoustic and visual events related to each of the participants in a group discussion using the example of business meeting recordings. The recording setup involves relatively few audio-visual sensors, comprising a limited number of cameras and microphones. We first demonstrate computationally efficient algorithms that can identify who spoke and when, a problem in speech processing known as speaker diarization. We also extract visual activity features efficiently from MPEG4 video by taking advantage of the processing that was already done for video compression. Then, we present a method of associating the audio-visual data together so that the content of each participant can be managed individually. The methods presented in this article can be used as a principal component that enables many higher-level semantic analysis tasks needed in search, retrieval, and navigation.
DSP Implementation of the Retinex Image Enhancement Algorithm

NASA Technical Reports Server (NTRS)

Hines, Glenn; Rahman, Zia-Ur; Jobson, Daniel; Woodell, Glenn

2004-01-01

The Retinex is a general-purpose image enhancement algorithm that is used to produce good visual representations of scenes. It performs a non-linear spatial/spectral transform that synthesizes strong local contrast enhancement and color constancy. A real-time, video frame rate implementation of the Retinex is required to meet the needs of various potential users. Retinex processing contains a relatively large number of complex computations, thus to achieve real-time performance using current technologies requires specialized hardware and software. In this paper we discuss the design and development of a digital signal processor (DSP) implementation of the Retinex. The target processor is a Texas Instruments TMS320C6711 floating point DSP. NTSC video is captured using a dedicated frame-grabber card, Retinex processed, and displayed on a standard monitor. We discuss the optimizations used to achieve real-time performance of the Retinex and also describe our future plans on using alternative architectures.

Computationally efficient video restoration for Nyquist sampled imaging sensors combining an affine-motion-based temporal Kalman filter and adaptive Wiener filter.

PubMed

Rucci, Michael; Hardie, Russell C; Barnard, Kenneth J

2014-05-01

In this paper, we present a computationally efficient video restoration algorithm to address both blur and noise for a Nyquist sampled imaging system. The proposed method utilizes a temporal Kalman filter followed by a correlation-model based spatial adaptive Wiener filter (AWF). The Kalman filter employs an affine background motion model and novel process-noise variance estimate. We also propose and demonstrate a new multidelay temporal Kalman filter designed to more robustly treat local motion. The AWF is a spatial operation that performs deconvolution and adapts to the spatially varying residual noise left in the Kalman filter stage. In image areas where the temporal Kalman filter is able to provide significant noise reduction, the AWF can be aggressive in its deconvolution. In other areas, where less noise reduction is achieved with the Kalman filter, the AWF balances the deconvolution with spatial noise reduction. In this way, the Kalman filter and AWF work together effectively, but without the computational burden of full joint spatiotemporal processing. We also propose a novel hybrid system that combines a temporal Kalman filter and BM3D processing. To illustrate the efficacy of the proposed methods, we test the algorithms on both simulated imagery and video collected with a visible camera.
Vehicle counting system using real-time video processing

NASA Astrophysics Data System (ADS)

Crisóstomo-Romero, Pedro M.

2006-02-01

Transit studies are important for planning a road network with optimal vehicular flow. A vehicular count is essential. This article presents a vehicle counting system based on video processing. An advantage of such system is the greater detail than is possible to obtain, like shape, size and speed of vehicles. The system uses a video camera placed above the street to image transit in real-time. The video camera must be placed at least 6 meters above the street level to achieve proper acquisition quality. Fast image processing algorithms and small image dimensions are used to allow real-time processing. Digital filters, mathematical morphology, segmentation and other techniques allow identifying and counting all vehicles in the image sequences. The system was implemented under Linux in a 1.8 GHz Pentium 4 computer. A successful count was obtained with frame rates of 15 frames per second for images of size 240x180 pixels and 24 frames per second for images of size 180x120 pixels, thus being able to count vehicles whose speeds do not exceed 150 km/h.
Multi-frame knowledge based text enhancement for mobile phone captured videos

NASA Astrophysics Data System (ADS)

Ozarslan, Suleyman; Eren, P. Erhan

2014-02-01

In this study, we explore automated text recognition and enhancement using mobile phone captured videos of store receipts. We propose a method which includes Optical Character Resolution (OCR) enhanced by our proposed Row Based Multiple Frame Integration (RB-MFI), and Knowledge Based Correction (KBC) algorithms. In this method, first, the trained OCR engine is used for recognition; then, the RB-MFI algorithm is applied to the output of the OCR. The RB-MFI algorithm determines and combines the most accurate rows of the text outputs extracted by using OCR from multiple frames of the video. After RB-MFI, KBC algorithm is applied to these rows to correct erroneous characters. Results of the experiments show that the proposed video-based approach which includes the RB-MFI and the KBC algorithm increases the word character recognition rate to 95%, and the character recognition rate to 98%.
Automatically monitoring driftwood in large rivers: preliminary results

NASA Astrophysics Data System (ADS)

Piegay, H.; Lemaire, P.; MacVicar, B.; Mouquet-Noppe, C.; Tougne, L.

2014-12-01

Driftwood in rivers impact sediment transport, riverine habitat and human infrastructures. Quantifying it, in particular large woods on fairly large rivers where it can move easily, would allow us to improve our knowledge on fluvial transport processes. There are several means of studying this phenomenon, amongst which RFID sensors tracking, photo and video monitoring. In this abstract, we are interested in the latter, being easier and cheaper to deploy. However, video monitoring of driftwood generates a huge amount of images and manually labeling it is tedious. It is essential to automate such a monitoring process, which is a difficult task in the field of computer vision, and more specifically automatic video analysis. Detecting foreground into dynamic background remains an open problem to date. We installed a video camera at the riverside of a gauging station on the Ain River, a 3500 km² Piedmont River in France. Several floods were manually annotated by a human operator. We developed software that automatically extracts and characterizes wood blocks within a video stream. This algorithm is based upon a statistical model and combines static, dynamic and spatial data. Segmented wood objects are further described with the help of a skeleton-based approach that helps us to automatically determine its shape, diameter and length. The first detailed comparisons between manual annotations and automatically extracted data show that we can fairly well detect large wood until a given size (approximately 120 cm in length or 15 cm in diameter) whereas smaller ones are difficult to detect and tend to be missed by either the human operator, either the algorithm. Detection is fairly accurate in high flow conditions where the water channel is usually brown because of suspended sediment transport. In low flow context, our algorithm still needs improvement to reduce the number of false positive so as to better distinguish shadow or turbulence structures from wood pieces.
Error-free holographic frames encryption with CA pixel-permutation encoding algorithm

NASA Astrophysics Data System (ADS)

Li, Xiaowei; Xiao, Dan; Wang, Qiong-Hua

2018-01-01

The security of video data is necessary in network security transmission hence cryptography is technique to make video data secure and unreadable to unauthorized users. In this paper, we propose a holographic frames encryption technique based on the cellular automata (CA) pixel-permutation encoding algorithm. The concise pixel-permutation algorithm is used to address the drawbacks of the traditional CA encoding methods. The effectiveness of the proposed video encoding method is demonstrated by simulation examples.
A hardware architecture for real-time shadow removal in high-contrast video

NASA Astrophysics Data System (ADS)

Verdugo, Pablo; Pezoa, Jorge E.; Figueroa, Miguel

2017-09-01

Broadcasting an outdoor sports event at daytime is a challenging task due to the high contrast that exists between areas in the shadow and light conditions within the same scene. Commercial cameras typically do not handle the high dynamic range of such scenes in a proper manner, resulting in broadcast streams with very little shadow detail. We propose a hardware architecture for real-time shadow removal in high-resolution video, which reduces the shadow effect and simultaneously improves shadow details. The algorithm operates only on the shadow portions of each video frame, thus improving the results and producing more realistic images than algorithms that operate on the entire frame, such as simplified Retinex and histogram shifting. The architecture receives an input in the RGB color space, transforms it into the YIQ space, and uses color information from both spaces to produce a mask of the shadow areas present in the image. The mask is then filtered using a connected components algorithm to eliminate false positives and negatives. The hardware uses pixel information at the edges of the mask to estimate the illumination ratio between light and shadow in the image, which is then used to correct the shadow area. Our prototype implementation simultaneously processes up to 7 video streams of 1920×1080 pixels at 60 frames per second on a Xilinx Kintex-7 XC7K325T FPGA.
Real-time heart rate measurement for multi-people using compressive tracking

NASA Astrophysics Data System (ADS)

Liu, Lingling; Zhao, Yuejin; Liu, Ming; Kong, Lingqin; Dong, Liquan; Ma, Feilong; Pang, Zongguang; Cai, Zhi; Zhang, Yachu; Hua, Peng; Yuan, Ruifeng

2017-09-01

The rise of aging population has created a demand for inexpensive, unobtrusive, automated health care solutions. Image PhotoPlethysmoGraphy(IPPG) aids in the development of these solutions by allowing for the extraction of physiological signals from video data. However, the main deficiencies of the recent IPPG methods are non-automated, non-real-time and susceptible to motion artifacts(MA). In this paper, a real-time heart rate(HR) detection method for multiple subjects simultaneously was proposed and realized using the open computer vision(openCV) library, which consists of getting multiple subjects' facial video automatically through a Webcam, detecting the region of interest (ROI) in the video, reducing the false detection rate by our improved Adaboost algorithm, reducing the MA by our improved compress tracking(CT) algorithm, wavelet noise-suppression algorithm for denoising and multi-threads for higher detection speed. For comparison, HR was measured simultaneously using a medical pulse oximetry device for every subject during all sessions. Experimental results on a data set of 30 subjects show that the max average absolute error of heart rate estimation is less than 8 beats per minute (BPM), and the processing speed of every frame has almost reached real-time: the experiments with video recordings of ten subjects under the condition of the pixel resolution of 600× 800 pixels show that the average HR detection time of 10 subjects was about 17 frames per second (fps).
Techniques for video compression

NASA Technical Reports Server (NTRS)

Wu, Chwan-Hwa

1995-01-01

In this report, we present our study on multiprocessor implementation of a MPEG2 encoding algorithm. First, we compare two approaches to implementing video standards, VLSI technology and multiprocessor processing, in terms of design complexity, applications, and cost. Then we evaluate the functional modules of MPEG2 encoding process in terms of their computation time. Two crucial modules are identified based on this evaluation. Then we present our experimental study on the multiprocessor implementation of the two crucial modules. Data partitioning is used for job assignment. Experimental results show that high speedup ratio and good scalability can be achieved by using this kind of job assignment strategy.
Target tracking and 3D trajectory acquisition of cabbage butterfly (P. rapae) based on the KCF-BS algorithm.

PubMed

Guo, Yang-Yang; He, Dong-Jian; Liu, Cong

2018-06-25

Insect behaviour is an important research topic in plant protection. To study insect behaviour accurately, it is necessary to observe and record their flight trajectory quantitatively and precisely in three dimensions (3D). The goal of this research was to analyse frames extracted from videos using Kernelized Correlation Filters (KCF) and Background Subtraction (BS) (KCF-BS) to plot the 3D trajectory of cabbage butterfly (P. rapae). Considering the experimental environment with a wind tunnel, a quadrature binocular vision insect video capture system was designed and applied in this study. The KCF-BS algorithm was used to track the butterfly in video frames and obtain coordinates of the target centroid in two videos. Finally the 3D trajectory was calculated according to the matching relationship in the corresponding frames of two angles in the video. To verify the validity of the KCF-BS algorithm, Compressive Tracking (CT) and Spatio-Temporal Context Learning (STC) algorithms were performed. The results revealed that the KCF-BS tracking algorithm performed more favourably than CT and STC in terms of accuracy and robustness.
Fast and robust wavelet-based dynamic range compression and contrast enhancement model with color restoration

NASA Astrophysics Data System (ADS)

Unaldi, Numan; Asari, Vijayan K.; Rahman, Zia-ur

2009-05-01

Recently we proposed a wavelet-based dynamic range compression algorithm to improve the visual quality of digital images captured from high dynamic range scenes with non-uniform lighting conditions. The fast image enhancement algorithm that provides dynamic range compression, while preserving the local contrast and tonal rendition, is also a good candidate for real time video processing applications. Although the colors of the enhanced images produced by the proposed algorithm are consistent with the colors of the original image, the proposed algorithm fails to produce color constant results for some "pathological" scenes that have very strong spectral characteristics in a single band. The linear color restoration process is the main reason for this drawback. Hence, a different approach is required for the final color restoration process. In this paper the latest version of the proposed algorithm, which deals with this issue is presented. The results obtained by applying the algorithm to numerous natural images show strong robustness and high image quality.
A novel vehicle tracking algorithm based on mean shift and active contour model in complex environment

NASA Astrophysics Data System (ADS)

Cai, Lei; Wang, Lin; Li, Bo; Zhang, Libao; Lv, Wen

2017-06-01

Vehicle tracking technology is currently one of the most active research topics in machine vision. It is an important part of intelligent transportation system. However, in theory and technology, it still faces many challenges including real-time and robustness. In video surveillance, the targets need to be detected in real-time and to be calculated accurate position for judging the motives. The contents of video sequence images and the target motion are complex, so the objects can't be expressed by a unified mathematical model. Object-tracking is defined as locating the interest moving target in each frame of a piece of video. The current tracking technology can achieve reliable results in simple environment over the target with easy identified characteristics. However, in more complex environment, it is easy to lose the target because of the mismatch between the target appearance and its dynamic model. Moreover, the target usually has a complex shape, but the tradition target tracking algorithm usually represents the tracking results by simple geometric such as rectangle or circle, so it cannot provide accurate information for the subsequent upper application. This paper combines a traditional object-tracking technology, Mean-Shift algorithm, with a kind of image segmentation algorithm, Active-Contour model, to get the outlines of objects while the tracking process and automatically handle topology changes. Meanwhile, the outline information is used to aid tracking algorithm to improve it.
Real-Time Visualization of Tissue Ischemia

NASA Technical Reports Server (NTRS)

Bearman, Gregory H. (Inventor); Chrien, Thomas D. (Inventor); Eastwood, Michael L. (Inventor)

2000-01-01

A real-time display of tissue ischemia which comprises three CCD video cameras, each with a narrow bandwidth filter at the correct wavelength is discussed. The cameras simultaneously view an area of tissue suspected of having ischemic areas through beamsplitters. The output from each camera is adjusted to give the correct signal intensity for combining with, the others into an image for display. If necessary a digital signal processor (DSP) can implement algorithms for image enhancement prior to display. Current DSP engines are fast enough to give real-time display. Measurement at three, wavelengths, combined into a real-time Red-Green-Blue (RGB) video display with a digital signal processing (DSP) board to implement image algorithms, provides direct visualization of ischemic areas.
Video denoising, deblocking, and enhancement through separable 4-D nonlocal spatiotemporal transforms.

PubMed

Maggioni, Matteo; Boracchi, Giacomo; Foi, Alessandro; Egiazarian, Karen

2012-09-01

We propose a powerful video filtering algorithm that exploits temporal and spatial redundancy characterizing natural video sequences. The algorithm implements the paradigm of nonlocal grouping and collaborative filtering, where a higher dimensional transform-domain representation of the observations is leveraged to enforce sparsity, and thus regularize the data: 3-D spatiotemporal volumes are constructed by tracking blocks along trajectories defined by the motion vectors. Mutually similar volumes are then grouped together by stacking them along an additional fourth dimension, thus producing a 4-D structure, termed group, where different types of data correlation exist along the different dimensions: local correlation along the two dimensions of the blocks, temporal correlation along the motion trajectories, and nonlocal spatial correlation (i.e., self-similarity) along the fourth dimension of the group. Collaborative filtering is then realized by transforming each group through a decorrelating 4-D separable transform and then by shrinkage and inverse transformation. In this way, the collaborative filtering provides estimates for each volume stacked in the group, which are then returned and adaptively aggregated to their original positions in the video. The proposed filtering procedure addresses several video processing applications, such as denoising, deblocking, and enhancement of both grayscale and color data. Experimental results prove the effectiveness of our method in terms of both subjective and objective visual quality, and show that it outperforms the state of the art in video denoising.
Shaking video stabilization with content completion

NASA Astrophysics Data System (ADS)

Peng, Yi; Ye, Qixiang; Liu, Yanmei; Jiao, Jianbin

2009-01-01

A new stabilization algorithm to counterbalance the shaking motion in a video based on classical Kandade-Lucas- Tomasi (KLT) method is presented in this paper. Feature points are evaluated with law of large numbers and clustering algorithm to reduce the side effect of moving foreground. Analysis on the change of motion direction is also carried out to detect the existence of shaking. For video clips with detected shaking, an affine transformation is performed to warp the current frame to the reference one. In addition, the missing content of a frame during the stabilization is completed with optical flow analysis and mosaicking operation. Experiments on video clips demonstrate the effectiveness of the proposed algorithm.
Digital video steganalysis exploiting collusion sensitivity

NASA Astrophysics Data System (ADS)

Budhia, Udit; Kundur, Deepa

2004-09-01

In this paper we present an effective steganalyis technique for digital video sequences based on the collusion attack. Steganalysis is the process of detecting with a high probability and low complexity the presence of covert data in multimedia. Existing algorithms for steganalysis target detecting covert information in still images. When applied directly to video sequences these approaches are suboptimal. In this paper, we present a method that overcomes this limitation by using redundant information present in the temporal domain to detect covert messages in the form of Gaussian watermarks. Our gains are achieved by exploiting the collusion attack that has recently been studied in the field of digital video watermarking, and more sophisticated pattern recognition tools. Applications of our scheme include cybersecurity and cyberforensics.
A Model-Based Approach for the Measurement of Eye Movements Using Image Processing

NASA Technical Reports Server (NTRS)

Sung, Kwangjae; Reschke, Millard F.

1997-01-01

This paper describes a video eye-tracking algorithm which searches for the best fit of the pupil modeled as a circular disk. The algorithm is robust to common image artifacts such as the droopy eyelids and light reflections while maintaining the measurement resolution available by the centroid algorithm. The presented algorithm is used to derive the pupil size and center coordinates, and can be combined with iris-tracking techniques to measure ocular torsion. A comparison search method of pupil candidates using pixel coordinate reference lookup tables optimizes the processing requirements for a least square fit of the circular disk model. This paper includes quantitative analyses and simulation results for the resolution and the robustness of the algorithm. The algorithm presented in this paper provides a platform for a noninvasive, multidimensional eye measurement system which can be used for clinical and research applications requiring the precise recording of eye movements in three-dimensional space.
Inferring consistent functional interaction patterns from natural stimulus FMRI data

PubMed Central

Sun, Jiehuan; Hu, Xintao; Huang, Xiu; Liu, Yang; Li, Kaiming; Li, Xiang; Han, Junwei; Guo, Lei

2014-01-01

There has been increasing interest in how the human brain responds to natural stimulus such as video watching in the neuroimaging field. Along this direction, this paper presents our effort in inferring consistent and reproducible functional interaction patterns under natural stimulus of video watching among known functional brain regions identified by task-based fMRI. Then, we applied and compared four statistical approaches, including Bayesian network modeling with searching algorithms: greedy equivalence search (GES), Peter and Clark (PC) analysis, independent multiple greedy equivalence search (IMaGES), and the commonly used Granger causality analysis (GCA), to infer consistent and reproducible functional interaction patterns among these brain regions. It is interesting that a number of reliable and consistent functional interaction patterns were identified by the GES, PC and IMaGES algorithms in different participating subjects when they watched multiple video shots of the same semantic category. These interaction patterns are meaningful given current neuroscience knowledge and are reasonably reproducible across different brains and video shots. In particular, these consistent functional interaction patterns are supported by structural connections derived from diffusion tensor imaging (DTI) data, suggesting the structural underpinnings of consistent functional interactions. Our work demonstrates that specific consistent patterns of functional interactions among relevant brain regions might reflect the brain's fundamental mechanisms of online processing and comprehension of video messages. PMID:22440644
Dynamic video encryption algorithm for H.264/AVC based on a spatiotemporal chaos system.

PubMed

Xu, Hui; Tong, Xiao-Jun; Zhang, Miao; Wang, Zhu; Li, Ling-Hao

2016-06-01

Video encryption schemes mostly employ the selective encryption method to encrypt parts of important and sensitive video information, aiming to ensure the real-time performance and encryption efficiency. The classic block cipher is not applicable to video encryption due to the high computational overhead. In this paper, we propose the encryption selection control module to encrypt video syntax elements dynamically which is controlled by the chaotic pseudorandom sequence. A novel spatiotemporal chaos system and binarization method is used to generate a key stream for encrypting the chosen syntax elements. The proposed scheme enhances the resistance against attacks through the dynamic encryption process and high-security stream cipher. Experimental results show that the proposed method exhibits high security and high efficiency with little effect on the compression ratio and time cost.
Non-contact cardiac pulse rate estimation based on web-camera

NASA Astrophysics Data System (ADS)

Wang, Yingzhi; Han, Tailin

2015-12-01

In this paper, we introduce a new methodology of non-contact cardiac pulse rate estimation based on the imaging Photoplethysmography (iPPG) and blind source separation. This novel's approach can be applied to color video recordings of the human face and is based on automatic face tracking along with blind source separation of the color channels into RGB three-channel component. First of all, we should do some pre-processings of the data which can be got from color video such as normalization and sphering. We can use spectrum analysis to estimate the cardiac pulse rate by Independent Component Analysis (ICA) and JADE algorithm. With Bland-Altman and correlation analysis, we compared the cardiac pulse rate extracted from videos recorded by a basic webcam to a Commercial pulse oximetry sensors and achieved high accuracy and correlation. Root mean square error for the estimated results is 2.06bpm, which indicates that the algorithm can realize the non-contact measurements of cardiac pulse rate.
Detection of dominant flow and abnormal events in surveillance video

NASA Astrophysics Data System (ADS)

Kwak, Sooyeong; Byun, Hyeran

2011-02-01

We propose an algorithm for abnormal event detection in surveillance video. The proposed algorithm is based on a semi-unsupervised learning method, a kind of feature-based approach so that it does not detect the moving object individually. The proposed algorithm identifies dominant flow without individual object tracking using a latent Dirichlet allocation model in crowded environments. It can also automatically detect and localize an abnormally moving object in real-life video. The performance tests are taken with several real-life databases, and their results show that the proposed algorithm can efficiently detect abnormally moving objects in real time. The proposed algorithm can be applied to any situation in which abnormal directions or abnormal speeds are detected regardless of direction.

Colonoscopy video quality assessment using hidden Markov random fields

NASA Astrophysics Data System (ADS)

Park, Sun Young; Sargent, Dusty; Spofford, Inbar; Vosburgh, Kirby

2011-03-01

With colonoscopy becoming a common procedure for individuals aged 50 or more who are at risk of developing colorectal cancer (CRC), colon video data is being accumulated at an ever increasing rate. However, the clinically valuable information contained in these videos is not being maximally exploited to improve patient care and accelerate the development of new screening methods. One of the well-known difficulties in colonoscopy video analysis is the abundance of frames with no diagnostic information. Approximately 40% - 50% of the frames in a colonoscopy video are contaminated by noise, acquisition errors, glare, blur, and uneven illumination. Therefore, filtering out low quality frames containing no diagnostic information can significantly improve the efficiency of colonoscopy video analysis. To address this challenge, we present a quality assessment algorithm to detect and remove low quality, uninformative frames. The goal of our algorithm is to discard low quality frames while retaining all diagnostically relevant information. Our algorithm is based on a hidden Markov model (HMM) in combination with two measures of data quality to filter out uninformative frames. Furthermore, we present a two-level framework based on an embedded hidden Markov model (EHHM) to incorporate the proposed quality assessment algorithm into a complete, automated diagnostic image analysis system for colonoscopy video.
Very low cost real time histogram-based contrast enhancer utilizing fixed-point DSP processing

NASA Astrophysics Data System (ADS)

McCaffrey, Nathaniel J.; Pantuso, Francis P.

1998-03-01

A real time contrast enhancement system utilizing histogram- based algorithms has been developed to operate on standard composite video signals. This low-cost DSP based system is designed with fixed-point algorithms and an off-chip look up table (LUT) to reduce the cost considerably over other contemporary approaches. This paper describes several real- time contrast enhancing systems advanced at the Sarnoff Corporation for high-speed visible and infrared cameras. The fixed-point enhancer was derived from these high performance cameras. The enhancer digitizes analog video and spatially subsamples the stream to qualify the scene's luminance. Simultaneously, the video is streamed through a LUT that has been programmed with the previous calculation. Reducing division operations by subsampling reduces calculation- cycles and also allows the processor to be used with cameras of nominal resolutions. All values are written to the LUT during blanking so no frames are lost. The enhancer measures 13 cm X 6.4 cm X 3.2 cm, operates off 9 VAC and consumes 12 W. This processor is small and inexpensive enough to be mounted with field deployed security cameras and can be used for surveillance, video forensics and real- time medical imaging.
A hardware-oriented concurrent TZ search algorithm for High-Efficiency Video Coding

NASA Astrophysics Data System (ADS)

Doan, Nghia; Kim, Tae Sung; Rhee, Chae Eun; Lee, Hyuk-Jae

2017-12-01

High-Efficiency Video Coding (HEVC) is the latest video coding standard, in which the compression performance is double that of its predecessor, the H.264/AVC standard, while the video quality remains unchanged. In HEVC, the test zone (TZ) search algorithm is widely used for integer motion estimation because it effectively searches the good-quality motion vector with a relatively small amount of computation. However, the complex computation structure of the TZ search algorithm makes it difficult to implement it in the hardware. This paper proposes a new integer motion estimation algorithm which is designed for hardware execution by modifying the conventional TZ search to allow parallel motion estimations of all prediction unit (PU) partitions. The algorithm consists of the three phases of zonal, raster, and refinement searches. At the beginning of each phase, the algorithm obtains the search points required by the original TZ search for all PU partitions in a coding unit (CU). Then, all redundant search points are removed prior to the estimation of the motion costs, and the best search points are then selected for all PUs. Compared to the conventional TZ search algorithm, experimental results show that the proposed algorithm significantly decreases the Bjøntegaard Delta bitrate (BD-BR) by 0.84%, and it also reduces the computational complexity by 54.54%.
Video fingerprinting for copy identification: from research to industry applications

NASA Astrophysics Data System (ADS)

Lu, Jian

2009-02-01

Research that began a decade ago in video copy detection has developed into a technology known as "video fingerprinting". Today, video fingerprinting is an essential and enabling tool adopted by the industry for video content identification and management in online video distribution. This paper provides a comprehensive review of video fingerprinting technology and its applications in identifying, tracking, and managing copyrighted content on the Internet. The review includes a survey on video fingerprinting algorithms and some fundamental design considerations, such as robustness, discriminability, and compactness. It also discusses fingerprint matching algorithms, including complexity analysis, and approximation and optimization for fast fingerprint matching. On the application side, it provides an overview of a number of industry-driven applications that rely on video fingerprinting. Examples are given based on real-world systems and workflows to demonstrate applications in detecting and managing copyrighted content, and in monitoring and tracking video distribution on the Internet.
Wavelet-based audio embedding and audio/video compression

NASA Astrophysics Data System (ADS)

Mendenhall, Michael J.; Claypoole, Roger L., Jr.

2001-12-01

Watermarking, traditionally used for copyright protection, is used in a new and exciting way. An efficient wavelet-based watermarking technique embeds audio information into a video signal. Several effective compression techniques are applied to compress the resulting audio/video signal in an embedded fashion. This wavelet-based compression algorithm incorporates bit-plane coding, index coding, and Huffman coding. To demonstrate the potential of this audio embedding and audio/video compression algorithm, we embed an audio signal into a video signal and then compress. Results show that overall compression rates of 15:1 can be achieved. The video signal is reconstructed with a median PSNR of nearly 33 dB. Finally, the audio signal is extracted from the compressed audio/video signal without error.
Particle Filtering with Region-based Matching for Tracking of Partially Occluded and Scaled Targets*

PubMed Central

Nakhmani, Arie; Tannenbaum, Allen

2012-01-01

Visual tracking of arbitrary targets in clutter is important for a wide range of military and civilian applications. We propose a general framework for the tracking of scaled and partially occluded targets, which do not necessarily have prominent features. The algorithm proposed in the present paper utilizes a modified normalized cross-correlation as the likelihood for a particle filter. The algorithm divides the template, selected by the user in the first video frame, into numerous patches. The matching process of these patches by particle filtering allows one to handle the target’s occlusions and scaling. Experimental results with fixed rectangular templates show that the method is reliable for videos with nonstationary, noisy, and cluttered background, and provides accurate trajectories in cases of target translation, scaling, and occlusion. PMID:22506088
Direct migration motion estimation and mode decision to decoder for a low-complexity decoder Wyner-Ziv video coding

NASA Astrophysics Data System (ADS)

Lei, Ted Chih-Wei; Tseng, Fan-Shuo

2017-07-01

This paper addresses the problem of high-computational complexity decoding in traditional Wyner-Ziv video coding (WZVC). The key focus is the migration of two traditionally high-computationally complex encoder algorithms, namely motion estimation and mode decision. In order to reduce the computational burden in this process, the proposed architecture adopts the partial boundary matching algorithm and four flexible types of block mode decision at the decoder. This approach does away with the need for motion estimation and mode decision at the encoder. The experimental results show that the proposed padding block-based WZVC not only decreases decoder complexity to approximately one hundredth that of the state-of-the-art DISCOVER decoding but also outperforms DISCOVER codec by up to 3 to 4 dB.
Space Shuttle Main Engine Propellant Path Leak Detection Using Sequential Image Processing

NASA Technical Reports Server (NTRS)

Smith, L. Montgomery; Malone, Jo Anne; Crawford, Roger A.

1995-01-01

Initial research in this study using theoretical radiation transport models established that the occurrence of a leak is accompanies by a sudden but sustained change in intensity in a given region of an image. In this phase, temporal processing of video images on a frame-by-frame basis was used to detect leaks within a given field of view. The leak detection algorithm developed in this study consists of a digital highpass filter cascaded with a moving average filter. The absolute value of the resulting discrete sequence is then taken and compared to a threshold value to produce the binary leak/no leak decision at each point in the image. Alternatively, averaging over the full frame of the output image produces a single time-varying mean value estimate that is indicative of the intensity and extent of a leak. Laboratory experiments were conducted in which artificially created leaks on a simulated SSME background were produced and recorded from a visible wavelength video camera. This data was processed frame-by-frame over the time interval of interest using an image processor implementation of the leak detection algorithm. In addition, a 20 second video sequence of an actual SSME failure was analyzed using this technique. The resulting output image sequences and plots of the full frame mean value versus time verify the effectiveness of the system.
First- and third-party ground truth for key frame extraction from consumer video clips

NASA Astrophysics Data System (ADS)

Costello, Kathleen; Luo, Jiebo

2007-02-01

Extracting key frames (KF) from video is of great interest in many applications, such as video summary, video organization, video compression, and prints from video. KF extraction is not a new problem. However, current literature has been focused mainly on sports or news video. In the consumer video space, the biggest challenges for key frame selection from consumer videos are the unconstrained content and lack of any preimposed structure. In this study, we conduct ground truth collection of key frames from video clips taken by digital cameras (as opposed to camcorders) using both first- and third-party judges. The goals of this study are: (1) to create a reference database of video clips reasonably representative of the consumer video space; (2) to identify associated key frames by which automated algorithms can be compared and judged for effectiveness; and (3) to uncover the criteria used by both first- and thirdparty human judges so these criteria can influence algorithm design. The findings from these ground truths will be discussed.
Robust efficient video fingerprinting

NASA Astrophysics Data System (ADS)

Puri, Manika; Lubin, Jeffrey

2009-02-01

We have developed a video fingerprinting system with robustness and efficiency as the primary and secondary design criteria. In extensive testing, the system has shown robustness to cropping, letter-boxing, sub-titling, blur, drastic compression, frame rate changes, size changes and color changes, as well as to the geometric distortions often associated with camcorder capture in cinema settings. Efficiency is afforded by a novel two-stage detection process in which a fast matching process first computes a number of likely candidates, which are then passed to a second slower process that computes the overall best match with minimal false alarm probability. One key component of the algorithm is a maximally stable volume computation - a three-dimensional generalization of maximally stable extremal regions - that provides a content-centric coordinate system for subsequent hash function computation, independent of any affine transformation or extensive cropping. Other key features include an efficient bin-based polling strategy for initial candidate selection, and a final SIFT feature-based computation for final verification. We describe the algorithm and its performance, and then discuss additional modifications that can provide further improvement to efficiency and accuracy.
Subjective evaluation of next-generation video compression algorithms: a case study

NASA Astrophysics Data System (ADS)

De Simone, Francesca; Goldmann, Lutz; Lee, Jong-Seok; Ebrahimi, Touradj; Baroncini, Vittorio

2010-08-01

This paper describes the details and the results of the subjective quality evaluation performed at EPFL, as a contribution to the effort of the Joint Collaborative Team on Video Coding (JCT-VC) for the definition of the next-generation video coding standard. The performance of 27 coding technologies have been evaluated with respect to two H.264/MPEG-4 AVC anchors, considering high definition (HD) test material. The test campaign involved a total of 494 naive observers and took place over a period of four weeks. While similar tests have been conducted as part of the standardization process of previous video coding technologies, the test campaign described in this paper is by far the most extensive in the history of video coding standardization. The obtained subjective quality scores show high consistency and support an accurate comparison of the performance of the different coding solutions.
Highly efficient simulation environment for HDTV video decoder in VLSI design

NASA Astrophysics Data System (ADS)

Mao, Xun; Wang, Wei; Gong, Huimin; He, Yan L.; Lou, Jian; Yu, Lu; Yao, Qingdong; Pirsch, Peter

2002-01-01

With the increase of the complex of VLSI such as the SoC (System on Chip) of MPEG-2 Video decoder with HDTV scalability especially, simulation and verification of the full design, even as high as the behavior level in HDL, often proves to be very slow, costly and it is difficult to perform full verification until late in the design process. Therefore, they become bottleneck of the procedure of HDTV video decoder design, and influence it's time-to-market mostly. In this paper, the architecture of Hardware/Software Interface of HDTV video decoder is studied, and a Hardware-Software Mixed Simulation (HSMS) platform is proposed to check and correct error in the early design stage, based on the algorithm of MPEG-2 video decoding. The application of HSMS to target system could be achieved by employing several introduced approaches. Those approaches speed up the simulation and verification task without decreasing performance.
The Simple Video Coder: A free tool for efficiently coding social video data.

PubMed

Barto, Daniel; Bird, Clark W; Hamilton, Derek A; Fink, Brandi C

2017-08-01

Videotaping of experimental sessions is a common practice across many disciplines of psychology, ranging from clinical therapy, to developmental science, to animal research. Audio-visual data are a rich source of information that can be easily recorded; however, analysis of the recordings presents a major obstacle to project completion. Coding behavior is time-consuming and often requires ad-hoc training of a student coder. In addition, existing software is either prohibitively expensive or cumbersome, which leaves researchers with inadequate tools to quickly process video data. We offer the Simple Video Coder-free, open-source software for behavior coding that is flexible in accommodating different experimental designs, is intuitive for students to use, and produces outcome measures of event timing, frequency, and duration. Finally, the software also offers extraction tools to splice video into coded segments suitable for training future human coders or for use as input for pattern classification algorithms.
Segment scheduling method for reducing 360° video streaming latency

NASA Astrophysics Data System (ADS)

Gudumasu, Srinivas; Asbun, Eduardo; He, Yong; Ye, Yan

2017-09-01

360° video is an emerging new format in the media industry enabled by the growing availability of virtual reality devices. It provides the viewer a new sense of presence and immersion. Compared to conventional rectilinear video (2D or 3D), 360° video poses a new and difficult set of engineering challenges on video processing and delivery. Enabling comfortable and immersive user experience requires very high video quality and very low latency, while the large video file size poses a challenge to delivering 360° video in a quality manner at scale. Conventionally, 360° video represented in equirectangular or other projection formats can be encoded as a single standards-compliant bitstream using existing video codecs such as H.264/AVC or H.265/HEVC. Such method usually needs very high bandwidth to provide an immersive user experience. While at the client side, much of such high bandwidth and the computational power used to decode the video are wasted because the user only watches a small portion (i.e., viewport) of the entire picture. Viewport dependent 360°video processing and delivery approaches spend more bandwidth on the viewport than on non-viewports and are therefore able to reduce the overall transmission bandwidth. This paper proposes a dual buffer segment scheduling algorithm for viewport adaptive streaming methods to reduce latency when switching between high quality viewports in 360° video streaming. The approach decouples the scheduling of viewport segments and non-viewport segments to ensure the viewport segment requested matches the latest user head orientation. A base layer buffer stores all lower quality segments, and a viewport buffer stores high quality viewport segments corresponding to the most recent viewer's head orientation. The scheduling scheme determines viewport requesting time based on the buffer status and the head orientation. This paper also discusses how to deploy the proposed scheduling design for various viewport adaptive video streaming methods. The proposed dual buffer segment scheduling method is implemented in an end-to-end tile based 360° viewports adaptive video streaming platform, where the entire 360° video is divided into a number of tiles, and each tile is independently encoded into multiple quality level representations. The client requests different quality level representations of each tile based on the viewer's head orientation and the available bandwidth, and then composes all tiles together for rendering. The simulation results verify that the proposed dual buffer segment scheduling algorithm reduces the viewport switch latency, and utilizes available bandwidth more efficiently. As a result, a more consistent immersive 360° video viewing experience can be presented to the user.
Temporal flicker reduction and denoising in video using sparse directional transforms

NASA Astrophysics Data System (ADS)

Kanumuri, Sandeep; Guleryuz, Onur G.; Civanlar, M. Reha; Fujibayashi, Akira; Boon, Choong S.

2008-08-01

The bulk of the video content available today over the Internet and over mobile networks suffers from many imperfections caused during acquisition and transmission. In the case of user-generated content, which is typically produced with inexpensive equipment, these imperfections manifest in various ways through noise, temporal flicker and blurring, just to name a few. Imperfections caused by compression noise and temporal flicker are present in both studio-produced and user-generated video content transmitted at low bit-rates. In this paper, we introduce an algorithm designed to reduce temporal flicker and noise in video sequences. The algorithm takes advantage of the sparse nature of video signals in an appropriate transform domain that is chosen adaptively based on local signal statistics. When the signal corresponds to a sparse representation in this transform domain, flicker and noise, which are spread over the entire domain, can be reduced easily by enforcing sparsity. Our results show that the proposed algorithm reduces flicker and noise significantly and enables better presentation of compressed videos.
Designing a scalable video-on-demand server with data sharing

NASA Astrophysics Data System (ADS)

Lim, Hyeran; Du, David H.

2000-12-01

As current disk space and transfer speed increase, the bandwidth between a server and its disks has become critical for video-on-demand (VOD) services. Our VOD server consists of several hosts sharing data on disks through a ring-based network. Data sharing provided by the spatial-reuse ring network between servers and disks not only increases the utilization towards full bandwidth but also improves the availability of videos. Striping and replication methods are introduced in order to improve the efficiency of our VOD server system as well as the availability of videos. We consider tow kinds of resources of a VOD server system. Given a representative access profile, our intention is to propose an algorithm to find an initial condition, place videos on disks in the system successfully. If any copy of a video cannot be placed due to lack of resources, more servers/disks are added. When all videos are place on the disks by our algorithm, the final configuration is determined with indicator of how tolerable it is against the fluctuation in demand of videos. Considering it is a NP-hard problem, our algorithm generates the final configuration with O(M log M) at best, where M is the number of movies.
Designing a scalable video-on-demand server with data sharing

NASA Astrophysics Data System (ADS)

Lim, Hyeran; Du, David H. C.

2001-01-01

As current disk space and transfer speed increase, the bandwidth between a server and its disks has become critical for video-on-demand (VOD) services. Our VOD server consists of several hosts sharing data on disks through a ring-based network. Data sharing provided by the spatial-reuse ring network between servers and disks not only increases the utilization towards full bandwidth but also improves the availability of videos. Striping and replication methods are introduced in order to improve the efficiency of our VOD server system as well as the availability of videos. We consider tow kinds of resources of a VOD server system. Given a representative access profile, our intention is to propose an algorithm to find an initial condition, place videos on disks in the system successfully. If any copy of a video cannot be placed due to lack of resources, more servers/disks are added. When all videos are place on the disks by our algorithm, the final configuration is determined with indicator of how tolerable it is against the fluctuation in demand of videos. Considering it is a NP-hard problem, our algorithm generates the final configuration with O(M log M) at best, where M is the number of movies.
Two novel motion-based algorithms for surveillance video analysis on embedded platforms

NASA Astrophysics Data System (ADS)

Vijverberg, Julien A.; Loomans, Marijn J. H.; Koeleman, Cornelis J.; de With, Peter H. N.

2010-05-01

This paper proposes two novel motion-vector based techniques for target detection and target tracking in surveillance videos. The algorithms are designed to operate on a resource-constrained device, such as a surveillance camera, and to reuse the motion vectors generated by the video encoder. The first novel algorithm for target detection uses motion vectors to construct a consistent motion mask, which is combined with a simple background segmentation technique to obtain a segmentation mask. The second proposed algorithm aims at multi-target tracking and uses motion vectors to assign blocks to targets employing five features. The weights of these features are adapted based on the interaction between targets. These algorithms are combined in one complete analysis application. The performance of this application for target detection has been evaluated for the i-LIDS sterile zone dataset and achieves an F1-score of 0.40-0.69. The performance of the analysis algorithm for multi-target tracking has been evaluated using the CAVIAR dataset and achieves an MOTP of around 9.7 and MOTA of 0.17-0.25. On a selection of targets in videos from other datasets, the achieved MOTP and MOTA are 8.8-10.5 and 0.32-0.49 respectively. The execution time on a PC-based platform is 36 ms. This includes the 20 ms for generating motion vectors, which are also required by the video encoder.
Realization and optimization of AES algorithm on the TMS320DM6446 based on DaVinci technology

NASA Astrophysics Data System (ADS)

Jia, Wen-bin; Xiao, Fu-hai

2013-03-01

The application of AES algorithm in the digital cinema system avoids video data to be illegal theft or malicious tampering, and solves its security problems. At the same time, in order to meet the requirements of the real-time, scene and transparent encryption of high-speed data streams of audio and video in the information security field, through the in-depth analysis of AES algorithm principle, based on the hardware platform of TMS320DM6446, with the software framework structure of DaVinci, this paper proposes the specific realization methods of AES algorithm in digital video system and its optimization solutions. The test results show digital movies encrypted by AES128 can not play normally, which ensures the security of digital movies. Through the comparison of the performance of AES128 algorithm before optimization and after, the correctness and validity of improved algorithm is verified.
A novel rotational matrix and translation vector algorithm: geometric accuracy for augmented reality in oral and maxillofacial surgeries.

PubMed

Murugesan, Yahini Prabha; Alsadoon, Abeer; Manoranjan, Paul; Prasad, P W C

2018-06-01

Augmented reality-based surgeries have not been successfully implemented in oral and maxillofacial areas due to limitations in geometric accuracy and image registration. This paper aims to improve the accuracy and depth perception of the augmented video. The proposed system consists of a rotational matrix and translation vector algorithm to reduce the geometric error and improve the depth perception by including 2 stereo cameras and a translucent mirror in the operating room. The results on the mandible/maxilla area show that the new algorithm improves the video accuracy by 0.30-0.40 mm (in terms of overlay error) and the processing rate to 10-13 frames/s compared to 7-10 frames/s in existing systems. The depth perception increased by 90-100 mm. The proposed system concentrates on reducing the geometric error. Thus, this study provides an acceptable range of accuracy with a shorter operating time, which provides surgeons with a smooth surgical flow. Copyright © 2018 John Wiley & Sons, Ltd.

Query by example video based on fuzzy c-means initialized by fixed clustering center

NASA Astrophysics Data System (ADS)

Hou, Sujuan; Zhou, Shangbo; Siddique, Muhammad Abubakar

2012-04-01

Currently, the high complexity of video contents has posed the following major challenges for fast retrieval: (1) efficient similarity measurements, and (2) efficient indexing on the compact representations. A video-retrieval strategy based on fuzzy c-means (FCM) is presented for querying by example. Initially, the query video is segmented and represented by a set of shots, each shot can be represented by a key frame, and then we used video processing techniques to find visual cues to represent the key frame. Next, because the FCM algorithm is sensitive to the initializations, here we initialized the cluster center by the shots of query video so that users could achieve appropriate convergence. After an FCM cluster was initialized by the query video, each shot of query video was considered a benchmark point in the aforesaid cluster, and each shot in the database possessed a class label. The similarity between the shots in the database with the same class label and benchmark point can be transformed into the distance between them. Finally, the similarity between the query video and the video in database was transformed into the number of similar shots. Our experimental results demonstrated the performance of this proposed approach.
Specialized Computer Systems for Environment Visualization

NASA Astrophysics Data System (ADS)

Al-Oraiqat, Anas M.; Bashkov, Evgeniy A.; Zori, Sergii A.

2018-06-01

The need for real time image generation of landscapes arises in various fields as part of tasks solved by virtual and augmented reality systems, as well as geographic information systems. Such systems provide opportunities for collecting, storing, analyzing and graphically visualizing geographic data. Algorithmic and hardware software tools for increasing the realism and efficiency of the environment visualization in 3D visualization systems are proposed. This paper discusses a modified path tracing algorithm with a two-level hierarchy of bounding volumes and finding intersections with Axis-Aligned Bounding Box. The proposed algorithm eliminates the branching and hence makes the algorithm more suitable to be implemented on the multi-threaded CPU and GPU. A modified ROAM algorithm is used to solve the qualitative visualization of reliefs' problems and landscapes. The algorithm is implemented on parallel systems—cluster and Compute Unified Device Architecture-networks. Results show that the implementation on MPI clusters is more efficient than Graphics Processing Unit/Graphics Processing Clusters and allows real-time synthesis. The organization and algorithms of the parallel GPU system for the 3D pseudo stereo image/video synthesis are proposed. With realizing possibility analysis on a parallel GPU-architecture of each stage, 3D pseudo stereo synthesis is performed. An experimental prototype of a specialized hardware-software system 3D pseudo stereo imaging and video was developed on the CPU/GPU. The experimental results show that the proposed adaptation of 3D pseudo stereo imaging to the architecture of GPU-systems is efficient. Also it accelerates the computational procedures of 3D pseudo-stereo synthesis for the anaglyph and anamorphic formats of the 3D stereo frame without performing optimization procedures. The acceleration is on average 11 and 54 times for test GPUs.
A robust coding scheme for packet video

NASA Technical Reports Server (NTRS)

Chen, Y. C.; Sayood, Khalid; Nelson, D. J.

1991-01-01

We present a layered packet video coding algorithm based on a progressive transmission scheme. The algorithm provides good compression and can handle significant packet loss with graceful degradation in the reconstruction sequence. Simulation results for various conditions are presented.
A robust coding scheme for packet video

NASA Technical Reports Server (NTRS)

Chen, Yun-Chung; Sayood, Khalid; Nelson, Don J.

1992-01-01

A layered packet video coding algorithm based on a progressive transmission scheme is presented. The algorithm provides good compression and can handle significant packet loss with graceful degradation in the reconstruction sequence. Simulation results for various conditions are presented.
Effect of Watermarking on Diagnostic Preservation of Atherosclerotic Ultrasound Video in Stroke Telemedicine.

PubMed

Dey, Nilanjan; Bose, Soumyo; Das, Achintya; Chaudhuri, Sheli Sinha; Saba, Luca; Shafique, Shoaib; Nicolaides, Andrew; Suri, Jasjit S

2016-04-01

Embedding of diagnostic and health care information requires secure encryption and watermarking. This research paper presents a comprehensive study for the behavior of some well established watermarking algorithms in frequency domain for the preservation of stroke-based diagnostic parameters. Two different sets of watermarking algorithms namely: two correlation-based (binary logo hiding) and two singular value decomposition (SVD)-based (gray logo hiding) watermarking algorithms are used for embedding ownership logo. The diagnostic parameters in atherosclerotic plaque ultrasound video are namely: (a) bulb identification and recognition which consists of identifying the bulb edge points in far and near carotid walls; (b) carotid bulb diameter; and (c) carotid lumen thickness all along the carotid artery. The tested data set consists of carotid atherosclerotic movies taken under IRB protocol from University of Indiana Hospital, USA-AtheroPoint™ (Roseville, CA, USA) joint pilot study. ROC (receiver operating characteristic) analysis was performed on the bulb detection process that showed an accuracy and sensitivity of 100 % each, respectively. The diagnostic preservation (DPsystem) for SVD-based approach was above 99 % with PSNR (Peak signal-to-noise ratio) above 41, ensuring the retention of diagnostic parameter devalorization as an effect of watermarking. Thus, the fully automated proposed system proved to be an efficient method for watermarking the atherosclerotic ultrasound video for stroke application.
Video quality pooling adaptive to perceptual distortion severity.

PubMed

Park, Jincheol; Seshadrinathan, Kalpana; Lee, Sanghoon; Bovik, Alan Conrad

2013-02-01

It is generally recognized that severe video distortions that are transient in space and/or time have a large effect on overall perceived video quality. In order to understand this phenomena, we study the distribution of spatio-temporally local quality scores obtained from several video quality assessment (VQA) algorithms on videos suffering from compression and lossy transmission over communication channels. We propose a content adaptive spatial and temporal pooling strategy based on the observed distribution. Our method adaptively emphasizes "worst" scores along both the spatial and temporal dimensions of a video sequence and also considers the perceptual effect of large-area cohesive motion flow such as egomotion. We demonstrate the efficacy of the method by testing it using three different VQA algorithms on the LIVE Video Quality database and the EPFL-PoliMI video quality database.
Ensemble of Chaotic and Naive Approaches for Performance Enhancement in Video Encryption.

PubMed

Chandrasekaran, Jeyamala; Thiruvengadam, S J

2015-01-01

Owing to the growth of high performance network technologies, multimedia applications over the Internet are increasing exponentially. Applications like video conferencing, video-on-demand, and pay-per-view depend upon encryption algorithms for providing confidentiality. Video communication is characterized by distinct features such as large volume, high redundancy between adjacent frames, video codec compliance, syntax compliance, and application specific requirements. Naive approaches for video encryption encrypt the entire video stream with conventional text based cryptographic algorithms. Although naive approaches are the most secure for video encryption, the computational cost associated with them is very high. This research work aims at enhancing the speed of naive approaches through chaos based S-box design. Chaotic equations are popularly known for randomness, extreme sensitivity to initial conditions, and ergodicity. The proposed methodology employs two-dimensional discrete Henon map for (i) generation of dynamic and key-dependent S-box that could be integrated with symmetric algorithms like Blowfish and Data Encryption Standard (DES) and (ii) generation of one-time keys for simple substitution ciphers. The proposed design is tested for randomness, nonlinearity, avalanche effect, bit independence criterion, and key sensitivity. Experimental results confirm that chaos based S-box design and key generation significantly reduce the computational cost of video encryption with no compromise in security.
Ensemble of Chaotic and Naive Approaches for Performance Enhancement in Video Encryption

PubMed Central

Chandrasekaran, Jeyamala; Thiruvengadam, S. J.

2015-01-01

Owing to the growth of high performance network technologies, multimedia applications over the Internet are increasing exponentially. Applications like video conferencing, video-on-demand, and pay-per-view depend upon encryption algorithms for providing confidentiality. Video communication is characterized by distinct features such as large volume, high redundancy between adjacent frames, video codec compliance, syntax compliance, and application specific requirements. Naive approaches for video encryption encrypt the entire video stream with conventional text based cryptographic algorithms. Although naive approaches are the most secure for video encryption, the computational cost associated with them is very high. This research work aims at enhancing the speed of naive approaches through chaos based S-box design. Chaotic equations are popularly known for randomness, extreme sensitivity to initial conditions, and ergodicity. The proposed methodology employs two-dimensional discrete Henon map for (i) generation of dynamic and key-dependent S-box that could be integrated with symmetric algorithms like Blowfish and Data Encryption Standard (DES) and (ii) generation of one-time keys for simple substitution ciphers. The proposed design is tested for randomness, nonlinearity, avalanche effect, bit independence criterion, and key sensitivity. Experimental results confirm that chaos based S-box design and key generation significantly reduce the computational cost of video encryption with no compromise in security. PMID:26550603
Learning multimodal dictionaries.

PubMed

Monaci, Gianluca; Jost, Philippe; Vandergheynst, Pierre; Mailhé, Boris; Lesage, Sylvain; Gribonval, Rémi

2007-09-01

Real-world phenomena involve complex interactions between multiple signal modalities. As a consequence, humans are used to integrate at each instant perceptions from all their senses in order to enrich their understanding of the surrounding world. This paradigm can be also extremely useful in many signal processing and computer vision problems involving mutually related signals. The simultaneous processing of multimodal data can, in fact, reveal information that is otherwise hidden when considering the signals independently. However, in natural multimodal signals, the statistical dependencies between modalities are in general not obvious. Learning fundamental multimodal patterns could offer deep insight into the structure of such signals. In this paper, we present a novel model of multimodal signals based on their sparse decomposition over a dictionary of multimodal structures. An algorithm for iteratively learning multimodal generating functions that can be shifted at all positions in the signal is proposed, as well. The learning is defined in such a way that it can be accomplished by iteratively solving a generalized eigenvector problem, which makes the algorithm fast, flexible, and free of user-defined parameters. The proposed algorithm is applied to audiovisual sequences and it is able to discover underlying structures in the data. The detection of such audio-video patterns in audiovisual clips allows to effectively localize the sound source on the video in presence of substantial acoustic and visual distractors, outperforming state-of-the-art audiovisual localization algorithms.
Bio-Inspired Neural Model for Learning Dynamic Models

NASA Technical Reports Server (NTRS)

Duong, Tuan; Duong, Vu; Suri, Ronald

2009-01-01

A neural-network mathematical model that, relative to prior such models, places greater emphasis on some of the temporal aspects of real neural physical processes, has been proposed as a basis for massively parallel, distributed algorithms that learn dynamic models of possibly complex external processes by means of learning rules that are local in space and time. The algorithms could be made to perform such functions as recognition and prediction of words in speech and of objects depicted in video images. The approach embodied in this model is said to be "hardware-friendly" in the following sense: The algorithms would be amenable to execution by special-purpose computers implemented as very-large-scale integrated (VLSI) circuits that would operate at relatively high speeds and low power demands.
Hierarchical event selection for video storyboards with a case study on snooker video visualization.

PubMed

Parry, Matthew L; Legg, Philip A; Chung, David H S; Griffiths, Iwan W; Chen, Min

2011-12-01

Video storyboard, which is a form of video visualization, summarizes the major events in a video using illustrative visualization. There are three main technical challenges in creating a video storyboard, (a) event classification, (b) event selection and (c) event illustration. Among these challenges, (a) is highly application-dependent and requires a significant amount of application specific semantics to be encoded in a system or manually specified by users. This paper focuses on challenges (b) and (c). In particular, we present a framework for hierarchical event representation, and an importance-based selection algorithm for supporting the creation of a video storyboard from a video. We consider the storyboard to be an event summarization for the whole video, whilst each individual illustration on the board is also an event summarization but for a smaller time window. We utilized a 3D visualization template for depicting and annotating events in illustrations. To demonstrate the concepts and algorithms developed, we use Snooker video visualization as a case study, because it has a concrete and agreeable set of semantic definitions for events and can make use of existing techniques of event detection and 3D reconstruction in a reliable manner. Nevertheless, most of our concepts and algorithms developed for challenges (b) and (c) can be applied to other application areas. © 2010 IEEE
Robust camera calibration for sport videos using court models

NASA Astrophysics Data System (ADS)

Farin, Dirk; Krabbe, Susanne; de With, Peter H. N.; Effelsberg, Wolfgang

2003-12-01

We propose an automatic camera calibration algorithm for court sports. The obtained camera calibration parameters are required for applications that need to convert positions in the video frame to real-world coordinates or vice versa. Our algorithm uses a model of the arrangement of court lines for calibration. Since the court model can be specified by the user, the algorithm can be applied to a variety of different sports. The algorithm starts with a model initialization step which locates the court in the image without any user assistance or a-priori knowledge about the most probable position. Image pixels are classified as court line pixels if they pass several tests including color and local texture constraints. A Hough transform is applied to extract line elements, forming a set of court line candidates. The subsequent combinatorial search establishes correspondences between lines in the input image and lines from the court model. For the succeeding input frames, an abbreviated calibration algorithm is used, which predicts the camera parameters for the new image and optimizes the parameters using a gradient-descent algorithm. We have conducted experiments on a variety of sport videos (tennis, volleyball, and goal area sequences of soccer games). Video scenes with considerable difficulties were selected to test the robustness of the algorithm. Results show that the algorithm is very robust to occlusions, partial court views, bad lighting conditions, or shadows.
Evaluation schemes for video and image anomaly detection algorithms

NASA Astrophysics Data System (ADS)

Parameswaran, Shibin; Harguess, Josh; Barngrover, Christopher; Shafer, Scott; Reese, Michael

2016-05-01

Video anomaly detection is a critical research area in computer vision. It is a natural first step before applying object recognition algorithms. There are many algorithms that detect anomalies (outliers) in videos and images that have been introduced in recent years. However, these algorithms behave and perform differently based on differences in domains and tasks to which they are subjected. In order to better understand the strengths and weaknesses of outlier algorithms and their applicability in a particular domain/task of interest, it is important to measure and quantify their performance using appropriate evaluation metrics. There are many evaluation metrics that have been used in the literature such as precision curves, precision-recall curves, and receiver operating characteristic (ROC) curves. In order to construct these different metrics, it is also important to choose an appropriate evaluation scheme that decides when a proposed detection is considered a true or a false detection. Choosing the right evaluation metric and the right scheme is very critical since the choice can introduce positive or negative bias in the measuring criterion and may favor (or work against) a particular algorithm or task. In this paper, we review evaluation metrics and popular evaluation schemes that are used to measure the performance of anomaly detection algorithms on videos and imagery with one or more anomalies. We analyze the biases introduced by these by measuring the performance of an existing anomaly detection algorithm.
High-precision tracking of brownian boomerang colloidal particles confined in quasi two dimensions.

PubMed

Chakrabarty, Ayan; Wang, Feng; Fan, Chun-Zhen; Sun, Kai; Wei, Qi-Huo

2013-11-26

In this article, we present a high-precision image-processing algorithm for tracking the translational and rotational Brownian motion of boomerang-shaped colloidal particles confined in quasi-two-dimensional geometry. By measuring mean square displacements of an immobilized particle, we demonstrate that the positional and angular precision of our imaging and image-processing system can achieve 13 nm and 0.004 rad, respectively. By analyzing computer-simulated images, we demonstrate that the positional and angular accuracies of our image-processing algorithm can achieve 32 nm and 0.006 rad. Because of zero correlations between the displacements in neighboring time intervals, trajectories of different videos of the same particle can be merged into a very long time trajectory, allowing for long-time averaging of different physical variables. We apply this image-processing algorithm to measure the diffusion coefficients of boomerang particles of three different apex angles and discuss the angle dependence of these diffusion coefficients.
A Genetic Algorithm and Fuzzy Logic Approach for Video Shot Boundary Detection

PubMed Central

Thounaojam, Dalton Meitei; Khelchandra, Thongam; Singh, Kh. Manglem; Roy, Sudipta

2016-01-01

This paper proposed a shot boundary detection approach using Genetic Algorithm and Fuzzy Logic. In this, the membership functions of the fuzzy system are calculated using Genetic Algorithm by taking preobserved actual values for shot boundaries. The classification of the types of shot transitions is done by the fuzzy system. Experimental results show that the accuracy of the shot boundary detection increases with the increase in iterations or generations of the GA optimization process. The proposed system is compared to latest techniques and yields better result in terms of F1score parameter. PMID:27127500
Tracking and recognition face in videos with incremental local sparse representation model

NASA Astrophysics Data System (ADS)

Wang, Chao; Wang, Yunhong; Zhang, Zhaoxiang

2013-10-01

This paper addresses the problem of tracking and recognizing faces via incremental local sparse representation. First a robust face tracking algorithm is proposed via employing local sparse appearance and covariance pooling method. In the following face recognition stage, with the employment of a novel template update strategy, which combines incremental subspace learning, our recognition algorithm adapts the template to appearance changes and reduces the influence of occlusion and illumination variation. This leads to a robust video-based face tracking and recognition with desirable performance. In the experiments, we test the quality of face recognition in real-world noisy videos on YouTube database, which includes 47 celebrities. Our proposed method produces a high face recognition rate at 95% of all videos. The proposed face tracking and recognition algorithms are also tested on a set of noisy videos under heavy occlusion and illumination variation. The tracking results on challenging benchmark videos demonstrate that the proposed tracking algorithm performs favorably against several state-of-the-art methods. In the case of the challenging dataset in which faces undergo occlusion and illumination variation, and tracking and recognition experiments under significant pose variation on the University of California, San Diego (Honda/UCSD) database, our proposed method also consistently demonstrates a high recognition rate.
A MPEG-4 encoder based on TMS320C6416

NASA Astrophysics Data System (ADS)

Li, Gui-ju; Liu, Wei-ning

2013-08-01

Engineering and products need to achieve real-time video encoding by DSP, but the high computational complexity and huge amount of data requires that system has high data throughput. In this paper, a real-time MPEG-4 video encoder is designed based on TMS320C6416 platform. The kernel is the DSP of TMS320C6416T and FPGA chip f as the organization and management of video data. In order to control the flow of input and output data. Encoded stream is output using the synchronous serial port. The system has the clock frequency of 1GHz and has up to 8000 MIPS speed processing capacity when running at full speed. Due to the low coding efficiency of MPEG-4 video encoder transferred directly to DSP platform, it is needed to improve the program structure, data structures and algorithms combined with TMS320C6416T characteristics. First: Design the image storage architecture by balancing the calculation spending, storage space cost and EDMA read time factors. Open up a more buffer in memory, each buffer cache 16 lines of video data to be encoded, reconstruction image and reference image including search range. By using the variable alignment mode of the DSP, modifying the definition of structure variables and change the look-up table which occupy larger space with a direct calculation array to save memory space. After the program structure optimization, the program code, all variables, buffering buffers and the interpolation image including the search range can be placed in memory. Then, as to the time-consuming process modules and some functions which are called many times, the corresponding modules are written in parallel assembly language of TMS320C6416T which can increase the running speed. Besides, the motion estimation algorithm is improved by using a cross-hexagon search algorithm, The search speed can be increased obviously. Finally, the execution time, signal-to-noise ratio and compression ratio of a real-time image acquisition sequence is given. The experimental results show that the designed encoder in this paper can accomplish real-time encoding of a 768× 576, 25 frames per second grayscale video. The code rate is 1.5M bits per second.
Locally adaptive vector quantization: Data compression with feature preservation

NASA Technical Reports Server (NTRS)

Cheung, K. M.; Sayano, M.

1992-01-01

A study of a locally adaptive vector quantization (LAVQ) algorithm for data compression is presented. This algorithm provides high-speed one-pass compression and is fully adaptable to any data source and does not require a priori knowledge of the source statistics. Therefore, LAVQ is a universal data compression algorithm. The basic algorithm and several modifications to improve performance are discussed. These modifications are nonlinear quantization, coarse quantization of the codebook, and lossless compression of the output. Performance of LAVQ on various images using irreversible (lossy) coding is comparable to that of the Linde-Buzo-Gray algorithm, but LAVQ has a much higher speed; thus this algorithm has potential for real-time video compression. Unlike most other image compression algorithms, LAVQ preserves fine detail in images. LAVQ's performance as a lossless data compression algorithm is comparable to that of Lempel-Ziv-based algorithms, but LAVQ uses far less memory during the coding process.
Video semaphore decoding for free-space optical communication

NASA Astrophysics Data System (ADS)

Last, Matthew; Fisher, Brian; Ezekwe, Chinwuba; Hubert, Sean M.; Patel, Sheetal; Hollar, Seth; Leibowitz, Brian S.; Pister, Kristofer S. J.

2001-04-01

Using teal-time image processing we have demonstrated a low bit-rate free-space optical communication system at a range of more than 20km with an average optical transmission power of less than 2mW. The transmitter is an autonomous one cubic inch microprocessor-controlled sensor node with a laser diode output. The receiver is a standard CCD camera with a 1-inch aperture lens, and both hardware and software implementations of the video semaphore decoding algorithm. With this system sensor data can be reliably transmitted 21 km form San Francisco to Berkeley.
Low-complexity image processing for real-time detection of neonatal clonic seizures.

PubMed

Ntonfo, Guy Mathurin Kouamou; Ferrari, Gianluigi; Raheli, Riccardo; Pisani, Francesco

2012-05-01

In this paper, we consider a novel low-complexity real-time image-processing-based approach to the detection of neonatal clonic seizures. Our approach is based on the extraction, from a video of a newborn, of an average luminance signal representative of the body movements. Since clonic seizures are characterized by periodic movements of parts of the body (e.g., the limbs), by evaluating the periodicity of the extracted average luminance signal it is possible to detect the presence of a clonic seizure. The periodicity is investigated, through a hybrid autocorrelation-Yin estimation technique, on a per-window basis, where a time window is defined as a sequence of consecutive video frames. While processing is first carried out on a single window basis, we extend our approach to interlaced windows. The performance of the proposed detection algorithm is investigated, in terms of sensitivity and specificity, through receiver operating characteristic curves, considering video recordings of newborns affected by neonatal seizures.

Shot boundary detection and label propagation for spatio-temporal video segmentation

NASA Astrophysics Data System (ADS)

Piramanayagam, Sankaranaryanan; Saber, Eli; Cahill, Nathan D.; Messinger, David

2015-02-01

This paper proposes a two stage algorithm for streaming video segmentation. In the first stage, shot boundaries are detected within a window of frames by comparing dissimilarity between 2-D segmentations of each frame. In the second stage, the 2-D segments are propagated across the window of frames in both spatial and temporal direction. The window is moved across the video to find all shot transitions and obtain spatio-temporal segments simultaneously. As opposed to techniques that operate on entire video, the proposed approach consumes significantly less memory and enables segmentation of lengthy videos. We tested our segmentation based shot detection method on the TRECVID 2007 video dataset and compared it with block-based technique. Cut detection results on the TRECVID 2007 dataset indicate that our algorithm has comparable results to the best of the block-based methods. The streaming video segmentation routine also achieves promising results on a challenging video segmentation benchmark database.
Speckle reduction in echocardiography by temporal compounding and anisotropic diffusion filtering

NASA Astrophysics Data System (ADS)

Giraldo-Guzmán, Jader; Porto-Solano, Oscar; Cadena-Bonfanti, Alberto; Contreras-Ortiz, Sonia H.

2015-01-01

Echocardiography is a medical imaging technique based on ultrasound signals that is used to evaluate heart anatomy and physiology. Echocardiographic images are affected by speckle, a type of multiplicative noise that obscures details of the structures, and reduces the overall image quality. This paper shows an approach to enhance echocardiography using two processing techniques: temporal compounding and anisotropic diffusion filtering. We used twenty echocardiographic videos that include one or three cardiac cycles to test the algorithms. Two images from each cycle were aligned in space and averaged to obtain the compound images. These images were then processed using anisotropic diffusion filters to further improve their quality. Resultant images were evaluated using quality metrics and visual assessment by two medical doctors. The average total improvement on signal-to-noise ratio was up to 100.29% for videos with three cycles, and up to 32.57% for videos with one cycle.
An optimal adder-based hardware architecture for the DCT/SA-DCT

NASA Astrophysics Data System (ADS)

Kinane, Andrew; Muresan, Valentin; O'Connor, Noel

2005-07-01

The explosive growth of the mobile multimedia industry has accentuated the need for ecient VLSI implemen- tations of the associated computationally demanding signal processing algorithms. This need becomes greater as end-users demand increasingly enhanced features and more advanced underpinning video analysis. One such feature is object-based video processing as supported by MPEG-4 core profile, which allows content-based in- teractivity. MPEG-4 has many computationally demanding underlying algorithms, an example of which is the Shape Adaptive Discrete Cosine Transform (SA-DCT). The dynamic nature of the SA-DCT processing steps pose significant VLSI implementation challenges and many of the previously proposed approaches use area and power consumptive multipliers. Most also ignore the subtleties of the packing steps and manipulation of the shape information. We propose a new multiplier-less serial datapath based solely on adders and multiplexers to improve area and power. The adder cost is minimised by employing resource re-use methods. The number of (physical) adders used has been derived using a common sub-expression elimination algorithm. Additional energy eciency is factored into the design by employing guarded evaluation and local clock gating. Our design implements the SA-DCT packing with minimal switching using ecient addressing logic with a transpose mem- ory RAM. The entire design has been synthesized using TSMC 0.09µm TCBN90LP technology yielding a gate count of 12028 for the datapath and its control logic.
Restoration of Static JPEG Images and RGB Video Frames by Means of Nonlinear Filtering in Conditions of Gaussian and Non-Gaussian Noise

NASA Astrophysics Data System (ADS)

Sokolov, R. I.; Abdullin, R. R.

2017-11-01

The use of nonlinear Markov process filtering makes it possible to restore both video stream frames and static photos at the stage of preprocessing. The present paper reflects the results of research in comparison of these types image filtering quality by means of special algorithm when Gaussian or non-Gaussian noises acting. Examples of filter operation at different values of signal-to-noise ratio are presented. A comparative analysis has been performed, and the best filtered kind of noise has been defined. It has been shown the quality of developed algorithm is much better than quality of adaptive one for RGB signal filtering at the same a priori information about the signal. Also, an advantage over median filter takes a place when both fluctuation and pulse noise filtering.
Model-based video segmentation for vision-augmented interactive games

NASA Astrophysics Data System (ADS)

Liu, Lurng-Kuo

2000-04-01

This paper presents an architecture and algorithms for model based video object segmentation and its applications to vision augmented interactive game. We are especially interested in real time low cost vision based applications that can be implemented in software in a PC. We use different models for background and a player object. The object segmentation algorithm is performed in two different levels: pixel level and object level. At pixel level, the segmentation algorithm is formulated as a maximizing a posteriori probability (MAP) problem. The statistical likelihood of each pixel is calculated and used in the MAP problem. Object level segmentation is used to improve segmentation quality by utilizing the information about the spatial and temporal extent of the object. The concept of an active region, which is defined based on motion histogram and trajectory prediction, is introduced to indicate the possibility of a video object region for both background and foreground modeling. It also reduces the overall computation complexity. In contrast with other applications, the proposed video object segmentation system is able to create background and foreground models on the fly even without introductory background frames. Furthermore, we apply different rate of self-tuning on the scene model so that the system can adapt to the environment when there is a scene change. We applied the proposed video object segmentation algorithms to several prototype virtual interactive games. In our prototype vision augmented interactive games, a player can immerse himself/herself inside a game and can virtually interact with other animated characters in a real time manner without being constrained by helmets, gloves, special sensing devices, or background environment. The potential applications of the proposed algorithms including human computer gesture interface and object based video coding such as MPEG-4 video coding.
Studying fish near ocean energy devices using underwater video

DOE Office of Scientific and Technical Information (OSTI.GOV)

Matzner, Shari; Hull, Ryan E.; Harker-Klimes, Genevra EL

The effects of energy devices on fish populations are not well-understood, and studying the interactions of fish with tidal and instream turbines is challenging. To address this problem, we have evaluated algorithms to automatically detect fish in underwater video and propose a semi-automated method for ocean and river energy device ecological monitoring. The key contributions of this work are the demonstration of a background subtraction algorithm (ViBE) that detected 87% of human-identified fish events and is suitable for use in a real-time system to reduce data volume, and the demonstration of a statistical model to classify detections as fish ormore » not fish that achieved a correct classification rate of 85% overall and 92% for detections larger than 5 pixels. Specific recommendations for underwater video acquisition to better facilitate automated processing are given. The recommendations will help energy developers put effective monitoring systems in place, and could lead to a standard approach that simplifies the monitoring effort and advances the scientific understanding of the ecological impacts of ocean and river energy devices.« less
Joint Multi-Leaf Segmentation, Alignment, and Tracking for Fluorescence Plant Videos.

PubMed

Yin, Xi; Liu, Xiaoming; Chen, Jin; Kramer, David M

2018-06-01

This paper proposes a novel framework for fluorescence plant video processing. The plant research community is interested in the leaf-level photosynthetic analysis within a plant. A prerequisite for such analysis is to segment all leaves, estimate their structures, and track them over time. We identify this as a joint multi-leaf segmentation, alignment, and tracking problem. First, leaf segmentation and alignment are applied on the last frame of a plant video to find a number of well-aligned leaf candidates. Second, leaf tracking is applied on the remaining frames with leaf candidate transformation from the previous frame. We form two optimization problems with shared terms in their objective functions for leaf alignment and tracking respectively. A quantitative evaluation framework is formulated to evaluate the performance of our algorithm with four metrics. Two models are learned to predict the alignment accuracy and detect tracking failure respectively in order to provide guidance for subsequent plant biology analysis. The limitation of our algorithm is also studied. Experimental results show the effectiveness, efficiency, and robustness of the proposed method.
Pilot study on real-time motion detection in UAS video data by human observer and image exploitation algorithm

NASA Astrophysics Data System (ADS)

Hild, Jutta; Krüger, Wolfgang; Brüstle, Stefan; Trantelle, Patrick; Unmüßig, Gabriel; Voit, Michael; Heinze, Norbert; Peinsipp-Byma, Elisabeth; Beyerer, Jürgen

2017-05-01

Real-time motion video analysis is a challenging and exhausting task for the human observer, particularly in safety and security critical domains. Hence, customized video analysis systems providing functions for the analysis of subtasks like motion detection or target tracking are welcome. While such automated algorithms relieve the human operators from performing basic subtasks, they impose additional interaction duties on them. Prior work shows that, e.g., for interaction with target tracking algorithms, a gaze-enhanced user interface is beneficial. In this contribution, we present an investigation on interaction with an independent motion detection (IDM) algorithm. Besides identifying an appropriate interaction technique for the user interface - again, we compare gaze-based and traditional mouse-based interaction - we focus on the benefit an IDM algorithm might provide for an UAS video analyst. In a pilot study, we exposed ten subjects to the task of moving target detection in UAS video data twice, once performing with automatic support, once performing without it. We compare the two conditions considering performance in terms of effectiveness (correct target selections). Additionally, we report perceived workload (measured using the NASA-TLX questionnaire) and user satisfaction (measured using the ISO 9241-411 questionnaire). The results show that a combination of gaze input and automated IDM algorithm provides valuable support for the human observer, increasing the number of correct target selections up to 62% and reducing workload at the same time.
Collaborative real-time motion video analysis by human observer and image exploitation algorithms

NASA Astrophysics Data System (ADS)

Hild, Jutta; Krüger, Wolfgang; Brüstle, Stefan; Trantelle, Patrick; Unmüßig, Gabriel; Heinze, Norbert; Peinsipp-Byma, Elisabeth; Beyerer, Jürgen

2015-05-01

Motion video analysis is a challenging task, especially in real-time applications. In most safety and security critical applications, a human observer is an obligatory part of the overall analysis system. Over the last years, substantial progress has been made in the development of automated image exploitation algorithms. Hence, we investigate how the benefits of automated video analysis can be integrated suitably into the current video exploitation systems. In this paper, a system design is introduced which strives to combine both the qualities of the human observer's perception and the automated algorithms, thus aiming to improve the overall performance of a real-time video analysis system. The system design builds on prior work where we showed the benefits for the human observer by means of a user interface which utilizes the human visual focus of attention revealed by the eye gaze direction for interaction with the image exploitation system; eye tracker-based interaction allows much faster, more convenient, and equally precise moving target acquisition in video images than traditional computer mouse selection. The system design also builds on prior work we did on automated target detection, segmentation, and tracking algorithms. Beside the system design, a first pilot study is presented, where we investigated how the participants (all non-experts in video analysis) performed in initializing an object tracking subsystem by selecting a target for tracking. Preliminary results show that the gaze + key press technique is an effective, efficient, and easy to use interaction technique when performing selection operations on moving targets in videos in order to initialize an object tracking function.
Multimodal Sensor Fusion for Personnel Detection

DTIC Science & Technology

2011-07-01

video ). Efficacy of UGS systems is often limited by high false alarm rates because the onboard data processing algorithms may not be able to correctly...humans) and animals (e.g., donkeys , mules, and horses). The humans walked alone and in groups with and without backpacks; the animals were led by their
Video transmission on ATM networks. Ph.D. Thesis

NASA Technical Reports Server (NTRS)

Chen, Yun-Chung

1993-01-01

The broadband integrated services digital network (B-ISDN) is expected to provide high-speed and flexible multimedia applications. Multimedia includes data, graphics, image, voice, and video. Asynchronous transfer mode (ATM) is the adopted transport techniques for B-ISDN and has the potential for providing a more efficient and integrated environment for multimedia. It is believed that most broadband applications will make heavy use of visual information. The prospect of wide spread use of image and video communication has led to interest in coding algorithms for reducing bandwidth requirements and improving image quality. The major results of a study on the bridging of network transmission performance and video coding are: Using two representative video sequences, several video source models are developed. The fitness of these models are validated through the use of statistical tests and network queuing performance. A dual leaky bucket algorithm is proposed as an effective network policing function. The concept of the dual leaky bucket algorithm can be applied to a prioritized coding approach to achieve transmission efficiency. A mapping of the performance/control parameters at the network level into equivalent parameters at the video coding level is developed. Based on that, a complete set of principles for the design of video codecs for network transmission is proposed.
Multi-Core Programming Design Patterns: Stream Processing Algorithms for Dynamic Scene Perceptions

DTIC Science & Technology

2014-05-01

processor developed by IBM and other companies , incorpo- rates the verb—POWER5— processor as the Power Processor Element (PPE), one of the early general...deliver an power efficient single-precision peak performance of more than 256 GFlops. Substantially more raw power became available later, when nVIDIA ...algorithms, including IBM’s Cell/B.E., GPUs from NVidia and AMD and many-core CPUs from Intel.27 The vast growth of digital video content has been a
Multilevel analysis of sports video sequences

NASA Astrophysics Data System (ADS)

Han, Jungong; Farin, Dirk; de With, Peter H. N.

2006-01-01

We propose a fully automatic and flexible framework for analysis and summarization of tennis broadcast video sequences, using visual features and specific game-context knowledge. Our framework can analyze a tennis video sequence at three levels, which provides a broad range of different analysis results. The proposed framework includes novel pixel-level and object-level tennis video processing algorithms, such as a moving-player detection taking both the color and the court (playing-field) information into account, and a player-position tracking algorithm based on a 3-D camera model. Additionally, we employ scene-level models for detecting events, like service, base-line rally and net-approach, based on a number real-world visual features. The system can summarize three forms of information: (1) all court-view playing frames in a game, (2) the moving trajectory and real-speed of each player, as well as relative position between the player and the court, (3) the semantic event segments in a game. The proposed framework is flexible in choosing the level of analysis that is desired. It is effective because the framework makes use of several visual cues obtained from the real-world domain to model important events like service, thereby increasing the accuracy of the scene-level analysis. The paper presents attractive experimental results highlighting the system efficiency and analysis capabilities.
Precise determination of anthropometric dimensions by means of image processing methods for estimating human body segment parameter values.

PubMed

Baca, A

1996-04-01

A method has been developed for the precise determination of anthropometric dimensions from the video images of four different body configurations. High precision is achieved by incorporating techniques for finding the location of object boundaries with sub-pixel accuracy, the implementation of calibration algorithms, and by taking into account the varying distances of the body segments from the recording camera. The system allows automatic segment boundary identification from the video image, if the boundaries are marked on the subject by black ribbons. In connection with the mathematical finite-mass-element segment model of Hatze, body segment parameters (volumes, masses, the three principal moments of inertia, the three local coordinates of the segmental mass centers etc.) can be computed by using the anthropometric data determined videometrically as input data. Compared to other, recently published video-based systems for the estimation of the inertial properties of body segments, the present algorithms reduce errors originating from optical distortions, inaccurate edge-detection procedures, and user-specified upper and lower segment boundaries or threshold levels for the edge-detection. The video-based estimation of human body segment parameters is especially useful in situations where ease of application and rapid availability of comparatively precise parameter values are of importance.
Geopositioning with a quadcopter: Extracted feature locations and predicted accuracy without a priori sensor attitude information

NASA Astrophysics Data System (ADS)

Dolloff, John; Hottel, Bryant; Edwards, David; Theiss, Henry; Braun, Aaron

2017-05-01

This paper presents an overview of the Full Motion Video-Geopositioning Test Bed (FMV-GTB) developed to investigate algorithm performance and issues related to the registration of motion imagery and subsequent extraction of feature locations along with predicted accuracy. A case study is included corresponding to a video taken from a quadcopter. Registration of the corresponding video frames is performed without the benefit of a priori sensor attitude (pointing) information. In particular, tie points are automatically measured between adjacent frames using standard optical flow matching techniques from computer vision, an a priori estimate of sensor attitude is then computed based on supplied GPS sensor positions contained in the video metadata and a photogrammetric/search-based structure from motion algorithm, and then a Weighted Least Squares adjustment of all a priori metadata across the frames is performed. Extraction of absolute 3D feature locations, including their predicted accuracy based on the principles of rigorous error propagation, is then performed using a subset of the registered frames. Results are compared to known locations (check points) over a test site. Throughout this entire process, no external control information (e.g. surveyed points) is used other than for evaluation of solution errors and corresponding accuracy.
Video Shot Boundary Detection Using QR-Decomposition and Gaussian Transition Detection

NASA Astrophysics Data System (ADS)

Amiri, Ali; Fathy, Mahmood

2010-12-01

This article explores the problem of video shot boundary detection and examines a novel shot boundary detection algorithm by using QR-decomposition and modeling of gradual transitions by Gaussian functions. Specifically, the authors attend to the challenges of detecting gradual shots and extracting appropriate spatiotemporal features that affect the ability of algorithms to efficiently detect shot boundaries. The algorithm utilizes the properties of QR-decomposition and extracts a block-wise probability function that illustrates the probability of video frames to be in shot transitions. The probability function has abrupt changes in hard cut transitions, and semi-Gaussian behavior in gradual transitions. The algorithm detects these transitions by analyzing the probability function. Finally, we will report the results of the experiments using large-scale test sets provided by the TRECVID 2006, which has assessments for hard cut and gradual shot boundary detection. These results confirm the high performance of the proposed algorithm.
Framework for performance evaluation of face, text, and vehicle detection and tracking in video: data, metrics, and protocol.

PubMed

Kasturi, Rangachar; Goldgof, Dmitry; Soundararajan, Padmanabhan; Manohar, Vasant; Garofolo, John; Bowers, Rachel; Boonstra, Matthew; Korzhova, Valentina; Zhang, Jing

2009-02-01

Common benchmark data sets, standardized performance metrics, and baseline algorithms have demonstrated considerable impact on research and development in a variety of application domains. These resources provide both consumers and developers of technology with a common framework to objectively compare the performance of different algorithms and algorithmic improvements. In this paper, we present such a framework for evaluating object detection and tracking in video: specifically for face, text, and vehicle objects. This framework includes the source video data, ground-truth annotations (along with guidelines for annotation), performance metrics, evaluation protocols, and tools including scoring software and baseline algorithms. For each detection and tracking task and supported domain, we developed a 50-clip training set and a 50-clip test set. Each data clip is approximately 2.5 minutes long and has been completely spatially/temporally annotated at the I-frame level. Each task/domain, therefore, has an associated annotated corpus of approximately 450,000 frames. The scope of such annotation is unprecedented and was designed to begin to support the necessary quantities of data for robust machine learning approaches, as well as a statistically significant comparison of the performance of algorithms. The goal of this work was to systematically address the challenges of object detection and tracking through a common evaluation framework that permits a meaningful objective comparison of techniques, provides the research community with sufficient data for the exploration of automatic modeling techniques, encourages the incorporation of objective evaluation into the development process, and contributes useful lasting resources of a scale and magnitude that will prove to be extremely useful to the computer vision research community for years to come.
Joint Transform Correlation for face tracking: elderly fall detection application

NASA Astrophysics Data System (ADS)

Katz, Philippe; Aron, Michael; Alfalou, Ayman

2013-03-01

In this paper, an iterative tracking algorithm based on a non-linear JTC (Joint Transform Correlator) architecture and enhanced by a digital image processing method is proposed and validated. This algorithm is based on the computation of a correlation plane where the reference image is updated at each frame. For that purpose, we use the JTC technique in real time to track a patient (target image) in a room fitted with a video camera. The correlation plane is used to localize the target image in the current video frame (frame i). Then, the reference image to be exploited in the next frame (frame i+1) is updated according to the previous one (frame i). In an effort to validate our algorithm, our work is divided into two parts: (i) a large study based on different sequences with several situations and different JTC parameters is achieved in order to quantify their effects on the tracking performances (decimation, non-linearity coefficient, size of the correlation plane, size of the region of interest...). (ii) the tracking algorithm is integrated into an application of elderly fall detection. The first reference image is a face detected by means of Haar descriptors, and then localized into the new video image thanks to our tracking method. In order to avoid a bad update of the reference frame, a method based on a comparison of image intensity histograms is proposed and integrated in our algorithm. This step ensures a robust tracking of the reference frame. This article focuses on face tracking step optimisation and evalutation. A supplementary step of fall detection, based on vertical acceleration and position, will be added and studied in further work.
Video segmentation for post-production

NASA Astrophysics Data System (ADS)

Wills, Ciaran

2001-12-01

Specialist post-production is an industry that has much to gain from the application of content-based video analysis techniques. However the types of material handled in specialist post-production, such as television commercials, pop music videos and special effects are quite different in nature from the typical broadcast material which many video analysis techniques are designed to work with; shots are short and highly dynamic, and the transitions are often novel or ambiguous. We address the problem of scene change detection and develop a new algorithm which tackles some of the common aspects of post-production material that cause difficulties for past algorithms, such as illumination changes and jump cuts. Operating in the compressed domain on Motion JPEG compressed video, our algorithm detects cuts and fades by analyzing each JPEG macroblock in the context of its temporal and spatial neighbors. Analyzing the DCT coefficients directly we can extract the mean color of a block and an approximate detail level. We can also perform an approximated cross-correlation between two blocks. The algorithm is part of a set of tools being developed to work with an automated asset management system designed specifically for use in post-production facilities.
Real-time fluorescence target/background (T/B) ratio calculation in multimodal endoscopy for detecting GI tract cancer

NASA Astrophysics Data System (ADS)

Jiang, Yang; Gong, Yuanzheng; Wang, Thomas D.; Seibel, Eric J.

2017-02-01

Multimodal endoscopy, with fluorescence-labeled probes binding to overexpressed molecular targets, is a promising technology to visualize early-stage cancer. T/B ratio is the quantitative analysis used to correlate fluorescence regions to cancer. Currently, T/B ratio calculation is post-processing and does not provide real-time feedback to the endoscopist. To achieve real-time computer assisted diagnosis (CAD), we establish image processing protocols for calculating T/B ratio and locating high-risk fluorescence regions for guiding biopsy and therapy in Barrett's esophagus (BE) patients. Methods: Chan-Vese algorithm, an active contour model, is used to segment high-risk regions in fluorescence videos. A semi-implicit gradient descent method was applied to minimize the energy function of this algorithm and evolve the segmentation. The surrounding background was then identified using morphology operation. The average T/B ratio was computed and regions of interest were highlighted based on user-selected thresholding. Evaluation was conducted on 50 fluorescence videos acquired from clinical video recordings using a custom multimodal endoscope. Results: With a processing speed of 2 fps on a laptop computer, we obtained accurate segmentation of high-risk regions examined by experts. For each case, the clinical user could optimize target boundary by changing the penalty on area inside the contour. Conclusion: Automatic and real-time procedure of calculating T/B ratio and identifying high-risk regions of early esophageal cancer was developed. Future work will increase processing speed to <5 fps, refine the clinical interface, and apply to additional GI cancers and fluorescence peptides.

High-grade video compression of echocardiographic studies: a multicenter validation study of selected motion pictures expert groups (MPEG)-4 algorithms.

PubMed

Barbier, Paolo; Alimento, Marina; Berna, Giovanni; Celeste, Fabrizio; Gentile, Francesco; Mantero, Antonio; Montericcio, Vincenzo; Muratori, Manuela

2007-05-01

Large files produced by standard compression algorithms slow down spread of digital and tele-echocardiography. We validated echocardiographic video high-grade compression with the new Motion Pictures Expert Groups (MPEG)-4 algorithms with a multicenter study. Seven expert cardiologists blindly scored (5-point scale) 165 uncompressed and compressed 2-dimensional and color Doppler video clips, based on combined diagnostic content and image quality (uncompressed files as references). One digital video and 3 MPEG-4 algorithms (WM9, MV2, and DivX) were used, the latter at 3 compression levels (0%, 35%, and 60%). Compressed file sizes decreased from 12 to 83 MB to 0.03 to 2.3 MB (1:1051-1:26 reduction ratios). Mean SD of differences was 0.81 for intraobserver variability (uncompressed and digital video files). Compared with uncompressed files, only the DivX mean score at 35% (P = .04) and 60% (P = .001) compression was significantly reduced. At subcategory analysis, these differences were still significant for gray-scale and fundamental imaging but not for color or second harmonic tissue imaging. Original image quality, session sequence, compression grade, and bitrate were all independent determinants of mean score. Our study supports use of MPEG-4 algorithms to greatly reduce echocardiographic file sizes, thus facilitating archiving and transmission. Quality evaluation studies should account for the many independent variables that affect image quality grading.
Eye blink detection for different driver states in conditionally automated driving and manual driving using EOG and a driver camera.

PubMed

Schmidt, Jürgen; Laarousi, Rihab; Stolzmann, Wolfgang; Karrer-Gauß, Katja

2018-06-01

In this article, we examine the performance of different eye blink detection algorithms under various constraints. The goal of the present study was to evaluate the performance of an electrooculogram- and camera-based blink detection process in both manually and conditionally automated driving phases. A further comparison between alert and drowsy drivers was performed in order to evaluate the impact of drowsiness on the performance of blink detection algorithms in both driving modes. Data snippets from 14 monotonous manually driven sessions (mean 2 h 46 min) and 16 monotonous conditionally automated driven sessions (mean 2 h 45 min) were used. In addition to comparing two data-sampling frequencies for the electrooculogram measures (50 vs. 25 Hz) and four different signal-processing algorithms for the camera videos, we compared the blink detection performance of 24 reference groups. The analysis of the videos was based on very detailed definitions of eyelid closure events. The correct detection rates for the alert and manual driving phases (maximum 94%) decreased significantly in the drowsy (minus 2% or more) and conditionally automated (minus 9% or more) phases. Blinking behavior is therefore significantly impacted by drowsiness as well as by automated driving, resulting in less accurate blink detection.
Blinking supervision in a working environment

NASA Astrophysics Data System (ADS)

Morcego, Bernardo; Argilés, Marc; Cabrerizo, Marc; Cardona, Genís; Pérez, Ramon; Pérez-Cabré, Elisabet; Gispets, Joan

2016-02-01

The health of the ocular surface requires blinks of the eye to be frequent in order to provide moisture and to renew the tear film. However, blinking frequency has been shown to decrease in certain conditions such as when subjects are conducting tasks with high cognitive and visual demands. These conditions are becoming more common as people work or spend their leisure time in front of video display terminals. Supervision of blinking frequency in such environments is possible, thanks to the availability of computer-integrated cameras. Therefore, the aim of the present study is to develop an algorithm for the detection of eye blinks and to test it, in a number of videos captured, while subjects are conducting a variety of tasks in front of the computer. The sensitivity of the algorithm for blink detection was found to be of 87.54% (range 30% to 100%), with a mean false-positive rate of 0.19% (range 0% to 1.7%), depending on the illumination conditions during which the image was captured and other computer-user spatial configurations. The current automatic process is based on a partly modified pre-existing eye detection and image processing algorithms and consists of four stages that are aimed at eye detection, eye tracking, iris detection and segmentation, and iris height/width ratio assessment.
Image enhancement software for underwater recovery operations: User's manual

NASA Astrophysics Data System (ADS)

Partridge, William J.; Therrien, Charles W.

1989-06-01

This report describes software for performing image enhancement on live or recorded video images. The software was developed for operational use during underwater recovery operations at the Naval Undersea Warfare Engineering Station. The image processing is performed on an IBM-PC/AT compatible computer equipped with hardware to digitize and display video images. The software provides the capability to provide contrast enhancement and other similar functions in real time through hardware lookup tables, to automatically perform histogram equalization, to capture one or more frames and average them or apply one of several different processing algorithms to a captured frame. The report is in the form of a user manual for the software and includes guided tutorial and reference sections. A Digital Image Processing Primer in the appendix serves to explain the principle concepts that are used in the image processing.
The research and application of visual saliency and adaptive support vector machine in target tracking field.

PubMed

Chen, Yuantao; Xu, Weihong; Kuang, Fangjun; Gao, Shangbing

2013-01-01

The efficient target tracking algorithm researches have become current research focus of intelligent robots. The main problems of target tracking process in mobile robot face environmental uncertainty. They are very difficult to estimate the target states, illumination change, target shape changes, complex backgrounds, and other factors and all affect the occlusion in tracking robustness. To further improve the target tracking's accuracy and reliability, we present a novel target tracking algorithm to use visual saliency and adaptive support vector machine (ASVM). Furthermore, the paper's algorithm has been based on the mixture saliency of image features. These features include color, brightness, and sport feature. The execution process used visual saliency features and those common characteristics have been expressed as the target's saliency. Numerous experiments demonstrate the effectiveness and timeliness of the proposed target tracking algorithm in video sequences where the target objects undergo large changes in pose, scale, and illumination.
Flow measurements in sewers based on image analysis: automatic flow velocity algorithm.

PubMed

Jeanbourquin, D; Sage, D; Nguyen, L; Schaeli, B; Kayal, S; Barry, D A; Rossi, L

2011-01-01

Discharges of combined sewer overflows (CSOs) and stormwater are recognized as an important source of environmental contamination. However, the harsh sewer environment and particular hydraulic conditions during rain events reduce the reliability of traditional flow measurement probes. An in situ system for sewer water flow monitoring based on video images was evaluated. Algorithms to determine water velocities were developed based on image-processing techniques. The image-based water velocity algorithm identifies surface features and measures their positions with respect to real world coordinates. A web-based user interface and a three-tier system architecture enable remote configuration of the cameras and the image-processing algorithms in order to calculate automatically flow velocity on-line. Results of investigations conducted in a CSO are presented. The system was found to measure reliably water velocities, thereby providing the means to understand particular hydraulic behaviors.
Smartphone based automatic organ validation in ultrasound video.

PubMed

Vaish, Pallavi; Bharath, R; Rajalakshmi, P

2017-07-01

Telesonography involves transmission of ultrasound video from remote areas to the doctors for getting diagnosis. Due to the lack of trained sonographers in remote areas, the ultrasound videos scanned by these untrained persons do not contain the proper information that is required by a physician. As compared to standard methods for video transmission, mHealth driven systems need to be developed for transmitting valid medical videos. To overcome this problem, we are proposing an organ validation algorithm to evaluate the ultrasound video based on the content present. This will guide the semi skilled person to acquire the representative data from patient. Advancement in smartphone technology allows us to perform high medical image processing on smartphone. In this paper we have developed an Application (APP) for a smartphone which can automatically detect the valid frames (which consist of clear organ visibility) in an ultrasound video and ignores the invalid frames (which consist of no-organ visibility), and produces a compressed sized video. This is done by extracting the GIST features from the Region of Interest (ROI) of the frame and then classifying the frame using SVM classifier with quadratic kernel. The developed application resulted with the accuracy of 94.93% in classifying valid and invalid images.
ACE: Automatic Centroid Extractor for real time target tracking

NASA Technical Reports Server (NTRS)

Cameron, K.; Whitaker, S.; Canaris, J.

1990-01-01

A high performance video image processor has been implemented which is capable of grouping contiguous pixels from a raster scan image into groups and then calculating centroid information for each object in a frame. The algorithm employed to group pixels is very efficient and is guaranteed to work properly for all convex shapes as well as most concave shapes. Processing speeds are adequate for real time processing of video images having a pixel rate of up to 20 million pixels per second. Pixels may be up to 8 bits wide. The processor is designed to interface directly to a transputer serial link communications channel with no additional hardware. The full custom VLSI processor was implemented in a 1.6 mu m CMOS process and measures 7200 mu m on a side.
Methodology for stereoscopic motion-picture quality assessment

NASA Astrophysics Data System (ADS)

Voronov, Alexander; Vatolin, Dmitriy; Sumin, Denis; Napadovsky, Vyacheslav; Borisov, Alexey

2013-03-01

Creating and processing stereoscopic video imposes additional quality requirements related to view synchronization. In this work we propose a set of algorithms for detecting typical stereoscopic-video problems, which appear owing to imprecise setup of capture equipment or incorrect postprocessing. We developed a methodology for analyzing the quality of S3D motion pictures and for revealing their most problematic scenes. We then processed 10 modern stereo films, including Avatar, Resident Evil: Afterlife and Hugo, and analyzed changes in S3D-film quality over the years. This work presents real examples of common artifacts (color and sharpness mismatch, vertical disparity and excessive horizontal disparity) in the motion pictures we processed, as well as possible solutions for each problem. Our results enable improved quality assessment during the filming and postproduction stages.
Display device-adapted video quality-of-experience assessment

NASA Astrophysics Data System (ADS)

Rehman, Abdul; Zeng, Kai; Wang, Zhou

2015-03-01

Today's viewers consume video content from a variety of connected devices, including smart phones, tablets, notebooks, TVs, and PCs. This imposes significant challenges for managing video traffic efficiently to ensure an acceptable quality-of-experience (QoE) for the end users as the perceptual quality of video content strongly depends on the properties of the display device and the viewing conditions. State-of-the-art full-reference objective video quality assessment algorithms do not take into account the combined impact of display device properties, viewing conditions, and video resolution while performing video quality assessment. We performed a subjective study in order to understand the impact of aforementioned factors on perceptual video QoE. We also propose a full reference video QoE measure, named SSIMplus, that provides real-time prediction of the perceptual quality of a video based on human visual system behaviors, video content characteristics (such as spatial and temporal complexity, and video resolution), display device properties (such as screen size, resolution, and brightness), and viewing conditions (such as viewing distance and angle). Experimental results have shown that the proposed algorithm outperforms state-of-the-art video quality measures in terms of accuracy and speed.
Artificial Neural Network applied to lightning flashes

NASA Astrophysics Data System (ADS)

Gin, R. B.; Guedes, D.; Bianchi, R.

2013-05-01

The development of video cameras enabled cientists to study lightning discharges comportment with more precision. The main goal of this project is to create a system able to detect images of lightning discharges stored in videos and classify them using an Artificial Neural Network (ANN)using C Language and OpenCV libraries. The developed system, can be split in two different modules: detection module and classification module. The detection module uses OpenCV`s computer vision libraries and image processing techniques to detect if there are significant differences between frames in a sequence, indicating that something, still not classified, occurred. Whenever there is a significant difference between two consecutive frames, two main algorithms are used to analyze the frame image: brightness and shape algorithms. These algorithms detect both shape and brightness of the event, removing irrelevant events like birds, as well as detecting the relevant events exact position, allowing the system to track it over time. The classification module uses a neural network to classify the relevant events as horizontal or vertical lightning, save the event`s images and calculates his number of discharges. The Neural Network was implemented using the backpropagation algorithm, and was trained with 42 training images , containing 57 lightning events (one image can have more than one lightning). TheANN was tested with one to five hidden layers, with up to 50 neurons each. The best configuration achieved a success rate of 95%, with one layer containing 20 neurons (33 test images with 42 events were used in this phase). This configuration was implemented in the developed system to analyze 20 video files, containing 63 lightning discharges previously manually detected. Results showed that all the lightning discharges were detected, many irrelevant events were unconsidered, and the event's number of discharges was correctly computed. The neural network used in this project achieved a success rate of 90%. The videos used in this experiment were acquired by seven video cameras installed in São Bernardo do Campo, Brazil, that continuously recorded lightning events during the summer. The cameras were disposed in a 360 loop, recording all data at a time resolution of 33ms. During this period, several convective storms were recorded.
A Novel Real-Time Reference Key Frame Scan Matching Method.

PubMed

Mohamed, Haytham; Moussa, Adel; Elhabiby, Mohamed; El-Sheimy, Naser; Sesay, Abu

2017-05-07

Unmanned aerial vehicles represent an effective technology for indoor search and rescue operations. Typically, most indoor missions' environments would be unknown, unstructured, and/or dynamic. Navigation of UAVs in such environments is addressed by simultaneous localization and mapping approach using either local or global approaches. Both approaches suffer from accumulated errors and high processing time due to the iterative nature of the scan matching method. Moreover, point-to-point scan matching is prone to outlier association processes. This paper proposes a low-cost novel method for 2D real-time scan matching based on a reference key frame (RKF). RKF is a hybrid scan matching technique comprised of feature-to-feature and point-to-point approaches. This algorithm aims at mitigating errors accumulation using the key frame technique, which is inspired from video streaming broadcast process. The algorithm depends on the iterative closest point algorithm during the lack of linear features which is typically exhibited in unstructured environments. The algorithm switches back to the RKF once linear features are detected. To validate and evaluate the algorithm, the mapping performance and time consumption are compared with various algorithms in static and dynamic environments. The performance of the algorithm exhibits promising navigational, mapping results and very short computational time, that indicates the potential use of the new algorithm with real-time systems.
Virtual viewpoint synthesis in multi-view video system

NASA Astrophysics Data System (ADS)

Li, Fang; Yang, Shiqiang

2005-07-01

In this paper, we present a virtual viewpoint video synthesis algorithm to satisfy the following three aims: low computing consuming; real time interpolation and acceptable video quality. In contrast with previous technologies, this method obtain incompletely 3D structure using neighbor video sources instead of getting total 3D information with all video sources, so that the computation is reduced greatly. So we demonstrate our interactive multi-view video synthesis algorithm in a personal computer. Furthermore, adopting the method of choosing feature points to build the correspondence between the frames captured by neighbor cameras, we need not require camera calibration. Finally, our method can be used when the angle between neighbor cameras is 25-30 degrees that it is much larger than common computer vision experiments. In this way, our method can be applied into many applications such as sports live, video conference, etc.
Simulator sickness analysis of 3D video viewing on passive 3D TV

NASA Astrophysics Data System (ADS)

Brunnström, K.; Wang, K.; Andrén, B.

2013-03-01

The MPEG 3DV project is working on the next generation video encoding standard and in this process a call for proposal of encoding algorithms was issued. To evaluate these algorithm a large scale subjective test was performed involving Laboratories all over the world. For the participating Labs it was optional to administer a slightly modified Simulator Sickness Questionnaire (SSQ) from Kennedy et al (1993) before and after the test. Here we report the results from one Lab (Acreo) located in Sweden. The videos were shown on a 46 inch film pattern retarder 3D TV, where the viewers were using polarized passive eye-glasses to view the stereoscopic 3D video content. There were 68 viewers participating in this investigation in ages ranges from 16 to 72, with one third females. The questionnaire was filled in before and after the test, with a viewing time ranging between 30 min to about one and half hour, which is comparable to a feature length movie. The SSQ consists of 16 different symptoms that have been identified as important for indicating simulator sickness. When analyzing the individual symptoms it was found that Fatigue, Eye-strain, Difficulty Focusing and Difficulty Concentrating were significantly worse after than before. SSQ was also analyzed according to the model suggested by Kennedy et al (1993). All in all this investigation shows a statistically significant increase in symptoms after viewing 3D video especially related to visual or Oculomotor system.
VASIR: An Open-Source Research Platform for Advanced Iris Recognition Technologies.

PubMed

Lee, Yooyoung; Micheals, Ross J; Filliben, James J; Phillips, P Jonathon

2013-01-01

The performance of iris recognition systems is frequently affected by input image quality, which in turn is vulnerable to less-than-optimal conditions due to illuminations, environments, and subject characteristics (e.g., distance, movement, face/body visibility, blinking, etc.). VASIR (Video-based Automatic System for Iris Recognition) is a state-of-the-art NIST-developed iris recognition software platform designed to systematically address these vulnerabilities. We developed VASIR as a research tool that will not only provide a reference (to assess the relative performance of alternative algorithms) for the biometrics community, but will also advance (via this new emerging iris recognition paradigm) NIST's measurement mission. VASIR is designed to accommodate both ideal (e.g., classical still images) and less-than-ideal images (e.g., face-visible videos). VASIR has three primary modules: 1) Image Acquisition 2) Video Processing, and 3) Iris Recognition. Each module consists of several sub-components that have been optimized by use of rigorous orthogonal experiment design and analysis techniques. We evaluated VASIR performance using the MBGC (Multiple Biometric Grand Challenge) NIR (Near-Infrared) face-visible video dataset and the ICE (Iris Challenge Evaluation) 2005 still-based dataset. The results showed that even though VASIR was primarily developed and optimized for the less-constrained video case, it still achieved high verification rates for the traditional still-image case. For this reason, VASIR may be used as an effective baseline for the biometrics community to evaluate their algorithm performance, and thus serves as a valuable research platform.
VASIR: An Open-Source Research Platform for Advanced Iris Recognition Technologies

PubMed Central

Lee, Yooyoung; Micheals, Ross J; Filliben, James J; Phillips, P Jonathon

2013-01-01

The performance of iris recognition systems is frequently affected by input image quality, which in turn is vulnerable to less-than-optimal conditions due to illuminations, environments, and subject characteristics (e.g., distance, movement, face/body visibility, blinking, etc.). VASIR (Video-based Automatic System for Iris Recognition) is a state-of-the-art NIST-developed iris recognition software platform designed to systematically address these vulnerabilities. We developed VASIR as a research tool that will not only provide a reference (to assess the relative performance of alternative algorithms) for the biometrics community, but will also advance (via this new emerging iris recognition paradigm) NIST’s measurement mission. VASIR is designed to accommodate both ideal (e.g., classical still images) and less-than-ideal images (e.g., face-visible videos). VASIR has three primary modules: 1) Image Acquisition 2) Video Processing, and 3) Iris Recognition. Each module consists of several sub-components that have been optimized by use of rigorous orthogonal experiment design and analysis techniques. We evaluated VASIR performance using the MBGC (Multiple Biometric Grand Challenge) NIR (Near-Infrared) face-visible video dataset and the ICE (Iris Challenge Evaluation) 2005 still-based dataset. The results showed that even though VASIR was primarily developed and optimized for the less-constrained video case, it still achieved high verification rates for the traditional still-image case. For this reason, VASIR may be used as an effective baseline for the biometrics community to evaluate their algorithm performance, and thus serves as a valuable research platform. PMID:26401431
An approach to integrate the human vision psychology and perception knowledge into image enhancement

NASA Astrophysics Data System (ADS)

Wang, Hui; Huang, Xifeng; Ping, Jiang

2009-07-01

Image enhancement is very important image preprocessing technology especially when the image is captured in the poor imaging condition or dealing with the high bits image. The benefactor of image enhancement either may be a human observer or a computer vision process performing some kind of higher-level image analysis, such as target detection or scene understanding. One of the main objects of the image enhancement is getting a high dynamic range image and a high contrast degree image for human perception or interpretation. So, it is very necessary to integrate either empirical or statistical human vision psychology and perception knowledge into image enhancement. The human vision psychology and perception claims that humans' perception and response to the intensity fluctuation δu of visual signals are weighted by the background stimulus u, instead of being plainly uniform. There are three main laws: Weber's law, Weber- Fechner's law and Stevens's Law that describe this phenomenon in the psychology and psychophysics. This paper will integrate these three laws of the human vision psychology and perception into a very popular image enhancement algorithm named Adaptive Plateau Equalization (APE). The experiments were done on the high bits star image captured in night scene and the infrared-red image both the static image and the video stream. For the jitter problem in the video stream, this algorithm reduces this problem using the difference between the current frame's plateau value and the previous frame's plateau value to correct the current frame's plateau value. Considering the random noise impacts, the pixel value mapping process is not only depending on the current pixel but the pixels in the window surround the current pixel. The window size is usually 3×3. The process results of this improved algorithms is evaluated by the entropy analysis and visual perception analysis. The experiments' result showed the improved APE algorithms improved the quality of the image, the target and the surrounding assistant targets could be identified easily, and the noise was not amplified much. For the low quality image, these improved algorithms augment the information entropy and improve the image and the video stream aesthetic quality, while for the high quality image they will not debase the quality of the image.
High Performance Compression of Science Data

NASA Technical Reports Server (NTRS)

Storer, James A.; Carpentieri, Bruno; Cohn, Martin

1994-01-01

Two papers make up the body of this report. One presents a single-pass adaptive vector quantization algorithm that learns a codebook of variable size and shape entries; the authors present experiments on a set of test images showing that with no training or prior knowledge of the data, for a given fidelity, the compression achieved typically equals or exceeds that of the JPEG standard. The second paper addresses motion compensation, one of the most effective techniques used in interframe data compression. A parallel block-matching algorithm for estimating interframe displacement of blocks with minimum error is presented. The algorithm is designed for a simple parallel architecture to process video in real time.
SWCD: a sliding window and self-regulated learning-based background updating method for change detection in videos

NASA Astrophysics Data System (ADS)

Işık, Şahin; Özkan, Kemal; Günal, Serkan; Gerek, Ömer Nezih

2018-03-01

Change detection with background subtraction process remains to be an unresolved issue and attracts research interest due to challenges encountered on static and dynamic scenes. The key challenge is about how to update dynamically changing backgrounds from frames with an adaptive and self-regulated feedback mechanism. In order to achieve this, we present an effective change detection algorithm for pixelwise changes. A sliding window approach combined with dynamic control of update parameters is introduced for updating background frames, which we called sliding window-based change detection. Comprehensive experiments on related test videos show that the integrated algorithm yields good objective and subjective performance by overcoming illumination variations, camera jitters, and intermittent object motions. It is argued that the obtained method makes a fair alternative in most types of foreground extraction scenarios; unlike case-specific methods, which normally fail for their nonconsidered scenarios.
[Investigation on remote measurement of air pollution by a method of infrared passive scanning imaging].

PubMed

Jiao, Yang; Xu, Liang; Gao, Min-Guang; Feng, Ming-Chun; Jin, Ling; Tong, Jing-Jing; Li, Sheng

2012-07-01

Passive remote sensing by Fourier-transform infrared (FTIR) spectrometry allows detection of air pollution. However, for the localization of a leak and a complete assessment of the situation in the case of the release of a hazardous cloud, information about the position and the distribution of a cloud is essential. Therefore, an imaging passive remote sensing system comprising an interferometer, a data acquisition and processing software, scan system, a video system, and a personal computer has been developed. The remote sensing of SF6 was done. The column densities of all directions in which a target compound has been identified may be retrieved by a nonlinear least squares fitting algorithm and algorithm of radiation transfer, and a false color image is displayed. The results were visualized by a video image, overlaid by false color concentration distribution image. The system has a high selectivity, and allows visualization and quantification of pollutant clouds.

The video watermarking container: efficient real-time transaction watermarking

NASA Astrophysics Data System (ADS)

Wolf, Patrick; Hauer, Enrico; Steinebach, Martin

2008-02-01

When transaction watermarking is used to secure sales in online shops by embedding transaction specific watermarks, the major challenge is embedding efficiency: Maximum speed by minimal workload. This is true for all types of media. Video transaction watermarking presents a double challenge. Video files not only are larger than for example music files of the same playback time. In addition, video watermarking algorithms have a higher complexity than algorithms for other types of media. Therefore online shops that want to protect their videos by transaction watermarking are faced with the problem that their servers need to work harder and longer for every sold medium in comparison to audio sales. In the past, many algorithms responded to this challenge by reducing their complexity. But this usually results in a loss of either robustness or transparency. This paper presents a different approach. The container technology separates watermark embedding into two stages: A preparation stage and the finalization stage. In the preparation stage, the video is divided into embedding segments. For each segment one copy marked with "0" and anther one marked with "1" is created. This stage is computationally expensive but only needs to be done once. In the finalization stage, the watermarked video is assembled from the embedding segments according to the watermark message. This stage is very fast and involves no complex computations. It thus allows efficient creation of individually watermarked video files.
An FPGA-Based People Detection System

NASA Astrophysics Data System (ADS)

Nair, Vinod; Laprise, Pierre-Olivier; Clark, James J.

2005-12-01

This paper presents an FPGA-based system for detecting people from video. The system is designed to use JPEG-compressed frames from a network camera. Unlike previous approaches that use techniques such as background subtraction and motion detection, we use a machine-learning-based approach to train an accurate detector. We address the hardware design challenges involved in implementing such a detector, along with JPEG decompression, on an FPGA. We also present an algorithm that efficiently combines JPEG decompression with the detection process. This algorithm carries out the inverse DCT step of JPEG decompression only partially. Therefore, it is computationally more efficient and simpler to implement, and it takes up less space on the chip than the full inverse DCT algorithm. The system is demonstrated on an automated video surveillance application and the performance of both hardware and software implementations is analyzed. The results show that the system can detect people accurately at a rate of about[InlineEquation not available: see fulltext.] frames per second on a Virtex-II 2V1000 using a MicroBlaze processor running at[InlineEquation not available: see fulltext.], communicating with dedicated hardware over FSL links.
Lip reading using neural networks

NASA Astrophysics Data System (ADS)

Kalbande, Dhananjay; Mishra, Akassh A.; Patil, Sanjivani; Nirgudkar, Sneha; Patel, Prashant

2011-10-01

Computerized lip reading, or speech reading, is concerned with the difficult task of converting a video signal of a speaking person to written text. It has several applications like teaching deaf and dumb to speak and communicate effectively with the other people, its crime fighting potential and invariance to acoustic environment. We convert the video of the subject speaking vowels into images and then images are further selected manually for processing. However, several factors like fast speech, bad pronunciation, and poor illumination, movement of face, moustaches and beards make lip reading difficult. Contour tracking methods and Template matching are used for the extraction of lips from the face. K Nearest Neighbor algorithm is then used to classify the 'speaking' images and the 'silent' images. The sequence of images is then transformed into segments of utterances. Feature vector is calculated on each frame for all the segments and is stored in the database with properly labeled class. Character recognition is performed using modified KNN algorithm which assigns more weight to nearer neighbors. This paper reports the recognition of vowels using KNN algorithms
A Completely Blind Video Integrity Oracle.

PubMed

Mittal, Anish; Saad, Michele A; Bovik, Alan C

2016-01-01

Considerable progress has been made toward developing still picture perceptual quality analyzers that do not require any reference picture and that are not trained on human opinion scores of distorted images. However, there do not yet exist any such completely blind video quality assessment (VQA) models. Here, we attempt to bridge this gap by developing a new VQA model called the video intrinsic integrity and distortion evaluation oracle (VIIDEO). The new model does not require the use of any additional information other than the video being quality evaluated. VIIDEO embodies models of intrinsic statistical regularities that are observed in natural vidoes, which are used to quantify disturbances introduced due to distortions. An algorithm derived from the VIIDEO model is thereby able to predict the quality of distorted videos without any external knowledge about the pristine source, anticipated distortions, or human judgments of video quality. Even with such a paucity of information, we are able to show that the VIIDEO algorithm performs much better than the legacy full reference quality measure MSE on the LIVE VQA database and delivers performance comparable with a leading human judgment trained blind VQA model. We believe that the VIIDEO algorithm is a significant step toward making real-time monitoring of completely blind video quality possible.
The infrared video image pseudocolor processing system

NASA Astrophysics Data System (ADS)

Zhu, Yong; Zhang, JiangLing

2003-11-01

The infrared video image pseudo-color processing system, emphasizing on the algorithm and its implementation for measured object"s 2D temperature distribution using pseudo-color technology, is introduced in the paper. The data of measured object"s thermal image is the objective presentation of its surface temperature distribution, but the color has a close relationship with people"s subjective cognition. The so-called pseudo-color technology cross the bridge between subjectivity and objectivity, and represents the measured object"s temperature distribution in reason and at first hand. The algorithm of pseudo-color is based on the distance of IHS space. Thereby the definition of pseudo-color visual resolution is put forward. Both the software (which realize the map from the sample data to the color space) and the hardware (which carry out the conversion from the color space to palette by HDL) co-operate. Therefore the two levels map which is logic map and physical map respectively is presented. The system has been used abroad in failure diagnose of electric power devices, fire protection for lifesaving and even SARS detection in CHINA lately.
A parallel spatiotemporal saliency and discriminative online learning method for visual target tracking in aerial videos.

PubMed

Aghamohammadi, Amirhossein; Ang, Mei Choo; A Sundararajan, Elankovan; Weng, Ng Kok; Mogharrebi, Marzieh; Banihashem, Seyed Yashar

2018-01-01

Visual tracking in aerial videos is a challenging task in computer vision and remote sensing technologies due to appearance variation difficulties. Appearance variations are caused by camera and target motion, low resolution noisy images, scale changes, and pose variations. Various approaches have been proposed to deal with appearance variation difficulties in aerial videos, and amongst these methods, the spatiotemporal saliency detection approach reported promising results in the context of moving target detection. However, it is not accurate for moving target detection when visual tracking is performed under appearance variations. In this study, a visual tracking method is proposed based on spatiotemporal saliency and discriminative online learning methods to deal with appearance variations difficulties. Temporal saliency is used to represent moving target regions, and it was extracted based on the frame difference with Sauvola local adaptive thresholding algorithms. The spatial saliency is used to represent the target appearance details in candidate moving regions. SLIC superpixel segmentation, color, and moment features can be used to compute feature uniqueness and spatial compactness of saliency measurements to detect spatial saliency. It is a time consuming process, which prompted the development of a parallel algorithm to optimize and distribute the saliency detection processes that are loaded into the multi-processors. Spatiotemporal saliency is then obtained by combining the temporal and spatial saliencies to represent moving targets. Finally, a discriminative online learning algorithm was applied to generate a sample model based on spatiotemporal saliency. This sample model is then incrementally updated to detect the target in appearance variation conditions. Experiments conducted on the VIVID dataset demonstrated that the proposed visual tracking method is effective and is computationally efficient compared to state-of-the-art methods.
A parallel spatiotemporal saliency and discriminative online learning method for visual target tracking in aerial videos

PubMed Central

2018-01-01

Visual tracking in aerial videos is a challenging task in computer vision and remote sensing technologies due to appearance variation difficulties. Appearance variations are caused by camera and target motion, low resolution noisy images, scale changes, and pose variations. Various approaches have been proposed to deal with appearance variation difficulties in aerial videos, and amongst these methods, the spatiotemporal saliency detection approach reported promising results in the context of moving target detection. However, it is not accurate for moving target detection when visual tracking is performed under appearance variations. In this study, a visual tracking method is proposed based on spatiotemporal saliency and discriminative online learning methods to deal with appearance variations difficulties. Temporal saliency is used to represent moving target regions, and it was extracted based on the frame difference with Sauvola local adaptive thresholding algorithms. The spatial saliency is used to represent the target appearance details in candidate moving regions. SLIC superpixel segmentation, color, and moment features can be used to compute feature uniqueness and spatial compactness of saliency measurements to detect spatial saliency. It is a time consuming process, which prompted the development of a parallel algorithm to optimize and distribute the saliency detection processes that are loaded into the multi-processors. Spatiotemporal saliency is then obtained by combining the temporal and spatial saliencies to represent moving targets. Finally, a discriminative online learning algorithm was applied to generate a sample model based on spatiotemporal saliency. This sample model is then incrementally updated to detect the target in appearance variation conditions. Experiments conducted on the VIVID dataset demonstrated that the proposed visual tracking method is effective and is computationally efficient compared to state-of-the-art methods. PMID:29438421
A semi-automatic annotation tool for cooking video

NASA Astrophysics Data System (ADS)

Bianco, Simone; Ciocca, Gianluigi; Napoletano, Paolo; Schettini, Raimondo; Margherita, Roberto; Marini, Gianluca; Gianforme, Giorgio; Pantaleo, Giuseppe

2013-03-01

In order to create a cooking assistant application to guide the users in the preparation of the dishes relevant to their profile diets and food preferences, it is necessary to accurately annotate the video recipes, identifying and tracking the foods of the cook. These videos present particular annotation challenges such as frequent occlusions, food appearance changes, etc. Manually annotate the videos is a time-consuming, tedious and error-prone task. Fully automatic tools that integrate computer vision algorithms to extract and identify the elements of interest are not error free, and false positive and false negative detections need to be corrected in a post-processing stage. We present an interactive, semi-automatic tool for the annotation of cooking videos that integrates computer vision techniques under the supervision of the user. The annotation accuracy is increased with respect to completely automatic tools and the human effort is reduced with respect to completely manual ones. The performance and usability of the proposed tool are evaluated on the basis of the time and effort required to annotate the same video sequences.
An efficient approach for video information retrieval

NASA Astrophysics Data System (ADS)

Dong, Daoguo; Xue, Xiangyang

2005-01-01

Today, more and more video information can be accessed through internet, satellite, etc.. Retrieving specific video information from large-scale video database has become an important and challenging research topic in the area of multimedia information retrieval. In this paper, we introduce a new and efficient index structure OVA-File, which is a variant of VA-File. In OVA-File, the approximations close to each other in data space are stored in close positions of the approximation file. The benefit is that only a part of approximations close to the query vector need to be visited to get the query result. Both shot query algorithm and video clip algorithm are proposed to support video information retrieval efficiently. The experimental results showed that the queries based on OVA-File were much faster than that based on VA-File with small loss of result quality.
3D video coding: an overview of present and upcoming standards

NASA Astrophysics Data System (ADS)

Merkle, Philipp; Müller, Karsten; Wiegand, Thomas

2010-07-01

An overview of existing and upcoming 3D video coding standards is given. Various different 3D video formats are available, each with individual pros and cons. The 3D video formats can be separated into two classes: video-only formats (such as stereo and multiview video) and depth-enhanced formats (such as video plus depth and multiview video plus depth). Since all these formats exist of at least two video sequences and possibly additional depth data, efficient compression is essential for the success of 3D video applications and technologies. For the video-only formats the H.264 family of coding standards already provides efficient and widely established compression algorithms: H.264/AVC simulcast, H.264/AVC stereo SEI message, and H.264/MVC. For the depth-enhanced formats standardized coding algorithms are currently being developed. New and specially adapted coding approaches are necessary, as the depth or disparity information included in these formats has significantly different characteristics than video and is not displayed directly, but used for rendering. Motivated by evolving market needs, MPEG has started an activity to develop a generic 3D video standard within the 3DVC ad-hoc group. Key features of the standard are efficient and flexible compression of depth-enhanced 3D video representations and decoupling of content creation and display requirements.
Modeling of video traffic in packet networks, low rate video compression, and the development of a lossy+lossless image compression algorithm

NASA Technical Reports Server (NTRS)

Sayood, K.; Chen, Y. C.; Wang, X.

1992-01-01

During this reporting period we have worked on three somewhat different problems. These are modeling of video traffic in packet networks, low rate video compression, and the development of a lossy + lossless image compression algorithm, which might have some application in browsing algorithms. The lossy + lossless scheme is an extension of work previously done under this grant. It provides a simple technique for incorporating browsing capability. The low rate coding scheme is also a simple variation on the standard discrete cosine transform (DCT) coding approach. In spite of its simplicity, the approach provides surprisingly high quality reconstructions. The modeling approach is borrowed from the speech recognition literature, and seems to be promising in that it provides a simple way of obtaining an idea about the second order behavior of a particular coding scheme. Details about these are presented.
An improved architecture for video rate image transformations

NASA Technical Reports Server (NTRS)

Fisher, Timothy E.; Juday, Richard D.

1989-01-01

Geometric image transformations are of interest to pattern recognition algorithms for their use in simplifying some aspects of the pattern recognition process. Examples include reducing sensitivity to rotation, scale, and perspective of the object being recognized. The NASA Programmable Remapper can perform a wide variety of geometric transforms at full video rate. An architecture is proposed that extends its abilities and alleviates many of the first version's shortcomings. The need for the improvements are discussed in the context of the initial Programmable Remapper and the benefits and limitations it has delivered. The implementation and capabilities of the proposed architecture are discussed.
Pulsation Detection from Noisy Ultrasound-Echo Moving Images of Newborn Baby Head Using Fourier Transform

NASA Astrophysics Data System (ADS)

Yamada, Masayoshi; Fukuzawa, Masayuki; Kitsunezuka, Yoshiki; Kishida, Jun; Nakamori, Nobuyuki; Kanamori, Hitoshi; Sakurai, Takashi; Kodama, Souichi

1995-05-01

In order to detect pulsation from a series of noisy ultrasound-echo moving images of a newborn baby's head for pediatric diagnosis, a digital image processing system capable of recording at the video rate and processing the recorded series of images was constructed. The time-sequence variations of each pixel value in a series of moving images were analyzed and then an algorithm based on Fourier transform was developed for the pulsation detection, noting that the pulsation associated with blood flow was periodically changed by heartbeat. Pulsation detection for pediatric diagnosis was successfully made from a series of noisy ultrasound-echo moving images of newborn baby's head by using the image processing system and the pulsation detection algorithm developed here.
Embedded security system for multi-modal surveillance in a railway carriage

NASA Astrophysics Data System (ADS)

Zouaoui, Rhalem; Audigier, Romaric; Ambellouis, Sébastien; Capman, François; Benhadda, Hamid; Joudrier, Stéphanie; Sodoyer, David; Lamarque, Thierry

2015-10-01

Public transport security is one of the main priorities of the public authorities when fighting against crime and terrorism. In this context, there is a great demand for autonomous systems able to detect abnormal events such as violent acts aboard passenger cars and intrusions when the train is parked at the depot. To this end, we present an innovative approach which aims at providing efficient automatic event detection by fusing video and audio analytics and reducing the false alarm rate compared to classical stand-alone video detection. The multi-modal system is composed of two microphones and one camera and integrates onboard video and audio analytics and fusion capabilities. On the one hand, for detecting intrusion, the system relies on the fusion of "unusual" audio events detection with intrusion detections from video processing. The audio analysis consists in modeling the normal ambience and detecting deviation from the trained models during testing. This unsupervised approach is based on clustering of automatically extracted segments of acoustic features and statistical Gaussian Mixture Model (GMM) modeling of each cluster. The intrusion detection is based on the three-dimensional (3D) detection and tracking of individuals in the videos. On the other hand, for violent events detection, the system fuses unsupervised and supervised audio algorithms with video event detection. The supervised audio technique detects specific events such as shouts. A GMM is used to catch the formant structure of a shout signal. Video analytics use an original approach for detecting aggressive motion by focusing on erratic motion patterns specific to violent events. As data with violent events is not easily available, a normality model with structured motions from non-violent videos is learned for one-class classification. A fusion algorithm based on Dempster-Shafer's theory analyses the asynchronous detection outputs and computes the degree of belief of each probable event.
Peri-operative imaging of cancer margins with reflectance confocal microscopy during Mohs micrographic surgery: feasibility of a video-mosaicing algorithm

NASA Astrophysics Data System (ADS)

Flores, Eileen; Yelamos, Oriol; Cordova, Miguel; Kose, Kivanc; Phillips, William; Rossi, Anthony; Nehal, Kishwer; Rajadhyaksha, Milind

2017-02-01

Reflectance confocal microscopy (RCM) imaging shows promise for guiding surgical treatment of skin cancers. Recent technological advancements such as the introduction of the handheld version of the reflectance confocal microscope, video acquisition and video-mosaicing have improved RCM as an emerging tool to evaluate cancer margins during routine surgical skin procedures such as Mohs micrographic surgery (MMS). Detection of residual non-melanoma skin cancer (NMSC) tumor during MMS is feasible, as demonstrated by the introduction of real-time perioperative imaging on patients in the surgical setting. Our study is currently testing the feasibility of a new mosaicing algorithm for perioperative RCM imaging of NMSC cancer margins on patients during MMS. We report progress toward imaging and image analysis on forty-five patients, who presented for MMS at the MSKCC Dermatology service. The first 10 patients were used as a training set to establish an RCM imaging algorithm, which was implemented on the remaining test set of 35 patients. RCM imaging, using 35% AlCl3 for nuclear contrast, was performed pre- and intra-operatively with the Vivascope 3000 (Caliber ID). Imaging was performed in quadrants in the wound, to simulate the Mohs surgeon's examination of pathology. Videos were taken at the epidermal and deep dermal margins. Our Mohs surgeons assessed all videos and video-mosaics for quality and correlation to histology. Overall, our RCM video-mosaicing algorithm is feasible. RCM videos and video-mosaics of the epidermal and dermal margins were found to be of clinically acceptable quality. Assessment of cancer margins was affected by type of NMSC, size and location. Among the test set of 35 patients, 83% showed acceptable imaging quality, resolution and contrast. Visualization of nuclear and cellular morphology of residual BCC/SCC tumor and normal skin features could be detected in the peripheral and deep dermal margins. We observed correlation between the RCM videos/video-mosaics and the corresponding histology in 32 lesions. Peri-operative RCM imaging shows promise for improved and faster detection of cancer margins and guiding MMS in the surgical setting.
Foliage penetration by using 4-D point cloud data

NASA Astrophysics Data System (ADS)

Méndez Rodríguez, Javier; Sánchez-Reyes, Pedro J.; Cruz-Rivera, Sol M.

2012-06-01

Real-time awareness and rapid target detection are critical for the success of military missions. New technologies capable of detecting targets concealed in forest areas are needed in order to track and identify possible threats. Currently, LAser Detection And Ranging (LADAR) systems are capable of detecting obscured targets; however, tracking capabilities are severely limited. Now, a new LADAR-derived technology is under development to generate 4-D datasets (3-D video in a point cloud format). As such, there is a new need for algorithms that are able to process data in real time. We propose an algorithm capable of removing vegetation and other objects that may obfuscate concealed targets in a real 3-D environment. The algorithm is based on wavelets and can be used as a pre-processing step in a target recognition algorithm. Applications of the algorithm in a real-time 3-D system could help make pilots aware of high risk hidden targets such as tanks and weapons, among others. We will be using a 4-D simulated point cloud data to demonstrate the capabilities of our algorithm.
Direct endoscopic video registration for sinus surgery

NASA Astrophysics Data System (ADS)

Mirota, Daniel; Taylor, Russell H.; Ishii, Masaru; Hager, Gregory D.

2009-02-01

Advances in computer vision have made possible robust 3D reconstruction of monocular endoscopic video. These reconstructions accurately represent the visible anatomy and, once registered to pre-operative CT data, enable a navigation system to track directly through video eliminating the need for an external tracking system. Video registration provides the means for a direct interface between an endoscope and a navigation system and allows a shorter chain of rigid-body transformations to be used to solve the patient/navigation-system registration. To solve this registration step we propose a new 3D-3D registration algorithm based on Trimmed Iterative Closest Point (TrICP)1 and the z-buffer algorithm.2 The algorithm takes as input a 3D point cloud of relative scale with the origin at the camera center, an isosurface from the CT, and an initial guess of the scale and location. Our algorithm utilizes only the visible polygons of the isosurface from the current camera location during each iteration to minimize the search area of the target region and robustly reject outliers of the reconstruction. We present example registrations in the sinus passage applicable to both sinus surgery and transnasal surgery. To evaluate our algorithm's performance we compare it to registration via Optotrak and present closest distance point to surface error. We show our algorithm has a mean closest distance error of .2268mm.
FPGA implementation of image dehazing algorithm for real time applications

NASA Astrophysics Data System (ADS)

Kumar, Rahul; Kaushik, Brajesh Kumar; Balasubramanian, R.

2017-09-01

Weather degradation such as haze, fog, mist, etc. severely reduces the effective range of visual surveillance. This degradation is a spatially varying phenomena, which makes this problem non trivial. Dehazing is an essential preprocessing stage in applications such as long range imaging, border security, intelligent transportation system, etc. However, these applications require low latency of the preprocessing block. In this work, single image dark channel prior algorithm is modified and implemented for fast processing with comparable visual quality of the restored image/video. Although conventional single image dark channel prior algorithm is computationally expensive, it yields impressive results. Moreover, a two stage image dehazing architecture is introduced, wherein, dark channel and airlight are estimated in the first stage. Whereas, transmission map and intensity restoration are computed in the next stages. The algorithm is implemented using Xilinx Vivado software and validated by using Xilinx zc702 development board, which contains an Artix7 equivalent Field Programmable Gate Array (FPGA) and ARM Cortex A9 dual core processor. Additionally, high definition multimedia interface (HDMI) has been incorporated for video feed and display purposes. The results show that the dehazing algorithm attains 29 frames per second for the image resolution of 1920x1080 which is suitable of real time applications. The design utilizes 9 18K_BRAM, 97 DSP_48, 6508 FFs and 8159 LUTs.
Detecting double compressed MPEG videos with the same quantization matrix and synchronized group of pictures structure

NASA Astrophysics Data System (ADS)

Aghamaleki, Javad Abbasi; Behrad, Alireza

2018-01-01

Double compression detection is a crucial stage in digital image and video forensics. However, the detection of double compressed videos is challenging when the video forger uses the same quantization matrix and synchronized group of pictures (GOP) structure during the recompression history to conceal tampering effects. A passive approach is proposed for detecting double compressed MPEG videos with the same quantization matrix and synchronized GOP structure. To devise the proposed algorithm, the effects of recompression on P frames are mathematically studied. Then, based on the obtained guidelines, a feature vector is proposed to detect double compressed frames on the GOP level. Subsequently, sparse representations of the feature vectors are used for dimensionality reduction and enrich the traces of recompression. Finally, a support vector machine classifier is employed to detect and localize double compression in temporal domain. The experimental results show that the proposed algorithm achieves the accuracy of more than 95%. In addition, the comparisons of the results of the proposed method with those of other methods reveal the efficiency of the proposed algorithm.
Processing Ocean Images to Detect Large Drift Nets

NASA Technical Reports Server (NTRS)

Veenstra, Tim

2009-01-01

A computer program processes the digitized outputs of a set of downward-looking video cameras aboard an aircraft flying over the ocean. The purpose served by this software is to facilitate the detection of large drift nets that have been lost, abandoned, or jettisoned. The development of this software and of the associated imaging hardware is part of a larger effort to develop means of detecting and removing large drift nets before they cause further environmental damage to the ocean and to shores on which they sometimes impinge. The software is capable of near-realtime processing of as many as three video feeds at a rate of 30 frames per second. After a user sets the parameters of an adjustable algorithm, the software analyzes each video stream, detects any anomaly, issues a command to point a high-resolution camera toward the location of the anomaly, and, once the camera has been so aimed, issues a command to trigger the camera shutter. The resulting high-resolution image is digitized, and the resulting data are automatically uploaded to the operator s computer for analysis.

[Reconstruction of Vehicle-human Crash Accident and Injury Analysis Based on 3D Laser Scanning, Multi-rigid-body Reconstruction and Optimized Genetic Algorithm].

PubMed

Sun, J; Wang, T; Li, Z D; Shao, Y; Zhang, Z Y; Feng, H; Zou, D H; Chen, Y J

2017-12-01

To reconstruct a vehicle-bicycle-cyclist crash accident and analyse the injuries using 3D laser scanning technology, multi-rigid-body dynamics and optimized genetic algorithm, and to provide biomechanical basis for the forensic identification of death cause. The vehicle was measured by 3D laser scanning technology. The multi-rigid-body models of cyclist, bicycle and vehicle were developed based on the measurements. The value range of optimal variables was set. A multi-objective genetic algorithm and the nondominated sorting genetic algorithm were used to find the optimal solutions, which were compared to the record of the surveillance video around the accident scene. The reconstruction result of laser scanning on vehicle was satisfactory. In the optimal solutions found by optimization method of genetic algorithm, the dynamical behaviours of dummy, bicycle and vehicle corresponded to that recorded by the surveillance video. The injury parameters of dummy were consistent with the situation and position of the real injuries on the cyclist in accident. The motion status before accident, damage process by crash and mechanical analysis on the injury of the victim can be reconstructed using 3D laser scanning technology, multi-rigid-body dynamics and optimized genetic algorithm, which have application value in the identification of injury manner and analysis of death cause in traffic accidents. Copyright© by the Editorial Department of Journal of Forensic Medicine
An automatic fuzzy-based multi-temporal brain digital subtraction angiography image fusion algorithm using curvelet transform and content selection strategy.

PubMed

Momeni, Saba; Pourghassem, Hossein

2014-08-01

Recently image fusion has prominent role in medical image processing and is useful to diagnose and treat many diseases. Digital subtraction angiography is one of the most applicable imaging to diagnose brain vascular diseases and radiosurgery of brain. This paper proposes an automatic fuzzy-based multi-temporal fusion algorithm for 2-D digital subtraction angiography images. In this algorithm, for blood vessel map extraction, the valuable frames of brain angiography video are automatically determined to form the digital subtraction angiography images based on a novel definition of vessel dispersion generated by injected contrast material. Our proposed fusion scheme contains different fusion methods for high and low frequency contents based on the coefficient characteristic of wrapping second generation of curvelet transform and a novel content selection strategy. Our proposed content selection strategy is defined based on sample correlation of the curvelet transform coefficients. In our proposed fuzzy-based fusion scheme, the selection of curvelet coefficients are optimized by applying weighted averaging and maximum selection rules for the high frequency coefficients. For low frequency coefficients, the maximum selection rule based on local energy criterion is applied to better visual perception. Our proposed fusion algorithm is evaluated on a perfect brain angiography image dataset consisting of one hundred 2-D internal carotid rotational angiography videos. The obtained results demonstrate the effectiveness and efficiency of our proposed fusion algorithm in comparison with common and basic fusion algorithms.
Comparison of H.265/HEVC encoders

NASA Astrophysics Data System (ADS)

Trochimiuk, Maciej

2016-09-01

The H.265/HEVC is the state-of-the-art video compression standard, which allows the bitrate reduction up to 50% compared with its predecessor, H.264/AVC, maintaining equal perceptual video quality. The growth in coding efficiency was achieved by increasing the number of available intra- and inter-frame prediction features and improvements in existing ones, such as entropy encoding and filtering. Nevertheless, to achieve real-time performance of the encoder, simplifications in algorithm are inevitable. Some features and coding modes shall be skipped, to reduce time needed to evaluate modes forwarded to rate-distortion optimisation. Thus, the potential acceleration of the encoding process comes at the expense of coding efficiency. In this paper, a trade-off between video quality and encoding speed of various H.265/HEVC encoders is discussed.
HEVC optimizations for medical environments

NASA Astrophysics Data System (ADS)

Fernández, D. G.; Del Barrio, A. A.; Botella, Guillermo; García, Carlos; Meyer-Baese, Uwe; Meyer-Baese, Anke

2016-05-01

HEVC/H.265 is the most interesting and cutting-edge topic in the world of digital video compression, allowing to reduce by half the required bandwidth in comparison with the previous H.264 standard. Telemedicine services and in general any medical video application can benefit from the video encoding advances. However, the HEVC is computationally expensive to implement. In this paper a method for reducing the HEVC complexity in the medical environment is proposed. The sequences that are typically processed in this context contain several homogeneous regions. Leveraging these regions, it is possible to simplify the HEVC flow while maintaining a high-level quality. In comparison with the HM16.2 standard, the encoding time is reduced up to 75%, with a negligible quality loss. Moreover, the algorithm is straightforward to implement in any hardware platform.
Detecting and Analyzing Multiple Moving Objects in Crowded Environments with Coherent Motion Regions

DOE Office of Scientific and Technical Information (OSTI.GOV)

Cheriyadat, Anil M.

Understanding the world around us from large-scale video data requires vision systems that can perform automatic interpretation. While human eyes can unconsciously perceive independent objects in crowded scenes and other challenging operating environments, automated systems have difficulty detecting, counting, and understanding their behavior in similar scenes. Computer scientists at ORNL have a developed a technology termed as "Coherent Motion Region Detection" that invloves identifying multiple indepedent moving objects in crowded scenes by aggregating low-level motion cues extracted from moving objects. Humans and other species exploit such low-level motion cues seamlessely to perform perceptual grouping for visual understanding. The algorithm detectsmore » and tracks feature points on moving objects resulting in partial trajectories that span coherent 3D region in the space-time volume defined by the video. In the case of multi-object motion, many possible coherent motion regions can be constructed around the set of trajectories. The unique approach in the algorithm is to identify all possible coherent motion regions, then extract a subset of motion regions based on an innovative measure to automatically locate moving objects in crowded environments.The software reports snapshot of the object, count, and derived statistics ( count over time) from input video streams. The software can directly process videos streamed over the internet or directly from a hardware device (camera).« less
Mission planning optimization of video satellite for ground multi-object staring imaging

NASA Astrophysics Data System (ADS)

Cui, Kaikai; Xiang, Junhua; Zhang, Yulin

2018-03-01

This study investigates the emergency scheduling problem of ground multi-object staring imaging for a single video satellite. In the proposed mission scenario, the ground objects require a specified duration of staring imaging by the video satellite. The planning horizon is not long, i.e., it is usually shorter than one orbit period. A binary decision variable and the imaging order are used as the design variables, and the total observation revenue combined with the influence of the total attitude maneuvering time is regarded as the optimization objective. Based on the constraints of the observation time windows, satellite attitude adjustment time, and satellite maneuverability, a constraint satisfaction mission planning model is established for ground object staring imaging by a single video satellite. Further, a modified ant colony optimization algorithm with tabu lists (Tabu-ACO) is designed to solve this problem. The proposed algorithm can fully exploit the intelligence and local search ability of ACO. Based on full consideration of the mission characteristics, the design of the tabu lists can reduce the search range of ACO and improve the algorithm efficiency significantly. The simulation results show that the proposed algorithm outperforms the conventional algorithm in terms of optimization performance, and it can obtain satisfactory scheduling results for the mission planning problem.
Joint source-channel coding for motion-compensated DCT-based SNR scalable video.

PubMed

Kondi, Lisimachos P; Ishtiaq, Faisal; Katsaggelos, Aggelos K

2002-01-01

In this paper, we develop an approach toward joint source-channel coding for motion-compensated DCT-based scalable video coding and transmission. A framework for the optimal selection of the source and channel coding rates over all scalable layers is presented such that the overall distortion is minimized. The algorithm utilizes universal rate distortion characteristics which are obtained experimentally and show the sensitivity of the source encoder and decoder to channel errors. The proposed algorithm allocates the available bit rate between scalable layers and, within each layer, between source and channel coding. We present the results of this rate allocation algorithm for video transmission over a wireless channel using the H.263 Version 2 signal-to-noise ratio (SNR) scalable codec for source coding and rate-compatible punctured convolutional (RCPC) codes for channel coding. We discuss the performance of the algorithm with respect to the channel conditions, coding methodologies, layer rates, and number of layers.
The development of an imaging informatics-based multi-institutional platform to support sports performance and injury prevention in track and field

NASA Astrophysics Data System (ADS)

Liu, Joseph; Wang, Ximing; Verma, Sneha; McNitt-Gray, Jill; Liu, Brent

2018-03-01

The main goal of sports science and performance enhancement is to collect video and image data, process them, and quantify the results, giving insight to help athletes improve technique. For long jump in track and field, the processed output of video with force vector overlays and force calculations allow coaches to view specific stages of the hop, step, and jump, and identify how each stage can be improved to increase jump distance. Outputs also provide insight into how athletes can better maneuver to prevent injury. Currently, each data collection site collects and stores data with their own methods. There is no standard for data collection, formats, or storage. Video files and quantified results are stored in different formats, structures, and locations such as Dropbox and hard drives. Using imaging informatics-based principles we can develop a platform for multiple institutions that promotes the standardization of sports performance data. In addition, the system will provide user authentication and privacy as in clinical trials, with specific user access rights. Long jump data collected from different field sites will be standardized into specified formats before database storage. Quantified results from image-processing algorithms are stored similar to CAD algorithm results. The system will streamline the current sports performance data workflow and provide a user interface for athletes and coaches to view results of individual collections and also longitudinally across different collections. This streamlined platform and interface is a tool for coaches and athletes to easily access and review data to improve sports performance and prevent injury.
An Efficient and Robust Moving Shadow Removal Algorithm and Its Applications in ITS

NASA Astrophysics Data System (ADS)

Lin, Chin-Teng; Yang, Chien-Ting; Shou, Yu-Wen; Shen, Tzu-Kuei

2010-12-01

We propose an efficient algorithm for removing shadows of moving vehicles caused by non-uniform distributions of light reflections in the daytime. This paper presents a brand-new and complete structure in feature combination as well as analysis for orientating and labeling moving shadows so as to extract the defined objects in foregrounds more easily in each snapshot of the original files of videos which are acquired in the real traffic situations. Moreover, we make use of Gaussian Mixture Model (GMM) for background removal and detection of moving shadows in our tested images, and define two indices for characterizing non-shadowed regions where one indicates the characteristics of lines and the other index can be characterized by the information in gray scales of images which helps us to build a newly defined set of darkening ratios (modified darkening factors) based on Gaussian models. To prove the effectiveness of our moving shadow algorithm, we carry it out with a practical application of traffic flow detection in ITS (Intelligent Transportation System)—vehicle counting. Our algorithm shows the faster processing speed, 13.84 ms/frame, and can improve the accuracy rate in 4% ~ 10% for our three tested videos in the experimental results of vehicle counting.
A Novel Real-Time Reference Key Frame Scan Matching Method

PubMed Central

Mohamed, Haytham; Moussa, Adel; Elhabiby, Mohamed; El-Sheimy, Naser; Sesay, Abu

2017-01-01

Unmanned aerial vehicles represent an effective technology for indoor search and rescue operations. Typically, most indoor missions’ environments would be unknown, unstructured, and/or dynamic. Navigation of UAVs in such environments is addressed by simultaneous localization and mapping approach using either local or global approaches. Both approaches suffer from accumulated errors and high processing time due to the iterative nature of the scan matching method. Moreover, point-to-point scan matching is prone to outlier association processes. This paper proposes a low-cost novel method for 2D real-time scan matching based on a reference key frame (RKF). RKF is a hybrid scan matching technique comprised of feature-to-feature and point-to-point approaches. This algorithm aims at mitigating errors accumulation using the key frame technique, which is inspired from video streaming broadcast process. The algorithm depends on the iterative closest point algorithm during the lack of linear features which is typically exhibited in unstructured environments. The algorithm switches back to the RKF once linear features are detected. To validate and evaluate the algorithm, the mapping performance and time consumption are compared with various algorithms in static and dynamic environments. The performance of the algorithm exhibits promising navigational, mapping results and very short computational time, that indicates the potential use of the new algorithm with real-time systems. PMID:28481285
Keyhole imaging method for dynamic objects behind the occlusion area

NASA Astrophysics Data System (ADS)

Hao, Conghui; Chen, Xi; Dong, Liquan; Zhao, Yuejin; Liu, Ming; Kong, Lingqin; Hui, Mei; Liu, Xiaohua; Wu, Hong

2018-01-01

A method of keyhole imaging based on camera array is realized to obtain the video image behind a keyhole in shielded space at a relatively long distance. We get the multi-angle video images by using a 2×2 CCD camera array to take the images behind the keyhole in four directions. The multi-angle video images are saved in the form of frame sequences. This paper presents a method of video frame alignment. In order to remove the non-target area outside the aperture, we use the canny operator and morphological method to realize the edge detection of images and fill the images. The image stitching of four images is accomplished on the basis of the image stitching algorithm of two images. In the image stitching algorithm of two images, the SIFT method is adopted to accomplish the initial matching of images, and then the RANSAC algorithm is applied to eliminate the wrong matching points and to obtain a homography matrix. A method of optimizing transformation matrix is proposed in this paper. Finally, the video image with larger field of view behind the keyhole can be synthesized with image frame sequence in which every single frame is stitched. The results show that the screen of the video is clear and natural, the brightness transition is smooth. There is no obvious artificial stitching marks in the video, and it can be applied in different engineering environment .
Energy conservation using face detection

NASA Astrophysics Data System (ADS)

Deotale, Nilesh T.; Kalbande, Dhananjay R.; Mishra, Akassh A.

2011-10-01

Computerized Face Detection, is concerned with the difficult task of converting a video signal of a person to written text. It has several applications like face recognition, simultaneous multiple face processing, biometrics, security, video surveillance, human computer interface, image database management, digital cameras use face detection for autofocus, selecting regions of interest in photo slideshows that use a pan-and-scale and The Present Paper deals with energy conservation using face detection. Automating the process to a computer requires the use of various image processing techniques. There are various methods that can be used for Face Detection such as Contour tracking methods, Template matching, Controlled background, Model based, Motion based and color based. Basically, the video of the subject are converted into images are further selected manually for processing. However, several factors like poor illumination, movement of face, viewpoint-dependent Physical appearance, Acquisition geometry, Imaging conditions, Compression artifacts makes Face detection difficult. This paper reports an algorithm for conservation of energy using face detection for various devices. The present paper suggests Energy Conservation can be done by Detecting the Face and reducing the brightness of complete image and then adjusting the brightness of the particular area of an image where the face is located using histogram equalization.
Hierarchical video summarization

NASA Astrophysics Data System (ADS)

Ratakonda, Krishna; Sezan, M. Ibrahim; Crinon, Regis J.

1998-12-01

We address the problem of key-frame summarization of vide in the absence of any a priori information about its content. This is a common problem that is encountered in home videos. We propose a hierarchical key-frame summarization algorithm where a coarse-to-fine key-frame summary is generated. A hierarchical key-frame summary facilitates multi-level browsing where the user can quickly discover the content of the video by accessing its coarsest but most compact summary and then view a desired segment of the video with increasingly more detail. At the finest level, the summary is generated on the basis of color features of video frames, using an extension of a recently proposed key-frame extraction algorithm. The finest level key-frames are recursively clustered using a novel pairwise K-means clustering approach with temporal consecutiveness constraint. We also address summarization of MPEG-2 compressed video without fully decoding the bitstream. We also propose efficient mechanisms that facilitate decoding the video when the hierarchical summary is utilized in browsing and playback of video segments starting at selected key-frames.
High performance compression of science data

NASA Technical Reports Server (NTRS)

Storer, James A.; Cohn, Martin

1994-01-01

Two papers make up the body of this report. One presents a single-pass adaptive vector quantization algorithm that learns a codebook of variable size and shape entries; the authors present experiments on a set of test images showing that with no training or prior knowledge of the data, for a given fidelity, the compression achieved typically equals or exceeds that of the JPEG standard. The second paper addresses motion compensation, one of the most effective techniques used in the interframe data compression. A parallel block-matching algorithm for estimating interframe displacement of blocks with minimum error is presented. The algorithm is designed for a simple parallel architecture to process video in real time.
Selective encryption for H.264/AVC video coding

NASA Astrophysics Data System (ADS)

Shi, Tuo; King, Brian; Salama, Paul

2006-02-01

Due to the ease with which digital data can be manipulated and due to the ongoing advancements that have brought us closer to pervasive computing, the secure delivery of video and images has become a challenging problem. Despite the advantages and opportunities that digital video provide, illegal copying and distribution as well as plagiarism of digital audio, images, and video is still ongoing. In this paper we describe two techniques for securing H.264 coded video streams. The first technique, SEH264Algorithm1, groups the data into the following blocks of data: (1) a block that contains the sequence parameter set and the picture parameter set, (2) a block containing a compressed intra coded frame, (3) a block containing the slice header of a P slice, all the headers of the macroblock within the same P slice, and all the luma and chroma DC coefficients belonging to the all the macroblocks within the same slice, (4) a block containing all the ac coefficients, and (5) a block containing all the motion vectors. The first three are encrypted whereas the last two are not. The second method, SEH264Algorithm2, relies on the use of multiple slices per coded frame. The algorithm searches the compressed video sequence for start codes (0x000001) and then encrypts the next N bits of data.
Real-time video streaming using H.264 scalable video coding (SVC) in multihomed mobile networks: a testbed approach

NASA Astrophysics Data System (ADS)

Nightingale, James; Wang, Qi; Grecos, Christos

2011-03-01

Users of the next generation wireless paradigm known as multihomed mobile networks expect satisfactory quality of service (QoS) when accessing streamed multimedia content. The recent H.264 Scalable Video Coding (SVC) extension to the Advanced Video Coding standard (AVC), offers the facility to adapt real-time video streams in response to the dynamic conditions of multiple network paths encountered in multihomed wireless mobile networks. Nevertheless, preexisting streaming algorithms were mainly proposed for AVC delivery over multipath wired networks and were evaluated by software simulation. This paper introduces a practical, hardware-based testbed upon which we implement and evaluate real-time H.264 SVC streaming algorithms in a realistic multihomed wireless mobile networks environment. We propose an optimised streaming algorithm with multi-fold technical contributions. Firstly, we extended the AVC packet prioritisation schemes to reflect the three-dimensional granularity of SVC. Secondly, we designed a mechanism for evaluating the effects of different streamer 'read ahead window' sizes on real-time performance. Thirdly, we took account of the previously unconsidered path switching and mobile networks tunnelling overheads encountered in real-world deployments. Finally, we implemented a path condition monitoring and reporting scheme to facilitate the intelligent path switching. The proposed system has been experimentally shown to offer a significant improvement in PSNR of the received stream compared with representative existing algorithms.
A constrained joint source/channel coder design and vector quantization of nonstationary sources

NASA Technical Reports Server (NTRS)

Sayood, Khalid; Chen, Y. C.; Nori, S.; Araj, A.

1993-01-01

The emergence of broadband ISDN as the network for the future brings with it the promise of integration of all proposed services in a flexible environment. In order to achieve this flexibility, asynchronous transfer mode (ATM) has been proposed as the transfer technique. During this period a study was conducted on the bridging of network transmission performance and video coding. The successful transmission of variable bit rate video over ATM networks relies on the interaction between the video coding algorithm and the ATM networks. Two aspects of networks that determine the efficiency of video transmission are the resource allocation algorithm and the congestion control algorithm. These are explained in this report. Vector quantization (VQ) is one of the more popular compression techniques to appear in the last twenty years. Numerous compression techniques, which incorporate VQ, have been proposed. While the LBG VQ provides excellent compression, there are also several drawbacks to the use of the LBG quantizers including search complexity and memory requirements, and a mismatch between the codebook and the inputs. The latter mainly stems from the fact that the VQ is generally designed for a specific rate and a specific class of inputs. In this work, an adaptive technique is proposed for vector quantization of images and video sequences. This technique is an extension of the recursively indexed scalar quantization (RISQ) algorithm.
A Secure and Robust Compressed Domain Video Steganography for Intra- and Inter-Frames Using Embedding-Based Byte Differencing (EBBD) Scheme

PubMed Central

Idbeaa, Tarik; Abdul Samad, Salina; Husain, Hafizah

2016-01-01

This paper presents a novel secure and robust steganographic technique in the compressed video domain namely embedding-based byte differencing (EBBD). Unlike most of the current video steganographic techniques which take into account only the intra frames for data embedding, the proposed EBBD technique aims to hide information in both intra and inter frames. The information is embedded into a compressed video by simultaneously manipulating the quantized AC coefficients (AC-QTCs) of luminance components of the frames during MPEG-2 encoding process. Later, during the decoding process, the embedded information can be detected and extracted completely. Furthermore, the EBBD basically deals with two security concepts: data encryption and data concealing. Hence, during the embedding process, secret data is encrypted using the simplified data encryption standard (S-DES) algorithm to provide better security to the implemented system. The security of the method lies in selecting candidate AC-QTCs within each non-overlapping 8 × 8 sub-block using a pseudo random key. Basic performance of this steganographic technique verified through experiments on various existing MPEG-2 encoded videos over a wide range of embedded payload rates. Overall, the experimental results verify the excellent performance of the proposed EBBD with a better trade-off in terms of imperceptibility and payload, as compared with previous techniques while at the same time ensuring minimal bitrate increase and negligible degradation of PSNR values. PMID:26963093
A Secure and Robust Compressed Domain Video Steganography for Intra- and Inter-Frames Using Embedding-Based Byte Differencing (EBBD) Scheme.

PubMed

Idbeaa, Tarik; Abdul Samad, Salina; Husain, Hafizah

2016-01-01

This paper presents a novel secure and robust steganographic technique in the compressed video domain namely embedding-based byte differencing (EBBD). Unlike most of the current video steganographic techniques which take into account only the intra frames for data embedding, the proposed EBBD technique aims to hide information in both intra and inter frames. The information is embedded into a compressed video by simultaneously manipulating the quantized AC coefficients (AC-QTCs) of luminance components of the frames during MPEG-2 encoding process. Later, during the decoding process, the embedded information can be detected and extracted completely. Furthermore, the EBBD basically deals with two security concepts: data encryption and data concealing. Hence, during the embedding process, secret data is encrypted using the simplified data encryption standard (S-DES) algorithm to provide better security to the implemented system. The security of the method lies in selecting candidate AC-QTCs within each non-overlapping 8 × 8 sub-block using a pseudo random key. Basic performance of this steganographic technique verified through experiments on various existing MPEG-2 encoded videos over a wide range of embedded payload rates. Overall, the experimental results verify the excellent performance of the proposed EBBD with a better trade-off in terms of imperceptibility and payload, as compared with previous techniques while at the same time ensuring minimal bitrate increase and negligible degradation of PSNR values.
Determining the bias and variance of a deterministic finger-tracking algorithm.

PubMed

Morash, Valerie S; van der Velden, Bas H M

2016-06-01

Finger tracking has the potential to expand haptic research and applications, as eye tracking has done in vision research. In research applications, it is desirable to know the bias and variance associated with a finger-tracking method. However, assessing the bias and variance of a deterministic method is not straightforward. Multiple measurements of the same finger position data will not produce different results, implying zero variance. Here, we present a method of assessing deterministic finger-tracking variance and bias through comparison to a non-deterministic measure. A proof-of-concept is presented using a video-based finger-tracking algorithm developed for the specific purpose of tracking participant fingers during a psychological research study. The algorithm uses ridge detection on videos of the participant's hand, and estimates the location of the right index fingertip. The algorithm was evaluated using data from four participants, who explored tactile maps using only their right index finger and all right-hand fingers. The algorithm identified the index fingertip in 99.78 % of one-finger video frames and 97.55 % of five-finger video frames. Although the algorithm produced slightly biased and more dispersed estimates relative to a human coder, these differences (x=0.08 cm, y=0.04 cm) and standard deviations (σ x =0.16 cm, σ y =0.21 cm) were small compared to the size of a fingertip (1.5-2.0 cm). Some example finger-tracking results are provided where corrections are made using the bias and variance estimates.

Enhance Video Film using Retnix method

NASA Astrophysics Data System (ADS)

Awad, Rasha; Al-Zuky, Ali A.; Al-Saleh, Anwar H.; Mohamad, Haidar J.

2018-05-01

An enhancement technique used to improve the studied video quality. Algorithms like mean and standard deviation are used as a criterion within this paper, and it applied for each video clip that divided into 80 images. The studied filming environment has different light intensity (315, 566, and 644Lux). This different environment gives similar reality to the outdoor filming. The outputs of the suggested algorithm are compared with the results before applying it. This method is applied into two ways: first, it is applied for the full video clip to get the enhanced film; second, it is applied for every individual image to get the enhanced image then compiler them to get the enhanced film. This paper shows that the enhancement technique gives good quality video film depending on a statistical method, and it is recommended to use it in different application.
Open-source telemedicine platform for wireless medical video communication.

PubMed

Panayides, A; Eleftheriou, I; Pantziaris, M

2013-01-01

An m-health system for real-time wireless communication of medical video based on open-source software is presented. The objective is to deliver a low-cost telemedicine platform which will allow for reliable remote diagnosis m-health applications such as emergency incidents, mass population screening, and medical education purposes. The performance of the proposed system is demonstrated using five atherosclerotic plaque ultrasound videos. The videos are encoded at the clinically acquired resolution, in addition to lower, QCIF, and CIF resolutions, at different bitrates, and four different encoding structures. Commercially available wireless local area network (WLAN) and 3.5G high-speed packet access (HSPA) wireless channels are used to validate the developed platform. Objective video quality assessment is based on PSNR ratings, following calibration using the variable frame delay (VFD) algorithm that removes temporal mismatch between original and received videos. Clinical evaluation is based on atherosclerotic plaque ultrasound video assessment protocol. Experimental results show that adequate diagnostic quality wireless medical video communications are realized using the designed telemedicine platform. HSPA cellular networks provide for ultrasound video transmission at the acquired resolution, while VFD algorithm utilization bridges objective and subjective ratings.
Open-Source Telemedicine Platform for Wireless Medical Video Communication

PubMed Central

Panayides, A.; Eleftheriou, I.; Pantziaris, M.

2013-01-01

An m-health system for real-time wireless communication of medical video based on open-source software is presented. The objective is to deliver a low-cost telemedicine platform which will allow for reliable remote diagnosis m-health applications such as emergency incidents, mass population screening, and medical education purposes. The performance of the proposed system is demonstrated using five atherosclerotic plaque ultrasound videos. The videos are encoded at the clinically acquired resolution, in addition to lower, QCIF, and CIF resolutions, at different bitrates, and four different encoding structures. Commercially available wireless local area network (WLAN) and 3.5G high-speed packet access (HSPA) wireless channels are used to validate the developed platform. Objective video quality assessment is based on PSNR ratings, following calibration using the variable frame delay (VFD) algorithm that removes temporal mismatch between original and received videos. Clinical evaluation is based on atherosclerotic plaque ultrasound video assessment protocol. Experimental results show that adequate diagnostic quality wireless medical video communications are realized using the designed telemedicine platform. HSPA cellular networks provide for ultrasound video transmission at the acquired resolution, while VFD algorithm utilization bridges objective and subjective ratings. PMID:23573082
Computer vision barrel inspection

NASA Astrophysics Data System (ADS)

Wolfe, William J.; Gunderson, James; Walworth, Matthew E.

1994-02-01

One of the Department of Energy's (DOE) ongoing tasks is the storage and inspection of a large number of waste barrels containing a variety of hazardous substances. Martin Marietta is currently contracted to develop a robotic system -- the Intelligent Mobile Sensor System (IMSS) -- for the automatic monitoring and inspection of these barrels. The IMSS is a mobile robot with multiple sensors: video cameras, illuminators, laser ranging and barcode reader. We assisted Martin Marietta in this task, specifically in the development of image processing algorithms that recognize and classify the barrel labels. Our subsystem uses video images to detect and locate the barcode, so that the barcode reader can be pointed at the barcode.
A Novel Optical/digital Processing System for Pattern Recognition

NASA Technical Reports Server (NTRS)

Boone, Bradley G.; Shukla, Oodaye B.

1993-01-01

This paper describes two processing algorithms that can be implemented optically: the Radon transform and angular correlation. These two algorithms can be combined in one optical processor to extract all the basic geometric and amplitude features from objects embedded in video imagery. We show that the internal amplitude structure of objects is recovered by the Radon transform, which is a well-known result, but, in addition, we show simulation results that calculate angular correlation, a simple but unique algorithm that extracts object boundaries from suitably threshold images from which length, width, area, aspect ratio, and orientation can be derived. In addition to circumventing scale and rotation distortions, these simulations indicate that the features derived from the angular correlation algorithm are relatively insensitive to tracking shifts and image noise. Some optical architecture concepts, including one based on micro-optical lenslet arrays, have been developed to implement these algorithms. Simulation test and evaluation using simple synthetic object data will be described, including results of a study that uses object boundaries (derivable from angular correlation) to classify simple objects using a neural network.
Physical environment virtualization for human activities recognition

NASA Astrophysics Data System (ADS)

Poshtkar, Azin; Elangovan, Vinayak; Shirkhodaie, Amir; Chan, Alex; Hu, Shuowen

2015-05-01

Human activity recognition research relies heavily on extensive datasets to verify and validate performance of activity recognition algorithms. However, obtaining real datasets are expensive and highly time consuming. A physics-based virtual simulation can accelerate the development of context based human activity recognition algorithms and techniques by generating relevant training and testing videos simulating diverse operational scenarios. In this paper, we discuss in detail the requisite capabilities of a virtual environment to aid as a test bed for evaluating and enhancing activity recognition algorithms. To demonstrate the numerous advantages of virtual environment development, a newly developed virtual environment simulation modeling (VESM) environment is presented here to generate calibrated multisource imagery datasets suitable for development and testing of recognition algorithms for context-based human activities. The VESM environment serves as a versatile test bed to generate a vast amount of realistic data for training and testing of sensor processing algorithms. To demonstrate the effectiveness of VESM environment, we present various simulated scenarios and processed results to infer proper semantic annotations from the high fidelity imagery data for human-vehicle activity recognition under different operational contexts.
Detection and Tracking of Moving Objects with Real-Time Onboard Vision System

NASA Astrophysics Data System (ADS)

Erokhin, D. Y.; Feldman, A. B.; Korepanov, S. E.

2017-05-01

Detection of moving objects in video sequence received from moving video sensor is a one of the most important problem in computer vision. The main purpose of this work is developing set of algorithms, which can detect and track moving objects in real time computer vision system. This set includes three main parts: the algorithm for estimation and compensation of geometric transformations of images, an algorithm for detection of moving objects, an algorithm to tracking of the detected objects and prediction their position. The results can be claimed to create onboard vision systems of aircraft, including those relating to small and unmanned aircraft.
Change Detection Algorithms for Surveillance in Visual IoT: A Comparative Study

NASA Astrophysics Data System (ADS)

Akram, Beenish Ayesha; Zafar, Amna; Akbar, Ali Hammad; Wajid, Bilal; Chaudhry, Shafique Ahmad

2018-01-01

The VIoT (Visual Internet of Things) connects virtual information world with real world objects using sensors and pervasive computing. For video surveillance in VIoT, ChD (Change Detection) is a critical component. ChD algorithms identify regions of change in multiple images of the same scene recorded at different time intervals for video surveillance. This paper presents performance comparison of histogram thresholding and classification ChD algorithms using quantitative measures for video surveillance in VIoT based on salient features of datasets. The thresholding algorithms Otsu, Kapur, Rosin and classification methods k-means, EM (Expectation Maximization) were simulated in MATLAB using diverse datasets. For performance evaluation, the quantitative measures used include OSR (Overall Success Rate), YC (Yule's Coefficient) and JC (Jaccard's Coefficient), execution time and memory consumption. Experimental results showed that Kapur's algorithm performed better for both indoor and outdoor environments with illumination changes, shadowing and medium to fast moving objects. However, it reflected degraded performance for small object size with minor changes. Otsu algorithm showed better results for indoor environments with slow to medium changes and nomadic object mobility. k-means showed good results in indoor environment with small object size producing slow change, no shadowing and scarce illumination changes.
Automatic video summarization driven by a spatio-temporal attention model

NASA Astrophysics Data System (ADS)

Barland, R.; Saadane, A.

2008-02-01

According to the literature, automatic video summarization techniques can be classified in two parts, following the output nature: "video skims", which are generated using portions of the original video and "key-frame sets", which correspond to the images, selected from the original video, having a significant semantic content. The difference between these two categories is reduced when we consider automatic procedures. Most of the published approaches are based on the image signal and use either pixel characterization or histogram techniques or image decomposition by blocks. However, few of them integrate properties of the Human Visual System (HVS). In this paper, we propose to extract keyframes for video summarization by studying the variations of salient information between two consecutive frames. For each frame, a saliency map is produced simulating the human visual attention by a bottom-up (signal-dependent) approach. This approach includes three parallel channels for processing three early visual features: intensity, color and temporal contrasts. For each channel, the variations of the salient information between two consecutive frames are computed. These outputs are then combined to produce the global saliency variation which determines the key-frames. Psychophysical experiments have been defined and conducted to analyze the relevance of the proposed key-frame extraction algorithm.
Design and testing of artifact-suppressed adaptive histogram equalization: a contrast-enhancement technique for display of digital chest radiographs.

PubMed

Rehm, K; Seeley, G W; Dallas, W J; Ovitt, T W; Seeger, J F

1990-01-01

One of the goals of our research in the field of digital radiography has been to develop contrast-enhancement algorithms for eventual use in the display of chest images on video devices with the aim of preserving the diagnostic information presently available with film, some of which would normally be lost because of the smaller dynamic range of video monitors. The ASAHE algorithm discussed in this article has been tested by investigating observer performance in a difficult detection task involving phantoms and simulated lung nodules, using film as the output medium. The results of the experiment showed that the algorithm is successful in providing contrast-enhanced, natural-looking chest images while maintaining diagnostic information. The algorithm did not effect an increase in nodule detectability, but this was not unexpected because film is a medium capable of displaying a wide range of gray levels. It is sufficient at this stage to show that there is no degradation in observer performance. Future tests will evaluate the performance of the ASAHE algorithm in preparing chest images for video display.
Analysis of Video-Based Microscopic Particle Trajectories Using Kalman Filtering

PubMed Central

Wu, Pei-Hsun; Agarwal, Ashutosh; Hess, Henry; Khargonekar, Pramod P.; Tseng, Yiider

2010-01-01

Abstract The fidelity of the trajectories obtained from video-based particle tracking determines the success of a variety of biophysical techniques, including in situ single cell particle tracking and in vitro motility assays. However, the image acquisition process is complicated by system noise, which causes positioning error in the trajectories derived from image analysis. Here, we explore the possibility of reducing the positioning error by the application of a Kalman filter, a powerful algorithm to estimate the state of a linear dynamic system from noisy measurements. We show that the optimal Kalman filter parameters can be determined in an appropriate experimental setting, and that the Kalman filter can markedly reduce the positioning error while retaining the intrinsic fluctuations of the dynamic process. We believe the Kalman filter can potentially serve as a powerful tool to infer a trajectory of ultra-high fidelity from noisy images, revealing the details of dynamic cellular processes. PMID:20550894
Enhancement tuning and control for high dynamic range images in multi-scale locally adaptive contrast enhancement algorithms

NASA Astrophysics Data System (ADS)

Cvetkovic, Sascha D.; Schirris, Johan; de With, Peter H. N.

2009-01-01

For real-time imaging in surveillance applications, visibility of details is of primary importance to ensure customer confidence. If we display High Dynamic-Range (HDR) scenes whose contrast spans four or more orders of magnitude on a conventional monitor without additional processing, results are unacceptable. Compression of the dynamic range is therefore a compulsory part of any high-end video processing chain because standard monitors are inherently Low- Dynamic Range (LDR) devices with maximally two orders of display dynamic range. In real-time camera processing, many complex scenes are improved with local contrast enhancements, bringing details to the best possible visibility. In this paper, we show how a multi-scale high-frequency enhancement scheme, in which gain is a non-linear function of the detail energy, can be used for the dynamic range compression of HDR real-time video camera signals. We also show the connection of our enhancement scheme to the processing way of the Human Visual System (HVS). Our algorithm simultaneously controls perceived sharpness, ringing ("halo") artifacts (contrast) and noise, resulting in a good balance between visibility of details and non-disturbance of artifacts. The overall quality enhancement, suitable for both HDR and LDR scenes, is based on a careful selection of the filter types for the multi-band decomposition and a detailed analysis of the signal per frequency band.
Providing QoS through machine-learning-driven adaptive multimedia applications.

PubMed

Ruiz, Pedro M; Botía, Juan A; Gómez-Skarmeta, Antonio

2004-06-01

We investigate the optimization of the quality of service (QoS) offered by real-time multimedia adaptive applications through machine learning algorithms. These applications are able to adapt in real time their internal settings (i.e., video sizes, audio and video codecs, among others) to the unpredictably changing capacity of the network. Traditional adaptive applications just select a set of settings to consume less than the available bandwidth. We propose a novel approach in which the selected set of settings is the one which offers a better user-perceived QoS among all those combinations which satisfy the bandwidth restrictions. We use a genetic algorithm to decide when to trigger the adaptation process depending on the network conditions (i.e., loss-rate, jitter, etc.). Additionally, the selection of the new set of settings is done according to a set of rules which model the user-perceived QoS. These rules are learned using the SLIPPER rule induction algorithm over a set of examples extracted from scores provided by real users. We will demonstrate that the proposed approach guarantees a good user-perceived QoS even when the network conditions are constantly changing.
Object Occlusion Detection Using Automatic Camera Calibration for a Wide-Area Video Surveillance System

PubMed Central

Jung, Jaehoon; Yoon, Inhye; Paik, Joonki

2016-01-01

This paper presents an object occlusion detection algorithm using object depth information that is estimated by automatic camera calibration. The object occlusion problem is a major factor to degrade the performance of object tracking and recognition. To detect an object occlusion, the proposed algorithm consists of three steps: (i) automatic camera calibration using both moving objects and a background structure; (ii) object depth estimation; and (iii) detection of occluded regions. The proposed algorithm estimates the depth of the object without extra sensors but with a generic red, green and blue (RGB) camera. As a result, the proposed algorithm can be applied to improve the performance of object tracking and object recognition algorithms for video surveillance systems. PMID:27347978
Joint Attributes and Event Analysis for Multimedia Event Detection.

PubMed

Ma, Zhigang; Chang, Xiaojun; Xu, Zhongwen; Sebe, Nicu; Hauptmann, Alexander G

2017-06-15

Semantic attributes have been increasingly used the past few years for multimedia event detection (MED) with promising results. The motivation is that multimedia events generally consist of lower level components such as objects, scenes, and actions. By characterizing multimedia event videos with semantic attributes, one could exploit more informative cues for improved detection results. Much existing work obtains semantic attributes from images, which may be suboptimal for video analysis since these image-inferred attributes do not carry dynamic information that is essential for videos. To address this issue, we propose to learn semantic attributes from external videos using their semantic labels. We name them video attributes in this paper. In contrast with multimedia event videos, these external videos depict lower level contents such as objects, scenes, and actions. To harness video attributes, we propose an algorithm established on a correlation vector that correlates them to a target event. Consequently, we could incorporate video attributes latently as extra information into the event detector learnt from multimedia event videos in a joint framework. To validate our method, we perform experiments on the real-world large-scale TRECVID MED 2013 and 2014 data sets and compare our method with several state-of-the-art algorithms. The experiments show that our method is advantageous for MED.
Design and develop a video conferencing framework for real-time telemedicine applications using secure group-based communication architecture.

PubMed

Mat Kiah, M L; Al-Bakri, S H; Zaidan, A A; Zaidan, B B; Hussain, Muzammil

2014-10-01

One of the applications of modern technology in telemedicine is video conferencing. An alternative to traveling to attend a conference or meeting, video conferencing is becoming increasingly popular among hospitals. By using this technology, doctors can help patients who are unable to physically visit hospitals. Video conferencing particularly benefits patients from rural areas, where good doctors are not always available. Telemedicine has proven to be a blessing to patients who have no access to the best treatment. A telemedicine system consists of customized hardware and software at two locations, namely, at the patient's and the doctor's end. In such cases, the video streams of the conferencing parties may contain highly sensitive information. Thus, real-time data security is one of the most important requirements when designing video conferencing systems. This study proposes a secure framework for video conferencing systems and a complete management solution for secure video conferencing groups. Java Media Framework Application Programming Interface classes are used to design and test the proposed secure framework. Real-time Transport Protocol over User Datagram Protocol is used to transmit the encrypted audio and video streams, and RSA and AES algorithms are used to provide the required security services. Results show that the encryption algorithm insignificantly increases the video conferencing computation time.
Implementation issues in source coding

NASA Technical Reports Server (NTRS)

Sayood, Khalid; Chen, Yun-Chung; Hadenfeldt, A. C.

1989-01-01

An edge preserving image coding scheme which can be operated in both a lossy and a lossless manner was developed. The technique is an extension of the lossless encoding algorithm developed for the Mars observer spectral data. It can also be viewed as a modification of the DPCM algorithm. A packet video simulator was also developed from an existing modified packet network simulator. The coding scheme for this system is a modification of the mixture block coding (MBC) scheme described in the last report. Coding algorithms for packet video were also investigated.
Intelligent Network-Centric Sensors Development Program

DTIC Science & Technology

2012-07-31

Image sensor Configuration: ; Cone 360 degree LWIR PFx Sensor: •■. Image sensor . Configuration: Image MWIR Configuration; Cone 360 degree... LWIR PFx Sensor: Video Configuration: Cone 360 degree SW1R, 2. Reasoning Process to Match Sensor Systems to Algorithms The ontological...effects of coherent imaging because of aberrations. Another reason is the specular nature of active imaging. Both contribute to the nonuniformity
A comparison of foveated acquisition and tracking performance relative to uniform resolution approaches

NASA Astrophysics Data System (ADS)

Dubuque, Shaun; Coffman, Thayne; McCarley, Paul; Bovik, A. C.; Thomas, C. William

2009-05-01

Foveated imaging has been explored for compression and tele-presence, but gaps exist in the study of foveated imaging applied to acquisition and tracking systems. Results are presented from two sets of experiments comparing simple foveated and uniform resolution targeting (acquisition and tracking) algorithms. The first experiments measure acquisition performance when locating Gabor wavelet targets in noise, with fovea placement driven by a mutual information measure. The foveated approach is shown to have lower detection delay than a notional uniform resolution approach when using video that consumes equivalent bandwidth. The second experiments compare the accuracy of target position estimates from foveated and uniform resolution tracking algorithms. A technique is developed to select foveation parameters that minimize error in Kalman filter state estimates. Foveated tracking is shown to consistently outperform uniform resolution tracking on an abstract multiple target task when using video that consumes equivalent bandwidth. Performance is also compared to uniform resolution processing without bandwidth limitations. In both experiments, superior performance is achieved at a given bandwidth by foveated processing because limited resources are allocated intelligently to maximize operational performance. These findings indicate the potential for operational performance improvements over uniform resolution systems in both acquisition and tracking tasks.
Blind identification of full-field vibration modes from video measurements with phase-based video motion magnification

NASA Astrophysics Data System (ADS)

Yang, Yongchao; Dorn, Charles; Mancini, Tyler; Talken, Zachary; Kenyon, Garrett; Farrar, Charles; Mascareñas, David

2017-02-01

Experimental or operational modal analysis traditionally requires physically-attached wired or wireless sensors for vibration measurement of structures. This instrumentation can result in mass-loading on lightweight structures, and is costly and time-consuming to install and maintain on large civil structures, especially for long-term applications (e.g., structural health monitoring) that require significant maintenance for cabling (wired sensors) or periodic replacement of the energy supply (wireless sensors). Moreover, these sensors are typically placed at a limited number of discrete locations, providing low spatial sensing resolution that is hardly sufficient for modal-based damage localization, or model correlation and updating for larger-scale structures. Non-contact measurement methods such as scanning laser vibrometers provide high-resolution sensing capacity without the mass-loading effect; however, they make sequential measurements that require considerable acquisition time. As an alternative non-contact method, digital video cameras are relatively low-cost, agile, and provide high spatial resolution, simultaneous, measurements. Combined with vision based algorithms (e.g., image correlation, optical flow), video camera based measurements have been successfully used for vibration measurements and subsequent modal analysis, based on techniques such as the digital image correlation (DIC) and the point-tracking. However, they typically require speckle pattern or high-contrast markers to be placed on the surface of structures, which poses challenges when the measurement area is large or inaccessible. This work explores advanced computer vision and video processing algorithms to develop a novel video measurement and vision-based operational (output-only) modal analysis method that alleviate the need of structural surface preparation associated with existing vision-based methods and can be implemented in a relatively efficient and autonomous manner with little user supervision and calibration. First a multi-scale image processing method is applied on the frames of the video of a vibrating structure to extract the local pixel phases that encode local structural vibration, establishing a full-field spatiotemporal motion matrix. Then a high-spatial dimensional, yet low-modal-dimensional, over-complete model is used to represent the extracted full-field motion matrix using modal superposition, which is physically connected and manipulated by a family of unsupervised learning models and techniques, respectively. Thus, the proposed method is able to blindly extract modal frequencies, damping ratios, and full-field (as many points as the pixel number of the video frame) mode shapes from line of sight video measurements of the structure. The method is validated by laboratory experiments on a bench-scale building structure and a cantilever beam. Its ability for output (video measurements)-only identification and visualization of the weakly-excited mode is demonstrated and several issues with its implementation are discussed.

Efficient parallel implementation of active appearance model fitting algorithm on GPU.

PubMed

Wang, Jinwei; Ma, Xirong; Zhu, Yuanping; Sun, Jizhou

2014-01-01

The active appearance model (AAM) is one of the most powerful model-based object detecting and tracking methods which has been widely used in various situations. However, the high-dimensional texture representation causes very time-consuming computations, which makes the AAM difficult to apply to real-time systems. The emergence of modern graphics processing units (GPUs) that feature a many-core, fine-grained parallel architecture provides new and promising solutions to overcome the computational challenge. In this paper, we propose an efficient parallel implementation of the AAM fitting algorithm on GPUs. Our design idea is fine grain parallelism in which we distribute the texture data of the AAM, in pixels, to thousands of parallel GPU threads for processing, which makes the algorithm fit better into the GPU architecture. We implement our algorithm using the compute unified device architecture (CUDA) on the Nvidia's GTX 650 GPU, which has the latest Kepler architecture. To compare the performance of our algorithm with different data sizes, we built sixteen face AAM models of different dimensional textures. The experiment results show that our parallel AAM fitting algorithm can achieve real-time performance for videos even on very high-dimensional textures.
Efficient Parallel Implementation of Active Appearance Model Fitting Algorithm on GPU

PubMed Central

Wang, Jinwei; Ma, Xirong; Zhu, Yuanping; Sun, Jizhou

2014-01-01

The active appearance model (AAM) is one of the most powerful model-based object detecting and tracking methods which has been widely used in various situations. However, the high-dimensional texture representation causes very time-consuming computations, which makes the AAM difficult to apply to real-time systems. The emergence of modern graphics processing units (GPUs) that feature a many-core, fine-grained parallel architecture provides new and promising solutions to overcome the computational challenge. In this paper, we propose an efficient parallel implementation of the AAM fitting algorithm on GPUs. Our design idea is fine grain parallelism in which we distribute the texture data of the AAM, in pixels, to thousands of parallel GPU threads for processing, which makes the algorithm fit better into the GPU architecture. We implement our algorithm using the compute unified device architecture (CUDA) on the Nvidia's GTX 650 GPU, which has the latest Kepler architecture. To compare the performance of our algorithm with different data sizes, we built sixteen face AAM models of different dimensional textures. The experiment results show that our parallel AAM fitting algorithm can achieve real-time performance for videos even on very high-dimensional textures. PMID:24723812
Grayscale image segmentation for real-time traffic sign recognition: the hardware point of view

NASA Astrophysics Data System (ADS)

Cao, Tam P.; Deng, Guang; Elton, Darrell

2009-02-01

In this paper, we study several grayscale-based image segmentation methods for real-time road sign recognition applications on an FPGA hardware platform. The performance of different image segmentation algorithms in different lighting conditions are initially compared using PC simulation. Based on these results and analysis, suitable algorithms are implemented and tested on a real-time FPGA speed sign detection system. Experimental results show that the system using segmented images uses significantly less hardware resources on an FPGA while maintaining comparable system's performance. The system is capable of processing 60 live video frames per second.
A system for real-time measurement of the brachial artery diameter in B-mode ultrasound images.

PubMed

Gemignani, Vincenzo; Faita, Francesco; Ghiadoni, Lorenzo; Poggianti, Elisa; Demi, Marcello

2007-03-01

The measurement of the brachial artery diameter is frequently used in clinical studies for evaluating the flow-mediated dilation and, in conjunction with the blood pressure value, for assessing arterial stiffness. This paper presents a system for computing the brachial artery diameter in real-time by analyzing B-mode ultrasound images. The method is based on a robust edge detection algorithm which is used to automatically locate the two walls of the vessel. The measure of the diameter is obtained with subpixel precision and with a temporal resolution of 25 samples/s, so that the small dilations induced by the cardiac cycle can also be retrieved. The algorithm is implemented on a standalone video processing board which acquires the analog video signal from the ultrasound equipment. Results are shown in real-time on a graphical user interface. The system was tested both on synthetic ultrasound images and in clinical studies of flow-mediated dilation. Accuracy, robustness, and intra/inter observer variability of the method were evaluated.
"SmartMonitor"--an intelligent security system for the protection of individuals and small properties with the possibility of home automation.

PubMed

Frejlichowski, Dariusz; Gościewska, Katarzyna; Forczmański, Paweł; Hofman, Radosław

2014-06-05

"SmartMonitor" is an intelligent security system based on image analysis that combines the advantages of alarm, video surveillance and home automation systems. The system is a complete solution that automatically reacts to every learned situation in a pre-specified way and has various applications, e.g., home and surrounding protection against unauthorized intrusion, crime detection or supervision over ill persons. The software is based on well-known and proven methods and algorithms for visual content analysis (VCA) that were appropriately modified and adopted to fit specific needs and create a video processing model which consists of foreground region detection and localization, candidate object extraction, object classification and tracking. In this paper, the "SmartMonitor" system is presented along with its architecture, employed methods and algorithms, and object analysis approach. Some experimental results on system operation are also provided. In the paper, focus is put on one of the aforementioned functionalities of the system, namely supervision over ill persons.
Robust Video Stabilization Using Particle Keypoint Update and l1-Optimized Camera Path

PubMed Central

Jeon, Semi; Yoon, Inhye; Jang, Jinbeum; Yang, Seungji; Kim, Jisung; Paik, Joonki

2017-01-01

Acquisition of stabilized video is an important issue for various type of digital cameras. This paper presents an adaptive camera path estimation method using robust feature detection to remove shaky artifacts in a video. The proposed algorithm consists of three steps: (i) robust feature detection using particle keypoints between adjacent frames; (ii) camera path estimation and smoothing; and (iii) rendering to reconstruct a stabilized video. As a result, the proposed algorithm can estimate the optimal homography by redefining important feature points in the flat region using particle keypoints. In addition, stabilized frames with less holes can be generated from the optimal, adaptive camera path that minimizes a temporal total variation (TV). The proposed video stabilization method is suitable for enhancing the visual quality for various portable cameras and can be applied to robot vision, driving assistant systems, and visual surveillance systems. PMID:28208622
Temporal compressive imaging for video

NASA Astrophysics Data System (ADS)

Zhou, Qun; Zhang, Linxia; Ke, Jun

2018-01-01

In many situations, imagers are required to have higher imaging speed, such as gunpowder blasting analysis and observing high-speed biology phenomena. However, measuring high-speed video is a challenge to camera design, especially, in infrared spectrum. In this paper, we reconstruct a high-frame-rate video from compressive video measurements using temporal compressive imaging (TCI) with a temporal compression ratio T=8. This means that, 8 unique high-speed temporal frames will be obtained from a single compressive frame using a reconstruction algorithm. Equivalently, the video frame rates is increased by 8 times. Two methods, two-step iterative shrinkage/threshold (TwIST) algorithm and the Gaussian mixture model (GMM) method, are used for reconstruction. To reduce reconstruction time and memory usage, each frame of size 256×256 is divided into patches of size 8×8. The influence of different coded mask to reconstruction is discussed. The reconstruction qualities using TwIST and GMM are also compared.
Zero-block mode decision algorithm for H.264/AVC.

PubMed

Lee, Yu-Ming; Lin, Yinyi

2009-03-01

In the previous paper , we proposed a zero-block intermode decision algorithm for H.264 video coding based upon the number of zero-blocks of 4 x 4 DCT coefficients between the current macroblock and the co-located macroblock. The proposed algorithm can achieve significant improvement in computation, but the computation performance is limited for high bit-rate coding. To improve computation efficiency, in this paper, we suggest an enhanced zero-block decision algorithm, which uses an early zero-block detection method to compute the number of zero-blocks instead of direct DCT and quantization (DCT/Q) calculation and incorporates two adequate decision methods into semi-stationary and nonstationary regions of a video sequence. In addition, the zero-block decision algorithm is also applied to the intramode prediction in the P frame. The enhanced zero-block decision algorithm brings out a reduction of average 27% of total encoding time compared to the zero-block decision algorithm.
Using LabView for real-time monitoring and tracking of multiple biological objects

NASA Astrophysics Data System (ADS)

Nikolskyy, Aleksandr I.; Krasilenko, Vladimir G.; Bilynsky, Yosyp Y.; Starovier, Anzhelika

2017-04-01

Today real-time studying and tracking of movement dynamics of various biological objects is important and widely researched. Features of objects, conditions of their visualization and model parameters strongly influence the choice of optimal methods and algorithms for a specific task. Therefore, to automate the processes of adaptation of recognition tracking algorithms, several Labview project trackers are considered in the article. Projects allow changing templates for training and retraining the system quickly. They adapt to the speed of objects and statistical characteristics of noise in images. New functions of comparison of images or their features, descriptors and pre-processing methods will be discussed. The experiments carried out to test the trackers on real video files will be presented and analyzed.
Recognising safety critical events: can automatic video processing improve naturalistic data analyses?

PubMed

Dozza, Marco; González, Nieves Pañeda

2013-11-01

New trends in research on traffic accidents include Naturalistic Driving Studies (NDS). NDS are based on large scale data collection of driver, vehicle, and environment information in real world. NDS data sets have proven to be extremely valuable for the analysis of safety critical events such as crashes and near crashes. However, finding safety critical events in NDS data is often difficult and time consuming. Safety critical events are currently identified using kinematic triggers, for instance searching for deceleration below a certain threshold signifying harsh braking. Due to the low sensitivity and specificity of this filtering procedure, manual review of video data is currently necessary to decide whether the events identified by the triggers are actually safety critical. Such reviewing procedure is based on subjective decisions, is expensive and time consuming, and often tedious for the analysts. Furthermore, since NDS data is exponentially growing over time, this reviewing procedure may not be viable anymore in the very near future. This study tested the hypothesis that automatic processing of driver video information could increase the correct classification of safety critical events from kinematic triggers in naturalistic driving data. Review of about 400 video sequences recorded from the events, collected by 100 Volvo cars in the euroFOT project, suggested that drivers' individual reaction may be the key to recognize safety critical events. In fact, whether an event is safety critical or not often depends on the individual driver. A few algorithms, able to automatically classify driver reaction from video data, have been compared. The results presented in this paper show that the state of the art subjective review procedures to identify safety critical events from NDS can benefit from automated objective video processing. In addition, this paper discusses the major challenges in making such video analysis viable for future NDS and new potential applications for NDS video processing. As new NDS such as SHRP2 are now providing the equivalent of five years of one vehicle data each day, the development of new methods, such as the one proposed in this paper, seems necessary to guarantee that these data can actually be analysed. Copyright © 2013 Elsevier Ltd. All rights reserved.
Feasibility of video codec algorithms for software-only playback

NASA Astrophysics Data System (ADS)

Rodriguez, Arturo A.; Morse, Ken

1994-05-01

Software-only video codecs can provide good playback performance in desktop computers with a 486 or 68040 CPU running at 33 MHz without special hardware assistance. Typically, playback of compressed video can be categorized into three tasks: the actual decoding of the video stream, color conversion, and the transfer of decoded video data from system RAM to video RAM. By current standards, good playback performance is the decoding and display of video streams of 320 by 240 (or larger) compressed frames at 15 (or greater) frames-per- second. Software-only video codecs have evolved by modifying and tailoring existing compression methodologies to suit video playback in desktop computers. In this paper we examine the characteristics used to evaluate software-only video codec algorithms, namely: image fidelity (i.e., image quality), bandwidth (i.e., compression) ease-of-decoding (i.e., playback performance), memory consumption, compression to decompression asymmetry, scalability, and delay. We discuss the tradeoffs among these variables and the compromises that can be made to achieve low numerical complexity for software-only playback. Frame- differencing approaches are described since software-only video codecs typically employ them to enhance playback performance. To complement other papers that appear in this session of the Proceedings, we review methods derived from binary pattern image coding since these methods are amenable for software-only playback. In particular, we introduce a novel approach called pixel distribution image coding.
Semantic-based surveillance video retrieval.

PubMed

Hu, Weiming; Xie, Dan; Fu, Zhouyu; Zeng, Wenrong; Maybank, Steve

2007-04-01

Visual surveillance produces large amounts of video data. Effective indexing and retrieval from surveillance video databases are very important. Although there are many ways to represent the content of video clips in current video retrieval algorithms, there still exists a semantic gap between users and retrieval systems. Visual surveillance systems supply a platform for investigating semantic-based video retrieval. In this paper, a semantic-based video retrieval framework for visual surveillance is proposed. A cluster-based tracking algorithm is developed to acquire motion trajectories. The trajectories are then clustered hierarchically using the spatial and temporal information, to learn activity models. A hierarchical structure of semantic indexing and retrieval of object activities, where each individual activity automatically inherits all the semantic descriptions of the activity model to which it belongs, is proposed for accessing video clips and individual objects at the semantic level. The proposed retrieval framework supports various queries including queries by keywords, multiple object queries, and queries by sketch. For multiple object queries, succession and simultaneity restrictions, together with depth and breadth first orders, are considered. For sketch-based queries, a method for matching trajectories drawn by users to spatial trajectories is proposed. The effectiveness and efficiency of our framework are tested in a crowded traffic scene.
Unattended real-time re-establishment of visibility in high dynamic range video and stills

NASA Astrophysics Data System (ADS)

Abidi, B.

2014-05-01

We describe a portable unattended persistent surveillance system that corrects for harsh illumination conditions, where bright sun light creates mixed contrast effects, i.e., heavy shadows and washouts. These effects result in high dynamic range scenes, where illuminance can vary from few luxes to a 6 figure value. When using regular monitors and cameras, such wide span of illuminations can only be visualized if the actual range of values is compressed, leading to the creation of saturated and/or dark noisy areas and a loss of information in these areas. Images containing extreme mixed contrast cannot be fully enhanced from a single exposure, simply because all information is not present in the original data. The active intervention in the acquisition process is required. A software package, capable of integrating multiple types of COTS and custom cameras, ranging from Unmanned Aerial Systems (UAS) data links to digital single-lens reflex cameras (DSLR), is described. Hardware and software are integrated via a novel smart data acquisition algorithm, which communicates to the camera the parameters that would maximize information content in the final processed scene. A fusion mechanism is then applied to the smartly acquired data, resulting in an enhanced scene where information in both dark and bright areas is revealed. Multi-threading and parallel processing are exploited to produce automatic real time full motion corrected video. A novel enhancement algorithm was also devised to process data from legacy and non-controllable cameras. The software accepts and processes pre-recorded sequences and stills, enhances visible, night vision, and Infrared data, and successfully applies to night time and dark scenes. Various user options are available, integrating custom functionalities of the application into intuitive and easy to use graphical interfaces. The ensuing increase in visibility in surveillance video and intelligence imagery will expand the performance and timely decision making of the human analyst, as well as that of unmanned systems performing automatic data exploitation, such as target detection and identification.
An adaptive enhancement algorithm for infrared video based on modified k-means clustering

NASA Astrophysics Data System (ADS)

Zhang, Linze; Wang, Jingqi; Wu, Wen

2016-09-01

In this paper, we have proposed a video enhancement algorithm to improve the output video of the infrared camera. Sometimes the video obtained by infrared camera is very dark since there is no clear target. In this case, infrared video should be divided into frame images by frame extraction, in order to carry out the image enhancement. For the first frame image, which can be divided into k sub images by using K-means clustering according to the gray interval it occupies before k sub images' histogram equalization according to the amount of information per sub image, we used a method to solve a problem that final cluster centers close to each other in some cases; and for the other frame images, their initial cluster centers can be determined by the final clustering centers of the previous ones, and the histogram equalization of each sub image will be carried out after image segmentation based on K-means clustering. The histogram equalization can make the gray value of the image to the whole gray level, and the gray level of each sub image is determined by the ratio of pixels to a frame image. Experimental results show that this algorithm can improve the contrast of infrared video where night target is not obvious which lead to a dim scene, and reduce the negative effect given by the overexposed pixels adaptively in a certain range.
A Novel Artificial Intelligence System for Endotracheal Intubation.

PubMed

Carlson, Jestin N; Das, Samarjit; De la Torre, Fernando; Frisch, Adam; Guyette, Francis X; Hodgins, Jessica K; Yealy, Donald M

2016-01-01

Adequate visualization of the glottic opening is a key factor to successful endotracheal intubation (ETI); however, few objective tools exist to help guide providers' ETI attempts toward the glottic opening in real-time. Machine learning/artificial intelligence has helped to automate the detection of other visual structures but its utility with ETI is unknown. We sought to test the accuracy of various computer algorithms in identifying the glottic opening, creating a tool that could aid successful intubation. We collected a convenience sample of providers who each performed ETI 10 times on a mannequin using a video laryngoscope (C-MAC, Karl Storz Corp, Tuttlingen, Germany). We recorded each attempt and reviewed one-second time intervals for the presence or absence of the glottic opening. Four different machine learning/artificial intelligence algorithms analyzed each attempt and time point: k-nearest neighbor (KNN), support vector machine (SVM), decision trees, and neural networks (NN). We used half of the videos to train the algorithms and the second half to test the accuracy, sensitivity, and specificity of each algorithm. We enrolled seven providers, three Emergency Medicine attendings, and four paramedic students. From the 70 total recorded laryngoscopic video attempts, we created 2,465 time intervals. The algorithms had the following sensitivity and specificity for detecting the glottic opening: KNN (70%, 90%), SVM (70%, 90%), decision trees (68%, 80%), and NN (72%, 78%). Initial efforts at computer algorithms using artificial intelligence are able to identify the glottic opening with over 80% accuracy. With further refinements, video laryngoscopy has the potential to provide real-time, direction feedback to the provider to help guide successful ETI.
Content-based video retrieval by example video clip

NASA Astrophysics Data System (ADS)

Dimitrova, Nevenka; Abdel-Mottaleb, Mohamed

1997-01-01

This paper presents a novel approach for video retrieval from a large archive of MPEG or Motion JPEG compressed video clips. We introduce a retrieval algorithm that takes a video clip as a query and searches the database for clips with similar contents. Video clips are characterized by a sequence of representative frame signatures, which are constructed from DC coefficients and motion information (`DC+M' signatures). The similarity between two video clips is determined by using their respective signatures. This method facilitates retrieval of clips for the purpose of video editing, broadcast news retrieval, or copyright violation detection.
Programmable architecture for pixel level processing tasks in lightweight strapdown IR seekers

NASA Astrophysics Data System (ADS)

Coates, James L.

1993-06-01

Typical processing tasks associated with missile IR seeker applications are described, and a straw man suite of algorithms is presented. A fully programmable multiprocessor architecture is realized on a multimedia video processor (MVP) developed by Texas Instruments. The MVP combines the elements of RISC, floating point, advanced DSPs, graphics processors, display and acquisition control, RAM, and external memory. Front end pixel level tasks typical of missile interceptor applications, operating on 256 x 256 sensor imagery, can be processed at frame rates exceeding 100 Hz in a single MVP chip.
Multi-frame image processing with panning cameras and moving subjects

NASA Astrophysics Data System (ADS)

Paolini, Aaron; Humphrey, John; Curt, Petersen; Kelmelis, Eric

2014-06-01

Imaging scenarios commonly involve erratic, unpredictable camera behavior or subjects that are prone to movement, complicating multi-frame image processing techniques. To address these issues, we developed three techniques that can be applied to multi-frame image processing algorithms in order to mitigate the adverse effects observed when cameras are panning or subjects within the scene are moving. We provide a detailed overview of the techniques and discuss the applicability of each to various movement types. In addition to this, we evaluated algorithm efficacy with demonstrated benefits using field test video, which has been processed using our commercially available surveillance product. Our results show that algorithm efficacy is significantly improved in common scenarios, expanding our software's operational scope. Our methods introduce little computational burden, enabling their use in real-time and low-power solutions, and are appropriate for long observation periods. Our test cases focus on imaging through turbulence, a common use case for multi-frame techniques. We present results of a field study designed to test the efficacy of these techniques under expanded use cases.
Automated video-based detection of nocturnal convulsive seizures in a residential care setting.

PubMed

Geertsema, Evelien E; Thijs, Roland D; Gutter, Therese; Vledder, Ben; Arends, Johan B; Leijten, Frans S; Visser, Gerhard H; Kalitzin, Stiliyan N

2018-06-01

People with epilepsy need assistance and are at risk of sudden death when having convulsive seizures (CS). Automated real-time seizure detection systems can help alert caregivers, but wearable sensors are not always tolerated. We determined algorithm settings and investigated detection performance of a video algorithm to detect CS in a residential care setting. The algorithm calculates power in the 2-6 Hz range relative to 0.5-12.5 Hz range in group velocity signals derived from video-sequence optical flow. A detection threshold was found using a training set consisting of video-electroencephalogaphy (EEG) recordings of 72 CS. A test set consisting of 24 full nights of 12 new subjects in residential care and additional recordings of 50 CS selected randomly was used to estimate performance. All data were analyzed retrospectively. The start and end of CS (generalized clonic and tonic-clonic seizures) and other seizures considered desirable to detect (long generalized tonic, hyperkinetic, and other major seizures) were annotated. The detection threshold was set to the value that obtained 97% sensitivity in the training set. Sensitivity, latency, and false detection rate (FDR) per night were calculated in the test set. A seizure was detected when the algorithm output exceeded the threshold continuously for 2 seconds. With the detection threshold determined in the training set, all CS were detected in the test set (100% sensitivity). Latency was ≤10 seconds in 78% of detections. Three/five hyperkinetic and 6/9 other major seizures were detected. Median FDR was 0.78 per night and no false detections occurred in 9/24 nights. Our algorithm could improve safety unobtrusively by automated real-time detection of CS in video registrations, with an acceptable latency and FDR. The algorithm can also detect some other motor seizures requiring assistance. © 2018 The Authors. Epilepsia published by Wiley Periodicals, Inc. on behalf of International League Against Epilepsy.
Robust media processing on programmable power-constrained systems

NASA Astrophysics Data System (ADS)

McVeigh, Jeff

2005-03-01

To achieve consumer-level quality, media systems must process continuous streams of audio and video data while maintaining exacting tolerances on sampling rate, jitter, synchronization, and latency. While it is relatively straightforward to design fixed-function hardware implementations to satisfy worst-case conditions, there is a growing trend to utilize programmable multi-tasking solutions for media applications. The flexibility of these systems enables support for multiple current and future media formats, which can reduce design costs and time-to-market. This paper provides practical engineering solutions to achieve robust media processing on such systems, with specific attention given to power-constrained platforms. The techniques covered in this article utilize the fundamental concepts of algorithm and software optimization, software/hardware partitioning, stream buffering, hierarchical prioritization, and system resource and power management. A novel enhancement to dynamically adjust processor voltage and frequency based on buffer fullness to reduce system power consumption is examined in detail. The application of these techniques is provided in a case study of a portable video player implementation based on a general-purpose processor running a non real-time operating system that achieves robust playback of synchronized H.264 video and MP3 audio from local storage and streaming over 802.11.

Robust skin color-based moving object detection for video surveillance

NASA Astrophysics Data System (ADS)

Kaliraj, Kalirajan; Manimaran, Sudha

2016-07-01

Robust skin color-based moving object detection for video surveillance is proposed. The objective of the proposed algorithm is to detect and track the target under complex situations. The proposed framework comprises four stages, which include preprocessing, skin color-based feature detection, feature classification, and target localization and tracking. In the preprocessing stage, the input image frame is smoothed using averaging filter and transformed into YCrCb color space. In skin color detection, skin color regions are detected using Otsu's method of global thresholding. In the feature classification, histograms of both skin and nonskin regions are constructed and the features are classified into foregrounds and backgrounds based on Bayesian skin color classifier. The foreground skin regions are localized by a connected component labeling process. Finally, the localized foreground skin regions are confirmed as a target by verifying the region properties, and nontarget regions are rejected using the Euler method. At last, the target is tracked by enclosing the bounding box around the target region in all video frames. The experiment was conducted on various publicly available data sets and the performance was evaluated with baseline methods. It evidently shows that the proposed algorithm works well against slowly varying illumination, target rotations, scaling, fast, and abrupt motion changes.
Binary video codec for data reduction in wireless visual sensor networks

NASA Astrophysics Data System (ADS)

Khursheed, Khursheed; Ahmad, Naeem; Imran, Muhammad; O'Nils, Mattias

2013-02-01

Wireless Visual Sensor Networks (WVSN) is formed by deploying many Visual Sensor Nodes (VSNs) in the field. Typical applications of WVSN include environmental monitoring, health care, industrial process monitoring, stadium/airports monitoring for security reasons and many more. The energy budget in the outdoor applications of WVSN is limited to the batteries and the frequent replacement of batteries is usually not desirable. So the processing as well as the communication energy consumption of the VSN needs to be optimized in such a way that the network remains functional for longer duration. The images captured by VSN contain huge amount of data and require efficient computational resources for processing the images and wide communication bandwidth for the transmission of the results. Image processing algorithms must be designed and developed in such a way that they are computationally less complex and must provide high compression rate. For some applications of WVSN, the captured images can be segmented into bi-level images and hence bi-level image coding methods will efficiently reduce the information amount in these segmented images. But the compression rate of the bi-level image coding methods is limited by the underlined compression algorithm. Hence there is a need for designing other intelligent and efficient algorithms which are computationally less complex and provide better compression rate than that of bi-level image coding methods. Change coding is one such algorithm which is computationally less complex (require only exclusive OR operations) and provide better compression efficiency compared to image coding but it is effective for applications having slight changes between adjacent frames of the video. The detection and coding of the Region of Interest (ROIs) in the change frame efficiently reduce the information amount in the change frame. But, if the number of objects in the change frames is higher than a certain level then the compression efficiency of both the change coding and ROI coding becomes worse than that of image coding. This paper explores the compression efficiency of the Binary Video Codec (BVC) for the data reduction in WVSN. We proposed to implement all the three compression techniques i.e. image coding, change coding and ROI coding at the VSN and then select the smallest bit stream among the results of the three compression techniques. In this way the compression performance of the BVC will never become worse than that of image coding. We concluded that the compression efficiency of BVC is always better than that of change coding and is always better than or equal that of ROI coding and image coding.
Side-information-dependent correlation channel estimation in hash-based distributed video coding.

PubMed

Deligiannis, Nikos; Barbarien, Joeri; Jacobs, Marc; Munteanu, Adrian; Skodras, Athanassios; Schelkens, Peter

2012-04-01

In the context of low-cost video encoding, distributed video coding (DVC) has recently emerged as a potential candidate for uplink-oriented applications. This paper builds on a concept of correlation channel (CC) modeling, which expresses the correlation noise as being statistically dependent on the side information (SI). Compared with classical side-information-independent (SII) noise modeling adopted in current DVC solutions, it is theoretically proven that side-information-dependent (SID) modeling improves the Wyner-Ziv coding performance. Anchored in this finding, this paper proposes a novel algorithm for online estimation of the SID CC parameters based on already decoded information. The proposed algorithm enables bit-plane-by-bit-plane successive refinement of the channel estimation leading to progressively improved accuracy. Additionally, the proposed algorithm is included in a novel DVC architecture that employs a competitive hash-based motion estimation technique to generate high-quality SI at the decoder. Experimental results corroborate our theoretical gains and validate the accuracy of the channel estimation algorithm. The performance assessment of the proposed architecture shows remarkable and consistent coding gains over a germane group of state-of-the-art distributed and standard video codecs, even under strenuous conditions, i.e., large groups of pictures and highly irregular motion content.
A distributed multichannel demand-adaptive P2P VoD system with optimized caching and neighbor-selection

NASA Astrophysics Data System (ADS)

Zhang, Hao; Chen, Minghua; Parekh, Abhay; Ramchandran, Kannan

2011-09-01

We design a distributed multi-channel P2P Video-on-Demand (VoD) system using "plug-and-play" helpers. Helpers are heterogenous "micro-servers" with limited storage, bandwidth and number of users they can serve simultaneously. Our proposed system has the following salient features: (1) it jointly optimizes over helper-user connection topology, video storage distribution and transmission bandwidth allocation; (2) it minimizes server load, and is adaptable to varying supply and demand patterns across multiple video channels irrespective of video popularity; and (3) it is fully distributed and requires little or no maintenance overhead. The combinatorial nature of the problem and the system demand for distributed algorithms makes the problem uniquely challenging. By utilizing Lagrangian decomposition and Markov chain approximation based arguments, we address this challenge by designing two distributed algorithms running in tandem: a primal-dual storage and bandwidth allocation algorithm and a "soft-worst-neighbor-choking" topology-building algorithm. Our scheme provably converges to a near-optimal solution, and is easy to implement in practice. Packet-level simulation results show that the proposed scheme achieves minimum sever load under highly heterogeneous combinations of supply and demand patterns, and is robust to system dynamics of user/helper churn, user/helper asynchrony, and random delays in the network.
Computational Performance of Intel MIC, Sandy Bridge, and GPU Architectures: Implementation of a 1D c++/OpenMP Electrostatic Particle-In-Cell Code

DTIC Science & Technology

2014-05-01

fusion, space and astrophysical plasmas, but still the general picture can be presented quite well with the fluid approach [6, 7]. The microscopic...purpose computing CPU for algorithms where processing of large blocks of data is done in parallel. The reason for that is the GPU’s highly effective...parallel structure. Most of the image and video processing computations involve heavy matrix and vector op- erations over large amounts of data and
Generalizing the Nonlocal-Means to Super-Resolution Reconstruction

DTIC Science & Technology

2008-12-12

Image Process., vol. 5, no. 6, pp. 996–1011, Jun. 1996. [7] A. J. Patti, M. I. Sezan, and M. A. Tekalp, “ Superresolution video reconstruction with...computationally efficient image superresolution algorithm,” IEEE Trans. Image Process., vol. 10, no. 4, pp. 573–583, Apr. 2001. [13] M. Elad and Y...pp. 21–36, May 2003. [18] S. Farsiu, D. Robinson, M. Elad, and P. Milanfar, “Robust shift and add approach to superresolution ,” in Proc. SPIE Conf
A results-based process for evaluation of diverse visual analytics tools

NASA Astrophysics Data System (ADS)

Rubin, Gary; Berger, David H.

2013-05-01

With the pervasiveness of still and full-motion imagery in commercial and military applications, the need to ingest and analyze these media has grown rapidly in recent years. Additionally, video hosting and live camera websites provide a near real-time view of our changing world with unprecedented spatial coverage. To take advantage of these controlled and crowd-sourced opportunities, sophisticated visual analytics (VA) tools are required to accurately and efficiently convert raw imagery into usable information. Whether investing in VA products or evaluating algorithms for potential development, it is important for stakeholders to understand the capabilities and limitations of visual analytics tools. Visual analytics algorithms are being applied to problems related to Intelligence, Surveillance, and Reconnaissance (ISR), facility security, and public safety monitoring, to name a few. The diversity of requirements means that a onesize- fits-all approach to performance assessment will not work. We present a process for evaluating the efficacy of algorithms in real-world conditions, thereby allowing users and developers of video analytics software to understand software capabilities and identify potential shortcomings. The results-based approach described in this paper uses an analysis of end-user requirements and Concept of Operations (CONOPS) to define Measures of Effectiveness (MOEs), test data requirements, and evaluation strategies. We define metrics that individually do not fully characterize a system, but when used together, are a powerful way to reveal both strengths and weaknesses. We provide examples of data products, such as heatmaps, performance maps, detection timelines, and rank-based probability-of-detection curves.
Demosaicking for full motion video 9-band SWIR sensor

NASA Astrophysics Data System (ADS)

Kanaev, Andrey V.; Rawhouser, Marjorie; Kutteruf, Mary R.; Yetzbacher, Michael K.; DePrenger, Michael J.; Novak, Kyle M.; Miller, Corey A.; Miller, Christopher W.

2014-05-01

Short wave infrared (SWIR) spectral imaging systems are vital for Intelligence, Surveillance, and Reconnaissance (ISR) applications because of their abilities to autonomously detect targets and classify materials. Typically the spectral imagers are incapable of providing Full Motion Video (FMV) because of their reliance on line scanning. We enable FMV capability for a SWIR multi-spectral camera by creating a repeating pattern of 3x3 spectral filters on a staring focal plane array (FPA). In this paper we present the imagery from an FMV SWIR camera with nine discrete bands and discuss image processing algorithms necessary for its operation. The main task of image processing in this case is demosaicking of the spectral bands i.e. reconstructing full spectral images with original FPA resolution from spatially subsampled and incomplete spectral data acquired with the choice of filter array pattern. To the best of author's knowledge, the demosaicking algorithms for nine or more equally sampled bands have not been reported before. Moreover all existing algorithms developed for demosaicking visible color filter arrays with less than nine colors assume either certain relationship between the visible colors, which are not valid for SWIR imaging, or presence of one color band with higher sampling rate compared to the rest of the bands, which does not conform to our spectral filter pattern. We will discuss and present results for two novel approaches to demosaicking: interpolation using multi-band edge information and application of multi-frame super-resolution to a single frame resolution enhancement of multi-spectral spatially multiplexed images.
Image processing for improved eye-tracking accuracy

NASA Technical Reports Server (NTRS)

Mulligan, J. B.; Watson, A. B. (Principal Investigator)

1997-01-01

Video cameras provide a simple, noninvasive method for monitoring a subject's eye movements. An important concept is that of the resolution of the system, which is the smallest eye movement that can be reliably detected. While hardware systems are available that estimate direction of gaze in real-time from a video image of the pupil, such systems must limit image processing to attain real-time performance and are limited to a resolution of about 10 arc minutes. Two ways to improve resolution are discussed. The first is to improve the image processing algorithms that are used to derive an estimate. Off-line analysis of the data can improve resolution by at least one order of magnitude for images of the pupil. A second avenue by which to improve resolution is to increase the optical gain of the imaging setup (i.e., the amount of image motion produced by a given eye rotation). Ophthalmoscopic imaging of retinal blood vessels provides increased optical gain and improved immunity to small head movements but requires a highly sensitive camera. The large number of images involved in a typical experiment imposes great demands on the storage, handling, and processing of data. A major bottleneck had been the real-time digitization and storage of large amounts of video imagery, but recent developments in video compression hardware have made this problem tractable at a reasonable cost. Images of both the retina and the pupil can be analyzed successfully using a basic toolbox of image-processing routines (filtering, correlation, thresholding, etc.), which are, for the most part, well suited to implementation on vectorizing supercomputers.
Compact full-motion video hyperspectral cameras: development, image processing, and applications

NASA Astrophysics Data System (ADS)

Kanaev, A. V.

2015-10-01

Emergence of spectral pixel-level color filters has enabled development of hyper-spectral Full Motion Video (FMV) sensors operating in visible (EO) and infrared (IR) wavelengths. The new class of hyper-spectral cameras opens broad possibilities of its utilization for military and industry purposes. Indeed, such cameras are able to classify materials as well as detect and track spectral signatures continuously in real time while simultaneously providing an operator the benefit of enhanced-discrimination-color video. Supporting these extensive capabilities requires significant computational processing of the collected spectral data. In general, two processing streams are envisioned for mosaic array cameras. The first is spectral computation that provides essential spectral content analysis e.g. detection or classification. The second is presentation of the video to an operator that can offer the best display of the content depending on the performed task e.g. providing spatial resolution enhancement or color coding of the spectral analysis. These processing streams can be executed in parallel or they can utilize each other's results. The spectral analysis algorithms have been developed extensively, however demosaicking of more than three equally-sampled spectral bands has been explored scarcely. We present unique approach to demosaicking based on multi-band super-resolution and show the trade-off between spatial resolution and spectral content. Using imagery collected with developed 9-band SWIR camera we demonstrate several of its concepts of operation including detection and tracking. We also compare the demosaicking results to the results of multi-frame super-resolution as well as to the combined multi-frame and multiband processing.
Vision-based measurement for rotational speed by improving Lucas-Kanade template tracking algorithm.

PubMed

Guo, Jie; Zhu, Chang'an; Lu, Siliang; Zhang, Dashan; Zhang, Chunyu

2016-09-01

Rotational angle and speed are important parameters for condition monitoring and fault diagnosis of rotating machineries, and their measurement is useful in precision machining and early warning of faults. In this study, a novel vision-based measurement algorithm is proposed to complete this task. A high-speed camera is first used to capture the video of the rotational object. To extract the rotational angle, the template-based Lucas-Kanade algorithm is introduced to complete motion tracking by aligning the template image in the video sequence. Given the special case of nonplanar surface of the cylinder object, a nonlinear transformation is designed for modeling the rotation tracking. In spite of the unconventional and complex form, the transformation can realize angle extraction concisely with only one parameter. A simulation is then conducted to verify the tracking effect, and a practical tracking strategy is further proposed to track consecutively the video sequence. Based on the proposed algorithm, instantaneous rotational speed (IRS) can be measured accurately and efficiently. Finally, the effectiveness of the proposed algorithm is verified on a brushless direct current motor test rig through the comparison with results obtained by the microphone. Experimental results demonstrate that the proposed algorithm can extract accurately rotational angles and can measure IRS with the advantage of noncontact and effectiveness.
An Interactive Assessment Framework for Visual Engagement: Statistical Analysis of a TEDx Video

ERIC Educational Resources Information Center

Farhan, Muhammad; Aslam, Muhammad

2017-01-01

This study aims to assess the visual engagement of the video lectures. This analysis can be useful for the presenter and student to find out the overall visual attention of the videos. For this purpose, a new algorithm and data collection module are developed. Videos can be transformed into a dataset with the help of data collection module. The…
Real-time detection of small and dim moving objects in IR video sequences using a robust background estimator and a noise-adaptive double thresholding

NASA Astrophysics Data System (ADS)

Zingoni, Andrea; Diani, Marco; Corsini, Giovanni

2016-10-01

We developed an algorithm for automatically detecting small and poorly contrasted (dim) moving objects in real-time, within video sequences acquired through a steady infrared camera. The algorithm is suitable for different situations since it is independent of the background characteristics and of changes in illumination. Unlike other solutions, small objects of any size (up to single-pixel), either hotter or colder than the background, can be successfully detected. The algorithm is based on accurately estimating the background at the pixel level and then rejecting it. A novel approach permits background estimation to be robust to changes in the scene illumination and to noise, and not to be biased by the transit of moving objects. Care was taken in avoiding computationally costly procedures, in order to ensure the real-time performance even using low-cost hardware. The algorithm was tested on a dataset of 12 video sequences acquired in different conditions, providing promising results in terms of detection rate and false alarm rate, independently of background and objects characteristics. In addition, the detection map was produced frame by frame in real-time, using cheap commercial hardware. The algorithm is particularly suitable for applications in the fields of video-surveillance and computer vision. Its reliability and speed permit it to be used also in critical situations, like in search and rescue, defence and disaster monitoring.
Software Accelerates Computing Time for Complex Math

NASA Technical Reports Server (NTRS)

2014-01-01

Ames Research Center awarded Newark, Delaware-based EM Photonics Inc. SBIR funding to utilize graphic processing unit (GPU) technology- traditionally used for computer video games-to develop high-computing software called CULA. The software gives users the ability to run complex algorithms on personal computers with greater speed. As a result of the NASA collaboration, the number of employees at the company has increased 10 percent.
Robust perception algorithms for road and track autonomous following

NASA Astrophysics Data System (ADS)

Marion, Vincent; Lecointe, Olivier; Lewandowski, Cecile; Morillon, Joel G.; Aufrere, Romuald; Marcotegui, Beatrix; Chapuis, Roland; Beucher, Serge

2004-09-01

The French Military Robotic Study Program (introduced in Aerosense 2003), sponsored by the French Defense Procurement Agency and managed by Thales Airborne Systems as the prime contractor, focuses on about 15 robotic themes, which can provide an immediate "operational add-on value." The paper details the "road and track following" theme (named AUT2), which main purpose was to develop a vision based sub-system to automatically detect roadsides of an extended range of roads and tracks suitable to military missions. To achieve the goal, efforts focused on three main areas: (1) Improvement of images quality at algorithms inputs, thanks to the selection of adapted video cameras, and the development of a THALES patented algorithm: it removes in real time most of the disturbing shadows in images taken in natural environments, enhances contrast and lowers reflection effect due to films of water. (2) Selection and improvement of two complementary algorithms (one is segment oriented, the other region based) (3) Development of a fusion process between both algorithms, which feeds in real time a road model with the best available data. Each previous step has been developed so that the global perception process is reliable and safe: as an example, the process continuously evaluates itself and outputs confidence criteria qualifying roadside detection. The paper presents the processes in details, and the results got from passed military acceptance tests, which trigger the next step: autonomous track following (named AUT3).
Text extraction from images in the wild using the Viola-Jones algorithm

NASA Astrophysics Data System (ADS)

Saabna, Raid M.; Zingboim, Eran

2018-04-01

Text Localization and extraction is an important issue in modern applications of computer vision. Applications such as reading and translating texts in the wild or from videos are among the many applications that can benefit results of this field. In this work, we adopt the well-known Viola-Jones algorithm to enable text extraction and localization from images in the wild. The Viola-Jones is an efficient, and a fast image-processing algorithm originally used for face detection. Based on some resemblance between text and face detection tasks in the wild, we have modified the viola-jones to detect regions of interest where text may be localized. In the proposed approach, some modification to the HAAR like features and a semi-automatic process of data set generating and manipulation were presented to train the algorithm. A process of sliding windows with different sizes have been used to scan the image for individual letters and letter clusters existence. A post processing step is used in order to combine the detected letters into words and to remove false positives. The novelty of the presented approach is using the strengths of a modified Viola-Jones algorithm to identify many different objects representing different letters and clusters of similar letters and later combine them into words of varying lengths. Impressive results were obtained on the ICDAR contest data sets.
Video flow active control by means of adaptive shifted foveal geometries

NASA Astrophysics Data System (ADS)

Urdiales, Cristina; Rodriguez, Juan A.; Bandera, Antonio J.; Sandoval, Francisco

2000-10-01

This paper presents a control mechanism for video transmission that relies on transmitting non-uniform resolution images depending on the delay of the communication channel. These images are built in an active way to keep the areas of interest of the image at the highest resolution available. In order to shift the area of high resolution over the image and to achieve a data structure easy to process by using conventional algorithms, a shifted fovea multi resolution geometry of adaptive size is used. Besides, if delays are nevertheless too high, the different areas of resolution of the image can be transmitted at different rates. A functional system has been developed for corridor surveillance with static cameras. Tests with real video images have proven that the method allows an almost constant rate of images per second as long as the channel is not collapsed.
Optimization of image processing algorithms on mobile platforms

NASA Astrophysics Data System (ADS)

Poudel, Pramod; Shirvaikar, Mukul

2011-03-01

This work presents a technique to optimize popular image processing algorithms on mobile platforms such as cell phones, net-books and personal digital assistants (PDAs). The increasing demand for video applications like context-aware computing on mobile embedded systems requires the use of computationally intensive image processing algorithms. The system engineer has a mandate to optimize them so as to meet real-time deadlines. A methodology to take advantage of the asymmetric dual-core processor, which includes an ARM and a DSP core supported by shared memory, is presented with implementation details. The target platform chosen is the popular OMAP 3530 processor for embedded media systems. It has an asymmetric dual-core architecture with an ARM Cortex-A8 and a TMS320C64x Digital Signal Processor (DSP). The development platform was the BeagleBoard with 256 MB of NAND RAM and 256 MB SDRAM memory. The basic image correlation algorithm is chosen for benchmarking as it finds widespread application for various template matching tasks such as face-recognition. The basic algorithm prototypes conform to OpenCV, a popular computer vision library. OpenCV algorithms can be easily ported to the ARM core which runs a popular operating system such as Linux or Windows CE. However, the DSP is architecturally more efficient at handling DFT algorithms. The algorithms are tested on a variety of images and performance results are presented measuring the speedup obtained due to dual-core implementation. A major advantage of this approach is that it allows the ARM processor to perform important real-time tasks, while the DSP addresses performance-hungry algorithms.
Algorithm for Automatic Behavior Quantification of Laboratory Mice Using High-Frame-Rate Videos

NASA Astrophysics Data System (ADS)

Nie, Yuman; Takaki, Takeshi; Ishii, Idaku; Matsuda, Hiroshi

In this paper, we propose an algorithm for automatic behavior quantification in laboratory mice to quantify several model behaviors. The algorithm can detect repetitive motions of the fore- or hind-limbs at several or dozens of hertz, which are too rapid for the naked eye, from high-frame-rate video images. Multiple repetitive motions can always be identified from periodic frame-differential image features in four segmented regions — the head, left side, right side, and tail. Even when a mouse changes its posture and orientation relative to the camera, these features can still be extracted from the shift- and orientation-invariant shape of the mouse silhouette by using the polar coordinate system and adjusting the angle coordinate according to the head and tail positions. The effectiveness of the algorithm is evaluated by analyzing long-term 240-fps videos of four laboratory mice for six typical model behaviors: moving, rearing, immobility, head grooming, left-side scratching, and right-side scratching. The time durations for the model behaviors determined by the algorithm have detection/correction ratios greater than 80% for all the model behaviors. This shows good quantification results for actual animal testing.
Falling-incident detection and throughput enhancement in a multi-camera video-surveillance system.

PubMed

Shieh, Wann-Yun; Huang, Ju-Chin

2012-09-01

For most elderly, unpredictable falling incidents may occur at the corner of stairs or a long corridor due to body frailty. If we delay to rescue a falling elder who is likely fainting, more serious consequent injury may occur. Traditional secure or video surveillance systems need caregivers to monitor a centralized screen continuously, or need an elder to wear sensors to detect falling incidents, which explicitly waste much human power or cause inconvenience for elders. In this paper, we propose an automatic falling-detection algorithm and implement this algorithm in a multi-camera video surveillance system. The algorithm uses each camera to fetch the images from the regions required to be monitored. It then uses a falling-pattern recognition algorithm to determine if a falling incident has occurred. If yes, system will send short messages to someone needs to be noticed. The algorithm has been implemented in a DSP-based hardware acceleration board for functionality proof. Simulation results show that the accuracy of falling detection can achieve at least 90% and the throughput of a four-camera surveillance system can be improved by about 2.1 times. Copyright © 2011 IPEM. Published by Elsevier Ltd. All rights reserved.

Transcoding method from H.264/AVC to high efficiency video coding based on similarity of intraprediction, interprediction, and motion vector

NASA Astrophysics Data System (ADS)

Liu, Mei-Feng; Zhong, Guo-Yun; He, Xiao-Hai; Qing, Lin-Bo

2016-09-01

Currently, most video resources on line are encoded in the H.264/AVC format. More fluent video transmission can be obtained if these resources are encoded in the newest international video coding standard: high efficiency video coding (HEVC). In order to improve the video transmission and storage on line, a transcoding method from H.264/AVC to HEVC is proposed. In this transcoding algorithm, the coding information of intraprediction, interprediction, and motion vector (MV) in H.264/AVC video stream are used to accelerate the coding in HEVC. It is found through experiments that the region of interprediction in HEVC overlaps that in H.264/AVC. Therefore, the intraprediction for the region in HEVC, which is interpredicted in H.264/AVC, can be skipped to reduce coding complexity. Several macroblocks in H.264/AVC are combined into one PU in HEVC when the MV difference between two of the macroblocks in H.264/AVC is lower than a threshold. This method selects only one coding unit depth and one prediction unit (PU) mode to reduce the coding complexity. An MV interpolation method of combined PU in HEVC is proposed according to the areas and distances between the center of one macroblock in H.264/AVC and that of the PU in HEVC. The predicted MV accelerates the motion estimation for HEVC coding. The simulation results show that our proposed algorithm achieves significant coding time reduction with a little loss in bitrates distortion rate, compared to the existing transcoding algorithms and normal HEVC coding.
Global motion compensated visual attention-based video watermarking

NASA Astrophysics Data System (ADS)

Oakes, Matthew; Bhowmik, Deepayan; Abhayaratne, Charith

2016-11-01

Imperceptibility and robustness are two key but complementary requirements of any watermarking algorithm. Low-strength watermarking yields high imperceptibility but exhibits poor robustness. High-strength watermarking schemes achieve good robustness but often suffer from embedding distortions resulting in poor visual quality in host media. This paper proposes a unique video watermarking algorithm that offers a fine balance between imperceptibility and robustness using motion compensated wavelet-based visual attention model (VAM). The proposed VAM includes spatial cues for visual saliency as well as temporal cues. The spatial modeling uses the spatial wavelet coefficients while the temporal modeling accounts for both local and global motion to arrive at the spatiotemporal VAM for video. The model is then used to develop a video watermarking algorithm, where a two-level watermarking weighting parameter map is generated from the VAM saliency maps using the saliency model and data are embedded into the host image according to the visual attentiveness of each region. By avoiding higher strength watermarking in the visually attentive region, the resulting watermarked video achieves high perceived visual quality while preserving high robustness. The proposed VAM outperforms the state-of-the-art video visual attention methods in joint saliency detection and low computational complexity performance. For the same embedding distortion, the proposed visual attention-based watermarking achieves up to 39% (nonblind) and 22% (blind) improvement in robustness against H.264/AVC compression, compared to existing watermarking methodology that does not use the VAM. The proposed visual attention-based video watermarking results in visual quality similar to that of low-strength watermarking and a robustness similar to those of high-strength watermarking.
Dynamic Power-Saving Method for Wi-Fi Direct Based IoT Networks Considering Variable-Bit-Rate Video Traffic.

PubMed

Jin, Meihua; Jung, Ji-Young; Lee, Jung-Ryun

2016-10-12

With the arrival of the era of Internet of Things (IoT), Wi-Fi Direct is becoming an emerging wireless technology that allows one to communicate through a direct connection between the mobile devices anytime, anywhere. In Wi-Fi Direct-based IoT networks, all devices are categorized by group of owner (GO) and client. Since portability is emphasized in Wi-Fi Direct devices, it is essential to control the energy consumption of a device very efficiently. In order to avoid unnecessary power consumed by GO, Wi-Fi Direct standard defines two power-saving methods: Opportunistic and Notice of Absence (NoA) power-saving methods. In this paper, we suggest an algorithm to enhance the energy efficiency of Wi-Fi Direct power-saving, considering the characteristics of multimedia video traffic. Proposed algorithm utilizes the statistical distribution for the size of video frames and adjusts the lengths of awake intervals in a beacon interval dynamically. In addition, considering the inter-dependency among video frames, the proposed algorithm ensures that a video frame having high priority is transmitted with higher probability than other frames having low priority. Simulation results show that the proposed method outperforms the traditional NoA method in terms of average delay and energy efficiency.
Dynamic Power-Saving Method for Wi-Fi Direct Based IoT Networks Considering Variable-Bit-Rate Video Traffic

PubMed Central

Jin, Meihua; Jung, Ji-Young; Lee, Jung-Ryun

2016-01-01

With the arrival of the era of Internet of Things (IoT), Wi-Fi Direct is becoming an emerging wireless technology that allows one to communicate through a direct connection between the mobile devices anytime, anywhere. In Wi-Fi Direct-based IoT networks, all devices are categorized by group of owner (GO) and client. Since portability is emphasized in Wi-Fi Direct devices, it is essential to control the energy consumption of a device very efficiently. In order to avoid unnecessary power consumed by GO, Wi-Fi Direct standard defines two power-saving methods: Opportunistic and Notice of Absence (NoA) power-saving methods. In this paper, we suggest an algorithm to enhance the energy efficiency of Wi-Fi Direct power-saving, considering the characteristics of multimedia video traffic. Proposed algorithm utilizes the statistical distribution for the size of video frames and adjusts the lengths of awake intervals in a beacon interval dynamically. In addition, considering the inter-dependency among video frames, the proposed algorithm ensures that a video frame having high priority is transmitted with higher probability than other frames having low priority. Simulation results show that the proposed method outperforms the traditional NoA method in terms of average delay and energy efficiency. PMID:27754315
An evolutionary computation based algorithm for calculating solar differential rotation by automatic tracking of coronal bright points

NASA Astrophysics Data System (ADS)

Shahamatnia, Ehsan; Dorotovič, Ivan; Fonseca, Jose M.; Ribeiro, Rita A.

2016-03-01

Developing specialized software tools is essential to support studies of solar activity evolution. With new space missions such as Solar Dynamics Observatory (SDO), solar images are being produced in unprecedented volumes. To capitalize on that huge data availability, the scientific community needs a new generation of software tools for automatic and efficient data processing. In this paper a prototype of a modular framework for solar feature detection, characterization, and tracking is presented. To develop an efficient system capable of automatic solar feature tracking and measuring, a hybrid approach combining specialized image processing, evolutionary optimization, and soft computing algorithms is being followed. The specialized hybrid algorithm for tracking solar features allows automatic feature tracking while gathering characterization details about the tracked features. The hybrid algorithm takes advantages of the snake model, a specialized image processing algorithm widely used in applications such as boundary delineation, image segmentation, and object tracking. Further, it exploits the flexibility and efficiency of Particle Swarm Optimization (PSO), a stochastic population based optimization algorithm. PSO has been used successfully in a wide range of applications including combinatorial optimization, control, clustering, robotics, scheduling, and image processing and video analysis applications. The proposed tool, denoted PSO-Snake model, was already successfully tested in other works for tracking sunspots and coronal bright points. In this work, we discuss the application of the PSO-Snake algorithm for calculating the sidereal rotational angular velocity of the solar corona. To validate the results we compare them with published manual results performed by an expert.
Chest compression rate measurement from smartphone video.

PubMed

Engan, Kjersti; Hinna, Thomas; Ryen, Tom; Birkenes, Tonje S; Myklebust, Helge

2016-08-11

Out-of-hospital cardiac arrest is a life threatening situation where the first person performing cardiopulmonary resuscitation (CPR) most often is a bystander without medical training. Some existing smartphone apps can call the emergency number and provide for example global positioning system (GPS) location like Hjelp 113-GPS App by the Norwegian air ambulance. We propose to extend functionality of such apps by using the built in camera in a smartphone to capture video of the CPR performed, primarily to estimate the duration and rate of the chest compression executed, if any. All calculations are done in real time, and both the caller and the dispatcher will receive the compression rate feedback when detected. The proposed algorithm is based on finding a dynamic region of interest in the video frames, and thereafter evaluating the power spectral density by computing the fast fourier transform over sliding windows. The power of the dominating frequencies is compared to the power of the frequency area of interest. The system is tested on different persons, male and female, in different scenarios addressing target compression rates, background disturbances, compression with mouth-to-mouth ventilation, various background illuminations and phone placements. All tests were done on a recording Laerdal manikin, providing true compression rates for comparison. Overall, the algorithm is seen to be promising, and it manages a number of disturbances and light situations. For target rates at 110 cpm, as recommended during CPR, the mean error in compression rate (Standard dev. over tests in parentheses) is 3.6 (0.8) for short hair bystanders, and 8.7 (6.0) including medium and long haired bystanders. The presented method shows that it is feasible to detect the compression rate of chest compressions performed by a bystander by placing the smartphone close to the patient, and using the built-in camera combined with a video processing algorithm performed real-time on the device.
Robust Observation Detection for Single Object Tracking: Deterministic and Probabilistic Patch-Based Approaches

PubMed Central

Zulkifley, Mohd Asyraf; Rawlinson, David; Moran, Bill

2012-01-01

In video analytics, robust observation detection is very important as the content of the videos varies a lot, especially for tracking implementation. Contrary to the image processing field, the problems of blurring, moderate deformation, low illumination surroundings, illumination change and homogenous texture are normally encountered in video analytics. Patch-Based Observation Detection (PBOD) is developed to improve detection robustness to complex scenes by fusing both feature- and template-based recognition methods. While we believe that feature-based detectors are more distinctive, however, for finding the matching between the frames are best achieved by a collection of points as in template-based detectors. Two methods of PBOD—the deterministic and probabilistic approaches—have been tested to find the best mode of detection. Both algorithms start by building comparison vectors at each detected points of interest. The vectors are matched to build candidate patches based on their respective coordination. For the deterministic method, patch matching is done in 2-level test where threshold-based position and size smoothing are applied to the patch with the highest correlation value. For the second approach, patch matching is done probabilistically by modelling the histograms of the patches by Poisson distributions for both RGB and HSV colour models. Then, maximum likelihood is applied for position smoothing while a Bayesian approach is applied for size smoothing. The result showed that probabilistic PBOD outperforms the deterministic approach with average distance error of 10.03% compared with 21.03%. This algorithm is best implemented as a complement to other simpler detection methods due to heavy processing requirement. PMID:23202226
Algorithms for High-Speed Noninvasive Eye-Tracking System

NASA Technical Reports Server (NTRS)

Talukder, Ashit; Morookian, John-Michael; Lambert, James

2010-01-01

Two image-data-processing algorithms are essential to the successful operation of a system of electronic hardware and software that noninvasively tracks the direction of a person s gaze in real time. The system was described in High-Speed Noninvasive Eye-Tracking System (NPO-30700) NASA Tech Briefs, Vol. 31, No. 8 (August 2007), page 51. To recapitulate from the cited article: Like prior commercial noninvasive eyetracking systems, this system is based on (1) illumination of an eye by a low-power infrared light-emitting diode (LED); (2) acquisition of video images of the pupil, iris, and cornea in the reflected infrared light; (3) digitization of the images; and (4) processing the digital image data to determine the direction of gaze from the centroids of the pupil and cornea in the images. Most of the prior commercial noninvasive eyetracking systems rely on standard video cameras, which operate at frame rates of about 30 Hz. Such systems are limited to slow, full-frame operation. The video camera in the present system includes a charge-coupled-device (CCD) image detector plus electronic circuitry capable of implementing an advanced control scheme that effects readout from a small region of interest (ROI), or subwindow, of the full image. Inasmuch as the image features of interest (the cornea and pupil) typically occupy a small part of the camera frame, this ROI capability can be exploited to determine the direction of gaze at a high frame rate by reading out from the ROI that contains the cornea and pupil (but not from the rest of the image) repeatedly. One of the present algorithms exploits the ROI capability. The algorithm takes horizontal row slices and takes advantage of the symmetry of the pupil and cornea circles and of the gray-scale contrasts of the pupil and cornea with respect to other parts of the eye. The algorithm determines which horizontal image slices contain the pupil and cornea, and, on each valid slice, the end coordinates of the pupil and cornea. Information from multiple slices is then combined to robustly locate the centroids of the pupil and cornea images. The other of the two present algorithms is a modified version of an older algorithm for estimating the direction of gaze from the centroids of the pupil and cornea. The modification lies in the use of the coordinates of the centroids, rather than differences between the coordinates of the centroids, in a gaze-mapping equation. The equation locates a gaze point, defined as the intersection of the gaze axis with a surface of interest, which is typically a computer display screen (see figure). The expected advantage of the modification is to make the gaze computation less dependent on some simplifying assumptions that are sometimes not accurate
Digital Watermarking: From Concepts to Real-Time Video Applications

DTIC Science & Technology

1999-01-01

includes still- image , video, audio, and geometry data among others-the fundamental con- cept of steganography can be transferred from the field of...size of the message, which should be as small as possible. Some commercially available algorithms for image watermarking forego the secure-watermarking... image compres- sion.’ The image’s luminance component is divided into 8 x 8 pixel blocks. The algorithm selects a sequence of blocks and applies the
Optical Flow Analysis and Kalman Filter Tracking in Video Surveillance Algorithms

DTIC Science & Technology

2007-06-01

Grover Brown and Patrick Y.C. Hwang , Introduction to Random Signals and Applied Kalman Filtering, Third edition, John Wiley & Sons, New York, 1997...noise. Brown and Hwang [6] achieve this improvement by linearly blending the prior estimate, 1kx ∧ − , with the noisy measurement, kz , in the equation...AND KALMAN FILTER TRACKING IN VIDEO SURVEILLANCE ALGORITHMS by David A. Semko June 2007 Thesis Advisor: Monique P. Fargues Second
A Method for Counting Moving People in Video Surveillance Videos

NASA Astrophysics Data System (ADS)

Conte, Donatello; Foggia, Pasquale; Percannella, Gennaro; Tufano, Francesco; Vento, Mario

2010-12-01

People counting is an important problem in video surveillance applications. This problem has been faced either by trying to detect people in the scene and then counting them or by establishing a mapping between some scene feature and the number of people (avoiding the complex detection problem). This paper presents a novel method, following this second approach, that is based on the use of SURF features and of an [InlineEquation not available: see fulltext.]-SVR regressor provide an estimate of this count. The algorithm takes specifically into account problems due to partial occlusions and to perspective. In the experimental evaluation, the proposed method has been compared with the algorithm by Albiol et al., winner of the PETS 2009 contest on people counting, using the same PETS 2009 database. The provided results confirm that the proposed method yields an improved accuracy, while retaining the robustness of Albiol's algorithm.
Exploration of depth modeling mode one lossless wedgelets storage strategies for 3D-high efficiency video coding

NASA Astrophysics Data System (ADS)

Sanchez, Gustavo; Marcon, César; Agostini, Luciano Volcan

2018-01-01

The 3D-high efficiency video coding has introduced tools to obtain higher efficiency in 3-D video coding, and most of them are related to the depth maps coding. Among these tools, the depth modeling mode-1 (DMM-1) focuses on better encoding edges regions of depth maps. The large memory required for storing all wedgelet patterns is one of the bottlenecks in the DMM-1 hardware design of both encoder and decoder since many patterns must be stored. Three algorithms to reduce the DMM-1 memory requirements and a hardware design targeting the most efficient among these algorithms are presented. Experimental results demonstrate that the proposed solutions surpass related works reducing up to 78.8% of the wedgelet memory, without degrading the encoding efficiency. Synthesis results demonstrate that the proposed algorithm reduces almost 75% of the power dissipation when compared to the standard approach.
Automated method for tracing leading and trailing processes of migrating neurons in confocal image sequences

NASA Astrophysics Data System (ADS)

Kerekes, Ryan A.; Gleason, Shaun S.; Trivedi, Niraj; Solecki, David J.

2010-03-01

Segmentation, tracking, and tracing of neurons in video imagery are important steps in many neuronal migration studies and can be inaccurate and time-consuming when performed manually. In this paper, we present an automated method for tracing the leading and trailing processes of migrating neurons in time-lapse image stacks acquired with a confocal fluorescence microscope. In our approach, we first locate and track the soma of the cell of interest by smoothing each frame and tracking the local maxima through the sequence. We then trace the leading process in each frame by starting at the center of the soma and stepping repeatedly in the most likely direction of the leading process. This direction is found at each step by examining second derivatives of fluorescent intensity along curves of constant radius around the current point. Tracing terminates after a fixed number of steps or when fluorescent intensity drops below a fixed threshold. We evolve the resulting trace to form an improved trace that more closely follows the approximate centerline of the leading process. We apply a similar algorithm to the trailing process of the cell by starting the trace in the opposite direction. We demonstrate our algorithm on two time-lapse confocal video sequences of migrating cerebellar granule neurons (CGNs). We show that the automated traces closely approximate ground truth traces to within 1 or 2 pixels on average. Additionally, we compute line intensity profiles of fluorescence along the automated traces and quantitatively demonstrate their similarity to manually generated profiles in terms of fluorescence peak locations.
Automated Thermal Image Processing for Detection and Classification of Birds and Bats - FY2012 Annual Report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Duberstein, Corey A.; Matzner, Shari; Cullinan, Valerie I.

Surveying wildlife at risk from offshore wind energy development is difficult and expensive. Infrared video can be used to record birds and bats that pass through the camera view, but it is also time consuming and expensive to review video and determine what was recorded. We proposed to conduct algorithm and software development to identify and to differentiate thermally detected targets of interest that would allow automated processing of thermal image data to enumerate birds, bats, and insects. During FY2012 we developed computer code within MATLAB to identify objects recorded in video and extract attribute information that describes the objectsmore » recorded. We tested the efficiency of track identification using observer-based counts of tracks within segments of sample video. We examined object attributes, modeled the effects of random variability on attributes, and produced data smoothing techniques to limit random variation within attribute data. We also began drafting and testing methodology to identify objects recorded on video. We also recorded approximately 10 hours of infrared video of various marine birds, passerine birds, and bats near the Pacific Northwest National Laboratory (PNNL) Marine Sciences Laboratory (MSL) at Sequim, Washington. A total of 6 hours of bird video was captured overlooking Sequim Bay over a series of weeks. An additional 2 hours of video of birds was also captured during two weeks overlooking Dungeness Bay within the Strait of Juan de Fuca. Bats and passerine birds (swallows) were also recorded at dusk on the MSL campus during nine evenings. An observer noted the identity of objects viewed through the camera concurrently with recording. These video files will provide the information necessary to produce and test software developed during FY2013. The annotation will also form the basis for creation of a method to reliably identify recorded objects.« less
Topic Transition in Educational Videos Using Visually Salient Words

ERIC Educational Resources Information Center

Gandhi, Ankit; Biswas, Arijit; Deshmukh, Om

2015-01-01

In this paper, we propose a visual saliency algorithm for automatically finding the topic transition points in an educational video. First, we propose a method for assigning a saliency score to each word extracted from an educational video. We design several mid-level features that are indicative of visual saliency. The optimal feature combination…
Content-Based Indexing and Teaching Focus Mining for Lecture Videos

ERIC Educational Resources Information Center

Lin, Yu-Tzu; Yen, Bai-Jang; Chang, Chia-Hu; Lee, Greg C.; Lin, Yu-Chih

2010-01-01

Purpose: The purpose of this paper is to propose an indexing and teaching focus mining system for lecture videos recorded in an unconstrained environment. Design/methodology/approach: By applying the proposed algorithms in this paper, the slide structure can be reconstructed by extracting slide images from the video. Instead of applying…
High-performance software-only H.261 video compression on PC

NASA Astrophysics Data System (ADS)

Kasperovich, Leonid

1996-03-01

This paper describes an implementation of a software H.261 codec for PC, that takes an advantage of the fast computational algorithms for DCT-based video compression, which have been presented by the author at the February's 1995 SPIE/IS&T meeting. The motivation for developing the H.261 prototype system is to demonstrate a feasibility of real time software- only videoconferencing solution to operate across a wide range of network bandwidth, frame rate, and resolution of the input video. As the bandwidths of current network technology will be increased, the higher frame rate and resolution of video to be transmitted is allowed, that requires, in turn, a software codec to be able to compress pictures of CIF (352 X 288) resolution at up to 30 frame/sec. Running on Pentium 133 MHz PC the codec presented is capable to compress video in CIF format at 21 - 23 frame/sec. This result is comparable to the known hardware-based H.261 solutions, but it doesn't require any specific hardware. The methods to achieve high performance, the program optimization technique for Pentium microprocessor along with the performance profile, showing the actual contribution of the different encoding/decoding stages to the overall computational process, are presented.
A hybrid frame concealment algorithm for H.264/AVC.

PubMed

Yan, Bo; Gharavi, Hamid

2010-01-01

In packet-based video transmissions, packets loss due to channel errors may result in the loss of the whole video frame. Recently, many error concealment algorithms have been proposed in order to combat channel errors; however, most of the existing algorithms can only deal with the loss of macroblocks and are not able to conceal the whole missing frame. In order to resolve this problem, in this paper, we have proposed a new hybrid motion vector extrapolation (HMVE) algorithm to recover the whole missing frame, and it is able to provide more accurate estimation for the motion vectors of the missing frame than other conventional methods. Simulation results show that it is highly effective and significantly outperforms other existing frame recovery methods.
Efficient video-equipped fire detection approach for automatic fire alarm systems

NASA Astrophysics Data System (ADS)

Kang, Myeongsu; Tung, Truong Xuan; Kim, Jong-Myon

2013-01-01

This paper proposes an efficient four-stage approach that automatically detects fire using video capabilities. In the first stage, an approximate median method is used to detect video frame regions involving motion. In the second stage, a fuzzy c-means-based clustering algorithm is employed to extract candidate regions of fire from all of the movement-containing regions. In the third stage, a gray level co-occurrence matrix is used to extract texture parameters by tracking red-colored objects in the candidate regions. These texture features are, subsequently, used as inputs of a back-propagation neural network to distinguish between fire and nonfire. Experimental results indicate that the proposed four-stage approach outperforms other fire detection algorithms in terms of consistently increasing the accuracy of fire detection in both indoor and outdoor test videos.
Pattern-based integer sample motion search strategies in the context of HEVC

NASA Astrophysics Data System (ADS)

Maier, Georg; Bross, Benjamin; Grois, Dan; Marpe, Detlev; Schwarz, Heiko; Veltkamp, Remco C.; Wiegand, Thomas

2015-09-01

The H.265/MPEG-H High Efficiency Video Coding (HEVC) standard provides a significant increase in coding efficiency compared to its predecessor, the H.264/MPEG-4 Advanced Video Coding (AVC) standard, which however comes at the cost of a high computational burden for a compliant encoder. Motion estimation (ME), which is a part of the inter-picture prediction process, typically consumes a high amount of computational resources, while significantly increasing the coding efficiency. In spite of the fact that both H.265/MPEG-H HEVC and H.264/MPEG-4 AVC standards allow processing motion information on a fractional sample level, the motion search algorithms based on the integer sample level remain to be an integral part of ME. In this paper, a flexible integer sample ME framework is proposed, thereby allowing to trade off significant reduction of ME computation time versus coding efficiency penalty in terms of bit rate overhead. As a result, through extensive experimentation, an integer sample ME algorithm that provides a good trade-off is derived, incorporating a combination and optimization of known predictive, pattern-based and early termination techniques. The proposed ME framework is implemented on a basis of the HEVC Test Model (HM) reference software, further being compared to the state-of-the-art fast search algorithm, which is a native part of HM. It is observed that for high resolution sequences, the integer sample ME process can be speed-up by factors varying from 3.2 to 7.6, resulting in the bit-rate overhead of 1.5% and 0.6% for Random Access (RA) and Low Delay P (LDP) configurations, respectively. In addition, the similar speed-up is observed for sequences with mainly Computer-Generated Imagery (CGI) content while trading off the bit rate overhead of up to 5.2%.

Performance improvement of multi-class detection using greedy algorithm for Viola-Jones cascade selection

NASA Astrophysics Data System (ADS)

Tereshin, Alexander A.; Usilin, Sergey A.; Arlazarov, Vladimir V.

2018-04-01

This paper aims to study the problem of multi-class object detection in video stream with Viola-Jones cascades. An adaptive algorithm for selecting Viola-Jones cascade based on greedy choice strategy in solution of the N-armed bandit problem is proposed. The efficiency of the algorithm on the problem of detection and recognition of the bank card logos in the video stream is shown. The proposed algorithm can be effectively used in documents localization and identification, recognition of road scene elements, localization and tracking of the lengthy objects , and for solving other problems of rigid object detection in a heterogeneous data flows. The computational efficiency of the algorithm makes it possible to use it both on personal computers and on mobile devices based on processors with low power consumption.
Real-time single image dehazing based on dark channel prior theory and guided filtering

NASA Astrophysics Data System (ADS)

Zhang, Zan

2017-10-01

Images and videos taken outside the foggy day are serious degraded. In order to restore degraded image taken in foggy day and overcome traditional Dark Channel prior algorithms problems of remnant fog in edge, we propose a new dehazing method.We first find the fog area in the dark primary color map to obtain the estimated value of the transmittance using quadratic tree. Then we regard the gray-scale image after guided filtering as atmospheric light map and remove haze based on it. Box processing and image down sampling technology are also used to improve the processing speed. Finally, the atmospheric light scattering model is used to restore the image. A plenty of experiments show that algorithm is effective, efficient and has a wide range of application.
New robust algorithm for tracking cells in videos of Drosophila morphogenesis based on finding an ideal path in segmented spatio-temporal cellular structures.

PubMed

Bellaïche, Yohanns; Bosveld, Floris; Graner, François; Mikula, Karol; Remesíková, Mariana; Smísek, Michal

2011-01-01

In this paper, we present a novel algorithm for tracking cells in time lapse confocal microscopy movie of a Drosophila epithelial tissue during pupal morphogenesis. We consider a 2D + time video as a 3D static image, where frames are stacked atop each other, and using a spatio-temporal segmentation algorithm we obtain information about spatio-temporal 3D tubes representing evolutions of cells. The main idea for tracking is the usage of two distance functions--first one from the cells in the initial frame and second one from segmented boundaries. We track the cells backwards in time. The first distance function attracts the subsequently constructed cell trajectories to the cells in the initial frame and the second one forces them to be close to centerlines of the segmented tubular structures. This makes our tracking algorithm robust against noise and missing spatio-temporal boundaries. This approach can be generalized to a 3D + time video analysis, where spatio-temporal tubes are 4D objects.
Visual saliency-based fast intracoding algorithm for high efficiency video coding

NASA Astrophysics Data System (ADS)

Zhou, Xin; Shi, Guangming; Zhou, Wei; Duan, Zhemin

2017-01-01

Intraprediction has been significantly improved in high efficiency video coding over H.264/AVC with quad-tree-based coding unit (CU) structure from size 64×64 to 8×8 and more prediction modes. However, these techniques cause a dramatic increase in computational complexity. An intracoding algorithm is proposed that consists of perceptual fast CU size decision algorithm and fast intraprediction mode decision algorithm. First, based on the visual saliency detection, an adaptive and fast CU size decision method is proposed to alleviate intraencoding complexity. Furthermore, a fast intraprediction mode decision algorithm with step halving rough mode decision method and early modes pruning algorithm is presented to selectively check the potential modes and effectively reduce the complexity of computation. Experimental results show that our proposed fast method reduces the computational complexity of the current HM to about 57% in encoding time with only 0.37% increases in BD rate. Meanwhile, the proposed fast algorithm has reasonable peak signal-to-noise ratio losses and nearly the same subjective perceptual quality.
Mixture block coding with progressive transmission in packet video. Appendix 1: Item 2. M.S. Thesis

NASA Technical Reports Server (NTRS)

Chen, Yun-Chung

1989-01-01

Video transmission will become an important part of future multimedia communication because of dramatically increasing user demand for video, and rapid evolution of coding algorithm and VLSI technology. Video transmission will be part of the broadband-integrated services digital network (B-ISDN). Asynchronous transfer mode (ATM) is a viable candidate for implementation of B-ISDN due to its inherent flexibility, service independency, and high performance. According to the characteristics of ATM, the information has to be coded into discrete cells which travel independently in the packet switching network. A practical realization of an ATM video codec called Mixture Block Coding with Progressive Transmission (MBCPT) is presented. This variable bit rate coding algorithm shows how a constant quality performance can be obtained according to user demand. Interactions between codec and network are emphasized including packetization, service synchronization, flow control, and error recovery. Finally, some simulation results based on MBCPT coding with error recovery are presented.
Low-complexity transcoding algorithm from H.264/AVC to SVC using data mining

NASA Astrophysics Data System (ADS)

Garrido-Cantos, Rosario; De Cock, Jan; Martínez, Jose Luis; Van Leuven, Sebastian; Cuenca, Pedro; Garrido, Antonio

2013-12-01

Nowadays, networks and terminals with diverse characteristics of bandwidth and capabilities coexist. To ensure a good quality of experience, this diverse environment demands adaptability of the video stream. In general, video contents are compressed to save storage capacity and to reduce the bandwidth required for its transmission. Therefore, if these compressed video streams were compressed using scalable video coding schemes, they would be able to adapt to those heterogeneous networks and a wide range of terminals. Since the majority of the multimedia contents are compressed using H.264/AVC, they cannot benefit from that scalability. This paper proposes a low-complexity algorithm to convert an H.264/AVC bitstream without scalability to scalable bitstreams with temporal scalability in baseline and main profiles by accelerating the mode decision task of the scalable video coding encoding stage using machine learning tools. The results show that when our technique is applied, the complexity is reduced by 87% while maintaining coding efficiency.
Memory-efficient table look-up optimized algorithm for context-based adaptive variable length decoding in H.264/advanced video coding

NASA Astrophysics Data System (ADS)

Wang, Jianhua; Cheng, Lianglun; Wang, Tao; Peng, Xiaodong

2016-03-01

Table look-up operation plays a very important role during the decoding processing of context-based adaptive variable length decoding (CAVLD) in H.264/advanced video coding (AVC). However, frequent table look-up operation can result in big table memory access, and then lead to high table power consumption. Aiming to solve the problem of big table memory access of current methods, and then reduce high power consumption, a memory-efficient table look-up optimized algorithm is presented for CAVLD. The contribution of this paper lies that index search technology is introduced to reduce big memory access for table look-up, and then reduce high table power consumption. Specifically, in our schemes, we use index search technology to reduce memory access by reducing the searching and matching operations for code_word on the basis of taking advantage of the internal relationship among length of zero in code_prefix, value of code_suffix and code_lengh, thus saving the power consumption of table look-up. The experimental results show that our proposed table look-up algorithm based on index search can lower about 60% memory access consumption compared with table look-up by sequential search scheme, and then save much power consumption for CAVLD in H.264/AVC.
Colour-reproduction algorithm for transmitting variable video frames and its application to capsule endoscopy

PubMed Central

Khan, Tareq; Shrestha, Ravi; Imtiaz, Md. Shamin

2015-01-01

Presented is a new power-efficient colour generation algorithm for wireless capsule endoscopy (WCE) application. In WCE, transmitting colour image data from the human intestine through radio frequency (RF) consumes a huge amount of power. The conventional way is to transmit all R, G and B components of all frames. Using the proposed dictionary-based colour generation scheme, instead of sending all R, G and B frames, first one colour frame is sent followed by a series of grey-scale frames. At the receiver end, the colour information is extracted from the colour frame and then added to colourise the grey-scale frames. After a certain number of grey-scale frames, another colour frame is sent followed by the same number of grey-scale frames. This process is repeated until the end of the video sequence to maintain the colour similarity. As a result, over 50% of RF transmission power can be saved using the proposed scheme, which will eventually lead to a battery life extension of the capsule by 4–7 h. The reproduced colour images have been evaluated both statistically and subjectively by professional gastroenterologists. The algorithm is finally implemented using a WCE prototype and the performance is validated using an ex-vivo trial. PMID:26609405
Complexity reduction in the H.264/AVC using highly adaptive fast mode decision based on macroblock motion activity

NASA Astrophysics Data System (ADS)

Abdellah, Skoudarli; Mokhtar, Nibouche; Amina, Serir

2015-11-01

The H.264/AVC video coding standard is used in a wide range of applications from video conferencing to high-definition television according to its high compression efficiency. This efficiency is mainly acquired from the newly allowed prediction schemes including variable block modes. However, these schemes require a high complexity to select the optimal mode. Consequently, complexity reduction in the H.264/AVC encoder has recently become a very challenging task in the video compression domain, especially when implementing the encoder in real-time applications. Fast mode decision algorithms play an important role in reducing the overall complexity of the encoder. In this paper, we propose an adaptive fast intermode algorithm based on motion activity, temporal stationarity, and spatial homogeneity. This algorithm predicts the motion activity of the current macroblock from its neighboring blocks and identifies temporal stationary regions and spatially homogeneous regions using adaptive threshold values based on content video features. Extensive experimental work has been done in high profile, and results show that the proposed source-coding algorithm effectively reduces the computational complexity by 53.18% on average compared with the reference software encoder, while maintaining the high-coding efficiency of H.264/AVC by incurring only 0.097 dB in total peak signal-to-noise ratio and 0.228% increment on the total bit rate.
Optimal space communications techniques. [discussion of video signals and delta modulation

NASA Technical Reports Server (NTRS)

Schilling, D. L.

1974-01-01

The encoding of video signals using the Song Adaptive Delta Modulator (Song ADM) is discussed. The video signals are characterized as a sequence of pulses having arbitrary height and width. Although the ADM is suited to tracking signals having fast rise times, it was found that the DM algorithm (which permits an exponential rise for estimating an input step) results in a large overshoot and an underdamped response to the step. An overshoot suppression algorithm which significantly reduces the ringing while not affecting the rise time is presented along with formuli for the rise time and the settling time. Channel errors and their effect on the DM encoded bit stream were investigated.
Vision-based system for the control and measurement of wastewater flow rate in sewer systems.

PubMed

Nguyen, L S; Schaeli, B; Sage, D; Kayal, S; Jeanbourquin, D; Barry, D A; Rossi, L

2009-01-01

Combined sewer overflows and stormwater discharges represent an important source of contamination to the environment. However, the harsh environment inside sewers and particular hydraulic conditions during rain events reduce the reliability of traditional flow measurement probes. In the following, we present and evaluate an in situ system for the monitoring of water flow in sewers based on video images. This paper focuses on the measurement of the water level based on image-processing techniques. The developed image-based water level algorithms identify the wall/water interface from sewer images and measure its position with respect to real world coordinates. A web-based user interface and a 3-tier system architecture enable the remote configuration of the cameras and the image-processing algorithms. Images acquired and processed by our system were found to reliably measure water levels and thereby to provide crucial information leading to better understand particular hydraulic behaviors. In terms of robustness and accuracy, the water level algorithm provided equal or better results compared to traditional water level probes in three different in situ configurations.
Study of moving object detecting and tracking algorithm for video surveillance system

NASA Astrophysics Data System (ADS)

Wang, Tao; Zhang, Rongfu

2010-10-01

This paper describes a specific process of moving target detecting and tracking in the video surveillance.Obtain high-quality background is the key to achieving differential target detecting in the video surveillance.The paper is based on a block segmentation method to build clear background,and using the method of background difference to detecing moving target,after a series of treatment we can be extracted the more comprehensive object from original image,then using the smallest bounding rectangle to locate the object.In the video surveillance system, the delay of camera and other reasons lead to tracking lag,the model of Kalman filter based on template matching was proposed,using deduced and estimated capacity of Kalman,the center of smallest bounding rectangle for predictive value,predicted the position in the next moment may appare,followed by template matching in the region as the center of this position,by calculate the cross-correlation similarity of current image and reference image,can determine the best matching center.As narrowed the scope of searching,thereby reduced the searching time,so there be achieve fast-tracking.
Non-contact Real-time heart rate measurements based on high speed circuit technology research

NASA Astrophysics Data System (ADS)

Wu, Jizhe; Liu, Xiaohua; Kong, Lingqin; Shi, Cong; Liu, Ming; Hui, Mei; Dong, Liquan; Zhao, Yuejin

2015-08-01

In recent years, morbidity and mortality of the cardiovascular or cerebrovascular disease, which threaten human health greatly, increased year by year. Heart rate is an important index of these diseases. To address this status, the paper puts forward a kind of simple structure, easy operation, suitable for large populations of daily monitoring non-contact heart rate measurement. In the method we use imaging equipment video sensitive areas. The changes of light intensity reflected through the image grayscale average. The light change is caused by changes in blood volume. We video the people face which include the sensitive areas (ROI), and use high-speed processing circuit to save the video as AVI format into memory. After processing the whole video of a period of time, we draw curve of each color channel with frame number as horizontal axis. Then get heart rate from the curve. We use independent component analysis (ICA) to restrain noise of sports interference, realized the accurate extraction of heart rate signal under the motion state. We design an algorithm, based on high-speed processing circuit, for face recognition and tracking to automatically get face region. We do grayscale average processing to the recognized image, get RGB three grayscale curves, and extract a clearer pulse wave curves through independent component analysis, and then we get the heart rate under the motion state. At last, by means of compare our system with Fingertip Pulse Oximeter, result show the system can realize a more accurate measurement, the error is less than 3 pats per minute.
Swarm Intelligence for Optimizing Hybridized Smoothing Filter in Image Edge Enhancement

NASA Astrophysics Data System (ADS)

Rao, B. Tirumala; Dehuri, S.; Dileep, M.; Vindhya, A.

In this modern era, image transmission and processing plays a major role. It would be impossible to retrieve information from satellite and medical images without the help of image processing techniques. Edge enhancement is an image processing step that enhances the edge contrast of an image or video in an attempt to improve its acutance. Edges are the representations of the discontinuities of image intensity functions. For processing these discontinuities in an image, a good edge enhancement technique is essential. The proposed work uses a new idea for edge enhancement using hybridized smoothening filters and we introduce a promising technique of obtaining best hybrid filter using swarm algorithms (Artificial Bee Colony (ABC), Particle Swarm Optimization (PSO) and Ant Colony Optimization (ACO)) to search for an optimal sequence of filters from among a set of rather simple, representative image processing filters. This paper deals with the analysis of the swarm intelligence techniques through the combination of hybrid filters generated by these algorithms for image edge enhancement.
Robust efficient estimation of heart rate pulse from video.

PubMed

Xu, Shuchang; Sun, Lingyun; Rohde, Gustavo Kunde

2014-04-01

We describe a simple but robust algorithm for estimating the heart rate pulse from video sequences containing human skin in real time. Based on a model of light interaction with human skin, we define the change of blood concentration due to arterial pulsation as a pixel quotient in log space, and successfully use the derived signal for computing the pulse heart rate. Various experiments with different cameras, different illumination condition, and different skin locations were conducted to demonstrate the effectiveness and robustness of the proposed algorithm. Examples computed with normal illumination show the algorithm is comparable with pulse oximeter devices both in accuracy and sensitivity.
Robust efficient estimation of heart rate pulse from video

PubMed Central

Xu, Shuchang; Sun, Lingyun; Rohde, Gustavo Kunde

2014-01-01

We describe a simple but robust algorithm for estimating the heart rate pulse from video sequences containing human skin in real time. Based on a model of light interaction with human skin, we define the change of blood concentration due to arterial pulsation as a pixel quotient in log space, and successfully use the derived signal for computing the pulse heart rate. Various experiments with different cameras, different illumination condition, and different skin locations were conducted to demonstrate the effectiveness and robustness of the proposed algorithm. Examples computed with normal illumination show the algorithm is comparable with pulse oximeter devices both in accuracy and sensitivity. PMID:24761294
MoNET: media over net gateway processor for next-generation network

NASA Astrophysics Data System (ADS)

Elabd, Hammam; Sundar, Rangarajan; Dedes, John

2001-12-01

MoNETTM (Media over Net) SX000 product family is designed using a scalable voice, video and packet-processing platform to address applications with channel densities from few voice channels to four OC3 per card. This platform is developed for bridging public circuit-switched network to the next generation packet telephony and data network. The platform consists of a DSP farm, RISC processors and interface modules. DSP farm is required to execute voice compression, image compression and line echo cancellation algorithms for large number of voice, video, fax, and modem or data channels. RISC CPUs are used for performing various packetizations based on RTP, UDP/IP and ATM encapsulations. In addition, RISC CPUs also participate in the DSP farm load management and communication with the host and other MoP devices. The MoNETTM S1000 communications device is designed for voice processing and for bridging TDM to ATM and IP packet networks. The S1000 consists of the DSP farm based on Carmel DSP core and 32-bit RISC CPU, along with Ethernet, Utopia, PCI, and TDM interfaces. In this paper, we will describe the VoIP infrastructure, building blocks of the S500, S1000 and S3000 devices, algorithms executed on these device and associated channel densities, detailed DSP architecture, memory architecture, data flow and scheduling.
Robust cell tracking in epithelial tissues through identification of maximum common subgraphs.

PubMed

Kursawe, Jochen; Bardenet, Rémi; Zartman, Jeremiah J; Baker, Ruth E; Fletcher, Alexander G

2016-11-01

Tracking of cells in live-imaging microscopy videos of epithelial sheets is a powerful tool for investigating fundamental processes in embryonic development. Characterizing cell growth, proliferation, intercalation and apoptosis in epithelia helps us to understand how morphogenetic processes such as tissue invagination and extension are locally regulated and controlled. Accurate cell tracking requires correctly resolving cells entering or leaving the field of view between frames, cell neighbour exchanges, cell removals and cell divisions. However, current tracking methods for epithelial sheets are not robust to large morphogenetic deformations and require significant manual interventions. Here, we present a novel algorithm for epithelial cell tracking, exploiting the graph-theoretic concept of a 'maximum common subgraph' to track cells between frames of a video. Our algorithm does not require the adjustment of tissue-specific parameters, and scales in sub-quadratic time with tissue size. It does not rely on precise positional information, permitting large cell movements between frames and enabling tracking in datasets acquired at low temporal resolution due to experimental constraints such as phototoxicity. To demonstrate the method, we perform tracking on the Drosophila embryonic epidermis and compare cell-cell rearrangements to previous studies in other tissues. Our implementation is open source and generally applicable to epithelial tissues. © 2016 The Authors.
Robust cell tracking in epithelial tissues through identification of maximum common subgraphs

PubMed Central

Bardenet, Rémi; Zartman, Jeremiah J.; Baker, Ruth E.

2016-01-01

Tracking of cells in live-imaging microscopy videos of epithelial sheets is a powerful tool for investigating fundamental processes in embryonic development. Characterizing cell growth, proliferation, intercalation and apoptosis in epithelia helps us to understand how morphogenetic processes such as tissue invagination and extension are locally regulated and controlled. Accurate cell tracking requires correctly resolving cells entering or leaving the field of view between frames, cell neighbour exchanges, cell removals and cell divisions. However, current tracking methods for epithelial sheets are not robust to large morphogenetic deformations and require significant manual interventions. Here, we present a novel algorithm for epithelial cell tracking, exploiting the graph-theoretic concept of a ‘maximum common subgraph’ to track cells between frames of a video. Our algorithm does not require the adjustment of tissue-specific parameters, and scales in sub-quadratic time with tissue size. It does not rely on precise positional information, permitting large cell movements between frames and enabling tracking in datasets acquired at low temporal resolution due to experimental constraints such as phototoxicity. To demonstrate the method, we perform tracking on the Drosophila embryonic epidermis and compare cell–cell rearrangements to previous studies in other tissues. Our implementation is open source and generally applicable to epithelial tissues. PMID:28334699
Measuring Engagement as Students Learn Dynamic Systems and Control with a Video Game

ERIC Educational Resources Information Center

Coller, B. D.; Shernoff, David J.; Strati, Anna

2011-01-01

The paper presents results of a multi-year quasi-experimental study of student engagement during which a video game was introduced into an undergraduate dynamic systems and control course. The video game, "EduTorcs", provided challenges in which students devised control algorithms that drive virtual cars and ride virtual bikes through a…

Quantum Image Processing and Its Application to Edge Detection: Theory and Experiment

NASA Astrophysics Data System (ADS)

Yao, Xi-Wei; Wang, Hengyan; Liao, Zeyang; Chen, Ming-Cheng; Pan, Jian; Li, Jun; Zhang, Kechao; Lin, Xingcheng; Wang, Zhehui; Luo, Zhihuang; Zheng, Wenqiang; Li, Jianzhong; Zhao, Meisheng; Peng, Xinhua; Suter, Dieter

2017-07-01

Processing of digital images is continuously gaining in volume and relevance, with concomitant demands on data storage, transmission, and processing power. Encoding the image information in quantum-mechanical systems instead of classical ones and replacing classical with quantum information processing may alleviate some of these challenges. By encoding and processing the image information in quantum-mechanical systems, we here demonstrate the framework of quantum image processing, where a pure quantum state encodes the image information: we encode the pixel values in the probability amplitudes and the pixel positions in the computational basis states. Our quantum image representation reduces the required number of qubits compared to existing implementations, and we present image processing algorithms that provide exponential speed-up over their classical counterparts. For the commonly used task of detecting the edge of an image, we propose and implement a quantum algorithm that completes the task with only one single-qubit operation, independent of the size of the image. This demonstrates the potential of quantum image processing for highly efficient image and video processing in the big data era.
Video copy protection and detection framework (VPD) for e-learning systems

NASA Astrophysics Data System (ADS)

ZandI, Babak; Doustarmoghaddam, Danial; Pour, Mahsa R.

2013-03-01

This Article reviews and compares the copyright issues related to the digital video files, which can be categorized as contended based and Digital watermarking copy Detection. Then we describe how to protect a digital video by using a special Video data hiding method and algorithm. We also discuss how to detect the copy right of the file, Based on expounding Direction of the technology of the video copy detection, and Combining with the own research results, brings forward a new video protection and copy detection approach in terms of plagiarism and e-learning systems using the video data hiding technology. Finally we introduce a framework for Video protection and detection in e-learning systems (VPD Framework).
Robust video copy detection approach based on local tangent space alignment

NASA Astrophysics Data System (ADS)

Nie, Xiushan; Qiao, Qianping

2012-04-01

We propose a robust content-based video copy detection approach based on local tangent space alignment (LTSA), which is an efficient dimensionality reduction algorithm. The idea is motivated by the fact that the content of video becomes richer and the dimension of content becomes higher. It does not give natural tools for video analysis and understanding because of the high dimensionality. The proposed approach reduces the dimensionality of video content using LTSA, and then generates video fingerprints in low dimensional space for video copy detection. Furthermore, a dynamic sliding window is applied to fingerprint matching. Experimental results show that the video copy detection approach has good robustness and discrimination.
Video Denoising via Dynamic Video Layering

NASA Astrophysics Data System (ADS)

Guo, Han; Vaswani, Namrata

2018-07-01

Video denoising refers to the problem of removing "noise" from a video sequence. Here the term "noise" is used in a broad sense to refer to any corruption or outlier or interference that is not the quantity of interest. In this work, we develop a novel approach to video denoising that is based on the idea that many noisy or corrupted videos can be split into three parts - the "low-rank layer", the "sparse layer", and a small residual (which is small and bounded). We show, using extensive experiments, that our denoising approach outperforms the state-of-the-art denoising algorithms.
Integrating Algorithm Visualization Video into a First-Year Algorithm and Data Structure Course

ERIC Educational Resources Information Center

Crescenzi, Pilu; Malizia, Alessio; Verri, M. Cecilia; Diaz, Paloma; Aedo, Ignacio

2012-01-01

In this paper we describe the results that we have obtained while integrating algorithm visualization (AV) movies (strongly tightened with the other teaching material), within a first-year undergraduate course on algorithms and data structures. Our experimental results seem to support the hypothesis that making these movies available significantly…
Reliable motion detection of small targets in video with low signal-to-clutter ratios

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nichols, S.A.; Naylor, R.B.

1995-07-01

Studies show that vigilance decreases rapidly after several minutes when human operators are required to search live video for infrequent intrusion detections. Therefore, there is a need for systems which can automatically detect targets in live video and reserve the operator`s attention for assessment only. Thus far, automated systems have not simultaneously provided adequate detection sensitivity, false alarm suppression, and ease of setup when used in external, unconstrained environments. This unsatisfactory performance can be exacerbated by poor video imagery with low contrast, high noise, dynamic clutter, image misregistration, and/or the presence of small, slow, or erratically moving targets. This papermore » describes a highly adaptive video motion detection and tracking algorithm which has been developed as part of Sandia`s Advanced Exterior Sensor (AES) program. The AES is a wide-area detection and assessment system for use in unconstrained exterior security applications. The AES detection and tracking algorithm provides good performance under stressing data and environmental conditions. Features of the algorithm include: reliable detection with negligible false alarm rate of variable velocity targets having low signal-to-clutter ratios; reliable tracking of targets that exhibit motion that is non-inertial, i.e., varies in direction and velocity; automatic adaptation to both infrared and visible imagery with variable quality; and suppression of false alarms caused by sensor flaws and/or cutouts.« less
Resolution enhancement of low-quality videos using a high-resolution frame

NASA Astrophysics Data System (ADS)

Pham, Tuan Q.; van Vliet, Lucas J.; Schutte, Klamer

2006-01-01

This paper proposes an example-based Super-Resolution (SR) algorithm of compressed videos in the Discrete Cosine Transform (DCT) domain. Input to the system is a Low-Resolution (LR) compressed video together with a High-Resolution (HR) still image of similar content. Using a training set of corresponding LR-HR pairs of image patches from the HR still image, high-frequency details are transferred from the HR source to the LR video. The DCT-domain algorithm is much faster than example-based SR in spatial domain 6 because of a reduction in search dimensionality, which is a direct result of the compact and uncorrelated DCT representation. Fast searching techniques like tree-structure vector quantization 16 and coherence search1 are also key to the improved efficiency. Preliminary results on MJPEG sequence show promising result of the DCT-domain SR synthesis approach.
Free-viewpoint video of human actors using multiple handheld Kinects.

PubMed

Ye, Genzhi; Liu, Yebin; Deng, Yue; Hasler, Nils; Ji, Xiangyang; Dai, Qionghai; Theobalt, Christian

2013-10-01

We present an algorithm for creating free-viewpoint video of interacting humans using three handheld Kinect cameras. Our method reconstructs deforming surface geometry and temporal varying texture of humans through estimation of human poses and camera poses for every time step of the RGBZ video. Skeletal configurations and camera poses are found by solving a joint energy minimization problem, which optimizes the alignment of RGBZ data from all cameras, as well as the alignment of human shape templates to the Kinect data. The energy function is based on a combination of geometric correspondence finding, implicit scene segmentation, and correspondence finding using image features. Finally, texture recovery is achieved through jointly optimization on spatio-temporal RGB data using matrix completion. As opposed to previous methods, our algorithm succeeds on free-viewpoint video of human actors under general uncontrolled indoor scenes with potentially dynamic background, and it succeeds even if the cameras are moving.
The Coverage Problem in Video-Based Wireless Sensor Networks: A Survey

PubMed Central

Costa, Daniel G.; Guedes, Luiz Affonso

2010-01-01

Wireless sensor networks typically consist of a great number of tiny low-cost electronic devices with limited sensing and computing capabilities which cooperatively communicate to collect some kind of information from an area of interest. When wireless nodes of such networks are equipped with a low-power camera, visual data can be retrieved, facilitating a new set of novel applications. The nature of video-based wireless sensor networks demands new algorithms and solutions, since traditional wireless sensor networks approaches are not feasible or even efficient for that specialized communication scenario. The coverage problem is a crucial issue of wireless sensor networks, requiring specific solutions when video-based sensors are employed. In this paper, it is surveyed the state of the art of this particular issue, regarding strategies, algorithms and general computational solutions. Open research areas are also discussed, envisaging promising investigation considering coverage in video-based wireless sensor networks. PMID:22163651
Performance of the JPEG Estimated Spectrum Adaptive Postfilter (JPEG-ESAP) for Low Bit Rates

NASA Technical Reports Server (NTRS)

Linares, Irving (Inventor)

2016-01-01

Frequency-based, pixel-adaptive filtering using the JPEG-ESAP algorithm for low bit rate JPEG formatted color images may allow for more compressed images while maintaining equivalent quality at a smaller file size or bitrate. For RGB, an image is decomposed into three color bands--red, green, and blue. The JPEG-ESAP algorithm is then applied to each band (e.g., once for red, once for green, and once for blue) and the output of each application of the algorithm is rebuilt as a single color image. The ESAP algorithm may be repeatedly applied to MPEG-2 video frames to reduce their bit rate by a factor of 2 or 3, while maintaining equivalent video quality, both perceptually, and objectively, as recorded in the computed PSNR values.
“SmartMonitor” — An Intelligent Security System for the Protection of Individuals and Small Properties with the Possibility of Home Automation

PubMed Central

Frejlichowski, Dariusz; Gościewska, Katarzyna; Forczmański, Paweł; Hofman, Radosław

2014-01-01

“SmartMonitor” is an intelligent security system based on image analysis that combines the advantages of alarm, video surveillance and home automation systems. The system is a complete solution that automatically reacts to every learned situation in a pre-specified way and has various applications, e.g., home and surrounding protection against unauthorized intrusion, crime detection or supervision over ill persons. The software is based on well-known and proven methods and algorithms for visual content analysis (VCA) that were appropriately modified and adopted to fit specific needs and create a video processing model which consists of foreground region detection and localization, candidate object extraction, object classification and tracking. In this paper, the “SmartMonitor” system is presented along with its architecture, employed methods and algorithms, and object analysis approach. Some experimental results on system operation are also provided. In the paper, focus is put on one of the aforementioned functionalities of the system, namely supervision over ill persons. PMID:24905854
Lossless Video Sequence Compression Using Adaptive Prediction

NASA Technical Reports Server (NTRS)

Li, Ying; Sayood, Khalid

2007-01-01

We present an adaptive lossless video compression algorithm based on predictive coding. The proposed algorithm exploits temporal, spatial, and spectral redundancies in a backward adaptive fashion with extremely low side information. The computational complexity is further reduced by using a caching strategy. We also study the relationship between the operational domain for the coder (wavelet or spatial) and the amount of temporal and spatial redundancy in the sequence being encoded. Experimental results show that the proposed scheme provides significant improvements in compression efficiencies.
Robust object tracking techniques for vision-based 3D motion analysis applications

NASA Astrophysics Data System (ADS)

Knyaz, Vladimir A.; Zheltov, Sergey Y.; Vishnyakov, Boris V.

2016-04-01

Automated and accurate spatial motion capturing of an object is necessary for a wide variety of applications including industry and science, virtual reality and movie, medicine and sports. For the most part of applications a reliability and an accuracy of the data obtained as well as convenience for a user are the main characteristics defining the quality of the motion capture system. Among the existing systems for 3D data acquisition, based on different physical principles (accelerometry, magnetometry, time-of-flight, vision-based), optical motion capture systems have a set of advantages such as high speed of acquisition, potential for high accuracy and automation based on advanced image processing algorithms. For vision-based motion capture accurate and robust object features detecting and tracking through the video sequence are the key elements along with a level of automation of capturing process. So for providing high accuracy of obtained spatial data the developed vision-based motion capture system "Mosca" is based on photogrammetric principles of 3D measurements and supports high speed image acquisition in synchronized mode. It includes from 2 to 4 technical vision cameras for capturing video sequences of object motion. The original camera calibration and external orientation procedures provide the basis for high accuracy of 3D measurements. A set of algorithms as for detecting, identifying and tracking of similar targets, so for marker-less object motion capture is developed and tested. The results of algorithms' evaluation show high robustness and high reliability for various motion analysis tasks in technical and biomechanics applications.
Real-time Enhancement, Registration, and Fusion for an Enhanced Vision System

NASA Technical Reports Server (NTRS)

Hines, Glenn D.; Rahman, Zia-ur; Jobson, Daniel J.; Woodell, Glenn A.

2006-01-01

Over the last few years NASA Langley Research Center (LaRC) has been developing an Enhanced Vision System (EVS) to aid pilots while flying in poor visibility conditions. The EVS captures imagery using two infrared video cameras. The cameras are placed in an enclosure that is mounted and flown forward-looking underneath the NASA LaRC ARIES 757 aircraft. The data streams from the cameras are processed in real-time and displayed on monitors on-board the aircraft. With proper processing the camera system can provide better-than-human-observed imagery particularly during poor visibility conditions. However, to obtain this goal requires several different stages of processing including enhancement, registration, and fusion, and specialized processing hardware for real-time performance. We are using a real-time implementation of the Retinex algorithm for image enhancement, affine transformations for registration, and weighted sums to perform fusion. All of the algorithms are executed on a single TI DM642 digital signal processor (DSP) clocked at 720 MHz. The image processing components were added to the EVS system, tested, and demonstrated during flight tests in August and September of 2005. In this paper we briefly discuss the EVS image processing hardware and algorithms. We then discuss implementation issues and show examples of the results obtained during flight tests.
Toward automating Hammersmith pulled-to-sit examination of infants using feature point based video object tracking.

PubMed

Dogra, Debi P; Majumdar, Arun K; Sural, Shamik; Mukherjee, Jayanta; Mukherjee, Suchandra; Singh, Arun

2012-01-01

Hammersmith Infant Neurological Examination (HINE) is a set of tests used for grading neurological development of infants on a scale of 0 to 3. These tests help in assessing neurophysiological development of babies, especially preterm infants who are born before (the fetus reaches) the gestational age of 36 weeks. Such tests are often conducted in the follow-up clinics of hospitals for grading infants with suspected disabilities. Assessment based on HINE depends on the expertise of the physicians involved in conducting the examinations. It has been noted that some of these tests, especially pulled-to-sit and lateral tilting, are difficult to assess solely based on visual observation. For example, during the pulled-to-sit examination, the examiner needs to observe the relative movement of the head with respect to torso while pulling the infant by holding wrists. The examiner may find it difficult to follow the head movement from the coronal view. Video object tracking based automatic or semi-automatic analysis can be helpful in this case. In this paper, we present a video based method to automate the analysis of pulled-to-sit examination. In this context, a dynamic programming and node pruning based efficient video object tracking algorithm has been proposed. Pulled-to-sit event detection is handled by the proposed tracking algorithm that uses a 2-D geometric model of the scene. The algorithm has been tested with normal as well as marker based videos of the examination recorded at the neuro-development clinic of the SSKM Hospital, Kolkata, India. It is found that the proposed algorithm is capable of estimating the pulled-to-sit score with sensitivity (80%-92%) and specificity (89%-96%).
Implementation of smart phone video plethysmography and dependence on lighting parameters.

PubMed

Fletcher, Richard Ribón; Chamberlain, Daniel; Paggi, Nicholas; Deng, Xinyue

2015-08-01

The remote measurement of heart rate (HR) and heart rate variability (HRV) via a digital camera (video plethysmography) has emerged as an area of great interest for biomedical and health applications. While a few implementations of video plethysmography have been demonstrated on smart phones under controlled lighting conditions, it has been challenging to create a general scalable solution due to the large variability in smart phone hardware performance, software architecture, and the variable response to lighting parameters. In this context, we present a selfcontained smart phone implementation of video plethysmography for Android OS, which employs both stochastic and deterministic algorithms, and we use this to study the effect of lighting parameters (illuminance, color spectrum) on the accuracy of the remote HR measurement. Using two different phone models, we present the median HR error for five different video plethysmography algorithms under three different types of lighting (natural sunlight, compact fluorescent, and halogen incandescent) and variations in brightness. For most algorithms, we found the optimum light brightness to be in the range 1000-4000 lux and the optimum lighting types to be compact fluorescent and natural light. Moderate errors were found for most algorithms with some devices under conditions of low-brightness (<;500 lux) and highbrightness (>4000 lux). Our analysis also identified camera frame rate jitter as a major source of variability and error across different phone models, but this can be largely corrected through non-linear resampling. Based on testing with six human subjects, our real-time Android implementation successfully predicted the measured HR with a median error of -0.31 bpm, and an inter-quartile range of 2.1bpm.
A bioinspired collision detection algorithm for VLSI implementation

NASA Astrophysics Data System (ADS)

Cuadri, J.; Linan, G.; Stafford, R.; Keil, M. S.; Roca, E.

2005-06-01

In this paper a bioinspired algorithm for collision detection is proposed, based on previous models of the locust (Locusta migratoria) visual system reported by F.C. Rind and her group, in the University of Newcastle-upon-Tyne. The algorithm is suitable for VLSI implementation in standard CMOS technologies as a system-on-chip for automotive applications. The working principle of the algorithm is to process a video stream that represents the current scenario, and to fire an alarm whenever an object approaches on a collision course. Moreover, it establishes a scale of warning states, from no danger to collision alarm, depending on the activity detected in the current scenario. In the worst case, the minimum time before collision at which the model fires the collision alarm is 40 msec (1 frame before, at 25 frames per second). Since the average time to successfully fire an airbag system is 2 msec, even in the worst case, this algorithm would be very helpful to more efficiently arm the airbag system, or even take some kind of collision avoidance countermeasures. Furthermore, two additional modules have been included: a "Topological Feature Estimator" and an "Attention Focusing Algorithm". The former takes into account the shape of the approaching object to decide whether it is a person, a road line or a car. This helps to take more adequate countermeasures and to filter false alarms. The latter centres the processing power into the most active zones of the input frame, thus saving memory and processing time resources.
Hybrid digital-analog video transmission in wireless multicast and multiple-input multiple-output system

NASA Astrophysics Data System (ADS)

Liu, Yu; Lin, Xiaocheng; Fan, Nianfei; Zhang, Lin

2016-01-01

Wireless video multicast has become one of the key technologies in wireless applications. But the main challenge of conventional wireless video multicast, i.e., the cliff effect, remains unsolved. To overcome the cliff effect, a hybrid digital-analog (HDA) video transmission framework based on SoftCast, which transmits the digital bitstream with the quantization residuals, is proposed. With an effective power allocation algorithm and appropriate parameter settings, the residual gains can be maximized; meanwhile, the digital bitstream can assure transmission of a basic video to the multicast receiver group. In the multiple-input multiple-output (MIMO) system, since nonuniform noise interference on different antennas can be regarded as the cliff effect problem, ParCast, which is a variation of SoftCast, is also applied to video transmission to solve it. The HDA scheme with corresponding power allocation algorithms is also applied to improve video performance. Simulations show that the proposed HDA scheme can overcome the cliff effect completely with the transmission of residuals. What is more, it outperforms the compared WSVC scheme by more than 2 dB when transmitting under the same bandwidth, and it can further improve performance by nearly 8 dB in MIMO when compared with the ParCast scheme.
Annotations of Mexican bullfighting videos for semantic index

NASA Astrophysics Data System (ADS)

Montoya Obeso, Abraham; Oropesa Morales, Lester Arturo; Fernando Vázquez, Luis; Cocolán Almeda, Sara Ivonne; Stoian, Andrei; García Vázquez, Mireya Saraí; Zamudio Fuentes, Luis Miguel; Montiel Perez, Jesús Yalja; de la O Torres, Saul; Ramírez Acosta, Alejandro Alvaro

2015-09-01

The video annotation is important for web indexing and browsing systems. Indeed, in order to evaluate the performance of video query and mining techniques, databases with concept annotations are required. Therefore, it is necessary generate a database with a semantic indexing that represents the digital content of the Mexican bullfighting atmosphere. This paper proposes a scheme to make complex annotations in a video in the frame of multimedia search engine project. Each video is partitioned using our segmentation algorithm that creates shots of different length and different number of frames. In order to make complex annotations about the video, we use ELAN software. The annotations are done in two steps: First, we take note about the whole content in each shot. Second, we describe the actions as parameters of the camera like direction, position and deepness. As a consequence, we obtain a more complete descriptor of every action. In both cases we use the concepts of the TRECVid 2014 dataset. We also propose new concepts. This methodology allows to generate a database with the necessary information to create descriptors and algorithms capable to detect actions to automatically index and classify new bullfighting multimedia content.
Deep linear autoencoder and patch clustering-based unified one-dimensional coding of image and video

NASA Astrophysics Data System (ADS)

Li, Honggui

2017-09-01

This paper proposes a unified one-dimensional (1-D) coding framework of image and video, which depends on deep learning neural network and image patch clustering. First, an improved K-means clustering algorithm for image patches is employed to obtain the compact inputs of deep artificial neural network. Second, for the purpose of best reconstructing original image patches, deep linear autoencoder (DLA), a linear version of the classical deep nonlinear autoencoder, is introduced to achieve the 1-D representation of image blocks. Under the circumstances of 1-D representation, DLA is capable of attaining zero reconstruction error, which is impossible for the classical nonlinear dimensionality reduction methods. Third, a unified 1-D coding infrastructure for image, intraframe, interframe, multiview video, three-dimensional (3-D) video, and multiview 3-D video is built by incorporating different categories of videos into the inputs of patch clustering algorithm. Finally, it is shown in the results of simulation experiments that the proposed methods can simultaneously gain higher compression ratio and peak signal-to-noise ratio than those of the state-of-the-art methods in the situation of low bitrate transmission.

Implementation of an RBF neural network on embedded systems: real-time face tracking and identity verification.

PubMed

Yang, Fan; Paindavoine, M

2003-01-01

This paper describes a real time vision system that allows us to localize faces in video sequences and verify their identity. These processes are image processing techniques based on the radial basis function (RBF) neural network approach. The robustness of this system has been evaluated quantitatively on eight video sequences. We have adapted our model for an application of face recognition using the Olivetti Research Laboratory (ORL), Cambridge, UK, database so as to compare the performance against other systems. We also describe three hardware implementations of our model on embedded systems based on the field programmable gate array (FPGA), zero instruction set computer (ZISC) chips, and digital signal processor (DSP) TMS320C62, respectively. We analyze the algorithm complexity and present results of hardware implementations in terms of the resources used and processing speed. The success rates of face tracking and identity verification are 92% (FPGA), 85% (ZISC), and 98.2% (DSP), respectively. For the three embedded systems, the processing speeds for images size of 288 /spl times/ 352 are 14 images/s, 25 images/s, and 4.8 images/s, respectively.
D Surface Generation from Aerial Thermal Imagery

NASA Astrophysics Data System (ADS)

Khodaei, B.; Samadzadegan, F.; Dadras Javan, F.; Hasani, H.

2015-12-01

Aerial thermal imagery has been recently applied to quantitative analysis of several scenes. For the mapping purpose based on aerial thermal imagery, high accuracy photogrammetric process is necessary. However, due to low geometric resolution and low contrast of thermal imaging sensors, there are some challenges in precise 3D measurement of objects. In this paper the potential of thermal video in 3D surface generation is evaluated. In the pre-processing step, thermal camera is geometrically calibrated using a calibration grid based on emissivity differences between the background and the targets. Then, Digital Surface Model (DSM) generation from thermal video imagery is performed in four steps. Initially, frames are extracted from video, then tie points are generated by Scale-Invariant Feature Transform (SIFT) algorithm. Bundle adjustment is then applied and the camera position and orientation parameters are determined. Finally, multi-resolution dense image matching algorithm is used to create 3D point cloud of the scene. Potential of the proposed method is evaluated based on thermal imaging cover an industrial area. The thermal camera has 640×480 Uncooled Focal Plane Array (UFPA) sensor, equipped with a 25 mm lens which mounted in the Unmanned Aerial Vehicle (UAV). The obtained results show the comparable accuracy of 3D model generated based on thermal images with respect to DSM generated from visible images, however thermal based DSM is somehow smoother with lower level of texture. Comparing the generated DSM with the 9 measured GCPs in the area shows the Root Mean Square Error (RMSE) value is smaller than 5 decimetres in both X and Y directions and 1.6 meters for the Z direction.
Space-time light field rendering.

PubMed

Wang, Huamin; Sun, Mingxuan; Yang, Ruigang

2007-01-01

In this paper, we propose a novel framework called space-time light field rendering, which allows continuous exploration of a dynamic scene in both space and time. Compared to existing light field capture/rendering systems, it offers the capability of using unsynchronized video inputs and the added freedom of controlling the visualization in the temporal domain, such as smooth slow motion and temporal integration. In order to synthesize novel views from any viewpoint at any time instant, we develop a two-stage rendering algorithm. We first interpolate in the temporal domain to generate globally synchronized images using a robust spatial-temporal image registration algorithm followed by edge-preserving image morphing. We then interpolate these software-synchronized images in the spatial domain to synthesize the final view. In addition, we introduce a very accurate and robust algorithm to estimate subframe temporal offsets among input video sequences. Experimental results from unsynchronized videos with or without time stamps show that our approach is capable of maintaining photorealistic quality from a variety of real scenes.
Study of efficient video compression algorithms for space shuttle applications

NASA Technical Reports Server (NTRS)

Poo, Z.

1975-01-01

Results are presented of a study on video data compression techniques applicable to space flight communication. This study is directed towards monochrome (black and white) picture communication with special emphasis on feasibility of hardware implementation. The primary factors for such a communication system in space flight application are: picture quality, system reliability, power comsumption, and hardware weight. In terms of hardware implementation, these are directly related to hardware complexity, effectiveness of the hardware algorithm, immunity of the source code to channel noise, and data transmission rate (or transmission bandwidth). A system is recommended, and its hardware requirement summarized. Simulations of the study were performed on the improved LIM video controller which is computer-controlled by the META-4 CPU.
Real-time handling of existing content sources on a multi-layer display

NASA Astrophysics Data System (ADS)

Singh, Darryl S. K.; Shin, Jung

2013-03-01

A Multi-Layer Display (MLD) consists of two or more imaging planes separated by physical depth where the depth is a key component in creating a glasses-free 3D effect. Its core benefits include being viewable from multiple angles, having full panel resolution for 3D effects with no side effects of nausea or eye-strain. However, typically content must be designed for its optical configuration in foreground and background image pairs. A process was designed to give a consistent 3D effect in a 2-layer MLD from existing stereo video content in real-time. Optimizations to stereo matching algorithms that generate depth maps in real-time were specifically tailored for the optical characteristics and image processing algorithms of a MLD. The end-to-end process included improvements to the Hierarchical Belief Propagation (HBP) stereo matching algorithm, improvements to optical flow and temporal consistency. Imaging algorithms designed for the optical characteristics of a MLD provided some visual compensation for depth map inaccuracies. The result can be demonstrated in a PC environment, displayed on a 22" MLD, used in the casino slot market, with 8mm of panel seperation. Prior to this development, stereo content had not been used to achieve a depth-based 3D effect on a MLD in real-time
Uranus: a rapid prototyping tool for FPGA embedded computer vision

NASA Astrophysics Data System (ADS)

Rosales-Hernández, Victor; Castillo-Jimenez, Liz; Viveros-Velez, Gilberto; Zuñiga-Grajeda, Virgilio; Treviño Torres, Abel; Arias-Estrada, M.

2007-01-01

The starting point for all successful system development is the simulation. Performing high level simulation of a system can help to identify, insolate and fix design problems. This work presents Uranus, a software tool for simulation and evaluation of image processing algorithms with support to migrate them to an FPGA environment for algorithm acceleration and embedded processes purposes. The tool includes an integrated library of previous coded operators in software and provides the necessary support to read and display image sequences as well as video files. The user can use the previous compiled soft-operators in a high level process chain, and code his own operators. Additional to the prototyping tool, Uranus offers FPGA-based hardware architecture with the same organization as the software prototyping part. The hardware architecture contains a library of FPGA IP cores for image processing that are connected with a PowerPC based system. The Uranus environment is intended for rapid prototyping of machine vision and the migration to FPGA accelerator platform, and it is distributed for academic purposes.
ECG R-R peak detection on mobile phones.

PubMed

Sufi, F; Fang, Q; Cosic, I

2007-01-01

Mobile phones have become an integral part of modern life. Due to the ever increasing processing power, mobile phones are rapidly expanding its arena from a sole device of telecommunication to organizer, calculator, gaming device, web browser, music player, audio/video recording device, navigator etc. The processing power of modern mobile phones has been utilized by many innovative purposes. In this paper, we are proposing the utilization of mobile phones for monitoring and analysis of biosignal. The computation performed inside the mobile phone's processor will now be exploited for healthcare delivery. We performed literature review on RR interval detection from ECG and selected few PC based algorithms. Then, three of those existing RR interval detection algorithms were programmed on Java platform. Performance monitoring and comparison studies were carried out on three different mobile devices to determine their application on a realtime telemonitoring scenario.
Area-delay trade-offs of texture decompressors for a graphics processing unit

NASA Astrophysics Data System (ADS)

Novoa Súñer, Emilio; Ituero, Pablo; López-Vallejo, Marisa

2011-05-01

Graphics Processing Units have become a booster for the microelectronics industry. However, due to intellectual property issues, there is a serious lack of information on implementation details of the hardware architecture that is behind GPUs. For instance, the way texture is handled and decompressed in a GPU to reduce bandwidth usage has never been dealt with in depth from a hardware point of view. This work addresses a comparative study on the hardware implementation of different texture decompression algorithms for both conventional (PCs and video game consoles) and mobile platforms. Circuit synthesis is performed targeting both a reconfigurable hardware platform and a 90nm standard cell library. Area-delay trade-offs have been extensively analyzed, which allows us to compare the complexity of decompressors and thus determine suitability of algorithms for systems with limited hardware resources.
Musculoskeletal motion flow fields using hierarchical variable-sized block matching in ultrasonographic video sequences.

PubMed

Revell, J D; Mirmehdi, M; McNally, D S

2004-04-01

We examine tissue deformations using non-invasive dynamic musculoskeletal ultrasonograhy, and quantify its performance on controlled in vitro gold standard (groundtruth) sequences followed by clinical in vivo data. The proposed approach employs a two-dimensional variable-sized block matching algorithm with a hierarchical full search. We extend this process by refining displacements to sub-pixel accuracy. We show by application that this technique yields quantitatively reliable results.
User-oriented summary extraction for soccer video based on multimodal analysis

NASA Astrophysics Data System (ADS)

Liu, Huayong; Jiang, Shanshan; He, Tingting

2011-11-01

An advanced user-oriented summary extraction method for soccer video is proposed in this work. Firstly, an algorithm of user-oriented summary extraction for soccer video is introduced. A novel approach that integrates multimodal analysis, such as extraction and analysis of the stadium features, moving object features, audio features and text features is introduced. By these features the semantic of the soccer video and the highlight mode are obtained. Then we can find the highlight position and put them together by highlight degrees to obtain the video summary. The experimental results for sports video of world cup soccer games indicate that multimodal analysis is effective for soccer video browsing and retrieval.
A novel key-frame extraction approach for both video summary and video index.

PubMed

Lei, Shaoshuai; Xie, Gang; Yan, Gaowei

2014-01-01

Existing key-frame extraction methods are basically video summary oriented; yet the index task of key-frames is ignored. This paper presents a novel key-frame extraction approach which can be available for both video summary and video index. First a dynamic distance separability algorithm is advanced to divide a shot into subshots based on semantic structure, and then appropriate key-frames are extracted in each subshot by SVD decomposition. Finally, three evaluation indicators are proposed to evaluate the performance of the new approach. Experimental results show that the proposed approach achieves good semantic structure for semantics-based video index and meanwhile produces video summary consistent with human perception.
Adapting hierarchical bidirectional inter prediction on a GPU-based platform for 2D and 3D H.264 video coding

NASA Astrophysics Data System (ADS)

Rodríguez-Sánchez, Rafael; Martínez, José Luis; Cock, Jan De; Fernández-Escribano, Gerardo; Pieters, Bart; Sánchez, José L.; Claver, José M.; de Walle, Rik Van

2013-12-01

The H.264/AVC video coding standard introduces some improved tools in order to increase compression efficiency. Moreover, the multi-view extension of H.264/AVC, called H.264/MVC, adopts many of them. Among the new features, variable block-size motion estimation is one which contributes to high coding efficiency. Furthermore, it defines a different prediction structure that includes hierarchical bidirectional pictures, outperforming traditional Group of Pictures patterns in both scenarios: single-view and multi-view. However, these video coding techniques have high computational complexity. Several techniques have been proposed in the literature over the last few years which are aimed at accelerating the inter prediction process, but there are no works focusing on bidirectional prediction or hierarchical prediction. In this article, with the emergence of many-core processors or accelerators, a step forward is taken towards an implementation of an H.264/AVC and H.264/MVC inter prediction algorithm on a graphics processing unit. The results show a negligible rate distortion drop with a time reduction of up to 98% for the complete H.264/AVC encoder.
Model-Based Analysis of Flow-Mediated Dilation and Intima-Media Thickness

PubMed Central

Bartoli, G.; Menegaz, G.; Lisi, M.; Di Stolfo, G.; Dragoni, S.; Gori, T.

2008-01-01

We present an end-to-end system for the automatic measurement of flow-mediated dilation (FMD) and intima-media thickness (IMT) for the assessment of the arterial function. The video sequences are acquired from a B-mode echographic scanner. A spline model (deformable template) is fitted to the data to detect the artery boundaries and track them all along the video sequence. The a priori knowledge about the image features and its content is exploited. Preprocessing is performed to improve both the visual quality of video frames for visual inspection and the performance of the segmentation algorithm without affecting the accuracy of the measurements. The system allows real-time processing as well as a high level of interactivity with the user. This is obtained by a graphical user interface (GUI) enabling the cardiologist to supervise the whole process and to eventually reset the contour extraction at any point in time. The system was validated and the accuracy, reproducibility, and repeatability of the measurements were assessed with extensive in vivo experiments. Jointly with the user friendliness, low cost, and robustness, this makes the system suitable for both research and daily clinical use. PMID:19360110
Detection of Abnormal Events via Optical Flow Feature Analysis

PubMed Central

Wang, Tian; Snoussi, Hichem

2015-01-01

In this paper, a novel algorithm is proposed to detect abnormal events in video streams. The algorithm is based on the histogram of the optical flow orientation descriptor and the classification method. The details of the histogram of the optical flow orientation descriptor are illustrated for describing movement information of the global video frame or foreground frame. By combining one-class support vector machine and kernel principal component analysis methods, the abnormal events in the current frame can be detected after a learning period characterizing normal behaviors. The difference abnormal detection results are analyzed and explained. The proposed detection method is tested on benchmark datasets, then the experimental results show the effectiveness of the algorithm. PMID:25811227
An adaptive DPCM algorithm for predicting contours in NTSC composite video signals

NASA Astrophysics Data System (ADS)

Cox, N. R.

An adaptive DPCM algorithm is proposed for encoding digitized National Television Systems Committee (NTSC) color video signals. This algorithm essentially predicts picture contours in the composite signal without resorting to component separation. The contour parameters (slope thresholds) are optimized using four 'typical' television frames that have been sampled at three times the color subcarrier frequency. Three variations of the basic predictor are simulated and compared quantitatively with three non-adaptive predictors of similar complexity. By incorporating a dual-word-length coder and buffer memory, high quality color pictures can be encoded at 4.0 bits/pel or 42.95 Mbit/s. The effect of channel error propagation is also investigated.
Experimental application of simulation tools for evaluating UAV video change detection

NASA Astrophysics Data System (ADS)

Saur, Günter; Bartelsen, Jan

2015-10-01

Change detection is one of the most important tasks when unmanned aerial vehicles (UAV) are used for video reconnaissance and surveillance. In this paper, we address changes on short time scale, i.e. the observations are taken within time distances of a few hours. Each observation is a short video sequence corresponding to the near-nadir overflight of the UAV above the interesting area and the relevant changes are e.g. recently added or removed objects. The change detection algorithm has to distinguish between relevant and non-relevant changes. Examples for non-relevant changes are versatile objects like trees and compression or transmission artifacts. To enable the usage of an automatic change detection within an interactive workflow of an UAV video exploitation system, an evaluation and assessment procedure has to be performed. Large video data sets which contain many relevant objects with varying scene background and altering influence parameters (e.g. image quality, sensor and flight parameters) including image metadata and ground truth data are necessary for a comprehensive evaluation. Since the acquisition of real video data is limited by cost and time constraints, from our point of view, the generation of synthetic data by simulation tools has to be considered. In this paper the processing chain of Saur et al. (2014) [1] and the interactive workflow for video change detection is described. We have selected the commercial simulation environment Virtual Battle Space 3 (VBS3) to generate synthetic data. For an experimental setup, an example scenario "road monitoring" has been defined and several video clips have been produced with varying flight and sensor parameters and varying objects in the scene. Image registration and change mask extraction, both components of the processing chain, are applied to corresponding frames of different video clips. For the selected examples, the images could be registered, the modelled changes could be extracted and the artifacts of the image rendering considered as noise (slight differences of heading angles, disparity of vegetation, 3D parallax) could be suppressed. We conclude that these image data could be considered to be realistic enough to serve as evaluation data for the selected processing components. Future work will extend the evaluation to other influence parameters and may include the human operator for mission planning and sensor control.
A fast bilinear structure from motion algorithm using a video sequence and inertial sensors.

PubMed

Ramachandran, Mahesh; Veeraraghavan, Ashok; Chellappa, Rama

2011-01-01

In this paper, we study the benefits of the availability of a specific form of additional information—the vertical direction (gravity) and the height of the camera, both of which can be conveniently measured using inertial sensors and a monocular video sequence for 3D urban modeling. We show that in the presence of this information, the SfM equations can be rewritten in a bilinear form. This allows us to derive a fast, robust, and scalable SfM algorithm for large scale applications. The SfM algorithm developed in this paper is experimentally demonstrated to have favorable properties compared to the sparse bundle adjustment algorithm. We provide experimental evidence indicating that the proposed algorithm converges in many cases to solutions with lower error than state-of-art implementations of bundle adjustment. We also demonstrate that for the case of large reconstruction problems, the proposed algorithm takes lesser time to reach its solution compared to bundle adjustment. We also present SfM results using our algorithm on the Google StreetView research data set.
Progressive video coding for noisy channels

NASA Astrophysics Data System (ADS)

Kim, Beong-Jo; Xiong, Zixiang; Pearlman, William A.

1998-10-01

We extend the work of Sherwood and Zeger to progressive video coding for noisy channels. By utilizing a 3D extension of the set partitioning in hierarchical trees (SPIHT) algorithm, we cascade the resulting 3D SPIHT video coder with a rate-compatible punctured convolutional channel coder for transmission of video over a binary symmetric channel. Progressive coding is achieved by increasing the target rate of the 3D embedded SPIHT video coder as the channel condition improves. The performance of our proposed coding system is acceptable at low transmission rate and bad channel conditions. Its low complexity makes it suitable for emerging applications such as video over wireless channels.
Missile signal processing common computer architecture for rapid technology upgrade

NASA Astrophysics Data System (ADS)

Rabinkin, Daniel V.; Rutledge, Edward; Monticciolo, Paul

2004-10-01

Interceptor missiles process IR images to locate an intended target and guide the interceptor towards it. Signal processing requirements have increased as the sensor bandwidth increases and interceptors operate against more sophisticated targets. A typical interceptor signal processing chain is comprised of two parts. Front-end video processing operates on all pixels of the image and performs such operations as non-uniformity correction (NUC), image stabilization, frame integration and detection. Back-end target processing, which tracks and classifies targets detected in the image, performs such algorithms as Kalman tracking, spectral feature extraction and target discrimination. In the past, video processing was implemented using ASIC components or FPGAs because computation requirements exceeded the throughput of general-purpose processors. Target processing was performed using hybrid architectures that included ASICs, DSPs and general-purpose processors. The resulting systems tended to be function-specific, and required custom software development. They were developed using non-integrated toolsets and test equipment was developed along with the processor platform. The lifespan of a system utilizing the signal processing platform often spans decades, while the specialized nature of processor hardware and software makes it difficult and costly to upgrade. As a result, the signal processing systems often run on outdated technology, algorithms are difficult to update, and system effectiveness is impaired by the inability to rapidly respond to new threats. A new design approach is made possible three developments; Moore's Law - driven improvement in computational throughput; a newly introduced vector computing capability in general purpose processors; and a modern set of open interface software standards. Today's multiprocessor commercial-off-the-shelf (COTS) platforms have sufficient throughput to support interceptor signal processing requirements. This application may be programmed under existing real-time operating systems using parallel processing software libraries, resulting in highly portable code that can be rapidly migrated to new platforms as processor technology evolves. Use of standardized development tools and 3rd party software upgrades are enabled as well as rapid upgrade of processing components as improved algorithms are developed. The resulting weapon system will have a superior processing capability over a custom approach at the time of deployment as a result of a shorter development cycles and use of newer technology. The signal processing computer may be upgraded over the lifecycle of the weapon system, and can migrate between weapon system variants enabled by modification simplicity. This paper presents a reference design using the new approach that utilizes an Altivec PowerPC parallel COTS platform. It uses a VxWorks-based real-time operating system (RTOS), and application code developed using an efficient parallel vector library (PVL). A quantification of computing requirements and demonstration of interceptor algorithm operating on this real-time platform are provided.
Full image-processing pipeline in field-programmable gate array for a small endoscopic camera

NASA Astrophysics Data System (ADS)

Mostafa, Sheikh Shanawaz; Sousa, L. Natércia; Ferreira, Nuno Fábio; Sousa, Ricardo M.; Santos, Joao; Wäny, Martin; Morgado-Dias, F.

2017-01-01

Endoscopy is an imaging procedure used for diagnosis as well as for some surgical purposes. The camera used for the endoscopy should be small and able to produce a good quality image or video, to reduce discomfort of the patients, and to increase the efficiency of the medical team. To achieve these fundamental goals, a small endoscopy camera with a footprint of 1 mm×1 mm×1.65 mm is used. Due to the physical properties of the sensors and human vision system limitations, different image-processing algorithms, such as noise reduction, demosaicking, and gamma correction, among others, are needed to faithfully reproduce the image or video. A full image-processing pipeline is implemented using a field-programmable gate array (FPGA) to accomplish a high frame rate of 60 fps with minimum processing delay. Along with this, a viewer has also been developed to display and control the image-processing pipeline. The control and data transfer are done by a USB 3.0 end point in the computer. The full developed system achieves real-time processing of the image and fits in a Xilinx Spartan-6LX150 FPGA.

A full field, 3-D velocimeter for microgravity crystallization experiments

NASA Technical Reports Server (NTRS)

Brodkey, Robert S.; Russ, Keith M.

1991-01-01

The programming and algorithms needed for implementing a full-field, 3-D velocimeter for laminar flow systems and the appropriate hardware to fully implement this ultimate system are discussed. It appears that imaging using a synched pair of video cameras and digitizer boards with synched rails for camera motion will provide a viable solution to the laminar tracking problem. The algorithms given here are simple, which should speed processing. On a heavily loaded VAXstation 3100 the particle identification can take 15 to 30 seconds, with the tracking taking less than one second. It seeems reasonable to assume that four image pairs can thus be acquired and analyzed in under one minute.
New fast DCT algorithms based on Loeffler's factorization

NASA Astrophysics Data System (ADS)

Hong, Yoon Mi; Kim, Il-Koo; Lee, Tammy; Cheon, Min-Su; Alshina, Elena; Han, Woo-Jin; Park, Jeong-Hoon

2012-10-01

This paper proposes a new 32-point fast discrete cosine transform (DCT) algorithm based on the Loeffler's 16-point transform. Fast integer realizations of 16-point and 32-point transforms are also provided based on the proposed transform. For the recent development of High Efficiency Video Coding (HEVC), simplified quanti-zation and de-quantization process are proposed. Three different forms of implementation with the essentially same performance, namely matrix multiplication, partial butterfly, and full factorization can be chosen accord-ing to the given platform. In terms of the number of multiplications required for the realization, our proposed full-factorization is 3~4 times faster than a partial butterfly, and about 10 times faster than direct matrix multiplication.
Video conference quality assessment based on cooperative sensing of video and audio

NASA Astrophysics Data System (ADS)

Wang, Junxi; Chen, Jialin; Tian, Xin; Zhou, Cheng; Zhou, Zheng; Ye, Lu

2015-12-01

This paper presents a method to video conference quality assessment, which is based on cooperative sensing of video and audio. In this method, a proposed video quality evaluation method is used to assess the video frame quality. The video frame is divided into noise image and filtered image by the bilateral filters. It is similar to the characteristic of human visual, which could also be seen as a low-pass filtering. The audio frames are evaluated by the PEAQ algorithm. The two results are integrated to evaluate the video conference quality. A video conference database is built to test the performance of the proposed method. It could be found that the objective results correlate well with MOS. Then we can conclude that the proposed method is efficiency in assessing video conference quality.
Extracting foreground ensemble features to detect abnormal crowd behavior in intelligent video-surveillance systems

NASA Astrophysics Data System (ADS)

Chan, Yi-Tung; Wang, Shuenn-Jyi; Tsai, Chung-Hsien

2017-09-01

Public safety is a matter of national security and people's livelihoods. In recent years, intelligent video-surveillance systems have become important active-protection systems. A surveillance system that provides early detection and threat assessment could protect people from crowd-related disasters and ensure public safety. Image processing is commonly used to extract features, e.g., people, from a surveillance video. However, little research has been conducted on the relationship between foreground detection and feature extraction. Most current video-surveillance research has been developed for restricted environments, in which the extracted features are limited by having information from a single foreground; they do not effectively represent the diversity of crowd behavior. This paper presents a general framework based on extracting ensemble features from the foreground of a surveillance video to analyze a crowd. The proposed method can flexibly integrate different foreground-detection technologies to adapt to various monitored environments. Furthermore, the extractable representative features depend on the heterogeneous foreground data. Finally, a classification algorithm is applied to these features to automatically model crowd behavior and distinguish an abnormal event from normal patterns. The experimental results demonstrate that the proposed method's performance is both comparable to that of state-of-the-art methods and satisfies the requirements of real-time applications.
Image and Video Compression with VLSI Neural Networks

NASA Technical Reports Server (NTRS)

Fang, W.; Sheu, B.

1993-01-01

An advanced motion-compensated predictive video compression system based on artificial neural networks has been developed to effectively eliminate the temporal and spatial redundancy of video image sequences and thus reduce the bandwidth and storage required for the transmission and recording of the video signal. The VLSI neuroprocessor for high-speed high-ratio image compression based upon a self-organization network and the conventional algorithm for vector quantization are compared. The proposed method is quite efficient and can achieve near-optimal results.
Human Signatures for Personnel Detection

DTIC Science & Technology

2010-09-14

work\\ TESIS \\prueba.avi’); switch videoinfo.ImageType case ’truecolor’ video=aviread(’C:\\MATLAB\\R2006a\\work\\ TESIS \\prueba.avi’); case...8217indexed’ video=aviread(’C:\\MATLAB\\R2006a\\work\\ TESIS \\prueba.avi’); video=ind2rgb(video); end 4.1.2 Color to Grayscale Converter The conversion from...algorithm. • Maximum errors for both GTSig and the lattice Boltzmann method were confined to the corners where fthe temperature is ill de ined Numerical
Objective grading of facial paralysis using Local Binary Patterns in video processing.

PubMed

He, Shu; Soraghan, John J; O'Reilly, Brian F

2008-01-01

This paper presents a novel framework for objective measurement of facial paralysis in biomedial videos. The motion information in the horizontal and vertical directions and the appearance features on the apex frames are extracted based on the Local Binary Patterns (LBP) on the temporal-spatial domain in each facial region. These features are temporally and spatially enhanced by the application of block schemes. A multi-resolution extension of uniform LBP is proposed to efficiently combine the micro-patterns and large-scale patterns into a feature vector, which increases the algorithmic robustness and reduces noise effects while still retaining computational simplicity. The symmetry of facial movements is measured by the Resistor-Average Distance (RAD) between LBP features extracted from the two sides of the face. Support Vector Machine (SVM) is applied to provide quantitative evaluation of facial paralysis based on the House-Brackmann (H-B) Scale. The proposed method is validated by experiments with 197 subject videos, which demonstrates its accuracy and efficiency.
Video Completion in Digital Stabilization Task Using Pseudo-Panoramic Technique

NASA Astrophysics Data System (ADS)

Favorskaya, M. N.; Buryachenko, V. V.; Zotin, A. G.; Pakhirka, A. I.

2017-05-01

Video completion is a necessary stage after stabilization of a non-stationary video sequence, if it is desirable to make the resolution of the stabilized frames equalled the resolution of the original frames. Usually the cropped stabilized frames lose 10-20% of area that means the worse visibility of the reconstructed scenes. The extension of a view of field may appear due to the pan-tilt-zoom unwanted camera movement. Our approach deals with a preparing of pseudo-panoramic key frame during a stabilization stage as a pre-processing step for the following inpainting. It is based on a multi-layered representation of each frame including the background and objects, moving differently. The proposed algorithm involves four steps, such as the background completion, local motion inpainting, local warping, and seamless blending. Our experiments show that a necessity of a seamless stitching occurs often than a local warping step. Therefore, a seamless blending was investigated in details including four main categories, such as feathering-based, pyramid-based, gradient-based, and optimal seam-based blending.
State of the "art": a taxonomy of artistic stylization techniques for images and video.

PubMed

Kyprianidis, Jan Eric; Collomosse, John; Wang, Tinghuai; Isenberg, Tobias

2013-05-01

This paper surveys the field of nonphotorealistic rendering (NPR), focusing on techniques for transforming 2D input (images and video) into artistically stylized renderings. We first present a taxonomy of the 2D NPR algorithms developed over the past two decades, structured according to the design characteristics and behavior of each technique. We then describe a chronology of development from the semiautomatic paint systems of the early nineties, through to the automated painterly rendering systems of the late nineties driven by image gradient analysis. Two complementary trends in the NPR literature are then addressed, with reference to our taxonomy. First, the fusion of higher level computer vision and NPR, illustrating the trends toward scene analysis to drive artistic abstraction and diversity of style. Second, the evolution of local processing approaches toward edge-aware filtering for real-time stylization of images and video. The survey then concludes with a discussion of open challenges for 2D NPR identified in recent NPR symposia, including topics such as user and aesthetic evaluation.
An integrated multispectral video and environmental monitoring system for the study of coastal processes and the support of beach management operations

NASA Astrophysics Data System (ADS)

Ghionis, George; Trygonis, Vassilis; Karydis, Antonis; Vousdoukas, Michalis; Alexandrakis, George; Drakopoulos, Panos; Amdreadis, Olympos; Psarros, Fotis; Velegrakis, Antonis; Poulos, Serafim

2016-04-01

Effective beach management requires environmental assessments that are based on sound science, are cost-effective and are available to beach users and managers in an accessible, timely and transparent manner. The most common problems are: 1) The available field data are scarce and of sub-optimal spatio-temporal resolution and coverage, 2) our understanding of local beach processes needs to be improved in order to accurately model/forecast beach dynamics under a changing climate, and 3) the information provided by coastal scientists/engineers in the form of data, models and scientific interpretation is often too complicated to be of direct use by coastal managers/decision makers. A multispectral video system has been developed, consisting of one or more video cameras operating in the visible part of the spectrum, a passive near-infrared (NIR) camera, an active NIR camera system, a thermal infrared camera and a spherical video camera, coupled with innovative image processing algorithms and a telemetric system for the monitoring of coastal environmental parameters. The complete system has the capability to record, process and communicate (in quasi-real time) high frequency information on shoreline position, wave breaking zones, wave run-up, erosion hot spots along the shoreline, nearshore wave height, turbidity, underwater visibility, wind speed and direction, air and sea temperature, solar radiation, UV radiation, relative humidity, barometric pressure and rainfall. An innovative, remotely-controlled interactive visual monitoring system, based on the spherical video camera (with 360°field of view), combines the video streams from all cameras and can be used by beach managers to monitor (in real time) beach user numbers, flow activities and safety at beaches of high touristic value. The high resolution near infrared cameras permit 24-hour monitoring of beach processes, while the thermal camera provides information on beach sediment temperature and moisture, can detect upwelling in the nearshore zone, and enhances the safety of beach users. All data can be presented in real- or quasi-real time and are stored for future analysis and training/validation of coastal processes models. Acknowledgements: This work was supported by the project BEACHTOUR (11SYN-8-1466) of the Operational Program "Cooperation 2011, Competitiveness and Entrepreneurship", co-funded by the European Regional Development Fund and the Greek Ministry of Education and Religious Affairs.
Comparison of turbulence mitigation algorithms

NASA Astrophysics Data System (ADS)

Kozacik, Stephen T.; Paolini, Aaron; Sherman, Ariel; Bonnett, James; Kelmelis, Eric

2017-07-01

When capturing imagery over long distances, atmospheric turbulence often degrades the data, especially when observation paths are close to the ground or in hot environments. These issues manifest as time-varying scintillation and warping effects that decrease the effective resolution of the sensor and reduce actionable intelligence. In recent years, several image processing approaches to turbulence mitigation have shown promise. Each of these algorithms has different computational requirements, usability demands, and degrees of independence from camera sensors. They also produce different degrees of enhancement when applied to turbulent imagery. Additionally, some of these algorithms are applicable to real-time operational scenarios while others may only be suitable for postprocessing workflows. EM Photonics has been developing image-processing-based turbulence mitigation technology since 2005. We will compare techniques from the literature with our commercially available, real-time, GPU-accelerated turbulence mitigation software. These comparisons will be made using real (not synthetic), experimentally obtained data for a variety of conditions, including varying optical hardware, imaging range, subjects, and turbulence conditions. Comparison metrics will include image quality, video latency, computational complexity, and potential for real-time operation. Additionally, we will present a technique for quantitatively comparing turbulence mitigation algorithms using real images of radial resolution targets.
Enhanced video indirect ophthalmoscopy (VIO) via robust mosaicing.

PubMed

Estrada, Rolando; Tomasi, Carlo; Cabrera, Michelle T; Wallace, David K; Freedman, Sharon F; Farsiu, Sina

2011-10-01

Indirect ophthalmoscopy (IO) is the standard of care for evaluation of the neonatal retina. When recorded on video from a head-mounted camera, IO images have low quality and narrow Field of View (FOV). We present an image fusion methodology for converting a video IO recording into a single, high quality, wide-FOV mosaic that seamlessly blends the best frames in the video. To this end, we have developed fast and robust algorithms for automatic evaluation of video quality, artifact detection and removal, vessel mapping, registration, and multi-frame image fusion. Our experiments show the effectiveness of the proposed methods.
Algorithm design for automated transportation photo enforcement camera image and video quality diagnostic check modules

NASA Astrophysics Data System (ADS)

Raghavan, Ajay; Saha, Bhaskar

2013-03-01

Photo enforcement devices for traffic rules such as red lights, toll, stops, and speed limits are increasingly being deployed in cities and counties around the world to ensure smooth traffic flow and public safety. These are typically unattended fielded systems, and so it is important to periodically check them for potential image/video quality problems that might interfere with their intended functionality. There is interest in automating such checks to reduce the operational overhead and human error involved in manually checking large camera device fleets. Examples of problems affecting such camera devices include exposure issues, focus drifts, obstructions, misalignment, download errors, and motion blur. Furthermore, in some cases, in addition to the sub-algorithms for individual problems, one also has to carefully design the overall algorithm and logic to check for and accurately classifying these individual problems. Some of these issues can occur in tandem or have the potential to be confused for each other by automated algorithms. Examples include camera misalignment that can cause some scene elements to go out of focus for wide-area scenes or download errors that can be misinterpreted as an obstruction. Therefore, the sequence in which the sub-algorithms are utilized is also important. This paper presents an overview of these problems along with no-reference and reduced reference image and video quality solutions to detect and classify such faults.
ViCoMo: visual context modeling for scene understanding in video surveillance

NASA Astrophysics Data System (ADS)

Creusen, Ivo M.; Javanbakhti, Solmaz; Loomans, Marijn J. H.; Hazelhoff, Lykele B.; Roubtsova, Nadejda; Zinger, Svitlana; de With, Peter H. N.

2013-10-01

The use of contextual information can significantly aid scene understanding of surveillance video. Just detecting people and tracking them does not provide sufficient information to detect situations that require operator attention. We propose a proof-of-concept system that uses several sources of contextual information to improve scene understanding in surveillance video. The focus is on two scenarios that represent common video surveillance situations, parking lot surveillance and crowd monitoring. In the first scenario, a pan-tilt-zoom (PTZ) camera tracking system is developed for parking lot surveillance. Context is provided by the traffic sign recognition system to localize regular and handicapped parking spot signs as well as license plates. The PTZ algorithm has the ability to selectively detect and track persons based on scene context. In the second scenario, a group analysis algorithm is introduced to detect groups of people. Contextual information is provided by traffic sign recognition and region labeling algorithms and exploited for behavior understanding. In both scenarios, decision engines are used to interpret and classify the output of the subsystems and if necessary raise operator alerts. We show that using context information enables the automated analysis of complicated scenarios that were previously not possible using conventional moving object classification techniques.
An efficient fully unsupervised video object segmentation scheme using an adaptive neural-network classifier architecture.

PubMed

Doulamis, A; Doulamis, N; Ntalianis, K; Kollias, S

2003-01-01

In this paper, an unsupervised video object (VO) segmentation and tracking algorithm is proposed based on an adaptable neural-network architecture. The proposed scheme comprises: 1) a VO tracking module and 2) an initial VO estimation module. Object tracking is handled as a classification problem and implemented through an adaptive network classifier, which provides better results compared to conventional motion-based tracking algorithms. Network adaptation is accomplished through an efficient and cost effective weight updating algorithm, providing a minimum degradation of the previous network knowledge and taking into account the current content conditions. A retraining set is constructed and used for this purpose based on initial VO estimation results. Two different scenarios are investigated. The first concerns extraction of human entities in video conferencing applications, while the second exploits depth information to identify generic VOs in stereoscopic video sequences. Human face/ body detection based on Gaussian distributions is accomplished in the first scenario, while segmentation fusion is obtained using color and depth information in the second scenario. A decision mechanism is also incorporated to detect time instances for weight updating. Experimental results and comparisons indicate the good performance of the proposed scheme even in sequences with complicated content (object bending, occlusion).
A description of a system of programs for mathematically processing on unified series (YeS) computers photographic images of the Earth taken from spacecraft

NASA Technical Reports Server (NTRS)

Zolotukhin, V. G.; Kolosov, B. I.; Usikov, D. A.; Borisenko, V. I.; Mosin, S. T.; Gorokhov, V. N.

1980-01-01

A description of a batch of programs for the YeS-1040 computer combined into an automated system for processing photo (and video) images of the Earth's surface, taken from spacecraft, is presented. Individual programs with the detailed discussion of the algorithmic and programmatic facilities needed by the user are presented. The basic principles for assembling the system, and the control programs are included. The exchange format within whose framework the cataloging of any programs recommended for the system of processing will be activated in the future is displayed.
Real-time Enhancement, Registration, and Fusion for a Multi-Sensor Enhanced Vision System

NASA Technical Reports Server (NTRS)

Hines, Glenn D.; Rahman, Zia-ur; Jobson, Daniel J.; Woodell, Glenn A.

2006-01-01

Over the last few years NASA Langley Research Center (LaRC) has been developing an Enhanced Vision System (EVS) to aid pilots while flying in poor visibility conditions. The EVS captures imagery using two infrared video cameras. The cameras are placed in an enclosure that is mounted and flown forward-looking underneath the NASA LaRC ARIES 757 aircraft. The data streams from the cameras are processed in real-time and displayed on monitors on-board the aircraft. With proper processing the camera system can provide better-than- human-observed imagery particularly during poor visibility conditions. However, to obtain this goal requires several different stages of processing including enhancement, registration, and fusion, and specialized processing hardware for real-time performance. We are using a real-time implementation of the Retinex algorithm for image enhancement, affine transformations for registration, and weighted sums to perform fusion. All of the algorithms are executed on a single TI DM642 digital signal processor (DSP) clocked at 720 MHz. The image processing components were added to the EVS system, tested, and demonstrated during flight tests in August and September of 2005. In this paper we briefly discuss the EVS image processing hardware and algorithms. We then discuss implementation issues and show examples of the results obtained during flight tests. Keywords: enhanced vision system, image enhancement, retinex, digital signal processing, sensor fusion
Bandwidth reduction for video-on-demand broadcasting using secondary content insertion

NASA Astrophysics Data System (ADS)

Golynski, Alexander; Lopez-Ortiz, Alejandro; Poirier, Guillaume; Quimper, Claude-Guy

2005-01-01

An optimal broadcasting scheme under the presence of secondary content (i.e. advertisements) is proposed. The proposed scheme works both for movies encoded in a Constant Bit Rate (CBR) or a Variable Bit Rate (VBR) format. It is shown experimentally that secondary content in movies can make Video-on-Demand (VoD) broadcasting systems more efficient. An efficient algorithm is given to compute the optimal broadcasting schedule with secondary content, which in particular significantly improves over the best previously known algorithm for computing the optimal broadcasting schedule without secondary content.
Parallax-Robust Surveillance Video Stitching

PubMed Central

He, Botao; Yu, Shaohua

2015-01-01

This paper presents a parallax-robust video stitching technique for timely synchronized surveillance video. An efficient two-stage video stitching procedure is proposed in this paper to build wide Field-of-View (FOV) videos for surveillance applications. In the stitching model calculation stage, we develop a layered warping algorithm to align the background scenes, which is location-dependent and turned out to be more robust to parallax than the traditional global projective warping methods. On the selective seam updating stage, we propose a change-detection based optimal seam selection approach to avert ghosting and artifacts caused by moving foregrounds. Experimental results demonstrate that our procedure can efficiently stitch multi-view videos into a wide FOV video output without ghosting and noticeable seams. PMID:26712756
Logo detection and classification in a sport video: video indexing for sponsorship revenue control

NASA Astrophysics Data System (ADS)

Kovar, Bohumil; Hanjalic, Alan

2001-12-01

This paper presents a novel approach to detecting and classifying a trademark logo in frames of a sport video. In view of the fact that we attempt to detect and recognize a logo in a natural scene, the algorithm developed in this paper differs from traditional techniques for logo detection and classification that are applicable either to well-structured general text documents (e.g. invoices, memos, bank cheques) or to specialized trademark logo databases, where logos appear isolated on a clear background and where their detection and classification is not disturbed by the surrounding visual detail. Although the development of our algorithm is still in its starting phase, experimental results performed so far on a set of soccer TV broadcasts are very encouraging.

Image processing for drawing recognition

NASA Astrophysics Data System (ADS)

Feyzkhanov, Rustem; Zhelavskaya, Irina

2014-03-01

The task of recognizing edges of rectangular structures is well known. Still, almost all of them work with static images and has no limit on work time. We propose application of conducting homography for the video stream which can be obtained from the webcam. We propose algorithm which can be successfully used for this kind of application. One of the main use cases of such application is recognition of drawings by person on the piece of paper before webcam.
Evaluation of Super Voxel Methods for Early Video Processing (Author’s Manuscript)

DTIC Science & Technology

2012-07-26

supervoxels in space- time [22]. This property embodies many of the basic Gestalt principles—proximity, continuation, closure, and symmetry—and helps...streaming approach. The mean shift algorithm used in our paper is presented by Paris and Durand [29], who introduce Morse theory to interpret mean...maximum 86 fpv. This data set allows us to evaluate the supervoxel methods against human perception . The third data set is from Grundman et al. [15
Physical and cognitive task analysis in interventional radiology.

PubMed

Johnson, S; Healey, A; Evans, J; Murphy, M; Crawshaw, M; Gould, D

2006-01-01

To identify, describe and detail the cognitive thought processes, decision-making, and physical actions involved in the preparation and successful performance of core interventional radiology procedures. Five commonly performed core interventional radiology procedures were selected for cognitive task analysis. Several examples of each procedure being performed by consultant interventional radiologists were videoed. The videos of those procedures, and the steps required for successful outcome, were analysed by a psychologist and an interventional radiologist. Once a skeleton algorithm of the procedures was defined, further refinement was achieved using individual interview techniques with consultant interventional radiologists. Additionally a critique of each iteration of the established algorithm was sought from non-participating independent consultant interventional radiologists. Detailed task descriptions and decision protocols were developed for five interventional radiology procedures (arterial puncture, nephrostomy, venous access, biopsy-using both ultrasound and computed tomography, and percutaneous transhepatic cholangiogram). Identical tasks performed within these procedures were identified and standardized within the protocols. Complex procedures were broken down and their constituent processes identified. This might be suitable for use as a training protocol to provide a universally acceptable safe practice at the most fundamental level. It is envisaged that data collected in this way can be used as an educational resource for trainees and could provide the basis for a training curriculum in interventional radiology. It will direct trainees towards safe practice of the highest standard. It will also provide performance objectives of a simulator model.
Aerial video mosaicking using binary feature tracking

NASA Astrophysics Data System (ADS)

Minnehan, Breton; Savakis, Andreas

2015-05-01

Unmanned Aerial Vehicles are becoming an increasingly attractive platform for many applications, as their cost decreases and their capabilities increase. Creating detailed maps from aerial data requires fast and accurate video mosaicking methods. Traditional mosaicking techniques rely on inter-frame homography estimations that are cascaded through the video sequence. Computationally expensive keypoint matching algorithms are often used to determine the correspondence of keypoints between frames. This paper presents a video mosaicking method that uses an object tracking approach for matching keypoints between frames to improve both efficiency and robustness. The proposed tracking method matches local binary descriptors between frames and leverages the spatial locality of the keypoints to simplify the matching process. Our method is robust to cascaded errors by determining the homography between each frame and the ground plane rather than the prior frame. The frame-to-ground homography is calculated based on the relationship of each point's image coordinates and its estimated location on the ground plane. Robustness to moving objects is integrated into the homography estimation step through detecting anomalies in the motion of keypoints and eliminating the influence of outliers. The resulting mosaics are of high accuracy and can be computed in real time.
Colour based fire detection method with temporal intensity variation filtration

NASA Astrophysics Data System (ADS)

Trambitckii, K.; Anding, K.; Musalimov, V.; Linß, G.

2015-02-01

Development of video, computing technologies and computer vision gives a possibility of automatic fire detection on video information. Under that project different algorithms was implemented to find more efficient way of fire detection. In that article colour based fire detection algorithm is described. But it is not enough to use only colour information to detect fire properly. The main reason of this is that in the shooting conditions may be a lot of things having colour similar to fire. A temporary intensity variation of pixels is used to separate them from the fire. These variations are averaged over the series of several frames. This algorithm shows robust work and was realised as a computer program by using of the OpenCV library.
Measuring exertion time, duty cycle and hand activity level for industrial tasks using computer vision.

PubMed

Akkas, Oguz; Lee, Cheng Hsien; Hu, Yu Hen; Harris Adamson, Carisa; Rempel, David; Radwin, Robert G

2017-12-01

Two computer vision algorithms were developed to automatically estimate exertion time, duty cycle (DC) and hand activity level (HAL) from videos of workers performing 50 industrial tasks. The average DC difference between manual frame-by-frame analysis and the computer vision DC was -5.8% for the Decision Tree (DT) algorithm, and 1.4% for the Feature Vector Training (FVT) algorithm. The average HAL difference was 0.5 for the DT algorithm and 0.3 for the FVT algorithm. A sensitivity analysis, conducted to examine the influence that deviations in DC have on HAL, found it remained unaffected when DC error was less than 5%. Thus, a DC error less than 10% will impact HAL less than 0.5 HAL, which is negligible. Automatic computer vision HAL estimates were therefore comparable to manual frame-by-frame estimates. Practitioner Summary: Computer vision was used to automatically estimate exertion time, duty cycle and hand activity level from videos of workers performing industrial tasks.
An improved KCF tracking algorithm based on multi-feature and multi-scale

NASA Astrophysics Data System (ADS)

Wu, Wei; Wang, Ding; Luo, Xin; Su, Yang; Tian, Weiye

2018-02-01

The purpose of visual tracking is to associate the target object in a continuous video frame. In recent years, the method based on the kernel correlation filter has become the research hotspot. However, the algorithm still has some problems such as video capture equipment fast jitter, tracking scale transformation. In order to improve the ability of scale transformation and feature description, this paper has carried an innovative algorithm based on the multi feature fusion and multi-scale transform. The experimental results show that our method solves the problem that the target model update when is blocked or its scale transforms. The accuracy of the evaluation (OPE) is 77.0%, 75.4% and the success rate is 69.7%, 66.4% on the VOT and OTB datasets. Compared with the optimal one of the existing target-based tracking algorithms, the accuracy of the algorithm is improved by 6.7% and 6.3% respectively. The success rates are improved by 13.7% and 14.2% respectively.
Dense 3D Face Alignment from 2D Video for Real-Time Use

PubMed Central

Jeni, László A.; Cohn, Jeffrey F.; Kanade, Takeo

2018-01-01

To enable real-time, person-independent 3D registration from 2D video, we developed a 3D cascade regression approach in which facial landmarks remain invariant across pose over a range of approximately 60 degrees. From a single 2D image of a person’s face, a dense 3D shape is registered in real time for each frame. The algorithm utilizes a fast cascade regression framework trained on high-resolution 3D face-scans of posed and spontaneous emotion expression. The algorithm first estimates the location of a dense set of landmarks and their visibility, then reconstructs face shapes by fitting a part-based 3D model. Because no assumptions are required about illumination or surface properties, the method can be applied to a wide range of imaging conditions that include 2D video and uncalibrated multi-view video. The method has been validated in a battery of experiments that evaluate its precision of 3D reconstruction, extension to multi-view reconstruction, temporal integration for videos and 3D head-pose estimation. Experimental findings strongly support the validity of real-time, 3D registration and reconstruction from 2D video. The software is available online at http://zface.org. PMID:29731533
Video face recognition against a watch list

NASA Astrophysics Data System (ADS)

Abbas, Jehanzeb; Dagli, Charlie K.; Huang, Thomas S.

2007-10-01

Due to a large increase in the video surveillance data recently in an effort to maintain high security at public places, we need more robust systems to analyze this data and make tasks like face recognition a realistic possibility in challenging environments. In this paper we explore a watch-list scenario where we use an appearance based model to classify query faces from low resolution videos into either a watch-list or a non-watch-list face. We then use our simple yet a powerful face recognition system to recognize the faces classified as watch-list faces. Where the watch-list includes those people that we are interested in recognizing. Our system uses simple feature machine algorithms from our previous work to match video faces against still images. To test our approach, we match video faces against a large database of still images obtained from a previous work in the field from Yahoo News over a period of time. We do this matching in an efficient manner to come up with a faster and nearly real-time system. This system can be incorporated into a larger surveillance system equipped with advanced algorithms involving anomalous event detection and activity recognition. This is a step towards more secure and robust surveillance systems and efficient video data analysis.
Real-time blood flow visualization using the graphics processing unit

NASA Astrophysics Data System (ADS)

Yang, Owen; Cuccia, David; Choi, Bernard

2011-01-01

Laser speckle imaging (LSI) is a technique in which coherent light incident on a surface produces a reflected speckle pattern that is related to the underlying movement of optical scatterers, such as red blood cells, indicating blood flow. Image-processing algorithms can be applied to produce speckle flow index (SFI) maps of relative blood flow. We present a novel algorithm that employs the NVIDIA Compute Unified Device Architecture (CUDA) platform to perform laser speckle image processing on the graphics processing unit. Software written in C was integrated with CUDA and integrated into a LabVIEW Virtual Instrument (VI) that is interfaced with a monochrome CCD camera able to acquire high-resolution raw speckle images at nearly 10 fps. With the CUDA code integrated into the LabVIEW VI, the processing and display of SFI images were performed also at ~10 fps. We present three video examples depicting real-time flow imaging during a reactive hyperemia maneuver, with fluid flow through an in vitro phantom, and a demonstration of real-time LSI during laser surgery of a port wine stain birthmark.
Real-time blood flow visualization using the graphics processing unit

PubMed Central

Yang, Owen; Cuccia, David; Choi, Bernard

2011-01-01

Laser speckle imaging (LSI) is a technique in which coherent light incident on a surface produces a reflected speckle pattern that is related to the underlying movement of optical scatterers, such as red blood cells, indicating blood flow. Image-processing algorithms can be applied to produce speckle flow index (SFI) maps of relative blood flow. We present a novel algorithm that employs the NVIDIA Compute Unified Device Architecture (CUDA) platform to perform laser speckle image processing on the graphics processing unit. Software written in C was integrated with CUDA and integrated into a LabVIEW Virtual Instrument (VI) that is interfaced with a monochrome CCD camera able to acquire high-resolution raw speckle images at nearly 10 fps. With the CUDA code integrated into the LabVIEW VI, the processing and display of SFI images were performed also at ∼10 fps. We present three video examples depicting real-time flow imaging during a reactive hyperemia maneuver, with fluid flow through an in vitro phantom, and a demonstration of real-time LSI during laser surgery of a port wine stain birthmark. PMID:21280915
A bio-inspired method and system for visual object-based attention and segmentation

NASA Astrophysics Data System (ADS)

Huber, David J.; Khosla, Deepak

2010-04-01

This paper describes a method and system of human-like attention and object segmentation in visual scenes that (1) attends to regions in a scene in their rank of saliency in the image, (2) extracts the boundary of an attended proto-object based on feature contours, and (3) can be biased to boost the attention paid to specific features in a scene, such as those of a desired target object in static and video imagery. The purpose of the system is to identify regions of a scene of potential importance and extract the region data for processing by an object recognition and classification algorithm. The attention process can be performed in a default, bottom-up manner or a directed, top-down manner which will assign a preference to certain features over others. One can apply this system to any static scene, whether that is a still photograph or imagery captured from video. We employ algorithms that are motivated by findings in neuroscience, psychology, and cognitive science to construct a system that is novel in its modular and stepwise approach to the problems of attention and region extraction, its application of a flooding algorithm to break apart an image into smaller proto-objects based on feature density, and its ability to join smaller regions of similar features into larger proto-objects. This approach allows many complicated operations to be carried out by the system in a very short time, approaching real-time. A researcher can use this system as a robust front-end to a larger system that includes object recognition and scene understanding modules; it is engineered to function over a broad range of situations and can be applied to any scene with minimal tuning from the user.
Infrared small target detection technology based on OpenCV

NASA Astrophysics Data System (ADS)

Liu, Lei; Huang, Zhijian

2013-05-01

Accurate and fast detection of infrared (IR) dim target has very important meaning for infrared precise guidance, early warning, video surveillance, etc. In this paper, some basic principles and the implementing flow charts of a series of algorithms for target detection are described. These algorithms are traditional two-frame difference method, improved three-frame difference method, background estimate and frame difference fusion method, and building background with neighborhood mean method. On the foundation of above works, an infrared target detection software platform which is developed by OpenCV and MFC is introduced. Three kinds of tracking algorithms are integrated in this software. In order to explain the software clearly, the framework and the function are described in this paper. At last, the experiments are performed for some real-life IR images. The whole algorithm implementing processes and results are analyzed, and those algorithms for detection targets are evaluated from the two aspects of subjective and objective. The results prove that the proposed method has satisfying detection effectiveness and robustness. Meanwhile, it has high detection efficiency and can be used for real-time detection.
Infrared small target detection technology based on OpenCV

NASA Astrophysics Data System (ADS)

Liu, Lei; Huang, Zhijian

2013-09-01

Accurate and fast detection of infrared (IR) dim target has very important meaning for infrared precise guidance, early warning, video surveillance, etc. In this paper, some basic principles and the implementing flow charts of a series of algorithms for target detection are described. These algorithms are traditional two-frame difference method, improved three-frame difference method, background estimate and frame difference fusion method, and building background with neighborhood mean method. On the foundation of above works, an infrared target detection software platform which is developed by OpenCV and MFC is introduced. Three kinds of tracking algorithms are integrated in this software. In order to explain the software clearly, the framework and the function are described in this paper. At last, the experiments are performed for some real-life IR images. The whole algorithm implementing processes and results are analyzed, and those algorithms for detection targets are evaluated from the two aspects of subjective and objective. The results prove that the proposed method has satisfying detection effectiveness and robustness. Meanwhile, it has high detection efficiency and can be used for real-time detection.
Investigation of correlation classification techniques

NASA Technical Reports Server (NTRS)

Haskell, R. E.

1975-01-01

A two-step classification algorithm for processing multispectral scanner data was developed and tested. The first step is a single pass clustering algorithm that assigns each pixel, based on its spectral signature, to a particular cluster. The output of that step is a cluster tape in which a single integer is associated with each pixel. The cluster tape is used as the input to the second step, where ground truth information is used to classify each cluster using an iterative method of potentials. Once the clusters have been assigned to classes the cluster tape is read pixel-by-pixel and an output tape is produced in which each pixel is assigned to its proper class. In addition to the digital classification programs, a method of using correlation clustering to process multispectral scanner data in real time by means of an interactive color video display is also described.
Concept-oriented indexing of video databases: toward semantic sensitive retrieval and browsing.

PubMed

Fan, Jianping; Luo, Hangzai; Elmagarmid, Ahmed K

2004-07-01

Digital video now plays an important role in medical education, health care, telemedicine and other medical applications. Several content-based video retrieval (CBVR) systems have been proposed in the past, but they still suffer from the following challenging problems: semantic gap, semantic video concept modeling, semantic video classification, and concept-oriented video database indexing and access. In this paper, we propose a novel framework to make some advances toward the final goal to solve these problems. Specifically, the framework includes: 1) a semantic-sensitive video content representation framework by using principal video shots to enhance the quality of features; 2) semantic video concept interpretation by using flexible mixture model to bridge the semantic gap; 3) a novel semantic video-classifier training framework by integrating feature selection, parameter estimation, and model selection seamlessly in a single algorithm; and 4) a concept-oriented video database organization technique through a certain domain-dependent concept hierarchy to enable semantic-sensitive video retrieval and browsing.
An Optical Flow-Based Full Reference Video Quality Assessment Algorithm.

PubMed

K, Manasa; Channappayya, Sumohana S

2016-06-01

We present a simple yet effective optical flow-based full-reference video quality assessment (FR-VQA) algorithm for assessing the perceptual quality of natural videos. Our algorithm is based on the premise that local optical flow statistics are affected by distortions and the deviation from pristine flow statistics is proportional to the amount of distortion. We characterize the local flow statistics using the mean, the standard deviation, the coefficient of variation (CV), and the minimum eigenvalue ( λ min ) of the local flow patches. Temporal distortion is estimated as the change in the CV of the distorted flow with respect to the reference flow, and the correlation between λ min of the reference and of the distorted patches. We rely on the robust multi-scale structural similarity index for spatial quality estimation. The computed temporal and spatial distortions, thus, are then pooled using a perceptually motivated heuristic to generate a spatio-temporal quality score. The proposed method is shown to be competitive with the state-of-the-art when evaluated on the LIVE SD database, the EPFL Polimi SD database, and the LIVE Mobile HD database. The distortions considered in these databases include those due to compression, packet-loss, wireless channel errors, and rate-adaptation. Our algorithm is flexible enough to allow for any robust FR spatial distortion metric for spatial distortion estimation. In addition, the proposed method is not only parameter-free but also independent of the choice of the optical flow algorithm. Finally, we show that the replacement of the optical flow vectors in our proposed method with the much coarser block motion vectors also results in an acceptable FR-VQA algorithm. Our algorithm is called the flow similarity index.
Dependency of human target detection performance on clutter and quality of supporting image analysis algorithms in a video surveillance task

NASA Astrophysics Data System (ADS)

Huber, Samuel; Dunau, Patrick; Wellig, Peter; Stein, Karin

2017-10-01

Background: In target detection, the success rates depend strongly on human observer performances. Two prior studies tested the contributions of target detection algorithms and prior training sessions. The aim of this Swiss-German cooperation study was to evaluate the dependency of human observer performance on the quality of supporting image analysis algorithms. Methods: The participants were presented 15 different video sequences. Their task was to detect all targets in the shortest possible time. Each video sequence showed a heavily cluttered simulated public area from a different viewing angle. In each video sequence, the number of avatars in the area was altered to 100, 150 and 200 subjects. The number of targets appearing was kept at 10%. The number of marked targets varied from 0, 5, 10, 20 up to 40 marked subjects while keeping the positive predictive value of the detection algorithm at 20%. During the task, workload level was assessed by applying an acoustic secondary task. Detection rates and detection times for the targets were analyzed using inferential statistics. Results: The study found Target Detection Time to increase and Target Detection Rates to decrease with increasing numbers of avatars. The same is true for the Secondary Task Reaction Time while there was no effect on Secondary Task Hit Rate. Furthermore, we found a trend for a u-shaped correlation between the numbers of markings and RTST indicating increased workload. Conclusion: The trial results may indicate useful criteria for the design of training and support of observers in observational tasks.
A robust close-range photogrammetric target extraction algorithm for size and type variant targets

NASA Astrophysics Data System (ADS)

Nyarko, Kofi; Thomas, Clayton; Torres, Gilbert

2016-05-01

The Photo-G program conducted by Naval Air Systems Command at the Atlantic Test Range in Patuxent River, Maryland, uses photogrammetric analysis of large amounts of real-world imagery to characterize the motion of objects in a 3-D scene. Current approaches involve several independent processes including target acquisition, target identification, 2-D tracking of image features, and 3-D kinematic state estimation. Each process has its own inherent complications and corresponding degrees of both human intervention and computational complexity. One approach being explored for automated target acquisition relies on exploiting the pixel intensity distributions of photogrammetric targets, which tend to be patterns with bimodal intensity distributions. The bimodal distribution partitioning algorithm utilizes this distribution to automatically deconstruct a video frame into regions of interest (ROI) that are merged and expanded to target boundaries, from which ROI centroids are extracted to mark target acquisition points. This process has proved to be scale, position and orientation invariant, as well as fairly insensitive to global uniform intensity disparities.
An embedded processor for real-time atmoshperic compensation

NASA Astrophysics Data System (ADS)

Bodnar, Michael R.; Curt, Petersen F.; Ortiz, Fernando E.; Carrano, Carmen J.; Kelmelis, Eric J.

2009-05-01

Imaging over long distances is crucial to a number of defense and security applications, such as homeland security and launch tracking. However, the image quality obtained from current long-range optical systems can be severely degraded by the turbulent atmosphere in the path between the region under observation and the imager. While this obscured image information can be recovered using post-processing techniques, the computational complexity of such approaches has prohibited deployment in real-time scenarios. To overcome this limitation, we have coupled a state-of-the-art atmospheric compensation algorithm, the average-bispectrum speckle method, with a powerful FPGA-based embedded processing board. The end result is a light-weight, lower-power image processing system that improves the quality of long-range imagery in real-time, and uses modular video I/O to provide a flexible interface to most common digital and analog video transport methods. By leveraging the custom, reconfigurable nature of the FPGA, a 20x speed increase over a modern desktop PC was achieved in a form-factor that is compact, low-power, and field-deployable.

Discriminability limits in spatio-temporal stereo block matching.

PubMed

Jain, Ankit K; Nguyen, Truong Q

2014-05-01

Disparity estimation is a fundamental task in stereo imaging and is a well-studied problem. Recently, methods have been adapted to the video domain where motion is used as a matching criterion to help disambiguate spatially similar candidates. In this paper, we analyze the validity of the underlying assumptions of spatio-temporal disparity estimation, and determine the extent to which motion aids the matching process. By analyzing the error signal for spatio-temporal block matching under the sum of squared differences criterion and treating motion as a stochastic process, we determine the probability of a false match as a function of image features, motion distribution, image noise, and number of frames in the spatio-temporal patch. This performance quantification provides insight into when spatio-temporal matching is most beneficial in terms of the scene and motion, and can be used as a guide to select parameters for stereo matching algorithms. We validate our results through simulation and experiments on stereo video.
Real-Time Control of a Video Game Using Eye Movements and Two Temporal EEG Sensors.

PubMed

Belkacem, Abdelkader Nasreddine; Saetia, Supat; Zintus-art, Kalanyu; Shin, Duk; Kambara, Hiroyuki; Yoshimura, Natsue; Berrached, Nasreddine; Koike, Yasuharu

2015-01-01

EEG-controlled gaming applications range widely from strictly medical to completely nonmedical applications. Games can provide not only entertainment but also strong motivation for practicing, thereby achieving better control with rehabilitation system. In this paper we present real-time control of video game with eye movements for asynchronous and noninvasive communication system using two temporal EEG sensors. We used wavelets to detect the instance of eye movement and time-series characteristics to distinguish between six classes of eye movement. A control interface was developed to test the proposed algorithm in real-time experiments with opened and closed eyes. Using visual feedback, a mean classification accuracy of 77.3% was obtained for control with six commands. And a mean classification accuracy of 80.2% was obtained using auditory feedback for control with five commands. The algorithm was then applied for controlling direction and speed of character movement in two-dimensional video game. Results showed that the proposed algorithm had an efficient response speed and timing with a bit rate of 30 bits/min, demonstrating its efficacy and robustness in real-time control.
Real-Time Control of a Video Game Using Eye Movements and Two Temporal EEG Sensors

PubMed Central

Saetia, Supat; Zintus-art, Kalanyu; Shin, Duk; Kambara, Hiroyuki; Yoshimura, Natsue; Berrached, Nasreddine; Koike, Yasuharu

2015-01-01

EEG-controlled gaming applications range widely from strictly medical to completely nonmedical applications. Games can provide not only entertainment but also strong motivation for practicing, thereby achieving better control with rehabilitation system. In this paper we present real-time control of video game with eye movements for asynchronous and noninvasive communication system using two temporal EEG sensors. We used wavelets to detect the instance of eye movement and time-series characteristics to distinguish between six classes of eye movement. A control interface was developed to test the proposed algorithm in real-time experiments with opened and closed eyes. Using visual feedback, a mean classification accuracy of 77.3% was obtained for control with six commands. And a mean classification accuracy of 80.2% was obtained using auditory feedback for control with five commands. The algorithm was then applied for controlling direction and speed of character movement in two-dimensional video game. Results showed that the proposed algorithm had an efficient response speed and timing with a bit rate of 30 bits/min, demonstrating its efficacy and robustness in real-time control. PMID:26690500
Innovative Video Diagnostic Equipment for Material Science

NASA Technical Reports Server (NTRS)

Capuano, G.; Titomanlio, D.; Soellner, W.; Seidel, A.

2012-01-01

Materials science experiments under microgravity increasingly rely on advanced optical systems to determine the physical properties of the samples under investigation. This includes video systems with high spatial and temporal resolution. The acquisition, handling, storage and transmission to ground of the resulting video data are very challenging. Since the available downlink data rate is limited, the capability to compress the video data significantly without compromising the data quality is essential. We report on the development of a Digital Video System (DVS) for EML (Electro Magnetic Levitator) which provides real-time video acquisition, high compression using advanced Wavelet algorithms, storage and transmission of a continuous flow of video with different characteristics in terms of image dimensions and frame rates. The DVS is able to operate with the latest generation of high-performance cameras acquiring high resolution video images up to 4Mpixels@60 fps or high frame rate video images up to about 1000 fps@512x512pixels.
Video-based heart rate monitoring across a range of skin pigmentations during an acute hypoxic challenge.

PubMed

Addison, Paul S; Jacquel, Dominique; Foo, David M H; Borg, Ulf R

2017-11-09

The robust monitoring of heart rate from the video-photoplethysmogram (video-PPG) during challenging conditions requires new analysis techniques. The work reported here extends current research in this area by applying a motion tolerant algorithm to extract high quality video-PPGs from a cohort of subjects undergoing marked heart rate changes during a hypoxic challenge, and exhibiting a full range of skin pigmentation types. High uptimes in reported video-based heart rate (HR vid ) were targeted, while retaining high accuracy in the results. Ten healthy volunteers were studied during a double desaturation hypoxic challenge. Video-PPGs were generated from the acquired video image stream and processed to generate heart rate. HR vid was compared to the pulse rate posted by a reference pulse oximeter device (HR p ). Agreement between video-based heart rate and that provided by the pulse oximeter was as follows: Bias = - 0.21 bpm, RMSD = 2.15 bpm, least squares fit gradient = 1.00 (Pearson R = 0.99, p < 0.0001), with a 98.78% reporting uptime. The difference between the HR vid and HR p exceeded 5 and 10 bpm, for 3.59 and 0.35% of the reporting time respectively, and at no point did these differences exceed 25 bpm. Excellent agreement was found between the HR vid and HR p in a study covering the whole range of skin pigmentation types (Fitzpatrick scales I-VI), using standard room lighting and with moderate subject motion. Although promising, further work should include a larger cohort with multiple subjects per Fitzpatrick class combined with a more rigorous motion and lighting protocol.
a Sensor Aided H.264/AVC Video Encoder for Aerial Video Sequences with in the Loop Metadata Correction

NASA Astrophysics Data System (ADS)

Cicala, L.; Angelino, C. V.; Ruatta, G.; Baccaglini, E.; Raimondo, N.

2015-08-01

Unmanned Aerial Vehicles (UAVs) are often employed to collect high resolution images in order to perform image mosaicking and/or 3D reconstruction. Images are usually stored on board and then processed with on-ground desktop software. In such a way the computational load, and hence the power consumption, is moved on ground, leaving on board only the task of storing data. Such an approach is important in the case of small multi-rotorcraft UAVs because of their low endurance due to the short battery life. Images can be stored on board with either still image or video data compression. Still image system are preferred when low frame rates are involved, because video coding systems are based on motion estimation and compensation algorithms which fail when the motion vectors are significantly long and when the overlapping between subsequent frames is very small. In this scenario, UAVs attitude and position metadata from the Inertial Navigation System (INS) can be employed to estimate global motion parameters without video analysis. A low complexity image analysis can be still performed in order to refine the motion field estimated using only the metadata. In this work, we propose to use this refinement step in order to improve the position and attitude estimation produced by the navigation system in order to maximize the encoder performance. Experiments are performed on both simulated and real world video sequences.
Tools for Protecting the Privacy of Specific Individuals in Video

NASA Astrophysics Data System (ADS)

Chen, Datong; Chang, Yi; Yan, Rong; Yang, Jie

2007-12-01

This paper presents a system for protecting the privacy of specific individuals in video recordings. We address the following two problems: automatic people identification with limited labeled data, and human body obscuring with preserved structure and motion information. In order to address the first problem, we propose a new discriminative learning algorithm to improve people identification accuracy using limited training data labeled from the original video and imperfect pairwise constraints labeled from face obscured video data. We employ a robust face detection and tracking algorithm to obscure human faces in the video. Our experiments in a nursing home environment show that the system can obtain a high accuracy of people identification using limited labeled data and noisy pairwise constraints. The study result indicates that human subjects can perform reasonably well in labeling pairwise constraints with the face masked data. For the second problem, we propose a novel method of body obscuring, which removes the appearance information of the people while preserving rich structure and motion information. The proposed approach provides a way to minimize the risk of exposing the identities of the protected people while maximizing the use of the captured data for activity/behavior analysis.
Improving Video Based Heart Rate Monitoring.

PubMed

Lin, Jian; Rozado, David; Duenser, Andreas

2015-01-01

Non-contact measurements of cardiac pulse can provide robust measurement of heart rate (HR) without the annoyance of attaching electrodes to the body. In this paper we explore a novel and reliable method to carry out video-based HR estimation and propose various performance improvement over existing approaches. The investigated method uses Independent Component Analysis (ICA) to detect the underlying HR signal from a mixed source signal present in the RGB channels of the image. The original ICA algorithm was implemented and several modifications were explored in order to determine which one could be optimal for accurate HR estimation. Using statistical analysis, we compared the cardiac pulse rate estimation from the different methods under comparison on the extracted videos to a commercially available oximeter. We found that some of these methods are quite effective and efficient in terms of improving accuracy and latency of the system. We have made the code of our algorithms openly available to the scientific community so that other researchers can explore how to integrate video-based HR monitoring in novel health technology applications. We conclude by noting that recent advances in video-based HR monitoring permit computers to be aware of a user's psychophysiological status in real time.
Constructing spherical panoramas of a bladder phantom from endoscopic video using bundle adjustment

NASA Astrophysics Data System (ADS)

Soper, Timothy D.; Chandler, John E.; Porter, Michael P.; Seibel, Eric J.

2011-03-01

The high recurrence rate of bladder cancer requires patients to undergo frequent surveillance screenings over their lifetime following initial diagnosis and resection. Our laboratory is developing panoramic stitching software that would compile several minutes of cystoscopic video into a single panoramic image, covering the entire bladder, for review by an urolgist at a later time or remote location. Global alignment of video frames is achieved by using a bundle adjuster that simultaneously recovers both the 3D structure of the bladder as well as the scope motion using only the video frames as input. The result of the algorithm is a complete 360° spherical panorama of the outer surface. The details of the software algorithms are presented here along with results from both a virtual cystoscopy as well from real endoscopic imaging of a bladder phantom. The software successfully stitched several hundred video frames into a single panoramic with subpixel accuracy and with no knowledge of the intrinsic camera properties, such as focal length and radial distortion. In the discussion, we outline future work in development of the software as well as identifying factors pertinent to clinical translation of this technology.
Novel memory architecture for video signal processor

NASA Astrophysics Data System (ADS)

Hung, Jen-Sheng; Lin, Chia-Hsing; Jen, Chein-Wei

1993-11-01

An on-chip memory architecture for video signal processor (VSP) is proposed. This memory structure is a two-level design for the different data locality in video applications. The upper level--Memory A provides enough storage capacity to reduce the impact on the limitation of chip I/O bandwidth, and the lower level--Memory B provides enough data parallelism and flexibility to meet the requirements of multiple reconfigurable pipeline function units in a single VSP chip. The needed memory size is decided by the memory usage analysis for video algorithms and the number of function units. Both levels of memory adopted a dual-port memory scheme to sustain the simultaneous read and write operations. Especially, Memory B uses multiple one-read-one-write memory banks to emulate the real multiport memory. Therefore, one can change the configuration of Memory B to several sets of memories with variable read/write ports by adjusting the bus switches. Then the numbers of read ports and write ports in proposed memory can meet requirement of data flow patterns in different video coding algorithms. We have finished the design of a prototype memory design using 1.2- micrometers SPDM SRAM technology and will fabricated it through TSMC, in Taiwan.
Efficient temporal and interlayer parameter prediction for weighted prediction in scalable high efficiency video coding

NASA Astrophysics Data System (ADS)

Tsang, Sik-Ho; Chan, Yui-Lam; Siu, Wan-Chi

2017-01-01

Weighted prediction (WP) is an efficient video coding tool that was introduced since the establishment of the H.264/AVC video coding standard, for compensating the temporal illumination change in motion estimation and compensation. WP parameters, including a multiplicative weight and an additive offset for each reference frame, are required to be estimated and transmitted to the decoder by slice header. These parameters cause extra bits in the coded video bitstream. High efficiency video coding (HEVC) provides WP parameter prediction to reduce the overhead. Therefore, WP parameter prediction is crucial to research works or applications, which are related to WP. Prior art has been suggested to further improve the WP parameter prediction by implicit prediction of image characteristics and derivation of parameters. By exploiting both temporal and interlayer redundancies, we propose three WP parameter prediction algorithms, enhanced implicit WP parameter, enhanced direct WP parameter derivation, and interlayer WP parameter, to further improve the coding efficiency of HEVC. Results show that our proposed algorithms can achieve up to 5.83% and 5.23% bitrate reduction compared to the conventional scalable HEVC in the base layer for SNR scalability and 2× spatial scalability, respectively.
Traffic Video Image Segmentation Model Based on Bayesian and Spatio-Temporal Markov Random Field

NASA Astrophysics Data System (ADS)

Zhou, Jun; Bao, Xu; Li, Dawei; Yin, Yongwen

2017-10-01

Traffic video image is a kind of dynamic image and its background and foreground is changed at any time, which results in the occlusion. In this case, using the general method is more difficult to get accurate image segmentation. A segmentation algorithm based on Bayesian and Spatio-Temporal Markov Random Field is put forward, which respectively build the energy function model of observation field and label field to motion sequence image with Markov property, then according to Bayesian' rule, use the interaction of label field and observation field, that is the relationship of label field’s prior probability and observation field’s likelihood probability, get the maximum posterior probability of label field’s estimation parameter, use the ICM model to extract the motion object, consequently the process of segmentation is finished. Finally, the segmentation methods of ST - MRF and the Bayesian combined with ST - MRF were analyzed. Experimental results: the segmentation time in Bayesian combined with ST-MRF algorithm is shorter than in ST-MRF, and the computing workload is small, especially in the heavy traffic dynamic scenes the method also can achieve better segmentation effect.
Baseline Face Detection, Head Pose Estimation, and Coarse Direction Detection for Facial Data in the SHRP2 Naturalistic Driving Study

DOE Office of Scientific and Technical Information (OSTI.GOV)

Paone, Jeffrey R; Bolme, David S; Ferrell, Regina Kay

Keeping a driver focused on the road is one of the most critical steps in insuring the safe operation of a vehicle. The Strategic Highway Research Program 2 (SHRP2) has over 3,100 recorded videos of volunteer drivers during a period of 2 years. This extensive naturalistic driving study (NDS) contains over one million hours of video and associated data that could aid safety researchers in understanding where the driver s attention is focused. Manual analysis of this data is infeasible, therefore efforts are underway to develop automated feature extraction algorithms to process and characterize the data. The real-world nature, volume,more » and acquisition conditions are unmatched in the transportation community, but there are also challenges because the data has relatively low resolution, high compression rates, and differing illumination conditions. A smaller dataset, the head pose validation study, is available which used the same recording equipment as SHRP2 but is more easily accessible with less privacy constraints. In this work we report initial head pose accuracy using commercial and open source face pose estimation algorithms on the head pose validation data set.« less
A clinically viable capsule endoscopy video analysis platform for automatic bleeding detection

NASA Astrophysics Data System (ADS)

Yi, Steven; Jiao, Heng; Xie, Jean; Mui, Peter; Leighton, Jonathan A.; Pasha, Shabana; Rentz, Lauri; Abedi, Mahmood

2013-02-01

In this paper, we present a novel and clinically valuable software platform for automatic bleeding detection on gastrointestinal (GI) tract from Capsule Endoscopy (CE) videos. Typical CE videos for GI tract run about 8 hours and are manually reviewed by physicians to locate diseases such as bleedings and polyps. As a result, the process is time consuming and is prone to disease miss-finding. While researchers have made efforts to automate this process, however, no clinically acceptable software is available on the marketplace today. Working with our collaborators, we have developed a clinically viable software platform called GISentinel for fully automated GI tract bleeding detection and classification. Major functional modules of the SW include: the innovative graph based NCut segmentation algorithm, the unique feature selection and validation method (e.g. illumination invariant features, color independent features, and symmetrical texture features), and the cascade SVM classification for handling various GI tract scenes (e.g. normal tissue, food particles, bubbles, fluid, and specular reflection). Initial evaluation results on the SW have shown zero bleeding instance miss-finding rate and 4.03% false alarm rate. This work is part of our innovative 2D/3D based GI tract disease detection software platform. While the overall SW framework is designed for intelligent finding and classification of major GI tract diseases such as bleeding, ulcer, and polyp from the CE videos, this paper will focus on the automatic bleeding detection functional module.
Task-oriented situation recognition

NASA Astrophysics Data System (ADS)

Bauer, Alexander; Fischer, Yvonne

2010-04-01

From the advances in computer vision methods for the detection, tracking and recognition of objects in video streams, new opportunities for video surveillance arise: In the future, automated video surveillance systems will be able to detect critical situations early enough to enable an operator to take preventive actions, instead of using video material merely for forensic investigations. However, problems such as limited computational resources, privacy regulations and a constant change in potential threads have to be addressed by a practical automated video surveillance system. In this paper, we show how these problems can be addressed using a task-oriented approach. The system architecture of the task-oriented video surveillance system NEST and an algorithm for the detection of abnormal behavior as part of the system are presented and illustrated for the surveillance of guests inside a video-monitored building.
Dynamic image fusion and general observer preference

NASA Astrophysics Data System (ADS)

Burks, Stephen D.; Doe, Joshua M.

2010-04-01

Recent developments in image fusion give the user community many options for ways of presenting the imagery to an end-user. Individuals at the US Army RDECOM CERDEC Night Vision and Electronic Sensors Directorate have developed an electronic system that allows users to quickly and efficiently determine optimal image fusion algorithms and color parameters based upon collected imagery and videos from environments that are typical to observers in a military environment. After performing multiple multi-band data collections in a variety of military-like scenarios, different waveband, fusion algorithm, image post-processing, and color choices are presented to observers as an output of the fusion system. The observer preferences can give guidelines as to how specific scenarios should affect the presentation of fused imagery.
D Reconstruction with a Collaborative Approach Based on Smartphones and a Cloud-Based Server

NASA Astrophysics Data System (ADS)

Nocerino, E.; Poiesi, F.; Locher, A.; Tefera, Y. T.; Remondino, F.; Chippendale, P.; Van Gool, L.

2017-11-01

The paper presents a collaborative image-based 3D reconstruction pipeline to perform image acquisition with a smartphone and geometric 3D reconstruction on a server during concurrent or disjoint acquisition sessions. Images are selected from the video feed of the smartphone's camera based on their quality and novelty. The smartphone's app provides on-the-fly reconstruction feedback to users co-involved in the acquisitions. The server is composed of an incremental SfM algorithm that processes the received images by seamlessly merging them into a single sparse point cloud using bundle adjustment. Dense image matching algorithm can be lunched to derive denser point clouds. The reconstruction details, experiments and performance evaluation are presented and discussed.
What Makes a Message Stick? The Role of Content and Context in Social Media Epidemics

DTIC Science & Technology

2013-09-23

First, we propose visual memes , or frequently re-posted short video segments, for detecting and monitoring latent video interactions at scale. Content...interactions (such as quoting, or remixing, parts of a video). Visual memes are extracted by scalable detection algorithms that we develop, with...high accuracy. We further augment visual memes with text, via a statistical model of latent topics. We model content interactions on YouTube with
Single-Scale Retinex Using Digital Signal Processors

NASA Technical Reports Server (NTRS)

Hines, Glenn; Rahman, Zia-Ur; Jobson, Daniel; Woodell, Glenn

2005-01-01

The Retinex is an image enhancement algorithm that improves the brightness, contrast and sharpness of an image. It performs a non-linear spatial/spectral transform that provides simultaneous dynamic range compression and color constancy. It has been used for a wide variety of applications ranging from aviation safety to general purpose photography. Many potential applications require the use of Retinex processing at video frame rates. This is difficult to achieve with general purpose processors because the algorithm contains a large number of complex computations and data transfers. In addition, many of these applications also constrain the potential architectures to embedded processors to save power, weight and cost. Thus we have focused on digital signal processors (DSPs) and field programmable gate arrays (FPGAs) as potential solutions for real-time Retinex processing. In previous efforts we attained a 21 (full) frame per second (fps) processing rate for the single-scale monochromatic Retinex with a TMS320C6711 DSP operating at 150 MHz. This was achieved after several significant code improvements and optimizations. Since then we have migrated our design to the slightly more powerful TMS320C6713 DSP and the fixed point TMS320DM642 DSP. In this paper we briefly discuss the Retinex algorithm, the performance of the algorithm executing on the TMS320C6713 and the TMS320DM642, and compare the results with the TMS320C6711.
An algorithm for the detection and characterisation of volcanic plumes using thermal camera imagery

NASA Astrophysics Data System (ADS)

Bombrun, Maxime; Jessop, David; Harris, Andrew; Barra, Vincent

2018-02-01

Volcanic plumes are turbulent mixtures of particles and gas which are injected into the atmosphere during a volcanic eruption. Depending on the intensity of the eruption, plumes can rise from a few tens of metres up to many tens of kilometres above the vent and thus, present a major hazard for the surrounding population. Currently, however, few if any algorithms are available for automated plume tracking and assessment. Here, we present a new image processing algorithm for segmentation, tracking and parameters extraction of convective plume recorded with thermal cameras. We used thermal video of two volcanic eruptions and two plumes simulated in laboratory to develop and test an efficient technique for analysis of volcanic plumes. We validated our method by two different approaches. First, we compare our segmentation method to previously published algorithms. Next, we computed plume parameters, such as height, width and spreading angle at regular intervals of time. These parameters allowed us to calculate an entrainment coefficient and obtain information about the entrainment efficiency in Strombolian eruptions. Our proposed algorithm is rapid, automated while producing better visual outlines compared to the other segmentation algorithms, and provides output that is at least as accurate as manual measurements of plumes.

Robust digital image inpainting algorithm in the wireless environment

NASA Astrophysics Data System (ADS)

Karapetyan, G.; Sarukhanyan, H. G.; Agaian, S. S.

2014-05-01

Image or video inpainting is the process/art of retrieving missing portions of an image without introducing undesirable artifacts that are undetectable by an ordinary observer. An image/video can be damaged due to a variety of factors, such as deterioration due to scratches, laser dazzling effects, wear and tear, dust spots, loss of data when transmitted through a channel, etc. Applications of inpainting include image restoration (removing laser dazzling effects, dust spots, date, text, time, etc.), image synthesis (texture synthesis), completing panoramas, image coding, wireless transmission (recovery of the missing blocks), digital culture protection, image de-noising, fingerprint recognition, and film special effects and production. Most inpainting methods can be classified in two key groups: global and local methods. Global methods are used for generating large image regions from samples while local methods are used for filling in small image gaps. Each method has its own advantages and limitations. For example, the global inpainting methods perform well on textured image retrieval, whereas the classical local methods perform poorly. In addition, some of the techniques are computationally intensive; exceeding the capabilities of most currently used mobile devices. In general, the inpainting algorithms are not suitable for the wireless environment. This paper presents a new and efficient scheme that combines the advantages of both local and global methods into a single algorithm. Particularly, it introduces a blind inpainting model to solve the above problems by adaptively selecting support area for the inpainting scheme. The proposed method is applied to various challenging image restoration tasks, including recovering old photos, recovering missing data on real and synthetic images, and recovering the specular reflections in endoscopic images. A number of computer simulations demonstrate the effectiveness of our scheme and also illustrate the main properties and implementation steps of the presented algorithm. Furthermore, the simulation results show that the presented method is among the state-of-the-art and compares favorably against many available methods in the wireless environment. Robustness in the wireless environment with respect to the shape of the manually selected "marked" region is also illustrated. Currently, we are working on the expansion of this work to video and 3-D data.
Analysis and segmentation of images in case of solving problems of detecting and tracing objects on real-time video

NASA Astrophysics Data System (ADS)

Ezhova, Kseniia; Fedorenko, Dmitriy; Chuhlamov, Anton

2016-04-01

The article deals with the methods of image segmentation based on color space conversion, and allow the most efficient way to carry out the detection of a single color in a complex background and lighting, as well as detection of objects on a homogeneous background. The results of the analysis of segmentation algorithms of this type, the possibility of their implementation for creating software. The implemented algorithm is very time-consuming counting, making it a limited application for the analysis of the video, however, it allows us to solve the problem of analysis of objects in the image if there is no dictionary of images and knowledge bases, as well as the problem of choosing the optimal parameters of the frame quantization for video analysis.
Large-Scale Simulation Network Design Study

DTIC Science & Technology

1983-10-01

video displays: three for the tank commander, three for the driver, one for the loader, and one for the gunner. The solid angles subtended by these...Newman Inc Range Sortr This process sorts the expanded display lists into range order for drawing according to the "painter’s algorithm’" The range sorter ...session could then be continued as soon as the network recovered. and the elapsed session time would not be wasted . The SimNet design is much more tolerant
A Fast MEANSHIFT Algorithm-Based Target Tracking System

PubMed Central

Sun, Jian

2012-01-01

Tracking moving targets in complex scenes using an active video camera is a challenging task. Tracking accuracy and efficiency are two key yet generally incompatible aspects of a Target Tracking System (TTS). A compromise scheme will be studied in this paper. A fast mean-shift-based Target Tracking scheme is designed and realized, which is robust to partial occlusion and changes in object appearance. The physical simulation shows that the image signal processing speed is >50 frame/s. PMID:22969397
Robotic Vision-Based Localization in an Urban Environment

NASA Technical Reports Server (NTRS)

Mchenry, Michael; Cheng, Yang; Matthies

2007-01-01

A system of electronic hardware and software, now undergoing development, automatically estimates the location of a robotic land vehicle in an urban environment using a somewhat imprecise map, which has been generated in advance from aerial imagery. This system does not utilize the Global Positioning System and does not include any odometry, inertial measurement units, or any other sensors except a stereoscopic pair of black-and-white digital video cameras mounted on the vehicle. Of course, the system also includes a computer running software that processes the video image data. The software consists mostly of three components corresponding to the three major image-data-processing functions: Visual Odometry This component automatically tracks point features in the imagery and computes the relative motion of the cameras between sequential image frames. This component incorporates a modified version of a visual-odometry algorithm originally published in 1989. The algorithm selects point features, performs multiresolution area-correlation computations to match the features in stereoscopic images, tracks the features through the sequence of images, and uses the tracking results to estimate the six-degree-of-freedom motion of the camera between consecutive stereoscopic pairs of images (see figure). Urban Feature Detection and Ranging Using the same data as those processed by the visual-odometry component, this component strives to determine the three-dimensional (3D) coordinates of vertical and horizontal lines that are likely to be parts of, or close to, the exterior surfaces of buildings. The basic sequence of processes performed by this component is the following: 1. An edge-detection algorithm is applied, yielding a set of linked lists of edge pixels, a horizontal-gradient image, and a vertical-gradient image. 2. Straight-line segments of edges are extracted from the linked lists generated in step 1. Any straight-line segments longer than an arbitrary threshold (e.g., 30 pixels) are assumed to belong to buildings or other artificial objects. 3. A gradient-filter algorithm is used to test straight-line segments longer than the threshold to determine whether they represent edges of natural or artificial objects. In somewhat oversimplified terms, the test is based on the assumption that the gradient of image intensity varies little along a segment that represents the edge of an artificial object.
An adaptive DPCM encoder for NTSC composite video signals

NASA Astrophysics Data System (ADS)

Cox, N. R.

An adaptive DPCM algorithm is proposed for encoding digitized National Television Systems Committee (NTSC) color video signals. This algorithm essentially predicts picture contours in the composite signal without resorting to component separation. Preliminary subjective and objective tests performed on an experimental encoder/simulator indicate that high quality color pictures can be encoded at 4.0 bits/pel or 42.95 Mbit/s. This requires the use of a 4/8 bit dual-word-length coder and buffer memory. Such a system might be useful in certain short hop applications if both large-signal and small-signal responses can be preserved.
Objectification of perceptual image quality for mobile video

NASA Astrophysics Data System (ADS)

Lee, Seon-Oh; Sim, Dong-Gyu

2011-06-01

This paper presents an objective video quality evaluation method for quantifying the subjective quality of digital mobile video. The proposed method aims to objectify the subjective quality by extracting edgeness and blockiness parameters. To evaluate the performance of the proposed algorithms, we carried out subjective video quality tests with the double-stimulus continuous quality scale method and obtained differential mean opinion score values for 120 mobile video clips. We then compared the performance of the proposed methods with that of existing methods in terms of the differential mean opinion score with 120 mobile video clips. Experimental results showed that the proposed methods were approximately 10% better than the edge peak signal-to-noise ratio of the J.247 method in terms of the Pearson correlation.
Quality Scalability Aware Watermarking for Visual Content.

PubMed

Bhowmik, Deepayan; Abhayaratne, Charith

2016-11-01

Scalable coding-based content adaptation poses serious challenges to traditional watermarking algorithms, which do not consider the scalable coding structure and hence cannot guarantee correct watermark extraction in media consumption chain. In this paper, we propose a novel concept of scalable blind watermarking that ensures more robust watermark extraction at various compression ratios while not effecting the visual quality of host media. The proposed algorithm generates scalable and robust watermarked image code-stream that allows the user to constrain embedding distortion for target content adaptations. The watermarked image code-stream consists of hierarchically nested joint distortion-robustness coding atoms. The code-stream is generated by proposing a new wavelet domain blind watermarking algorithm guided by a quantization based binary tree. The code-stream can be truncated at any distortion-robustness atom to generate the watermarked image with the desired distortion-robustness requirements. A blind extractor is capable of extracting watermark data from the watermarked images. The algorithm is further extended to incorporate a bit-plane discarding-based quantization model used in scalable coding-based content adaptation, e.g., JPEG2000. This improves the robustness against quality scalability of JPEG2000 compression. The simulation results verify the feasibility of the proposed concept, its applications, and its improved robustness against quality scalable content adaptation. Our proposed algorithm also outperforms existing methods showing 35% improvement. In terms of robustness to quality scalable video content adaptation using Motion JPEG2000 and wavelet-based scalable video coding, the proposed method shows major improvement for video watermarking.
Submillimeter video imaging with a superconducting bolometer array

NASA Astrophysics Data System (ADS)

Becker, Daniel Thomas

Millimeter wavelength radiation holds promise for detection of security threats at a distance, including suicide bombers and maritime threats in poor weather. The high sensitivity of superconducting Transition Edge Sensor (TES) bolometers makes them ideal for passive imaging of thermal signals at millimeter and submillimeter wavelengths. I have built a 350 GHz video-rate imaging system using an array of feedhorn-coupled TES bolometers. The system operates at standoff distances of 16 m to 28 m with a measured spatial resolution of 1.4 cm (at 17 m). It currently contains one 251-detector sub-array, and can be expanded to contain four sub-arrays for a total of 1004 detectors. The system has been used to take video images that reveal the presence of weapons concealed beneath a shirt in an indoor setting. This dissertation describes the design, implementation and characterization of this system. It presents an overview of the challenges associated with standoff passive imaging and how these problems can be overcome through the use of large-format TES bolometer arrays. I describe the design of the system and cover the results of detector and optical characterization. I explain the procedure used to generate video images using the system, and present a noise analysis of those images. This analysis indicates that the Noise Equivalent Temperature Difference (NETD) of the video images is currently limited by artifacts of the scanning process. More sophisticated image processing algorithms can eliminate these artifacts and reduce the NETD to 100 mK, which is the target value for the most demanding passive imaging scenarios. I finish with an overview of future directions for this system.
First results of the multi-purpose real-time processing video camera system on the Wendelstein 7-X stellarator and implications for future devices

NASA Astrophysics Data System (ADS)

Zoletnik, S.; Biedermann, C.; Cseh, G.; Kocsis, G.; König, R.; Szabolics, T.; Szepesi, T.; Wendelstein 7-X Team

2018-01-01

A special video camera has been developed for the 10-camera overview video system of the Wendelstein 7-X (W7-X) stellarator considering multiple application needs and limitations resulting from this complex long-pulse superconducting stellarator experiment. The event detection intelligent camera (EDICAM) uses a special 1.3 Mpixel CMOS sensor with non-destructive read capability which enables fast monitoring of smaller Regions of Interest (ROIs) even during long exposures. The camera can perform simple data evaluation algorithms (minimum/maximum, mean comparison to levels) on the ROI data which can dynamically change the readout process and generate output signals. Multiple EDICAM cameras were operated in the first campaign of W7-X and capabilities were explored in the real environment. Data prove that the camera can be used for taking long exposure (10-100 ms) overview images of the plasma while sub-ms monitoring and even multi-camera correlated edge plasma turbulence measurements of smaller areas can be done in parallel. These latter revealed that filamentary turbulence structures extend between neighboring modules of the stellarator. Considerations emerging for future upgrades of this system and similar setups on future long-pulse fusion experiments such as ITER are discussed.
Real-time image processing for passive mmW imagery

NASA Astrophysics Data System (ADS)

Kozacik, Stephen; Paolini, Aaron; Bonnett, James; Harrity, Charles; Mackrides, Daniel; Dillon, Thomas E.; Martin, Richard D.; Schuetz, Christopher A.; Kelmelis, Eric; Prather, Dennis W.

2015-05-01

The transmission characteristics of millimeter waves (mmWs) make them suitable for many applications in defense and security, from airport preflight scanning to penetrating degraded visual environments such as brownout or heavy fog. While the cold sky provides sufficient illumination for these images to be taken passively in outdoor scenarios, this utility comes at a cost; the diffraction limit of the longer wavelengths involved leads to lower resolution imagery compared to the visible or IR regimes, and the low power levels inherent to passive imagery allow the data to be more easily degraded by noise. Recent techniques leveraging optical upconversion have shown significant promise, but are still subject to fundamental limits in resolution and signal-to-noise ratio. To address these issues we have applied techniques developed for visible and IR imagery to decrease noise and increase resolution in mmW imagery. We have developed these techniques into fieldable software, making use of GPU platforms for real-time operation of computationally complex image processing algorithms. We present data from a passive, 77 GHz, distributed aperture, video-rate imaging platform captured during field tests at full video rate. These videos demonstrate the increase in situational awareness that can be gained through applying computational techniques in real-time without needing changes in detection hardware.
Automatic segmentation of the optic nerve head for deformation measurements in video rate optical coherence tomography

NASA Astrophysics Data System (ADS)

Hidalgo-Aguirre, Maribel; Gitelman, Julian; Lesk, Mark Richard; Costantino, Santiago

2015-11-01

Optical coherence tomography (OCT) imaging has become a standard diagnostic tool in ophthalmology, providing essential information associated with various eye diseases. In order to investigate the dynamics of the ocular fundus, we present a simple and accurate automated algorithm to segment the inner limiting membrane in video-rate optic nerve head spectral domain (SD) OCT images. The method is based on morphological operations including a two-step contrast enhancement technique, proving to be very robust when dealing with low signal-to-noise ratio images and pathological eyes. An analysis algorithm was also developed to measure neuroretinal tissue deformation from the segmented retinal profiles. The performance of the algorithm is demonstrated, and deformation results are presented for healthy and glaucomatous eyes.
Fire flame detection based on GICA and target tracking

NASA Astrophysics Data System (ADS)

Rong, Jianzhong; Zhou, Dechuang; Yao, Wei; Gao, Wei; Chen, Juan; Wang, Jian

2013-04-01

To improve the video fire detection rate, a robust fire detection algorithm based on the color, motion and pattern characteristics of fire targets was proposed, which proved a satisfactory fire detection rate for different fire scenes. In this fire detection algorithm: (a) a rule-based generic color model was developed based on analysis on a large quantity of flame pixels; (b) from the traditional GICA (Geometrical Independent Component Analysis) model, a Cumulative Geometrical Independent Component Analysis (C-GICA) model was developed for motion detection without static background and (c) a BP neural network fire recognition model based on multi-features of the fire pattern was developed. Fire detection tests on benchmark fire video clips of different scenes have shown the robustness, accuracy and fast-response of the algorithm.
Parallel design patterns for a low-power, software-defined compressed video encoder

NASA Astrophysics Data System (ADS)

Bruns, Michael W.; Hunt, Martin A.; Prasad, Durga; Gunupudi, Nageswara R.; Sonachalam, Sekar

2011-06-01

Video compression algorithms such as H.264 offer much potential for parallel processing that is not always exploited by the technology of a particular implementation. Consumer mobile encoding devices often achieve real-time performance and low power consumption through parallel processing in Application Specific Integrated Circuit (ASIC) technology, but many other applications require a software-defined encoder. High quality compression features needed for some applications such as 10-bit sample depth or 4:2:2 chroma format often go beyond the capability of a typical consumer electronics device. An application may also need to efficiently combine compression with other functions such as noise reduction, image stabilization, real time clocks, GPS data, mission/ESD/user data or software-defined radio in a low power, field upgradable implementation. Low power, software-defined encoders may be implemented using a massively parallel memory-network processor array with 100 or more cores and distributed memory. The large number of processor elements allow the silicon device to operate more efficiently than conventional DSP or CPU technology. A dataflow programming methodology may be used to express all of the encoding processes including motion compensation, transform and quantization, and entropy coding. This is a declarative programming model in which the parallelism of the compression algorithm is expressed as a hierarchical graph of tasks with message communication. Data parallel and task parallel design patterns are supported without the need for explicit global synchronization control. An example is described of an H.264 encoder developed for a commercially available, massively parallel memorynetwork processor device.
A unified and efficient framework for court-net sports video analysis using 3D camera modeling

NASA Astrophysics Data System (ADS)

Han, Jungong; de With, Peter H. N.

2007-01-01

The extensive amount of video data stored on available media (hard and optical disks) necessitates video content analysis, which is a cornerstone for different user-friendly applications, such as, smart video retrieval and intelligent video summarization. This paper aims at finding a unified and efficient framework for court-net sports video analysis. We concentrate on techniques that are generally applicable for more than one sports type to come to a unified approach. To this end, our framework employs the concept of multi-level analysis, where a novel 3-D camera modeling is utilized to bridge the gap between the object-level and the scene-level analysis. The new 3-D camera modeling is based on collecting features points from two planes, which are perpendicular to each other, so that a true 3-D reference is obtained. Another important contribution is a new tracking algorithm for the objects (i.e. players). The algorithm can track up to four players simultaneously. The complete system contributes to summarization by various forms of information, of which the most important are the moving trajectory and real-speed of each player, as well as 3-D height information of objects and the semantic event segments in a game. We illustrate the performance of the proposed system by evaluating it for a variety of court-net sports videos containing badminton, tennis and volleyball, and we show that the feature detection performance is above 92% and events detection about 90%.
Delivery of video-on-demand services using local storages within passive optical networks.

PubMed

Abeywickrama, Sandu; Wong, Elaine

2013-01-28

At present, distributed storage systems have been widely studied to alleviate Internet traffic build-up caused by high-bandwidth, on-demand applications. Distributed storage arrays located locally within the passive optical network were previously proposed to deliver Video-on-Demand services. As an added feature, a popularity-aware caching algorithm was also proposed to dynamically maintain the most popular videos in the storage arrays of such local storages. In this paper, we present a new dynamic bandwidth allocation algorithm to improve Video-on-Demand services over passive optical networks using local storages. The algorithm exploits the use of standard control packets to reduce the time taken for the initial request communication between the customer and the central office, and to maintain the set of popular movies in the local storage. We conduct packet level simulations to perform a comparative analysis of the Quality-of-Service attributes between two passive optical networks, namely the conventional passive optical network and one that is equipped with a local storage. Results from our analysis highlight that strategic placement of a local storage inside the network enables the services to be delivered with improved Quality-of-Service to the customer. We further formulate power consumption models of both architectures to examine the trade-off between enhanced Quality-of-Service performance versus the increased power requirement from implementing a local storage within the network.
Full-frame video stabilization with motion inpainting.

PubMed

Matsushita, Yasuyuki; Ofek, Eyal; Ge, Weina; Tang, Xiaoou; Shum, Heung-Yeung

2006-07-01

Video stabilization is an important video enhancement technology which aims at removing annoying shaky motion from videos. We propose a practical and robust approach of video stabilization that produces full-frame stabilized videos with good visual quality. While most previous methods end up with producing smaller size stabilized videos, our completion method can produce full-frame videos by naturally filling in missing image parts by locally aligning image data of neighboring frames. To achieve this, motion inpainting is proposed to enforce spatial and temporal consistency of the completion in both static and dynamic image areas. In addition, image quality in the stabilized video is enhanced with a new practical deblurring algorithm. Instead of estimating point spread functions, our method transfers and interpolates sharper image pixels of neighboring frames to increase the sharpness of the frame. The proposed video completion and deblurring methods enabled us to develop a complete video stabilizer which can naturally keep the original image quality in the stabilized videos. The effectiveness of our method is confirmed by extensive experiments over a wide variety of videos.
Extended image differencing for change detection in UAV video mosaics

NASA Astrophysics Data System (ADS)

Saur, Günter; Krüger, Wolfgang; Schumann, Arne

2014-03-01

Change detection is one of the most important tasks when using unmanned aerial vehicles (UAV) for video reconnaissance and surveillance. We address changes of short time scale, i.e. the observations are taken in time distances from several minutes up to a few hours. Each observation is a short video sequence acquired by the UAV in near-nadir view and the relevant changes are, e.g., recently parked or moved vehicles. In this paper we extend our previous approach of image differencing for single video frames to video mosaics. A precise image-to-image registration combined with a robust matching approach is needed to stitch the video frames to a mosaic. Additionally, this matching algorithm is applied to mosaic pairs in order to align them to a common geometry. The resulting registered video mosaic pairs are the input of the change detection procedure based on extended image differencing. A change mask is generated by an adaptive threshold applied to a linear combination of difference images of intensity and gradient magnitude. The change detection algorithm has to distinguish between relevant and non-relevant changes. Examples for non-relevant changes are stereo disparity at 3D structures of the scene, changed size of shadows, and compression or transmission artifacts. The special effects of video mosaicking such as geometric distortions and artifacts at moving objects have to be considered, too. In our experiments we analyze the influence of these effects on the change detection results by considering several scenes. The results show that for video mosaics this task is more difficult than for single video frames. Therefore, we extended the image registration by estimating an elastic transformation using a thin plate spline approach. The results for mosaics are comparable to that of single video frames and are useful for interactive image exploitation due to a larger scene coverage.
Blastocyst microinjection automation.

PubMed

Mattos, Leonardo S; Grant, Edward; Thresher, Randy; Kluckman, Kimberly

2009-09-01

Blastocyst microinjections are routinely involved in the process of creating genetically modified mice for biomedical research, but their efficiency is highly dependent on the skills of the operators. As a consequence, much time and resources are required for training microinjection personnel. This situation has been aggravated by the rapid growth of genetic research, which has increased the demand for mutant animals. Therefore, increased productivity and efficiency in this area are highly desired. Here, we pursue these goals through the automation of a previously developed teleoperated blastocyst microinjection system. This included the design of a new system setup to facilitate automation, the definition of rules for automatic microinjections, the implementation of video processing algorithms to extract feedback information from microscope images, and the creation of control algorithms for process automation. Experimentation conducted with this new system and operator assistance during the cells delivery phase demonstrated a 75% microinjection success rate. In addition, implantation of the successfully injected blastocysts resulted in a 53% birth rate and a 20% yield of chimeras. These results proved that the developed system was capable of automatic blastocyst penetration and retraction, demonstrating the success of major steps toward full process automation.
The research on visual industrial robot which adopts fuzzy PID control algorithm

NASA Astrophysics Data System (ADS)

Feng, Yifei; Lu, Guoping; Yue, Lulin; Jiang, Weifeng; Zhang, Ye

2017-03-01

The control system of six degrees of freedom visual industrial robot based on the control mode of multi-axis motion control cards and PC was researched. For the variable, non-linear characteristics of industrial robot`s servo system, adaptive fuzzy PID controller was adopted. It achieved better control effort. In the vision system, a CCD camera was used to acquire signals and send them to video processing card. After processing, PC controls the six joints` motion by motion control cards. By experiment, manipulator can operate with machine tool and vision system to realize the function of grasp, process and verify. It has influence on the manufacturing of the industrial robot.

STS-74/MIR Photogrammetric Appendage Structural Dynamics Experiment Preliminary Data Analysis

NASA Technical Reports Server (NTRS)

Gilbert, Michael G.; Welch, Sharon S.; Pappa, Richard S.; Demeo, Martha E.

1997-01-01

The Photogrammetric Appendage Structural Dynamics Experiment was designed, developed, and flown to demonstrate and prove measurement of the structural vibration response of a Russian Space Station Mir solar array using photogrammetric methods. The experiment flew on the STS-74 Space Shuttle mission to Mir in November 1995 and obtained video imagery of solar array structural response to various excitation events. The video imagery has been digitized and triangulated to obtain response time history data at discrete points on the solar array. This data has been further processed using the Eigensystem Realization Algorithm modal identification technique to determine the natural vibration frequencies, damping, and mode shapes of the solar array. The results demonstrate that photogrammetric measurement of articulating, nonoptically targeted, flexible solar arrays and appendages is a viable, low-cost measurement option for the International Space Station.
Optimal camera exposure for video surveillance systems by predictive control of shutter speed, aperture, and gain

NASA Astrophysics Data System (ADS)

Torres, Juan; Menéndez, José Manuel

2015-02-01

This paper establishes a real-time auto-exposure method to guarantee that surveillance cameras in uncontrolled light conditions take advantage of their whole dynamic range while provide neither under nor overexposed images. State-of-the-art auto-exposure methods base their control on the brightness of the image measured in a limited region where the foreground objects are mostly located. Unlike these methods, the proposed algorithm establishes a set of indicators based on the image histogram that defines its shape and position. Furthermore, the location of the objects to be inspected is likely unknown in surveillance applications. Thus, the whole image is monitored in this approach. To control the camera settings, we defined a parameters function (Ef ) that linearly depends on the shutter speed and the electronic gain; and is inversely proportional to the square of the lens aperture diameter. When the current acquired image is not overexposed, our algorithm computes the value of Ef that would move the histogram to the maximum value that does not overexpose the capture. When the current acquired image is overexposed, it computes the value of Ef that would move the histogram to a value that does not underexpose the capture and remains close to the overexposed region. If the image is under and overexposed, the whole dynamic range of the camera is therefore used, and a default value of the Ef that does not overexpose the capture is selected. This decision follows the idea that to get underexposed images is better than to get overexposed ones, because the noise produced in the lower regions of the histogram can be removed in a post-processing step while the saturated pixels of the higher regions cannot be recovered. The proposed algorithm was tested in a video surveillance camera placed at an outdoor parking lot surrounded by buildings and trees which produce moving shadows in the ground. During the daytime of seven days, the algorithm was running alternatively together with a representative auto-exposure algorithm in the recent literature. Besides the sunrises and the nightfalls, multiple weather conditions occurred which produced light changes in the scene: sunny hours that produced sharpen shadows and highlights; cloud coverages that softened the shadows; and cloudy and rainy hours that dimmed the scene. Several indicators were used to measure the performance of the algorithms. They provided the objective quality as regards: the time that the algorithms recover from an under or over exposure, the brightness stability, and the change related to the optimal exposure. The results demonstrated that our algorithm reacts faster to all the light changes than the selected state-of-the-art algorithm. It is also capable of acquiring well exposed images and maintaining the brightness stable during more time. Summing up the results, we concluded that the proposed algorithm provides a fast and stable auto-exposure method that maintains an optimal exposure for video surveillance applications. Future work will involve the evaluation of this algorithm in robotics.
Analysis-Preserving Video Microscopy Compression via Correlation and Mathematical Morphology

PubMed Central

Shao, Chong; Zhong, Alfred; Cribb, Jeremy; Osborne, Lukas D.; O’Brien, E. Timothy; Superfine, Richard; Mayer-Patel, Ketan; Taylor, Russell M.

2015-01-01

The large amount video data produced by multi-channel, high-resolution microscopy system drives the need for a new high-performance domain-specific video compression technique. We describe a novel compression method for video microscopy data. The method is based on Pearson's correlation and mathematical morphology. The method makes use of the point-spread function (PSF) in the microscopy video acquisition phase. We compare our method to other lossless compression methods and to lossy JPEG, JPEG2000 and H.264 compression for various kinds of video microscopy data including fluorescence video and brightfield video. We find that for certain data sets, the new method compresses much better than lossless compression with no impact on analysis results. It achieved a best compressed size of 0.77% of the original size, 25× smaller than the best lossless technique (which yields 20% for the same video). The compressed size scales with the video's scientific data content. Further testing showed that existing lossy algorithms greatly impacted data analysis at similar compression sizes. PMID:26435032
Architecture design of motion estimation for ITU-T H.263

NASA Astrophysics Data System (ADS)

Ku, Chung-Wei; Lin, Gong-Sheng; Chen, Liang-Gee; Lee, Yung-Ping

1997-01-01

Digitalized video and audio system has become the trend of the progress in multimedia, because it provides great performance in quality and feasibility of processing. However, as the huge amount of information is needed while the bandwidth is limitted, data compression plays an important role in the system. Say, for a 176 x 144 monochromic sequence with 10 frames/sec frame rate, the bandwidth is about 2Mbps. This wastes much channel resource and limits the applications. MPEG (moving picttre ezpert groip) standardizes the video codec scheme, and it performs high compression ratio while providing good quality. MPEG-i is used for the frame size about 352 x 240 and 30 frames per second, and MPEG-2 provides scalibility and can be applied on scenes with higher definition, say HDTV (high definition television). On the other hand, some applications concerns the very low bit-rate, such as videophone and video-conferencing. Because the channel bandwidth is much limitted in telephone network, a very high compression ratio must be required. ITU-T announced the H.263 video coding standards to meet the above requirements.8 According to the simulation results of TMN-5,22 it outperforms 11.263 with little overhead of complexity. Since wireless communication is the trend in the near future, low power design of the video codec is an important issue for portable visual telephone. Motion estimation is the most computation consuming parts in the whole video codec. About 60% of the computation is spent on this parts for the encoder. Several architectures were proposed for efficient processing of block matching algorithms. In this paper, in order to meet the requirements of 11.263 and the expectation of low power consumption, a modified sandwich architecture in21 is proposed. Based on the parallel processing philosophy, low power is expected and the generation of either one motion vector or four motion vectors with half-pixel accuracy is achieved concurrently. In addition, we will present our solution how to solve the other addition modes in 11.263 with the proposed architecture.
Image preprocessing for improving computational efficiency in implementation of restoration and superresolution algorithms.

PubMed

Sundareshan, Malur K; Bhattacharjee, Supratik; Inampudi, Radhika; Pang, Ho-Yuen

2002-12-10

Computational complexity is a major impediment to the real-time implementation of image restoration and superresolution algorithms in many applications. Although powerful restoration algorithms have been developed within the past few years utilizing sophisticated mathematical machinery (based on statistical optimization and convex set theory), these algorithms are typically iterative in nature and require a sufficient number of iterations to be executed to achieve the desired resolution improvement that may be needed to meaningfully perform postprocessing image exploitation tasks in practice. Additionally, recent technological breakthroughs have facilitated novel sensor designs (focal plane arrays, for instance) that make it possible to capture megapixel imagery data at video frame rates. A major challenge in the processing of these large-format images is to complete the execution of the image processing steps within the frame capture times and to keep up with the output rate of the sensor so that all data captured by the sensor can be efficiently utilized. Consequently, development of novel methods that facilitate real-time implementation of image restoration and superresolution algorithms is of significant practical interest and is the primary focus of this study. The key to designing computationally efficient processing schemes lies in strategically introducing appropriate preprocessing steps together with the superresolution iterations to tailor optimized overall processing sequences for imagery data of specific formats. For substantiating this assertion, three distinct methods for tailoring a preprocessing filter and integrating it with the superresolution processing steps are outlined. These methods consist of a region-of-interest extraction scheme, a background-detail separation procedure, and a scene-derived information extraction step for implementing a set-theoretic restoration of the image that is less demanding in computation compared with the superresolution iterations. A quantitative evaluation of the performance of these algorithms for restoring and superresolving various imagery data captured by diffraction-limited sensing operations are also presented.
Gaze inspired subtitle position evaluation for MOOCs videos

NASA Astrophysics Data System (ADS)

Chen, Hongli; Yan, Mengzhen; Liu, Sijiang; Jiang, Bo

2017-06-01

Online educational resources, such as MOOCs, is becoming increasingly popular, especially in higher education field. One most important media type for MOOCs is course video. Besides traditional bottom-position subtitle accompany to the videos, in recent years, researchers try to develop more advanced algorithms to generate speaker-following style subtitles. However, the effectiveness of such subtitle is still unclear. In this paper, we investigate the relationship between subtitle position and the learning effect after watching the video on tablet devices. Inspired with image based human eye tracking technique, this work combines the objective gaze estimation statistics with subjective user study to achieve a convincing conclusion - speaker-following subtitles are more suitable for online educational videos.
Evaluating Cell Processes, Quality, and Biomarkers in Pluripotent Stem Cells Using Video Bioinformatics

PubMed Central

Lin, Sabrina C.; Bays, Brett C.; Omaiye, Esther; Bhanu, Bir; Talbot, Prue

2016-01-01

There is a foundational need for quality control tools in stem cell laboratories engaged in basic research, regenerative therapies, and toxicological studies. These tools require automated methods for evaluating cell processes and quality during in vitro passaging, expansion, maintenance, and differentiation. In this paper, an unbiased, automated high-content profiling toolkit, StemCellQC, is presented that non-invasively extracts information on cell quality and cellular processes from time-lapse phase-contrast videos. Twenty four (24) morphological and dynamic features were analyzed in healthy, unhealthy, and dying human embryonic stem cell (hESC) colonies to identify those features that were affected in each group. Multiple features differed in the healthy versus unhealthy/dying groups, and these features were linked to growth, motility, and death. Biomarkers were discovered that predicted cell processes before they were detectable by manual observation. StemCellQC distinguished healthy and unhealthy/dying hESC colonies with 96% accuracy by non-invasively measuring and tracking dynamic and morphological features over 48 hours. Changes in cellular processes can be monitored by StemCellQC and predictions can be made about the quality of pluripotent stem cell colonies. This toolkit reduced the time and resources required to track multiple pluripotent stem cell colonies and eliminated handling errors and false classifications due to human bias. StemCellQC provided both user-specified and classifier-determined analysis in cases where the affected features are not intuitive or anticipated. Video analysis algorithms allowed assessment of biological phenomena using automatic detection analysis, which can aid facilities where maintaining stem cell quality and/or monitoring changes in cellular processes are essential. In the future StemCellQC can be expanded to include other features, cell types, treatments, and differentiating cells. PMID:26848582
Evaluating Cell Processes, Quality, and Biomarkers in Pluripotent Stem Cells Using Video Bioinformatics.

PubMed

Zahedi, Atena; On, Vincent; Lin, Sabrina C; Bays, Brett C; Omaiye, Esther; Bhanu, Bir; Talbot, Prue

2016-01-01

There is a foundational need for quality control tools in stem cell laboratories engaged in basic research, regenerative therapies, and toxicological studies. These tools require automated methods for evaluating cell processes and quality during in vitro passaging, expansion, maintenance, and differentiation. In this paper, an unbiased, automated high-content profiling toolkit, StemCellQC, is presented that non-invasively extracts information on cell quality and cellular processes from time-lapse phase-contrast videos. Twenty four (24) morphological and dynamic features were analyzed in healthy, unhealthy, and dying human embryonic stem cell (hESC) colonies to identify those features that were affected in each group. Multiple features differed in the healthy versus unhealthy/dying groups, and these features were linked to growth, motility, and death. Biomarkers were discovered that predicted cell processes before they were detectable by manual observation. StemCellQC distinguished healthy and unhealthy/dying hESC colonies with 96% accuracy by non-invasively measuring and tracking dynamic and morphological features over 48 hours. Changes in cellular processes can be monitored by StemCellQC and predictions can be made about the quality of pluripotent stem cell colonies. This toolkit reduced the time and resources required to track multiple pluripotent stem cell colonies and eliminated handling errors and false classifications due to human bias. StemCellQC provided both user-specified and classifier-determined analysis in cases where the affected features are not intuitive or anticipated. Video analysis algorithms allowed assessment of biological phenomena using automatic detection analysis, which can aid facilities where maintaining stem cell quality and/or monitoring changes in cellular processes are essential. In the future StemCellQC can be expanded to include other features, cell types, treatments, and differentiating cells.
Mapping wide row crops with video sequences acquired from a tractor moving at treatment speed.

PubMed

Sainz-Costa, Nadir; Ribeiro, Angela; Burgos-Artizzu, Xavier P; Guijarro, María; Pajares, Gonzalo

2011-01-01

This paper presents a mapping method for wide row crop fields. The resulting map shows the crop rows and weeds present in the inter-row spacing. Because field videos are acquired with a camera mounted on top of an agricultural vehicle, a method for image sequence stabilization was needed and consequently designed and developed. The proposed stabilization method uses the centers of some crop rows in the image sequence as features to be tracked, which compensates for the lateral movement (sway) of the camera and leaves the pitch unchanged. A region of interest is selected using the tracked features, and an inverse perspective technique transforms the selected region into a bird's-eye view that is centered on the image and that enables map generation. The algorithm developed has been tested on several video sequences of different fields recorded at different times and under different lighting conditions, with good initial results. Indeed, lateral displacements of up to 66% of the inter-row spacing were suppressed through the stabilization process, and crop rows in the resulting maps appear straight.
Multiple player tracking in sports video: a dual-mode two-way bayesian inference approach with progressive observation modeling.

PubMed

Xing, Junliang; Ai, Haizhou; Liu, Liwei; Lao, Shihong

2011-06-01

Multiple object tracking (MOT) is a very challenging task yet of fundamental importance for many practical applications. In this paper, we focus on the problem of tracking multiple players in sports video which is even more difficult due to the abrupt movements of players and their complex interactions. To handle the difficulties in this problem, we present a new MOT algorithm which contributes both in the observation modeling level and in the tracking strategy level. For the observation modeling, we develop a progressive observation modeling process that is able to provide strong tracking observations and greatly facilitate the tracking task. For the tracking strategy, we propose a dual-mode two-way Bayesian inference approach which dynamically switches between an offline general model and an online dedicated model to deal with single isolated object tracking and multiple occluded object tracking integrally by forward filtering and backward smoothing. Extensive experiments on different kinds of sports videos, including football, basketball, as well as hockey, demonstrate the effectiveness and efficiency of the proposed method.
A Comprehensive Motion Estimation Technique for the Improvement of EIS Methods Based on the SURF Algorithm and Kalman Filter.

PubMed

Cheng, Xuemin; Hao, Qun; Xie, Mengdi

2016-04-07

Video stabilization is an important technology for removing undesired motion in videos. This paper presents a comprehensive motion estimation method for electronic image stabilization techniques, integrating the speeded up robust features (SURF) algorithm, modified random sample consensus (RANSAC), and the Kalman filter, and also taking camera scaling and conventional camera translation and rotation into full consideration. Using SURF in sub-pixel space, feature points were located and then matched. The false matched points were removed by modified RANSAC. Global motion was estimated by using the feature points and modified cascading parameters, which reduced the accumulated errors in a series of frames and improved the peak signal to noise ratio (PSNR) by 8.2 dB. A specific Kalman filter model was established by considering the movement and scaling of scenes. Finally, video stabilization was achieved with filtered motion parameters using the modified adjacent frame compensation. The experimental results proved that the target images were stabilized even when the vibrating amplitudes of the video become increasingly large.
Temporally coherent 4D video segmentation for teleconferencing

NASA Astrophysics Data System (ADS)

Ehmann, Jana; Guleryuz, Onur G.

2013-09-01

We develop an algorithm for 4-D (RGB+Depth) video segmentation targeting immersive teleconferencing ap- plications on emerging mobile devices. Our algorithm extracts users from their environments and places them onto virtual backgrounds similar to green-screening. The virtual backgrounds increase immersion and interac- tivity, relieving the users of the system from distractions caused by disparate environments. Commodity depth sensors, while providing useful information for segmentation, result in noisy depth maps with a large number of missing depth values. By combining depth and RGB information, our work signi¯cantly improves the other- wise very coarse segmentation. Further imposing temporal coherence yields compositions where the foregrounds seamlessly blend with the virtual backgrounds with minimal °icker and other artifacts. We achieve said improve- ments by correcting the missing information in depth maps before fast RGB-based segmentation, which operates in conjunction with temporal coherence. Simulation results indicate the e±cacy of the proposed system in video conferencing scenarios.
Detecting dominant motion patterns in crowds of pedestrians

NASA Astrophysics Data System (ADS)

Saqib, Muhammad; Khan, Sultan Daud; Blumenstein, Michael

2017-02-01

As the population of the world increases, urbanization generates crowding situations which poses challenges to public safety and security. Manual analysis of crowded situations is a tedious job and usually prone to errors. In this paper, we propose a novel technique of crowd analysis, the aim of which is to detect different dominant motion patterns in real-time videos. A motion field is generated by computing the dense optical flow. The motion field is then divided into blocks. For each block, we adopt an Intra-clustering algorithm for detecting different flows within the block. Later on, we employ Inter-clustering for clustering the flow vectors among different blocks. We evaluate the performance of our approach on different real-time videos. The experimental results show that our proposed method is capable of detecting distinct motion patterns in crowded videos. Moreover, our algorithm outperforms state-of-the-art methods.
Energy minimization of mobile video devices with a hardware H.264/AVC encoder based on energy-rate-distortion optimization

NASA Astrophysics Data System (ADS)

Kang, Donghun; Lee, Jungeon; Jung, Jongpil; Lee, Chul-Hee; Kyung, Chong-Min

2014-09-01

In mobile video systems powered by battery, reducing the encoder's compression energy consumption is critical to prolong its lifetime. Previous Energy-rate-distortion (E-R-D) optimization methods based on a software codec is not suitable for practical mobile camera systems because the energy consumption is too large and encoding rate is too low. In this paper, we propose an E-R-D model for the hardware codec based on the gate-level simulation framework to measure the switching activity and the energy consumption. From the proposed E-R-D model, an energy minimizing algorithm for mobile video camera sensor have been developed with the GOP (Group of Pictures) size and QP(Quantization Parameter) as run-time control variables. Our experimental results show that the proposed algorithm provides up to 31.76% of energy consumption saving while satisfying the rate and distortion constraints.
Mode-dependent templates and scan order for H.264/AVC-based intra lossless coding.

PubMed

Gu, Zhouye; Lin, Weisi; Lee, Bu-Sung; Lau, Chiew Tong; Sun, Ming-Ting

2012-09-01

In H.264/advanced video coding (AVC), lossless coding and lossy coding share the same entropy coding module. However, the entropy coders in the H.264/AVC standard were original designed for lossy video coding and do not yield adequate performance for lossless video coding. In this paper, we analyze the problem with the current lossless coding scheme and propose a mode-dependent template (MD-template) based method for intra lossless coding. By exploring the statistical redundancy of the prediction residual in the H.264/AVC intra prediction modes, more zero coefficients are generated. By designing a new scan order for each MD-template, the scanned coefficients sequence fits the H.264/AVC entropy coders better. A fast implementation algorithm is also designed. With little computation increase, experimental results confirm that the proposed fast algorithm achieves about 7.2% bit saving compared with the current H.264/AVC fidelity range extensions high profile.
Automatic video shot boundary detection using k-means clustering and improved adaptive dual threshold comparison

NASA Astrophysics Data System (ADS)

Sa, Qila; Wang, Zhihui

2018-03-01

At present, content-based video retrieval (CBVR) is the most mainstream video retrieval method, using the video features of its own to perform automatic identification and retrieval. This method involves a key technology, i.e. shot segmentation. In this paper, the method of automatic video shot boundary detection with K-means clustering and improved adaptive dual threshold comparison is proposed. First, extract the visual features of every frame and divide them into two categories using K-means clustering algorithm, namely, one with significant change and one with no significant change. Then, as to the classification results, utilize the improved adaptive dual threshold comparison method to determine the abrupt as well as gradual shot boundaries.Finally, achieve automatic video shot boundary detection system.
4DCAPTURE: a general purpose software package for capturing and analyzing two- and three-dimensional motion data acquired from video sequences

NASA Astrophysics Data System (ADS)

Walton, James S.; Hodgson, Peter; Hallamasek, Karen; Palmer, Jake

2003-07-01

4DVideo is creating a general purpose capability for capturing and analyzing kinematic data from video sequences in near real-time. The core element of this capability is a software package designed for the PC platform. The software ("4DCapture") is designed to capture and manipulate customized AVI files that can contain a variety of synchronized data streams -- including audio, video, centroid locations -- and signals acquired from more traditional sources (such as accelerometers and strain gauges.) The code includes simultaneous capture or playback of multiple video streams, and linear editing of the images (together with the ancilliary data embedded in the files). Corresponding landmarks seen from two or more views are matched automatically, and photogrammetric algorithms permit multiple landmarks to be tracked in two- and three-dimensions -- with or without lens calibrations. Trajectory data can be processed within the main application or they can be exported to a spreadsheet where they can be processed or passed along to a more sophisticated, stand-alone, data analysis application. Previous attempts to develop such applications for high-speed imaging have been limited in their scope, or by the complexity of the application itself. 4DVideo has devised a friendly ("FlowStack") user interface that assists the end-user to capture and treat image sequences in a natural progression. 4DCapture employs the AVI 2.0 standard and DirectX technology which effectively eliminates the file size limitations found in older applications. In early tests, 4DVideo has streamed three RS-170 video sources to disk for more than an hour without loss of data. At this time, the software can acquire video sequences in three ways: (1) directly, from up to three hard-wired cameras supplying RS-170 (monochrome) signals; (2) directly, from a single camera or video recorder supplying an NTSC (color) signal; and (3) by importing existing video streams in the AVI 1.0 or AVI 2.0 formats. The latter is particularly useful for high-speed applications where the raw images are often captured and stored by the camera before being downloaded. Provision has been made to synchronize data acquired from any combination of these video sources using audio and visual "tags." Additional "front-ends," designed for digital cameras, are anticipated.
Real-time rendering for multiview autostereoscopic displays

NASA Astrophysics Data System (ADS)

Berretty, R.-P. M.; Peters, F. J.; Volleberg, G. T. G.

2006-02-01

In video systems, the introduction of 3D video might be the next revolution after the introduction of color. Nowadays multiview autostereoscopic displays are in development. Such displays offer various views at the same time and the image content observed by the viewer depends upon his position with respect to the screen. His left eye receives a signal that is different from what his right eye gets; this gives, provided the signals have been properly processed, the impression of depth. The various views produced on the display differ with respect to their associated camera positions. A possible video format that is suited for rendering from different camera positions is the usual 2D format enriched with a depth related channel, e.g. for each pixel in the video not only its color is given, but also e.g. its distance to a camera. In this paper we provide a theoretical framework for the parallactic transformations which relates captured and observed depths to screen and image disparities. Moreover we present an efficient real time rendering algorithm that uses forward mapping to reduce aliasing artefacts and that deals properly with occlusions. For improved perceived resolution, we take the relative position of the color subpixels and the optics of the lenticular screen into account. Sophisticated filtering techniques results in high quality images.
Fuzzy Filtering Method for Color Videos Corrupted by Additive Noise

PubMed Central

Ponomaryov, Volodymyr I.; Montenegro-Monroy, Hector; Nino-de-Rivera, Luis

2014-01-01

A novel method for the denoising of color videos corrupted by additive noise is presented in this paper. The proposed technique consists of three principal filtering steps: spatial, spatiotemporal, and spatial postprocessing. In contrast to other state-of-the-art algorithms, during the first spatial step, the eight gradient values in different directions for pixels located in the vicinity of a central pixel as well as the R, G, and B channel correlation between the analogous pixels in different color bands are taken into account. These gradient values give the information about the level of contamination then the designed fuzzy rules are used to preserve the image features (textures, edges, sharpness, chromatic properties, etc.). In the second step, two neighboring video frames are processed together. Possible local motions between neighboring frames are estimated using block matching procedure in eight directions to perform interframe filtering. In the final step, the edges and smoothed regions in a current frame are distinguished for final postprocessing filtering. Numerous simulation results confirm that this novel 3D fuzzy method performs better than other state-of-the-art techniques in terms of objective criteria (PSNR, MAE, NCD, and SSIM) as well as subjective perception via the human vision system in the different color videos. PMID:24688428
Different source image fusion based on FPGA

NASA Astrophysics Data System (ADS)

Luo, Xiao; Piao, Yan

2016-03-01

The fusion technology of video image is to make the video obtained by different image sensors complementary to each other by some technical means, so as to obtain the video information which is rich in information and suitable for the human eye system. Infrared cameras in harsh environments such as when smoke, fog and low light situations penetrating power, but the ability to obtain the details of the image is poor, does not meet the human visual system. Single visible light imaging can be rich in detail, high resolution images and for the visual system, but the visible image easily affected by the external environment. Infrared image and visible image fusion process involved in the video image fusion algorithm complexity and high calculation capacity, have occupied more memory resources, high clock rate requirements, such as software, c ++, c, etc. to achieve more, but based on Hardware platform less. In this paper, based on the imaging characteristics of infrared images and visible light images, the software and hardware are combined to obtain the registration parameters through software matlab, and the gray level weighted average method is used to implement the hardware platform. Information fusion, and finally the fusion image can achieve the goal of effectively improving the acquisition of information to increase the amount of information in the image.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.