Rahul Vanam | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Rahul Vanam is active.

Explore More

Publication

Featured researches published by Rahul Vanam.

international conference on image processing | 2007

Improved Rate Control and Motion Estimation for H.264 Encoder

Loren Merritt; Rahul Vanam

In this paper, we describe rate control and motion estimation in x264, an open source H.264/AVC encoder. We compare the rate control methods of x264 with the JM reference encoder and show that our approach performs well in both PSNR and bitrate. In motion estimation, we describe our implementation of initialization and show that it improves PSNR. We also propose an early termination for simplified uneven cross multi hexagon grid search (UMH) in x264 and show that it improves the speed by a factor of 1.5. Finally, we show that x264 performs 50 times faster and provides bitrates within 5% of the JM reference encoder for the same PSNR.

IEEE Transactions on Audio, Speech, and Language Processing | 2008

An Objective Metric of Human Subjective Audio Quality Optimized for a Wide Range of Audio Fidelities

Charles D. Creusere; Kumar D. Kallakuri; Rahul Vanam

The goal of this paper is to develop an audio quality metric that can accurately quantify subjective quality over audio fidelities ranging from highly impaired to perceptually lossless. As one example of its utility, such a metric would allow scalable audio coding algorithms to be easily optimized over their entire operating ranges. We have found that the ITU-recommended objective quality metric, ITU-R BS.1387, does not accurately predict subjective audio quality over the wide range of fidelity levels of interest to us. In developing the desired universal metric, we use as a starting point the model output variables (MOVs) that make up BS.1387 as well as the energy equalization truncation threshold which has been found to be particularly useful for highly impaired audio. To combine these MOVs into a single quality measure that is both accurate and robust, we have developed a hybrid least-squares/minimax optimization procedure. Our test results show that the minimax-optimized metric is up to 36% lower in maximum absolute error compared to a similar metric designed using the conventional least-squares procedure.

data compression conference | 2007

Distortion-Complexity Optimization of the H.264/MPEG-4 AVC Encoder using the GBFOS Algorithm

Rahul Vanam; Eve A. Riskin; Sheila S. Hemami; Richard E. Ladner

The H.264/ACV standard provides significant improvements in performance over earlier video coding standards at the cost of increased complexity. Our challenge is to determine H.264 parameter settings that have low complexity but still offer high video quality. In this paper, we propose two fast algorithms for finding the H.264 parameter settings that take about 1% and 8%, respectively, of the number of tests required by an exhaustive search. Both the fast algorithms result in a maximum decrease in peak-signal-to-noise ratio of less than 0.71 dB for different data sets and bitrates

data compression conference | 2009

H.264/MPEG-4 AVC Encoder Parameter Selection Algorithms for Complexity Distortion Tradeoff

Rahul Vanam; Eve A. Riskin; Richard E. Ladner

The H.264 encoder has input parameters that determine the bit rate and distortion of the compressed video and the encoding complexity. A set of encoder parameters is referred to as a parameter setting. We previously proposed two offline algorithms for choosing H.264 encoder parameter settings that have distortion-complexity performance close to the parameter settings obtained from an exhaustive search, but take significantly fewer encodings. However they generate only a few parameter settings. If there is no available parameter settings for a given encode time, the encoder will need to use a lower complexity parameter setting resulting in a decrease in peak-signal-to-noise-ratio (PSNR). In this paper, we propose two algorithms for finding additional parameter settings over our previous algorithm and show that they improve the PSNR by up to 0.71 dB and 0.43 dB, respectively. We test both our algorithms on Linux and PocketPC platforms.

Disability and Rehabilitation: Assistive Technology | 2008

MobileASL: Intelligibility of sign language video over mobile phones

Anna C. Cavender; Rahul Vanam; Dane K. Barney; Richard E. Ladner; Eve A. Riskin

For Deaf people, access to the mobile telephone network in the United States is currently limited to text messaging, forcing communication in English as opposed to American Sign Language (ASL), the preferred language. Because ASL is a visual language, mobile video phones have the potential to give Deaf people access to real-time mobile communication in their preferred language. However, even todays best video compression techniques can not yield intelligible ASL at limited cell phone network bandwidths. Motivated by this constraint, we conducted one focus group and two user studies with members of the Deaf Community to determine the intelligibility effects of video compression techniques that exploit the visual nature of sign language. Inspired by eye tracking results that show high resolution foveal vision is maintained around the face, we studied region-of-interest encodings (where the face is encoded at higher quality) as well as reduced frame rates (where fewer, better quality, frames are displayed every second). At all bit rates studied here, participants preferred moderate quality increases in the face region, sacrificing quality in other regions. They also preferred slightly lower frame rates because they yield better quality frames for a fixed bit rate. The limited processing power of cell phones is a serious concern because a real-time video encoder and decoder will be needed. Choosing less complex settings for the encoder can reduce encoding time, but will affect video quality. We studied the intelligibility effects of this tradeoff and found that we can significantly speed up encoding time without severely affecting intelligibility. These results show promise for real-time access to the current low-bandwidth cell phone network through sign-language-specific encoding techniques.

picture coding symposium | 2013

Perceptual pre-processing filter for user-adaptive coding and delivery of visual information

Rahul Vanam; Yuriy A. Reznik

We describe design of an adaptive video delivery system employing a perceptual preprocessing filter. Such filter receives parameters of the reproduction setup, such as viewing distance, pixel density, ambient illuminance, etc. It subsequently applies a contrast sensitivity model of human vision to remove spatial oscillations that are invisible under such conditions. By removing such oscillations the filter simplifies the video content, therefore leading to more efficient encoding without causing any visible alterations of the content. Through experiments, we demonstrate that the use of our filter can yield significant bit rate savings compared to conventional encoding methods that are not tailored to specific viewing conditions.

ieee global conference on signal and information processing | 2013

Improving coding and delivery of video by exploiting the oblique effect

Yuriy A. Reznik; Rahul Vanam

Oblique effect implies lower visual sensitivity to diagonally oriented spatial oscillations as opposed to horizontal and vertical ones. To exploit this phenomenon we propose to use an adaptive anisotropic low-pass filter applied to video prior to encoding. We then describe design of such a filter. Through experiments, we demonstrate that the use of this filter can yield appreciable bitrate savings compared to conventional filtering and encoding of the same content.

data compression conference | 2013

Improving the Efficiency of Video Coding by Using Perceptual Preprocessing Filter

Rahul Vanam; Yuriy A. Reznik

We describe the design of a perceptual preprocessing filter for improving the effectiveness of video coding. This filter uses known parameters of the reproduction setup, such as viewing distance, pixel density, and contrast ratio of the screen, as well as a contrast sensitivity model of human vision to identify spatial oscillations that are invisible. By removing such oscillations the filter simplifies the video content, therefore leading to more efficient encoding without causing any visible alterations of the content. Through experiments, we demonstrate the use of our filter can yield significant bit rate savings compared to conventional encoding methods that are not tailored to specific viewing conditions.

international conference on acoustics, speech, and signal processing | 2015

Rate control for lossless region of interest coding in HEVC intra-coding with applications to digital pathology images

Victor Sanchez; Francesc Auli-Llinas; Rahul Vanam; Joan Bartrina-Rapesta

This paper proposes a rate control algorithm for lossless region of interest (RoI) coding in HEVC intra-coding. The algorithm is developed for digital pathology images and allows for random access to the data. Based on an input RoI mask, the algorithm first encodes the RoI losslessly. According to the bit rate spent on the RoI, it then encodes the background by using rate control in order to meet an overall target bit rate. In order to increase rate control accuracy, the algorithm uses an R-λ model to approximate the slope of the rate-distortion curve, and updates any related model parameters during the encoding process. Random access is attained by coding the data using independent tiles. Experimental results show that the proposed algorithm attains the overall bit rate very accurately while providing lossless reconstruction of the RoI.

visual communications and image processing | 2012

User-adaptive mobile video streaming

Yuriy A. Reznik; Ed Asbun; Zhifeng Chen; Yan Ye; Eldad Zeira; Rahul Vanam; Zheng Yuan; Gregory S. Sternberg; Ariela Zeira; Naresh Soni

Summary form only given. We describe the design of a mobile streaming system, which optimizes video delivery based on dynamic analysis of user behavior and viewing conditions, including user proximity, viewing angle, and ambient illuminance.

Explore More