Vinay Melkote | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Vinay Melkote is active.

Explore More

Publication

Featured researches published by Vinay Melkote.

data compression conference | 2010

Estimation-Theoretic Delayed Decoding of Predictively Encoded Video Sequences

Jingning Han; Vinay Melkote; Kenneth Rose

Current video coding schemes employ motion compensation to exploit the fact that the signal forms an auto-regressive process along the motion trajectory, and remove temporal redundancies with prior reconstructed samples via prediction. However, the decoder may, in principle, also exploit correlations with received encoding information of future frames. In contrast to current decoders that reconstruct every block immediately as the corresponding quantization indices are available, we propose an estimation-theoretic delayed decoding scheme which leverages quantization and motion information of one or more future frames to refine the reconstruction of the current block. The scheme, implemented in the transform domain, efficiently combines all available (including future) information in an appropriately derived conditional pdf, to obtain the optimal delayed reconstruction of each transform coefficient in the frame. Experiments demonstrate substantial gains over the standard H.264 decoder. The scheme learns the autoregressive model from information available to the decoder, and compatibility with the standard syntax and existing encoders is retained.

international conference on image processing | 2010

Transform-domain temporal prediction in video coding: Exploiting correlation variation across coefficients

Jingning Han; Vinay Melkote; Kenneth Rose

Temporal prediction in standard video coding is performed in the spatial domain, where each pixel is predicted from a motion-compensated reconstructed pixel in a prior frame. This paper is premised on the realization that such standard prediction treats each pixel independently and ignores underlying spatial correlations, while transform-domain prediction would eliminate much of the spatial correlation before signal components (transform coefficients) are independently predicted. Moreover, the true temporal correlations emerge after signal decomposition, and vary considerably from low to high frequency components. This precise nature of the temporal dependencies is entirely masked in spatial domain prediction by the high temporal correlation coefficient (ρ ≈ 1) imposed on all pixels by the dominant low frequency components. We derive optimal transform-domain per-coefficient predictors for three main settings: basic inter-frame prediction; bi-directional prediction; and enhancement-layer prediction in scalable coding. Experimental results provide evidence for substantial performance gains in all settings.

multimedia signal processing | 2011

Transform-domain temporal prediction in video coding with spatially adaptive spectral correlations

Jingning Han; Vinay Melkote; Kenneth Rose

Temporal prediction in standard video coding is performed in the spatial domain, where each pixel block is predicted from a motion-compensated pixel block in a previously reconstructed frame. Such prediction treats each pixel independently and ignores underlying spatial correlations. In contrast, this paper proposes a paradigm for motion-compensated prediction in the transform domain, that eliminates much of the spatial correlation before individual frequency components along a motion trajectory are independently predicted. The proposed scheme exploits the true temporal correlations, that emerge only after signal decomposition, and vary considerably from low to high frequency. The scheme spatially and temporally adapts to the evolving source statistics via a recursive procedure to obtain the cross-correlation between transform coefficients on the same motion trajectory. This recursion involves already reconstructed data and precludes the need for any additional side-information in the bit-stream. Experiments demonstrate substantial performance gains in comparison with the standard codec that employs conventional pixel domain motion-compensated prediction.

international conference on image processing | 2010

Estimation-theoretic approach to delayed prediction in scalable video coding

Jingning Han; Vinay Melkote; Kenneth Rose

Scalable video coding (SVC) employs inter-frame prediction at the base and/or the enhancement layers. Since the base layer can be encoded/decoded independent of the enhancement layers, we consider here the potential gains when prediction at the enhancement layers is delayed to accumulate and incorporate additional future information from the base layer. We build on two basic estimationtheoretic (ET) approaches developed by our group: an ET approach for enhancement layer prediction that optimally combines current base layer with prior enhancement layer information, and our recent ET approach for delayed decoding. The proposed technique fully exploits all the available information from the base layer, including any future frame information, and past enhancement layer information. It achieves considerable gains over zero-delay techniques including both standard SVC, and SVC with optimal ET prediction (but with zero encoding delay).

international conference on acoustics, speech, and signal processing | 2012

An estimation-theoretic approach to spatially scalable video coding

Jingning Han; Vinay Melkote; Kenneth Rose

This paper focuses on prediction optimality in spatially scalable video coding. It is inspired by the earlier estimation-theoretic prediction framework developed by our group for quality (SNR) scalability, which achieved optimality by fully accounting for relevant information from the current base layer (e.g., quantization intervals) and the enhancement layer, to efficiently calculate the conditional expectation that forms the optimal predictor. It was central to that approach that all layers reconstruct approximations to the same original transform coefficient. In spatial scalability, however, the layers encode different resolution versions of the signal. To approach optimality in enhancement layer prediction, the current work departs from existing spatially scalable codecs that employ pixel-domain resampling to perform inter-layer prediction. Instead, it incorporates a transform-domain resampling technique that ensures that the base layer quantization intervals are accessible and usable at the enhancement layer, which in conjunction with prior enhancement layer information, enable optimal prediction. Simulations provide experimental evidence that the proposed approach achieves substantial enhancement layer coding gains over the standard.

international conference on image processing | 2011

A unified framework for spectral domain prediction and end-to-end distortion estimation in scalable video coding

Jingning Han; Vinay Melkote; Kenneth Rose

A novel scalable coding approach is proposed for video transmission over lossy networks, which builds on two estimation-theoretic (ET) paradigms previously developed by our group: (1) an ET approach to enhancement layer prediction in scalable video coding (ET-SVC) that optimally combines all available information from both the current base layer and prior enhancement layer frames, and (2) the spectral coefficient-wise optimal recursive estimate (SCORE) of end-to-end distortion. SCORE provides the encoder with an estimate of distortion per decoder-reconstructed transform coefficient, accounting for the effects of quantization, concealment, packet loss and error propagation via the prediction loop. The current work significantly extends the scope of SCORE to encompass the setting of ET-SVC, whose prediction involves non-linear operations. This advance enables optimization of ET-SVC systems for transmission over lossy networks, thereby combining optimal prediction with optimal mode decisions at the enhancement layer. Experiments first demonstrate the estimation accuracy of SCORE in the settings of the ET-SVC coder. They then show considerable gains when SCORE is incorporated into ET-SVC to optimize encoding decisions under a wide range of packet loss and bit rates.

international conference on acoustics, speech, and signal processing | 2010

Optimal delayed decoding of predictively encoded sources

Vinay Melkote; Kenneth Rose

Predictive coding eliminates redundancy due to correlations between the current and past signal samples, so that only the innovation, or prediction residual, needs to be encoded. However, the decoder may, in principle, also exploit correlations with future samples. Prior decoder enhancement work mainly applied a non-causal filter to smooth the regular decoder reconstruction. In this work we broaden the scope to pose the problem: Given an allowed decoding delay, what is the optimal decoding algorithm for predictively encoded sources? To exploit all information available to the decoder, the proposed algorithm recursively estimates conditional probability densities, given both past and available future information, and computes the optimal reconstruction via conditional expectation. We further derive a near-optimal low complexity approximation to the optimal decoder, which employs a time-invariant lookup table or codebook approach. Simulations indicate that the latter method closely approximates the optimal delayed decoder, and that both considerably outperform the competition.

international conference on acoustics, speech, and signal processing | 2010

Joint optimization of the perceptual core and lossless compression layers in scalable audio coding

Emmanuel Ravelli; Vinay Melkote; Tejaswi Nanjundaswamy; Kenneth Rose

MPEG-4 High-Definition Advanced Audio Coding (HD-AAC) enables scalable-to-lossless (SLS) audio coding with an Advanced Audio Coding (AAC) base layer, and fine-grained enhancements based on the MPEG SLS standard. While the AAC core offers better perceptual quality at lossy bit-rates, its inclusion has been observed to compromise the ultimate lossless compression performance as compared to the SLS ‘non-core’ (i.e., without an AAC base layer) codec. In contrast, the latter provides excellent lossless compression but with significantly degraded audio quality at low bit-rates. We propose a trellis-based approach to directly optimize the trade-off between the quality of the AAC core and the lossless compression performance of SLS. Simulations to test the effectiveness of the approach demonstrate the capability to adjust the trade-off to match application specific needs. Moreover, such optimization can in fact achieve an AAC core of superior perceptual quality while maintaining state-of-the-art (and surprisingly sometimes even better) lossless compression, all this in compliance with the HD-AAC standard.

2010 18th International Packet Video Workshop | 2010

A recursive optimal spectral estimate of end-to-end distortion in video communications

Jingning Han; Vinay Melkote; Kenneth Rose

End-to-end distortion estimation is critical to effective error-resilient video coding. The recursive optimal per-pixel estimate (ROPE) is a known approach to compute up to second moments of decoder-reconstructed pixels, and thereby optimally estimate the distortion. ROPE accurately accounts for encoding/decoding operations that are recursive in the pixel domain, and their interaction with packet loss and decoder concealment. The premise of this work is that considerable gains could be recouped by a dual estimation technique that would perform its recursion in the transform domain. This opens the door to accurate distortion estimation in conjunction with estimation-theoretic source coding approaches that involve transform domain operations, including improved prediction in both single-layer and scalable video coding. We present a novel recursive optimal estimate that operates entirely in the transform domain, namely, the spectral coefficient-wise optimal recursive estimate (SCORE). The method overcomes intricacies due to motion compensation from “off-grid” blocks. We first demonstrate that its accuracy matches ROPE in the usual setting where ROPE is known to be optimal. Then we consider an enhanced encoding scenario involving spectral operations that cannot be accurately tracked by ROPE, but for which SCORE still maintains optimality and hence enables substantial end-to-end performance gains over a large range of packet loss rates.

IEEE Transactions on Image Processing | 2013

Estimation-Theoretic Approach to Delayed Decoding of Predictively Encoded Video Sequences

Jingning Han; Vinay Melkote; Kenneth Rose

Current video coders employ predictive coding with motion compensation to exploit temporal redundancies in the signal. In particular, blocks along a motion trajectory are modeled as an auto-regressive (AR) process, and it is generally assumed that the prediction errors are temporally independent and approximate the innovations of this process. Thus, zero-delay encoding and decoding is considered efficient. This paper is premised on the largely ignored fact that these prediction errors are, in fact, temporally dependent due to quantization effects in the prediction loop. It presents an estimation-theoretic delayed decoding scheme, which exploits information from future frames to improve the reconstruction quality of the current frame. In contrast to the standard decoder that reproduces every block instantaneously once the corresponding quantization indices of residues are available, the proposed delayed decoder efficiently combines all accessible (including any future) information in an appropriately derived probability density function, to obtain the optimal delayed reconstruction per transform coefficient. Experiments demonstrate significant gains over the standard decoder. Requisite information about the source AR model is estimated in a spatio-temporally adaptive manner from a bit-stream conforming to the H.264/AVC standard, i.e., no side information needs to be sent to the decoder in order to employ the proposed approach, thereby compatibility with the standard syntax and existing encoders is retained.

Explore More