Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Debargha Mukherjee is active.

Publication


Featured researches published by Debargha Mukherjee.


Storage and Retrieval for Image and Video Databases | 1997

NeTra-V: toward an object-based video representation

Yining Deng; Debargha Mukherjee; B. S. Manjunath

There is a growing need for new representations of video that allow not only compact storage of data but also content-based functionalities such as search and manipulation of objects. We present here a prototype system, called NeTra-V, that is currently being developed to address some of these content related issues. The system has a two-stage video processing structure: a global feature extraction and clustering stage, and a local feature extraction and object-based representation stage. Key aspects of the system include a new spatio-temporal segmentation and object-tracking scheme, and a hierarchical object-based video representation model. The spatio-temporal segmentation scheme combines the color/texture image segmentation and affine motion estimation techniques. Experimental results show that the proposed approach can handle large motion. The output of the segmentation, the alpha plane as it is referred to in the MPEG-4 terminology, can be used to compute local image properties. This local information forms the low-level content description module in our video representation. Experimental results illustrating spatio- temporal segmentation and tracking are provided.


IEEE Transactions on Circuits and Systems for Video Technology | 1996

Subband DCT: definition, analysis, and applications

Sung-Hwan Jung; Sanjit K. Mitra; Debargha Mukherjee

The discrete cosine transform (DCT) is well known for its highly efficient coding performance and is widely used in many image compression applications. However, in low bit rate coding, it produces undesirable block artifacts that are visually not pleasing. In addition, in many practical applications, faster computation and easier VLSI implementation of DCT coefficients are also important issues. The removal of the block artifacts and faster DCT computation are therefore of practical interest. In this paper, we investigate a modified DCT computation scheme, to be called the subband DCT (SB-DCT), that provides a simple, efficient solution to the reduction of the block artifacts while achieving faster computation. We have applied the new approach for the low bit rate coding and decoding of images. Simulation results on real images have verified the improved performance obtained using the proposed method over the standard JPEG method.


picture coding symposium | 2013

The latest open-source video codec VP9 - An overview and preliminary results

Debargha Mukherjee; Jim Bankoski; Adrian Grange; Jingning Han; John Koleszar; Paul Wilkins; Yaowu Xu; Ronald Sebastiaan Bultje

Google has recently finalized a next generation open-source video codec called VP9, as part of the libvpx repository of the WebM project (http://www.webmproject.org/). Starting from the VP8 video codec released by Google in 2010 as the baseline, various enhancements and new tools were added, resulting in the next-generation VP9 bit-stream. This paper provides a brief technical overview of VP9 along with comparisons with other state-of-the-art video codecs H.264/AVC and HEVC on standard test sets. Results show VP9 to be quite competitive with mainstream state-of-the-art codecs.


IEEE Transactions on Circuits and Systems for Video Technology | 2003

Vector SPIHT for embedded wavelet video and image coding

Debargha Mukherjee; Sanjit K. Mitra

The set partitioning in hierarchical trees (SPIHT) approach for still-image compression proposed by Said and Pearlman (1996) is one of the most efficient embedded monochrome image compression schemes known to date. The algorithm relies on a very efficient scanning and bit-allocation scheme for quantizing the coefficients obtained by a wavelet decomposition of an image. In this paper, we adopt this approach to scan groups (vectors) of wavelet coefficients, and use successive refinement vector quantization (VQ) techniques with staggered bit-allocation to quantize the groups at once. The scheme is named vector SPIHT (VSPIHT). We present discussions on possible models for the distributions of the coefficient vectors, and show how trained classified tree-multistage VQ techniques can be used to efficiently quantize them. Extensive coding results comparing VSPIHT to scalar SPIHT in the mean-squared-error sense, are presented for monochrome images. VSPIHT is found to yield superior performance for most images, especially those with high detail content. The method is also applied to color video coding, where a partially scalable bitstream is generated. We present the coding results on QCIF sequences as compared against H.263.


IEEE Transactions on Image Processing | 2013

Learning-Based, Automatic 2D-to-3D Image and Video Conversion

Janusz Konrad; Meng Wang; Prakash Ishwar; Chen Wu; Debargha Mukherjee

Despite a significant growth in the last few years, the availability of 3D content is still dwarfed by that of its 2D counterpart. To close this gap, many 2D-to-3D image and video conversion methods have been proposed. Methods involving human operators have been most successful but also time-consuming and costly. Automatic methods, which typically make use of a deterministic 3D scene model, have not yet achieved the same level of quality for they rely on assumptions that are often violated in practice. In this paper, we propose a new class of methods that are based on the radically different approach of learning the 2D-to-3D conversion from examples. We develop two types of methods. The first is based on learning a point mapping from local image/video attributes, such as color, spatial position, and, in the case of video, motion at each pixel, to scene-depth at that pixel using a regression type idea. The second method is based on globally estimating the entire depth map of a query image directly from a repository of 3D images ( image+depth pairs or stereopairs) using a nearest-neighbor regression type idea. We demonstrate both the efficacy and the computational efficiency of our methods on numerous 2D images and discuss their drawbacks and benefits. Although far from perfect, our results demonstrate that repositories of 3D content can be used for effective 2D-to-3D image conversion. An extension to video is immediate by enforcing temporal continuity of computed depth maps.


Proceedings of SPIE | 2013

Towards a next generation open-source video codec

Jim Bankoski; Ronald Sebastiaan Bultje; Adrian Grange; Qunshan Gu; Jingning Han; John Koleszar; Debargha Mukherjee; Paul Wilkins; Yaowu Xu

Google has recently been developing a next generation opensource video codec called VP9, as part of the experimental branch of the libvpx repository included in the WebM project (http://www.webmproject.org/). Starting from the VP8 video codec released by Google in 2010 as the baseline, a number of enhancements and new tools have been added to improve the coding efficiency. This paper provides a technical overview of the current status of this project along with comparisons and other stateoftheart video codecs H. 264/AVC and HEVC. The new tools that have been added so far include: larger prediction block sizes up to 64x64, various forms of compound INTER prediction, more modes for INTRA prediction, ⅛pel motion vectors and 8tap switchable subpel interpolation filters, improved motion reference generation and motion vector coding, improved entropy coding and framelevel entropy adaptation for various symbols, improved loop filtering, incorporation of Asymmetric Discrete Sine Transforms and larger 16x16 and 32x32 DCTs, frame level segmentation to group similar areas together, etc. Other tools and various bitstream features are being actively worked on as well. The VP9 bitstream is expected to be finalized by earlyto mid2013. Results show VP9 to be quite competitive in performance with mainstream stateoftheart codecs.


Proceedings of SPIE | 2012

Automatic 2D-to-3D image conversion using 3D examples from the internet

Janusz Konrad; G. Brown; Meng Wang; Prakash Ishwar; Chen Wu; Debargha Mukherjee

The availability of 3D hardware has so far outpaced the production of 3D content. Although to date many methods have been proposed to convert 2D images to 3D stereopairs, the most successful ones involve human operators and, therefore, are time-consuming and costly, while the fully-automatic ones have not yet achieved the same level of quality. This subpar performance is due to the fact that automatic methods usually rely on assumptions about the captured 3D scene that are often violated in practice. In this paper, we explore a radically different approach inspired by our work on saliency detection in images. Instead of relying on a deterministic scene model for the input 2D image, we propose to learn the model from a large dictionary of stereopairs, such as YouTube 3D. Our new approach is built upon a key observation and an assumption. The key observation is that among millions of stereopairs available on-line, there likely exist many stereopairs whose 3D content matches that of the 2D input (query). We assume that two stereopairs whose left images are photometrically similar are likely to have similar disparity fields. Our approach first finds a number of on-line stereopairs whose left image is a close photometric match to the 2D query and then extracts depth information from these stereopairs. Since disparities for the selected stereopairs differ due to differences in underlying image content, level of noise, distortions, etc., we combine them by using the median. We apply the resulting median disparity field to the 2D query to obtain the corresponding right image, while handling occlusions and newly-exposed areas in the usual way. We have applied our method in two scenarios. First, we used YouTube 3D videos in search of the most similar frames. Then, we repeated the experiments on a small, but carefully-selected, dictionary of stereopairs closely matching the query. This, to a degree, emulates the results one would expect from the use of an extremely large 3D repository. While far from perfect, the presented results demonstrate that on-line repositories of 3D content can be used for effective 2D-to-3D image conversion. With the continuously increasing amount of 3D data on-line and with the rapidly growing computing power in the cloud, the proposed framework seems a promising alternative to operator-assisted 2D-to-3D conversion.


IEEE Transactions on Circuits and Systems for Video Technology | 2012

Video Super-Resolution Using Codebooks Derived From Key-Frames

Edson M. Hung; R.L. de Queiroz; Fernanda Brandi; K. F. de Oliveira; Debargha Mukherjee

Example-based super-resolution (SR) is an attractive option to Bayesian approaches to enhance image resolution. We use a multiresolution approach to example-based SR and discuss codebook construction for video sequences. We match a block to be super-resolved to a low-resolution version of the reference high-resolution image blocks. Once the match is found, we carefully apply the high-frequency contents of the chosen reference block to the one to be super-resolved. In essence, the method relies on “betting” that if the low-frequency contents of two blocks are very similar, their high-frequency contents also might match. In particular, we are interested in scenarios where examples can be picked up from readily available high-resolution images that are strongly related to the frame to be super-resolved. Hence, they constitute an excellent source of material to construct a dynamic codebook. Here, we propose a method to super-resolve a video using multiple overlapped variable-block-size codebooks. We implemented a mixed-resolution video coding scenario, where some frames are encoded at a higher resolution and can be used to enhance the other lower-resolution ones. In another scenario, we consider the framework where the camera captures a video at a lower resolution and also takes periodic snapshots at a higher resolution. Results indicate substantial gains over interpolation and fixed-codebook SR, and significant gains over previous works as well.


international conference on image processing | 1998

Color image embedding using multidimensional lattice structures

Jong Jin Chae; Debargha Mukherjee; B. S. Manjunath

This paper describes a robust data embedding scheme which uses noise resilient channel codes based on a multidimensional lattice structure. Compared to prior work in digital watermarking, the proposed scheme can handle a significantly larger quantity of signature data such as gray-scale or color images. A trade-off between the quantity of hidden data and the quality of the watermarked image is achieved by varying the number of quantization levels for the signature, and a scale factor for data embedding. Experimental results on signature recovery from JPEG compressed watermarked images show that good quality reconstruction is possible even when the images are lossy compressed by as much as 85%. Potential applications of this method include, in addition to watermarking, digital data hiding for security and for bit stream control and manipulation.


Proceedings IEEE International Forum on Research and Technology Advances in Digital Libraries -ADL'98- | 1998

A robust data hiding technique using multidimensional lattices

Jong Jin Chae; Debargha Mukherjee; B. S. Manjunath

Describes a data hiding technique which uses noise-resilient channel codes based on multidimensional lattices. A trade-off between between the quantity of hidden data and the quality of the watermarked image is achieved by varying the number of quantization levels for the signature and a scale factor for data embedding. Experimental results show that the watermarked image is transparent to embedding for large amounts of hidden data, and the quality of the extracted signature is high even when the watermarked image is subjected to up to 75% wavelet compression and 85% JPEG lossy compression. These results can be combined with a private key-based scheme to make unauthorized retrieval practically impossible, even with the knowledge of the algorithm.

Collaboration


Dive into the Debargha Mukherjee's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Meng Wang

Hefei University of Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge