Amar Mukherjee | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Amar Mukherjee is active.

Explore More

Publication

Featured researches published by Amar Mukherjee.

Archive | 2008

The Burrows-Wheeler Transform: Data Compression, Suffix Arrays, and Pattern Matching

Donald A. Adjeroh; Tim Bell; Amar Mukherjee

The Burrows-Wheeler Transform is a text transformation scheme that has found applications in different aspects of the data explosion problem, from data compression to index structures and search. The BWT belongs to a new class of compression algorithms, distinguished by its ability to perform compression by sorted contexts. More recently, the BWT has also found various applications in addition to text data compression, such as in lossless and lossy image compression, tree-source identification, bioinformatics, machine translation, shape matching, and test data compression. This book will serve as a reference for seasoned professionals or researchers in the area, while providing a gentle introduction, making it accessible for senior undergraduate students or first year graduate students embarking upon research in compression, pattern matching, full text retrieval, compressed index structures, or other areas related to the BWT. Key Features Comprehensive resource for information related to different aspects of the Burrows-Wheeler Transform including: Gentle introduction to the BWT History of the development of the BWT Detailed theoretical analysis of algorithmic issues and performance limits Searching on BWT compressed data Hardware architectures for the BWT Explores non-traditional applications of the BWT in areas such as: Bioinformatics Joint source-channel coding Modern information retrieval Machine translation Test data compression for systems-on-chip Teaching materials ideal for classroom use on courses in: Data Compression and Source Coding Modern Information Retrieval Information Science Digital Libraries

IEEE Transactions on Circuits and Systems | 1991

Efficient VLSI designs for data transformation of tree-based codes

Amar Mukherjee; Nagarajan Ranganathan; Mostafa A. Bassiouni

A class of VLSI architectures for data transformation of tree-based codes is proposed, concentrating on transformation functions used for data compression and decompression. Two algorithms are presented: a sequential algorithm that generates the code bits serially one bit per machine cycle, and a parallel algorithm that generates the entire code bits of a symbol in one machine cycle. The algorithms use the principle of propagation of a token in a reverse binary tree constructed from the original codes. The design approaches are applicable to any binary codes, although the static Huffman code is used as an illustration. A hardware algorithm for generating adaptive Huffman codes is proposed, and a VLSI architecture for implementing the algorithm is described. The high speed of the algorithms ensures that data transformation is done on the fly, as data are being transferred from/to high-speed I/O communication devices. >

international conference on computer graphics and interactive techniques | 2006

Generalized wavelet product integral for rendering dynamic glossy objects

Weifeng Sun; Amar Mukherjee

We consider real-time rendering of dynamic glossy objects with realistic shadows under distant all-frequency environment lighting. Previous PRT approaches pre-compute light transport for a fixed scene and cannot account for cast shadows on high-glossy objects occluded by dynamic neighbors. In this paper, we extend double/triple product integral to generalized multi-function product integral. We represent shading integral at each vertex as the product integral of multiple functions, involving the lighting, BRDF, local visibility and dynamic occlusions. Our main contribution is a new mathematical representation and analysis of multi-function product integral in the wavelet domain. We show that multi-function product integral in the primal corresponds to the summation of the product of basis coefficients and integral coefficients. We propose a novel generalized Haar integral coefficient theorem to evaluate arbitrary Haar integral coefficients. We present an efficient sub-linear algorithm to render dynamic glossy objects under time-variant all-frequency lighting and arbitrary view conditions in a few seconds on a commodity CPU, orders of magnitude faster than previous techniques. To further accelerate shadow computation, we propose a Just-in-time Radiance Transfer (JRT) technique. JRT is a new generalization to PRT for dynamic scenes. It is compact and flexible, and supports glossy materials. By pre-computing radiance transfer vectors at runtime, we demonstrate rendering dynamic view-dependent all-frequency shadows in real-time.

IEEE Transactions on Very Large Scale Integration Systems | 1993

MARVLE: a VLSI chip for data compression using tree-based codes

Amar Mukherjee; Nagarajan Ranganathan; Jeffrey W. Flieder; Tinku Acharya

Describes the architecture and design of a CMOS VLSI chip for data compression and decompression using tree-based codes. The chip, called MARVLE, implements a memory-based architecture for variable length encoding and decoding based on tree-based codes. The architecture is based on an efficient scheme of mapping the tree representing any binary code onto a memory device. A prototype 2-mm CMOS VLSI chip has been designed, verified, and fabricated by the MOSIS facility. The chip has a 512*12 static RAM with an access time of 4 ns and logic circuitry for compression as well as decompression. The chip occupies a silicon area of 6.8 mm*6.9 mm and consists of 49695 transistors. The prototype chip yields a compression rate of 95.2 Mb/s and a decompression rate of 60.6 Mb/s with a clock rate of 83.3 MHz. The VLSI hardware can be used to implement the JPEG baseline compression scheme. >

computational systems bioinformatics | 2002

DNA sequence compression using the Burrows-Wheeler Transform

Donald A. Adjeroh; Yong Zhang; Amar Mukherjee; Matt Powell; Tim Bell

We investigate off-line dictionary oriented approaches to DNA sequence compression, based on the Burrows-Wheeler Transform (BWT). The preponderance of short repeating patterns is an important phenomenon in biological sequences. Here, we propose off-line methods to compress DNA sequences that exploit the different repetition structures inherent in such sequences. Repetition analysis is performed based on the relationship between the BWT and important pattern matching data structures, such as the suffix tree and suffix array. We discuss how the proposed approach can be incorporated in the BWT compression pipeline.

international conference on information technology coding and computing | 2001

LIPT: a lossless text transform to improve compression

Fauzia S. Awan; Amar Mukherjee

We propose an approach to develop a dictionary based reversible lossless text transformation, called LIFT (length index preserving transform), which can be applied to a source text to improve the existing algorithms ability to compress. In LIFT, the length of the input word and the offset of the words in the dictionary are denoted with alphabets. Our encoding scheme makes use of the recurrence of same length words in the English language to create context in the transformed text that the entropy coders can exploit. LIFT also achieves some compression at the preprocessing stage and retains enough context and redundancy for the compression algorithms to give better results. Bzip2 with LIFT gives 5.24% improvement in average BPC over Bzip2 without LIPT, and PPMD with LIPT gives 4.46% improvement in average BPC over PPMD without LIFT, for our test corpus.

data compression conference | 2002

Searching BWT compressed text with the Boyer-Moore algorithm and binary search

Tim Bell; Matt Powell; Amar Mukherjee; Donald A. Adjeroh

This paper explores two techniques for on-line exact pattern matching in files that have been compressed using the Burrows-Wheeler transform. We investigate two approaches. The first is an application of the Boyer-Moore algorithm (1977) to a transformed string. The second approach is based on the observation that the transform effectively contains a sorted list of all substrings of the original text, which can be exploited for very rapid searching using a variant of binary search. Both methods are faster than a decompress-and-search approach for small numbers of queries, and binary search is much faster even for large numbers of queries.

IEEE Transactions on Computers | 1989

Hardware algorithms for determining similarity between two strings

Amar Mukherjee

The author presents pipelined hardware algorithms with time complexity O(n+m) for determining between two character strings expressed as the length of the longest common subsequence of the given pair of strings. The algorithms use cellular architecture with simple basic cells and regular nearest-neighbor communication generally suitable for VLSI implementation. Two methods are presented: a sequential method with serial text input and an alternating method in which both the pattern and the text are serially applied to the machine. Theorems are proved that lead to a very optimized design of a basic cell. >

Journal of Computational Biology | 2006

An Optimal Algorithm for Perfect Phylogeny Haplotyping

Ravi Vijayasatya; Amar Mukherjee

Inferring haplotype data from genotype data is a crucial step in linking SNPs to human diseases. Given n genotypes over m SNP sites, the haplotype inference (HI) problem deals with finding a set of haplotypes so that each given genotype can be formed by a combining a pair of haplotypes from the set. The perfect phylogeny haplotyping (PPH) problem is one of the many computational approaches to the HI problem. Though it was conjectured that the complexity of the PPH problem was O(nm), the complexity of all the solutions presented until recently was O(nm (2)). In this paper, we make complete use of the column-ordering that was presented earlier and show that there must be some interdependencies among the pairwise relationships between SNP sites in order for the given genotypes to allow a perfect phylogeny. Based on these interdependencies, we introduce the FlexTree (flexible tree) data structure that represents all the pairwise relationships in O(m) space. The FlexTree data structure provides a compact representation of all the perfect phylogenies for the given set of genotypes. We also introduce an ordering of the genotypes that allows the genotypes to be added to the FlexTree sequentially. The column ordering, the FlexTree data structure, and the row ordering we introduce make the O(nm) OPPH algorithm possible. We present some results on simulated data which demonstrate that the OPPH algorithm performs quiet impressively when compared to the previous algorithms. The OPPH algorithm is one of the first O(nm) algorithms presented for the PPH problem.

Journal of the Association for Information Science and Technology | 1995

Efficient decoding of compressed data

Mostafa A. Bassiouni; Amar Mukherjee

In this article, we discuss the problem of enhancing the speed of Huffman decoding. One viable solution to this problem is the multibit scheme which uses the concept of k‐bit trees to decode up to k bits at a time. A linear‐time optimal solution for the mapping of 2‐bit trees into memory is presented. The optimal solution is derived by formulating the memory mapping problem as a binary string mapping problem and observing that at most four different 4‐bit patterns can occur within any 2‐bit Huffman tree. In addition to reducing the processing time of decoding, the optimal scheme is storage efficient, does not require changes to the encoding process, and is suitable for hardware implementations.

Explore More