Tejaswi Nanjundaswamy
University of California, Santa Barbara
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Tejaswi Nanjundaswamy.
picture coding symposium | 2013
Yue Chen; Jingning Han; Tejaswi Nanjundaswamy; Kenneth Rose
A novel filtering approach that naturally combines information from both intra-frame and motion compensated referencing for efficient prediction is proposed to fully exploit the spatio-temporal correlations of video signals, thereby achieving superior compression performance. Inspiration was drawn from our recent work on extrapolation filter based intra prediction, which views the spatial signal as a non-separable first-order Markov process and employs a 3-tap recursive filter to effectively capture the statistical characteristics. This work significantly extends the scope to further incorporate motion compensated reference in a filtering framework, whose coefficients were optimized via a “k-modes”-like iteration that accounts for various factors in the compression process including variation in statistics in the prediction loop, to minimize the rate-distortion cost. Experiments validate the efficacy of the proposed spatio-temporal approach, which translates into consistent coding performance gains.
international conference on image processing | 2014
Shunyao Li; Yue Chen; Jingning Han; Tejaswi Nanjundaswamy; Kenneth Rose
Conventional “pixel copying” prediction used in current video standards was shown in previous work to be sub-optimal compared to 2-D non-separable Markov model based recursive extrapolation approaches. The premise of this paper is that in order to achieve the full potential of these approaches it is necessary to account for several requirements, namely, the design of prediction modes (and respective extrapolation filters) must optimize a rate-distortion cost rather than minimize the mean squared prediction error; the filters must be of sufficient complexity to cover all necessary directions; and the approach must include adaptation to available information indicative of local statistics. Hence, the proposed system employs four-tap recursive extrapolation filters that can predict from all standard directions, combined with a filter design method that accounts for the overall rate-distortion cost in conjunction with the codec decisions, along with adaptation of filter coefficients to relevant local information provided by encoder decisions on target bit rate and block size. Experimental evidence is provided for substantial coding gains over conventional intra coding.
international conference on acoustics, speech, and signal processing | 2010
Emmanuel Ravelli; Vinay Melkote; Tejaswi Nanjundaswamy; Kenneth Rose
MPEG-4 High-Definition Advanced Audio Coding (HD-AAC) enables scalable-to-lossless (SLS) audio coding with an Advanced Audio Coding (AAC) base layer, and fine-grained enhancements based on the MPEG SLS standard. While the AAC core offers better perceptual quality at lossy bit-rates, its inclusion has been observed to compromise the ultimate lossless compression performance as compared to the SLS ‘non-core’ (i.e., without an AAC base layer) codec. In contrast, the latter provides excellent lossless compression but with significantly degraded audio quality at low bit-rates. We propose a trellis-based approach to directly optimize the trade-off between the quality of the AAC core and the lossless compression performance of SLS. Simulations to test the effectiveness of the approach demonstrate the capability to adjust the trade-off to match application specific needs. Moreover, such optimization can in fact achieve an AAC core of superior perceptual quality while maintaining state-of-the-art (and surprisingly sometimes even better) lossless compression, all this in compliance with the HD-AAC standard.
IEEE Transactions on Audio, Speech, and Language Processing | 2013
Emmanuel Ravelli; Vinay Melkote; Tejaswi Nanjundaswamy; Kenneth Rose
Current scalable audio coders typically optimize performance at a particular layer without regard to impact on other layers, and are thus unable to provide a performance trade-off between different layers. In the particular case of MPEG Scalable Advanced Audio Coding (S-AAC) and Scalable-to-Lossless (SLS) coding, the base-layer is optimized first followed by successive optimization of higher layers, which ensures optimality of the base-layer but results in a scalability penalty that progressively increases with the enhancement layer index. The ability to trade-off performance between different layers enables alignment to the real world requirement for audio quality commensurate with the bandwidth afforded by a user. This work provides the means to better control the performance tradeoffs, and the distribution of the scalability penalty, between the base and enhancement layers. Specifically, it proposes an efficient joint optimization algorithm that selects the encoding parameters for each layer while accounting for the rate-distortion costs in all layers. The efficacy of the technique is demonstrated in the two distinct settings of S-AAC, and SLS High Definition Advanced Audio Coding. Objective and subjective tests provide evidence for substantial gains, and represent a significant step toward bridging the gap with the non-scalable coder.
information theory workshop | 2012
Kumar Viswanatha; Emrah Akyol; Tejaswi Nanjundaswamy; Kenneth Rose
This paper focuses on a new framework for scalable coding of information based on principles derived from common information of two dependent random variables. In the conventional successive refinement setting, the encoder generates two layers of information called the base layer and the enhancement layer. The first decoder, which receives only the base layer, produces a coarse reconstruction of the source, whereas the second decoder, which receives both the layers, uses the enhancement layer to refine the information further leading to a finer reconstruction. It is popularly known that asymptotic rate-distortion optimality at both the decoders is possible if and only if the source-distortion pair is successively refinable. However when the source is not successively refinable under the given distortion metric, it is impossible to achieve rate-distortion optimality at both the layers simultaneously. For this reason, most practical system designers resort to storing two individual representations of the source leading to significant overhead in transmission/storage costs. Inspired by the breadth of applications, in this paper, we propose a new framework for scalable coding wherein a subset of the bits sent to the first decoder is not sent to the second decoder. That is, the encoder generates one common bit stream which is routed to both the decoders, but unlike the conventional successive refinement setting, both the decoders receive an additional individual bitstream. By relating the proposed framework with the problem of common information of two dependent random variables, we derive a single letter characterization for the minimum sum rate achievable for the proposed setting when the two decoders are constrained to receive information at their respective rate-distortion functions. We show using a simple example that the proposed framework provides a strictly better asymptotic sum rate as opposed to the conventional scalable coding setup when the source-distortion pair is not successively refinable.
workshop on applications of signal processing to audio and acoustics | 2011
Tejaswi Nanjundaswamy; Kenneth Rose
The long term prediction (LTP) tool is used in audio compression systems to exploit periodicity in signals. This tool capitalizes on the periodic component of the waveform by selecting a past segment as the basis for prediction of the current frame. However, most audio signals are polyphonic in nature, consisting of a mixture of periodic signals. This renders the LTP suboptimal, as the mixtures period equals the least common multiple of its individual component periods, which typically extends far beyond the duration over which the signal is stationary. Instead of seeking a past segment that represents a “compromise” for incompatible component periods, we propose a more complex filter that caters to the individual signal components. This proposed technique predicts every periodic component of the signal from its immediate history, and this is achieved by cascading LTP filters, each corresponding to individual periodic component. We also propose a recursive “divide and conquer” technique to estimate the parameters of all the LTP filters. For a real world evaluation, we employ this technique within the Bluetooth Sub-band Codec. Considerable gains achieved on a variety of polyphonic signals demonstrate the effectiveness of the proposal.
international conference on image processing | 2015
Shunyao Li; Tejaswi Nanjundaswamy; Yue Chen; Kenneth Rose
Current video coders exploit temporal dependencies via prediction that consists of motion-compensated pixel copying operations. Such per-pixel temporal prediction ignores important underlying spatial correlations, as well as considerable variations in temporal correlation across frequency components. In the transform domain, however, spatial decorrelation is first achieved, allowing for the true temporal correlation at each frequency to emerge and be properly accounted for, with particular impact at high frequencies, whose lower correlation is otherwise masked by the dominant low frequencies. This paper focuses on effective design of transform domain temporal prediction that: i) fully accounts for the effects of sub-pixel interpolation filters, and ii) circumvents the challenge of catastrophic design instability due to quantization error propagation through the prediction loop. We design predictors conditioned on frequency and sub-pixel position, employing an iterative open-loop (hence stable) design procedure that, on convergence, approximates closed-loop operation. Experimental results validate the effectiveness of both the asymptotic closed-loop design procedure and the transform-domain temporal prediction paradigm, with significant and consistent performance gains over the standard.
international conference on acoustics, speech, and signal processing | 2012
Tejaswi Nanjundaswamy; Kenneth Rose
This paper proposes a frame loss concealment technique for audio signals, which is designed to overcome the main challenge due to the polyphonic nature of most music signals and is inspired by our recent research on compression of such signals. The underlying idea is to employ a cascade of long term prediction filters (tailored to the periodic components) to circumvent the pitfalls of naive waveform repetition, and to enable effective time-domain prediction of every periodic component from the immediate history. In the first phase, a cascaded filter is designed from available past samples and is used to predict across the lost frame(s). Available future reconstructed samples allow refinement of the filter parameters to minimize the squared prediction error across such samples. In the second phase a prediction is similarly performed in reverse from future samples. Finally the lost frame is interpolated as a weighted average of forward and backward predicted samples. Objective and subjective evaluation results for the proposed approach, in comparison with existing techniques, all incorporated within an MPEG AAC low delay decoder, provide strong evidence for considerable gains across a variety of polyphonic signals.
international conference on acoustics, speech, and signal processing | 2014
Tejaswi Nanjundaswamy; Kumar Viswanatha; Kenneth Rose
Scalable coders generate hierarchically layered bitstreams to serve content at different quality levels, wherein the base layer provides a coarse quality reconstruction and successive layers incrementally refine the quality. However, it is widely recognized that there is an inherent performance penalty due to the scalable coding structure, when compared to independently encoded copies. To mitigate this loss we propose a layered compression framework, having roots in information theoretic concepts, which relaxes the strict hierarchical constraints, wherein only a “subset” of the information of a lower quality level is shared with higher quality levels. In other words, there is flexibility to also have “private” information at each quality level, beside information that is common to multiple levels. We employ this framework within the MPEG Scalable AAC and propose an optimization scheme to jointly select parameters of all the layers. Experimental evaluation results demonstrate the utility of the flexibility provided by the proposed framework.
IEEE Transactions on Audio, Speech, and Language Processing | 2014
Tejaswi Nanjundaswamy; Kenneth Rose
Audio compression systems exploit periodicity in signals to remove inter-frame redundancies via the long term prediction (LTP) tool. This simple tool capitalizes on the periodic component of the waveform by selecting a past segment as the basis for prediction of the current frame. However, most audio signals are polyphonic in nature, containing a mixture of several periodic components. While such polyphonic signals may themselves be periodic with overall period equaling the least common multiple of the individual component periods, the signal rarely remains sufficiently stationary over the extended period, rendering the LTP tool suboptimal. Instead of seeking a past segment that represents a “compromise” for incompatible component periods, we propose a more complex filter that predicts every periodic component of the signal from its immediate history, and this is achieved by cascading LTP filters, each corresponding to individual periodic component. We also propose a recursive “divide and conquer” technique to estimate parameters of all the LTP filters. We then demonstrate the effectiveness of cascaded LTP in two distinct settings of the ultra low delay Bluetooth Subband Codec and the MPEG Advanced Audio Coding (AAC) standard. In MPEG AAC, we specifically adapt the cascaded LTP parameter estimation to take into account the perceptual distortion criteria, and also propose a low decoder complexity variant. Objective and subjective results for all the settings validate the effectiveness of the proposal on a variety of polyphonic signals.