Regunathan Radhakrishnan
Mitsubishi Electric Research Laboratories
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Regunathan Radhakrishnan.
Archive | 2003
Ajay Divakaran; Kadir A. Peker; Regunathan Radhakrishnan; Ziyou Xiong; Romain Cabasson
We present video summarization and indexing techniques using the MPEG-7 motion activity descriptor. The descriptor can be extracted in the compressed domain and is compact, and hence is easy to extract and match. We establish that the intensity of motion activity of a video shot is a direct indication of its summarizability. We describe video summarization techniques based on sampling in the cumulative motion activity space. We then describe combinations of the motion activity based techniques with generalized sound recognition that enable completely automatic generation of news and sports video summaries. Our summarization is computationally simple and flexible, which allows rapid generation of a summary of any desired length.
international conference on consumer electronics | 2006
Isao Otsuka; Regunathan Radhakrishnan; Michael Siracusa; Ajay Divakaran; Hidetoshi Mishima
We extend our sports video browsing framework for personal video recorders, such as DVD recorders, blu-ray disc recorders and/or hard disc recorders, to other genres. We reduce the computational complexity by reducing the number of audio classes to a small but useful set that is useful for both sports video and music video, as well as by reducing the complexity of the Gaussian mixture models. Our extension to music video content consists of detecting music/song periods by compensating for false alarms. Our results indicate that our enhanced audio-only summarization maintains the sports video performance and works well with music video content. We can therefore integrate the enhancement into our product while in fact reducing the computational complexity.
IEEE Transactions on Circuits and Systems for Video Technology | 2002
Regunathan Radhakrishnan; Nasir D. Memon
We investigate the image authentication system SARI, proposed by Lin and Chang (see ibid., vol.11, p.153-68, Feb. 2001), that distinguishes JPEG compression from malicious manipulations. In particular, we took at the image digest component of this system. We show that if multiple images have been authenticated with the same secret key and the digests of these images are known to an attacker, Oscar, then he can cause arbitrary images to be authenticated with this same but unknown key. We show that the number of such images needed by Oscar to launch a successful attack is quite small, making the attack very practical. We then suggest possible solutions to enhance the security of this authentication system.
conference on image and video communications and processing | 2005
Regunathan Radhakrishnan; Ajay Divakaran
We present a systematic framework for arriving at audio classes for detection of crimes in elevators. We use a time series analysis framework to analyze the low-level features extracted from the audio of an elevator surveillance content to perform an inlier/outlier based temporal segmentation. Since suspicious events in elevators are outliers in a background of usual events, such a segmentation help bring out such events without any a priori knowledge. Then, by performing an automatic clustering on the detected outliers, we identify consistent patterns for which we can train supervised detectors. We apply the proposed framework to a collection of elevator surveillance audio data to systematically acquire audio classes such as banging, footsteps, non-neutral speech and normal speech etc. Based on the observation that the banging audio class and non-neutral speech class are indicative of suspicious events in the elevator data set, we are able to detect all of the suspicious activities without any misses.
electronic imaging | 2003
King-Shy Goh; Koji Miyahara; Regunathan Radhakrishnan; Ziyou Xiong; Ajay Divakaran
Removing commercials from television programs is a much sought-after feature for a personal video recorder. In this paper, we employ an unsupervised clustering scheme (CM_Detect) to detect commercials in television programs. Each program is first divided into W8-minute chunks, and we extract audio and visual features from each of these chunks. Next, we apply k-means clustering to assign each chunk with a commercial/program label. In contrast to other methods, we do not make any assumptions regarding the program content. Thus, our method is highly content-adaptive and computationally inexpensive. Through empirical studies on various content, including American news, Japanese news, and sports programs, we demonstrate that our method is able to filter out most of the commercials without falsely removing the regular program.
electronic imaging | 2002
Regunathan Radhakrishnan; Nasir D. Memon
The goal of audio content authentication techniques is to separate malicious manipulations from authentic signal processing applications like compression, filtering, etc. The key difference between malicious operations and signal processing operations is that the latter tends to preserve the perceptual content of the underlying audio signal. Hence, in order to separate malicious operations from allowed operations, a content authentication procedure should be based on a model that approximates human perception of audio. In this paper, we propose an audio content authentication technique based on an invariant feature contained in two perceptually similar audio data, i.e. the masking curve. We also evaluate the performance of this technique by embedding a hash based on the masking curve into the audio signal using an existing transparent and robust data hiding technique. At the receiver, the same content-based hash is extracted from the audio and compared with the calculated hash bits. Correlation between calculated hash bits and extracted hash bits degrades gracefully with the perceived quality of received audio. This implies that the threshold for authentication can be adapted to the required level of perceptual quality at the receiver. Experimental results show that this content-based hash is able to differentiate allowed signal processing applications like MP3 compression from certain malicious operations, which modify the perceptual content of the audio.
electronic imaging | 2003
Regunathan Radhakrishnan; Ziyou Xiong; Nasir D. Memon
Robust hash functions are central to the security of multimedia content authentication systems. Such functions are sensitive to a key but robust to many allowed signal processing operations on the underlying content. Robustness of the hash function to changes in the original content implies the existence of a cluster in the feature space around the original contents feature vector, any point within which getting hashed to the same output. The shape and size of the cluster determines the trade-off between the robustness offered and the security of the authentication system based on the robust hash function. The clustering itself is based on a secret key and hence unknown to the attacker. However, we show in this paper that the specific clustering arrived at by a robust hash function may be possible to learn. Specifically, we look at a well known robust hash function for image data called the Visual Hash Function (VHF). Given just an input and its hash value, we show how to construct a statistical model of the hash function, without any knowledge of the secret key used to compute the hash. We also show how to use this model to engineer arbitrary and malicious collisions. Finally, we propose one possible modification to VHF so that constructing a model that mimics its behavior becomes difficult.
Archive | 2005
Ziyou Xiong; Regunathan Radhakrishnan; Ajay Divakaran; Thomas S. Huang
We summarize our recent work on “highlight” events detection and recognition in sports video. We have developed two different joint audio-visual fusion frameworks for this task, namely “audio-visual coupled hidden Markov model” and “audio classification then visual hidden Markov model verification”. Our comparative study of these two frameworks shows that the second approach outperforms the first approach by a large margin. Our study also suggests the importance of modeling the so-called middle-level features such as audience reactions and camera patterns in sports video.
Storage and Retrieval for Image and Video Databases | 2003
Ajay Divakaran; Regunathan Radhakrishnan; Ziyou Xiong; Michael A. Casey
In Casey describes a generalized sound recognition framework based on reduced rank spectra and Minimum-Entropy Priors. This approach enables successful recognition of a wide variety of sounds such as male speech, female speech, music, animal sounds etc. In this work, we apply this recognition framework to news video to enable quick video browsing. We identify speaker change positions in the broadcast news using the sound recognition framework. We combine the speaker change position with color & motion cues from video and are able to locate the beginning of each of the topics covered by the news video. We can thus skim the video by merely playing a small portion starting from each of the locations where one of the principal cast begins to speak. In combination with our motion-based video browsing approach, our technique provides simple automatic news video browsing. While similar work has been done before, our approach is simpler and faster than competing techniques, and provides a rich framework for further analysis and description of content.
Archive | 2009
Paris Smaragdis; Regunathan Radhakrishnan; Kevin W. Wilson
A lot of multimedia content comes with a soundtrack which is often not taken advantage of by content analysis applications. In this chapter we cover some of the essential techniques for performing context analysis from audio signals. We describe the most popular approaches in representing audio signals, learning their structure and constructing classifiers that recognize specific sounds, as well as algorithms for locating where sounds are coming from. All these tools when used in the context of content analysis can provide powerful descriptors that can help us find various events which would be hard to locate otherwise.