Lalitha Agnihotri | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Lalitha Agnihotri is active.

Explore More

Publication

Featured researches published by Lalitha Agnihotri.

Proceedings IEEE Workshop on Content-Based Access of Image and Video Libraries (CBAIVL'99) | 1999

Text detection for video analysis

Lalitha Agnihotri; Nevenka Dimitrova

Textual information brings important semantic clues in video content analysis. We describe a method for detection and representation of text in video segments. The method consists of seven steps: channel separation, image enhancement, edge detection, edge filtering, character detection, text box detection, and text line detection. Our results show that this method can be applied to English as well as non-English text (such as Korean) with precision and recall of 85%.

international conference on image processing | 2001

Integrated multimedia processing for topic segmentation and classification

Radu S. Jasinschi; Nevenka Dimitrova; Thomas McGee; Lalitha Agnihotri; John Zimmerman; Dongge Li

We describe integrated multimedia processing for Video Scout, a system that segments and indexes TV programs according to their audio, visual, and transcript information. Video Scout represents a future direction for personal video recorders. In addition to using electronic program guide metadata and a user profile, Scout allows the users to request specific topics within a program. For example, users can request the video clip of the USA president speaking from a half-hour news program. Video Scout has three modules: (i) video pre-processing, (ii) segmentation and indexing, and (iii) storage and user interface. Segmentation and indexing, the core of the system, incorporates a Bayesian framework that integrates information from the audio, visual, and transcript (closed captions) domains. This framework uses three layers to process low, mid, and high-level multimedia information. The high-level layer generates semantic information about TV program topics. This paper describes the elements of the system and presents results from running Video Scout on real TV programs.

international conference on multimedia and expo | 2000

TV program classification based on face and text processing

Gang Wei; Lalitha Agnihotri; Nevenka Dimitrova

In this paper we describe a system to classify TV programs into predefined categories based on the analysis of their video contents. This is very useful in intelligent display and storage systems that can select channels and record or skip contents according to the consumers preference. Distinguishable patterns exist in different categories of TV programs in terms of human faces and superimposed text. By applying face and text tracking to a number of training video segments, including commercials, news, sitcoms, and soaps, we have identified patterns within each category of TV programs in a predefined feature space that reflects the face and text characteristics of the video. A given video segment is projected to the feature space and compared against the distribution of known categories of TV programs. Domain-knowledge is used to help the classification. Encouraging results have been achieved so far in our initial experiments.

computer vision and pattern recognition | 2003

Evolvable visual commercial detector

Lalitha Agnihotri; Nevenka Dimitrova; Thomas McGee; Sylvie Jeannin; J. David Schaffer; Jan Alexis Daniel Nesvadba

Commercial detection plays an important role in various video segmentation and indexing applications. It provides high-level program segmentation so that other algorithms can be applied on the true program material in the broadcast. It is a challenge to have robust commercial detection methodology for various platforms, content formats, and broadcast styles that are used all over the world. Wide deployment of such an algorithm not only requires the development of new algorithms but also updating and tuning of parameters for existing algorithms. We present visual commercial detectors that rely on features including, luminance, letterbox, and keyframe distance. These detectors were developed after a careful study of the various features that can be extracted during MPEG-encoding process in real time. Due to the intermittent nature of the features, and platform restrictions, the commercial detection relies on a set of thresholds to keep the implementation as simple as possible. We evolved these thresholds using genetic algorithms (GAs) to optimize the performance. We show how a scalar genetic algorithm can locate sets of parameters in a multi-objective space (precision and recall) that outperform the values selected by an expert engineer. We present the results of optimizing a commercial detection algorithm for different data sets and parameter sets. In this paper we show that GAs drastically improved our approach and enabled fast prototyping and performance tuning of commercial detection algorithms.

Internet multimedia management systems. Conference | 2003

Video summarization: Methods and landscape

Mauro Barbieri; Lalitha Agnihotri; Nevenka Dimitrova

The ability to summarize and abstract information will be an essential part of intelligent behavior in consumer devices. Various summarization methods have been the topic of intensive research in the content-based video analysis community. Summarization in traditional information retrieval is a well understood problem. While there has been a lot of research in the multimedia community there is no agreed upon terminology and classification of the problems in this domain. Although the problem has been researched from different aspects there is usually no distinction between the various dimensions of summarization. The goal of the paper is to provide the basic definitions of widely used terms such as skimming, summarization, and highlighting. The different levels of summarization: local, global, and meta-level are made explicit. We distinguish among the dimensions of task, content, and method and provide an extensive classification model for the same. We map the existing summary extraction approaches in the literature into this model and we classify the aspects of proposed systems in the literature. In addition, we outline the evaluation methods and provide a brief survey. Finally we propose future research directions based on the white spots that we identified by analysis of existing systems in the literature.

Proceedings of the IEEE | 2008

“You Tube and I Find”—Personalizing Multimedia Content Access

Svetha Venkatesh; Brett Adams; Dinh Q. Phung; Chitra Dorai; Robert G. Farrell; Lalitha Agnihotri; Nevenka Dimitrova

Recent growth in broadband access and proliferation of small personal devices that capture images and videos has led to explosive growth of multimedia content available everywhere - from personal disks to the Web. While digital media capture and upload has become nearly universal with newer device technology, there is still a need for better tools and technologies to search large collections of multimedia data and to find and deliver the right content to a user according to her current needs and preferences. A renewed focus on the subjective dimension in the multimedia lifecycle, from creation, distribution, to delivery and consumption, is required to address this need beyond what is feasible today. Integration of the subjective aspects of the media itself - its affective, perceptual, and physiological potential (both intended and achieved), together with those of the users themselves will allow for personalizing the content access, beyond todays facility. This integration, transforming the traditional multimedia information retrieval (MIR) indexes to more effectively answer specific user needs, will allow a richer degree of personalization predicated on user intention and mode of interaction, relationship to the producer, content of the media, and their history and lifestyle. In this paper, we identify the challenges in achieving this integration, current approaches to interpreting content creation processes, to user modelling and profiling, and to personalized content selection, and we detail future directions. The structure of the paper is as follows: In Section I, we introduce the problem and present some definitions. In Section II, we present a review of the aspects of personalized content and current approaches for the same. Section III discusses the problem of obtaining metadata that is required for personalized media creation and present eMediate as a case study of an integrated media capture environment. Section IV presents the MAGIC system as a case study of capturing effective descriptive data and putting users first in distributed learning delivery. The aspects of modelling the user are presented as a case study in using users personality as a way to personalize summaries in Section V. Finally, Section VI concludes the paper with a discussion on the emerging challenges and the open problems.

international conference on multimedia and expo | 2004

Design and evaluation of a music video summarization system

Lalitha Agnihotri; Nevenka Dimitrova; John R. Kender

We present a system that summarizes the textual, audio, and video information of music videos in a format tuned to the preferences of a focus group of 20 users. First, we analyzed user-needs for the content and the layout of the music summaries. Then, we designed algorithms that segment individual song videos from full music video programs by noting changes in color palette, transcript, and audio classification. We summarize each song with automatically selected high level information such as title, artist, duration, title frame, and text as well as audio and visual segments of the chorus. Our system automatically determines with high recall and precision chorus locations, from the placement of repeated words and phrases in the text of the songs lyrics. Our Bayesian belief network then selects other significant video and audio content from the multiple media. Overall, we are able to compress content by a factor of 10. Our second user study has identified the principal variations between users in their choices of content desired in the summary, and in their choices of the platforms that should support their viewing.

Storage and Retrieval for Image and Video Databases | 1999

Selective video content analysis and filtering

Nevenka Dimitrova; Thomas McGee; Lalitha Agnihotri; Serhan Dagtas; Radu S. Jasinschi

Consumer digital video devices are becoming computing platforms. As computing platforms, digital video devices are capable of crunching the compressed bits into the best displayable picture and delivering enhanced services. Although these deices will primarily aim to continue their traditional functions of display and storage, there are additional functions such as content management for real- time and stored video, tele-shopping, banking, Internet connectivity, and interactive services, which the device could also handle.

acm multimedia | 2001

Personalizing video recorders using multimedia processing and integration

Nevenka Dimitrova; Radu S. Jasinschi; Lalitha Agnihotri; John Zimmerman; Thomas McGee; Dongge Li

Current personal Vido recorders make it very easy for consumers to record whole TV programs. Our research however, focuses on personalizing TV at a sub-program level. We use a traditional Content-Based Information Retrieval system architecture consisting of archiving and retrieval modules. The archiving module employs a three-layered, multimodal integration framework to segment, analyze, characterize, and classify segments. The retrieval module relies on users personal preferences to deliver both full programs and video segments of interest. We tested retrieval concepts with real users and discovered that they see more value in segmenting non-narrative programs (e.g. news) than narrative programs (e.g. movies). We benchmarked individual algorithms and segment classification for celebrity and financial segments as instances of non-narrative content. For celebrity segments we obtained a total precision of 94.1% and recall of 85.7%, and for financial segments a total precision of 81.1% and a recall of 86.9%.

international conference on acoustics, speech, and signal processing | 2002

A probabilistic layered framework for integrating multimedia content and context information

Radu S. Jasinschi; Nevenka Dimitrova; Thomas McGee; Lalitha Agnihotri; John Zimmerman; Dongge Li; Jennifer Louie

Automatic indexing of large collections of multimedia data is important for enabling retrieval functions. Current approaches mostly draw on a single or dual modality of video content analysis. Here we describe a framework for the integration of multimedia content and context information, which generalizes and systematizes current methods. Content information in the visual, audio, and text domains, is described at different levels of granularity and abstraction. Context describes the underlying structural information that can be used to constrain the possible number of interpretations. We introduce a probabilistic framework that combines (a) Bayesian networks that describe both content and context and (b) hierarchical priors that describe the integration of content and context. We present an application that uses this framework to segment and index TV programs. We discuss experimental results on segment classification on six and a half hours of broadcast video. In our experiments we used audio context information. Classification results for financial segments yield 83% and for celebrity segments 89%.

Explore More