Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Savitha Srinivasan is active.

Publication


Featured researches published by Savitha Srinivasan.


Communications of The ACM | 2006

Service systems, service scientists, SSME, and innovation

Paul P. Maglio; Savitha Srinivasan; Jeffrey Thomas Kreulen; Jim Spohrer

Computer scientists work with formal models of algorithms and computation, and someday service scientists may work with formal models of service systems. The four examples here document some of the early efforts to establish a new academic discipline and new profession.


international acm sigir conference on research and development in information retrieval | 2000

Phonetic confusion matrix based spoken document retrieval

Savitha Srinivasan; Dragutin Petkovic

Combined word-based index and phonetic indexes have been used to improve the performance of spoken document retrieval systems primarily by addressing the out-of-vocabulary retrieval problem. However, a known problem with phonetic recognition is its limited accuracy in comparison with word level recognition. We propose a novel method for phonetic retrieval in the CueVideo system based on the probabilistic formulation of term weighting using phone confusion data in a Bayesian framework. We evaluate this method of spoken document retrieval against word-based retrieval for the search levels identified in a realistic video-based distributed learning setting. Using our test data, we achieved an average recall of 0.88 with an average precision of 0.69 for retrieval of out-of-vocabulary words on phonetic transcripts with 35% word error rate. For in-vocabulary words, we achieved a 17% improvement in recall over word-based retrieval with a 17% loss in precision for word error rites ranging from 35 to 65%.


acm multimedia | 1998

Key to effective video retrieval: effective cataloging and browsing

Dulce B. Ponceleon; Savitha Srinivasan; Arnon Amir; Dragutin Petkovic; Dan Diklic

1. ABSTRACT Mukirnedia data is an increasingly important information medium today. Providing intelligent access for effective use of this information continues to offer challenges in digital Iibrary research. As computer vision, image processing and speech recognition research continue to progress, we examine the effectiveness of these fully automated techniques in architecting effective video retrieval systems. We present semi-automated techniques that combine manual inpu~ and video and speech technology for automatic content characterization integrated into a single system we cdl CueVideo. CueVideo integrates voice and manual annotation, attachment of related da~ visual content search technologies (QBICti), and novel mukiview storyboard generation to provide a system where the user can incorporate the type of semantic information that automatic techniques would fail to obtain. 1.1


Storage and Retrieval for Image and Video Databases | 1997

Updates to the QBIC system

Carlton Wayne Niblack; Xiaoming Zhu; James Lee Hafner; Tom Breuel; Dulce B. Ponceleon; Dragutin Petkovic; Myron Flickner; Eli Upfal; Sigfredo I. Nin; Sanghoon Sull; Byron Dom; Boon-Lock Yeo; Savitha Srinivasan; Dan Zivkovic; Mike Penner

QBICTM (Query By Image Content) is a set of technologies and associated software that allows a user to search, browse, and retrieve image, graphic, and video data from large on-line collections. This paper discusses current research directions of the QBIC project such as indexing for high-dimensional multimedia data, retrieval of gray level images, and storyboard generation suitable for video. It describes aspects of QBIC software including scripting tools, application interfaces, and available GUIs, and gives examples of applications and demonstration systems using it.


hawaii international conference on system sciences | 2000

Using audio time scale modification for video browsing

Arnon Amir; Dulce B. Ponceleon; Brian Blanchard; Dragutin Petkovic; Savitha Srinivasan; G. Cohen

In the IBM CueVideo project we study various aspects of fully automated video indexing, browsing and retrieval. The technical aspects include audio processing, speech recognition, image processing and information retrieval. Equally important, however, is exploring user expectations and conducting user studies. We focus on the field of video for Training and Education, including Distributed Learning, Remote Education, and Just-in-Time Learning. This paper describes the use of audio processing technology, namely audio Time Scale Modification (TSM), for the novel application of fast video browsing and efficient video-based learning. The paper provides a brief overview of the CueVideo system, technical background of TSM technology, and the way it is being used in our system. The results of our usability study on the effect of TSM on speech comprehension indicate that TSM is very useful for fast video browsing.


conference on information and knowledge management | 2004

Grammar-based task analysis of web logs

Savitha Srinivasan; Arnon Amir; Prasad M. Deshpande; Vladimir Zbarsky

The daily use of Internet-based services is involved with hundreds of different tasks being performed by multiple users. A single task is typically involved with a sequence of Web URLs invocation. We study the problem of pattern detection in Web logs to identify tasks performed by users, and analyze task trends over time using a grammar-based framework. Our results are demonstrated on a corporate Intranet portal application with 7000 users over a 6 week period and demonstrate compelling business value from this high-level task analysis.


acm multimedia | 1999

Towards robust features for classifying audio in the CueVideo system

Savitha Srinivasan; Dragutin Petkovic; Dulce B. Ponceleon

The role of audio in the context of multimedia applications involving video is becoming increasingly important. Many efforts in this area focus on audio data that contains some built-in semantic information structure such as in broadcast news, or focus on classification of audio that contains a single type of sound such as cleaar speech or clear music only. In the CueVideo system, we detect and classify audio that consists of mixed audio, i.e. combinations of speech and music together with other types of background sounds. Segmentation of mixed audio has applications in detection of story boundaries in video, spoken document retrieval systems, audio retrieval systems etc. We modify and combine audio features known to be effective in distinguishing speech from music, and examine their behavior on mixed audio. Our preliminary experimental results show that we can achieve a classification accuracy of over 80% for such mixed audio. Our study also provides us with several helpful insights related to analyzing mixed audio in the context of real applications.


IEEE Computer | 1999

Design patterns in object-oriented frameworks

Savitha Srinivasan

Developing interactive software systems with complex user interfaces has become increasingly common. Given this trend, it is important that new technology be based on flexible architectures that do not require developers to understand all the complexities inherent in a system. Object-oriented frameworks provide an important enabling technology for reusing both the architecture and the functionality of software components. Frameworks typically have a steep learning curve since the user must understand the abstract design of the underlying framework as well as the object collaboration rules or contracts-which are often not apparent in the framework interface-prior to using the framework. The author describes her experience with developing an object-oriented framework for speech recognition applications that use IBMs ViaVoice speech recognition technology. Design patterns help to effectively communicate the internal framework design and reduce dependence on the documentation.


acm multimedia | 2000

Detecting topical events in digital video

Tanveer Fathima Syeda-Mahmood; Savitha Srinivasan

The detection of events is essential to high-level semantic querying of video databases. It is also a very challenging problem requiring the detection and integration of evidence for an event available in multiple information modalities, such as audio, video and language. This paper focuses on the detection of specific types of events, namely, topic of discussion events that occur in classroom/lecture environments. Specifically, we present a query-driven approach to the detection of topic of discussion events with foils used in a lecture as a way to convey a topic. In particular, we use the image content of foils to detect visual events in which the foil is displayed and captured in the video stream. The recognition of a foil in video frames exploits the color and spatial layout of regions on foils using a technique called region hashing. Next, we use the textual phrases listed on a foil as an indication of a topic, and detect topical audio events as places in the audio track where the best evidence for the topical phrases was heard. Finally, we use a probabilistic model of event likelihood to combine the results of visual and audio avent detection that exploits their time cooccurrence. The resulting identification of topical events is evaluated in the domain of classroom lectures and talks.


conference on information and knowledge management | 2001

Advances in phonetic word spotting

Arnon Amir; Alon Efrat; Savitha Srinivasan

Phonetic speech retrieval is used to augment word based retrieval in spoken document retrieval systems, for in and out of vocabulary words. In this paper, we present a new indexing and ranking scheme using metaphones and a Bayesian phonetic edit distance. We conduct an extensive set of experiments using a hundred hours of HUB4 data with ground truth transcript and twenty-four thousands query words. We show improvement of up to 15% in precision compare to results obtained speech recognition alone, at a processing time of 0.5 Sec per query.

Researchain Logo
Decentralizing Knowledge