Jelena Tesic
IBM
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Jelena Tesic.
IEEE MultiMedia | 2006
Milind R. Naphade; John R. Smith; Jelena Tesic; Shih-Fu Chang; Winston H. Hsu; Lyndon Kennedy; Alexander G. Hauptmann; Jon Curtis
As increasingly powerful techniques emerge for machine tagging multimedia content, it becomes ever more important to standardize the underlying vocabularies. Doing so provides interoperability and lets the multimedia community focus ongoing research on a well-defined set of semantics. This paper describes a collaborative effort of multimedia researchers, library scientists, and end users to develop a large standardized taxonomy for describing broadcast news video. The large-scale concept ontology for multimedia (LSCOM) is the first of its kind designed to simultaneously optimize utility to facilitate end-user access, cover a large semantic space, make automated extraction feasible, and increase observability in diverse broadcast news video data sets
acm multimedia | 2007
Apostol Natsev; Alexander Haubold; Jelena Tesic; Lexing Xie; Rong Yan
We study the problem of semantic concept-based query expansion and re-ranking for multimedia retrieval. In particular, we explore the utility of a fixed lexicon of visual semantic concepts for automatic multimedia retrieval and re-ranking purposes. In this paper, we propose several new approaches for query expansion, in which textual keywords, visual examples, or initial retrieval results are analyzed to identify the most relevant visual concepts for the given query. These concepts are then used to generate additional query results and/or to re-rank an existing set of results. We develop both lexical and statistical approaches for text query expansion, as well as content-based approaches for visual query expansion. In addition, we study several other recently proposed methods for concept-based query expansion. In total, we compare 7 different approaches for expanding queries with visual semantic concepts. They are evaluated using a large video corpus and 39 concept detectors from the TRECVID-2006 video retrieval benchmark. We observe consistent improvement over the baselines for all 7 approaches, leading to an overall performance gain of 77% relative to a text retrieval baseline, and a 31% improvement relative to a state-of-the-art multimodal retrieval baseline.
acm multimedia | 2005
Apostol Natsev; Milind R. Naphade; Jelena Tesic
In this paper we unify two supposedly distinct tasks in multimedia retrieval. One task involves answering queries with a few examples. The other involves learning models for semantic concepts, also with a few examples. In our view these two tasks are identical with the only differentiation being the number of examples that are available for training. Once we adopt this unified view, we then apply identical techniques for solving both problems and evaluate the performance using the NIST TRECVID benchmark evaluation data [15]. We propose a combination hypothesis of two complementary classes of techniques, a nearest neighbor model using only positive examples and a discriminative support vector machine model using both positive and negative examples. In case of queries, where negative examples are rarely provided to seed the search, we create pseudo-negative samples. We then combine the ranked lists generated by evaluating the test database using both methods, to create a final ranked list of retrieved multimedia items. We evaluate this approach for rare concept and query topic modeling using the NIST TRECVID video corpus.In both tasks we find that applying the combination hypothesis across both modeling techniques and a variety of features results in enhanced performance over any of the baseline models, as well as in improved robustness with respect to training examples and visual features. In particular, we observe an improvement of 6% for rare concept detection and 17% for the search task.
knowledge discovery and data mining | 2007
Rong Yan; Jelena Tesic; John R. Smith
Typical approaches to the multi-label classification problem require learning an independent classifier for every label from all the examples and features. This can become a computational bottleneck for sizeable datasets with a large label space. In this paper, we propose an efficient and effective multi-label learning algorithm called model-shared subspace boosting (MSSBoost) as an attempt to reduce the information redundancy in the learning process. This algorithm automatically finds, shares and combines a number of base models across multiple labels, where each model is learned from random feature subspace and boots trap data samples. The decision functions for each label are jointly estimated and thus a small number of shared subspace models can support the entire label space. Our experimental results on both synthetic data and real multimedia collections have demonstrated that the proposed algorithm can achieve better classification performance than the non-ensemble baselineclassifiers with a significant speedup in the learning and prediction processes. It can also use a smaller number of base models to achieve the same classification performance as its non-model-shared counterpart.
IEEE MultiMedia | 2005
Jelena Tesic
Digital image metadata plays a crucial role in managing digital image repositories. It lets us catalog and maintain large image collections as well as search for and find relevant information. Moreover, describing a digital image with defined metadata schemes lets multiple systems with different platforms and interfaces access and process image metadata. Metadatas wide use in commercial, academic, and educational domains as well as on the Web has propelled the development of new standards for digital image data schemes. The Japan Electronics and Information Technology Industries Association has proposed the Exchangeable Image File Format (EXIF) as a standard for storing administrative metadata in digital image files during acquisition. The International Press Telecommunications Council (IPTC) has developed a standard for storing descriptive metadata information within digital images. These metadata schemas, as well as other emerging standards, provide a standard format for creating, processing, and exchanging digital image metadata and enable image management, analysis, indexing, and search applications.
IEEE MultiMedia | 2008
Scott A. Golder; Jelena Tesic
The four articles in this special issue focus on collaborative tagging of multimedia. The papers are summarized here.
multimedia information retrieval | 2006
James Ze Wang; Nozha Boujemaa; Alberto Del Bimbo; Donald Geman; Alexander G. Hauptmann; Jelena Tesic
Multimedia information retrieval is a highly diverse field. A variety of data types, research problems,methodologies are involved.Researchers in the field come from very different disciplines, ranging from mathematical and physical sciences, computational sciences and engineering, to application domains. The panel, consisting of highly visible active researchers from both academia and the industry,opens a discussion on the importance of diversity to the healthy growth of the field. This paper records their opinions expressed at the panel.
international conference on multimedia and expo | 2007
Lexing Xie; Apostol Natsev; Jelena Tesic
We propose effective multimodal fusion strategies for video search. Multimodal search is a widely applicable information-retrieval problem, and fusion strategies are essential to the system in order to utilize all available retrieval experts and to boost the performance. Prior work has focused on hard-and soft-modeling of query classes and learning weights for each class, while the class partition is either manually defined or learned from data but still insensitive to the testing query. We propose a query-dependent fusion strategy that dynamically generates a class among the training queries that are closest to the testing query, based on light-weight query features defined on the outcome of semantic analysis on the query text. A set of optimal weights are then learned on the dynamic class, which aims to model both the co-occurring query features and unusual test queries. Used in conjunction with the rest of our multimodal retrieval system, dynamic query classes performs favorably with hard and soft query classes, and the system performance improves upon the best automatic search run of TRECVID05 and TRECVID06 by 34% and 8%, respectively.
conference on image and video retrieval | 2007
Jelena Tesic; Apostol Natsev; John R. Smith
In this paper we present a novel approach to query-by-example using existing high-level semantics in the dataset. Typically with visual topics, the examples are not sufficiently diverse to create robust model of the users need in the descriptors space. As a result, direct modeling using the provided topic examples as training data is inadequate. Otherwise, systems resort to multiple content-based searches using each example in turn, which typically provides poor results. We explore the relevance of visual concept models and how they help refine the query topics. We propose a new technique of leveraging the underlying semantics contained in the visual query topic examples to improve the search. We treat the semantic space as the descriptor space, and intelligently model a query in that space. We use unlabeled data to expand the diversity of the topic examples as well as provide a robust set of negative examples that allow direct modeling. The approach intelligently models a positive and pseudo-negative space using unbiased and biased methods for data sampling and data selection, and improves semantic retrieval by %12 over TRECVID 2006 topics. Moreover, we explore the visual context in fusion with text and visual search baselines and examine how this component can improve baseline retrieval results by expanding and re-ranking them. We apply the proposed methods in a multimodal video search system, and show how the underlined semantics of the queries can significantly improve the overall visual search results, improving baseline by over 46%, and enhancing performance of other modalities by at least 10%. We also demonstrate improved robustness over a range of query topic training examples and query topics with varying visual support of in TRECVID.
international conference on multimedia and expo | 2007
Jelena Tesic; Apostol Natsev; Lexing Xie; John R. Smith
In this paper we examine a novel approach to the difficult problem of querying video databases using visual topics with few examples. Typically with visual topics, the examples are not sufficiently diverse to create a robust model of the users need. As a result, direct modeling using the provided topic examples as training data is inadequate. Otherwise, systems resort to multiple content-based searches using each example in turn, which typically provides poor results. We propose a new technique of leveraging unlabeled data to expand the diversity of the topic examples as well as provide a robust set of negative examples that allow direct modeling. The approach intelligently models a pseudo-negative space using unbiased and biased methods for data sampling and data selection. We apply the proposed method in a fusion framework to improve discriminative support vector machine modeling, and improve the overall system performance. The result is an enhanced performance over any of the baseline models, as well as improved robustness with respect to training examples, visual features, and visual support of video topics in TRECVID. The proposed method outperforms a baseline retrieval approach by more than 18% on the TRECVID 2006 video collection and query topics.