Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Boris Babenko is active.

Publication


Featured researches published by Boris Babenko.


computer vision and pattern recognition | 2009

Visual tracking with online Multiple Instance Learning

Boris Babenko; Ming-Hsuan Yang; Serge J. Belongie

In this paper, we address the problem of learning an adaptive appearance model for object tracking. In particular, a class of tracking techniques called “tracking by detection” have been shown to give promising results at real-time speeds. These methods train a discriminative classifier in an online manner to separate the object from the background. This classifier bootstraps itself by using the current tracker state to extract positive and negative examples from the current frame. Slight inaccuracies in the tracker can therefore lead to incorrectly labeled training examples, which degrades the classifier and can cause further drift. In this paper we show that using Multiple Instance Learning (MIL) instead of traditional supervised learning avoids these problems, and can therefore lead to a more robust tracker with fewer parameter tweaks. We present a novel online MIL algorithm for object tracking that achieves superior results with real-time performance.


IEEE Transactions on Pattern Analysis and Machine Intelligence | 2011

Robust Object Tracking with Online Multiple Instance Learning

Boris Babenko; Ming-Hsuan Yang; Serge J. Belongie

In this paper, we address the problem of tracking an object in a video given its location in the first frame and no other information. Recently, a class of tracking techniques called “tracking by detection” has been shown to give promising results at real-time speeds. These methods train a discriminative classifier in an online manner to separate the object from the background. This classifier bootstraps itself by using the current tracker state to extract positive and negative examples from the current frame. Slight inaccuracies in the tracker can therefore lead to incorrectly labeled training examples, which degrade the classifier and can cause drift. In this paper, we show that using Multiple Instance Learning (MIL) instead of traditional supervised learning avoids these problems and can therefore lead to a more robust tracker with fewer parameter tweaks. We propose a novel online MIL algorithm for object tracking that achieves superior results with real-time performance. We present thorough experimental results (both qualitative and quantitative) on a number of challenging video clips.


international conference on computer vision | 2011

End-to-end scene text recognition

Kai Wang; Boris Babenko; Serge J. Belongie

This paper focuses on the problem of word detection and recognition in natural images. The problem is significantly more challenging than reading text in scanned documents, and has only recently gained attention from the computer vision community. Sub-components of the problem, such as text detection and cropped image word recognition, have been studied in isolation [7, 4, 20]. However, what is unclear is how these recent approaches contribute to solving the end-to-end problem of word recognition. We fill this gap by constructing and evaluating two systems. The first, representing the de facto state-of-the-art, is a two stage pipeline consisting of text detection followed by a leading OCR engine. The second is a system rooted in generic object recognition, an extension of our previous work in [20]. We show that the latter approach achieves superior performance. While scene text recognition has generally been treated with highly domain-specific methods, our results demonstrate the suitability of applying generic computer vision methods. Adopting this approach opens the door for real world scene text recognition to benefit from the rapid advances that have been taking place in object recognition.


european conference on computer vision | 2010

Visual recognition with humans in the loop

Steve Branson; Catherine Wah; Florian Schroff; Boris Babenko; Peter Welinder; Pietro Perona; Serge J. Belongie

We present an interactive, hybrid human-computer method for object classification. The method applies to classes of objects that are recognizable by people with appropriate expertise (e.g., animal species or airplane model), but not (in general) by people without such expertise. It can be seen as a visual version of the 20 questions game, where questions based on simple visual attributes are posed interactively. The goal is to identify the true class while minimizing the number of questions asked, using the visual content of the image. We introduce a general framework for incorporating almost any off-the-shelf multi-class object recognition algorithm into the visual 20 questions game, and provide methodologies to account for imperfect user responses and unreliable computer vision algorithms. We evaluate our methods on Birds-200, a difficult dataset of 200 tightly-related bird species, and on the Animals With Attributes dataset. Our results demonstrate that incorporating user input drives up recognition accuracy to levels that are good enough for practical applications, while at the same time, computer vision reduces the amount of human interaction required.


european conference on computer vision | 2008

Multiple Component Learning for Object Detection

Piotr Dollár; Boris Babenko; Serge J. Belongie; Pietro Perona; Zhuowen Tu

Object detection is one of the key problems in computer vision. In the last decade, discriminative learning approaches have proven effective in detecting rigid objects, achieving very low false positives rates. The field has also seen a resurgence of part-based recognition methods, with impressive results on highly articulated, diverse object categories. In this paper we propose a discriminative learning approach for detection that is inspired by part-based recognition approaches. Our method, Multiple Component Learning (mcl), automatically learns individual component classifiers and combines these into an overall classifier. Unlike previous methods, which rely on either fairly restricted part models or labeled part data, mcl learns powerful component classifiers in a weakly supervised manner, where object labels are provided but part labels are not. The basis of mcl lies in learning a set classifier; we achieve this by combining boosting with weakly supervised learning, specifically the Multiple Instance Learning framework (mil). mcl is general, and we demonstrate results on a range of data from computer audition and computer vision. In particular, mcl outperforms all existing methods on the challenging INRIA pedestrian detection dataset, and unlike methods that are not part-based, mcl is quite robust to occlusions.


computer vision and pattern recognition | 2006

ImprovingWeb-based Image Search via Content Based Clustering

Nadav Ben-Haim; Boris Babenko; Serge J. Belongie

Current image search engines on the web rely purely on the keywords around the images and the filenames, which produces a lot of garbage in the search results. Alternatively, there exist methods for content based image retrieval that require a user to submit a query image, and return images that are similar in content. We propose a novel approach named ReSPEC (Re-ranking Sets of Pictures by Exploiting Consistency), that is a hybrid of the two methods. Our algorithm first retrieves the results of a keyword query from an existing image search engine, clusters the results based on extracted image features, and returns the cluster that is inferred to be the most relevant to the search query. Furthermore, it ranks the remaining results in order of relevance.


european conference on computer vision | 2008

Weakly Supervised Object Localization with Stable Segmentations

Carolina Galleguillos; Boris Babenko; Andrew Rabinovich; Serge J. Belongie

Multiple Instance Learning (MIL) provides a framework for training a discriminative classifier from data with ambiguous labels. This framework is well suited for the task of learning object classifiers from weakly labeled image data, where only the presence of an object in an image is known, but not its location. Some recent work has explored the application of MIL algorithms to the tasks of image categorization and natural scene classification. In this paper we extend these ideas in a framework that uses MIL to recognize and localizeobjects in images. To achieve this we employ state of the art image descriptors and multiple stable segmentations. These components, combined with a powerful MIL algorithm, form our object recognition system called MILSS. We show highly competitive object categorization results on the Caltech dataset. To evaluate the performance of our algorithm further, we introduce the challenging Landmarks-18 dataset, a collection of photographs of famous landmarks from around the world. The results on this new dataset show the great potential of our proposed algorithm.


international conference on computer vision | 2009

Similarity metrics for categorization: From monolithic to category specific

Boris Babenko; Steve Branson; Serge J. Belongie

Similarity metrics that are learned from labeled training data can be advantageous in terms of performance and/or efficiency. These learned metrics can then be used in conjunction with a nearest neighbor classifier, or can be plugged in as kernels to an SVM. For the task of categorization two scenarios have thus far been explored. The first is to train a single “monolithic” similarity metric that is then used for all examples. The other is to train a metric for each category in a 1-vs-all manner. While the former approach seems to be at a disadvantage in terms of performance, the latter is not practical for large numbers of categories. In this paper we explore the space in between these two extremes. We present an algorithm that learns a few similarity metrics, while simultaneously grouping categories together and assigning one of these metrics to each group. We present promising results and show how the learned metrics generalize to novel categories.


international conference on computer vision | 2007

Task Specific Local Region Matching

Boris Babenko; Piotr Dollár; Serge J. Belongie

Many problems in computer vision require the knowledge of potential point correspondences between two images. The usual approach for automatically determining correspondences begins by comparing small neighborhoods of high saliency in both images. Since speed is of the essence, most current approaches for local region matching involve the computation of a feature vector that is invariant to various geometric and photometric transformations, followed by fast distance computations using standard vector norms. These algorithms include many parameters, and choosing an algorithm and setting its parameters for a given problem is more an art than a science. Furthermore, although invariance of the resulting feature space is in general desirable, there is necessarily a tradeoff between invariance and descriptiveness for any given task. In this paper we pose local region matching as a classification problem, and use powerful machine learning techniques to train a classifier that selects features from a much larger pool. Our algorithm can be trained on specific domains or tasks, and performs better than the state of the art in such cases. Since our method is an application of boosting, we refer to it as boosted region matching (BOOM).


international conference on computer vision | 2009

A family of online boosting algorithms

Boris Babenko; Ming-Hsuan Yang; Serge J. Belongie

Boosting has become a powerful and useful tool in the machine learning and computer vision communities in recent years, and many interesting boosting algorithms have been developed to solve various challenging problems. In particular, Friedman proposed a flexible framework called gradient boosting, which has been used to derive boosting procedures for regression, multiple instance learning, semi-supervised learning, etc. Recently some attention has been given to online boosting (where the examples become available one at a time). In this paper we develop a boosting framework that can be used to derive online boosting algorithms for various cost functions. Within this framework, we derive online boosting algorithms for Logistic Regression, Least Squares Regression, and Multiple Instance Learning. We present promising results on a wide range of data sets.

Collaboration


Dive into the Boris Babenko's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Pietro Perona

California Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Steve Branson

California Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Zhuowen Tu

University of California

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Catherine Wah

University of California

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge