Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Gabriela Csurka is active.

Publication


Featured researches published by Gabriela Csurka.


international conference on computer vision | 2009

A framework for visual saliency detection with applications to image thumbnailing

Luca Marchesotti; Claudio Cifarelli; Gabriela Csurka

We propose a novel framework for visual saliency detection based on a simple principle: images sharing their global visual appearances are likely to share similar salience. Assuming that an annotated image database is available, we first retrieve the most similar images to the target image; secondly, we build a simple classifier and we use it to generate saliency maps. Finally, we refine the maps and we extract thumbnails. We show that in spite of its simplicity, our framework outperforms state-of-the-art approaches. Another advantage is its ability to deal with visual pop-up and application/task-driven saliency, if appropriately annotated images are available.


european conference on computer vision | 2006

Adapted vocabularies for generic visual categorization

Florent Perronnin; Christopher R. Dance; Gabriela Csurka; Marco Bressan

Several state-of-the-art Generic Visual Categorization (GVC) systems are built around a vocabulary of visual terms and characterize images with one histogram of visual word counts. We propose a novel and practical approach to GVC based on a universal vocabulary, which describes the content of all the considered classes of images, and class vocabularies obtained through the adaptation of the universal vocabulary using class-specific data. An image is characterized by a set of histograms – one per class – where each histogram describes whether the image content is best modeled by the universal vocabulary or the corresponding class vocabulary. It is shown experimentally on three very different databases that this novel representation outperforms those approaches which characterize an image with a single histogram.


international conference on computer vision | 2011

Assessing the aesthetic quality of photographs using generic image descriptors

Luca Marchesotti; Florent Perronnin; Diane Larlus; Gabriela Csurka

In this paper, we automatically assess the aesthetic properties of images. In the past, this problem has been addressed by hand-crafting features which would correlate with best photographic practices (e.g. “Does this image respect the rule of thirds?”) or with photographic techniques (e.g. “Is this image a macro?”). We depart from this line of research and propose to use generic image descriptors to assess aesthetic quality. We experimentally show that the descriptors we use, which aggregate statistics computed from low-level local features, implicitly encode the aesthetic properties explicitly used by state-of-the-art methods and outperform them by a significant margin.


european conference on computer vision | 2012

Metric learning for large scale image classification: generalizing to new classes at near-zero cost

Thomas Mensink; Jakob J. Verbeek; Florent Perronnin; Gabriela Csurka

We are interested in large-scale image classification and especially in the setting where images corresponding to new or existing classes are continuously added to the training set. Our goal is to devise classifiers which can incorporate such images and classes on-the-fly at (near) zero cost. We cast this problem into one of learning a metric which is shared across all classes and explore k-nearest neighbor (k-NN) and nearest class mean (NCM) classifiers. We learn metrics on the ImageNet 2010 challenge data set, which contains more than 1.2M training images of 1K classes. Surprisingly, the NCM classifier compares favorably to the more flexible k-NN classifier, and has comparable performance to linear SVMs. We also study the generalization performance, among others by using the learned metric on the ImageNet-10K dataset, and we obtain competitive performance. Finally, we explore zero-shot classification, and show how the zero-shot model can be combined very effectively with small training datasets.


IEEE Transactions on Pattern Analysis and Machine Intelligence | 2013

Distance-Based Image Classification: Generalizing to New Classes at Near-Zero Cost

Thomas Mensink; Jakob J. Verbeek; Florent Perronnin; Gabriela Csurka

We study large-scale image classification methods that can incorporate new classes and training images continuously over time at negligible cost. To this end, we consider two distance-based classifiers, the k-nearest neighbor (k-NN) and nearest class mean (NCM) classifiers, and introduce a new metric learning approach for the latter. We also introduce an extension of the NCM classifier to allow for richer class representations. Experiments on the ImageNet 2010 challenge dataset, which contains over 106 training images of 1,000 classes, show that, surprisingly, the NCM classifier compares favorably to the more flexible k-NN classifier. Moreover, the NCM performance is comparable to that of linear SVMs which obtain current state-of-the-art performance. Experimentally, we study the generalization performance to classes that were not used to learn the metrics. Using a metric learned on 1,000 classes, we show results for the ImageNet-10K dataset which contains 10,000 classes, and obtain performance that is competitive with the current state-of-the-art while being orders of magnitude faster. Furthermore, we show how a zero-shot class prior based on the ImageNet hierarchy can improve performance when few training images are available.


international conference on multimedia retrieval | 2011

Semantic combination of textual and visual information in multimedia retrieval

Stéphane Clinchant; Julien Ah-Pine; Gabriela Csurka

The goal of this paper is to introduce a set of techniques we call semantic combination in order to efficiently fuse text and image retrieval systems in the context of multimedia information access. These techniques emerge from the observation that image and textual queries are expressed at different semantic levels and that a single image query is often ambiguous. Overall, the semantic combination techniques overcome a conceptual barrier rather than a technical one: these methods can be seen as a combination of late fusion and image reranking. Albeit simple, this approach has not been used yet. We assess the proposed techniques against late and cross-media fusion using 4 different ImageCLEF datasets. Compared to late fusion, performances significantly increase on two datasets and remain similar on the two other ones.


computer vision and pattern recognition | 2011

Learning structured prediction models for interactive image labeling

Thomas Mensink; Jakob J. Verbeek; Gabriela Csurka

We propose structured models for image labeling that take into account the dependencies among the image labels explicitly. These models are more expressive than independent label predictors, and lead to more accurate predictions. While the improvement is modest for fully-automatic image labeling, the gain is significant in an interactive scenario where a user provides the value of some of the image labels. Such an interactive scenario offers an interesting trade-off between accuracy and manual labeling effort. The structured models are used to decide which labels should be set by the user, and transfer the user input to more accurate predictions on other image labels. We also apply our models to attribute-based image classification, where attribute predictions of a test image are mapped to class probabilities by means of a given attribute-class mapping. In this case the structured models are built at the attribute level. We also consider an interactive system where the system asks a user to set some of the attribute values in order to maximally improve class prediction performance. Experimental results on three publicly available benchmark data sets show that in all scenarios our structured models lead to more accurate predictions, and leverage user input much more effectively than state-of-the-art independent models.


british machine vision conference | 2008

A Simple High Performance Approach to Semantic Segmentation

Gabriela Csurka; Florent Perronnin

We propose a simple approach to semantic image segmentation. Our system scores low-level patches according to their class re levance, propagates these posterior probabilities to pixels and uses low- level segmentation to guide the semantic segmentation. The two main contributions of this paper are as follows. First, for the patch scoring, we describe each patch with a high-level descriptor based on the Fisher kernel and use a s et of linear classifiers. While the Fisher kernel methodology was shown t o lead to high accuracy for image classification, it has not been applied to the segmentation problem. Second, we use global image classifiers to take into account the context of the objects to be segmented. If an image as a whole is unlikely to contain an object class, then the corresponding class is not considered in the segmentation pipeline. This increases the classification a ccuracy and reduces the computational cost. We will show that despite its apparent simplicity, this system provides above state-of-the-art performance on the PASCAL VOC 2007 dataset and state-of-the-art performance on the MSRC 21 dataset.


Multimedia Tools and Applications | 2009

Crossing textual and visual content in different application scenarios

Julien Ah-Pine; Marco Bressan; Stéphane Clinchant; Gabriela Csurka; Yves Hoppenot; Jean-Michel Renders

This paper deals with multimedia information access. We propose two new approaches for hybrid text-image information processing that can be straightforwardly generalized to the more general multimodal scenario. Both approaches fall in the trans-media pseudo-relevance feedback category. Our first method proposes using a mixture model of the aggregate components, considering them as a single relevance concept. In our second approach, we define trans-media similarities as an aggregation of monomodal similarities between the elements of the aggregate and the new multimodal object. We also introduce the monomodal similarity measures for text and images that serve as basic components for both proposed trans-media similarities. We show how one can frame a large variety of problem in order to address them with the proposed techniques: image annotation or captioning, text illustration and multimedia retrieval and clustering. Finally, we present how these methods can be integrated in two applications: a travel blog assistant system and a tool for browsing the Wikipedia taking into account the multimedia nature of its content.


international conference on pattern recognition | 2006

Probabilistic Automatic Red Eye Detection and Correction

Jutta Willamowski; Gabriela Csurka

In this paper we propose a new probabilistic approach to red eye detection and correction. It is based on stepwise refinement of a pixel-wise red eye probability map. Red eye detection starts with a fast non red eye region rejection step. A classification step then adjusts the probabilities attributed to the detected red eye candidates. The correction step finally applies a soft red eye correction based on the resulting probability map. The proposed approach is fast and allows achieving an excellent correction of strong red eyes while producing a still significant correction of weaker red eyes

Collaboration


Dive into the Gabriela Csurka's collaboration.

Researchain Logo
Decentralizing Knowledge