Svebor Karaman | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Svebor Karaman is active.

Explore More

Publication

Featured researches published by Svebor Karaman.

international conference on computer vision | 2012

Identity inference: generalizing person re-identification scenarios

Svebor Karaman; Andrew D. Bagdanov

In this article we introduce the problem of identity inference as a generalization of the re-identification problem. Identity inference is applicable in situations where a large number of unknown persons must be identified without knowing a priori that groups of test images represent the same individual. Standard single- and multi-shot person re-identification are special cases of our formulation. We present an approach to solving identity inference problems using a Conditional Random Field (CRF) to model identity inference as a labeling problem in the CRF. The CRF model ensures that the final labeling gives similar labels to detections that are similar in feature space, and is flexible enough to incorporate constraints in the temporal and spatial domains. Experimental results are given on the ETHZ dataset. Our approach yields state-of-the-art performance for the multi-shot re-identification task and promising results for more general identity inference problems.

Multimedia Tools and Applications | 2016

Personalized multimedia content delivery on an interactive table by passive observation of museum visitors

Svebor Karaman; Andrew D. Bagdanov; Lea Landucci; Gianpaolo D'Amico; Andrea Ferracani; Daniele Pezzatini; Alberto Del Bimbo

The amount of multimedia data collected in museum databases is growing fast, while the capacity of museums to display information to visitors is acutely limited by physical space. Museums must seek the perfect balance of information given on individual pieces in order to provide sufficient information to aid visitor understanding while maintaining sparse usage of the walls and guaranteeing high appreciation of the exhibit. Moreover, museums often target the interests of average visitors instead of the entire spectrum of different interests each individual visitor might have. Finally, visiting a museum should not be an experience contained in the physical space of the museum but a door opened onto a broader context of related artworks, authors, artistic trends, etc. In this paper we describe the MNEMOSYNE system that attempts to address these issues through a new multimedia museum experience. Based on passive observation, the system builds a profile of the artworks of interest for each visitor. These profiles of interest are then used to drive an interactive table that personalizes multimedia content delivery. The natural user interface on the interactive table uses the visitor’s profile, an ontology of museum content and a recommendation system to personalize exploration of multimedia content. At the end of their visit, the visitor can take home a personalized summary of their visit on a custom mobile application. In this article we describe in detail each component of our approach as well as the first field trials of our prototype system built and deployed at our permanent exhibition space at LeMurate (http://www.lemurate.comune.fi.it/lemurate/) in Florence together with the first results of the evaluation process during the official installation in the National Museum of Bargello (http://www.uffizi.firenze.it/musei/?m=bargello).

british machine vision conference | 2014

Adaptive Structured Pooling for Action Recognition

Svebor Karaman; Lorenzo Seidenari; Shugao Ma; Alberto Del Bimbo; Stan Sclaroff

We propose an adaptive structured pooling strategy to solve the action recognition problem in videos. Our method aims at individuating several spatio-temporal pooling regions each corresponding to a consistent spatial and temporal subset of the video. Each of them gives a pooling weight map and is represented as a Fisher vector computed from the soft weighted contributions of all dense trajectories evolving in it. We further represent each video through a graph structure, defined over multiple granularities of spatio-temporal subsets. The graph structures extracted from all videos are compared with an efficient graph matching kernel.

international conference on multimedia and expo | 2015

Efficient hough forest object detection for low-power devices

Andrea Ciolini; Lorenzo Seidenari; Svebor Karaman; Alberto Del Bimbo

An important task in computer vision is object localization and recognition within images and video. Achieving real-time object localization and recognition on low-power devices is especially relevant in the context of wearable technologies. Indeed, wearable devices have a reduced size and cost and limited computational power leading to a challenging scenario for classical computer vision algorithms. This paper improves the Hough Forest approach with several contributions: a faster computation of the features and a faster evaluation of the learned model with minimal loss in accuracy. Our method is characterized by a low computational requirement and allows real-time detection on a wearable device.

computer vision and pattern recognition | 2015

MuseumVisitors: A dataset for pedestrian and group detection, gaze estimation and behavior understanding

Federico Bartoli; Giuseppe Lisanti; Lorenzo Seidenari; Svebor Karaman; Alberto Del Bimbo

In this paper we describe a new dataset, under construction, acquired inside the National Museum of Bargello in Florence. It was recorded with three IP cameras at a resolution of 1280 × 800 pixels and an average framerate of five frames per second. Sequences were recorded following two scenarios. The first scenario consists of visitors watching different artworks (individuals), while the second one consists of groups of visitors watching the same artworks (groups). This dataset is specifically designed to support research on group detection, occlusion handling, tracking, re-identification and behavior analysis. In order to ease the annotation process we designed a user friendly web interface that allows to annotate: bounding boxes, occlusion area, body orientation and head gaze, group belonging, and artwork under observation. We provide a comparison with other existing datasets that have group and occlusion annotations. In order to assess the difficulties of this dataset we have also performed some tests exploiting seven representative state-of-the-art pedestrian detectors.

international conference on image analysis and processing | 2013

Passive Profiling and Natural Interaction Metaphors for Personalized Multimedia Museum Experiences

Svebor Karaman; Andrew D. Bagdanov; Gianpaolo D'Amico; Lea Landucci; Andrea Ferracani; Daniele Pezzatini; Alberto Del Bimbo

Museums must balance the amount of information given on individual pieces or exhibitions in order to provide sufficient information to aid visitor understanding. At the same time they must avoid cluttering the environment and reducing the enjoyment of the exhibit. Moreover, each visitor has different interests and each might prefer more (or less) information on different artworks depending on their individual profile of interest. Finally, visiting a museum should not be a closed experience but a door opened onto a broader context of related artworks, authors, artistic trends, etc. In this paper we describe the MNEMOSYNE system that attempts to provide such a museum experience. Based on passive observation of visitors, the system builds a profile of the artworks of interest for each visitor. These profiles of interest are then used to personalize content delivery on an interactive table. The natural user interface on the interactive table uses the visitors profile, a museum content ontology and a recommendation system to personalize the users exploration of available multimedia content. At the end of their visit, the visitor can take home a personalized summary of their visit on a custom mobile application. In this article we describe each component of our approach as well as the first field trials of our prototype system built and deployed at our permanent exhibition space at Le Murate in the city of Florence.

international conference on pattern recognition | 2014

Unsupervised Scene Adaptation for Faster Multi-scale Pedestrian Detection

Federico Bartoli; Giuseppe Lisanti; Svebor Karaman; Andrew D. Bagdanov; Alberto Del Bimbo

In this paper we describe an approach to automatically improving the efficiency of soft cascade-based person detectors. Our technique addresses the two fundamental bottlenecks in cascade detectors: the number of weak classifiers that need to be evaluated in each cascade, and the total number of detection windows to be evaluated. By simply observing a soft cascade operating on a scene, we learn scale specific linear approximations of cascade traces that allows us to eliminate a large fraction of the classifier evaluation. Independently, this time by observing regions of support in the soft cascade on a training set, we learn a coarse geometric model of the scene that allows our detector to propose candidate detection windows and significantly reduce the number of windows run through the cascade. Our approaches are unsupervised and require no additional labeled person images for learning. Our linear cascade approximation results in about 28% savings in detection, while our geometric model gives a saving of over 95%, without appreciable loss of accuracy.

computer vision and pattern recognition | 2017

Learning Discriminative and Transformation Covariant Local Feature Detectors

Xu Zhang; Felix X. Yu; Svebor Karaman; Shih-Fu Chang

Robust covariant local feature detectors are important for detecting local features that are (1) discriminative of the image content and (2) can be repeatably detected at consistent locations when the image undergoes diverse transformations. Such detectors are critical for applications such as image search and scene reconstruction. Many learning-based local feature detectors address one of these two problems while overlooking the other. In this work, we propose a novel learning-based method to simultaneously address both issues. Specifically, we extend the covariant constraint proposed by Lenc and Vedaldi [8] by defining the concepts of standard patch and canonical feature and leverage these to train a novel robust covariant detector. We show that the introduction of these concepts greatly simplifies the learning stage of the covariant detector, and also makes the detector much more robust. Extensive experiments show that our method outperforms previous hand-crafted and learning-based detectors by large margins in terms of repeatability.

Person Re-Identification | 2014

From Re-identification to Identity Inference: Labeling Consistency by Local Similarity Constraints

Svebor Karaman; Giuseppe Lisanti; Andrew D. Bagdanov; Alberto Del Bimbo

In this chapter, we introduce the problem of identity inference as a generalization of person re-identification. It is most appropriate to distinguish identity inference from re-identification in situations where a large number of observations must be identified without knowing a priori that groups of test images represent the same individual. The standard single- and multishot person re-identification common in the literature are special cases of our formulation. We present an approach to solving identity inference by modeling it as a labeling problem in a Conditional Random Field (CRF). The CRF model ensures that the final labeling gives similar labels to detections that are similar in feature space. Experimental results are given on the ETHZ, i-LIDS and CAVIAR datasets. Our approach yields state-of-the-art performance for multishot re-identification, and our results on the more general identity inference problem demonstrate that we are able to infer the identity of very many examples even with very few labeled images in the gallery.

international conference on image analysis and processing | 2013

Multi-target Data Association Using Sparse Reconstruction

Andrew D. Bagdanov; Alberto Del Bimbo; Dario Di Fina; Svebor Karaman; Giuseppe Lisanti; Iacopo Masi

In this paper we describe a solution to multi-target data association problem based on l1-regularized sparse basis expansions. Assuming we have sufficient training samples per subject, our idea is to create a discriminative basis of observations that we can use to reconstruct and associate a new target. The use of l1-regularized basis expansions allows our approach to exploit multiple instances of the target when performing data association rather than relying on an average representation of target appearance. Preliminary experimental results on the PETS dataset are encouraging and demonstrate that our approach is an accurate and efficient approach to multi-target data association.

Explore More