Slawomir Bak
French Institute for Research in Computer Science and Automation
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Slawomir Bak.
advanced video and signal based surveillance | 2011
Slawomir Bak; Etienne Corvee; Francois Bremond; Monique Thonnat
Human re-identification is defined as a requirement to determine whether a given individual has already appeared over a network of cameras. This problem is particularly hard by significant appearance changes across different camera views. In order to re-identify people a human signature should handle difference in illumination, pose and camera parameters. We propose a new appearance model combining information from multiple images to obtain highly discriminative human signature, called Mean Riemannian Covariance Grid (MRCG). The method is evaluated and compared with the state of the art using benchmark video sequences from the ETHZ and the i-LIDS datasets. We demonstrate that the proposed approach outperforms state of the art methods. Finally, the results of our approach are shown on two other more pertinent datasets.
ieee international conference on automatic face gesture recognition | 2013
Piotr Tadeusz Bilinski; Etienne Corvee; Slawomir Bak; Francois Bremond
This paper addresses the problem of recognizing human actions in video sequences for home care applications. Recent studies have shown that approaches which use a bag-of-words representation reach high action recognition accuracy. Unfortunately, these approaches have problems to discriminate similar actions, ignoring spatial information of features. As we focus on recognizing subtle differences in behaviour of patients, we propose a novel method which significantly enhances the discriminative properties of the bag-of-words technique. Our approach is based on a dynamic coordinate system, which introduces spatial information to the bag-of-words model, by computing relative tracklets. We perform an extensive evaluation of our approach on three datasets: popular KTH dataset, challenging ADL dataset and our collected Hospital dataset. Experiments show that our representation enhances the discriminative power of features and bag-of-words model, bringing significant improvements in action recognition performance.
advanced video and signal based surveillance | 2014
Slawomir Bak; Sofia Zaidenberg; Bernard Boulay; Francois Bremond
Re-identifying people in a network of cameras requires an invariant human representation. State of the art algorithms are likely to fail in real-world scenarios due to serious perspective changes. Most of existing approaches focus on invariant and discriminative features, while ignoring the body alignment issue. In this paper we propose 3 methods for improving the performance of person re-identification. We focus on eliminating perspective distortions by using 3D scene information. Perspective changes are minimized by affine transformations of cropped images containing the target (1). Further we estimate the human pose for (2) clustering data from a video stream and (3) weighting image features. The pose is estimated using 3D scene information and motion of the target. We validated our approach on a publicly available dataset with a network of 8 cameras. The results demonstrated significant increase in the re-identification performance over the state of the art.
workshop on applications of computer vision | 2014
Slawomir Bak; Ratnesh Kumar; Francois Bremond
This paper introduces an image region descriptor and applies it to the problem of appearance matching. The proposed descriptor can be seen as a natural extension of covariance. Driven by recent studies in mathematical statistics related to Brownian motion, we design the Brownian descriptor. In contrast to the classical covariance descriptor, which measures the degree of linear relationship between features, our novel descriptor measures the degree of all kinds of possible relationships between features. We argue that the proposed covariance is a richer descriptor than the classical covariance, especially when fusing non-linearly dependent features. We evaluate our approach on tracking related applications, demonstrating that the Brownian descriptor outperforms the classical covariance in terms of matching accuracy and efficiency.
international conference on image processing | 2012
Slawomir Bak; Duc Phu Chau; Julien Badie; Etienne Corvee; Francois Bremond; Monique Thonnat
This paper addresses the problem of multi-target tracking in crowded scenes from a single camera. We propose an algorithm for learning discriminative appearance models for different targets. These appearance models are based on covariance descriptor extracted from tracklets given by a short-term tracking algorithm. Short-term tracking relies on object descriptors tuned by a controller which copes with context variation over time. We link tracklets by using discriminative analysis on a Riemannian manifold. Our evaluation shows that by applying this discriminative analysis, we can reduce false alarms and identity switches, not only for tracking in a single camera but also for matching object appearances between non-overlapping cameras.
advanced video and signal based surveillance | 2012
Julien Badie; Slawomir Bak; Silviu-Tudor Serban; Francois Bremond
This paper presents a new approach for tracking multiple persons in a single camera. This approach focuses on recovering tracked individuals that have been lost and are detected again, after being miss-detected (e.g. occluded) or after leaving the scene and coming back. In order to correct tracking errors, a multi-cameras re-identification method is adapted, with a real-time constraint. The proposed approach uses a highly discriminative human signature based on covariance matrix, improved using background subtraction, and a people detection confidence. The problem of linking several tracklets belonging to the same individual is also handled as a ranking problem using a learned parameter. The objective is to create clusters of tracklets describing the same individual. The evaluation is performed on PETS2009 dataset showing promising results.
advanced video and signal based surveillance | 2014
Piotr Tadeusz Bilinski; Michal Koperski; Slawomir Bak; Francois Bremond
This paper addresses a problem of recognizing human actions in video sequences. Recent studies have shown that methods which use bag-of-features and space-time features achieve high recognition accuracy. Such methods extract both appearance-based and motion-based features. This paper focuses only on appearance features. We propose to model relationships between different pixel-level appearance features such as intensity and gradient using Brownian covariance, which is a natural extension of classical covariance measure. While classical covariance can model only linear relationships, Brownian covariance models all kinds of possible relationships. We propose a method to compute Brownian covariance on space-time volume of a video sequence. We show that proposed Video Brownian Covariance (VBC) descriptor carries complementary information to the Histogram of Oriented Gradients (HOG) descriptor. The fusion of these two descriptors gives a significant improvement in performance on three challenging action recognition datasets.
electronic imaging | 2015
Slawomir Bak; Filipe Martins; Francois Bremond
The person re-identification problem is a well known retrieval task that requires finding a person of interest in a network of cameras. In a real-world scenario, state of the art algorithms are likely to fail due to serious perspective and pose changes as well as variations in lighting conditions across the camera network. The most effective approaches try to cope with all these changes by applying metric learning tools to find a transfer function between a camera pair. Unfortunately, this transfer function is usually dependent on the camera pair and requires labeled training data for each camera. This might be unattainable in a large camera network. In this paper, instead of learning the transfer function that addresses all appearance changes, we propose to learn a generic metric pool that only focuses on pose changes. This pool consists of metrics, each one learned to match a specific pair of poses. Automatically estimated poses determine the proper metric, thus improving matching. We show that metrics learned using a single camera improve the matching across the whole camera network, providing a scalable solution. We validated our approach on a publicly available dataset demonstrating increase in the re-identification performance.
Journal of Electronic Imaging | 2015
Slawomir Bak; Francois Bremond
Abstract. The person reidentification task applied in a real-world scenario is addressed. Finding people in a network of cameras is challenging due to significant variations in lighting conditions, different color responses, and different camera viewpoints. State-of-the-art algorithms are likely to fail due to serious perspective and pose changes. Most of the existing approaches try to cope with all these changes by applying metric learning tools to find a transfer function between a camera pair while ignoring the body alignment issue. Additionally, this transfer function usually depends on the camera pair and requires labeled training data for each camera. This might be unattainable in a large camera network. We employ three-dimensional scene information for minimizing perspective distortions and estimating the target pose. The estimated pose is further used for splitting a target trajectory into reliable chunks, each one with a uniform pose. These chunks are matched through a network of cameras using a previously learned metric pool. However, instead of learning transfer functions that cope with all appearance variations, we propose to learn a generic metric pool that only focuses on pose changes. This pool consists of metrics, each one learned to match a specific pair of poses and not being limited to a specific camera pair. Automatically estimated poses determine the proper metric, thus improving matching. We show that metrics learned using only a single camera can significantly improve the matching across the whole camera network, providing a scalable solution. We validated our approach on publicly available datasets, demonstrating increase in the reidentification performance.
international conference on multimedia communications | 2011
Slawomir Bak; Krzysztof Kurowski; Krystyna Napierala
The paper presents a new approach to the human reidentification problem using covariance features. In many cases a distance operator between signatures based on generalized eigenvalues has to be computed efficiently, especially once the real-time response is expected from the system. This is a challenging problem as many procedures are computationally intensive tasks and must be repeated constantly. To deal with this problem we have successfully designed and tested a new video surveillance system. To obtain the required high efficiency we took the advantage of highly parallel computing architectures such as FPGA, GPU and CPU units to perform calculations. However, we had to propose a new GPU-based implementation of the distance operator for querying the example database. In this paper we present experimental evaluation of the proposed solution in the light of the database response time depending on its size.