Slobodan Ilic
Siemens
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Slobodan Ilic.
computer vision and pattern recognition | 2010
Bertram Drost; Markus Ulrich; Nassir Navab; Slobodan Ilic
This paper addresses the problem of recognizing free-form 3D objects in point clouds. Compared to traditional approaches based on point descriptors, which depend on local information around points, we propose a novel method that creates a global model description based on oriented point pair features and matches that model locally using a fast voting scheme. The global model description consists of all model point pair features and represents a mapping from the point pair feature space to the model, where similar features on the model are grouped together. Such representation allows using much sparser object and scene point clouds, resulting in very fast performance. Recognition is done locally using an efficient voting scheme on a reduced two-dimensional search space. We demonstrate the efficiency of our approach and show its high recognition performance in the case of noise, clutter and partial occlusions. Compared to state of the art approaches we achieve better recognition rates, and demonstrate that with a slight or even no sacrifice of the recognition performance our method is much faster then the current state of the art approaches.
IEEE Transactions on Pattern Analysis and Machine Intelligence | 2012
Stefan Hinterstoisser; Cedric Cagniart; Slobodan Ilic; Peter F. Sturm; Nassir Navab; Pascal Fua; Vincent Lepetit
We present a method for real-time 3D object instance detection that does not require a time-consuming training stage, and can handle untextured objects. At its core, our approach is a novel image representation for template matching designed to be robust to small image transformations. This robustness is based on spread image gradient orientations and allows us to test only a small subset of all possible pixel locations when parsing the image, and to represent a 3D object with a limited set of templates. In addition, we demonstrate that if a dense depth sensor is available we can extend our approach for an even better performance also taking 3D surface normal orientations into account. We show how to take advantage of the architecture of modern computers to build an efficient but very discriminant representation of the input images that can be used to consider thousands of templates in real time. We demonstrate in many experiments on real data that our method is much faster and more robust with respect to background clutter than current state-of-the-art methods.
asian conference on computer vision | 2012
Stefan Hinterstoisser; Vincent Lepetit; Slobodan Ilic; Stefan Johannes Josef Holzer; Gary R. Bradski; Kurt Konolige; Nassir Navab
We propose a framework for automatic modeling, detection, and tracking of 3D objects with a Kinect. The detection part is mainly based on the recent template-based LINEMOD approach [1] for object detection. We show how to build the templates automatically from 3D models, and how to estimate the 6 degrees-of-freedom pose accurately and in real-time. The pose estimation and the color information allow us to check the detection hypotheses and improves the correct detection rate by 13% with respect to the original LINEMOD. These many improvements make our framework suitable for object manipulation in Robotics applications. Moreover we propose a new dataset made of 15 registered, 1100+ frame video sequences of 15 various objects for the evaluation of future competing methods.
computer vision and pattern recognition | 2010
Stefan Hinterstoisser; Vincent Lepetit; Slobodan Ilic; Pascal Fua; Nassir Navab
We present a method for real-time 3D object detection that does not require a time consuming training stage, and can handle untextured objects. At its core, is a novel template representation that is designed to be robust to small image transformations. This robustness based on dominant gradient orientations lets us test only a small subset of all possible pixel locations when parsing the image, and to represent a 3D object with a limited set of templates. We show that together with a binary representation that makes evaluation very fast and a branch-and-bound approach to efficiently scan the image, it can detect untextured objects in complex situations and provide their 3D pose in real-time.
international conference on computer vision | 2011
Stefan Hinterstoisser; Stefan Johannes Josef Holzer; Cedric Cagniart; Slobodan Ilic; Kurt Konolige; Nassir Navab; Vincent Lepetit
We present a method for detecting 3D objects using multi-modalities. While it is generic, we demonstrate it on the combination of an image and a dense depth map which give complementary object information. It works in real-time, under heavy clutter, does not require a time consuming training stage, and can handle untextured objects. It is based on an efficient representation of templates that capture the different modalities, and we show in many experiments on commodity hardware that our approach significantly outperforms state-of-the-art methods on single modalities.
european conference on computer vision | 2010
Cedric Cagniart; Edmond Boyer; Slobodan Ilic
In this paper, we address the problem of tracking the temporal evolution of arbitrary shapes observed in multi-camera setups. This is motivated by the ever growing number of applications that require consistent shape information along temporal sequences. The approach we propose considers a temporal sequence of independently reconstructed surfaces and iteratively deforms a reference mesh to fit these observations. To effectively cope with outlying and missing geometry, we introduce a novel probabilistic mesh deformation framework. Using generic local rigidity priors and accounting for the uncertainty in the data acquisition process, this framework effectively handles missing data, relatively large reconstruction artefacts and multiple objects. Extensive experiments demonstrate the effectiveness and robustness of the method on various 4D datasets.
computer vision and pattern recognition | 2004
Miodrag Dimitrijevic; Slobodan Ilic; Pascal Fua
We propose a face reconstruction technique that produces models that not only look good when texture mapped, but are also metrically accurate. Our method is designed to work with short uncalibrated video or movie sequences, even when the lighting is poor resulting in specularities and shadows that complicate the algorithms task. Our approach relies on optimizing the shape parameters of a sophisticated PCA based model given pairwise image correspondences as input. All that is required is enough relative motion between camera and subject so that we can derive structure from motion. By matching the results against laser scanning data, we will show that its precision is excellent and can be predicted as a junction of the number and quality of the correspondences. This is important if one wishes to obtain the appropriate compromise between processing speed and quality of the results. Furthermore, our method is in fact not specific to faces and could equally be applied to any shape for which a shape model controlled with relatively small number of parameters exists.
computer vision and pattern recognition | 2014
Vasileios Belagiannis; Sikandar Amin; Mykhaylo Andriluka; Bernt Schiele; Nassir Navab; Slobodan Ilic
In this work, we address the problem of 3D pose estimation of multiple humans from multiple views. This is a more challenging problem than single human 3D pose estimation due to the much larger state space, partial occlusions as well as across view ambiguities when not knowing the identity of the humans in advance. To address these problems, we first create a reduced state space by triangulation of corresponding body joints obtained from part detectors in pairs of camera views. In order to resolve the ambiguities of wrong and mixed body parts of multiple humans after triangulation and also those coming from false positive body part detections, we introduce a novel 3D pictorial structures (3DPS) model. Our model infers 3D human body configurations from our reduced state space. The 3DPS model is generic and applicable to both single and multiple human pose estimation. In order to compare to the state-of-the art, we first evaluate our method on single human 3D pose estimation on HumanEva-I [22] and KTH Multiview Football Dataset II [8] datasets. Then, we introduce and evaluate our method on two datasets for multiple human 3D pose estimation. In order to compare to the state-of-the art, we first evaluate our method on single human 3D pose estimation on HumanEva-I [22] and KTH Multiview Football Dataset II [8] datasets. Then, we introduce and evaluate our method on two datasets for multiple human 3D pose estimation.
computer vision and pattern recognition | 2010
Cedric Cagniart; Edmond Boyer; Slobodan Ilic
In this paper, we consider the problem of tracking nonrigid surfaces and propose a generic data-driven mesh deformation framework. In contrast to methods using strong prior models, this framework assumes little on the observed surface and hence easily generalizes to most free-form surfaces while effectively handling large deformations. To this aim, the reference surface is divided into elementary surface cells or patches. This strategy ensures robustness by providing natural integration domains over the surface for noisy data, while enabling to express simple patch-level rigidity constraints. In addition, we associate to this scheme a robust numerical optimization that solves for physically plausible surface deformations given arbitrary constraints. In order to demonstrate the versatility of the proposed framework, we conducted experiments on open and closed surfaces, with possibly non-connected components, that undergo large deformations and fast motions. We also performed quantitative and qualitative evaluations in multicameras and monocular environments, and with different types of data including 2D correspondences and 3D point clouds.
computer vision and pattern recognition | 2009
Stefan Johannes Josef Holzer; Stefan Hinterstoisser; Slobodan Ilic; Nassir Navab
We propose a new approach for detecting low textured planar objects and estimating their 3D pose. Standard matching and pose estimation techniques often depend on texture and feature points. They fail when there is no or only little texture available. Edge-based approaches mostly can deal with these limitations but are slow in practice when they have to search for six degrees of freedom. We overcome these problems by introducing the distance transform templates, generated by applying the distance transform to standard edge based templates. We obtain robustness against perspective transformations by training a classifier for various template poses. In addition, spatial relations between multiple contours on the template are learnt and later used for outlier removal. At runtime, the classifier provides the identity and a rough 3D pose of the distance transform template, which is further refined by a modified template matching algorithm that is also based on the distance transform. We qualitatively and quantitatively evaluate our approach on synthetic and real-life examples and demonstrate robust real-time performance.