Minh-Tri Pham
Toshiba
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Minh-Tri Pham.
International Journal of Computer Vision | 2014
Oliver Woodford; Minh-Tri Pham; Atsuto Maki; Frank Perbet; Björn Stenger
In applying the Hough transform to the problem of 3D shape recognition and registration, we develop two new and powerful improvements to this popular inference method. The first, intrinsic Hough, solves the problem of exponential memory requirements of the standard Hough transform by exploiting the sparsity of the Hough space. The second, minimum-entropy Hough, explains away incorrect votes, substantially reducing the number of modes in the posterior distribution of class and pose, and improving precision. Our experiments demonstrate that these contributions make the Hough transform not only tractable but also highly accurate for our example application. Both contributions can be applied to other tasks that already use the standard Hough transform.
international conference on computer vision | 2011
Minh-Tri Pham; Oliver Woodford; Frank Perbet; Atsuto Maki; Björn Stenger; Roberto Cipolla
This paper presents a method for vote-based 3D shape recognition and registration, in particular using mean shift on 3D pose votes in the space of direct similarity transforms for the first time. We introduce a new distance between poses in this space—the SRT distance. It is left-invariant, unlike Euclidean distance, and has a unique, closed-form mean, in contrast to Riemannian distance, so is fast to compute. We demonstrate improved performance over the state of the art in both recognition and registration on a real and challenging dataset, by comparing our distance with others in a mean shift framework, as well as with the commonly used Hough voting approach.
computer vision and pattern recognition | 2015
Christopher Zach; Adrian Penate-Sanchez; Minh-Tri Pham
Joint object recognition and pose estimation solely from range images is an important task e.g. in robotics applications and in automated manufacturing environments. The lack of color information and limitations of current commodity depth sensors make this task a challenging computer vision problem, and a standard random sampling based approach is prohibitively time-consuming. We propose to address this difficult problem by generating promising inlier sets for pose estimation by early rejection of clear outliers with the help of local belief propagation (or dynamic programming). By exploiting data-parallelism our method is fast, and we also do not rely on a computationally expensive training phase. We demonstrate state-of-the art performance on a standard dataset and illustrate our approach on challenging real sequences.
computer vision and pattern recognition | 2014
Frank Perbet; Sam Johnson; Minh-Tri Pham; Björn Stenger
This paper proposes a method for estimating the 3D body shape of a person with robustness to clothing. We formulate the problem as optimization over the manifold of valid depth maps of body shapes learned from synthetic training data. The manifold itself is represented using a novel data structure, a Multi-Resolution Manifold Forest (MRMF), which contains vertical edges between tree nodes as well as horizontal edges between nodes across trees that correspond to overlapping partitions. We show that this data structure allows both efficient localization and navigation on the manifold for on-the-fly building of local linear models (manifold charting). We demonstrate shape estimation of clothed users, showing significant improvement in accuracy over global shape models and models using pre-computed clusters. We further compare the MRMF with alternative manifold charting methods on a public dataset for estimating 3D motion from noisy 2D marker observations, obtaining state-of-the-art results.
computer vision and pattern recognition | 2014
Stephan Liwicki; Minh-Tri Pham; Stefanos Zafeiriou; Maja Pantic; Björn Stenger
In this paper we introduce a new distance for robustly matching vectors of 3D rotations. A special representation of 3D rotations, which we coin full-angle quaternion (FAQ), allows us to express this distance as Euclidean. We apply the distance to the problems of 3D shape recognition from point clouds and 2D object tracking in color video. For the former, we introduce a hashing scheme for scale and translation which outperforms the previous state-of-the-art approach on a public dataset. For the latter, we incorporate online subspace learning with the proposed FAQ representation to highlight the benefits of the new representation.
international conference on computer vision | 2011
Carlos Hernández; Frank Perbet; Minh-Tri Pham; George Vogiatzis; Oliver Woodford; Atsuto Maki; Björn Stenger; Roberto Cipolla
We present a video-based system which interactively captures the geometry of a 3D object in the form of a point cloud, then recognizes and registers known objects in this point cloud in a matter of seconds (fig. 1). In order to achieve interactive speed, we exploit both efficient inference algorithms and parallel computation, often on a GPU. The system can be broken down into two distinct phases: geometry capture, and object inference. We now discuss these in further detail.
International Journal of Computer Vision | 2015
Minh-Tri Pham; Oliver Woodford; Frank Perbet; Atsuto Maki; Riccardo Gherardi; Björn Stenger; Roberto Cipolla
The non-Euclidean nature of direct isometries in a Euclidean space, i.e. transformations consisting of a rotation and a translation, creates difficulties when computing distances, means and distributions over them, which have been well studied in the literature. Direct similarities, transformations consisting of a direct isometry and a positive uniform scaling, present even more of a challenge—one which we demonstrate and address here. In this article, we investigate divergences (a superset of distances without constraints on symmetry and sub-additivity) for comparing direct similarities, and means induced by them via minimizing a sum of squared divergences. We analyze several standard divergences: the Euclidean distance using the matrix representation of direct similarities, a divergence from Lie group theory, and the family of all left-invariant distances derived from Riemannian geometry. We derive their properties and those of their induced means, highlighting several shortcomings. In addition, we introduce a novel family of left-invariant divergences, called SRT divergences, which resolve several issues associated with the standard divergences. In our evaluation we empirically demonstrate the derived properties of the divergences and means, both qualitatively and quantitatively, on synthetic data. Finally, we compare the divergences in a real-world application: vote-based, scale-invariant object recognition. Our results show that the new divergences presented here, and their means, are both more effective and faster to compute for this task.
Machine Learning for Computer Vision | 2013
Minh-Tri Pham; Oliver Woodford; Frank Perbet; Atsuto Maki; Riccardo Gherardi; Björn Stenger; Roberto Cipolla
This chapter presents a method for vote-based 3D shape recognition and registration, in particular using mean shift on 3D pose votes in the space of direct similarity transformations for the first time. We introduce a new distance between poses in this space—the SRT distance. It is left-invariant, unlike Euclidean distance, and has a unique, closed-form mean, in contrast to Riemannian distance, so is fast to compute. We demonstrate improved performance over the state of the art in both recognition and registration on a (real and) challenging dataset, by comparing our distance with others in a mean shift framework, as well as with the commonly used Hough voting approach.
Archive | 2012
Frank Perbet; Atsuto Maki; Minh-Tri Pham; Björn Stenger; Oliver Woodford
Archive | 2012
Oliver Woodford; Minh-Tri Pham; Atsuto Maki; Frank Perbet; Björn Stenger