M José Oramas
Katholieke Universiteit Leuven
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by M José Oramas.
computer vision and pattern recognition | 2015
Basura Fernando; Efstratios Gavves; M José Oramas; Amir Ghodrati; Tinne Tuytelaars
In this paper we present a method to capture video-wide temporal information for action recognition. We postulate that a function capable of ordering the frames of a video temporally (based on the appearance) captures well the evolution of the appearance within the video. We learn such ranking functions per video via a ranking machine and use the parameters of these as a new video representation. The proposed method is easy to interpret and implement, fast to compute and effective in recognizing a wide variety of actions. We perform a large number of evaluations on datasets for generic action recognition (Hollywood2 and HMDB51), fine-grained actions (MPII- cooking activities) and gestures (Chalearn). Results show that the proposed method brings an absolute improvement of 7-10%, while being compatible with and complementary to further improvements in appearance and local motion based methods.
IEEE Transactions on Pattern Analysis and Machine Intelligence | 2017
Basura Fernando; Efstratios Gavves; M José Oramas; Amir Ghodrati; Tinne Tuytelaars
We propose a function-based temporal pooling method that captures the latent structure of the video sequence data - e.g., how frame-level features evolve over time in a video. We show how the parameters of a function that has been fit to the video data can serve as a robust new video representation. As a specific example, we learn a pooling function via ranking machines. By learning to rank the frame-level features of a video in chronological order, we obtain a new representation that captures the video-wide temporal dynamics of a video, suitable for action recognition. Other than ranking functions, we explore different parametric models that could also explain the temporal changes in videos. The proposed functional pooling methods, and rank pooling in particular, is easy to interpret and implement, fast to compute and effective in recognizing a wide variety of actions. We evaluate our method on various benchmarks for generic action, fine-grained action and gesture recognition. Results show that rank pooling brings an absolute improvement of 7-10 average pooling baseline. At the same time, rank pooling is compatible with and complementary to several appearance and local motion based methods and features, such as improved trajectories and deep learning features.
Computer Vision and Image Understanding | 2016
M José Oramas; Tinne Tuytelaars
Sampling context-based object proposals is effective for recovering missed detections.Topic models can effectively model higher-order relations between objects instances.Context-based proposals are effective for spotting regions that contain objects.Object proposal generation should not be employed solely as a pre-detection step. In this paper we focus on improving object detection performance in terms of recall. We propose a post-detection stage during which we explore the image with the objective of recovering missed detections. This exploration is performed by sampling object proposals in the image. We analyse four different strategies to perform this sampling, giving special attention to strategies that exploit spatial relations between objects. In addition, we propose a novel method to discover higher-order relations between groups of objects. Experiments on the challenging KITTI dataset show that our proposed relations-based proposal generation strategies can help improving recall at the cost of a relatively low amount of object proposals.
workshop on applications of computer vision | 2014
M José Oramas; Luc De Raedt; Tinne Tuytelaars
It is by now generally accepted that reasoning about the relationships between objects (and object hypotheses) can improve the accuracy of object detection methods. Relations between objects allow to reject inconsistent hypotheses and reduce the uncertainty of the initial hypotheses. However, most methods to date reason about object relations in a relatively crude way. In this paper we propose an alternative using cautious inference. Building on ideas from Collective Classification, we favor the most confident hypotheses as sources of contextual information and give higher relevance to the object relations observed during training. Additionally, we propose to cluster the pairwise relations into relationships. Our experiments on part of the KITTI data benchmark and the MIT StreetScenes dataset show that both steps improve the performance of relational classifiers.
british machine vision conference | 2014
M José Oramas; Tinne Tuytelaars
Motivation Object viewpoint classification, also referred to as object pose estimation, is a task of interest for several applications. However, since the early days of computer vision, it has been addressed from a very “local” perspective. This perspective focuses on learning from the features on the object itself, e.g. color, texture, or gradients [1, 2], to identify the different viewpoints in which an object may appear in an image. Lately, this trend has been extended from reasoning about local visual properties of the object in the image space to properties in the 3D scene [3, 4, 5]. Despite the effectiveness of the mentioned methods, they have the weakness of ignoring scene-related cues that can assist the classification process.
Computer Vision and Image Understanding | 2017
M José Oramas; Luc De Raedt; Tinne Tuytelaars
Abstract The task of object viewpoint estimation has been a challenge since the early days of computer vision. To estimate the viewpoint (or pose) of an object, people have mostly looked at object intrinsic features, such as shape or appearance. Surprisingly, informative features provided by other, extrinsic elements in the scene, have so far mostly been ignored. At the same time, contextual cues have been proven to be of great benefit for related tasks such as object detection or action recognition. In this paper, we explore how information from other objects in the scene can be exploited for viewpoint estimation. In particular, we look at object configurations by following a relational neighbor-based approach for reasoning about object relations. We show that, starting from noisy object detections and viewpoint estimates, exploiting the estimated viewpoint and location of other objects in the scene can lead to improved object viewpoint predictions. Experiments on the KITTI dataset demonstrate that object configurations can indeed be used as a complementary cue to appearance-based viewpoint estimation. Our analysis reveals that the proposed context-based method can improve object viewpoint estimation by reducing specific types of viewpoint estimation errors commonly made by methods that only consider local information. Moreover, considering contextual information produces superior performance in scenes where a high number of object instances occur. Finally, our results suggest that, following a cautious relational neighbor formulation brings improvements over its aggressive counterpart for the task of object viewpoint estimation.
international conference on pattern recognition applications and methods | 2012
Laura Antanas; Martijn van Otterlo; M José Oramas; Tinne Tuytelaars; Luc De Raedt
international conference on computer vision | 2013
M José Oramas; Luc De Raedt; Tinne Tuytelaars
arXiv: Computer Vision and Pattern Recognition | 2016
M José Oramas; Tinne Tuytelaars
international conference on pattern recognition applications and methods | 2013
Lieven Billiet; M José Oramas; McElory Hoffmann; Wannes Meert; Laura Antanas