Julieta Martinez
University of British Columbia
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Julieta Martinez.
computer vision and pattern recognition | 2014
Ankur Gupta; Julieta Martinez; James J. Little; Robert J. Woodham
We describe a new approach to transfer knowledge across views for action recognition by using examples from a large collection of unlabelled mocap data. We achieve this by directly matching purely motion based features from videos to mocap. Our approach recovers 3D pose sequences without performing any body part tracking. We use these matches to generate multiple motion projections and thus add view invariance to our action recognition model. We also introduce a closed form solution for approximate non-linear Circulant Temporal Encoding (nCTE), which allows us to efficiently perform the matches in the frequency domain. We test our approach on the challenging unsupervised modality of the IXMAS dataset, and use publicly available motion capture data for matching. Without any additional annotation effort, we are able to significantly outperform the current state of the art.
computer vision and pattern recognition | 2017
Julieta Martinez; Michael J. Black; Javier Romero
Human motion modelling is a classical problem at the intersection of graphics and computer vision, with applications spanning human-computer interaction, motion synthesis, and motion prediction for virtual and augmented reality. Following the success of deep learning methods in several computer vision tasks, recent work has focused on using deep recurrent neural networks (RNNs) to model human motion, with the goal of learning time-dependent representations that perform tasks such as short-term motion prediction and long-term human motion synthesis. We examine recent work, with a focus on the evaluation methodologies commonly used in the literature, and show that, surprisingly, state of the art performance can be achieved by a simple baseline that does not attempt to model motion at all. We investigate this result, and analyze recent RNN methods by looking at the architectures, loss functions, and training procedures used in state-of-the-art approaches. We propose three changes to the standard RNN models typically used for human motion, which results in a simple and scalable RNN architecture that obtains state-of-the-art performance on human motion prediction.
european conference on computer vision | 2016
Julieta Martinez; Joris Clement; Holger H. Hoos; James J. Little
We revisit Additive Quantization (AQ), an approach to vector quantization that uses multiple, full-dimensional, and non-orthogonal codebooks. Despite its elegant and simple formulation, AQ has failed to achieve state-of-the-art performance on standard retrieval benchmarks, because the encoding problem, which amounts to MAP inference in multiple fully-connected Markov Random Fields (MRFs), has proven to be hard to solve. We demonstrate that the performance of AQ can be improved to surpass the state of the art by leveraging iterated local search, a stochastic local search approach known to work well for a range of NP-hard combinatorial problems. We further show a direct application of our approach to a recent formulation of vector quantization that enforces sparsity of the codebooks. Unlike previous work, which required specialized optimization techniques, our formulation can be plugged directly into state-of-the-art lasso optimizers. This results in a conceptually simple, easily implemented method that outperforms the previous state of the art in solving sparse vector quantization. Our implementation is publicly available (https://github.com/jltmtz/local-search-quantization).
european conference on computer vision | 2016
Julieta Martinez; Holger H. Hoos; James J. Little
We focus on the problem of vector compression using multi-codebook quantization (MCQ). MCQ is a generalization of k-means where the centroids arise from the combinatorial sums of entries in multiple codebooks, and has become a critical component of large-scale, state-of-the-art approximate nearest neighbour search systems. MCQ is often addressed in an iterative manner, where learning the codebooks can be solved exactly via least-squares, but finding the optimal codes results in a large number of combinatorial NP-Hard problems. Recently, we have demonstrated that an algorithm based on stochastic local search for this problem outperforms all previous approaches. In this paper we introduce a GPU implementation of our method, which achieves a \(30{\times }\) speedup over a single-threaded CPU implementation. Our code is publicly available (https://github.com/jltmtz/local-search-quantization).
workshop on applications of computer vision | 2015
Frederick Tung; Julieta Martinez; Holger H. Hoos; James J. Little
We explore a novel paradigm in learning binary codes for large-scale image retrieval applications. Instead of learning a single globally optimal quantization model as in previous approaches, we encode the database points in a data-specific manner using a bank of quantization models. Each individual database point selects the quantization model that minimizes its individual quantization error. We apply the idea of a bank of quantization models to data independent and data-driven hashing methods for learning binary codes, obtaining state-of-the-art performance on three benchmark datasets.
workshop on applications of computer vision | 2014
Julieta Martinez; James J. Little; Nando de Freitas
Nearest Neighbour Search in high-dimensional spaces is a common problem in Computer Vision. Although no algorithm better than linear search is known, approximate algorithms are commonly used to tackle this problem. The drawback of using such algorithms is that their performance depends highly on parameter tuning. While this process can be automated using standard empirical optimization techniques, tuning is still time-consuming. In this paper, we propose to use Empirical Hardness Models to reduce the number of parameter configurations that Bayesian Optimization has to try, speeding up the optimization process. Evaluation on standard benchmarks of SIFT and GIST descriptors shows the viability of our approach.
european conference on computer vision | 2018
Julieta Martinez; Shobhit Zakhmi; Holger H. Hoos; James J. Little
Multi-codebook quantization (MCQ) is the task of expressing a set of vectors as accurately as possible in terms of discrete entries in multiple bases. Work in MCQ is heavily focused on lowering quantization error, thereby improving distance estimation and recall on benchmarks of visual descriptors at a fixed memory budget. However, recent studies and methods in this area are hard to compare against each other, because they use different datasets, different protocols, and, perhaps most importantly, different computational budgets. In this work, we first benchmark a series of MCQ baselines on an equal footing and provide an analysis of their recall-vs-running-time performance. We observe that local search quantization (LSQ) is in practice much faster than its competitors, but is not the most accurate method in all cases. We then introduce two novel improvements that render LSQ (i) more accurate and (ii) faster. These improvements are easy to implement, and define a new state of the art in MCQ.
workshop on applications of computer vision | 2016
Ankur Gupta; John He; Julieta Martinez; James J. Little; Robert J. Woodham
We present a novel and scalable approach for retrieval and flexible alignment of 3d human motion examples given a video query. Our method efficiently searches a large set of motion capture (mocap) files accounting for speed variations in motion. To align a short video clip with a part of a longer mocap sequence, we experiment with different feature representations comparable across the two modalities. We also evaluate two different Dynamic Time Warping (DTW) approaches that allow sub-sequence matching and suggest additional local constraints for a smooth alignment. Finally, to quantify video-based mocap retrieval, we introduce a benchmark providing a novel set of per-frame action labels for 2 000 files of the CMU-mocap dataset, as well as a collection of realistic video queries taken from YouTube. Our experiments show that temporal flexibility is not only required for the correct alignment of pose and motion, but it also improves the retrieval accuracy.
international conference on computer vision | 2017
Julieta Martinez; Rayat Hossain; Javier Romero; James J. Little
arXiv: Computer Vision and Pattern Recognition | 2014
Julieta Martinez; Holger H. Hoos; James J. Little