Marco Cuturi
Kyoto University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Marco Cuturi.
international conference on acoustics, speech, and signal processing | 2007
Marco Cuturi; Jean-Philippe Vert; Øystein Birkenes; Tomoko Matsui
We propose in this paper a new family of kernels to handle time series, notably speech data, within the framework of kernel methods which includes popular algorithms such as the support vector machine. These kernels elaborate on the well known dynamic time warping (DTW) family of distances by considering the same set of elementary operations, namely substitutions and repetitions of tokens, to map a sequence onto another. Associating to each of these operations a given score, DTW algorithms use dynamic programming techniques to compute an optimal sequence of operations with high overall score, in this paper we consider instead the score spanned by all possible alignments, take a smoothed version of their maximum and derive a kernel out of this formulation. We prove that this kernel is positive definite under favorable conditions and show how it can be tuned effectively for practical applications as we report encouraging results on a speech recognition task.
SIAM Journal on Scientific Computing | 2015
Jean-David Benamou; Guillaume Carlier; Marco Cuturi; Luca Nenna; Gabriel Peyré
This article details a general numerical framework to approximate so-lutions to linear programs related to optimal transport. The general idea is to introduce an entropic regularization of the initial linear program. This regularized problem corresponds to a Kullback-Leibler Bregman di-vergence projection of a vector (representing some initial joint distribu-tion) on the polytope of constraints. We show that for many problems related to optimal transport, the set of linear constraints can be split in an intersection of a few simple constraints, for which the projections can be computed in closed form. This allows us to make use of iterative Bregman projections (when there are only equality constraints) or more generally Bregman-Dykstra iterations (when inequality constraints are in-volved). We illustrate the usefulness of this approach to several variational problems related to optimal transport: barycenters for the optimal trans-port metric, tomographic reconstruction, multi-marginal optimal trans-port and in particular its application to Breniers relaxed solutions of in-compressible Euler equations, partial un-balanced optimal transport and optimal transport with capacity constraints.
international conference on computer graphics and interactive techniques | 2015
Justin Solomon; Fernando de Goes; Gabriel Peyré; Marco Cuturi; Adrian Butscher; Andy Nguyen; Tao Du; Leonidas J. Guibas
This paper introduces a new class of algorithms for optimization problems involving optimal transportation over geometric domains. Our main contribution is to show that optimal transportation can be made tractable over large domains used in graphics, such as images and triangle meshes, improving performance by orders of magnitude compared to previous work. To this end, we approximate optimal transportation distances using entropic regularization. The resulting objective contains a geodesic distance-based kernel that can be approximated with the heat kernel. This approach leads to simple iterative numerical schemes with linear convergence, in which each iteration only requires Gaussian convolution or the solution of a sparse, pre-factored linear system. We demonstrate the versatility and efficiency of our method on tasks including reflectance interpolation, color transfer, and geometry processing.
Siam Journal on Imaging Sciences | 2016
Marco Cuturi; Gabriel Peyré
Variational problems that involve Wasserstein distances have been recently proposed to summarize and learn from probability measures. Despite being conceptually simple, such problems are computationally challenging because they involve minimizing over quantities (Wasserstein distances) that are themselves hard to compute. We show that the dual formulation of Wasserstein variational problems introduced recently by G. Carlier, A. Oberman, and E. Oudet [ESAIM Math. Model. Numer. Anal., 6 (2015), pp. 1621--1642] can be regularized using an entropic smoothing, which leads to smooth, differentiable, convex optimization problems that are simpler to implement and numerically more stable. We illustrate the versatility of this approach by applying it to the computation of Wasserstein barycenters and gradient flows of spacial regularization functionals. (A correction is attached.)
international conference on computer graphics and interactive techniques | 2016
Nicolas Bonneel; Gabriel Peyré; Marco Cuturi
This article defines a new way to perform intuitive and geometrically faithful regressions on histogram-valued data. It leverages the theory of optimal transport, and in particular the definition of Wasserstein barycenters, to introduce for the first time the notion of barycentric coordinates for histograms. These coordinates take into account the underlying geometry of the ground space on which the histograms are defined, and are thus particularly meaningful for applications in graphics to shapes, color or material modification. Beside this abstract construction, we propose a fast numerical optimization scheme to solve this backward problem (finding the barycentric coordinates of a given histogram) with a low computational overhead with respect to the forward problem (computing the barycenter). This scheme relies on a backward algorithmic differentiation of the Sinkhorn algorithm which is used to optimize the entropic regularization of Wasserstein barycenters. We showcase an illustrative set of applications of these Wasserstein coordinates to various problems in computer graphics: shape approximation, BRDF acquisition and color editing.
information processing in medical imaging | 2015
Alexandre Gramfort; Gabriel Peyré; Marco Cuturi
Knowing how the Human brain is anatomically and functionally organized at the level of a group of healthy individuals or patients is the primary goal of neuroimaging research. Yet computing an average of brain imaging data defined over a voxel grid or a triangulation remains a challenge. Data are large, the geometry of the brain is complex and the between subjects variability leads to spatially or temporally non-overlapping effects of interest. To address the problem of variability, data are commonly smoothed before performing a linear group averaging. In this work we build on ideas originally introduced by Kantorovich to propose a new algorithm that can average efficiently non-normalized data defined over arbitrary discrete domains using transportation metrics. We show how Kantorovich means can be linked to Wasserstein barycenters in order to take advantage of the entropic smoothing approach used by. It leads to a smooth convex optimization problem and an algorithm with strong convergence guarantees. We illustrate the versatility of this tool and its empirical behavior on functional neuroimaging data, functional MRI and magnetoencephalography (MEG) source estimates, defined on voxel grids and triangulations of the folded cortical surface.
Siam Journal on Imaging Sciences | 2018
Morgan A. Schmitz; Matthieu Heitz; Nicolas Bonneel; Fred Maurice Ngolè Mboula; David Coeurjolly; Marco Cuturi; Gabriel Peyré; Jean-Luc Starck
This paper introduces a new nonlinear dictionary learning method for histograms in the probability simplex. The method leverages optimal transport theory, in the sense that our aim is to reconstruct histograms using so-called displacement interpolations (a.k.a. Wasserstein barycenters) between dictionary atoms; such atoms are themselves synthetic histograms in the probability simplex. Our method simultaneously estimates such atoms, and, for each datapoint, the vector of weights that can optimally reconstruct it as an optimal transport barycenter of such atoms. Our method is computationally tractable thanks to the addition of an entropic regularization to the usual optimal transportation problem, leading to an approximation scheme that is efficient, parallel and simple to differentiate. Both atoms and weights are learned using a gradient-based descent method. Gradients are obtained by automatic differentiation of the generalized Sinkhorn iterations that yield barycenters with entropic smoothing. Because of its formulation relying on Wasserstein barycenters instead of the usual matrix product between dictionary and codes, our method allows for nonlinear relationships between atoms and the reconstruction of input data. We illustrate its application in several different image processing settings.
SIAM Journal on Scientific Computing | 2018
Elsa Cazelles; Vivien Seguy; Jérémie Bigot; Marco Cuturi; Nicolas Papadakis
This paper is concerned with the statistical analysis of datasets whose elements are random histograms. For the purpose of learning principal modes of variation from such data, we consider the issue of computing the principal component analysis (PCA) of histograms with respect to the 2-Wasserstein distance between probability measures. To this end, we propose comparing the methods of log-PCA and geodesic PCA in the Wasserstein space as introduced in [J. Bigot et al., Ann. Inst. Henri Poincare Probab. Stat., 53 (2017), pp. 1--26; V. Seguy and M. Cuturi, Principal geodesic analysis for probability measures under the optimal transport metric, in Advances in Neural Information Processing Systems 28, C. Cortes, N. Lawrence, D. Lee, M. Sugiyama, and R. Garnett, eds., Curran Associates, Inc., Red Hook, NY, 2015, pp. 3294--3302]. Geodesic PCA involves solving a nonconvex optimization problem. To solve it approximately, we propose a novel forward-backward algorithm. This allows us to give a detailed comparison bet...
Wavelets and Sparsity XVII | 2017
Morgan A. Schmitz; Matthieu Heitz; Fred-Maurice Ngolè-Mboula; Nicolas Bonneel; David Coeurjolly; Marco Cuturi; Gabriel Peyré; Jean-Luc Starck
Optimal Transport theory enables the definition of a distance across the set of measures on any given space. This Wasserstein distance naturally accounts for geometric warping between measures (including, but not exclusive to, images). We introduce a new, Optimal Transport-based representation learning method in close analogy with the usual Dictionary Learning problem. This approach typically relies on a matrix dot-product between the learned dictionary and the codes making up the new representation. The relationship between atoms and data is thus ultimately linear. By reconstructing our data as Wasserstein barycenters of learned atoms instead, our approach yields a representation making full use of the Wasserstein distances attractive properties and allowing for non-linear relationships between the dictionary atoms and the datapoints. We apply our method to a dataset of Euclid-like simulated PSFs (Point Spread Function). ESAs Euclid mission will cover a large area of the sky in order to accurately measure the shape of billions of galaxies. PSF estimation and correction is one of the main sources of systematic errors on those galaxy shape measurements. PSF variations across the field of view and with the incoming lights wavelength can be highly non-linear, while still retaining strong geometrical information, making the use of Optimal Transport distances an attractive prospect. We show that our representation does indeed succeed at capturing the PSFs variations.
Machine Learning | 2015
Tam T. Le; Marco Cuturi
Learning distances that are specifically designed to compare histograms in the probability simplex has recently attracted the attention of the machine learning community. Learning such distances is important because most machine learning problems involve bags of features rather than simple vectors. Ample empirical evidence suggests that the Euclidean distance in general and Mahalanobis metric learning in particular may not be suitable to quantify distances between points in the simplex. We propose in this paper a new contribution to address this problem by generalizing a family of embeddings proposed by Aitchison (J R Stat Soc 44:139–177, 1982) to map the probability simplex onto a suitable Euclidean space. We provide algorithms to estimate the parameters of such maps by building on previous work on metric learning approaches. The criterion we study is not convex, and we consider alternating optimization schemes as well as accelerated gradient descent approaches. These algorithms lead to representations that outperform alternative approaches to compare histograms in a variety of contexts.