Olivier Delalleau
Université de Montréal
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Olivier Delalleau.
Neural Computation | 2004
Yoshua Bengio; Olivier Delalleau; Nicolas Le Roux; Jean-François Paiement; Pascal Vincent; Marie Ouimet
In this letter, we show a direct relation between spectral embedding methods and kernel principal components analysis and how both are special cases of a more general learning problem: learning the principal eigenfunctions of an operator defined from a kernel and the unknown data-generating density. Whereas spectral embedding methods provided only coordinates for the training points, the analysis justifies a simple extension to out-of-sample examples (the Nystrm formula) for multidimensional scaling (MDS), spectral clustering, Laplacian eigenmaps, locally linear embedding (LLE), and Isomap. The analysis provides, for all such spectral embedding methods, the definition of a loss function, whose empirical average is minimized by the traditional algorithms. The asymptotic expected value of that loss defines a generalization performance and clarifies what these algorithms are trying to learn. Experiments with LLE, Isomap, spectral clustering, and MDS show that this out-of-sample embedding formula generalizes well, with a level of error comparable to the effect of small perturbations of the training set on the embedding.
Neural Computation | 2009
Yoshua Bengio; Olivier Delalleau
We study an expansion of the log likelihood in undirected graphical models such as the restricted Boltzmann machine (RBM), where each term in the expansion is associated with a sample in a Gibbs chain alternating between two random variables (the visible vector and the hidden vector in RBMs). We are particularly interested in estimators of the gradient of the log likelihood obtained through this expansion. We show that its residual term converges to zero, justifying the use of a truncationrunning only a short Gibbs chain, which is the main idea behind the contrastive divergence (CD) estimator of the log-likelihood gradient. By truncating even more, we obtain a stochastic reconstruction error, related through a mean-field approximation to the reconstruction error often used to train autoassociators and stacked autoassociators. The derivation is not specific to the particular parametric forms used in RBMs and requires only convergence of the Gibbs chain. We present theoretical and empirical evidence linking the number of Gibbs steps k and the magnitude of the RBM parameters to the bias in the CD estimator. These experiments also suggest that the sign of the CD estimator is correct most of the time, even when the bias is large, so that CD-k is a good descent direction even for small k.
algorithmic learning theory | 2011
Yoshua Bengio; Olivier Delalleau
Deep architectures are families of functions corresponding to deep circuits. Deep Learning algorithms are based on parametrizing such circuits and tuning their parameters so as to approximately optimize some training objective. Whereas it was thought too difficult to train deep architectures, several successful algorithms have been proposed in recent years. We review some of the theoretical motivations for deep architectures, as well as some of their practical successes, and propose directions of investigations to address some of the remaining challenges.
computational intelligence | 2010
Yoshua Bengio; Olivier Delalleau; Clarence Simard
The family of decision tree learning algorithms is among the most widespread and studied. Motivated by the desire to develop learning algorithms that can generalize when learning highly varying functions such as those presumably needed to achieve artificial intelligence, we study some theoretical limitations of decision trees. We demonstrate formally that they can be seriously hurt by the curse of dimensionality in a sense that is a bit different from other nonparametric statistical methods, but most importantly, that they cannot generalize to variations not seen in the training set. This is because a decision tree creates a partition of the input space and needs at least one example in each of the regions associated with a leaf to make a sensible prediction in that region. A better understanding of the fundamental reasons for this limitation suggests that one should use forests or even deeper architectures instead of trees, which provide a form of distributed representation and can generalize to variations not encountered in the training data.
Journal of Computer-aided Molecular Design | 2004
Pierre-Jean L'Heureux; Julie Carreau; Yoshua Bengio; Olivier Delalleau; Shi Yi Yue
Current practice in Quantitative Structure Activity Relationship (QSAR) methods usually involves generating a great number of chemical descriptors and then cutting them back with variable selection techniques. Variable selection is an effective method to reduce the dimensionality but may discard some valuable information. This paper introduces Locally Linear Embedding (LLE), a local non-linear dimensionality reduction technique, that can statistically discover a low-dimensional representation of the chemical data. LLE is shown to create more stable representations than other non-linear dimensionality reduction algorithms, and to be capable of capturing non-linearity in chemical data.
computational intelligence | 2012
Yoshua Bengio; Nicolas Chapados; Olivier Delalleau; Hugo Larochelle; Xavier Saint-Mleux; Christian Hudon; Jérôme Louradour
We compare the recently proposed Discriminative Restricted Boltzmann Machine (DRBM) to the classical Support Vector Machine (SVM) on a challenging classification task consisting in identifying weapon classes from audio signals. The three weapon classes considered in this work (mortar, rocket, and rocket‐propelled grenade), are difficult to reliably classify with standard techniques because they tend to have similar acoustic signatures. In addition, specificities of the data available in this study make it challenging to rigorously compare classifiers, and we address methodological issues arising from this situation. Experiments show good classification accuracy that could make these techniques suitable for fielding on autonomous devices. DRBMs appear to yield better accuracy than SVMs, and are less sensitive to the choice of signal preprocessing and model hyperparameters. This last property is especially appealing in such a task where the lack of data makes model validation difficult.
computational intelligence and games | 2013
Eric Laufer; Raul Chandias Ferrari; Li Yao; Olivier Delalleau; Yoshua Bengio
We consider an industrial strength application of recommendation systems for video-game matchmaking in which off-policy policy evaluation is important but where standard approaches can hardly be applied. The objective of the policy is to sequentially form teams of players from those waiting to be matched, in such a way as to produce well-balanced matches. Unfortunately, the available training data comes from a policy that is not known perfectly and that is not stochastic, making it impossible to use methods based on importance weights. Furthermore, we observe that when the estimated reward function and the policy are obtained by training from the same off-policy dataset, the policy evaluation using the estimated reward function is biased. We present a simple calibration procedure that is similar to stacked regression and that removes most of the bias, in the experiments we performed. Data collected during beta tests of Ghost Recon Online, a first person shooter from Ubisoft, were used for the experiments.
computational intelligence | 2012
Yoshua Bengio; Nicolas Chapados; Olivier Delalleau; Hugo Larochelle; Xavier Saint-Mleux; Christian Hudon; Jérôme Louradour
We compare the recently proposed Discriminative Restricted Boltzmann Machine (DRBM) to the classical Support Vector Machine (SVM) on a challenging classification task consisting in identifying weapon classes from audio signals. The three weapon classes considered in this work (mortar, rocket, and rocket‐propelled grenade), are difficult to reliably classify with standard techniques because they tend to have similar acoustic signatures. In addition, specificities of the data available in this study make it challenging to rigorously compare classifiers, and we address methodological issues arising from this situation. Experiments show good classification accuracy that could make these techniques suitable for fielding on autonomous devices. DRBMs appear to yield better accuracy than SVMs, and are less sensitive to the choice of signal preprocessing and model hyperparameters. This last property is especially appealing in such a task where the lack of data makes model validation difficult.
computational intelligence | 2012
Yoshua Bengio; Nicolas Chapados; Olivier Delalleau; Hugo Larochelle; Xavier Saint-Mleux; Christian Hudon; Jérôme Louradour
We compare the recently proposed Discriminative Restricted Boltzmann Machine (DRBM) to the classical Support Vector Machine (SVM) on a challenging classification task consisting in identifying weapon classes from audio signals. The three weapon classes considered in this work (mortar, rocket, and rocket‐propelled grenade), are difficult to reliably classify with standard techniques because they tend to have similar acoustic signatures. In addition, specificities of the data available in this study make it challenging to rigorously compare classifiers, and we address methodological issues arising from this situation. Experiments show good classification accuracy that could make these techniques suitable for fielding on autonomous devices. DRBMs appear to yield better accuracy than SVMs, and are less sensitive to the choice of signal preprocessing and model hyperparameters. This last property is especially appealing in such a task where the lack of data makes model validation difficult.
neural information processing systems | 2003
Yoshua Bengio; Jean-François Paiement; Pascal Vincent; Olivier Delalleau; Nicolas Le Roux; Marie Ouimet