Grégoire Mesnil | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Grégoire Mesnil is active.

Explore More

Publication

Featured researches published by Grégoire Mesnil.

conference on information and knowledge management | 2014

A Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval

Yelong Shen; Xiaodong He; Jianfeng Gao; Li Deng; Grégoire Mesnil

In this paper, we propose a new latent semantic model that incorporates a convolutional-pooling structure over word sequences to learn low-dimensional, semantic vector representations for search queries and Web documents. In order to capture the rich contextual structures in a query or a document, we start with each word within a temporal context window in a word sequence to directly capture contextual features at the word n-gram level. Next, the salient word n-gram features in the word sequence are discovered by the model and are then aggregated to form a sentence-level feature vector. Finally, a non-linear transformation is applied to extract high-level semantic information to generate a continuous vector representation for the full text string. The proposed convolutional latent semantic model (CLSM) is trained on clickthrough data and is evaluated on a Web document ranking task using a large-scale, real-world data set. Results show that the proposed model effectively captures salient semantic information in queries and documents for the task while significantly outperforming previous state-of-the-art semantic models.

international world wide web conferences | 2014

Learning semantic representations using convolutional neural networks for web search

Yelong Shen; Xiaodong He; Jianfeng Gao; Li Deng; Grégoire Mesnil

This paper presents a series of new latent semantic models based on a convolutional neural network (CNN) to learn low-dimensional semantic vectors for search queries and Web documents. By using the convolution-max pooling operation, local contextual information at the word n-gram level is modeled first. Then, salient local fea-tures in a word sequence are combined to form a global feature vector. Finally, the high-level semantic information of the word sequence is extracted to form a global vector representation. The proposed models are trained on clickthrough data by maximizing the conditional likelihood of clicked documents given a query, us-ing stochastic gradient ascent. The new models are evaluated on a Web document ranking task using a large-scale, real-world data set. Results show that our model significantly outperforms other se-mantic models, which were state-of-the-art in retrieval performance prior to this work.

european conference on machine learning | 2011

Higher order contractive auto-encoder

Salah Rifai; Grégoire Mesnil; Pascal Vincent; Xavier Muller; Yoshua Bengio; Yann N. Dauphin; Xavier Glorot

We propose a novel regularizer when training an autoencoder for unsupervised feature extraction. We explicitly encourage the latent representation to contract the input space by regularizing the norm of the Jacobian (analytically) and the Hessian (stochastically) of the encoders output with respect to its input, at the training points. While the penalty on the Jacobians norm ensures robustness to tiny corruption of samples in the input space, constraining the norm of the Hessian extends this robustness when moving further away from the sample. From a manifold learning perspective, balancing this regularization with the auto-encoders reconstruction objective yields a representation that varies most when moving along the data manifold in input space, and is most insensitive in directions orthogonal to the manifold. The second order regularization, using the Hessian, penalizes curvature, and thus favors smooth manifold. We show that our proposed technique, while remaining computationally efficient, yields representations that are significantly better suited for initializing deep architectures than previously proposed approaches, beating state-of-the-art performance on a number of datasets.

IEEE Transactions on Audio, Speech, and Language Processing | 2015

Using recurrent neural networks for slot filling in spoken language understanding

Grégoire Mesnil; Yann N. Dauphin; Kaisheng Yao; Yoshua Bengio; Li Deng; Dilek Hakkani-Tür; Xiaodong He; Larry P. Heck; Gokhan Tur; Dong Yu; Geoffrey Zweig

Semantic slot filling is one of the most challenging problems in spoken language understanding (SLU). In this paper, we propose to use recurrent neural networks (RNNs) for this task, and present several novel architectures designed to efficiently model past and future temporal dependencies. Specifically, we implemented and compared several important RNN architectures, including Elman, Jordan, and hybrid variants. To facilitate reproducibility, we implemented these networks with the publicly available Theano neural network toolkit and completed experiments on the well-known airline travel information system (ATIS) benchmark. In addition, we compared the approaches on two custom SLU data sets from the entertainment and movies domains. Our results show that the RNN-based models outperform the conditional random field (CRF) baseline by 2% in absolute error reduction on the ATIS benchmark. We improve the state-of-the-art by 0.5% in the Entertainment domain, and 6.7% for the movies domain.

ICPRAM (Selected Papers) | 2015

Unsupervised Learning of Semantics of Object Detections for Scene Categorization

Grégoire Mesnil; Salah Rifai; Antoine Bordes; Xavier Glorot; Yoshua Bengio; Pascal Vincent

Classifying scenes (e.g. into “street”, “home” or “leisure”) is an important but complicated task nowadays, because images come with variability, ambiguity, and a wide range of illumination or scale conditions. Standard approaches build an intermediate representation of the global image and learn classifiers on it. Recently, it has been proposed to depict an image as an aggregation of its contained objects: the representation on which classifiers are trained is composed of many heterogeneous feature vectors derived from various object detectors. In this paper, we propose to study different approaches to efficiently learn contextual semantics out of these object detections. We use the features provided by Object-Bank [24] (177 different object detectors producing 252 attributes each), and show on several benchmarks for scene categorization that careful combinations, taking into account the structure of the data, allows to greatly improve over original results (from \(+5\) to \(+11\,\%\)) while drastically reducing the dimensionality of the representation by 97 % (from 44,604 to 1,000). We also show that the uncertainty relative to object detectors hampers the use of external semantic knowledge to improve detectors combination, unlike our unsupervised learning approach.

Machine Learning | 2014

Learning semantic representations of objects and their parts

Grégoire Mesnil; Antoine Bordes; Jason Weston; Gal Chechik; Yoshua Bengio

Recently, large scale image annotation datasets have been collected with millions of images and thousands of possible annotations. Latent variable models, or embedding methods, that simultaneously learn semantic representations of object labels and image representations can provide tractable solutions on such tasks. In this work, we are interested in jointly learning representations both for the objects in an image, and the parts of those objects, because such deeper semantic representations could bring a leap forward in image retrieval or browsing. Despite the size of these datasets, the amount of annotated data for objects and parts can be costly and may not be available. In this paper, we propose to bypass this cost with a method able to learn to jointly label objects and parts without requiring exhaustively labeled data. We design a model architecture that can be trained under a proxy supervision obtained by combining standard image annotation (from ImageNet) with semantic part-based within-label relations (from WordNet). The model itself is designed to model both object image to object label similarities, and object label to object part label similarities in a single joint system. Experiments conducted on our combined data and a precisely annotated evaluation set demonstrate the usefulness of our approach.

conference of the international speech communication association | 2013

Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding.

Grégoire Mesnil; Xiaodong He; Li Deng; Yoshua Bengio

international conference on machine learning | 2013

Better Mixing via Deep Representations

Yoshua Bengio; Grégoire Mesnil; Yann N. Dauphin; Salah Rifai

international conference on machine learning | 2012

Unsupervised and Transfer Learning Challenge: a Deep Learning Approach

Grégoire Mesnil; Yann N. Dauphin; Xavier Glorot; Salah Rifai; Yoshua Bengio; Ian J. Goodfellow; Erick Lavoie; Xavier Muller; Guillaume Desjardins; David Warde-Farley; Pascal Vincent; Aaron C. Courville; James Bergstra

arXiv: Computation and Language | 2014