Harm de Vries | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Harm de Vries is active.

Explore More

Publication

Featured researches published by Harm de Vries.

computer vision and pattern recognition | 2017

GuessWhat?! Visual Object Discovery through Multi-modal Dialogue

Harm de Vries; Florian Strub; Sarath Chandar; Olivier Pietquin; Hugo Larochelle; Aaron C. Courville

We introduce GuessWhat?!, a two-player guessing game as a testbed for research on the interplay of computer vision and dialogue systems. The goal of the game is to locate an unknown object in a rich image scene by asking a sequence of questions. Higher-level image understanding, like spatial reasoning and language grounding, is required to solve the proposed task. Our key contribution is the collection of a large-scale dataset consisting of 150K human-played games with a total of 800K visual question-answer pairs on 66K images. We explain our design decisions in collecting the dataset and introduce the oracle and questioner tasks that are associated with the two players of the game. We prototyped deep learning models to establish initial baselines of the introduced tasks.

international joint conference on artificial intelligence | 2017

End-to-end optimization of goal-driven and visually grounded dialogue systems

Florian Strub; Harm de Vries; Jérémie Mary; Aaron C. Courville; Olivier Pietquin

End-to-end design of dialogue systems has recently become a popular research topic thanks to powerful tools such as encoder-decoder architectures for sequence-to-sequence learning. Yet, most current approaches cast human-machine dialogue management as a supervised learning problem, aiming at predicting the next utterance of a participant given the full history of the dialogue. This vision is too simplistic to render the intrinsic planning problem inherent to dialogue as well as its grounded nature , making the context of a dialogue larger than the sole history. This is why only chitchat and question answering tasks have been addressed so far using end-to-end architectures. In this paper, we introduce a Deep Reinforcement Learning method to optimize visually grounded task-oriented dialogues , based on the policy gradient algorithm. This approach is tested on a dataset of 120k dialogues collected through Mechanical Turk and provides encouraging results at solving both the problem of generating natural dialogues and the task of discovering a specific object in a complex picture.

european conference on computer vision | 2018

Visual Reasoning with Multi-hop Feature Modulation

Florian Strub; Mathieu Seurin; Ethan Perez; Harm de Vries; Jérémie Mary; Philippe Preux; Aaron C. Courville; Olivier Pietquin

Recent breakthroughs in computer vision and natural language processing have spurred interest in challenging multi-modal tasks such as visual question-answering and visual dialogue. For such tasks, one successful approach is to condition image-based convolutional network computation on language via Feature-wise Linear Modulation (FiLM) layers, i.e., per-channel scaling and shifting. We propose to generate the parameters of FiLM layers going up the hierarchy of a convolutional network in a multi-hop fashion rather than all at once, as in prior work. By alternating between attending to the language input and generating FiLM layer parameters, this approach is better able to scale to settings with longer input sequences such as dialogue. We demonstrate that multi-hop FiLM generation significantly outperforms prior state-of-the-art on the GuessWhat?! visual dialogue task and matches state-of-the art on the ReferIt object retrieval task, and we provide additional qualitative analysis.

workshop on self organizing maps | 2017

Empirical evaluation of gradient methods for matrix learning vector quantization

Michael LeKander; Michael Biehl; Harm de Vries

Generalized Matrix Learning Vector Quantization (GMLVQ) critically relies on the use of an optimization algorithm to train its model parameters. We test various schemes for automated control of learning rates in gradient-based training. We evaluate these algorithms in terms of their achieved performance and their practical feasibility. We find that some algorithms do indeed perform better than others across multiple benchmark datasets. These algorithms produce GMLVQ models which not only better fit the training data, but also perform better upon validation. In particular, we find that the Variance-based Stochastic Gradient Descent algorithm consistently performs best across all experiments.

neural information processing systems | 2015