The 16th International Conference on the Foundations of Digital Games (FDG) 2021 | 2021

Meta-Learning a Solution to the Hanabi Ad-Hoc Challenge

 
 
 
 
 
 
 
 

Abstract


In this work we demonstrate that the First Order Model Agnostic Meta Learning (FOMAML) algorithm trained on the Hanabi Open Agent Dataset (HOAD) results in a model that is able to outplay both a naive MLP baseline, as well as a randomly selected partner in the Hanabi Ad-Hoc Challenge, in both low-shot and zero-shot setups. We first show that HOAD is well suited for the meta-learning task because its agents are high quality and utilize diverse strategies, thereby confirming that MAML is generalizing, and not memorizing agent strategies. We then detail our application of FOMAML to the cooperative decision making problem Hanabi entails, and we also provide evidence supporting recent results that the task update of MAML gives little to no test time performance boost. The pretrained models and game data are made available online at https://github.com/aronsar/hoad.

Volume None
Pages None
DOI 10.1145/3472538.3472546
Language English
Journal The 16th International Conference on the Foundations of Digital Games (FDG) 2021

Full Text