Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining | 2021

HMRL: Hyper-Meta Learning for Sparse Reward Reinforcement Learning Problem

Abstract

In spite of the success of existing meta reinforcement learning methods, they still have difficulty in learning a meta policy effectively for RL problems with sparse reward. In this respect, we develop a novel meta reinforcement learning framework called Hyper-Meta RL(HMRL), for sparse reward RL problems. It is consisted with three modules including the cross-environment meta state embedding module which constructs a common meta state space to adapt to different environments; the meta state based environment-specific meta reward shaping which effectively extends the original sparse reward trajectory by cross-environmental knowledge complementarity and as a consequence the meta policy achieves better generalization and efficiency with the shaped meta reward. Experiments with sparse-reward environments show the superiority of HMRL on both transferability and policy learning efficiency.

Volume None

Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining | 2021

HMRL: Hyper-Meta Learning for Sparse Reward Reinforcement Learning Problem

Abstract

Volume None

Pages None

DOI 10.1145/3447548.3467242

Language English

Journal Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining

Full Text