Archive | 2021

Transferable Environment Poisoning: Training-time Attack on Reinforcement Learning

 
 
 
 

Abstract


Studying adversarial attacks on Reinforcement Learning (RL) agents has become a key aspect of developing robust, RL-based solutions. Test-time attacks, which target the post-learning performance of an RL agent’s policy, have been well studied in both whiteand black-box settings. More recently, however, state-of-the-art works have shifted to investigate training-time attacks on RL agents, i.e., forcing the learning process towards a target policy designed by the attacker. Alas, these SOTA works continue to rely on white-box settings and/or use a reward-poisoning approach. In contrast, this paper studies environment-dynamics poisoning attacks at training time. Furthermore, while environment-dynamics poisoning presumes a transfer-learning capable agent, it also allows us to expand our approach to black-box attacks. Our overall framework, inspired by hierarchical RL, seeks the minimal environment-dynamics manipulation that will prompt the momentary policy of the agent to change in a desired manner. We show the attack efficiency by comparing it with the reward-poisoning approach, and empirically demonstrate the transferability of the environment-poisoning attack strategy. Finally, we seek to exploit the transferability of the attack strategy to handle black-box settings.

Volume None
Pages 1398-1406
DOI 10.5555/3463952.3464113
Language English
Journal None

Full Text