2021 IEEE International Intelligent Transportation Systems Conference (ITSC) | 2021

Learning Normalizing Flow Policies Based on Highway Demonstrations

 
 
 
 

Abstract


Imitation learning on real-world data has the potential to improve the simulation of real-world traffic. However, learning from human demonstrations can be challenging since the recorded behavior is typically noisy and multimodal. Most policies used in such setups parameterize a Gaussian distribution which is then used for sampling actions. This limits the expressiveness of the outputs and more sophisticated policies based on normalizing flows could enable to model more complex and multimodal behavior. We show how to combine the recently proposed Discriminator Actor Critic algorithm with conditional normalizing flow policies and test this setup on real-world driving data. The results show that normalizing flow policies are able to model complex behavior and lead to superior policies. To demonstrate the general applicability of our proposed approach, we also tested them on imitation learning for typical continuous control benchmarks. While the expert demonstrations of these benchmarks are not based on noisy and multimodal realworld behavior, our approach still leads to competitive results compared to state-of-the-art approaches.

Volume None
Pages 22-29
DOI 10.1109/itsc48978.2021.9564456
Language English
Journal 2021 IEEE International Intelligent Transportation Systems Conference (ITSC)

Full Text