IEEE Transactions on Multimedia | 2019

Deep Alignment Network Based Multi-Person Tracking With Occlusion and Motion Reasoning

 
 
 
 
 

Abstract


Tracking-by-detection is one of the typical paradigms for multi-person tracking, due to the availability of automatic pedestrian detectors. However, existing multi-person trackers are greatly challenged by misalignment in the pedestrian detectors (i.e., excessive background and part missing) and occlusion. To effectively handle these problems, we propose a deep alignment network-based multi-person tracking method with occlusion and motion reasoning. Specifically, the inaccurate detections are first corrected via a deep alignment network, in which an alignment estimation module is used to automatically learn the spatial transformation of these detections. As a result, the deep features from our alignment network will have better representation power and, thus, lead to more consistent tracks. Then, a coarse-to-fine schema is designed for construing a discriminative association cost matrix with spatial, motion, and appearance information. Meanwhile, a principled approach is developed to allow our method to handle occlusion with motion reasoning and the reidentification ability of the pedestrian alignment network. Finally, the association problem is solved via a simple yet real-time Hungarian algorithm. Comprehensive experiments on MOT16, ISSIA soccer, PETS09, and TUD datasets validate the effectiveness and robustness of our proposed tracker.

Volume 21
Pages 1183-1194
DOI 10.1109/TMM.2018.2875360
Language English
Journal IEEE Transactions on Multimedia

Full Text