2021 IEEE Winter Conference on Applications of Computer Vision (WACV) | 2021

Understanding the impact of mistakes on background regions in crowd counting

 
 
 
 

Abstract


In crowd counting we often observe wrong predictions on image regions not containing any person. But how often do these mistakes happen and how much do they affect the overall performance? In this paper we analyze this problem in depth and present an extensive analysis on five of the most important crowd counting datasets. We present this analysis in two parts. First, we quantify the number of mistakes made. Our results show that (i) mistakes on back-ground are substantial and they are responsible for 18-49% of the total error, (ii) models do not generalize well to different kinds of backgrounds and perform poorly on completely background images, and (iii) models make many more mistakes than those captured by the standard Mean Absolute Error (MAE) metric, as counting on background compensates substantially for misses on foreground. And second, we quantify the performance change gained by helping the model better deal with this problem. We enrich a popular crowd counting network with a segmentation branch trained to suppress background predictions. This simple addition (i) reduces background error by 10-83%, (ii) reduces fore-ground error by up to 26% and (iii) improves overall crowd counting performance up to 20%. When compared against the literature, this simple technique achieves very competitive results on all datasets, showing the importance of tack-ling the background problem.

Volume None
Pages 1649-1658
DOI 10.1109/WACV48630.2021.00169
Language English
Journal 2021 IEEE Winter Conference on Applications of Computer Vision (WACV)

Full Text