2019 16th Conference on Computer and Robot Vision (CRV) | 2019

Traffic Risk Assessment: A Two-Stream Approach Using Dynamic-Attention

 
 

Abstract


The problem being addressed in this research is performing traffic risk assessment on visual scenes captured via outward-facing dashcam videos. To perform risk assessment, a two-stream dynamic-attention recurrent convolutional neural architecture is used to provide a categorical risk level for each frame in a given input video sequence. The two-stream approach consists of a spatial stream, which analyzes individual video frames and computes high-level appearance features and a temporal stream, which analyzes optical flow between adjacent frames and computes high-level motion features. Both spatial and temporal streams are then fed into their respective recurrent neural networks (RNNs) that explicitly models the sequence of features in time. A dynamic-attention mechanism which allows the network to learn to focus on relevant objects in the visual scene is added. These objects are detected by a state-of-the-art object detector and correspond to vehicles, pedestrians, traffic signs, etc. The dynamic-attention mechanism not only improves classification performance, but also provides a method to visualize what the network sees when predicting a risk level. This mechanism allows the network to implicitly learn to focus on hazardous objects in the visual scene. Additionally, this research introduces an offline and online model that differ slightly in their implementations. The offline model analyzes the complete video sequence and scores a classification accuracy of 84.89%. The online model deals with an infinite stream of data and produces results in near real-time (7 frames-per-second); however, it suffers from a slight decrease in classification accuracy (79.90%).

Volume None
Pages 166-173
DOI 10.1109/CRV.2019.00030
Language English
Journal 2019 16th Conference on Computer and Robot Vision (CRV)

Full Text