The Journal of Supercomputing | 2021

An improved DQN path planning algorithm

 
 
 
 

Abstract


Aiming at the problem of vehicle model tracking error and overdependence in traditional path planning of intelligent driving vehicles, a path planning method of intelligent driving vehicles based on deep reinforcement learning is proposed. Firstly, the abstract model of real environment is extracted. The model uses deep reinforcement learning end-to-end strategy (DRL-ETE) and vehicle dynamics model to train the reinforcement learning model which approaches the optimal intelligent driving. Secondly, the real scene problem is transferred to the virtual abstract model through the model transfer strategy, and the control and trajectory sequences are calculated according to the trained deep reinforcement learning model in the environment. Finally, the optimal trajectory sequence is selected according to the evaluation function in real environment. Because the storage mode of experience playback mechanism of Deep Q-Network algorithm is FIFO, and the sampling mode of later playback training is average sampling, the efficiency of experience playback is low. These two problems lead to the slow process of intelligent driving vehicle to target and route finding. And because of greedy strategy, the information of exploration map is incomplete, and IDQNPER algorithm model is proposed. When storing samples, the samples are given weight, and sent to the network in priority order for sample training. Meanwhile, the importance data sequence is retained in the experience playback cache area, and the sequence with high similarity is removed. The total reward value is about 10% higher than the reward value of original Deep Q-network, which proves that the accuracy of intelligent driving vehicles tends to target points is higher. In order to further realize the autonomous decision-making of intelligent driving vehicles and solve the problem of relying too much on map information in the traditional human planning framework, an end-to-end path planning method is proposed based on the depth reinforcement learning theory, which maps the action instructions directly from the sensor information and then issues them to the intelligent driving vehicles. Firstly, CNN and LSTM are used to process radar and camera information. By comparing the advantages of DQ, Double DQN, Dueling DQN and PER algorithm, IDQNPER algorithm is used to train the automatic path planning of intelligent driving vehicles. Finally, the simulation and verification experiments are carried out in the static obstacle environment. The test results show that IDQNPER algorithm is adaptable to intelligent vehicles in different environments. The method can deal with the continuous input state and generate the continuous control sequence of the corner control, which can reduce the lateral tracking error. At the same time, the generalization performance of the model can be improved and the overdependence problem can be reduced by experience playback.

Volume None
Pages 1-24
DOI 10.1007/S11227-021-03878-2
Language English
Journal The Journal of Supercomputing

Full Text