2021 IEEE 32nd Annual International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC) | 2021

Reinforcement Learning Based Green Rate-Constrained UAV Trajectory and User Association Design for IoT Networks

 
 
 
 

Abstract


In this paper, we have proposed an energy-efficient unmanned aerial vehicle (UAV) assisted Internet of things (IoT) network where a low altitude UAV is employed as a mobile data collector. We develop a novel optimization framework that minimizes the total energy consumption of all devices by jointly optimizing the UAV’s trajectory, device association and respective transmit power allocation at every time slot while ensuring that every device should achieve a given transmission rate constraint. As this joint optimization problem is nonconvex and combinatorial, we adopt reinforcement learning (RL) based solution methodology that effectively decouples it into three individual optimization problems. The formulated problem is transformed as a Markov decision process (MDP) where UAV learns its trajectory according to its current state and corresponding action aiming to maximize the reward under the current policy. Finally, we conceive state-action-reward-state-action (SARSA), a low complexity iterative algorithm for updating the current policy in the case of randomly deployed IoT devices which achieves good computational complexity-optimality tradeoff via numerical results. We find that the proposed methodology reduces the total energy consumption of all devices by 9.23%, 14.06%, and 15.87% in the case of 80, 100, and 120 available time slots of UAV respectively.

Volume None
Pages 1345-1350
DOI 10.1109/pimrc50174.2021.9569434
Language English
Journal 2021 IEEE 32nd Annual International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC)

Full Text