IEEE Transactions on Parallel and Distributed Systems | 2021

Cuttlefish: Neural Configuration Adaptation for Video Analysis in Live Augmented Reality

 
 
 
 
 
 
 
 

Abstract


Instead of relying on remote clouds, today’s Augmented Reality (AR) applications usually send videos to nearby edge servers for analysis (such as objection detection) so as to optimize the user’s quality of experience (QoE), which is often determined by not only detection latency but also detection accuracy, playback fluency, etc. Therefore, many studies have been conducted to help adaptively choose best video configuration, e.g., resolution and frame per second (fps), based on network bandwidth to further improve QoE. However, we notice that the video content itself has significant impacts on the configuration selection, e.g., the videos with high-speed objects must be encoded with a high fps to meet the user’s fluency requirement. In this article, we aim to adaptively select configurations that match the time-varying network condition as well as the video content. We design Cuttlefish, a system that generates video configuration decisions using reinforcement learning (RL). Cuttlefish trains a neural network model that picks a configuration for the next encoding slot based on observations collected by AR devices. Cuttlefish does not rely on any pre-programmed models or specific assumptions on the environments. Instead, it learns to make configuration decisions solely through observations of the resulting performance of historical decisions. Cuttlefish automatically learns the adaptive configuration policy for diverse AR video streams and obtains a gratifying QoE. We compared Cuttlefish to several state-of-the-art bandwidth-based and velocity-based methods using trace-driven and real world experiments. The results show that Cuttlefish achieves a 18.4-25.8 percent higher QoE than the others.

Volume 32
Pages 830-841
DOI 10.1109/TPDS.2020.3035044
Language English
Journal IEEE Transactions on Parallel and Distributed Systems

Full Text