2021 IEEE 18th Annual Consumer Communications & Networking Conference (CCNC) | 2021
Real-Time and Energy-Efficient Inference at GPU-Based Network Edge using PON
Abstract
In recent years, advances in deep learning (DL) technology have greatly improved research and services related to artificial intelligence (AI). In particular, real-time object recognition has become an important technology in smart cities. To achieve this, low-cost network deployment and low-latency data transfer are the key technologies. In this paper, we focus on Time- and Wavelength-Division Multiplexed Passive Optical Network (TWDM-PON) based inference systems to deploy cost-efficient networks that accommodate many network cameras. A significant issue for a GPU-based inference system via TWDM-PON is optimally allocating upstream wavelength and bandwidth to enable real-time inference. However, it has not been considered to increase the batch size of arrival data at edge servers ensuring low-latency transmission. Therefore, this paper proposes a concept of an inference system in which a large number of cameras periodically upload image data to a GPU-based server via TWDM-PONe We also propose a cooperative wavelength and bandwidth allocation algorithm to ensure low-latency and time-synchronized data arrival at the edge. The performance of the proposed scheme is verified with computer simulation.