2021 Picture Coding Symposium (PCS) | 2021

Rate-Distortion-Time Cost Aware CNN Training for Fast VVC Intra-Picture Partitioning Decisions

 
 
 
 
 
 
 

Abstract


This paper presents a new method for fast VVC intra-picture encoding using a CNN. The CNN operates on the original samples of $\\mathbf{32}\\times \\mathbf{32}$ blocks. Given a current block, it derives for each of the block s multi-type trees (MTTs), which are nested in quad-tree (QT) nodes, a parameter pair. The parameter pairs constrain the minimum width and height of the sub-blocks in their MTTs. This enables the CNN to control the number of tested MTT splits with fine granularity. To skip modes while maintaining the rate-distortion (RD) performance, we train the CNN considering the Lagrangian rate-distortion-time (RDT) cost caused by the derived parameters. First, we generate training data by encoding; when reaching a quad-tree node in a $\\mathbf{32}\\times \\mathbf{32}$ block, we encode the associated MTT with varying parameter pair values and record the resulting the RD and time cost. Then, when the CNN outputs parameters in training, we estimate the related RDT cost of the $\\mathbf{32}\\times \\mathbf{32}$ block using the recorded data. For this, we model the dependency between RDT cost and the parameters by emulating the encoder s RD optimization process. This way, we train the CNN while considering the RDT cost with an accuracy that is sufficient to outperform existing approaches. The approach achieves an encoding time reduction of 50% with a bit rate increase of only 0.7% for VTM-10.2.

Volume None
Pages 1-5
DOI 10.1109/PCS50896.2021.9477452
Language English
Journal 2021 Picture Coding Symposium (PCS)

Full Text