This paper proposes a neural video compression (NVC) codec, DCVC-FM, which addresses two critical challenges in NVC: supporting a wide quality range and handling long prediction chains. The proposed method introduces feature modulation techniques to enhance the performance of NVC.
To support a wide quality range, the paper introduces a learnable quantization scaler and a uniform quantization parameter sampling mechanism. This allows the NVC to support a much larger quality range (11.4 dB) compared to previous NVCs (3.8 dB). The quantization scaler is trained to adaptively modulate the latent feature, enabling the NVC to adjust the quality level within a wide range. Additionally, the paper demonstrates the rate control capability of the NVC, which is essential for practical applications.
To handle long prediction chains, the paper proposes a periodically refreshing mechanism for temporal feature modulation. This mechanism helps alleviate the error propagation problem in NVC. The paper also shows that the proposed NVC can maintain quality across frames with lower bitrate cost compared to previous NVCs.
The proposed DCVC-FM supports both RGB and YUV colorspaces within a single model, and enables low-precision inference, which significantly reduces the running time and memory cost with minimal impact on compression ratio. Experiments show that DCVC-FM outperforms existing NVCs and traditional codecs in terms of compression ratio and quality. For example, under intra-period -1 setting, DCVC-FM achieves 29.7% bitrate saving and 16% MAC reduction compared to previous SOTA NVCs.
The paper also compares DCVC-FM with other NVCs and traditional codecs on various datasets, showing its superior performance. The results demonstrate that DCVC-FM is a significant milestone in the development of NVC technology. The code for DCVC-FM is available at https://github.com/microsoft/DCVC.This paper proposes a neural video compression (NVC) codec, DCVC-FM, which addresses two critical challenges in NVC: supporting a wide quality range and handling long prediction chains. The proposed method introduces feature modulation techniques to enhance the performance of NVC.
To support a wide quality range, the paper introduces a learnable quantization scaler and a uniform quantization parameter sampling mechanism. This allows the NVC to support a much larger quality range (11.4 dB) compared to previous NVCs (3.8 dB). The quantization scaler is trained to adaptively modulate the latent feature, enabling the NVC to adjust the quality level within a wide range. Additionally, the paper demonstrates the rate control capability of the NVC, which is essential for practical applications.
To handle long prediction chains, the paper proposes a periodically refreshing mechanism for temporal feature modulation. This mechanism helps alleviate the error propagation problem in NVC. The paper also shows that the proposed NVC can maintain quality across frames with lower bitrate cost compared to previous NVCs.
The proposed DCVC-FM supports both RGB and YUV colorspaces within a single model, and enables low-precision inference, which significantly reduces the running time and memory cost with minimal impact on compression ratio. Experiments show that DCVC-FM outperforms existing NVCs and traditional codecs in terms of compression ratio and quality. For example, under intra-period -1 setting, DCVC-FM achieves 29.7% bitrate saving and 16% MAC reduction compared to previous SOTA NVCs.
The paper also compares DCVC-FM with other NVCs and traditional codecs on various datasets, showing its superior performance. The results demonstrate that DCVC-FM is a significant milestone in the development of NVC technology. The code for DCVC-FM is available at https://github.com/microsoft/DCVC.