Boosting Neural Representations for Videos with a Conditional Decoder

Boosting Neural Representations for Videos with a Conditional Decoder

16 Mar 2024 | Xinjie Zhang, Ren Yang, Dailan He, Xington Ge, Tongda Xu, Yan Wang, Hongwei Qin, Jun Zhang
This paper introduces a universal boosting framework for implicit neural representations (INRs) in video processing. The framework enhances the representation capabilities and convergence speed of existing video INRs by incorporating a conditional decoder with a temporal-aware affine transform (TAT) module and a sinusoidal NeRV-like block (SNeRV). The TAT module aligns intermediate features with target frames using temporal embeddings, while the SNeRV block generates diverse features and improves parameter distribution. The framework also integrates a consistent entropy minimization technique to ensure consistency between training and inference, leading to better rate-distortion performance. Experiments on the UVG dataset show that the boosted methods significantly outperform baseline INRs in video regression, compression, inpainting, and interpolation. The proposed framework achieves superior performance across various tasks, demonstrating the effectiveness of the boosting techniques in enhancing video INRs. The results highlight the potential of the proposed framework in improving the efficiency and quality of video processing using implicit neural representations.This paper introduces a universal boosting framework for implicit neural representations (INRs) in video processing. The framework enhances the representation capabilities and convergence speed of existing video INRs by incorporating a conditional decoder with a temporal-aware affine transform (TAT) module and a sinusoidal NeRV-like block (SNeRV). The TAT module aligns intermediate features with target frames using temporal embeddings, while the SNeRV block generates diverse features and improves parameter distribution. The framework also integrates a consistent entropy minimization technique to ensure consistency between training and inference, leading to better rate-distortion performance. Experiments on the UVG dataset show that the boosted methods significantly outperform baseline INRs in video regression, compression, inpainting, and interpolation. The proposed framework achieves superior performance across various tasks, demonstrating the effectiveness of the boosting techniques in enhancing video INRs. The results highlight the potential of the proposed framework in improving the efficiency and quality of video processing using implicit neural representations.
Reach us at info@study.space
Understanding Boosting Neural Representations for Videos with a Conditional Decoder