Enhancing Blind Video Quality Assessment with Rich Quality-aware Features

Enhancing Blind Video Quality Assessment with Rich Quality-aware Features

14 May 2024 | Wei Sun, Haoning Wu, Zicheng Zhang, Jun Jia, Zhichao Zhang, Linhan Cao, Qiubo Chen, Xiongkuo Min, Weisi Lin, Guangtao Zhai
This paper presents a method to enhance blind video quality assessment (BVQA) models for social media videos by incorporating rich quality-aware features. The proposed model is based on SimpleVQA, which includes a trainable spatial quality module and a fixed temporal quality module. To improve performance, the model integrates features from three pre-trained models: LIQE, Q-Align, and FAST-VQA. These features capture frame-level quality-aware information, scene-specific features, and spatiotemporal quality-aware features. The features are concatenated and regressed into video quality scores using a multi-layer perceptron (MLP) network. The model achieves the best performance on three public social media VQA datasets and won first place in the CVPR NTIRE 2024 Short-form UGC Video Quality Assessment Challenge. The core contributions include enhancing the SimpleVQA framework with three types of quality-aware pre-trained features, improving the model's robustness and generalization, and using the MHSA module to capture salient frame regions that influence visual quality. The model demonstrates superior performance in handling complex distortions and diverse content in social media videos.This paper presents a method to enhance blind video quality assessment (BVQA) models for social media videos by incorporating rich quality-aware features. The proposed model is based on SimpleVQA, which includes a trainable spatial quality module and a fixed temporal quality module. To improve performance, the model integrates features from three pre-trained models: LIQE, Q-Align, and FAST-VQA. These features capture frame-level quality-aware information, scene-specific features, and spatiotemporal quality-aware features. The features are concatenated and regressed into video quality scores using a multi-layer perceptron (MLP) network. The model achieves the best performance on three public social media VQA datasets and won first place in the CVPR NTIRE 2024 Short-form UGC Video Quality Assessment Challenge. The core contributions include enhancing the SimpleVQA framework with three types of quality-aware pre-trained features, improving the model's robustness and generalization, and using the MHSA module to capture salient frame regions that influence visual quality. The model demonstrates superior performance in handling complex distortions and diverse content in social media videos.
Reach us at info@study.space