Adaptive Image Quality Assessment via Teaching Large Multimodal Model to Compare

Adaptive Image Quality Assessment via Teaching Large Multimodal Model to Compare

29 May 2024 | Hanwei Zhu, Haoning Wu, Yixuan Li, Zicheng Zhang, Baoliang Chen, Lingyu Zhu, Yuming Fang, Guangtao Zhai, Weisi Lin, Shiqi Wang
This paper introduces Compare2Score, a novel no-reference image quality assessment (IQA) model that leverages large multimodal models (LMMs) to bridge the gap between discrete comparative levels and continuous quality scores. The model is trained using scaled-up comparative instructions derived from comparing images within the same IQA dataset, enabling more flexible integration of diverse IQA datasets. During inference, it employs a soft comparison method to translate discrete comparative levels into continuous quality scores by calculating the likelihood of a test image being preferred over multiple anchor images. The quality score is further optimized using maximum a posteriori estimation with the resulting probability matrix. Extensive experiments on nine IQA datasets validate that Compare2Score effectively bridges text-defined comparative levels during training with converted single image quality scores for inference, surpassing state-of-the-art IQA models across diverse scenarios. Additionally, the probability-matrix-based inference conversion not only improves the rating accuracy of Compare2Score but also enhances zero-shot general-purpose LMMs, suggesting its intrinsic effectiveness. The model demonstrates strong generalization capability across synthetic, realistic, and generative distortions, and outperforms existing NR-IQA models in terms of prediction accuracy and SRCC results. The proposed soft comparison method is shown to be effective and efficient in converting discrete textual responses into continuous quality scores. The results highlight the robustness and utility of the soft comparison approach in diverse IQA contexts.This paper introduces Compare2Score, a novel no-reference image quality assessment (IQA) model that leverages large multimodal models (LMMs) to bridge the gap between discrete comparative levels and continuous quality scores. The model is trained using scaled-up comparative instructions derived from comparing images within the same IQA dataset, enabling more flexible integration of diverse IQA datasets. During inference, it employs a soft comparison method to translate discrete comparative levels into continuous quality scores by calculating the likelihood of a test image being preferred over multiple anchor images. The quality score is further optimized using maximum a posteriori estimation with the resulting probability matrix. Extensive experiments on nine IQA datasets validate that Compare2Score effectively bridges text-defined comparative levels during training with converted single image quality scores for inference, surpassing state-of-the-art IQA models across diverse scenarios. Additionally, the probability-matrix-based inference conversion not only improves the rating accuracy of Compare2Score but also enhances zero-shot general-purpose LMMs, suggesting its intrinsic effectiveness. The model demonstrates strong generalization capability across synthetic, realistic, and generative distortions, and outperforms existing NR-IQA models in terms of prediction accuracy and SRCC results. The proposed soft comparison method is shown to be effective and efficient in converting discrete textual responses into continuous quality scores. The results highlight the robustness and utility of the soft comparison approach in diverse IQA contexts.
Reach us at info@study.space
Understanding Adaptive Image Quality Assessment via Teaching Large Multimodal Model to Compare