Adaptive Image Quality Assessment via Teaching Large Multimodal Model to Compare

Adaptive Image Quality Assessment via Teaching Large Multimodal Model to Compare

29 May 2024 | Hanwei Zhu, Haoning Wu, Xixuan Li, Zicheng Zhang, Baoliang Chen, Lingyu Zhu, Yuming Fang, Guangtao Zhai, Weisi Lin, Shiqi Wang
The paper introduces Compare2Score, an innovative no-reference image quality assessment (NR-IQA) model that leverages large multimodal models (LMMs) to bridge the gap between discrete comparative levels and continuous quality scores. The key contributions of the paper are: 1. **Training Dataset**: The model generates scaled-up comparative instructions by comparing images within the same IQA dataset, allowing for more flexible integration of diverse IQA datasets. This approach simulates subjective testing by posing questions like, "Compared with the first image, how is the quality of the second image?" Responses are categorized into five comparative levels: inferior, worse, similar, better, and superior. 2. **Inference Conversion Strategy**: An adaptive soft comparison method is proposed to efficiently translate discrete comparative levels into continuous quality scores. This method calculates the likelihood that an input image is preferred over multiple anchor images, derived from a weighted summation of softmax-transformed log probabilities across the five comparative levels. The quality score is then computed using maximum a posteriori (MAP) estimation based on the resulting probability matrix. 3. **State-of-the-Art Framework**: Extensive experiments on nine IQA datasets validate the effectiveness of Compare2Score. The model consistently outperforms state-of-the-art NR-IQA models across synthetic and realistic distortions, demonstrating enhanced generalization capability. Additionally, the probability matrix-based inference conversion significantly improves the rating accuracy of Compare2Score and extends these improvements to zero-shot general-purpose LMMs. The paper also discusses the impact of anchor image selection and the number of anchor images on the model's performance, showing that a small set of anchor images (e.g., one per quality interval) suffices for achieving promising results. Overall, Compare2Score demonstrates robustness and effectiveness in various IQA contexts, contributing to the broader adoption of responsible AI technologies.The paper introduces Compare2Score, an innovative no-reference image quality assessment (NR-IQA) model that leverages large multimodal models (LMMs) to bridge the gap between discrete comparative levels and continuous quality scores. The key contributions of the paper are: 1. **Training Dataset**: The model generates scaled-up comparative instructions by comparing images within the same IQA dataset, allowing for more flexible integration of diverse IQA datasets. This approach simulates subjective testing by posing questions like, "Compared with the first image, how is the quality of the second image?" Responses are categorized into five comparative levels: inferior, worse, similar, better, and superior. 2. **Inference Conversion Strategy**: An adaptive soft comparison method is proposed to efficiently translate discrete comparative levels into continuous quality scores. This method calculates the likelihood that an input image is preferred over multiple anchor images, derived from a weighted summation of softmax-transformed log probabilities across the five comparative levels. The quality score is then computed using maximum a posteriori (MAP) estimation based on the resulting probability matrix. 3. **State-of-the-Art Framework**: Extensive experiments on nine IQA datasets validate the effectiveness of Compare2Score. The model consistently outperforms state-of-the-art NR-IQA models across synthetic and realistic distortions, demonstrating enhanced generalization capability. Additionally, the probability matrix-based inference conversion significantly improves the rating accuracy of Compare2Score and extends these improvements to zero-shot general-purpose LMMs. The paper also discusses the impact of anchor image selection and the number of anchor images on the model's performance, showing that a small set of anchor images (e.g., one per quality interval) suffices for achieving promising results. Overall, Compare2Score demonstrates robustness and effectiveness in various IQA contexts, contributing to the broader adoption of responsible AI technologies.
Reach us at info@study.space