Deep Portrait Quality Assessment. A NTIRE 2024 Challenge Survey

Deep Portrait Quality Assessment. A NTIRE 2024 Challenge Survey

17 Apr 2024 | Nicolas Chahine, Marcos V. Conde, Daniela Carfora, Gabriel Pacianotto, Benoit Pochon, Sira Ferradans, Radu Timofte, Yipo Huang, Quan Yuan, Xiangfei Sheng, Zhichao Yang, Leida Li, Fangyuan Kong, Yifang Xu, Wei Sun, Weixia Zhang, Yanwei Jiang, Haoning Wu, Zicheng Zhang, Jun Jia, Yingjie Zhou, Zhongpeng Ji, Xiongkou Min, Weisi Lin, Guangtao Zhai, Xiaoqi Wang, Junqi Liu, Zixi Guo, Juan Wang, Bing Li, Xinrui Xu, Zewen Chen
The NTIRE 2024 Portrait Quality Assessment Challenge aimed to develop efficient deep neural networks for estimating the perceptual quality of real portrait photos. The challenge required models to generalize across diverse scenes, lighting conditions, and challenging scenarios like movement and blur. A total of 140 participants registered, with 35 submitting results. The top 5 submissions were analyzed to assess the current state-of-the-art in portrait quality assessment. The challenge used the PIQ23 public dataset and a private portrait dataset to evaluate models on overall quality and generalization. The PIQ23 dataset consists of 50 scenes with diverse skin tones and challenging conditions for smartphone cameras. Each scene was annotated with three image quality attributes using pairwise comparisons, resulting in 600k comparisons from 30 experts. The challenge introduced the PIQ benchmark, which evaluates models on overall quality and generalization. The challenge testing set included 96 single-person scenes of 7 images each, captured with high-quality smartphones and a DSLR camera. Participants did not have access to the generalization test set, and results were obtained by executing their submitted models. Several novel frameworks were proposed to address the shortcomings of current portrait quality assessment methods, particularly in handling domain shifts and ensuring generalizability. The challenge focused on the overall attribute and a "generalization split" to evaluate the capacity of models to generalize outside the training scenes. The top methods included RQ-Net, which uses a global and local quality perception approach with a scale-shift invariant loss and pre-training with mixed multi-source data. Another method, BDVQA, used a ranking-based vision transformer network with merged ranking loss and test time augmentation. The PQE method analyzed facial and full image characteristics to assess portrait quality. MoNet used a mean-opinion network to collect diverse opinion features and produce a comprehensive quality score. The SECE-SYSU method used a scene-adaptive global context and local facial perception network to adaptively select scene-specific regressors. The challenge results showed that models struggled with the quality domain gap, indicating that model performance highly depends on the device used for capturing the data. The top methods demonstrated varying degrees of success in generalization and robustness, with some achieving high correlation metrics. The challenge highlighted the importance of scene-specific semantics and domain adaptation in portrait quality assessment.The NTIRE 2024 Portrait Quality Assessment Challenge aimed to develop efficient deep neural networks for estimating the perceptual quality of real portrait photos. The challenge required models to generalize across diverse scenes, lighting conditions, and challenging scenarios like movement and blur. A total of 140 participants registered, with 35 submitting results. The top 5 submissions were analyzed to assess the current state-of-the-art in portrait quality assessment. The challenge used the PIQ23 public dataset and a private portrait dataset to evaluate models on overall quality and generalization. The PIQ23 dataset consists of 50 scenes with diverse skin tones and challenging conditions for smartphone cameras. Each scene was annotated with three image quality attributes using pairwise comparisons, resulting in 600k comparisons from 30 experts. The challenge introduced the PIQ benchmark, which evaluates models on overall quality and generalization. The challenge testing set included 96 single-person scenes of 7 images each, captured with high-quality smartphones and a DSLR camera. Participants did not have access to the generalization test set, and results were obtained by executing their submitted models. Several novel frameworks were proposed to address the shortcomings of current portrait quality assessment methods, particularly in handling domain shifts and ensuring generalizability. The challenge focused on the overall attribute and a "generalization split" to evaluate the capacity of models to generalize outside the training scenes. The top methods included RQ-Net, which uses a global and local quality perception approach with a scale-shift invariant loss and pre-training with mixed multi-source data. Another method, BDVQA, used a ranking-based vision transformer network with merged ranking loss and test time augmentation. The PQE method analyzed facial and full image characteristics to assess portrait quality. MoNet used a mean-opinion network to collect diverse opinion features and produce a comprehensive quality score. The SECE-SYSU method used a scene-adaptive global context and local facial perception network to adaptively select scene-specific regressors. The challenge results showed that models struggled with the quality domain gap, indicating that model performance highly depends on the device used for capturing the data. The top methods demonstrated varying degrees of success in generalization and robustness, with some achieving high correlation metrics. The challenge highlighted the importance of scene-specific semantics and domain adaptation in portrait quality assessment.
Reach us at info@study.space
Understanding Deep Portrait Quality Assessment. A NTIRE 2024 Challenge Survey