Consistency-diversity-realism Pareto fronts of conditional image generative models

Consistency-diversity-realism Pareto fronts of conditional image generative models

14 Jun 2024 | Pietro Astolfi, Marlene Carell, Melissa Hall, Oscar Mañas, Matthew Muckley, Jakob Verbeek, Adriana Romero-Soriano, Michal Drozdzal
The paper "Consistency-Diversity-Realism Pareto Fronts of Conditional Image Generative Models" by Pietro Astolfi et al. explores the trade-offs between consistency, diversity, and realism in conditional image generative models, aiming to evaluate their potential as world models. The authors use state-of-the-art text-to-image (T2I) and image&text-to-image (I-T2I) models and their knobs to draw Pareto fronts, providing a holistic view of the multi-objective optimization problem. They find that while recent models excel in consistency and realism, they sacrifice representation diversity. Older models, such as LDM$_{1.5}$ and LDM$_{2.1}$, are better at maintaining representation diversity. The analysis also reveals that the oldest model, LDM$_{1.5}$, outperforms more recent models in all axes of evaluation, and there are significant disparities in performance across different geographical regions. The study concludes that there is no single best model, and the choice of model should be determined by the specific downstream application. The authors invite the research community to use Pareto fronts as an analytical tool to measure progress towards world models.The paper "Consistency-Diversity-Realism Pareto Fronts of Conditional Image Generative Models" by Pietro Astolfi et al. explores the trade-offs between consistency, diversity, and realism in conditional image generative models, aiming to evaluate their potential as world models. The authors use state-of-the-art text-to-image (T2I) and image&text-to-image (I-T2I) models and their knobs to draw Pareto fronts, providing a holistic view of the multi-objective optimization problem. They find that while recent models excel in consistency and realism, they sacrifice representation diversity. Older models, such as LDM$_{1.5}$ and LDM$_{2.1}$, are better at maintaining representation diversity. The analysis also reveals that the oldest model, LDM$_{1.5}$, outperforms more recent models in all axes of evaluation, and there are significant disparities in performance across different geographical regions. The study concludes that there is no single best model, and the choice of model should be determined by the specific downstream application. The authors invite the research community to use Pareto fronts as an analytical tool to measure progress towards world models.
Reach us at info@study.space
[slides and audio] Consistency-diversity-realism Pareto fronts of conditional image generative models