20 Mar 2024 | Yibo Wang, Ruiyuan Gao, Kai Chen, Kaiqiang Zhou, Yingjie Cai, Lanqing Hong, Zhenguo Li, Lihui Jiang, Dit-Yan Yeung, Qiang Xu, Kai Zhang
DetDiffusion is a novel framework that synergizes generative and perceptive models to enhance data generation and perception. The framework introduces perception-aware loss (P.A. loss) and perception-aware attributes (P.A. Attr) to improve image generation quality and controllability. P.A. loss is derived from segmentation maps and object masks, while P.A. Attr is extracted from trained detection models to enhance the performance of downstream perceptive tasks. The framework is evaluated on the COCO-Thing-Stuff dataset, achieving state-of-the-art results in layout-guided generation and significantly improving detector training. The model's ability to generate perception-aware images enables effective data augmentation, leading to improved performance in object detection tasks. The framework demonstrates the potential of integrating generative and perceptive models for enhanced data generation and perception.DetDiffusion is a novel framework that synergizes generative and perceptive models to enhance data generation and perception. The framework introduces perception-aware loss (P.A. loss) and perception-aware attributes (P.A. Attr) to improve image generation quality and controllability. P.A. loss is derived from segmentation maps and object masks, while P.A. Attr is extracted from trained detection models to enhance the performance of downstream perceptive tasks. The framework is evaluated on the COCO-Thing-Stuff dataset, achieving state-of-the-art results in layout-guided generation and significantly improving detector training. The model's ability to generate perception-aware images enables effective data augmentation, leading to improved performance in object detection tasks. The framework demonstrates the potential of integrating generative and perceptive models for enhanced data generation and perception.