Learning Background Prompts to Discover Implicit Knowledge for Open Vocabulary Object Detection

Learning Background Prompts to Discover Implicit Knowledge for Open Vocabulary Object Detection

1 Jun 2024 | Jiaming Li, Jiacheng Zhang, Jichang Li, Ge Li, Si Liu, Liang Lin, Guanbin Li
This paper introduces LBP, a novel framework for open-vocabulary object detection (OVD), which aims to enhance the detector's ability to recognize both base and novel categories by learning background prompts to harness implicit background knowledge. The framework consists of three key modules: Background Category-specific Prompt (BCP), Background Object Discovery (BOD), and Inference Probability Rectification (IPR). BCP discovers and represents background underlying categories, BOD explores implicit object knowledge related to these categories, and IPR rectifies conceptual overlaps between estimated background categories and novel categories during inference. The BCP module learns category-specific prompts to better capture the diverse implicit knowledge in background proposals, while BOD further exploits implicit object knowledge to alleviate model overfitting. IPR ensures accurate probability scores for novel categories by resolving conceptual overlaps. The proposed method is evaluated on two benchmark datasets, OV-COCO and OV-LVIS, demonstrating superior performance compared to existing state-of-the-art approaches. The results show that LBP significantly improves detection performance for both base and novel categories, particularly in scenarios where background knowledge is crucial. The framework is designed to be compatible with existing OVD methods, enabling seamless integration into various detection frameworks. The method addresses the challenges of background interpretation and model overfitting, leading to more accurate and robust object detection in open-vocabulary settings.This paper introduces LBP, a novel framework for open-vocabulary object detection (OVD), which aims to enhance the detector's ability to recognize both base and novel categories by learning background prompts to harness implicit background knowledge. The framework consists of three key modules: Background Category-specific Prompt (BCP), Background Object Discovery (BOD), and Inference Probability Rectification (IPR). BCP discovers and represents background underlying categories, BOD explores implicit object knowledge related to these categories, and IPR rectifies conceptual overlaps between estimated background categories and novel categories during inference. The BCP module learns category-specific prompts to better capture the diverse implicit knowledge in background proposals, while BOD further exploits implicit object knowledge to alleviate model overfitting. IPR ensures accurate probability scores for novel categories by resolving conceptual overlaps. The proposed method is evaluated on two benchmark datasets, OV-COCO and OV-LVIS, demonstrating superior performance compared to existing state-of-the-art approaches. The results show that LBP significantly improves detection performance for both base and novel categories, particularly in scenarios where background knowledge is crucial. The framework is designed to be compatible with existing OVD methods, enabling seamless integration into various detection frameworks. The method addresses the challenges of background interpretation and model overfitting, leading to more accurate and robust object detection in open-vocabulary settings.
Reach us at info@study.space