VRP-SAM: SAM with Visual Reference Prompt

VRP-SAM: SAM with Visual Reference Prompt

30 Mar 2024 | Yanpeng Sun, Jiahui Chen, Shan Zhang, Xinyu Zhang, Qiang Chen, Gang Zhang, Errui Ding, Jingdong Wang, Zechao Li
The paper introduces VRP-SAM, an extension of the Segment Anything Model (SAM) that incorporates a Visual Reference Prompt (VRP) encoder to enhance its segmentation capabilities. VRP-SAM allows SAM to use annotated reference images as prompts for segmenting specific objects in target images, supporting various annotation formats such as points, boxes, scribbles, and masks. The VRP encoder employs meta-learning to improve the model's generalization and adaptability. Extensive experiments on the Pascal and COCO datasets demonstrate that VRP-SAM achieves state-of-the-art performance in visual reference segmentation with minimal learnable parameters, showing strong generalization capabilities, especially in handling novel objects and cross-domain scenarios. The paper also includes a detailed analysis of the model's components, loss functions, and ablation studies, highlighting the effectiveness of the proposed approach.The paper introduces VRP-SAM, an extension of the Segment Anything Model (SAM) that incorporates a Visual Reference Prompt (VRP) encoder to enhance its segmentation capabilities. VRP-SAM allows SAM to use annotated reference images as prompts for segmenting specific objects in target images, supporting various annotation formats such as points, boxes, scribbles, and masks. The VRP encoder employs meta-learning to improve the model's generalization and adaptability. Extensive experiments on the Pascal and COCO datasets demonstrate that VRP-SAM achieves state-of-the-art performance in visual reference segmentation with minimal learnable parameters, showing strong generalization capabilities, especially in handling novel objects and cross-domain scenarios. The paper also includes a detailed analysis of the model's components, loss functions, and ablation studies, highlighting the effectiveness of the proposed approach.
Reach us at info@study.space
Understanding VRP-SAM%3A SAM with Visual Reference Prompt