February 7, 2024 | Fudan Zheng, Jindong Cao, Weijiang Yu, Zhiguang Chen, Nong Xiao, Yutong Lu
This paper addresses the challenge of low-resource medical image classification by proposing a weakly supervised prompt learning method called *MedPrompt*. The method leverages large-scale pre-trained vision-language models, such as CLIP, to learn transferable representations from unlabeled medical images and texts. *MedPrompt* includes an unsupervised pre-trained vision-language model and a weakly supervised prompt learning model. The unsupervised model pre-trains on large-scale medical images and texts, while the prompt learning model generates medical prompts automatically using only class labels. This approach reduces the reliance on domain experts for manual prompt design, making medical image classification more efficient and cost-effective. Experimental results on four benchmark datasets (CheXpert, MIMIC-CXR, COVID, and RSNA) show that *MedPrompt* outperforms hand-crafted prompts in zero-shot and few-shot learning, demonstrating superior generalization and performance. The proposed prompt generator is lightweight, with only 86,016 parameters and 86,112 FLOPs, making it suitable for integration into various network architectures. The method's limitations include its performance on datasets with completely new diseases and its potential in handling different medical imaging modalities.This paper addresses the challenge of low-resource medical image classification by proposing a weakly supervised prompt learning method called *MedPrompt*. The method leverages large-scale pre-trained vision-language models, such as CLIP, to learn transferable representations from unlabeled medical images and texts. *MedPrompt* includes an unsupervised pre-trained vision-language model and a weakly supervised prompt learning model. The unsupervised model pre-trains on large-scale medical images and texts, while the prompt learning model generates medical prompts automatically using only class labels. This approach reduces the reliance on domain experts for manual prompt design, making medical image classification more efficient and cost-effective. Experimental results on four benchmark datasets (CheXpert, MIMIC-CXR, COVID, and RSNA) show that *MedPrompt* outperforms hand-crafted prompts in zero-shot and few-shot learning, demonstrating superior generalization and performance. The proposed prompt generator is lightweight, with only 86,016 parameters and 86,112 FLOPs, making it suitable for integration into various network architectures. The method's limitations include its performance on datasets with completely new diseases and its potential in handling different medical imaging modalities.