Understanding In-context learning enables multimodal large language models to classify cancer pathology images

This study evaluates the effectiveness of in-context learning on multimodal large language models (LLMs) for classifying cancer pathology images. The researchers used the Generative Pretrained Transformer 4 with Vision capabilities (GPT-4V) to perform in-context learning on three critical histopathology tasks: tissue subtype classification in colorectal cancer, colon polyp subtyping, and breast tumor detection in lymph node sections. The results show that GPT-4V can achieve or even surpass the performance of specialized neural networks trained for these tasks, using only a minimal number of samples. This demonstrates that large vision language models trained on non-domain-specific data can be effectively applied to medical image processing tasks, democratizing access to generalist AI models for medical experts, especially in areas where annotated data is scarce. The study highlights the potential of in-context learning in improving the performance of foundation vision models and suggests that it may be a more efficient and scalable approach compared to traditional deep learning models.This study evaluates the effectiveness of in-context learning on multimodal large language models (LLMs) for classifying cancer pathology images. The researchers used the Generative Pretrained Transformer 4 with Vision capabilities (GPT-4V) to perform in-context learning on three critical histopathology tasks: tissue subtype classification in colorectal cancer, colon polyp subtyping, and breast tumor detection in lymph node sections. The results show that GPT-4V can achieve or even surpass the performance of specialized neural networks trained for these tasks, using only a minimal number of samples. This demonstrates that large vision language models trained on non-domain-specific data can be effectively applied to medical image processing tasks, democratizing access to generalist AI models for medical experts, especially in areas where annotated data is scarce. The study highlights the potential of in-context learning in improving the performance of foundation vision models and suggests that it may be a more efficient and scalable approach compared to traditional deep learning models.

In-context learning enables multimodal large language models to classify cancer pathology images

12 Mar 2024 | Dyke Ferber, Georg Wölfllein, Isabella C. Wiest, Marta Ligero, Srividhya Sainath, Narmin Ghaaffari Laleh, Omar S.M. El Nahhas, Gustav Müller-Franzes, Dirk Jäger, Daniel Truhn, Jakob Nikolas Kathar