This study evaluates the image recognition capabilities of GPT-4V, a multimodal large language model (LLM), in the context of the Japanese National Medical Licensing Examination. The researchers tested GPT-4V on 108 questions that included images and compared its performance when presented with both the question text and images versus only the question text. The results showed that GPT-4V's accuracy was 68% when provided with images and 72% without images, with no significant difference between the two conditions (P=.36). For clinical questions, the accuracy was 71% with images and 78% without, while for general questions, it was 30% with images and 20% without. The study concluded that the additional information from images did not significantly improve GPT-4V's performance, suggesting that it currently lacks effective image interpretation skills in the medical domain. The findings highlight the need for further development of domain-specific multimodal models to enhance the accuracy of LLMs in medical applications.This study evaluates the image recognition capabilities of GPT-4V, a multimodal large language model (LLM), in the context of the Japanese National Medical Licensing Examination. The researchers tested GPT-4V on 108 questions that included images and compared its performance when presented with both the question text and images versus only the question text. The results showed that GPT-4V's accuracy was 68% when provided with images and 72% without images, with no significant difference between the two conditions (P=.36). For clinical questions, the accuracy was 71% with images and 78% without, while for general questions, it was 30% with images and 20% without. The study concluded that the additional information from images did not significantly improve GPT-4V's performance, suggesting that it currently lacks effective image interpretation skills in the medical domain. The findings highlight the need for further development of domain-specific multimodal models to enhance the accuracy of LLMs in medical applications.