30 April 2024 | Matthew Christensen, Milos Vukadinovic, Neal Yuan, David Ouyang
The paper introduces EchoCLIP, a vision-language foundation model specifically designed for echocardiography, which learns to interpret cardiac ultrasound images and expert cardiologist interpretations. trained on 1,032,975 cardiac ultrasound videos and corresponding expert text, EchoCLIP performs well on various benchmarks for cardiac image interpretation, including assessing cardiac function and identifying implanted intracardiac devices. The model's performance is evaluated using internal and external test datasets, demonstrating its robustness and generalizability. A long-context variant, EchoCLIP-R, is developed to improve text retrieval capabilities, enabling the identification of unique patients across multiple videos and characterizing clinical changes over time. The study also introduces a saliency mapping approach, PromptCAM, to visualize regions of interest in images based on text prompts. Overall, EchoCLIP represents a significant step forward in automating echocardiography interpretation, improving access to cardiac imaging and enhancing clinical decision-making.The paper introduces EchoCLIP, a vision-language foundation model specifically designed for echocardiography, which learns to interpret cardiac ultrasound images and expert cardiologist interpretations. trained on 1,032,975 cardiac ultrasound videos and corresponding expert text, EchoCLIP performs well on various benchmarks for cardiac image interpretation, including assessing cardiac function and identifying implanted intracardiac devices. The model's performance is evaluated using internal and external test datasets, demonstrating its robustness and generalizability. A long-context variant, EchoCLIP-R, is developed to improve text retrieval capabilities, enabling the identification of unique patients across multiple videos and characterizing clinical changes over time. The study also introduces a saliency mapping approach, PromptCAM, to visualize regions of interest in images based on text prompts. Overall, EchoCLIP represents a significant step forward in automating echocardiography interpretation, improving access to cardiac imaging and enhancing clinical decision-making.