[slides] MICA%3A Towards Explainable Skin Lesion Diagnosis via Multi-Level Image-Concept Alignment

MICA: Towards Explainable Skin Lesion Diagnosis via Multi-Level Image-Concept Alignment This paper proposes MICA, a multi-modal explainable framework for skin disease diagnosis that semantically aligns medical images and clinical-related concepts at multiple levels. The framework aligns images and concepts at the image level, token level, and concept level to enhance model interpretability and performance. The method allows for model intervention and provides both textual and visual explanations in terms of human-interpretable concepts. Experimental results on three skin image datasets show that MICA achieves high performance and label efficiency for concept detection and disease diagnosis while preserving model interpretability. The code is available at https://github.com/Tommy-Bie/MICA. The paper discusses the challenges of black-box deep learning in medical image analysis and the need for Explainable Artificial Intelligence (XAI) methods. Existing concept-based methods often fail to capture the nuanced semantic relationships between sub-regions and concepts in medical images, leading to underutilization of valuable medical information. MICA addresses these issues by aligning medical images and clinical-related concepts at multiple levels, including global image level, regional token level, and concept subspace level. The method uses a CNN-based image encoder and a large language model-based concept encoder to extract semantic visual and textual features. The framework then aligns the images and concepts at three levels to improve the model's ability to learn correspondences between images and concepts. The method also includes an explainable disease diagnosis stage where the model detects concepts before making the final diagnosis. This allows for both visual and textual explanations of the model's decisions. The framework is evaluated on three skin image datasets, showing that MICA achieves superior performance and label efficiency compared to other methods. The method is also evaluated for explainability using multiple XAI metrics, demonstrating its ability to provide faithful, plausible, and understandable explanations. The results show that MICA can achieve competitive diagnosis results with a small proportion of diagnosis labels, indicating that it effectively learns the correspondences between medical images and clinical-related concepts.MICA: Towards Explainable Skin Lesion Diagnosis via Multi-Level Image-Concept Alignment This paper proposes MICA, a multi-modal explainable framework for skin disease diagnosis that semantically aligns medical images and clinical-related concepts at multiple levels. The framework aligns images and concepts at the image level, token level, and concept level to enhance model interpretability and performance. The method allows for model intervention and provides both textual and visual explanations in terms of human-interpretable concepts. Experimental results on three skin image datasets show that MICA achieves high performance and label efficiency for concept detection and disease diagnosis while preserving model interpretability. The code is available at https://github.com/Tommy-Bie/MICA. The paper discusses the challenges of black-box deep learning in medical image analysis and the need for Explainable Artificial Intelligence (XAI) methods. Existing concept-based methods often fail to capture the nuanced semantic relationships between sub-regions and concepts in medical images, leading to underutilization of valuable medical information. MICA addresses these issues by aligning medical images and clinical-related concepts at multiple levels, including global image level, regional token level, and concept subspace level. The method uses a CNN-based image encoder and a large language model-based concept encoder to extract semantic visual and textual features. The framework then aligns the images and concepts at three levels to improve the model's ability to learn correspondences between images and concepts. The method also includes an explainable disease diagnosis stage where the model detects concepts before making the final diagnosis. This allows for both visual and textual explanations of the model's decisions. The framework is evaluated on three skin image datasets, showing that MICA achieves superior performance and label efficiency compared to other methods. The method is also evaluated for explainability using multiple XAI metrics, demonstrating its ability to provide faithful, plausible, and understandable explanations. The results show that MICA can achieve competitive diagnosis results with a small proportion of diagnosis labels, indicating that it effectively learns the correspondences between medical images and clinical-related concepts.

MICA: Towards Explainable Skin Lesion Diagnosis via Multi-Level Image-Concept Alignment

2024-01-16 | Yequan Bie, Luyang Luo, Hao Chen