[slides] RadGenome-Chest CT%3A A Grounded Vision-Language Dataset for Chest CT Analysis

The paper introduces RadGenome-Chest CT, a comprehensive and large-scale dataset for 3D chest CT interpretation. This dataset is built upon the CT-RATE dataset, which contains 25,692 non-contrast 3D chest CT volumes and reports from 20,000 patients. The main contributions of RadGenome-Chest CT include: 1. **Organ-Level Segmentation Masks**: 197 categories of organ-level segmentation masks are provided, covering all critical regions in clinical CT reports. 2. **Grounded Reports**: 665,000 multi-granularity grounded reports, where each sentence in the report is linked to the corresponding anatomical region in the CT volume. 3. **Grounded VQA Pairs**: 1.3 million grounded Visual Question Answering (VQA) pairs, where questions and answers are linked to reference segmentation masks, enabling models to associate visual evidence with textual explanations. The dataset is constructed through a three-stage process: 1. **Segmentation Mask Generation**: Using the SAT model to segment primary anatomical targets in the CT volumes. 2. **Region-Specific Report Division**: Breaking all reports into an anatomically hierarchical structured format, linking each sentence to specific anatomical regions. 3. **Rule-Based Question Generation**: Generating visual question-answering pairs from both region-level and case-level findings. The paper also includes a detailed analysis of the dataset, visualizing the distribution of normal and abnormal cases and the frequency of various abnormalities. The authors believe that RadGenome-Chest CT will significantly advance the development of multimodal medical AI models by enabling them to generate texts based on segmentation regions, enhancing interpretability and patient care. All segmentation masks, grounded reports, and VQA pairs will be released to support future research.The paper introduces RadGenome-Chest CT, a comprehensive and large-scale dataset for 3D chest CT interpretation. This dataset is built upon the CT-RATE dataset, which contains 25,692 non-contrast 3D chest CT volumes and reports from 20,000 patients. The main contributions of RadGenome-Chest CT include: 1. **Organ-Level Segmentation Masks**: 197 categories of organ-level segmentation masks are provided, covering all critical regions in clinical CT reports. 2. **Grounded Reports**: 665,000 multi-granularity grounded reports, where each sentence in the report is linked to the corresponding anatomical region in the CT volume. 3. **Grounded VQA Pairs**: 1.3 million grounded Visual Question Answering (VQA) pairs, where questions and answers are linked to reference segmentation masks, enabling models to associate visual evidence with textual explanations. The dataset is constructed through a three-stage process: 1. **Segmentation Mask Generation**: Using the SAT model to segment primary anatomical targets in the CT volumes. 2. **Region-Specific Report Division**: Breaking all reports into an anatomically hierarchical structured format, linking each sentence to specific anatomical regions. 3. **Rule-Based Question Generation**: Generating visual question-answering pairs from both region-level and case-level findings. The paper also includes a detailed analysis of the dataset, visualizing the distribution of normal and abnormal cases and the frequency of various abnormalities. The authors believe that RadGenome-Chest CT will significantly advance the development of multimodal medical AI models by enabling them to generate texts based on segmentation regions, enhancing interpretability and patient care. All segmentation masks, grounded reports, and VQA pairs will be released to support future research.

RadGenome-Chest CT: A Grounded Vision-Language Dataset for Chest CT Analysis.

25 Apr 2024 | Xiaoman Zhang, Chaoyi Wu, Ziheng Zhao, Jiayu Lei, Ya Zhang, Yanfeng Wang, Weidi Xie