[slides and audio] Curated benchmark dataset for ultrasound based breast lesion analysis

The BrEaST dataset is a curated benchmark dataset for ultrasound-based breast lesion analysis, containing 256 breast ultrasound scans from 256 patients. Each scan includes images of benign and malignant lesions, as well as normal tissue examples, and was manually annotated by experienced radiologists. The dataset includes patient-level labels, image-level annotations, and tumor-level labels, with all cases confirmed by follow-up care or core needle biopsy results. It is the first breast ultrasound dataset to include such comprehensive labels and is available under a CC-BY 4.0 license. Breast cancer is the most commonly diagnosed cancer in women, with over 2.2 million new cases reported in 2020. Ultrasound is a widely used imaging modality for breast examination, but it is highly operator-dependent. The BrEaST dataset aims to support radiologists in breast disease detection, tumor segmentation, and classification by providing high-quality, annotated data. Previous datasets have had limitations, such as lack of manual annotations, poor quality, or limited utility. The BrEaST dataset addresses these issues by providing detailed annotations, including tumor segmentation, BIRADS features, and histopathological diagnoses. The dataset was collected from five radiologists at medical centers in Poland between 2019 and 2022. Images were anonymized, transferred, and annotated using a cloud-based system. Each image was manually annotated by radiologists, with labels including tumor characteristics, BIRADS categories, and histopathological diagnoses. The dataset includes 154 benign tumors, 98 malignancies, and 4 normal breasts. It is structured with images and corresponding masks, and includes a .xlsx file with detailed labels for each case. The BrEaST dataset is available for download from The Cancer Imaging Archive (TCIA) and for viewing on a dedicated webpage. It is designed for developing and evaluating algorithms for detecting, segmenting, and classifying abnormalities in breast ultrasound scans. The dataset includes a variety of cases, with examples of images for each BIRADS category. It also includes technical validation, ensuring the quality and reliability of the data. The dataset has limitations, such as limited cases for some diagnoses, but it provides valuable information for research and development of machine learning models in breast ultrasound analysis.The BrEaST dataset is a curated benchmark dataset for ultrasound-based breast lesion analysis, containing 256 breast ultrasound scans from 256 patients. Each scan includes images of benign and malignant lesions, as well as normal tissue examples, and was manually annotated by experienced radiologists. The dataset includes patient-level labels, image-level annotations, and tumor-level labels, with all cases confirmed by follow-up care or core needle biopsy results. It is the first breast ultrasound dataset to include such comprehensive labels and is available under a CC-BY 4.0 license. Breast cancer is the most commonly diagnosed cancer in women, with over 2.2 million new cases reported in 2020. Ultrasound is a widely used imaging modality for breast examination, but it is highly operator-dependent. The BrEaST dataset aims to support radiologists in breast disease detection, tumor segmentation, and classification by providing high-quality, annotated data. Previous datasets have had limitations, such as lack of manual annotations, poor quality, or limited utility. The BrEaST dataset addresses these issues by providing detailed annotations, including tumor segmentation, BIRADS features, and histopathological diagnoses. The dataset was collected from five radiologists at medical centers in Poland between 2019 and 2022. Images were anonymized, transferred, and annotated using a cloud-based system. Each image was manually annotated by radiologists, with labels including tumor characteristics, BIRADS categories, and histopathological diagnoses. The dataset includes 154 benign tumors, 98 malignancies, and 4 normal breasts. It is structured with images and corresponding masks, and includes a .xlsx file with detailed labels for each case. The BrEaST dataset is available for download from The Cancer Imaging Archive (TCIA) and for viewing on a dedicated webpage. It is designed for developing and evaluating algorithms for detecting, segmenting, and classifying abnormalities in breast ultrasound scans. The dataset includes a variety of cases, with examples of images for each BIRADS category. It also includes technical validation, ensuring the quality and reliability of the data. The dataset has limitations, such as limited cases for some diagnoses, but it provides valuable information for research and development of machine learning models in breast ultrasound analysis.

Curated benchmark dataset for ultrasound based breast lesion analysis

2024 | Anna Pawłowska¹, Anna Ćwierz-Pierikowska², Agnieszka Domalik², Dominika Jaguś¹, Piotr Kasprzak³, Rafał Matkowski³, Łukasz Fura³, Andrzej Nowicki¹ & Norbert Żołek¹