This narrative review provides an overview of the techniques and tools used to ensure high-quality data for machine learning and artificial intelligence (AI) applications in radiology. The review emphasizes the importance of consistent, standardized, traceable, correctly annotated, and de-identified data, considering local regulations such as the General Data Protection Regulation (GDPR) and the Health Insurance Portability and Accountability Act (HIPAA). Key topics include image resolution, pixel depth, file formats for medical image storage, free software solutions for image processing, anonymization and pseudonymization to protect patient privacy, methods to eliminate patient-identifying features, free and commercial tools for image annotation, and techniques for data harmonization and normalization.
The review highlights the significance of high-quality datasets in AI, noting that "garbage in, garbage out" is a universally recognized principle. It discusses the time-consuming task of image annotation and the need for efficient tools to speed up the process. The anonymization process is crucial for compliance with privacy regulations, and various methods are available, including manual and automated approaches. The review also covers the importance of standardizing image curation and annotation to enable federated learning, which addresses data governance and privacy issues without exchanging data.
Key software tools such as ImageJ, 3D Slicer, ITK-Snap, and VGG Image Annotator are discussed, along with their features and applications. The review emphasizes the importance of obtaining patient consent, data encryption, access control, and data retention periods to ensure compliance with regulations. Additionally, it explores the impact of different image normalization techniques on tasks like texture classification and the role of harmonization in ensuring consistent acquisition parameters across different machines and protocols.
The review concludes by highlighting the importance of ensuring high-quality images and annotations for AI-based algorithms in radiology, emphasizing the availability of both open-source and commercial tools that can be effectively used under the guidance of local regulations.This narrative review provides an overview of the techniques and tools used to ensure high-quality data for machine learning and artificial intelligence (AI) applications in radiology. The review emphasizes the importance of consistent, standardized, traceable, correctly annotated, and de-identified data, considering local regulations such as the General Data Protection Regulation (GDPR) and the Health Insurance Portability and Accountability Act (HIPAA). Key topics include image resolution, pixel depth, file formats for medical image storage, free software solutions for image processing, anonymization and pseudonymization to protect patient privacy, methods to eliminate patient-identifying features, free and commercial tools for image annotation, and techniques for data harmonization and normalization.
The review highlights the significance of high-quality datasets in AI, noting that "garbage in, garbage out" is a universally recognized principle. It discusses the time-consuming task of image annotation and the need for efficient tools to speed up the process. The anonymization process is crucial for compliance with privacy regulations, and various methods are available, including manual and automated approaches. The review also covers the importance of standardizing image curation and annotation to enable federated learning, which addresses data governance and privacy issues without exchanging data.
Key software tools such as ImageJ, 3D Slicer, ITK-Snap, and VGG Image Annotator are discussed, along with their features and applications. The review emphasizes the importance of obtaining patient consent, data encryption, access control, and data retention periods to ensure compliance with regulations. Additionally, it explores the impact of different image normalization techniques on tasks like texture classification and the role of harmonization in ensuring consistent acquisition parameters across different machines and protocols.
The review concludes by highlighting the importance of ensuring high-quality images and annotations for AI-based algorithms in radiology, emphasizing the availability of both open-source and commercial tools that can be effectively used under the guidance of local regulations.