29 February 2024 | Ali Kore, Elyar Abbasi Bavi, Vallijah Subasri, Moustafa Abdalla, Benjamin Fine, Elham Dolatabadi & Mohamed Abdalla
This study evaluates three data drift detection methods on real-world medical imaging data to detect data drift caused by (a) natural changes (e.g., emergence of COVID-19 in X-rays) and (b) synthetic drifts. The methods tested include tracking model performance, image data-based drift detection (TorchXRay Vision AutoEncoder), model output-based drift detection (Black Box Shift Detection), and combined image-and-output-based drift detection (TAE + BBSD). The results show that monitoring model performance alone is not a reliable proxy for detecting data drift, and that drift detection is highly dependent on sample size and patient features. Data drift detection is crucial for ensuring the safety and reliability of deployed AI models in healthcare, as it allows for proactive intervention before risks reach patients. The study highlights the importance of data drift detection in various scenarios, including when gold-labels are not available or when real-time evaluation is impractical. It also demonstrates that data-based drift detection methods, such as TAE + BBSD, are more sensitive to detecting data drift than performance-based methods. The study further shows that the sensitivity of drift detection approaches varies depending on the type of drift, such as changes in patient demographics, patient types, and pathologies. The findings suggest that data drift detection is a critical tool for maintaining reliable performance of deployed AI models in healthcare.This study evaluates three data drift detection methods on real-world medical imaging data to detect data drift caused by (a) natural changes (e.g., emergence of COVID-19 in X-rays) and (b) synthetic drifts. The methods tested include tracking model performance, image data-based drift detection (TorchXRay Vision AutoEncoder), model output-based drift detection (Black Box Shift Detection), and combined image-and-output-based drift detection (TAE + BBSD). The results show that monitoring model performance alone is not a reliable proxy for detecting data drift, and that drift detection is highly dependent on sample size and patient features. Data drift detection is crucial for ensuring the safety and reliability of deployed AI models in healthcare, as it allows for proactive intervention before risks reach patients. The study highlights the importance of data drift detection in various scenarios, including when gold-labels are not available or when real-time evaluation is impractical. It also demonstrates that data-based drift detection methods, such as TAE + BBSD, are more sensitive to detecting data drift than performance-based methods. The study further shows that the sensitivity of drift detection approaches varies depending on the type of drift, such as changes in patient demographics, patient types, and pathologies. The findings suggest that data drift detection is a critical tool for maintaining reliable performance of deployed AI models in healthcare.