[slides] Generalization%E2%80%94a key challenge for responsible AI in patient-facing clinical applications

The article discusses the challenge of generalization in AI systems, particularly in patient-facing clinical applications. Generalization refers to the ability of AI to apply or extrapolate knowledge from training data to new, unseen data. The authors highlight that expressive ML models, especially deep neural networks, are prone to overfitting and can learn spurious correlations, leading to poor performance on underrepresented groups. This is exacerbated in clinical settings due to high-dimensional, noisy, and incomplete datasets. The authors explore selective deployment as a solution, where algorithms are not deployed for groups underrepresented in the training data to avoid poor performance. They use a case study on breast cancer prediction to illustrate the ethical implications of selective deployment, particularly when biological variances like sex are involved. The article emphasizes the importance of trust in model predictions and proposes both sample/data-centric and model-centric methods for selecting trustworthy predictions. Ethical considerations are also addressed, including the potential for selective deployment to perpetuate social inequalities if based on socio-cultural factors rather than biological determinants. The authors recommend a balanced approach that involves human-in-the-loop decision-making to ensure equitable outcomes. Finally, the article calls for future research on improving generalization in clinical ML, particularly in small sample regimes, and encourages the development of principled algorithmic approaches to address generalization challenges.The article discusses the challenge of generalization in AI systems, particularly in patient-facing clinical applications. Generalization refers to the ability of AI to apply or extrapolate knowledge from training data to new, unseen data. The authors highlight that expressive ML models, especially deep neural networks, are prone to overfitting and can learn spurious correlations, leading to poor performance on underrepresented groups. This is exacerbated in clinical settings due to high-dimensional, noisy, and incomplete datasets. The authors explore selective deployment as a solution, where algorithms are not deployed for groups underrepresented in the training data to avoid poor performance. They use a case study on breast cancer prediction to illustrate the ethical implications of selective deployment, particularly when biological variances like sex are involved. The article emphasizes the importance of trust in model predictions and proposes both sample/data-centric and model-centric methods for selecting trustworthy predictions. Ethical considerations are also addressed, including the potential for selective deployment to perpetuate social inequalities if based on socio-cultural factors rather than biological determinants. The authors recommend a balanced approach that involves human-in-the-loop decision-making to ensure equitable outcomes. Finally, the article calls for future research on improving generalization in clinical ML, particularly in small sample regimes, and encourages the development of principled algorithmic approaches to address generalization challenges.

Generalization—a key challenge for responsible AI in patient-facing clinical applications

21 May 2024 | Lea Goetz, Nabeel Seedat, Robert Vandersluis & Mihaela van der Schaar