Understanding Machine Learning Interpretability%3A A Survey on Methods and Metrics

The article "Machine Learning Interpretability: A Survey on Methods and Metrics" by Diogo V. Carvalho, Eduardo M. Pereira, and Jaime S. Cardoso provides an in-depth review of the field of machine learning interpretability. The authors highlight the increasing importance of interpretability as machine learning systems become more prevalent in various domains, including healthcare, finance, and criminal justice. These systems are increasingly making high-stakes decisions that have significant social impacts, making it crucial to understand their internal logic and rationale. The article discusses the challenges and motivations behind the need for interpretability, emphasizing the lack of transparency in black box models and the regulatory requirements for verifiability and accountability. It reviews the historical context of interpretability research, noting that while there has been sporadic interest since the 1970s, recent years have seen a surge in attention due to the growing complexity and impact of ML systems. The authors also explore the societal impact of interpretability, including its role in ensuring fairness, privacy, reliability, and trust. They present a taxonomy of interpretability methods, categorizing them based on when they are applied (pre-model, in-model, post-model), whether they are intrinsic or post hoc, model-specific or model-agnostic, and the type of explanation they provide (feature summary, model internals, data point, surrogate intrinsically interpretable model). The article concludes by emphasizing the interdisciplinary nature of interpretability research, highlighting the need for collaboration between data science, human science, and human-computer interaction (HCI) to advance the field. It also discusses the importance of interpretability in achieving the goals of AI, such as intuition, reasoning, and social acceptance.The article "Machine Learning Interpretability: A Survey on Methods and Metrics" by Diogo V. Carvalho, Eduardo M. Pereira, and Jaime S. Cardoso provides an in-depth review of the field of machine learning interpretability. The authors highlight the increasing importance of interpretability as machine learning systems become more prevalent in various domains, including healthcare, finance, and criminal justice. These systems are increasingly making high-stakes decisions that have significant social impacts, making it crucial to understand their internal logic and rationale. The article discusses the challenges and motivations behind the need for interpretability, emphasizing the lack of transparency in black box models and the regulatory requirements for verifiability and accountability. It reviews the historical context of interpretability research, noting that while there has been sporadic interest since the 1970s, recent years have seen a surge in attention due to the growing complexity and impact of ML systems. The authors also explore the societal impact of interpretability, including its role in ensuring fairness, privacy, reliability, and trust. They present a taxonomy of interpretability methods, categorizing them based on when they are applied (pre-model, in-model, post-model), whether they are intrinsic or post hoc, model-specific or model-agnostic, and the type of explanation they provide (feature summary, model internals, data point, surrogate intrinsically interpretable model). The article concludes by emphasizing the interdisciplinary nature of interpretability research, highlighting the need for collaboration between data science, human science, and human-computer interaction (HCI) to advance the field. It also discusses the importance of interpretability in achieving the goals of AI, such as intuition, reasoning, and social acceptance.

Machine Learning Interpretability: A Survey on Methods and Metrics

26 July 2019 | Diogo V. Carvalho, Eduardo M. Pereira, Jaime S. Cardoso