The Entropy Enigma: Success and Failure of Entropy Minimization

The Entropy Enigma: Success and Failure of Entropy Minimization

2024 | Ori Press, Ravid Schwartz-Ziv, Yann LeCun, Matthias Bethge
Entropy minimization (EM) is a test-time adaptation (TTA) method that improves model accuracy on new datasets without additional labeled data. The paper analyzes why EM works initially by embedding test images close to training images, increasing accuracy, but eventually fails as embeddings diverge from training data, reducing accuracy. The study reveals that during EM, input data embeddings form distinct clusters, with initial clustering aligning with training data, but later divergence leading to accuracy degradation. Based on these insights, the authors propose a method to estimate model accuracy on arbitrary datasets without labels by analyzing how embeddings change during EM optimization. This method, called Weighted Flips (WF), measures the number of label flips during EM and weights them by initial prediction confidence. Experiments on 23 datasets show that WF achieves a mean absolute error of 5.75%, outperforming previous methods by 29.62%. The method is practical, efficient, and effective across various models and architectures. The study highlights the dual role of EM in both enhancing model performance and revealing the dynamics of data embeddings, offering new insights into the effectiveness of TTA methods and the challenges of accuracy estimation in real-world scenarios.Entropy minimization (EM) is a test-time adaptation (TTA) method that improves model accuracy on new datasets without additional labeled data. The paper analyzes why EM works initially by embedding test images close to training images, increasing accuracy, but eventually fails as embeddings diverge from training data, reducing accuracy. The study reveals that during EM, input data embeddings form distinct clusters, with initial clustering aligning with training data, but later divergence leading to accuracy degradation. Based on these insights, the authors propose a method to estimate model accuracy on arbitrary datasets without labels by analyzing how embeddings change during EM optimization. This method, called Weighted Flips (WF), measures the number of label flips during EM and weights them by initial prediction confidence. Experiments on 23 datasets show that WF achieves a mean absolute error of 5.75%, outperforming previous methods by 29.62%. The method is practical, efficient, and effective across various models and architectures. The study highlights the dual role of EM in both enhancing model performance and revealing the dynamics of data embeddings, offering new insights into the effectiveness of TTA methods and the challenges of accuracy estimation in real-world scenarios.
Reach us at info@study.space
[slides] The Entropy Enigma%3A Success and Failure of Entropy Minimization | StudySpace