Exploiting Inter-sample and Inter-feature Relations in Dataset Distillation

Exploiting Inter-sample and Inter-feature Relations in Dataset Distillation

31 Mar 2024 | Wenxiao Deng, Wenbin Li*, Tianyu Ding, Lei Wang, Hongguang Zhang, Kuihua Huang, Jing Huo, Yang Gao
This paper introduces two novel constraints for dataset distillation: class centralization and covariance matching. These constraints aim to improve the performance of distribution matching-based methods by addressing two key limitations—dispersed feature distributions within classes and an exclusive focus on mean feature consistency. The class centralization constraint enhances class discrimination by clustering samples within classes, while the covariance matching constraint achieves more accurate feature distribution matching between real and synthetic datasets through local feature covariance matrices, particularly beneficial when sample sizes are small. Experiments on CIFAR10, SVHN, CIFAR100, and TinyImageNet show significant performance improvements, with up to 6.6% improvement on CIFAR10 compared to state-of-the-art methods. The method also maintains robust performance across different architectures, with a maximum performance drop of 1.7%. The proposed constraints are plug-and-play and can be integrated with existing methods. The code is available at https://github.com/VincenDen/IID.This paper introduces two novel constraints for dataset distillation: class centralization and covariance matching. These constraints aim to improve the performance of distribution matching-based methods by addressing two key limitations—dispersed feature distributions within classes and an exclusive focus on mean feature consistency. The class centralization constraint enhances class discrimination by clustering samples within classes, while the covariance matching constraint achieves more accurate feature distribution matching between real and synthetic datasets through local feature covariance matrices, particularly beneficial when sample sizes are small. Experiments on CIFAR10, SVHN, CIFAR100, and TinyImageNet show significant performance improvements, with up to 6.6% improvement on CIFAR10 compared to state-of-the-art methods. The method also maintains robust performance across different architectures, with a maximum performance drop of 1.7%. The proposed constraints are plug-and-play and can be integrated with existing methods. The code is available at https://github.com/VincenDen/IID.
Reach us at info@study.space