11 Apr 2017 | Saining Xie1 Ross Girshick2 Piotr Dollár2 Zhuowen Tu1 Kaiming He2
The paper "Aggregated Residual Transformations for Deep Neural Networks" introduces a novel network architecture called ResNeXt, which is designed to improve image classification performance. The architecture is highly modular and repeatable, consisting of building blocks that aggregate a set of transformations with the same topology. This design simplifies hyper-parameter tuning and introduces a new dimension, called "cardinality," which is the size of the set of transformations. The authors argue that increasing cardinality can improve classification accuracy more effectively than increasing depth or width, especially when maintaining complexity. Empirical results on the ImageNet-1K dataset show that ResNeXt outperforms ResNet and other state-of-the-art models, achieving 2nd place in the ILSVRC 2016 classification task. Further experiments on the ImageNet-5K set and the COCO object detection dataset also demonstrate the superior performance of ResNeXt. The code and models are publicly available.The paper "Aggregated Residual Transformations for Deep Neural Networks" introduces a novel network architecture called ResNeXt, which is designed to improve image classification performance. The architecture is highly modular and repeatable, consisting of building blocks that aggregate a set of transformations with the same topology. This design simplifies hyper-parameter tuning and introduces a new dimension, called "cardinality," which is the size of the set of transformations. The authors argue that increasing cardinality can improve classification accuracy more effectively than increasing depth or width, especially when maintaining complexity. Empirical results on the ImageNet-1K dataset show that ResNeXt outperforms ResNet and other state-of-the-art models, achieving 2nd place in the ILSVRC 2016 classification task. Further experiments on the ImageNet-5K set and the COCO object detection dataset also demonstrate the superior performance of ResNeXt. The code and models are publicly available.