The paper addresses the challenge of catastrophic forgetting in incremental learning, where performance degrades when learning new classes. The authors propose a method called Bias Correction (BiC) to address the data imbalance issue between old and new classes, which is particularly challenging when the number of classes is large. BiC involves a two-stage training process: the first stage learns the convolution and fully connected layers using knowledge distillation, while the second stage corrects the bias in the fully connected layer with a linear model. This method is effective on large datasets like ImageNet (1000 classes) and MS-Celeb-1M (10000 classes), outperforming state-of-the-art algorithms by 11.1% and 13.2%, respectively. The paper also includes experiments on smaller datasets and an ablation study to validate the effectiveness of BiC.The paper addresses the challenge of catastrophic forgetting in incremental learning, where performance degrades when learning new classes. The authors propose a method called Bias Correction (BiC) to address the data imbalance issue between old and new classes, which is particularly challenging when the number of classes is large. BiC involves a two-stage training process: the first stage learns the convolution and fully connected layers using knowledge distillation, while the second stage corrects the bias in the fully connected layer with a linear model. This method is effective on large datasets like ImageNet (1000 classes) and MS-Celeb-1M (10000 classes), outperforming state-of-the-art algorithms by 11.1% and 13.2%, respectively. The paper also includes experiments on smaller datasets and an ablation study to validate the effectiveness of BiC.