Understanding Baichuan2-Sum%3A Instruction Finetune Baichuan2-7B Model for Dialogue Summarization

The paper presents Baichuan2-Sum, an instruction fine-tuned model for role-oriented dialogue summarization. The model is designed to generate summaries for different roles in dialogues, addressing the challenge of multiple role interactions and context shifts. By setting different instructions for various roles, the model learns from dialogue interactions and produces desired summaries. The authors apply the Noisy Embedding Instruction Fine Tuning (NEFTune) technique to add noise during training, enhancing model performance. Experiments on two public datasets, CSDS and SAMSUM, demonstrate that Baichuan2-Sum achieves state-of-the-art results, with significant improvements in ROUGE scores. The model outperforms previous state-of-the-art models in terms of lexical and semantic similarity, as measured by ROUGE-1/2/L and BERTScore. Human evaluation further confirms the model's superior performance in accuracy, coherence, and grammatical correctness. The paper also discusses limitations, such as the use of a smaller model size and the need for further improvements in text diversity. The code and related materials are publicly available on GitHub to facilitate future research in dialogue summarization.The paper presents Baichuan2-Sum, an instruction fine-tuned model for role-oriented dialogue summarization. The model is designed to generate summaries for different roles in dialogues, addressing the challenge of multiple role interactions and context shifts. By setting different instructions for various roles, the model learns from dialogue interactions and produces desired summaries. The authors apply the Noisy Embedding Instruction Fine Tuning (NEFTune) technique to add noise during training, enhancing model performance. Experiments on two public datasets, CSDS and SAMSUM, demonstrate that Baichuan2-Sum achieves state-of-the-art results, with significant improvements in ROUGE scores. The model outperforms previous state-of-the-art models in terms of lexical and semantic similarity, as measured by ROUGE-1/2/L and BERTScore. Human evaluation further confirms the model's superior performance in accuracy, coherence, and grammatical correctness. The paper also discusses limitations, such as the use of a smaller model size and the need for further improvements in text diversity. The code and related materials are publicly available on GitHub to facilitate future research in dialogue summarization.

Baichuan2-Sum: Instruction Finetune Baichuan2-7B Model for Dialogue Summarization

4 Apr 2024 | Jianfei Xiao, Yancan Chen, Yimin Ou, Hanyi Yu, Kai Shu, Yiyong Xiao