Conifer: Improving Complex Constrained Instruction-Following Ability of Large Language Models

Conifer: Improving Complex Constrained Instruction-Following Ability of Large Language Models

3 Apr 2024 | Haoran Sun*, Lixin Liu, Junjie Li, Fengyu Wang, Baohua Dong, Ran Lin, Ruohui Huang
Conifer is a novel instruction tuning dataset designed to enhance large language models (LLMs) in following complex, constrained instructions. The dataset is constructed using GPT-4 to generate high-quality, diverse, and complex instructions. The process involves query reframing, constraint generation, recombination, and a two-stage filtering process to ensure the quality of the instructions. A progressive learning scheme is also proposed, emphasizing an easy-to-hard progression and learning from process feedback. The dataset is used to train models, which show significant improvements in instruction-following abilities, especially for complex constraints. The Conifer dataset is evaluated on several benchmarks, including IFEval, FollowBench, and InFoBench, where it outperforms state-of-the-art open-source models. The dataset is also compared with other related works, such as WizardLM and Muffin, and shows superior performance in handling complex constraints. The effectiveness of Conifer is validated through extensive experiments, demonstrating its ability to improve LLMs' instruction-following capabilities. The dataset is publicly available for research and development.Conifer is a novel instruction tuning dataset designed to enhance large language models (LLMs) in following complex, constrained instructions. The dataset is constructed using GPT-4 to generate high-quality, diverse, and complex instructions. The process involves query reframing, constraint generation, recombination, and a two-stage filtering process to ensure the quality of the instructions. A progressive learning scheme is also proposed, emphasizing an easy-to-hard progression and learning from process feedback. The dataset is used to train models, which show significant improvements in instruction-following abilities, especially for complex constraints. The Conifer dataset is evaluated on several benchmarks, including IFEval, FollowBench, and InFoBench, where it outperforms state-of-the-art open-source models. The dataset is also compared with other related works, such as WizardLM and Muffin, and shows superior performance in handling complex constraints. The effectiveness of Conifer is validated through extensive experiments, demonstrating its ability to improve LLMs' instruction-following capabilities. The dataset is publicly available for research and development.
Reach us at info@study.space
Understanding Conifer%3A Improving Complex Constrained Instruction-Following Ability of Large Language Models