Data-efficient Fine-tuning for LLM-based Recommendation

Data-efficient Fine-tuning for LLM-based Recommendation

July 14–18, 2024 | Xinyu Lin, Wenjie Wang*, Yongqi Li, Shuo Yang, Fuli Feng*, Yinwei Wei, Tat-Seng Chua
The paper addresses the challenge of fine-tuning Large Language Models (LLMs) for recommendation tasks, which is computationally expensive and time-consuming. To overcome this, the authors propose a novel method called Data Pruning for Efficient LLM-based Recommendation (DEALRec). DEALRec aims to identify a subset of representative samples that can be used for few-shot fine-tuning of LLMs, reducing the computational cost while maintaining or improving performance. **Key Contributions:** 1. **Task Formulation:** The paper introduces the task of data pruning for efficient LLM-based recommendation, focusing on selecting a subset of samples that are representative and can lead to high overall performance. 2. **DEALRec Method:** DEALRec incorporates two scores—*:influence score* and *effort score*—to efficiently identify influential samples. The influence score estimates the impact of removing a sample on the empirical risk, while the effort score measures the learning difficulty for LLMs. These scores are combined to regularize the selection process, ensuring both accuracy and efficiency. 3. **Empirical Validation:** Extensive experiments on three real-world datasets validate the effectiveness of DEALRec. The method achieves high accuracy with only 2% of the full dataset, reducing fine-tuning time by 97%. **Related Work:** - **LLM-based Recommendation:** Previous work has explored various fine-tuning strategies for LLMs, but they often require extensive computational resources. - **Coreset Selection:** Coreset selection methods aim to select a small subset of samples that can represent the full dataset, but they are either heuristic or inapplicable to LLM-based recommendation due to the complexity of optimization. **Conclusion:** DEALRec provides a novel and efficient approach to data pruning for LLM-based recommendation, demonstrating significant improvements in both accuracy and computational efficiency.The paper addresses the challenge of fine-tuning Large Language Models (LLMs) for recommendation tasks, which is computationally expensive and time-consuming. To overcome this, the authors propose a novel method called Data Pruning for Efficient LLM-based Recommendation (DEALRec). DEALRec aims to identify a subset of representative samples that can be used for few-shot fine-tuning of LLMs, reducing the computational cost while maintaining or improving performance. **Key Contributions:** 1. **Task Formulation:** The paper introduces the task of data pruning for efficient LLM-based recommendation, focusing on selecting a subset of samples that are representative and can lead to high overall performance. 2. **DEALRec Method:** DEALRec incorporates two scores—*:influence score* and *effort score*—to efficiently identify influential samples. The influence score estimates the impact of removing a sample on the empirical risk, while the effort score measures the learning difficulty for LLMs. These scores are combined to regularize the selection process, ensuring both accuracy and efficiency. 3. **Empirical Validation:** Extensive experiments on three real-world datasets validate the effectiveness of DEALRec. The method achieves high accuracy with only 2% of the full dataset, reducing fine-tuning time by 97%. **Related Work:** - **LLM-based Recommendation:** Previous work has explored various fine-tuning strategies for LLMs, but they often require extensive computational resources. - **Coreset Selection:** Coreset selection methods aim to select a small subset of samples that can represent the full dataset, but they are either heuristic or inapplicable to LLM-based recommendation due to the complexity of optimization. **Conclusion:** DEALRec provides a novel and efficient approach to data pruning for LLM-based recommendation, demonstrating significant improvements in both accuracy and computational efficiency.
Reach us at info@study.space
[slides and audio] Data-efficient Fine-tuning for LLM-based Recommendation