The paper addresses the challenge of fine-tuning Large Language Models (LLMs) for recommendation tasks, which is computationally expensive and time-consuming. To overcome this, the authors propose a novel method called Data Pruning for Efficient LLM-based Recommendation (DEALRec). DEALRec aims to identify a subset of representative samples that can be used for few-shot fine-tuning of LLMs, reducing the computational cost while maintaining or improving performance.
**Key Contributions:**
1. **Task Formulation:** The paper introduces the task of data pruning for efficient LLM-based recommendation, focusing on selecting a subset of samples that are representative and can lead to high overall performance.
2. **DEALRec Method:** DEALRec incorporates two scores—*:influence score* and *effort score*—to efficiently identify influential samples. The influence score estimates the impact of removing a sample on the empirical risk, while the effort score measures the learning difficulty for LLMs. These scores are combined to regularize the selection process, ensuring both accuracy and efficiency.
3. **Empirical Validation:** Extensive experiments on three real-world datasets validate the effectiveness of DEALRec. The method achieves high accuracy with only 2% of the full dataset, reducing fine-tuning time by 97%.
**Related Work:**
- **LLM-based Recommendation:** Previous work has explored various fine-tuning strategies for LLMs, but they often require extensive computational resources.
- **Coreset Selection:** Coreset selection methods aim to select a small subset of samples that can represent the full dataset, but they are either heuristic or inapplicable to LLM-based recommendation due to the complexity of optimization.
**Conclusion:**
DEALRec provides a novel and efficient approach to data pruning for LLM-based recommendation, demonstrating significant improvements in both accuracy and computational efficiency.The paper addresses the challenge of fine-tuning Large Language Models (LLMs) for recommendation tasks, which is computationally expensive and time-consuming. To overcome this, the authors propose a novel method called Data Pruning for Efficient LLM-based Recommendation (DEALRec). DEALRec aims to identify a subset of representative samples that can be used for few-shot fine-tuning of LLMs, reducing the computational cost while maintaining or improving performance.
**Key Contributions:**
1. **Task Formulation:** The paper introduces the task of data pruning for efficient LLM-based recommendation, focusing on selecting a subset of samples that are representative and can lead to high overall performance.
2. **DEALRec Method:** DEALRec incorporates two scores—*:influence score* and *effort score*—to efficiently identify influential samples. The influence score estimates the impact of removing a sample on the empirical risk, while the effort score measures the learning difficulty for LLMs. These scores are combined to regularize the selection process, ensuring both accuracy and efficiency.
3. **Empirical Validation:** Extensive experiments on three real-world datasets validate the effectiveness of DEALRec. The method achieves high accuracy with only 2% of the full dataset, reducing fine-tuning time by 97%.
**Related Work:**
- **LLM-based Recommendation:** Previous work has explored various fine-tuning strategies for LLMs, but they often require extensive computational resources.
- **Coreset Selection:** Coreset selection methods aim to select a small subset of samples that can represent the full dataset, but they are either heuristic or inapplicable to LLM-based recommendation due to the complexity of optimization.
**Conclusion:**
DEALRec provides a novel and efficient approach to data pruning for LLM-based recommendation, demonstrating significant improvements in both accuracy and computational efficiency.