Stealthy Attack on Large Language Model based Recommendation

Stealthy Attack on Large Language Model based Recommendation

5 Jun 2024 | Jinghao Zhang¹², Yuting Liu³, Qiang Liu¹², Shu Wu¹²*, Guibing Guo³, Liang Wang¹²
This paper presents a stealthy attack on large language model (LLM)-based recommendation systems, highlighting a critical security vulnerability. LLM-based recommendation models emphasize textual content, making them susceptible to attacks where attackers subtly modify item text to increase exposure without affecting overall recommendation performance. The attack is highly stealthy, as it does not alter the model's training data and the changes are difficult for users and platforms to detect. The authors demonstrate that simple text modifications, such as inserting positive words or using GPT-based rewriting, can significantly boost item exposure. They also show that black-box text attacks, which do not require access to the model's internal parameters, are effective in increasing exposure. The study evaluates the attack across four mainstream LLM-based recommendation models and finds that the attack is highly effective and stealthy. The authors also investigate the impact of model fine-tuning and item popularity on the attack and demonstrate the transferability of the attack across different models and tasks. They propose a simple rewriting defense strategy that can mitigate the issue to some extent. The study reveals a significant security gap in LLM-based recommendation systems and calls for further research on protecting these systems. The findings emphasize the need for enhanced security measures to prevent malicious attacks that could lead to undesirable outcomes such as the promotion of low-quality products or the spread of misinformation.This paper presents a stealthy attack on large language model (LLM)-based recommendation systems, highlighting a critical security vulnerability. LLM-based recommendation models emphasize textual content, making them susceptible to attacks where attackers subtly modify item text to increase exposure without affecting overall recommendation performance. The attack is highly stealthy, as it does not alter the model's training data and the changes are difficult for users and platforms to detect. The authors demonstrate that simple text modifications, such as inserting positive words or using GPT-based rewriting, can significantly boost item exposure. They also show that black-box text attacks, which do not require access to the model's internal parameters, are effective in increasing exposure. The study evaluates the attack across four mainstream LLM-based recommendation models and finds that the attack is highly effective and stealthy. The authors also investigate the impact of model fine-tuning and item popularity on the attack and demonstrate the transferability of the attack across different models and tasks. They propose a simple rewriting defense strategy that can mitigate the issue to some extent. The study reveals a significant security gap in LLM-based recommendation systems and calls for further research on protecting these systems. The findings emphasize the need for enhanced security measures to prevent malicious attacks that could lead to undesirable outcomes such as the promotion of low-quality products or the spread of misinformation.
Reach us at info@study.space
[slides] Stealthy Attack on Large Language Model based Recommendation | StudySpace