27 May 2024 | Dixuan Wang, Yanda Li, Junyuan Jiang, Zepeng Ding, Guochao Jiang, Jiaqing Liang, Deqing Yang
This paper investigates the vulnerability of Large Language Models (LLMs) in terms of their tokenization process, which is critical for accurate language understanding and generation. The authors construct an adversarial dataset called ADT (Adversarial Dataset for Tokenizer) to challenge LLMs' tokenization and reveal their limitations. ADT consists of two subsets: ADT-Human, manually constructed, and ADT-Auto, automatically generated. The dataset is designed to include challenging tokenization cases that can mislead LLMs, leading to incorrect responses. The study evaluates the effectiveness of ADT on various LLMs, including GPT-4o, Llama-3, Qwen2.5-max, and others, demonstrating that these models often produce inaccurate responses when their tokenization is incorrect. The authors also propose an automatic data generation framework to create adversarial examples, which is efficient and robust. The results show that the proposed dataset and method effectively challenge LLMs' tokenization, highlighting the importance of improving tokenization algorithms to enhance LLM performance. The study contributes to the field by providing a new perspective on LLM vulnerabilities and offering a dataset for further research on improving tokenization processes.This paper investigates the vulnerability of Large Language Models (LLMs) in terms of their tokenization process, which is critical for accurate language understanding and generation. The authors construct an adversarial dataset called ADT (Adversarial Dataset for Tokenizer) to challenge LLMs' tokenization and reveal their limitations. ADT consists of two subsets: ADT-Human, manually constructed, and ADT-Auto, automatically generated. The dataset is designed to include challenging tokenization cases that can mislead LLMs, leading to incorrect responses. The study evaluates the effectiveness of ADT on various LLMs, including GPT-4o, Llama-3, Qwen2.5-max, and others, demonstrating that these models often produce inaccurate responses when their tokenization is incorrect. The authors also propose an automatic data generation framework to create adversarial examples, which is efficient and robust. The results show that the proposed dataset and method effectively challenge LLMs' tokenization, highlighting the importance of improving tokenization algorithms to enhance LLM performance. The study contributes to the field by providing a new perspective on LLM vulnerabilities and offering a dataset for further research on improving tokenization processes.