Language-Specific Neurons: The Key to Multilingual Capabilities in Large Language Models

Language-Specific Neurons: The Key to Multilingual Capabilities in Large Language Models

6 Jun 2024 | Tianyi Tang, Wenyang Luo, Haoyang Huang, Dongdong Zhang, Xiaolei Wang, Wayne Xin Zhao, Furu Wei, Ji-Rong Wen
The paper explores the multilingual capabilities of large language models (LLMs) and introduces a novel method, Language Activation Probability Entropy (LAPE), to identify language-specific neurons within these models. LAPE assesses the activation likelihood of individual neurons in response to different languages, selecting those with a higher activation probability for one or two specific languages. The study finds that LLMs' proficiency in processing specific languages is significantly influenced by a small subset of neurons, primarily located in the top and bottom layers of the model. These neurons are crucial for handling language-specific vocabulary, grammar, and idiomatic expressions. The research also demonstrates the feasibility of "steering" the output language of LLMs by selectively activating or deactivating these language-specific neurons, which could help mitigate off-target issues and enhance cross-lingual generation tasks. The findings provide important insights into the underlying mechanisms of LLMs' multilingual capabilities and offer a practical approach to improving their performance in multilingual scenarios.The paper explores the multilingual capabilities of large language models (LLMs) and introduces a novel method, Language Activation Probability Entropy (LAPE), to identify language-specific neurons within these models. LAPE assesses the activation likelihood of individual neurons in response to different languages, selecting those with a higher activation probability for one or two specific languages. The study finds that LLMs' proficiency in processing specific languages is significantly influenced by a small subset of neurons, primarily located in the top and bottom layers of the model. These neurons are crucial for handling language-specific vocabulary, grammar, and idiomatic expressions. The research also demonstrates the feasibility of "steering" the output language of LLMs by selectively activating or deactivating these language-specific neurons, which could help mitigate off-target issues and enhance cross-lingual generation tasks. The findings provide important insights into the underlying mechanisms of LLMs' multilingual capabilities and offer a practical approach to improving their performance in multilingual scenarios.
Reach us at info@study.space
[slides and audio] Language-Specific Neurons%3A The Key to Multilingual Capabilities in Large Language Models