[slides] Text Embedding Inversion Security for Multilingual Language Models

This paper investigates the security risks of text embedding inversion attacks on multilingual language models (LLMs). Text embeddings, commonly used in natural language processing (NLP), can be vulnerable to inversion attacks where malicious actors reconstruct original text from embeddings. While previous research has focused on English-based models, this study extends the analysis to multilingual settings, revealing that multilingual LLMs may be more susceptible to such attacks due to the limitations of English-based defenses. The authors propose a simple masking defense that effectively protects both monolingual and multilingual models without requiring additional training. The study explores two types of inversion attacks: multilingual and cross-lingual. Multilingual inversion attacks involve reconstructing text in multiple languages, while cross-lingual attacks assume the attacker does not know the target language. To evaluate cross-lingual attacks, the authors introduce an Ad hoc Translation method to overcome the limitations of current string-matching metrics. The results show that multilingual models can reconstruct text more effectively than monolingual ones, and that existing defenses for monolingual models are insufficient for multilingual models. The authors also evaluate the effectiveness of noise insertion and masking defenses. Noise insertion is shown to be effective in protecting monolingual models, while masking provides a simple yet effective defense for both monolingual and multilingual models. The study highlights the need for further research into multilingual security, as current defenses are not sufficient for non-English languages and multilingual models. The authors advocate for a broader approach to LLM and NLP security that includes a wider range of languages. The study also emphasizes the importance of addressing data contamination in pre-trained models, which can affect the validity of inversion attack experiments. Overall, the paper contributes to the understanding of text embedding inversion attacks and provides new insights into the security of multilingual language models.This paper investigates the security risks of text embedding inversion attacks on multilingual language models (LLMs). Text embeddings, commonly used in natural language processing (NLP), can be vulnerable to inversion attacks where malicious actors reconstruct original text from embeddings. While previous research has focused on English-based models, this study extends the analysis to multilingual settings, revealing that multilingual LLMs may be more susceptible to such attacks due to the limitations of English-based defenses. The authors propose a simple masking defense that effectively protects both monolingual and multilingual models without requiring additional training. The study explores two types of inversion attacks: multilingual and cross-lingual. Multilingual inversion attacks involve reconstructing text in multiple languages, while cross-lingual attacks assume the attacker does not know the target language. To evaluate cross-lingual attacks, the authors introduce an Ad hoc Translation method to overcome the limitations of current string-matching metrics. The results show that multilingual models can reconstruct text more effectively than monolingual ones, and that existing defenses for monolingual models are insufficient for multilingual models. The authors also evaluate the effectiveness of noise insertion and masking defenses. Noise insertion is shown to be effective in protecting monolingual models, while masking provides a simple yet effective defense for both monolingual and multilingual models. The study highlights the need for further research into multilingual security, as current defenses are not sufficient for non-English languages and multilingual models. The authors advocate for a broader approach to LLM and NLP security that includes a wider range of languages. The study also emphasizes the importance of addressing data contamination in pre-trained models, which can affect the validity of inversion attack experiments. Overall, the paper contributes to the understanding of text embedding inversion attacks and provides new insights into the security of multilingual language models.

Text Embedding Inversion Security for Multilingual Language Models

5 Jun 2024 | Yiyi Chen, Heather Lent, Johannes Bjerva