6 Jun 2024 | Robin San Roman * 1 2 Pierre Fernandez * 1 2 Hady Elsahar * 1 Alexandre Défossez 3 Teddy Furon 2 Tuan Tran 1
**AudioSeal** is a novel audio watermarking technique designed to detect AI-generated speech at the sample level. It employs a generator-detector architecture trained with a localization loss to enable precise watermark detection and a perceptual loss inspired by auditory masking to achieve better imperceptibility. AudioSeal outperforms existing methods in robustness to real-life audio manipulations and speed, achieving up to two orders of magnitude faster detection. The method is evaluated on various metrics, including SI-SNR, PESQ, STOI, and ViSQOL, and shows superior perceptual quality compared to traditional watermarking methods. AudioSeal also demonstrates strong localization capabilities, allowing for precise detection of AI-generated segments in longer audio clips. Additionally, it supports multi-bit watermarking for model version attribution. The code for AudioSeal is available on GitHub.**AudioSeal** is a novel audio watermarking technique designed to detect AI-generated speech at the sample level. It employs a generator-detector architecture trained with a localization loss to enable precise watermark detection and a perceptual loss inspired by auditory masking to achieve better imperceptibility. AudioSeal outperforms existing methods in robustness to real-life audio manipulations and speed, achieving up to two orders of magnitude faster detection. The method is evaluated on various metrics, including SI-SNR, PESQ, STOI, and ViSQOL, and shows superior perceptual quality compared to traditional watermarking methods. AudioSeal also demonstrates strong localization capabilities, allowing for precise detection of AI-generated segments in longer audio clips. Additionally, it supports multi-bit watermarking for model version attribution. The code for AudioSeal is available on GitHub.