[slides and audio] Towards Privacy-Aware Sign Language Translation at Scale

This paper addresses the challenges of data scarcity and privacy risks in sign language translation (SLT) by proposing a two-stage framework, SSVP-SLT, which leverages self-supervised video pretraining on anonymized and unannotated videos, followed by supervised SLT finetuning on a curated parallel dataset. The authors introduce SSVP-SLT, which achieves state-of-the-art performance on the How2Sign dataset, outperforming existing methods by over 3 BLEU-4. They also introduce a new benchmark dataset, DailyMoth-70h, consisting of 70 hours of continuous signing in native ASL. The paper discusses the effectiveness of self-supervised pretraining and the impact of facial blurring for anonymization, highlighting the trade-offs between privacy and performance. The results demonstrate the potential of self-supervised learning to alleviate data scarcity and scale up SLT, while maintaining privacy through techniques like facial blurring. The authors conclude by discussing the limitations and future directions, emphasizing the need for more sophisticated anonymization methods and diverse language support.This paper addresses the challenges of data scarcity and privacy risks in sign language translation (SLT) by proposing a two-stage framework, SSVP-SLT, which leverages self-supervised video pretraining on anonymized and unannotated videos, followed by supervised SLT finetuning on a curated parallel dataset. The authors introduce SSVP-SLT, which achieves state-of-the-art performance on the How2Sign dataset, outperforming existing methods by over 3 BLEU-4. They also introduce a new benchmark dataset, DailyMoth-70h, consisting of 70 hours of continuous signing in native ASL. The paper discusses the effectiveness of self-supervised pretraining and the impact of facial blurring for anonymization, highlighting the trade-offs between privacy and performance. The results demonstrate the potential of self-supervised learning to alleviate data scarcity and scale up SLT, while maintaining privacy through techniques like facial blurring. The authors conclude by discussing the limitations and future directions, emphasizing the need for more sophisticated anonymization methods and diverse language support.

Towards Privacy-Aware Sign Language Translation at Scale

7 Aug 2024 | Phillip Rust, Bowen Shi, Skyler Wang, Necati Cihan Camgöz, Jean Maillard