Look Once to Hear: Target Speech Hearing with Noisy Examples

Look Once to Hear: Target Speech Hearing with Noisy Examples

29 May 2024 | Bandhav Veluri, Malek Itani, Tuochao Chen, Takuya Yoshioka, Shyamnath Gollakota
The paper introduces a novel intelligent hearable system that enables users to hear target speakers in crowded environments by looking at them for a few seconds. The system captures a noisy binaural audio example of the target speaker, which is used to learn their speech traits. This example is then used to extract the target speaker's speech from interfering speakers and noise. The system achieves a signal quality improvement of 7.01 dB using less than 5 seconds of noisy enrollment audio and can process 8 ms of audio chunks in 6.24 ms on an embedded CPU. User studies demonstrate generalization to real-world static and mobile speakers in previously unseen indoor and outdoor multipath environments. The enrollment interface for noisy examples does not cause performance degradation compared to clean examples, while being convenient and user-friendly. The paper provides code and data at <https://github.com/vb000/LookOnceToHear>.The paper introduces a novel intelligent hearable system that enables users to hear target speakers in crowded environments by looking at them for a few seconds. The system captures a noisy binaural audio example of the target speaker, which is used to learn their speech traits. This example is then used to extract the target speaker's speech from interfering speakers and noise. The system achieves a signal quality improvement of 7.01 dB using less than 5 seconds of noisy enrollment audio and can process 8 ms of audio chunks in 6.24 ms on an embedded CPU. User studies demonstrate generalization to real-world static and mobile speakers in previously unseen indoor and outdoor multipath environments. The enrollment interface for noisy examples does not cause performance degradation compared to clean examples, while being convenient and user-friendly. The paper provides code and data at <https://github.com/vb000/LookOnceToHear>.
Reach us at info@study.space
Understanding Look Once to Hear%3A Target Speech Hearing with Noisy Examples