SDR - HALF-BAKED OR WELL DONE?

SDR - HALF-BAKED OR WELL DONE?

6 Nov 2018 | Jonathan Le Roux1, Scott Wisdom2, Hakan Erdogan3, John R. Hershey2
The paper discusses the misuse of the signal-to-distortion ratio (SDR) in speech enhancement and source separation, particularly in single-channel scenarios. The BSS_eval toolkit, widely used for evaluating these tasks, has been criticized for its improper handling of scaling and channel variations, leading to misleading results. The authors propose a modified version called scale-invariant SDR (SI-SDR), which is simpler and more robust. SI-SDR accounts for scaling discrepancies and provides a more accurate measure of performance. The paper also introduces scale-dependent SDR (SD-SDR) to address downscaling issues. Additionally, the authors define SI-SIR and SI-SAR, which provide a direct relationship between SDR, SIR, and SAR. The paper presents examples where SDR fails, such as when a filter removes most of the signal's spectrum, yet SDR remains high. SI-SDR overcomes these issues. The authors also compare SI-SDR and BSS_eval's SDR on a speech separation task, showing that SI-SDR provides more reliable results. The paper concludes that SI-SDR is a better alternative for evaluating single-channel separation tasks.The paper discusses the misuse of the signal-to-distortion ratio (SDR) in speech enhancement and source separation, particularly in single-channel scenarios. The BSS_eval toolkit, widely used for evaluating these tasks, has been criticized for its improper handling of scaling and channel variations, leading to misleading results. The authors propose a modified version called scale-invariant SDR (SI-SDR), which is simpler and more robust. SI-SDR accounts for scaling discrepancies and provides a more accurate measure of performance. The paper also introduces scale-dependent SDR (SD-SDR) to address downscaling issues. Additionally, the authors define SI-SIR and SI-SAR, which provide a direct relationship between SDR, SIR, and SAR. The paper presents examples where SDR fails, such as when a filter removes most of the signal's spectrum, yet SDR remains high. SI-SDR overcomes these issues. The authors also compare SI-SDR and BSS_eval's SDR on a speech separation task, showing that SI-SDR provides more reliable results. The paper concludes that SI-SDR is a better alternative for evaluating single-channel separation tasks.
Reach us at info@study.space
[slides] SDR %E2%80%93 Half-baked or Well Done%3F | StudySpace