ICASSP 2024 SPEECH SIGNAL IMPROVEMENT CHALLENGE

ICASSP 2024 SPEECH SIGNAL IMPROVEMENT CHALLENGE

2024 | Nicolae-Cătălin Ristea, Ando Saabas, Ross Cutler, Babak Naderi, Sebastian Braun, Solomiya Branets
The ICASSP 2024 Speech Signal Improvement Grand Challenge aims to advance research in enhancing speech signal quality in communication systems. Building on the success of the 2023 challenge, this year's event introduces several enhancements, including a dataset synthesizer, an objective metric (SIGMOS), and the addition of the Word Accuracy (WAcc) metric. The challenge evaluates 13 systems in the real-time track and 11 systems in the non-real-time track using both subjective P.804 and objective metrics. Key changes from the 2023 challenge include: 1. **Dataset Synthesizer**: A tool to generate realistic datasets for better baseline performance. 2. **Objective Metric (SIGMOS)**: A new metric correlated with extended P.804 tests, assessing full-band audio. 3. **Improved Evaluation Procedure**: Introduction of WAcc to provide a more comprehensive evaluation. 4. **Transcripts Released**: Public transcripts for the 2023 test set to facilitate independent model evaluation. The blind dataset consists of 500 clips from various devices and environments, with a focus on diverse impairment areas. The evaluation methodology includes subjective listening tests and WAcc using Azure Cognitive Services. The final score is calculated as a weighted average of SIG, OVRL, and WAcc. The top performers in the real-time track are 1024k, B&N, Nju-AALab, Sluice, and IIP, while in the non-real-time track, the top 3 teams are 1024k, SpeechGroup-IoA, and B&N. Statistical testing highlights significant differences in performance among the top teams. The challenge aims to advance the state-of-the-art in signal enhancement.The ICASSP 2024 Speech Signal Improvement Grand Challenge aims to advance research in enhancing speech signal quality in communication systems. Building on the success of the 2023 challenge, this year's event introduces several enhancements, including a dataset synthesizer, an objective metric (SIGMOS), and the addition of the Word Accuracy (WAcc) metric. The challenge evaluates 13 systems in the real-time track and 11 systems in the non-real-time track using both subjective P.804 and objective metrics. Key changes from the 2023 challenge include: 1. **Dataset Synthesizer**: A tool to generate realistic datasets for better baseline performance. 2. **Objective Metric (SIGMOS)**: A new metric correlated with extended P.804 tests, assessing full-band audio. 3. **Improved Evaluation Procedure**: Introduction of WAcc to provide a more comprehensive evaluation. 4. **Transcripts Released**: Public transcripts for the 2023 test set to facilitate independent model evaluation. The blind dataset consists of 500 clips from various devices and environments, with a focus on diverse impairment areas. The evaluation methodology includes subjective listening tests and WAcc using Azure Cognitive Services. The final score is calculated as a weighted average of SIG, OVRL, and WAcc. The top performers in the real-time track are 1024k, B&N, Nju-AALab, Sluice, and IIP, while in the non-real-time track, the top 3 teams are 1024k, SpeechGroup-IoA, and B&N. Statistical testing highlights significant differences in performance among the top teams. The challenge aims to advance the state-of-the-art in signal enhancement.
Reach us at info@study.space