11 Mar 2024 | Carlos Lassance, Hervé Déjean, Thibault Formal, Stéphane Clinchant
SPLADE-v3: New Baselines for SPLADE
This paper presents SPLADE-v3, a new version of the SPLADE library. The authors describe changes to the training structure and present their latest series of models. They compare SPLADE-v3 with BM25, SPLADE++, and re-rankers, showing its effectiveness through a meta-analysis over more than 40 query sets. SPLADE-v3 is statistically significantly more effective than both BM25 and SPLADE++ and performs well compared to cross-encoder re-rankers. It achieves over 40 MRR@10 on the MS MARCO dev set and improves by 2% the out-of-domain results on the BEIR benchmark.
The authors detail several improvements to the training of SPLADE models. These include using multiple negatives per batch, better distillation scores, two distillation losses, and further fine-tuning of SPLADE. They also introduce three new variants of SPLADE-v3: SPLADE-v3-DistilBERT, SPLADE-v3-Lexical, and SPLADE-v3-Doc. These variants have different starting points and training methods, leading to different performance characteristics.
The authors evaluate SPLADE-v3 on various datasets, including MS MARCO, BEIR, LoTTE, and others. They compare it to BM25, SPLADE++SelfDistil, and cross-encoder re-rankers. The results show that SPLADE-v3 outperforms BM25 and can even rival some re-rankers. The SPLADE-v3-DistilBERT variant is less effective than the BERT version but more effective than the lexical version on BEIR. The SPLADE-v3-Doc variant is the least effective, showing that even a minimal amount of computation on the query side is important.
The authors conclude that SPLADE-v3 is statistically significantly more effective than previous iterations of the SPLADE model. It outperforms BM25 and can even rival some re-rankers in most query sets, including zero-shot settings. The new variants of SPLADE-v3 offer different trade-offs between effectiveness and efficiency.SPLADE-v3: New Baselines for SPLADE
This paper presents SPLADE-v3, a new version of the SPLADE library. The authors describe changes to the training structure and present their latest series of models. They compare SPLADE-v3 with BM25, SPLADE++, and re-rankers, showing its effectiveness through a meta-analysis over more than 40 query sets. SPLADE-v3 is statistically significantly more effective than both BM25 and SPLADE++ and performs well compared to cross-encoder re-rankers. It achieves over 40 MRR@10 on the MS MARCO dev set and improves by 2% the out-of-domain results on the BEIR benchmark.
The authors detail several improvements to the training of SPLADE models. These include using multiple negatives per batch, better distillation scores, two distillation losses, and further fine-tuning of SPLADE. They also introduce three new variants of SPLADE-v3: SPLADE-v3-DistilBERT, SPLADE-v3-Lexical, and SPLADE-v3-Doc. These variants have different starting points and training methods, leading to different performance characteristics.
The authors evaluate SPLADE-v3 on various datasets, including MS MARCO, BEIR, LoTTE, and others. They compare it to BM25, SPLADE++SelfDistil, and cross-encoder re-rankers. The results show that SPLADE-v3 outperforms BM25 and can even rival some re-rankers. The SPLADE-v3-DistilBERT variant is less effective than the BERT version but more effective than the lexical version on BEIR. The SPLADE-v3-Doc variant is the least effective, showing that even a minimal amount of computation on the query side is important.
The authors conclude that SPLADE-v3 is statistically significantly more effective than previous iterations of the SPLADE model. It outperforms BM25 and can even rival some re-rankers in most query sets, including zero-shot settings. The new variants of SPLADE-v3 offer different trade-offs between effectiveness and efficiency.