RankMamba: Benchmarking Mamba's Document Ranking Performance in the Era of Transformers

RankMamba: Benchmarking Mamba's Document Ranking Performance in the Era of Transformers

7 Apr 2024 | Zhichao Xu
This paper evaluates the performance of Mamba, a state-space model-based architecture, in the document ranking task, a critical component of information retrieval. Mamba has shown transformer-equivalent performance in sequence modeling tasks. The study compares Mamba with transformer-based models in terms of performance and training efficiency. The results show that Mamba models achieve competitive performance compared to transformer-based models with similar training recipes. However, they have lower training throughput compared to efficient transformer implementations like Flash Attention. The study also highlights that encoder-only transformer-based models, such as RoBERTa, perform well in document ranking tasks. Mamba models, while competitive in performance, face challenges in training efficiency. The study provides a benchmarking analysis of Mamba and transformer-based models on the MS MARCO dataset and other benchmark datasets. It also discusses the limitations of Mamba, including lower training throughput and the need for further research to improve its efficiency. The study concludes that Mamba is a viable alternative to transformer-based models in document ranking tasks, but further improvements are needed to enhance its training efficiency. The code and trained checkpoints are made publicly available for reproducibility.This paper evaluates the performance of Mamba, a state-space model-based architecture, in the document ranking task, a critical component of information retrieval. Mamba has shown transformer-equivalent performance in sequence modeling tasks. The study compares Mamba with transformer-based models in terms of performance and training efficiency. The results show that Mamba models achieve competitive performance compared to transformer-based models with similar training recipes. However, they have lower training throughput compared to efficient transformer implementations like Flash Attention. The study also highlights that encoder-only transformer-based models, such as RoBERTa, perform well in document ranking tasks. Mamba models, while competitive in performance, face challenges in training efficiency. The study provides a benchmarking analysis of Mamba and transformer-based models on the MS MARCO dataset and other benchmark datasets. It also discusses the limitations of Mamba, including lower training throughput and the need for further research to improve its efficiency. The study concludes that Mamba is a viable alternative to transformer-based models in document ranking tasks, but further improvements are needed to enhance its training efficiency. The code and trained checkpoints are made publicly available for reproducibility.
Reach us at info@study.space