ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT

ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT

July 25–30, 2020 | Omar Khattab, Matei Zaharia
ColBERT is a novel ranking model that leverages contextualized late interaction over BERT for efficient and effective passage search. It introduces a late interaction architecture that independently encodes queries and documents using BERT, then employs a cheap yet powerful interaction step to model their fine-grained similarity. By delaying and retaining this fine-grained interaction, ColBERT can utilize the expressiveness of deep LMs while enabling pre-computation of document representations, significantly speeding up query processing. ColBERT's pruning-friendly interaction mechanism allows leveraging vector-similarity indexes for end-to-end retrieval directly from large document collections. Extensive evaluations on two recent passage search datasets show that ColBERT is competitive with existing BERT-based models, outperforming non-BERT baselines, while executing two orders of magnitude faster and requiring four orders of magnitude fewer FLOPs per query. ColBERT's indexing, which is the only time it needs to feed documents through BERT, is practical, allowing indexing of the MS MARCO collection in about three hours. ColBERT's effectiveness is attributed to late interaction, its implementation via MaxSim operations, and crucial design choices within its BERT-based encoders. The main contributions include proposing late interaction as a paradigm for efficient and effective neural ranking, presenting ColBERT as a highly-effective model, showing how to leverage ColBERT for re-ranking and full collection search, and evaluating ColBERT on MS MARCO and TREC CAR. ColBERT's architecture includes a query encoder, a document encoder, and a late interaction mechanism. The query and document encoders use BERT-based models to generate contextualized embeddings. ColBERT computes relevance scores via late interaction, which involves maximum similarity computations. ColBERT's late interaction mechanism is differentiable and end-to-end trainable. ColBERT's indexing process allows pre-computation of document embeddings, enabling efficient retrieval. ColBERT's end-to-end retrieval effectiveness is demonstrated through its ability to retrieve top-k results directly from a large document collection. ColBERT's indexing throughput and space footprint are also evaluated, showing that it can index the MS MARCO collection in about three hours with a space footprint of as little as tens of GiBs. ColBERT's effectiveness is further validated through ablation studies, showing the importance of late interaction, query augmentation, and end-to-end retrieval. ColBERT's efficiency and effectiveness make it a promising approach for efficient and effective passage search.ColBERT is a novel ranking model that leverages contextualized late interaction over BERT for efficient and effective passage search. It introduces a late interaction architecture that independently encodes queries and documents using BERT, then employs a cheap yet powerful interaction step to model their fine-grained similarity. By delaying and retaining this fine-grained interaction, ColBERT can utilize the expressiveness of deep LMs while enabling pre-computation of document representations, significantly speeding up query processing. ColBERT's pruning-friendly interaction mechanism allows leveraging vector-similarity indexes for end-to-end retrieval directly from large document collections. Extensive evaluations on two recent passage search datasets show that ColBERT is competitive with existing BERT-based models, outperforming non-BERT baselines, while executing two orders of magnitude faster and requiring four orders of magnitude fewer FLOPs per query. ColBERT's indexing, which is the only time it needs to feed documents through BERT, is practical, allowing indexing of the MS MARCO collection in about three hours. ColBERT's effectiveness is attributed to late interaction, its implementation via MaxSim operations, and crucial design choices within its BERT-based encoders. The main contributions include proposing late interaction as a paradigm for efficient and effective neural ranking, presenting ColBERT as a highly-effective model, showing how to leverage ColBERT for re-ranking and full collection search, and evaluating ColBERT on MS MARCO and TREC CAR. ColBERT's architecture includes a query encoder, a document encoder, and a late interaction mechanism. The query and document encoders use BERT-based models to generate contextualized embeddings. ColBERT computes relevance scores via late interaction, which involves maximum similarity computations. ColBERT's late interaction mechanism is differentiable and end-to-end trainable. ColBERT's indexing process allows pre-computation of document embeddings, enabling efficient retrieval. ColBERT's end-to-end retrieval effectiveness is demonstrated through its ability to retrieve top-k results directly from a large document collection. ColBERT's indexing throughput and space footprint are also evaluated, showing that it can index the MS MARCO collection in about three hours with a space footprint of as little as tens of GiBs. ColBERT's effectiveness is further validated through ablation studies, showing the importance of late interaction, query augmentation, and end-to-end retrieval. ColBERT's efficiency and effectiveness make it a promising approach for efficient and effective passage search.
Reach us at info@study.space
Understanding ColBERT%3A Efficient and Effective Passage Search via Contextualized Late Interaction over BERT