[slides and audio] From Matching to Generation%3A A Survey on Generative Information Retrieval

The paper "From Matching to Generation: A Survey on Generative Information Retrieval" by Xiaoxi Li et al. provides a comprehensive review of the latest research in generative information retrieval (GenIR). The authors highlight the evolution of IR systems from traditional similarity-based methods to generative approaches, driven by advancements in pre-trained language models. The paper categorizes GenIR into two main areas: generative document retrieval (GR) and reliable response generation. GR leverages generative models to directly generate document identifiers, eliminating the need for large-scale document indices. Reliable response generation, on the other hand, uses language models to generate user-centric responses, enhancing flexibility, efficiency, and creativity. The review covers advancements in GR, including model training, document identifier design, incremental learning, downstream task adaptation, multi-modal GR, and generative recommendation. It also discusses reliable response generation, focusing on internal knowledge memorization, external knowledge augmentation, generating responses with citations, and improving personal information assistance. The paper aims to offer a systematic reference for researchers in the GenIR field, addressing current challenges and future prospects.The paper "From Matching to Generation: A Survey on Generative Information Retrieval" by Xiaoxi Li et al. provides a comprehensive review of the latest research in generative information retrieval (GenIR). The authors highlight the evolution of IR systems from traditional similarity-based methods to generative approaches, driven by advancements in pre-trained language models. The paper categorizes GenIR into two main areas: generative document retrieval (GR) and reliable response generation. GR leverages generative models to directly generate document identifiers, eliminating the need for large-scale document indices. Reliable response generation, on the other hand, uses language models to generate user-centric responses, enhancing flexibility, efficiency, and creativity. The review covers advancements in GR, including model training, document identifier design, incremental learning, downstream task adaptation, multi-modal GR, and generative recommendation. It also discusses reliable response generation, focusing on internal knowledge memorization, external knowledge augmentation, generating responses with citations, and improving personal information assistance. The paper aims to offer a systematic reference for researchers in the GenIR field, addressing current challenges and future prospects.

From Matching to Generation: A Survey on Generative Information Retrieval

16 May 2024 | Xiaoxi Li, Jiajie Jin, Yujia Zhou, Yuyao Zhang, Peitian Zhang, Yutao Zhu, and Zhicheng Dou