USimAgent: Large Language Models for Simulating Search Users

USimAgent: Large Language Models for Simulating Search Users

July 14–18, 2024 | Erhan Zhang, Xingzhu Wang, Peiyuan Gong, Yankai Lin, Jiaxin Mao
USimAgent is a large language model (LLM)-based user search behavior simulator designed to generate realistic search sessions for information retrieval systems. The simulator can simulate user actions such as querying, clicking, and stopping during a search, enabling the generation of complete search sequences for specific tasks. Empirical tests on a real user behavior dataset show that USimAgent outperforms existing methods in query generation and is comparable to traditional methods in predicting user clicks and stopping behaviors. These results validate the effectiveness of using LLMs for user simulation and highlight the need for more robust and general user simulators. The simulator leverages LLMs' ability to understand and process natural language, allowing it to integrate information from the session context and external environment to enhance simulation realism. It also benefits from zero-shot/few-shot learning capabilities, enabling it to adapt to various scenarios without additional training for each task. Inspired by the ReAct method, USimAgent expands the action space to include combinations of reasoning and action steps, allowing it to perform in-depth reasoning based on the current context before executing actions, thus producing coherent behavioral outputs. In experiments, USimAgent was compared against traditional click models and stopping strategies. It demonstrated superior performance in query generation and in predicting user clicks and stopping behaviors. However, its performance in some aspects was comparable to traditional models, possibly due to the lack of position bias consideration in its click prediction and the use of zero-shot learning, which may not be as effective as models trained on large datasets. The study concludes that USimAgent is a promising framework for simulating search user behavior, and future research could focus on combining LLMs with broader datasets to enhance performance in user search simulation. The code and data are available at https://github.com/Meow-E/USimAgent.USimAgent is a large language model (LLM)-based user search behavior simulator designed to generate realistic search sessions for information retrieval systems. The simulator can simulate user actions such as querying, clicking, and stopping during a search, enabling the generation of complete search sequences for specific tasks. Empirical tests on a real user behavior dataset show that USimAgent outperforms existing methods in query generation and is comparable to traditional methods in predicting user clicks and stopping behaviors. These results validate the effectiveness of using LLMs for user simulation and highlight the need for more robust and general user simulators. The simulator leverages LLMs' ability to understand and process natural language, allowing it to integrate information from the session context and external environment to enhance simulation realism. It also benefits from zero-shot/few-shot learning capabilities, enabling it to adapt to various scenarios without additional training for each task. Inspired by the ReAct method, USimAgent expands the action space to include combinations of reasoning and action steps, allowing it to perform in-depth reasoning based on the current context before executing actions, thus producing coherent behavioral outputs. In experiments, USimAgent was compared against traditional click models and stopping strategies. It demonstrated superior performance in query generation and in predicting user clicks and stopping behaviors. However, its performance in some aspects was comparable to traditional models, possibly due to the lack of position bias consideration in its click prediction and the use of zero-shot learning, which may not be as effective as models trained on large datasets. The study concludes that USimAgent is a promising framework for simulating search user behavior, and future research could focus on combining LLMs with broader datasets to enhance performance in user search simulation. The code and data are available at https://github.com/Meow-E/USimAgent.
Reach us at info@futurestudyspace.com
[slides] USimAgent%3A Large Language Models for Simulating Search Users | StudySpace