March 11–14, 2024, Boulder, CO, USA | Callie Y. Kim, Christine P Lee, Bilge Mutlu
This study explores the design requirements for integrating large language models (LLMs) into robots to enhance human-robot interaction (HRI). The research compares an LLM-powered social robot with text-based and voice-based agents in four tasks: choose, generate, execute, and negotiate. Key findings indicate that LLM-powered robots are preferred in tasks involving connection-building and deliberation, where they excel in non-verbal cues and social interactions. However, they are less favored in tasks requiring logical communication and may induce anxiety due to verbose responses and communication errors. The study provides design implications for integrating LLMs into robots, emphasizing the need for rich non-verbal cues and task-specific fine-tuning of LLMs. The research also highlights the importance of considering task characteristics and user preferences to optimize HRI experiences.This study explores the design requirements for integrating large language models (LLMs) into robots to enhance human-robot interaction (HRI). The research compares an LLM-powered social robot with text-based and voice-based agents in four tasks: choose, generate, execute, and negotiate. Key findings indicate that LLM-powered robots are preferred in tasks involving connection-building and deliberation, where they excel in non-verbal cues and social interactions. However, they are less favored in tasks requiring logical communication and may induce anxiety due to verbose responses and communication errors. The study provides design implications for integrating LLMs into robots, emphasizing the need for rich non-verbal cues and task-specific fine-tuning of LLMs. The research also highlights the importance of considering task characteristics and user preferences to optimize HRI experiences.