[slides] Can Language Beat Numerical Regression%3F Language-Based Multimodal Trajectory Prediction

This paper introduces LMTraj (Language-based Multimodal Trajectory predictor), a novel approach that leverages language models to predict pedestrian trajectories. Unlike traditional numerical regression models, LMTraj treats trajectory coordinates as discrete signals and transforms them into natural language prompts. The method converts trajectory data and scene images into text prompts, which are then used in a question-answering template for a language model. To enhance the model's understanding of high-level context and social relationships, an auxiliary multi-task question and answering system is introduced. A specialized numerical tokenizer is trained to handle the transformed data effectively. The language model is trained using both zero-shot and supervised approaches, demonstrating its ability to perform deterministic and stochastic trajectory predictions. Extensive experiments on public datasets show that LMTraj outperforms existing numerical-based methods, achieving state-of-the-art results in pedestrian trajectory prediction. The code for LMTraj is publicly available at <https://github.com/inhwanbae/LMTrajectory>.This paper introduces LMTraj (Language-based Multimodal Trajectory predictor), a novel approach that leverages language models to predict pedestrian trajectories. Unlike traditional numerical regression models, LMTraj treats trajectory coordinates as discrete signals and transforms them into natural language prompts. The method converts trajectory data and scene images into text prompts, which are then used in a question-answering template for a language model. To enhance the model's understanding of high-level context and social relationships, an auxiliary multi-task question and answering system is introduced. A specialized numerical tokenizer is trained to handle the transformed data effectively. The language model is trained using both zero-shot and supervised approaches, demonstrating its ability to perform deterministic and stochastic trajectory predictions. Extensive experiments on public datasets show that LMTraj outperforms existing numerical-based methods, achieving state-of-the-art results in pedestrian trajectory prediction. The code for LMTraj is publicly available at <https://github.com/inhwanbae/LMTrajectory>.

Can Language Beat Numerical Regression? Language-Based Multimodal Trajectory Prediction

27 Mar 2024 | Inhwan Bae, Junoh Lee and Hae-Gon Jeon