[slides and audio] Approaching Human-Level Forecasting with Language Models

The paper "Approaching Human-Level Forecasting with Language Models" by Danny Halawi explores the potential of language models (LMs) to achieve competitive human-level forecasting. The authors develop a retrieval-augmented LM system designed to automatically search for relevant information, generate forecasts, and aggregate predictions. To evaluate the system, they collect a large dataset of questions from competitive forecasting platforms and test the system's performance against the aggregates of human forecasts. The results show that the system's average Brier score approaches that of human forecasters, and in some settings, it even surpasses them. The work suggests that using LMs for forecasting can provide accurate and scalable predictions, aiding institutional decision-making. Key contributions include the creation of the largest and most recent dataset for real-world forecasting questions, the development of a retrieval-augmented LM system, and a self-supervised fine-tuning method to improve LM performance in forecasting tasks.The paper "Approaching Human-Level Forecasting with Language Models" by Danny Halawi explores the potential of language models (LMs) to achieve competitive human-level forecasting. The authors develop a retrieval-augmented LM system designed to automatically search for relevant information, generate forecasts, and aggregate predictions. To evaluate the system, they collect a large dataset of questions from competitive forecasting platforms and test the system's performance against the aggregates of human forecasts. The results show that the system's average Brier score approaches that of human forecasters, and in some settings, it even surpasses them. The work suggests that using LMs for forecasting can provide accurate and scalable predictions, aiding institutional decision-making. Key contributions include the creation of the largest and most recent dataset for real-world forecasting questions, the development of a retrieval-augmented LM system, and a self-supervised fine-tuning method to improve LM performance in forecasting tasks.

Approaching Human-Level Forecasting with Language Models

28 Feb 2024 | Danny Halawi, Fred Zhang, Chen Yueh-Han, Jacob Steinhardt