April 26, 2024 | Benjamin S. Manning, Kehang Zhu, John J. Horton
The paper presents an innovative approach to automatically generate and test social scientific hypotheses using large language models (LLMs) and structural causal models (SCMs). The key innovation is the use of SCMs to organize the research process, providing a language for stating hypotheses, a blueprint for constructing LLM-based agents, an experimental design, and a plan for data analysis. The fitted SCM serves as an object for prediction or follow-on experiments. The authors demonstrate the approach through several scenarios: a negotiation, a bail hearing, a job interview, and an auction. In each case, the system proposes and tests causal relationships, finding evidence for some and not others. The insights from these simulations are not directly elicited from the LLM but are derived from its predictions based on the fitted SCM. The LLMs are good at predicting the signs of estimated effects but struggle with their magnitudes. In the auction experiment, the LLMs' predictions are dramatically improved when conditioned on the fitted SCM. The paper also discusses the advantages of using SCMs over other methods for studying causal relationships in social interactions, emphasizing the precision and automation they offer. The authors conclude by highlighting the potential of this approach to efficiently generate new insights about human behavior.The paper presents an innovative approach to automatically generate and test social scientific hypotheses using large language models (LLMs) and structural causal models (SCMs). The key innovation is the use of SCMs to organize the research process, providing a language for stating hypotheses, a blueprint for constructing LLM-based agents, an experimental design, and a plan for data analysis. The fitted SCM serves as an object for prediction or follow-on experiments. The authors demonstrate the approach through several scenarios: a negotiation, a bail hearing, a job interview, and an auction. In each case, the system proposes and tests causal relationships, finding evidence for some and not others. The insights from these simulations are not directly elicited from the LLM but are derived from its predictions based on the fitted SCM. The LLMs are good at predicting the signs of estimated effects but struggle with their magnitudes. In the auction experiment, the LLMs' predictions are dramatically improved when conditioned on the fitted SCM. The paper also discusses the advantages of using SCMs over other methods for studying causal relationships in social interactions, emphasizing the precision and automation they offer. The authors conclude by highlighting the potential of this approach to efficiently generate new insights about human behavior.