Efficient Causal Graph Discovery Using Large Language Models

Efficient Causal Graph Discovery Using Large Language Models

20 Jul 2024 | Thomas Jiralerspong, Xiaoyin Chen, Yash More, Vedant Shah, Yoshua Bengio
This paper proposes a novel framework for full causal graph discovery using large language models (LLMs). The framework leverages the breadth-first search (BFS) approach to reduce the number of queries needed for causal graph discovery, achieving linear query complexity $ O(n) $ compared to the quadratic complexity $ O(n^2) $ of pairwise query methods. The proposed method can incorporate observational data when available to improve performance. It achieves state-of-the-art results on real-world causal graphs of varying sizes, demonstrating its effectiveness and efficiency in discovering causal relationships. The framework is designed to use only metadata, such as variable names and descriptions, without requiring numerical observations. This is similar to how human domain experts construct causal graphs using their domain knowledge. The method consists of three stages: initialization, expansion, and insertion. In the initialization stage, the LLM is prompted to identify variables not causally affected by any other variables. In the expansion stage, the LLM is instructed to find variables caused by the current node. In the insertion stage, the variables proposed by the LLM are added to the BFS queue, and the edges are checked to ensure they do not create cycles. The proposed method is more efficient than previous LLM-based methods and does not require observational data, while still achieving state-of-the-art or competitive performance on three graphs of varying sizes. The method also shows how observational data can be easily incorporated to improve performance. The paper compares the proposed method with existing methods, including statistical methods like Greedy Equivalence Search (GES), the Peter-Clark Algorithm (PC), Non-combinatorial Optimization via Trace Exponential and Augmented Lagrangian for Structure Learning (NOTEARS), and Directed Acyclic Graphs via M-matrices for Acyclicity (DAGMA), as well as pairwise queries. The results show that the proposed method outperforms these methods on the Asia and Child causal graphs, and performs reasonably well on the large Neuropathic Pain causal graph, which is intractable for other methods. The proposed method has limitations, including its reliance on LLMs, lack of attribution or grounding in generated outputs, and performance dependent on the quality of the LLM used. Future work includes exploring hybrid statistical/LLM approaches, advanced prompting strategies, and testing the method with different LLMs. The paper also provides code to reproduce all experiments in an anonymized repository.This paper proposes a novel framework for full causal graph discovery using large language models (LLMs). The framework leverages the breadth-first search (BFS) approach to reduce the number of queries needed for causal graph discovery, achieving linear query complexity $ O(n) $ compared to the quadratic complexity $ O(n^2) $ of pairwise query methods. The proposed method can incorporate observational data when available to improve performance. It achieves state-of-the-art results on real-world causal graphs of varying sizes, demonstrating its effectiveness and efficiency in discovering causal relationships. The framework is designed to use only metadata, such as variable names and descriptions, without requiring numerical observations. This is similar to how human domain experts construct causal graphs using their domain knowledge. The method consists of three stages: initialization, expansion, and insertion. In the initialization stage, the LLM is prompted to identify variables not causally affected by any other variables. In the expansion stage, the LLM is instructed to find variables caused by the current node. In the insertion stage, the variables proposed by the LLM are added to the BFS queue, and the edges are checked to ensure they do not create cycles. The proposed method is more efficient than previous LLM-based methods and does not require observational data, while still achieving state-of-the-art or competitive performance on three graphs of varying sizes. The method also shows how observational data can be easily incorporated to improve performance. The paper compares the proposed method with existing methods, including statistical methods like Greedy Equivalence Search (GES), the Peter-Clark Algorithm (PC), Non-combinatorial Optimization via Trace Exponential and Augmented Lagrangian for Structure Learning (NOTEARS), and Directed Acyclic Graphs via M-matrices for Acyclicity (DAGMA), as well as pairwise queries. The results show that the proposed method outperforms these methods on the Asia and Child causal graphs, and performs reasonably well on the large Neuropathic Pain causal graph, which is intractable for other methods. The proposed method has limitations, including its reliance on LLMs, lack of attribution or grounding in generated outputs, and performance dependent on the quality of the LLM used. Future work includes exploring hybrid statistical/LLM approaches, advanced prompting strategies, and testing the method with different LLMs. The paper also provides code to reproduce all experiments in an anonymized repository.
Reach us at info@study.space
Understanding Efficient Causal Graph Discovery Using Large Language Models