22 May 2024 | Congzhi Zhang, Linhai Zhang, Jialong Wu, Deyu Zhou, Yulan He
Causal Prompting: Debiasing Large Language Model Prompting Based on Front-Door Adjustment
This paper proposes a novel method for debiasing large language models (LLMs) using causal inference, specifically front-door adjustment. The method addresses the issue of biases in LLMs, which can lead to incorrect or unfaithful reasoning and answers. Traditional debiasing methods focus on model training, but they struggle with complex biases in LLMs. The proposed method uses a structural causal model to uncover the causal relationships behind prompting methods and employs front-door adjustment to mitigate biases.
The method involves using the chain-of-thought (CoT) generated by LLMs as a mediator variable. The causal effect between input prompts and output answers is calculated using front-door adjustment. To accurately represent CoTs and estimate causal effects, contrastive learning is used to fine-tune the encoder of CoTs, aligning its representation space with that of the LLM.
The proposed method combines CoT, self-consistency (SC), and in-context learning (ICL) through front-door adjustment to mitigate LLM biases in natural language processing tasks. Experimental results show that the method achieves excellent performance across seven natural language processing datasets on both open-source and closed-source LLMs.
The method involves clustering CoTs generated by LLMs and using the cluster centers as representative CoTs. The optimal demonstration examples are retrieved for each representative CoT through an encoder-based intervention algorithm. The final answer is obtained by performing a weighted voting based on the results of multiple queries to the LLM.
The method also aligns the representation spaces of the encoder and the LLM using contrastive learning to improve the accuracy of causal effect estimation. The results demonstrate that the proposed method significantly improves performance across various NLP tasks, particularly in math reasoning and multi-hop question answering. The method is robust to adversarial data and generalizes well to datasets with significant bias.Causal Prompting: Debiasing Large Language Model Prompting Based on Front-Door Adjustment
This paper proposes a novel method for debiasing large language models (LLMs) using causal inference, specifically front-door adjustment. The method addresses the issue of biases in LLMs, which can lead to incorrect or unfaithful reasoning and answers. Traditional debiasing methods focus on model training, but they struggle with complex biases in LLMs. The proposed method uses a structural causal model to uncover the causal relationships behind prompting methods and employs front-door adjustment to mitigate biases.
The method involves using the chain-of-thought (CoT) generated by LLMs as a mediator variable. The causal effect between input prompts and output answers is calculated using front-door adjustment. To accurately represent CoTs and estimate causal effects, contrastive learning is used to fine-tune the encoder of CoTs, aligning its representation space with that of the LLM.
The proposed method combines CoT, self-consistency (SC), and in-context learning (ICL) through front-door adjustment to mitigate LLM biases in natural language processing tasks. Experimental results show that the method achieves excellent performance across seven natural language processing datasets on both open-source and closed-source LLMs.
The method involves clustering CoTs generated by LLMs and using the cluster centers as representative CoTs. The optimal demonstration examples are retrieved for each representative CoT through an encoder-based intervention algorithm. The final answer is obtained by performing a weighted voting based on the results of multiple queries to the LLM.
The method also aligns the representation spaces of the encoder and the LLM using contrastive learning to improve the accuracy of causal effect estimation. The results demonstrate that the proposed method significantly improves performance across various NLP tasks, particularly in math reasoning and multi-hop question answering. The method is robust to adversarial data and generalizes well to datasets with significant bias.