Causal Prompting: Debiasing Large Language Model Prompting based on Front-Door Adjustment

Causal Prompting: Debiasing Large Language Model Prompting based on Front-Door Adjustment

22 May 2024 | Congzhi Zhang, Linhai Zhang, Jialong Wu, Deyu Zhou, Yulan He
The paper introduces a novel method called Causal Prompting to address biases in large language models (LLMs) by leveraging front-door adjustment. Traditional debiasing methods focus on the training stage, such as data augmentation and reweighting, but struggle with complex biases in LLMs. Causal Prompting identifies the causal relationship behind prompting methods using a structural causal model and employs a mediator variable, the chain-of-thought (CoT), to estimate the causal effect between input prompts and output answers. The method combines CoT prompting with an encoder-based clustering algorithm to estimate the causal effect from prompts to CoTs and uses the normalized weighted geometric mean (NWGM) approximation to estimate the causal effect from CoTs to answers. Contrastive learning is used to align the representation space of the encoder with LLMs for more accurate causal effect estimation. Experimental results show that Causal Prompting significantly improves performance across seven natural language processing (NLP) datasets on both open-source and closed-source LLMs, demonstrating its effectiveness in mitigating biases in LLMs.The paper introduces a novel method called Causal Prompting to address biases in large language models (LLMs) by leveraging front-door adjustment. Traditional debiasing methods focus on the training stage, such as data augmentation and reweighting, but struggle with complex biases in LLMs. Causal Prompting identifies the causal relationship behind prompting methods using a structural causal model and employs a mediator variable, the chain-of-thought (CoT), to estimate the causal effect between input prompts and output answers. The method combines CoT prompting with an encoder-based clustering algorithm to estimate the causal effect from prompts to CoTs and uses the normalized weighted geometric mean (NWGM) approximation to estimate the causal effect from CoTs to answers. Contrastive learning is used to align the representation space of the encoder with LLMs for more accurate causal effect estimation. Experimental results show that Causal Prompting significantly improves performance across seven natural language processing (NLP) datasets on both open-source and closed-source LLMs, demonstrating its effectiveness in mitigating biases in LLMs.
Reach us at info@study.space