[slides and audio] Enhancing Ethical Explanations of Large Language Models through Iterative Symbolic Refinement

This paper addresses the challenge of enhancing the ethical reasoning and explainability of Large Language Models (LLMs) in Natural Language Inference (NLI) tasks, particularly in complex domains. The authors propose a neuro-symbolic framework called *Logic-Explainer*, which integrates LLMs with an external backward-chaining solver to refine step-wise natural language explanations. The framework aims to improve the logical validity, completeness, and non-redundancy of ethical explanations produced by LLMs. *Logic-Explainer* uses an iterative symbolic refinement methodology, where the LLM generates initial explanations and hypotheses of moral violations, which are then verified using a symbolic solver. If the explanation is not valid or redundant, the LLM refines the explanation through abductive and deductive inference, generating missing premises and revising the hypothesis. This process is repeated iteratively to construct a complete and non-redundant explanation. The paper evaluates *Logic-Explainer* on ethical NLI benchmarks, demonstrating significant improvements in the accuracy of identifying underlying moral violations compared to in-context learning and Chain-of-Thought (CoT) methods. The results show that *Logic-Explainer* can improve logical validity from 22.9% to 65.1% and reduce redundancy from 86.6% to 4.6%. The authors also release a corpus of structured natural language explanations for ethical NLI, named ExplainEthics, to support future research in this area. The contributions of the paper include the introduction of *Logic-Explainer*, extensive experiments on multi-step NLI tasks, and the release of the ExplainEthics corpus. The paper highlights the effectiveness of neuro-symbolic methods in enhancing the logical consistency, reliability, and alignment of LLMs in ethical reasoning.This paper addresses the challenge of enhancing the ethical reasoning and explainability of Large Language Models (LLMs) in Natural Language Inference (NLI) tasks, particularly in complex domains. The authors propose a neuro-symbolic framework called *Logic-Explainer*, which integrates LLMs with an external backward-chaining solver to refine step-wise natural language explanations. The framework aims to improve the logical validity, completeness, and non-redundancy of ethical explanations produced by LLMs. *Logic-Explainer* uses an iterative symbolic refinement methodology, where the LLM generates initial explanations and hypotheses of moral violations, which are then verified using a symbolic solver. If the explanation is not valid or redundant, the LLM refines the explanation through abductive and deductive inference, generating missing premises and revising the hypothesis. This process is repeated iteratively to construct a complete and non-redundant explanation. The paper evaluates *Logic-Explainer* on ethical NLI benchmarks, demonstrating significant improvements in the accuracy of identifying underlying moral violations compared to in-context learning and Chain-of-Thought (CoT) methods. The results show that *Logic-Explainer* can improve logical validity from 22.9% to 65.1% and reduce redundancy from 86.6% to 4.6%. The authors also release a corpus of structured natural language explanations for ethical NLI, named ExplainEthics, to support future research in this area. The contributions of the paper include the introduction of *Logic-Explainer*, extensive experiments on multi-step NLI tasks, and the release of the ExplainEthics corpus. The paper highlights the effectiveness of neuro-symbolic methods in enhancing the logical consistency, reliability, and alignment of LLMs in ethical reasoning.

Enhancing Ethical Explanations of Large Language Models through Iterative Symbolic Refinement

1 Feb 2024 | Xin Quan, Marco Valentino, Louise A. Dennis, André Freitas