[slides and audio] HD-Eval%3A Aligning Large Language Model Evaluators Through Hierarchical Criteria Decomposition

HD-EVAL is a novel framework that aligns large language model (LLM) evaluators with human preferences through hierarchical criteria decomposition. The framework aims to address the limitations of LLM-based evaluations, such as scope and potential bias, by decomposing evaluation tasks into finer-grained criteria, aggregating them according to estimated human preferences, pruning insignificant criteria, and further decomposing significant criteria. The iterative alignment training process integrates these steps to obtain a hierarchical decomposition of criteria that captures aspects of natural language at multiple granularities. implemented as a white-box, HD-EVAL's human preference-guided aggregator is efficient to train and more explainable than solely prompting LLMs. Extensive experiments on three evaluation domains (summarization, conversation, and data-to-text) demonstrate the superiority of HD-EVAL in aligning state-of-the-art evaluators and providing deeper insights into the explanation of evaluation results and the task itself. Key contributions include the proposed framework, its white-box nature, applicability to both open-source and API-hosted LLMs, and comprehensive experimental results showing superior performance in aligning LLM-based evaluators.HD-EVAL is a novel framework that aligns large language model (LLM) evaluators with human preferences through hierarchical criteria decomposition. The framework aims to address the limitations of LLM-based evaluations, such as scope and potential bias, by decomposing evaluation tasks into finer-grained criteria, aggregating them according to estimated human preferences, pruning insignificant criteria, and further decomposing significant criteria. The iterative alignment training process integrates these steps to obtain a hierarchical decomposition of criteria that captures aspects of natural language at multiple granularities. implemented as a white-box, HD-EVAL's human preference-guided aggregator is efficient to train and more explainable than solely prompting LLMs. Extensive experiments on three evaluation domains (summarization, conversation, and data-to-text) demonstrate the superiority of HD-EVAL in aligning state-of-the-art evaluators and providing deeper insights into the explanation of evaluation results and the task itself. Key contributions include the proposed framework, its white-box nature, applicability to both open-source and API-hosted LLMs, and comprehensive experimental results showing superior performance in aligning LLM-based evaluators.

HD-EVAL: Aligning Large Language Model Evaluators Through Hierarchical Criteria Decomposition

24 Feb 2024 | Yuxuan Liu†*, Tianchi Yang†, Shaohan Huang†, Zihan Zhang†, Haizhen Huang†, Furu Wei†, Weiwei Deng†, Feng Sun†, Qi Zhang†