HD-EVAL: Aligning Large Language Model Evaluators Through Hierarchical Criteria Decomposition

HD-EVAL: Aligning Large Language Model Evaluators Through Hierarchical Criteria Decomposition

24 Feb 2024 | Yuxuan Liu, Tianchi Yang, Shaohan Huang, Zihan Zhang, Haizhen Huang, Furu Wei, Weiwei Deng, Feng Sun, Qi Zhang
HD-EVAL is a novel framework that aligns large language model (LLM) evaluators with human preferences through hierarchical criteria decomposition. The framework iteratively decomposes evaluation tasks into finer-grained criteria, aggregates results based on human preferences, and prunes insignificant criteria to focus on significant aspects. This hierarchical decomposition captures multiple levels of granularity in natural language, enabling more comprehensive and accurate evaluations. HD-EVAL is implemented as a white-box system, making it efficient to train and more explainable than relying solely on prompting. It is applicable to both open-source and closed-source LLMs. Extensive experiments on three evaluation domains demonstrate that HD-EVAL significantly improves the alignment of state-of-the-art evaluators and provides deeper insights into evaluation results and tasks. The framework includes three key stages: hierarchical criteria decomposition, human preference-guided aggregation, and attribution pruning. These stages are integrated into an iterative alignment training process to refine the evaluation criteria and aggregator. HD-EVAL outperforms existing automatic evaluation metrics and demonstrates strong performance in aligning LLM-based evaluators with human preferences. The framework is also efficient in data-scarce scenarios and provides explainable results through white-box aggregators. The work highlights the importance of hierarchical thinking in evaluation and offers a promising alternative to traditional human evaluations.HD-EVAL is a novel framework that aligns large language model (LLM) evaluators with human preferences through hierarchical criteria decomposition. The framework iteratively decomposes evaluation tasks into finer-grained criteria, aggregates results based on human preferences, and prunes insignificant criteria to focus on significant aspects. This hierarchical decomposition captures multiple levels of granularity in natural language, enabling more comprehensive and accurate evaluations. HD-EVAL is implemented as a white-box system, making it efficient to train and more explainable than relying solely on prompting. It is applicable to both open-source and closed-source LLMs. Extensive experiments on three evaluation domains demonstrate that HD-EVAL significantly improves the alignment of state-of-the-art evaluators and provides deeper insights into evaluation results and tasks. The framework includes three key stages: hierarchical criteria decomposition, human preference-guided aggregation, and attribution pruning. These stages are integrated into an iterative alignment training process to refine the evaluation criteria and aggregator. HD-EVAL outperforms existing automatic evaluation metrics and demonstrates strong performance in aligning LLM-based evaluators with human preferences. The framework is also efficient in data-scarce scenarios and provides explainable results through white-box aggregators. The work highlights the importance of hierarchical thinking in evaluation and offers a promising alternative to traditional human evaluations.
Reach us at info@study.space