Fairness in Large Language Models: A Taxonomic Survey

Fairness in Large Language Models: A Taxonomic Survey

August 2024 | Zhibo Chu, Zichong Wang, and Wenbin Zhang
This survey provides a comprehensive overview of recent advances in fairness in large language models (LLMs). It begins with an introduction to LLMs, followed by an analysis of factors contributing to bias in LLMs. The concept of fairness in LLMs is discussed categorically, summarizing metrics for evaluating bias and existing algorithms for promoting fairness. The survey also summarizes resources for evaluating bias in LLMs, including toolkits and datasets. Finally, existing research challenges and open questions are discussed. LLMs have demonstrated remarkable capabilities in addressing problems across diverse domains, ranging from chatbots to medical diagnoses and financial advisory. However, they may face fairness concerns stemming from biases inherited from the real-world and even exacerbate them. Consequently, they could lead to discrimination against certain populations, especially in socially sensitive applications. The research community has made many efforts to address bias and discrimination in LLMs. Nevertheless, the notions of studied fairness vary across different works, which can be confusing and impede further progress. Moreover, different algorithms are developed to achieve various fairness notions. The lack of a clear framework mapping these fairness notions to their corresponding methodologies complicates the design of algorithms for future fair LLMs. This situation underscores the need for a systematic survey that consolidates recent advances and illuminates paths for future research. The survey categorizes recent studies on the fairness of LLMs according to three distinct perspectives: i) metrics for quantifying biases in LLMs, ii) algorithms for mitigating biases in LLMs, and iii) resources for evaluating biases in LLMs. Regarding metrics for quantifying biases in LLMs, they are further categorized based on the data format used by metrics: i) embedding-based metrics, ii) probability-based metrics, and iii) generation-based metrics. Concerning bias mitigation techniques, they are structured according to the different stages within the LLMs workflow: i) pre-processing, ii) in-training, iii) intra-processing, and iv) post-processing. In addition, the survey collects resources for evaluating biases in LLMs and groups them into Toolkits and Datasets. Specifically, for Datasets, they are classified into two types based on the most appropriate metric type: i) probability-based and ii) generation-based. The survey discusses existing algorithms for mitigating bias in LLMs, including pre-processing, in-training, intra-processing, and post-processing methods. It also summarizes existing datasets and related toolkits for evaluating bias in LLMs. Finally, the survey explores current research challenges and future directions in the field of fair LLMs. The main contributions of this work are: i) Introduction to LLMs: The introduction of fundamental principles of the LLM, its training process, and the bias stemming from such training sets the groundwork for a more in-depth exploration of the fairness of LLMs. ii) Comprehensive Metrics and Algorithms Review: A comprehensive overview of three categories of metrics and four categories of algorithmsThis survey provides a comprehensive overview of recent advances in fairness in large language models (LLMs). It begins with an introduction to LLMs, followed by an analysis of factors contributing to bias in LLMs. The concept of fairness in LLMs is discussed categorically, summarizing metrics for evaluating bias and existing algorithms for promoting fairness. The survey also summarizes resources for evaluating bias in LLMs, including toolkits and datasets. Finally, existing research challenges and open questions are discussed. LLMs have demonstrated remarkable capabilities in addressing problems across diverse domains, ranging from chatbots to medical diagnoses and financial advisory. However, they may face fairness concerns stemming from biases inherited from the real-world and even exacerbate them. Consequently, they could lead to discrimination against certain populations, especially in socially sensitive applications. The research community has made many efforts to address bias and discrimination in LLMs. Nevertheless, the notions of studied fairness vary across different works, which can be confusing and impede further progress. Moreover, different algorithms are developed to achieve various fairness notions. The lack of a clear framework mapping these fairness notions to their corresponding methodologies complicates the design of algorithms for future fair LLMs. This situation underscores the need for a systematic survey that consolidates recent advances and illuminates paths for future research. The survey categorizes recent studies on the fairness of LLMs according to three distinct perspectives: i) metrics for quantifying biases in LLMs, ii) algorithms for mitigating biases in LLMs, and iii) resources for evaluating biases in LLMs. Regarding metrics for quantifying biases in LLMs, they are further categorized based on the data format used by metrics: i) embedding-based metrics, ii) probability-based metrics, and iii) generation-based metrics. Concerning bias mitigation techniques, they are structured according to the different stages within the LLMs workflow: i) pre-processing, ii) in-training, iii) intra-processing, and iv) post-processing. In addition, the survey collects resources for evaluating biases in LLMs and groups them into Toolkits and Datasets. Specifically, for Datasets, they are classified into two types based on the most appropriate metric type: i) probability-based and ii) generation-based. The survey discusses existing algorithms for mitigating bias in LLMs, including pre-processing, in-training, intra-processing, and post-processing methods. It also summarizes existing datasets and related toolkits for evaluating bias in LLMs. Finally, the survey explores current research challenges and future directions in the field of fair LLMs. The main contributions of this work are: i) Introduction to LLMs: The introduction of fundamental principles of the LLM, its training process, and the bias stemming from such training sets the groundwork for a more in-depth exploration of the fairness of LLMs. ii) Comprehensive Metrics and Algorithms Review: A comprehensive overview of three categories of metrics and four categories of algorithms
Reach us at info@study.space