9 May 2024 | LENA ARMSTRONG, ABBEY LIU, STEPHEN MACNEIL, DANAË METAXA
This study investigates the potential racial and gender biases in large language models (LLMs), specifically OpenAI's GPT-3.5, in the context of hiring practices. The researchers conducted two algorithmic audits: one assessing how GPT scores resumes based on names indicating different racial and gender identities, and another generating resumes for those names. The findings reveal that GPT reflects and exacerbates existing biases, such as lower scores for resumes with names associated with women and people of color, and generating resumes with stereotypes, such as assigning less experienced roles to women and immigrant markers to Asian and Hispanic names. The study highlights the need for algorithmic auditing to address biases in automated hiring systems, as LLMs may perpetuate systemic inequalities. The results contribute to the growing body of research on LLM biases, emphasizing the importance of auditing these systems to ensure fairness and equity in hiring practices. The study also underscores the potential for a "silicon ceiling," an invisible barrier that disproportionately affects marginalized groups in hiring. The findings suggest that future research should consider a broader range of biases, including age, educational status, and nationality, when auditing automated systems. The study calls for ongoing research and policy development to address the ethical and social implications of LLMs in hiring.This study investigates the potential racial and gender biases in large language models (LLMs), specifically OpenAI's GPT-3.5, in the context of hiring practices. The researchers conducted two algorithmic audits: one assessing how GPT scores resumes based on names indicating different racial and gender identities, and another generating resumes for those names. The findings reveal that GPT reflects and exacerbates existing biases, such as lower scores for resumes with names associated with women and people of color, and generating resumes with stereotypes, such as assigning less experienced roles to women and immigrant markers to Asian and Hispanic names. The study highlights the need for algorithmic auditing to address biases in automated hiring systems, as LLMs may perpetuate systemic inequalities. The results contribute to the growing body of research on LLM biases, emphasizing the importance of auditing these systems to ensure fairness and equity in hiring practices. The study also underscores the potential for a "silicon ceiling," an invisible barrier that disproportionately affects marginalized groups in hiring. The findings suggest that future research should consider a broader range of biases, including age, educational status, and nationality, when auditing automated systems. The study calls for ongoing research and policy development to address the ethical and social implications of LLMs in hiring.