February 5, 2024 | Sierra Wyllie, Ilia Shumailov, and Nicolas Papernot
Fairness feedback loops occur when model outputs influence new training data, leading to biased outcomes. This phenomenon, known as model-induced distribution shifts (MIDS), can cause performance, fairness, and representation losses in initially unbiased datasets. MIDS can amplify existing biases, especially in generative and supervised models, leading to unfairness and reduced representation for minoritized groups. However, models can also be used to intentionally address these issues through algorithmic reparation (AR), which aims to correct historical discrimination by adjusting training data to promote equity. AR interventions, such as STratified AR (STAR), can mitigate the negative impacts of MIDS by ensuring more representative training data. The study evaluates the effects of MIDS in sequential classifier and generator settings, showing that MIDS can lead to significant performance and fairness degradation. AR interventions, particularly STAR, help reduce these harms by improving fairness and representation. The results demonstrate that AR can counteract the negative effects of MIDS, especially in datasets with high synthetic data proportions. The study highlights the importance of understanding and addressing MIDS to ensure fair and equitable outcomes in machine learning systems.Fairness feedback loops occur when model outputs influence new training data, leading to biased outcomes. This phenomenon, known as model-induced distribution shifts (MIDS), can cause performance, fairness, and representation losses in initially unbiased datasets. MIDS can amplify existing biases, especially in generative and supervised models, leading to unfairness and reduced representation for minoritized groups. However, models can also be used to intentionally address these issues through algorithmic reparation (AR), which aims to correct historical discrimination by adjusting training data to promote equity. AR interventions, such as STratified AR (STAR), can mitigate the negative impacts of MIDS by ensuring more representative training data. The study evaluates the effects of MIDS in sequential classifier and generator settings, showing that MIDS can lead to significant performance and fairness degradation. AR interventions, particularly STAR, help reduce these harms by improving fairness and representation. The results demonstrate that AR can counteract the negative effects of MIDS, especially in datasets with high synthetic data proportions. The study highlights the importance of understanding and addressing MIDS to ensure fair and equitable outcomes in machine learning systems.