Julia Dressel and Hany Farid examine the accuracy, fairness, and limitations of predicting recidivism using algorithms. They analyze the widely used commercial risk assessment software COMPAS, which is used to predict the likelihood of criminal defendants reoffending. Their study shows that COMPAS is no more accurate or fair than predictions made by people with little to no criminal justice expertise. A simple linear predictor with only two features is nearly as effective as COMPAS with its 137 features. The study also finds that more sophisticated classifiers do not improve prediction accuracy or fairness.
The researchers compare the performance of COMPAS with human assessments. They find that human participants, who were given limited information about defendants, achieved similar accuracy and fairness as COMPAS. The accuracy of human predictions was 62.1% on average, while COMPAS achieved 65.2%. However, when using a crowd-based approach, the accuracy of human predictions reached 67.0%, which was not significantly better than COMPAS. The AUC-ROC for human participants was nearly identical to that of COMPAS.
The study also examines fairness. Human participants showed no significant racial bias in their predictions, with accuracy rates for black and white defendants being similar. However, the false-positive and false-negative rates were significantly higher for black defendants compared to white defendants. These results are similar to those of COMPAS, which also showed racial bias in its predictions.
The researchers also tested the effect of including race in the defendant's description. They found that including race did not significantly affect the overall accuracy or fairness of predictions. The accuracy rates for black and white defendants remained similar, and the false-positive and false-negative rates were still significantly higher for black defendants.
The study concludes that COMPAS is not more accurate or fair than human predictions. It also shows that a simple linear classifier can achieve similar results to COMPAS. The researchers suggest that further research is needed to determine whether additional factors, such as dynamic risk factors, could improve prediction accuracy. They also note that the use of algorithms in criminal justice decisions should be carefully evaluated to ensure fairness and accuracy.Julia Dressel and Hany Farid examine the accuracy, fairness, and limitations of predicting recidivism using algorithms. They analyze the widely used commercial risk assessment software COMPAS, which is used to predict the likelihood of criminal defendants reoffending. Their study shows that COMPAS is no more accurate or fair than predictions made by people with little to no criminal justice expertise. A simple linear predictor with only two features is nearly as effective as COMPAS with its 137 features. The study also finds that more sophisticated classifiers do not improve prediction accuracy or fairness.
The researchers compare the performance of COMPAS with human assessments. They find that human participants, who were given limited information about defendants, achieved similar accuracy and fairness as COMPAS. The accuracy of human predictions was 62.1% on average, while COMPAS achieved 65.2%. However, when using a crowd-based approach, the accuracy of human predictions reached 67.0%, which was not significantly better than COMPAS. The AUC-ROC for human participants was nearly identical to that of COMPAS.
The study also examines fairness. Human participants showed no significant racial bias in their predictions, with accuracy rates for black and white defendants being similar. However, the false-positive and false-negative rates were significantly higher for black defendants compared to white defendants. These results are similar to those of COMPAS, which also showed racial bias in its predictions.
The researchers also tested the effect of including race in the defendant's description. They found that including race did not significantly affect the overall accuracy or fairness of predictions. The accuracy rates for black and white defendants remained similar, and the false-positive and false-negative rates were still significantly higher for black defendants.
The study concludes that COMPAS is not more accurate or fair than human predictions. It also shows that a simple linear classifier can achieve similar results to COMPAS. The researchers suggest that further research is needed to determine whether additional factors, such as dynamic risk factors, could improve prediction accuracy. They also note that the use of algorithms in criminal justice decisions should be carefully evaluated to ensure fairness and accuracy.