October 12-16, 2015, Denver, Colorado, USA | Matt Fredrikson, Somesh Jha, Thomas Ristenpart
This paper presents a new class of model inversion attacks that exploit confidence information revealed by machine learning (ML) models, along with countermeasures to mitigate these attacks. The attacks target ML models used in privacy-sensitive applications such as personalized medicine, lifestyle surveys, and facial recognition. The authors demonstrate that these attacks can infer sensitive information, such as individuals' marital infidelity or pornographic viewing habits, from decision tree models, and recover recognizable images of people's faces from facial recognition models.
The paper evaluates the effectiveness of these attacks on real-world data, showing that they can significantly improve inversion efficacy. For example, the authors show that an adversarial client can estimate whether a respondent in a lifestyle survey admitted to cheating on their significant other, and recover recognizable images of people's faces from facial recognition models given only their name and access to the ML model.
The paper also explores countermeasures, including privacy-aware decision tree training algorithms and rounding confidence values. These countermeasures are shown to significantly reduce the effectiveness of the attacks. The authors conclude that while model inversion attacks pose a privacy risk, they can be mitigated with minimal impact on the utility of the ML models.
The paper also discusses the limitations of existing model inversion attacks, such as the Fredrikson et al. attack, which is not effective in settings where the sensitive feature takes on a large number of possible values. The authors propose new attacks that exploit confidence information revealed by ML APIs, which can be applied to a variety of settings, including decision trees and facial recognition models.
The paper also presents experimental results showing that the new attacks can recover recognizable images of people's faces from facial recognition models, and that these images can be used to identify individuals in a lineup. The results show that the attacks are effective, with skilled humans being able to correctly identify the target person in a lineup with high accuracy.
The paper concludes that model inversion attacks pose a significant privacy risk, but that countermeasures can be implemented to mitigate this risk. The authors suggest that further research is needed to develop more robust countermeasures to protect against these attacks.This paper presents a new class of model inversion attacks that exploit confidence information revealed by machine learning (ML) models, along with countermeasures to mitigate these attacks. The attacks target ML models used in privacy-sensitive applications such as personalized medicine, lifestyle surveys, and facial recognition. The authors demonstrate that these attacks can infer sensitive information, such as individuals' marital infidelity or pornographic viewing habits, from decision tree models, and recover recognizable images of people's faces from facial recognition models.
The paper evaluates the effectiveness of these attacks on real-world data, showing that they can significantly improve inversion efficacy. For example, the authors show that an adversarial client can estimate whether a respondent in a lifestyle survey admitted to cheating on their significant other, and recover recognizable images of people's faces from facial recognition models given only their name and access to the ML model.
The paper also explores countermeasures, including privacy-aware decision tree training algorithms and rounding confidence values. These countermeasures are shown to significantly reduce the effectiveness of the attacks. The authors conclude that while model inversion attacks pose a privacy risk, they can be mitigated with minimal impact on the utility of the ML models.
The paper also discusses the limitations of existing model inversion attacks, such as the Fredrikson et al. attack, which is not effective in settings where the sensitive feature takes on a large number of possible values. The authors propose new attacks that exploit confidence information revealed by ML APIs, which can be applied to a variety of settings, including decision trees and facial recognition models.
The paper also presents experimental results showing that the new attacks can recover recognizable images of people's faces from facial recognition models, and that these images can be used to identify individuals in a lineup. The results show that the attacks are effective, with skilled humans being able to correctly identify the target person in a lineup with high accuracy.
The paper concludes that model inversion attacks pose a significant privacy risk, but that countermeasures can be implemented to mitigate this risk. The authors suggest that further research is needed to develop more robust countermeasures to protect against these attacks.