April 9, 2013 | vol. 110 | no. 15 | Michal Kosinski, David Stillwell, Thore Graepel
The study by Michal Kosinski, David Stillwell, and Thore Graepel demonstrates that digital records of human behavior, specifically Facebook Likes, can be used to predict a wide range of sensitive personal attributes, including sexual orientation, ethnicity, political views, personality traits, intelligence, happiness, substance use, parental separation, age, and gender. The analysis is based on a dataset of over 58,000 volunteers who provided their Facebook Likes, demographic profiles, and psychometric test results. The model uses dimensionality reduction and logistic/linear regression to predict these attributes from Likes. It achieves high accuracy in distinguishing between homosexual and heterosexual men (88%), African Americans and Caucasian Americans (95%), and Democrats and Republicans (85%). For the personality trait "Openness," the prediction accuracy is close to the test-retest accuracy of a standard personality test. The study highlights the potential for online personalization and the privacy implications of such predictions.The study by Michal Kosinski, David Stillwell, and Thore Graepel demonstrates that digital records of human behavior, specifically Facebook Likes, can be used to predict a wide range of sensitive personal attributes, including sexual orientation, ethnicity, political views, personality traits, intelligence, happiness, substance use, parental separation, age, and gender. The analysis is based on a dataset of over 58,000 volunteers who provided their Facebook Likes, demographic profiles, and psychometric test results. The model uses dimensionality reduction and logistic/linear regression to predict these attributes from Likes. It achieves high accuracy in distinguishing between homosexual and heterosexual men (88%), African Americans and Caucasian Americans (95%), and Democrats and Republicans (85%). For the personality trait "Openness," the prediction accuracy is close to the test-retest accuracy of a standard personality test. The study highlights the potential for online personalization and the privacy implications of such predictions.