2011 | Michel, Jean-Baptiste, Yuan Kui Shen, Aviva P. Aiden, Adrian Veres, Matthew K. Gray, The Google Books Team, Joseph P. Pickett, et al.
The article "Quantitative Analysis of Culture Using Millions of Digitized Books" by Jean-Baptiste Michel and colleagues presents a comprehensive study of cultural trends and phenomena through the analysis of a large corpus of digitized books. The corpus, comprising approximately 4% of all books ever printed, was created by Google's book digitization project and includes over 5 million books. The study focuses on linguistic and cultural changes reflected in the English language between 1800 and 2000, using computational methods to investigate various aspects of culture.
Key findings include:
1. **Culturomics**: The application of quantitative methods to cultural data, extending scientific inquiry to new phenomena.
2. **Lexicography**: Analysis of the English lexicon, showing its growth and the gap between dictionaries and the actual lexicon.
3. **Grammar**: Examination of the evolution of irregular verbs, highlighting slow and gradual changes.
4. **Collective Memory**: Study of how interest in historical events decays over time, with faster forgetting rates in recent decades.
5. **Cultural Adoption**: Rapid increase in the adoption of new technologies and inventions.
6. **Fame and Celebrity**: Tracking the rise and fall of fame, showing earlier and faster rise but shorter duration of celebrity.
7. **Censorship**: Detection of censorship through changes in the frequency of names and topics in different languages, particularly during the Nazi regime.
The authors argue that culturomic tools can aid in various fields, including lexicography, linguistics, history, and social sciences, by providing quantitative insights into cultural trends and phenomena.The article "Quantitative Analysis of Culture Using Millions of Digitized Books" by Jean-Baptiste Michel and colleagues presents a comprehensive study of cultural trends and phenomena through the analysis of a large corpus of digitized books. The corpus, comprising approximately 4% of all books ever printed, was created by Google's book digitization project and includes over 5 million books. The study focuses on linguistic and cultural changes reflected in the English language between 1800 and 2000, using computational methods to investigate various aspects of culture.
Key findings include:
1. **Culturomics**: The application of quantitative methods to cultural data, extending scientific inquiry to new phenomena.
2. **Lexicography**: Analysis of the English lexicon, showing its growth and the gap between dictionaries and the actual lexicon.
3. **Grammar**: Examination of the evolution of irregular verbs, highlighting slow and gradual changes.
4. **Collective Memory**: Study of how interest in historical events decays over time, with faster forgetting rates in recent decades.
5. **Cultural Adoption**: Rapid increase in the adoption of new technologies and inventions.
6. **Fame and Celebrity**: Tracking the rise and fall of fame, showing earlier and faster rise but shorter duration of celebrity.
7. **Censorship**: Detection of censorship through changes in the frequency of names and topics in different languages, particularly during the Nazi regime.
The authors argue that culturomic tools can aid in various fields, including lexicography, linguistics, history, and social sciences, by providing quantitative insights into cultural trends and phenomena.