AI and the Problem of Knowledge Collapse

AI and the Problem of Knowledge Collapse

April 2024 | Andrew J. Peterson
AI and the Problem of Knowledge Collapse Andrew J. Peterson April 2024 While artificial intelligence has the potential to process vast amounts of data, generate new insights, and unlock greater productivity, its widespread adoption may entail unforeseen consequences. We identify conditions under which AI, by reducing the cost of access to certain modes of knowledge, can paradoxically harm public understanding. While large language models are trained on vast amounts of diverse data, they naturally generate output towards the 'center' of the distribution. This is generally useful, but widespread reliance on recursive AI systems could lead to a process we define as "knowledge collapse", and argue this could harm innovation and the richness of human understanding and culture. However, unlike AI models that cannot choose what data they are trained on, humans may strategically seek out diverse forms of knowledge if they perceive them to be worthwhile. To investigate this, we provide a simple model in which a community of learners or innovators choose to use traditional methods or to rely on a discounted AI-assisted process and identify conditions under which knowledge collapse occurs. In our default model, a 20% discount on AI-generated content generates public beliefs 2.3 times further from the truth than when there is no discount. An empirical approach to measuring the distribution of LLM outputs is provided in theoretical terms and illustrated through a specific example comparing the diversity of outputs across different models and prompting styles. Finally, we consider further research directions to counteract harmful outcomes. The rise of AI-generated content and AI-mediated access to information may harm the future of human thought, information-seeking, and knowledge. AI-generated information initially has limited effects, but widespread adoption could lead to a "curse of recursion," where access to the original diversity of human knowledge is increasingly mediated by a narrow subset of views. This could reinforce an "echo chamber" effect, where individuals believe that neglected knowledge is of little value. AI's ability to discount the cost of access to certain information may further generate harm through the "streetlight effect," where search is disproportionately done in easily accessible areas. We argue that the resulting curtailment of the tails of human knowledge would have significant effects on fairness, inclusion, innovation, and cultural preservation. In our simulation model, we also consider the possibility that humans are strategic in actively curating their information sources. If there is significant value in neglected knowledge, some individuals may invest effort to realize gains. We identify a dynamic where AI may lead to "knowledge collapse," neglecting the long-tails of knowledge and creating a degenerately narrow perspective. We provide a model where individuals decide whether to rely on cheaper AI technology or invest in samples from the full distribution of true knowledge. We examine conditions under which individuals can prevent knowledge collapse. We outline an approach to defining and measuring output diversity and provide an illustrative example. We conclude with possible solutions to prevent knowledge collapse in the AI era. Previous work highlights concerns about the impact of technology on knowledge accessAI and the Problem of Knowledge Collapse Andrew J. Peterson April 2024 While artificial intelligence has the potential to process vast amounts of data, generate new insights, and unlock greater productivity, its widespread adoption may entail unforeseen consequences. We identify conditions under which AI, by reducing the cost of access to certain modes of knowledge, can paradoxically harm public understanding. While large language models are trained on vast amounts of diverse data, they naturally generate output towards the 'center' of the distribution. This is generally useful, but widespread reliance on recursive AI systems could lead to a process we define as "knowledge collapse", and argue this could harm innovation and the richness of human understanding and culture. However, unlike AI models that cannot choose what data they are trained on, humans may strategically seek out diverse forms of knowledge if they perceive them to be worthwhile. To investigate this, we provide a simple model in which a community of learners or innovators choose to use traditional methods or to rely on a discounted AI-assisted process and identify conditions under which knowledge collapse occurs. In our default model, a 20% discount on AI-generated content generates public beliefs 2.3 times further from the truth than when there is no discount. An empirical approach to measuring the distribution of LLM outputs is provided in theoretical terms and illustrated through a specific example comparing the diversity of outputs across different models and prompting styles. Finally, we consider further research directions to counteract harmful outcomes. The rise of AI-generated content and AI-mediated access to information may harm the future of human thought, information-seeking, and knowledge. AI-generated information initially has limited effects, but widespread adoption could lead to a "curse of recursion," where access to the original diversity of human knowledge is increasingly mediated by a narrow subset of views. This could reinforce an "echo chamber" effect, where individuals believe that neglected knowledge is of little value. AI's ability to discount the cost of access to certain information may further generate harm through the "streetlight effect," where search is disproportionately done in easily accessible areas. We argue that the resulting curtailment of the tails of human knowledge would have significant effects on fairness, inclusion, innovation, and cultural preservation. In our simulation model, we also consider the possibility that humans are strategic in actively curating their information sources. If there is significant value in neglected knowledge, some individuals may invest effort to realize gains. We identify a dynamic where AI may lead to "knowledge collapse," neglecting the long-tails of knowledge and creating a degenerately narrow perspective. We provide a model where individuals decide whether to rely on cheaper AI technology or invest in samples from the full distribution of true knowledge. We examine conditions under which individuals can prevent knowledge collapse. We outline an approach to defining and measuring output diversity and provide an illustrative example. We conclude with possible solutions to prevent knowledge collapse in the AI era. Previous work highlights concerns about the impact of technology on knowledge access
Reach us at info@study.space