This paper introduces the Collective Predictive Coding (CPC) hypothesis, which proposes that symbol emergence in human societies can be understood as decentralized Bayesian inference. The hypothesis is grounded in probabilistic generative models and language games, including the Metropolis–Hastings naming game. CPC extends the concept of predictive coding (PC) from individual to society-wide adaptation, suggesting that symbol systems emerge through collaborative, decentralized inference. The hypothesis posits that symbol emergence adheres to the society-wide free-energy principle (FEP), which has been recognized as a general principle of the human brain and cognition. The CPC hypothesis provides a new explanation for why large language models (LLMs) appear to possess knowledge about the world based on experience, even though they have neither sensory organs nor bodies. The paper reviews past approaches to symbol emergence systems, offers a comprehensive survey of related prior studies, and presents a discussion on CPC-based generalizations. It highlights future challenges and potential cross-disciplinary research avenues. The CPC hypothesis is proposed as a general framework for computational models of symbol emergence systems (SESs), based on pre-existing constructive models and their variants. It provides an approach for developing a computational model and introduces a learning algorithm for artificial agents that realize symbol emergence through decentralized communication. The hypothesis establishes a theoretical connection between PC, FEP, and symbol emergence. It also provides a new explanation for why LLMs appear to possess knowledge about the world based on experience, even though they have neither sensory organs nor bodies. The paper is organized into sections that review SESs, describe probabilistic generative models for symbol emergence, discuss the CPC hypothesis and its relationship with existing theories, and conclude with future directions. The CPC hypothesis is considered a significant contribution to the understanding of symbol emergence, offering a new computational perspective that integrates PC, FEP, and symbol emergence. The hypothesis suggests that symbol systems emerge through decentralized Bayesian inference, which can be considered as an extension of the Bayesian brain concept to a Bayesian society. The paper also discusses the potential link between the CPC hypothesis and the free-energy principle, positing that symbol emergence follows the society-wide free-energy principle. The CPC hypothesis is proposed as a general framework for computational models of symbol emergence systems, based on pre-existing constructive models and their variants. It provides an approach for developing a computational model and introduces a learning algorithm for artificial agents that realize symbol emergence through decentralized communication. The hypothesis establishes a theoretical connection between PC, FEP, and symbol emergence. It also provides a new explanation for why LLMs appear to possess knowledge about the world based on experience, even though they have neither sensory organs nor bodies.This paper introduces the Collective Predictive Coding (CPC) hypothesis, which proposes that symbol emergence in human societies can be understood as decentralized Bayesian inference. The hypothesis is grounded in probabilistic generative models and language games, including the Metropolis–Hastings naming game. CPC extends the concept of predictive coding (PC) from individual to society-wide adaptation, suggesting that symbol systems emerge through collaborative, decentralized inference. The hypothesis posits that symbol emergence adheres to the society-wide free-energy principle (FEP), which has been recognized as a general principle of the human brain and cognition. The CPC hypothesis provides a new explanation for why large language models (LLMs) appear to possess knowledge about the world based on experience, even though they have neither sensory organs nor bodies. The paper reviews past approaches to symbol emergence systems, offers a comprehensive survey of related prior studies, and presents a discussion on CPC-based generalizations. It highlights future challenges and potential cross-disciplinary research avenues. The CPC hypothesis is proposed as a general framework for computational models of symbol emergence systems (SESs), based on pre-existing constructive models and their variants. It provides an approach for developing a computational model and introduces a learning algorithm for artificial agents that realize symbol emergence through decentralized communication. The hypothesis establishes a theoretical connection between PC, FEP, and symbol emergence. It also provides a new explanation for why LLMs appear to possess knowledge about the world based on experience, even though they have neither sensory organs nor bodies. The paper is organized into sections that review SESs, describe probabilistic generative models for symbol emergence, discuss the CPC hypothesis and its relationship with existing theories, and conclude with future directions. The CPC hypothesis is considered a significant contribution to the understanding of symbol emergence, offering a new computational perspective that integrates PC, FEP, and symbol emergence. The hypothesis suggests that symbol systems emerge through decentralized Bayesian inference, which can be considered as an extension of the Bayesian brain concept to a Bayesian society. The paper also discusses the potential link between the CPC hypothesis and the free-energy principle, positing that symbol emergence follows the society-wide free-energy principle. The CPC hypothesis is proposed as a general framework for computational models of symbol emergence systems, based on pre-existing constructive models and their variants. It provides an approach for developing a computational model and introduces a learning algorithm for artificial agents that realize symbol emergence through decentralized communication. The hypothesis establishes a theoretical connection between PC, FEP, and symbol emergence. It also provides a new explanation for why LLMs appear to possess knowledge about the world based on experience, even though they have neither sensory organs nor bodies.