CYC: A Large-Scale Investment in Knowledge Infrastructure

CYC: A Large-Scale Investment in Knowledge Infrastructure

November 1995/Vol. 38, No. 11 | Douglas B. Lenat
CYC is a large-scale knowledge infrastructure project that has taken over a century of effort to build a universal schema of approximately 10^6 general concepts. The project focuses on codifying commonsense knowledge, with over 10^6 axioms manually crafted and millions inferred. The article discusses the assumptions behind such a project, technical lessons learned, and the range of applications enabled by the technology. CYC can be viewed as an expert system with a domain spanning all everyday objects and actions. It includes assertions like "You have to be awake to eat" and "You cannot remember events that have not happened yet." These assertions, while not found in traditional texts, represent fundamental knowledge assumed to be known. They serve as a foundation for standardizing information retrieval, integration, and consistency checking on the World-Wide Web and electronic commerce. The project avoided natural language understanding and automatic machine learning, instead manually crafting a million axioms to create a critical mass of knowledge. This allowed further knowledge collection through NLU and ML. The development of CYC required handling causality, time, space, intention, contradiction, and uncertainty, as well as developing methods for browsing and editing large knowledge bases. CYC uses a first-order predicate calculus with second-order extensions for knowledge representation. It avoids numeric certainty factors, instead assuming assertions true by default and using meta-level assertions to state likelihoods. The project emphasizes pragmatism, aiming for a balance between expressiveness and efficiency. CYC has various commercial applications, including information retrieval, dynamic linking of external information sources, word processing, and simulations. It can help in content-checking, speech recognition, email routing, and direct marketing. The project also enables semantic file systems and supports role-playing games with intelligent agents. CYC's development highlights the importance of large-scale AI projects, emphasizing the need for a robust and flexible system. The project has faced challenges, including handling context and ambiguity, but has made significant progress in creating a comprehensive knowledge base. The future of CYC lies in integrating with other technologies like neural networks and decision theory, offering a powerful tool for various applications.CYC is a large-scale knowledge infrastructure project that has taken over a century of effort to build a universal schema of approximately 10^6 general concepts. The project focuses on codifying commonsense knowledge, with over 10^6 axioms manually crafted and millions inferred. The article discusses the assumptions behind such a project, technical lessons learned, and the range of applications enabled by the technology. CYC can be viewed as an expert system with a domain spanning all everyday objects and actions. It includes assertions like "You have to be awake to eat" and "You cannot remember events that have not happened yet." These assertions, while not found in traditional texts, represent fundamental knowledge assumed to be known. They serve as a foundation for standardizing information retrieval, integration, and consistency checking on the World-Wide Web and electronic commerce. The project avoided natural language understanding and automatic machine learning, instead manually crafting a million axioms to create a critical mass of knowledge. This allowed further knowledge collection through NLU and ML. The development of CYC required handling causality, time, space, intention, contradiction, and uncertainty, as well as developing methods for browsing and editing large knowledge bases. CYC uses a first-order predicate calculus with second-order extensions for knowledge representation. It avoids numeric certainty factors, instead assuming assertions true by default and using meta-level assertions to state likelihoods. The project emphasizes pragmatism, aiming for a balance between expressiveness and efficiency. CYC has various commercial applications, including information retrieval, dynamic linking of external information sources, word processing, and simulations. It can help in content-checking, speech recognition, email routing, and direct marketing. The project also enables semantic file systems and supports role-playing games with intelligent agents. CYC's development highlights the importance of large-scale AI projects, emphasizing the need for a robust and flexible system. The project has faced challenges, including handling context and ambiguity, but has made significant progress in creating a comprehensive knowledge base. The future of CYC lies in integrating with other technologies like neural networks and decision theory, offering a powerful tool for various applications.
Reach us at info@study.space