Traditional Chinese Medicine Knowledge Graph Construction Based on Large Language Models

Traditional Chinese Medicine Knowledge Graph Construction Based on Large Language Models

2024 | Yichong Zhang and Yongtao Hao
This study explores the use of large language models (LLMs) to construct a knowledge graph for Traditional Chinese Medicine (TCM) to enhance the representation, storage, and application of TCM knowledge. The knowledge graph, based on a graph structure, organizes entities, attributes, and relationships within the TCM domain. By leveraging LLMs, substantial TCM-related data is collected and embedded, generating precise representations transformed into a knowledge graph format. Experimental evaluations confirm the accuracy and effectiveness of the constructed graph, extracting various entities and their relationships, providing a solid foundation for TCM learning, research, and application. The knowledge graph has significant potential in TCM, aiding in teaching, disease diagnosis, treatment decisions, and contributing to TCM modernization. The paper concludes by highlighting the use of LLMs to construct a knowledge graph for TCM, offering a vital foundation for knowledge representation and application in the field, with potential for future expansion and refinement. traditional Chinese medicine; large language modeling; knowledge graph; interdisciplinary research Traditional Chinese Medicine (TCM) embodies the unique wisdom of the Chinese nation regarding life, health, and medical treatment. With rich theoretical knowledge and clinical expertise, TCM holds significant academic and practical value. However, challenges persist in the modernization and intelligent application of TCM, hindering the effective utilization of knowledge and information within the field. The integration of advanced ontological theories and techniques from computer science into the study of TCM knowledge organization, constructing a Chinese medicine ontology, and achieving the knowledge-based restructuring of Chinese medicine information can provide a foundational data structure for data mining and knowledge discovery in the field of TCM. Knowledge Graphs (KGs) serve as crucial data resources for knowledge management and applications, playing a key role in various fields such as semantic retrieval, knowledge inference, decision-making, question-answering, and system recommendations. The objective of constructing a knowledge graph for TCM is to structurally represent and link entities, relationships, and attributes related to TCM, forming a comprehensive and accurate network of TCM knowledge. Such a knowledge graph can assist healthcare professionals in disease differentiation and treatment, support clinical decision-making, and provide rich data for TCM research. Additionally, a TCM knowledge graph facilitates the integration of TCM with modern medicine, opening new possibilities for interdisciplinary medical research and applications. The paper introduces the construction process of the TCM knowledge graph, which includes knowledge acquisition, knowledge extraction, knowledge fusion, and data storage. The data collection and cleaning process involves obtaining semi-structured and unstructured data from various online sources and ensuring high-quality standards through meticulous filtering. The prompt construction methodology, including named entity recognition (NER) and entity relationship extraction, is detailed, emphasizing the use of few-shot prompting techniques to guide LLMs effectively. The experimental results demonstrate the effectiveness of the proposed approach, with the iFLYTEK Spark Cognitive Large Model outperforming the ChatGPT model in terms of accuracy and efficiency. TheThis study explores the use of large language models (LLMs) to construct a knowledge graph for Traditional Chinese Medicine (TCM) to enhance the representation, storage, and application of TCM knowledge. The knowledge graph, based on a graph structure, organizes entities, attributes, and relationships within the TCM domain. By leveraging LLMs, substantial TCM-related data is collected and embedded, generating precise representations transformed into a knowledge graph format. Experimental evaluations confirm the accuracy and effectiveness of the constructed graph, extracting various entities and their relationships, providing a solid foundation for TCM learning, research, and application. The knowledge graph has significant potential in TCM, aiding in teaching, disease diagnosis, treatment decisions, and contributing to TCM modernization. The paper concludes by highlighting the use of LLMs to construct a knowledge graph for TCM, offering a vital foundation for knowledge representation and application in the field, with potential for future expansion and refinement. traditional Chinese medicine; large language modeling; knowledge graph; interdisciplinary research Traditional Chinese Medicine (TCM) embodies the unique wisdom of the Chinese nation regarding life, health, and medical treatment. With rich theoretical knowledge and clinical expertise, TCM holds significant academic and practical value. However, challenges persist in the modernization and intelligent application of TCM, hindering the effective utilization of knowledge and information within the field. The integration of advanced ontological theories and techniques from computer science into the study of TCM knowledge organization, constructing a Chinese medicine ontology, and achieving the knowledge-based restructuring of Chinese medicine information can provide a foundational data structure for data mining and knowledge discovery in the field of TCM. Knowledge Graphs (KGs) serve as crucial data resources for knowledge management and applications, playing a key role in various fields such as semantic retrieval, knowledge inference, decision-making, question-answering, and system recommendations. The objective of constructing a knowledge graph for TCM is to structurally represent and link entities, relationships, and attributes related to TCM, forming a comprehensive and accurate network of TCM knowledge. Such a knowledge graph can assist healthcare professionals in disease differentiation and treatment, support clinical decision-making, and provide rich data for TCM research. Additionally, a TCM knowledge graph facilitates the integration of TCM with modern medicine, opening new possibilities for interdisciplinary medical research and applications. The paper introduces the construction process of the TCM knowledge graph, which includes knowledge acquisition, knowledge extraction, knowledge fusion, and data storage. The data collection and cleaning process involves obtaining semi-structured and unstructured data from various online sources and ensuring high-quality standards through meticulous filtering. The prompt construction methodology, including named entity recognition (NER) and entity relationship extraction, is detailed, emphasizing the use of few-shot prompting techniques to guide LLMs effectively. The experimental results demonstrate the effectiveness of the proposed approach, with the iFLYTEK Spark Cognitive Large Model outperforming the ChatGPT model in terms of accuracy and efficiency. The
Reach us at info@study.space