3 Jun 2024 | Rasoul Nikbakht, Mohamed Benzaghta, and Giovanni Geraci
The article introduces *Tspec-LLM*, an open-source dataset containing all 3GPP documents from Release 8 to Release 19 (1999–2023), totaling 13.5 GB with 30,137 documents and 535 million words. This comprehensive dataset is designed to enhance the understanding of 3GPP specifications by large language models (LLMs). The dataset retains the original content, including tables and formulas, and preserves the structure of the original documents. To evaluate its effectiveness, the authors created a questionnaire based on 3GPP Releases 15–17, and tested the performance of LLMs such as GPT-3.5, GPT-4, and Gemini Pro 1.0. The results showed that these models achieved accuracies of 44%, 46%, and 51%, respectively, on the questionnaire. By incorporating a retrieval-augmented generation (RAG) framework, the accuracy of these models was significantly improved to 71%, 75%, and 72%, respectively. The RAG framework enhances LLM performance by retrieving relevant information from the *Tspec-LLM* dataset to improve the accuracy of responses. The dataset is also used to assess the effectiveness of the RAG framework on different datasets, showing that *Tspec-LLM* provides better accuracy compared to the SPEC5G dataset. The authors conclude that *Tspec-LLM* is a valuable resource for research on LLMs in the telecommunications domain, and future work will focus on improving the RAG framework and developing more comprehensive questionnaires for specific tasks.The article introduces *Tspec-LLM*, an open-source dataset containing all 3GPP documents from Release 8 to Release 19 (1999–2023), totaling 13.5 GB with 30,137 documents and 535 million words. This comprehensive dataset is designed to enhance the understanding of 3GPP specifications by large language models (LLMs). The dataset retains the original content, including tables and formulas, and preserves the structure of the original documents. To evaluate its effectiveness, the authors created a questionnaire based on 3GPP Releases 15–17, and tested the performance of LLMs such as GPT-3.5, GPT-4, and Gemini Pro 1.0. The results showed that these models achieved accuracies of 44%, 46%, and 51%, respectively, on the questionnaire. By incorporating a retrieval-augmented generation (RAG) framework, the accuracy of these models was significantly improved to 71%, 75%, and 72%, respectively. The RAG framework enhances LLM performance by retrieving relevant information from the *Tspec-LLM* dataset to improve the accuracy of responses. The dataset is also used to assess the effectiveness of the RAG framework on different datasets, showing that *Tspec-LLM* provides better accuracy compared to the SPEC5G dataset. The authors conclude that *Tspec-LLM* is a valuable resource for research on LLMs in the telecommunications domain, and future work will focus on improving the RAG framework and developing more comprehensive questionnaires for specific tasks.