[slides] NetConfEval%3A Can LLMs Facilitate Network Configuration%3F

This paper explores the potential of Large Language Models (LLMs) to simplify and automate network configuration, making it more human-friendly. The authors design a benchmark called NetConfEval to evaluate different LLMs in facilitating and automating network configuration tasks. These tasks include translating high-level policies and requirements into low-level network configurations and Python code. The study focuses on four specific tasks: generating formal specifications from high-level requirements, translating requirements into API/function calls, developing routing algorithms, and generating low-level configurations for existing and new protocols. The results show that while some LLMs, particularly GPT-4, exhibit high accuracy in translating human-language inputs into formal specifications and generating basic routing algorithms, they also face challenges such as context window limitations and the need for fine-tuning. The authors propose principles for designing LLM-based systems to facilitate network configuration, including splitting complex tasks into smaller subtasks, supporting task-specific verifiers, and keeping humans in the loop to ensure accuracy and reliability. Two GPT-4-based prototypes are presented: one for automatically configuring P4-enabled devices from high-level requirements and another for integrating LLMs into existing network synthesizers. The paper also discusses the challenges and costs associated with running experiments and the importance of prompt engineering. Overall, the study highlights the potential of LLMs to enhance network configuration but emphasizes the need for further research and development to address current limitations.This paper explores the potential of Large Language Models (LLMs) to simplify and automate network configuration, making it more human-friendly. The authors design a benchmark called NetConfEval to evaluate different LLMs in facilitating and automating network configuration tasks. These tasks include translating high-level policies and requirements into low-level network configurations and Python code. The study focuses on four specific tasks: generating formal specifications from high-level requirements, translating requirements into API/function calls, developing routing algorithms, and generating low-level configurations for existing and new protocols. The results show that while some LLMs, particularly GPT-4, exhibit high accuracy in translating human-language inputs into formal specifications and generating basic routing algorithms, they also face challenges such as context window limitations and the need for fine-tuning. The authors propose principles for designing LLM-based systems to facilitate network configuration, including splitting complex tasks into smaller subtasks, supporting task-specific verifiers, and keeping humans in the loop to ensure accuracy and reliability. Two GPT-4-based prototypes are presented: one for automatically configuring P4-enabled devices from high-level requirements and another for integrating LLMs into existing network synthesizers. The paper also discusses the challenges and costs associated with running experiments and the importance of prompt engineering. Overall, the study highlights the potential of LLMs to enhance network configuration but emphasizes the need for further research and development to address current limitations.

NetConfEval: Can LLMs Facilitate Network Configuration?