April 14-20, 2024 | Zeyang Ma, An Ran Chen, Dong Jae Kim, Tse-Hsun (Peter) Chen, Shaowei Wang
The paper "LLMParser: An Exploratory Study on Using Large Language Models for Log Parsing" by Zeyang Ma explores the potential of using Large Language Models (LLMs) for log parsing, a critical step in log-based analyses. Traditional log parsers struggle with diverse log formats, leading to suboptimal performance in downstream tasks. The authors propose LLMParser, an LLM-based log parser that leverages generative LLMs and few-shot tuning. Four LLMs—Flan-T5-small, Flan-T5-base, LLaMA-7B, and ChatGLM-6B—are used in the study. The evaluation on 16 open-source systems shows that LLMParser achieves a statistically significantly higher parsing accuracy (96% average) compared to state-of-the-art parsers. The study also finds that smaller LLMs can be more effective than more complex ones, and that few-shot tuning is more efficient than in-context learning. Additionally, the impact of training size, model size, and pre-training LLMs on log parsing accuracy is analyzed, revealing that pre-trained LLMs from other systems do not always improve accuracy. The research provides empirical evidence for using LLMs in log parsing and highlights future research directions.The paper "LLMParser: An Exploratory Study on Using Large Language Models for Log Parsing" by Zeyang Ma explores the potential of using Large Language Models (LLMs) for log parsing, a critical step in log-based analyses. Traditional log parsers struggle with diverse log formats, leading to suboptimal performance in downstream tasks. The authors propose LLMParser, an LLM-based log parser that leverages generative LLMs and few-shot tuning. Four LLMs—Flan-T5-small, Flan-T5-base, LLaMA-7B, and ChatGLM-6B—are used in the study. The evaluation on 16 open-source systems shows that LLMParser achieves a statistically significantly higher parsing accuracy (96% average) compared to state-of-the-art parsers. The study also finds that smaller LLMs can be more effective than more complex ones, and that few-shot tuning is more efficient than in-context learning. Additionally, the impact of training size, model size, and pre-training LLMs on log parsing accuracy is analyzed, revealing that pre-trained LLMs from other systems do not always improve accuracy. The research provides empirical evidence for using LLMs in log parsing and highlights future research directions.