Fuzzing BusyBox: Leveraging LLM and Crash Reuse for Embedded Bug Unearthing

Fuzzing BusyBox: Leveraging LLM and Crash Reuse for Embedded Bug Unearthing

6 Mar 2024 | Asmita¹, Yaroslav Oliinyk², Michael Scott², Ryan Tsang¹, Chongzhou Fang¹, Houman Homayoun¹
This paper presents a study on the vulnerabilities in BusyBox, an open-source software that bundles over 300 essential Linux commands into a single executable, commonly used in embedded devices. The research focuses on improving software testing techniques for embedded Linux systems, particularly BusyBox, by leveraging Large Language Models (LLMs) and crash reuse. The study introduces two techniques to enhance fuzz testing. The first technique uses LLMs to generate target-specific initial seeds for mutation-based, coverage-guided fuzzing. This approach significantly increases the number of crashes found, demonstrating the potential of LLMs to efficiently tackle the labor-intensive task of generating initial seeds. The second technique involves repurposing previously acquired crash data from similar fuzzed targets before initiating fuzzing on a new target. This approach streamlines the fuzz testing process by providing crash data directly to the new target before commencing fuzzing. The research demonstrates the effectiveness of these techniques in identifying crashes in the latest BusyBox version without traditional fuzzing. Manual triaging was performed to identify the nature of crashes in the latest BusyBox. The study also highlights the importance of updating BusyBox versions in commercial embedded devices, as many still use older versions with known vulnerabilities. The paper discusses the use of AFL++ for fuzz testing, with a focus on the BusyBox awk applet. The results show that LLM-generated initial seeds lead to a significant increase in crashes compared to random seeds. Additionally, the crash reuse technique was effective in identifying crashes in the latest BusyBox version without extensive fuzzing. The study also highlights the importance of crash analysis in identifying vulnerabilities in BusyBox. The analysis revealed crashes in GLIBC functions, including free, malloc, write, strlen, strdup, regex, and strftime. These crashes were linked to known vulnerabilities, such as CVE-2010-4051 and CVE-2015-8776, which persist in multiple software versions. The paper concludes that the proposed techniques are applicable to other BusyBox applets and can be used to improve software testing and vulnerability detection in embedded systems. The techniques have the potential to significantly reduce the time and resource demands of fuzz testing, making it a valuable tool for black-box fuzzing when a comprehensive crash database is available.This paper presents a study on the vulnerabilities in BusyBox, an open-source software that bundles over 300 essential Linux commands into a single executable, commonly used in embedded devices. The research focuses on improving software testing techniques for embedded Linux systems, particularly BusyBox, by leveraging Large Language Models (LLMs) and crash reuse. The study introduces two techniques to enhance fuzz testing. The first technique uses LLMs to generate target-specific initial seeds for mutation-based, coverage-guided fuzzing. This approach significantly increases the number of crashes found, demonstrating the potential of LLMs to efficiently tackle the labor-intensive task of generating initial seeds. The second technique involves repurposing previously acquired crash data from similar fuzzed targets before initiating fuzzing on a new target. This approach streamlines the fuzz testing process by providing crash data directly to the new target before commencing fuzzing. The research demonstrates the effectiveness of these techniques in identifying crashes in the latest BusyBox version without traditional fuzzing. Manual triaging was performed to identify the nature of crashes in the latest BusyBox. The study also highlights the importance of updating BusyBox versions in commercial embedded devices, as many still use older versions with known vulnerabilities. The paper discusses the use of AFL++ for fuzz testing, with a focus on the BusyBox awk applet. The results show that LLM-generated initial seeds lead to a significant increase in crashes compared to random seeds. Additionally, the crash reuse technique was effective in identifying crashes in the latest BusyBox version without extensive fuzzing. The study also highlights the importance of crash analysis in identifying vulnerabilities in BusyBox. The analysis revealed crashes in GLIBC functions, including free, malloc, write, strlen, strdup, regex, and strftime. These crashes were linked to known vulnerabilities, such as CVE-2010-4051 and CVE-2015-8776, which persist in multiple software versions. The paper concludes that the proposed techniques are applicable to other BusyBox applets and can be used to improve software testing and vulnerability detection in embedded systems. The techniques have the potential to significantly reduce the time and resource demands of fuzz testing, making it a valuable tool for black-box fuzzing when a comprehensive crash database is available.
Reach us at info@study.space