BetterV: Controlled Verilog Generation with Discriminative Guidance

BetterV: Controlled Verilog Generation with Discriminative Guidance

2 May 2024 | Zehua Pei, Hui-Ling Zhen, Mingxuan Yuan, Yu Huang, Bei Yu
BetterV is a framework for generating Verilog code using large language models (LLMs) with discriminative guidance. The framework fine-tunes LLMs on processed domain-specific datasets and incorporates generative discriminators to optimize Verilog implementation for various electronic design automation (EDA) tasks. Verilog modules are collected, filtered, and processed from the internet to form a clean and abundant dataset. Instruct-tuning methods are used to fine-tune LLMs to understand Verilog knowledge, and data augmentation enriches the training set and trains a generative discriminator for downstream tasks. BetterV can generate syntactically and functionally correct Verilog, outperforming GPT-4 on the VerilogEval benchmark. With the help of task-specific generative discriminators, BetterV achieves significant improvements in EDA tasks such as netlist node reduction for synthesis and verification runtime reduction with Boolean Satisfiability (SAT) solving. The framework first constructs a customized dataset from open-source Verilog, then performs instruct-tuning to teach LLMs domain-specific knowledge. Data augmentation is used to enrich the dataset and prepare labels for training the generative discriminator. The generative discriminator is trained with a hybrid loss function and guides LLMs to generate or modify Verilog code directly from natural language, improving performance on downstream tasks. BetterV demonstrates the ability to generate syntactically and functionally correct Verilog using fine-tuned LLMs, surpassing GPT-4 on the VerilogEval benchmark. It also provides a versatile solution for data augmentation and addresses the scarcity of Verilog resources. Experiments show that BetterV outperforms other models in functional correctness and downstream tasks such as synthesis nodes reduction and verification runtime reduction. The framework's discriminative guidance enhances the performance of LLMs on Verilog generation, improving both training efficiency and practical utility. BetterV marks a pioneering advancement in Verilog generation, offering a task-specific discriminator guidance approach that enhances training efficiency and practical utility. The framework's contributions include applying controllable text generation to engineering optimization challenges and introducing a downstream task-driven method for Verilog generation. BetterV also demonstrates the potential for application in optimization issues across various domains. The framework's results show that it can significantly reduce the number of iterations in industrial production and improve the robustness and generalization of the fine-tuned model. BetterV's ability to generate syntactically and functionally correct Verilog and its performance on various EDA tasks highlight its effectiveness in improving the design process. The framework's approach to data augmentation and discriminative guidance provides a promising direction for future research in automated circuit design.BetterV is a framework for generating Verilog code using large language models (LLMs) with discriminative guidance. The framework fine-tunes LLMs on processed domain-specific datasets and incorporates generative discriminators to optimize Verilog implementation for various electronic design automation (EDA) tasks. Verilog modules are collected, filtered, and processed from the internet to form a clean and abundant dataset. Instruct-tuning methods are used to fine-tune LLMs to understand Verilog knowledge, and data augmentation enriches the training set and trains a generative discriminator for downstream tasks. BetterV can generate syntactically and functionally correct Verilog, outperforming GPT-4 on the VerilogEval benchmark. With the help of task-specific generative discriminators, BetterV achieves significant improvements in EDA tasks such as netlist node reduction for synthesis and verification runtime reduction with Boolean Satisfiability (SAT) solving. The framework first constructs a customized dataset from open-source Verilog, then performs instruct-tuning to teach LLMs domain-specific knowledge. Data augmentation is used to enrich the dataset and prepare labels for training the generative discriminator. The generative discriminator is trained with a hybrid loss function and guides LLMs to generate or modify Verilog code directly from natural language, improving performance on downstream tasks. BetterV demonstrates the ability to generate syntactically and functionally correct Verilog using fine-tuned LLMs, surpassing GPT-4 on the VerilogEval benchmark. It also provides a versatile solution for data augmentation and addresses the scarcity of Verilog resources. Experiments show that BetterV outperforms other models in functional correctness and downstream tasks such as synthesis nodes reduction and verification runtime reduction. The framework's discriminative guidance enhances the performance of LLMs on Verilog generation, improving both training efficiency and practical utility. BetterV marks a pioneering advancement in Verilog generation, offering a task-specific discriminator guidance approach that enhances training efficiency and practical utility. The framework's contributions include applying controllable text generation to engineering optimization challenges and introducing a downstream task-driven method for Verilog generation. BetterV also demonstrates the potential for application in optimization issues across various domains. The framework's results show that it can significantly reduce the number of iterations in industrial production and improve the robustness and generalization of the fine-tuned model. BetterV's ability to generate syntactically and functionally correct Verilog and its performance on various EDA tasks highlight its effectiveness in improving the design process. The framework's approach to data augmentation and discriminative guidance provides a promising direction for future research in automated circuit design.
Reach us at info@study.space