COMPOSERX: MULTI-AGENT SYMBOLIC MUSIC COMPOSITION WITH LLMS

COMPOSERX: MULTI-AGENT SYMBOLIC MUSIC COMPOSITION WITH LLMS

30 Apr 2024 | Qixin Deng, Qikai Yang, Ruibin Yuan, Yipeng Huang, Yi Wang, Xubo Liu, Zeyue Tian, Jiahao Pan, Ge Zhang, Hanfeng Lin, Yizhi Li, Yinghao Ma, Jie Fu, Chenghua Lin, Emmanouil Benetos, Wenwu Wang, Guangyu Xia, Wei Xue, Yike Guo
**ComposerX: Multi-Agent Symbolic Music Composition with LLMs** **Authors:** Qixin Deng, Qikai Yang, Ruibin Yuan, Yipeng Huang, Yi Wang, Xubo Liu, Zeyue Tian, Jiahao Pan, Ge Zhang, Hanfeng Lin, Yizhi Li, Yinghao Ma, Jie Fu, Chenghua Lin, Emmanouil Benetos, Wenwu Wang, Guangyu Xia, Wei Xue, Yike Guo **Abstract:** Music composition is a complex task that requires understanding and generating information with long dependency and harmony constraints. Current large language models (LLMs) often fail in this task, even with modern techniques like In-Context-Learning and Chain-of-Thoughts. To enhance LLMs' potential in music composition, the authors propose ComposerX, a multi-agent symbolic music generation framework. The multi-agent approach significantly improves the quality of music composition by GPT-4, producing coherent polyphonic compositions with captivating melodies while adhering to user instructions. **Introduction:** Music shares structural similarities with language, making it a promising area for language models. Recent advances in LLMs have opened pathways to achieving Artificial General Intelligence (AGI), but music creation remains a less explored aspect. Current methods often struggle with advanced musical instructions and offer limited control options. ComposerX is a training-free, cost-effective, and unified approach that leverages the internal musical capabilities of GPT-4-turbo to generate high-quality polyphonic music. **Method:** ComposerX uses a structured conversation chain between agents role-played by GPT-4, including a Group Leader, Melody Agent, Harmony Agent, Instrument Agent, Reviewer Agent, and Arrangement Agent. Each agent has specific roles and communication patterns to ensure a structured and efficient composition process. The system is evaluated through user prompts and compared with single-agent systems and specialized music generation models. **Experiments:** The experiment setup involves a multi-agent conversation using the AutoGen framework, with a maximum of twelve rounds of interaction. Automatic and human listening tests are conducted to evaluate the system's performance. Results show that ComposerX outperforms single-agent baselines in terms of controllability, cost-effectiveness, and musical quality, achieving a 32.2% perceived human-like quality in the Turing test. **Discussion:** ComposerX enhances music composition quality but faces limitations in generating nuanced musical expressions, translating natural language instructions into musical notation, and handling complex polyphonic music. Future work will address these challenges to further improve the system's capabilities. **Conclusion:** ComposerX demonstrates the potential of LLMs in music composition by leveraging their reasoning abilities and large knowledge base in music history and theory. The multi-agent approach significantly improves the quality of music composition, offering a cost-effective and controllable solution for music generation.**ComposerX: Multi-Agent Symbolic Music Composition with LLMs** **Authors:** Qixin Deng, Qikai Yang, Ruibin Yuan, Yipeng Huang, Yi Wang, Xubo Liu, Zeyue Tian, Jiahao Pan, Ge Zhang, Hanfeng Lin, Yizhi Li, Yinghao Ma, Jie Fu, Chenghua Lin, Emmanouil Benetos, Wenwu Wang, Guangyu Xia, Wei Xue, Yike Guo **Abstract:** Music composition is a complex task that requires understanding and generating information with long dependency and harmony constraints. Current large language models (LLMs) often fail in this task, even with modern techniques like In-Context-Learning and Chain-of-Thoughts. To enhance LLMs' potential in music composition, the authors propose ComposerX, a multi-agent symbolic music generation framework. The multi-agent approach significantly improves the quality of music composition by GPT-4, producing coherent polyphonic compositions with captivating melodies while adhering to user instructions. **Introduction:** Music shares structural similarities with language, making it a promising area for language models. Recent advances in LLMs have opened pathways to achieving Artificial General Intelligence (AGI), but music creation remains a less explored aspect. Current methods often struggle with advanced musical instructions and offer limited control options. ComposerX is a training-free, cost-effective, and unified approach that leverages the internal musical capabilities of GPT-4-turbo to generate high-quality polyphonic music. **Method:** ComposerX uses a structured conversation chain between agents role-played by GPT-4, including a Group Leader, Melody Agent, Harmony Agent, Instrument Agent, Reviewer Agent, and Arrangement Agent. Each agent has specific roles and communication patterns to ensure a structured and efficient composition process. The system is evaluated through user prompts and compared with single-agent systems and specialized music generation models. **Experiments:** The experiment setup involves a multi-agent conversation using the AutoGen framework, with a maximum of twelve rounds of interaction. Automatic and human listening tests are conducted to evaluate the system's performance. Results show that ComposerX outperforms single-agent baselines in terms of controllability, cost-effectiveness, and musical quality, achieving a 32.2% perceived human-like quality in the Turing test. **Discussion:** ComposerX enhances music composition quality but faces limitations in generating nuanced musical expressions, translating natural language instructions into musical notation, and handling complex polyphonic music. Future work will address these challenges to further improve the system's capabilities. **Conclusion:** ComposerX demonstrates the potential of LLMs in music composition by leveraging their reasoning abilities and large knowledge base in music history and theory. The multi-agent approach significantly improves the quality of music composition, offering a cost-effective and controllable solution for music generation.
Reach us at info@study.space
[slides and audio] ComposerX%3A Multi-Agent Symbolic Music Composition with LLMs