2024 | Shengchao Hu, Li Shen, Ya Zhang, Dacheng Tao
This paper introduces CommFormer, a novel approach for learning multi-agent communication from a graph modeling perspective. The method conceptualizes the communication architecture among agents as a learnable graph, enabling the simultaneous optimization of the communication graph and architectural parameters through a bi-level optimization process. By leveraging continuous relaxation of graph representation and incorporating attention mechanisms, CommFormer efficiently optimizes the communication graph and refines architectural parameters in an end-to-end manner. Extensive experiments on various cooperative tasks demonstrate the robustness and effectiveness of CommFormer, showing that it consistently outperforms strong baselines and achieves comparable performance to methods that allow information sharing among all agents. The method is particularly effective in scenarios with varying numbers of agents, where it enables agents to develop more coordinated and sophisticated strategies. CommFormer addresses the challenges of communication in multi-agent reinforcement learning by dynamically adjusting the communication graph during inference, ensuring efficient and effective communication. The approach is validated through experiments on environments such as Predator-Prey, Predator-Capture-Prey, StarCraftII Multi-Agent Challenge, and Google Research Football, demonstrating its adaptability and performance across diverse cooperative scenarios. The results highlight the effectiveness of the proposed method in achieving optimal communication and cooperation among agents.This paper introduces CommFormer, a novel approach for learning multi-agent communication from a graph modeling perspective. The method conceptualizes the communication architecture among agents as a learnable graph, enabling the simultaneous optimization of the communication graph and architectural parameters through a bi-level optimization process. By leveraging continuous relaxation of graph representation and incorporating attention mechanisms, CommFormer efficiently optimizes the communication graph and refines architectural parameters in an end-to-end manner. Extensive experiments on various cooperative tasks demonstrate the robustness and effectiveness of CommFormer, showing that it consistently outperforms strong baselines and achieves comparable performance to methods that allow information sharing among all agents. The method is particularly effective in scenarios with varying numbers of agents, where it enables agents to develop more coordinated and sophisticated strategies. CommFormer addresses the challenges of communication in multi-agent reinforcement learning by dynamically adjusting the communication graph during inference, ensuring efficient and effective communication. The approach is validated through experiments on environments such as Predator-Prey, Predator-Capture-Prey, StarCraftII Multi-Agent Challenge, and Google Research Football, demonstrating its adaptability and performance across diverse cooperative scenarios. The results highlight the effectiveness of the proposed method in achieving optimal communication and cooperation among agents.