Understanding Leveraging Symmetry in RL-based Legged Locomotion Control

This paper explores the use of symmetry in model-free reinforcement learning (RL) for legged locomotion control. The authors address the challenge of underexploration in symmetric states, which leads to unnatural and suboptimal behaviors in robotic systems with morphological symmetries, such as legged robots. They propose two approaches to incorporate symmetry: modifying network architectures to be strictly equivariant/invariant, and using data augmentation to approximate equivariant/invariant actor-critics. The methods are tested on challenging tasks like loco-manipulation and bipedal locomotion, with results showing that the strictly equivariant policy outperforms others in sample efficiency and task performance. Symmetry-incorporated approaches also show better gait quality, higher robustness, and can be deployed zero-shot in real-world experiments. The paper introduces two variations of Proximal Policy Optimization (PPO): PPOaug, which uses data augmentation, and PPOeqic, which enforces equivariance/invariance constraints. Both are compared to a vanilla PPO baseline. Results show that PPOeqic consistently outperforms other methods in training returns and sample efficiency. It also achieves better symmetry in task performance and robustness in real-world scenarios. PPOaug shows similar performance to vanilla PPO in some tasks but is less robust. The experiments demonstrate that symmetry-incorporated policies, particularly PPOeqic, lead to more symmetric and efficient locomotion. They perform well in both simulation and real-world tasks, showing improved robustness and generalization. The study highlights the importance of symmetry constraints in improving the performance of RL algorithms for legged robots, especially in tasks requiring symmetric behaviors. The results suggest that incorporating symmetry can significantly enhance the effectiveness of RL in complex, dynamic environments.This paper explores the use of symmetry in model-free reinforcement learning (RL) for legged locomotion control. The authors address the challenge of underexploration in symmetric states, which leads to unnatural and suboptimal behaviors in robotic systems with morphological symmetries, such as legged robots. They propose two approaches to incorporate symmetry: modifying network architectures to be strictly equivariant/invariant, and using data augmentation to approximate equivariant/invariant actor-critics. The methods are tested on challenging tasks like loco-manipulation and bipedal locomotion, with results showing that the strictly equivariant policy outperforms others in sample efficiency and task performance. Symmetry-incorporated approaches also show better gait quality, higher robustness, and can be deployed zero-shot in real-world experiments. The paper introduces two variations of Proximal Policy Optimization (PPO): PPOaug, which uses data augmentation, and PPOeqic, which enforces equivariance/invariance constraints. Both are compared to a vanilla PPO baseline. Results show that PPOeqic consistently outperforms other methods in training returns and sample efficiency. It also achieves better symmetry in task performance and robustness in real-world scenarios. PPOaug shows similar performance to vanilla PPO in some tasks but is less robust. The experiments demonstrate that symmetry-incorporated policies, particularly PPOeqic, lead to more symmetric and efficient locomotion. They perform well in both simulation and real-world tasks, showing improved robustness and generalization. The study highlights the importance of symmetry constraints in improving the performance of RL algorithms for legged robots, especially in tasks requiring symmetric behaviors. The results suggest that incorporating symmetry can significantly enhance the effectiveness of RL in complex, dynamic environments.

Leveraging Symmetry in RL-based Legged Locomotion Control

27 Mar 2024 | Zhi Su, Xiaoyu Huang, Daniel Ordoñez-Aprea, Yunfei Li, Zhongyu Li, Qiayuan Liao, Giulio Turrisi, Massimiliano Pontil, Claudio Semini, Yi Wu, Koushil Sreenath